Is SCO Math-Challenged?

Friday, August 15 2003 @ 06:14 AM EDT

Contributed by: PJ

When SCO "terminated" Sequent's license on August 13, it said:

". . .Sequent-IBM has nevertheless contributed approximately 148 files of direct Sequent UNIX code to the Linux 2.4 and 2.5 kernels, containing 168,276 lines of code. This Sequent code is critical NUMA and RCU multi-processor code previously lacking in Linux."

I got the following email from a programmer, in which he challenges those numbers:

". . . NUMA and RCU implementation is at most no more than a dozen or two files in Linux. Each one is likely 1500 lines on average. So how does that get to be 148 files and 168276 lines? I think there are 2 possibilities and one of them is very interesting IMHO because they would have to be saying that they are extending their theory of derived works to large parts of linux.

"1. They are counting for multiple revision of the files.

"2. They are counting everything that NUMA and RCU _touches_, not just the implementation itself.

"Two other interesting things are the precision with which they state this is clearly bogus. Does that include whitespace? How about comments? Why not 168277 lines? The only way I can see that happening is pick a version (which one? there are over a hundred major releases from Linus alone in 2.4 and 2.5 series), find the RCU and NUMA functions/headers and count every line in every file that includes an RCU/NUMA header or calls an RCU/NUMA function. Anything else would require SCO to have access to the Sequent code (or 3. Just make things up).

". . . Defending these exact numbers is going to be a burden .. ."

I noticed Adam Baker commented on that same 8/13 story, and he did some calculations of his own:

"I've just had a quick grep through the (2.4.19) kernel source and other than trivia such as calling rcu_init() at startup, RCU seems to consist of one source file (kernel/rcupdate.c) with a corresponding header file and NUMA of one architecture independent file, most of which will be ignored by the compiler in some configurations because of #ifdef statements and less than 20 architecture specific source files a few of which would be used in any particular kernel that supported a specific NUMA machine. Large chunks of this code would also be Linux specific.

"If you exclude the areas which aren't really part of the core kernel such as networking, filesystems, device drivers, SCOs GPLed ABI stuff and sound libraries then the kernel only consists of 86 architecture independant source files and a further 85 to support the i386 platform and altogether they only total about 100,000 lines."

I'm not a programmer, so i can't speak to this, but when I get information from two sources I trust, it's time to put it up on Groklaw. Did SCO flunk math, as well as GPL Summer School? Or is this foreshadowing an attempted land grab for "derivative" code?

P.S. I just got an email from Roberto Dohnert. He did the math and he says: "To be exact, Numa and RCU come out to 29 files with 1836 lines of code."