Andrew Morton saw Groklaw's coverage of the "Linux is not forking like Unix" article, and he has now graciously provided his speaker's notes from SDForum, on the theme of "the interface between open source software development and the software-using business world." He says, "It's very close to what was said." I know, knowing you like I do, that you will enjoy it much more than any third-party report about what he allegedly said. I found it fascinating reading, and I'm happy I can share it with you now.
You might like to check out Lamlaw's November 22 article on this speech.
SDForum, 16 Nov 2004.
Will today talk about several issues related to the interface between open source software development and the software-using business world. These two big and quite different communities have come together in recent years with considerable success and surprisingly little friction. I'll be looking at matters related to the development, or creation of open source software rather than, say, the adoption of open source software.
I'll talk for maybe 20 minutes and will leave 15 minutes or so for a q-n-a discussion. All this is just one guy's opinion and I can and do make mistakes. So I'd really seriously ask that if people have disagreements with what I say or if they perceive insufficiencies in it, please let's bring those up in the discussion -- otherwise I'll just end up spouting the same garbage next time I stand up in front of some long-suffering people such as yourselves.
- software engineer, working on Linux kernel.
- Along with LT, have overall responsibility for development, delivery and quality of the public Linux kernel, available from kernel.org. Mainly do that by collecting, integrating and re-releasing the work of the many members of the kernel development team.
The public kernel is an input to the kernel which is released by Linux distributors such as Red Hat and Suse/Novell.
- It takes thousands of software packages to make up a distribution, and the kernel is just one of them. But it is the single thing which defines that distribution as being a "linux" distribution.
- My words are of course most applicable to the kernel project, but can be generalised to many of the most important open source software projects.
- It's interesting to note that the most important and successful open source products implement what one could describe as "legacy infrastructure".
Let's look at those two words:
- These products are implementing something which has been done many times before: operating system kernels, runtime software libraries, window managers, http servers and their variously-tiered tools, mail servers, various forms of file server, image manipulation programs, programming language compilers and interpreters, word processors, spreadsheets, database management, etc.
Many of the above are thirty or more year-old technology. Legacy stuff which everyone knows how to implement. All the intellectual property value has been wrung out of these technologies years ago and anyone who ships such products commercially is, to a large extent, providing to their customers a low-margin maintenance and support function.
Why did I describe it as "infrastructure"?
- Many of these successful open source products are implementing functions which other, higher-level software builds upon. The operating system, the libraries, the low-level network servers, the database tools, etc.
All of these provide basic infrastructure which will sit underneath non-open-source software products which are developed and marketed in the conventional commercial manner. ISV's are concentrating their investment and their innovation on higher-level customer-facing products while open source provides the legacy backend of the software stack.
- So the term "legacy infrastructure" places successful open source software into its commercial, historical and IT engineering context.
The rule is not universally true, of course. There are some open source products which are indeed state-of-the-art with research in their fields and which are competitive with commercial products. Examples of this would include projects such as valgrind (a form of software debugging tool) and the Ogg Vorbis project, which continues to deliver world-class media streaming codecs.
But such projects are the exception in the open-source world: frankly, if an open-source team is working well together, developing and delivering leading-edge software which others find valuable then that team should go and form a company and take a shot at getting rich with it -- this is not the space where open source licensing makes sense.
Let's look briefly at the resourcing for open source development.
- In the Linux kernel project, pretty well all of the main developers are working fulltime for technology companies of one form or another. The days of the bearded geek working in his basement purely for his own satisfaction are long gone on such projects.
Companies pay staff engineers to work on open source software products for several reasons.
- because they have a commercial interest in the quality of those products.
- so they have some leverage on the product's future feature set.
- so they have staff at hand who understand and can support the product.
- if they manufacture hardware: so the product supports that hardware well.
- Rather than directly hiring their own engineers, companies will also fund open source development by entering into contractual arrangements with other parties, mainly Linux distributors, for all of the same reasons. The main distributors of Linux are employing a large number of world-class engineers who work purely on open source products.
- Companies who contribute to open source projects place their engineers into a peer-to-peer relationship with the rest of the development team, thus gaining influence over the project and total visibility into the development and planning processes.
All these good things do not come for free. They only really come when the company (or, more specifically, individuals within that company) become recognised contributors to the project.
It's a sad fact of life that if someone pops up out of nowhere with a question or a contribution, they will have a hard time getting attention. This is not because of spite. We're not saying "you haven't helped us before so we're not talking to you". The reason why newcomers tend to face barriers is derived from the open source trust model -- if a contributor has a track record then we can take their changes with a degree of comfort. But if we've never heard from the contributor before then we basically need to go through their code line-by-line, and it's a ton more work for already busy people.
One of the things which I do is to try to prevent such things from simply falling on the floor. If someone is having procedural or process problems with the kernel team then they should contact me directly and I can generally offer advice, grease wheels, make things happen, kick heads or whatever else needs doing.
- Companies contribute engineering resources to open source projects for two strategic reasons:
- Firstly: resource pooling. Maintaining an entire OS is expensive, but with open source you get to pool development resources with the other users of the product while retaining many of the benefits of an in-house development project.
- And the second main reason why companies contribute to open source is to avoid vendor lockin. One way to obtain your low-level software is to simply license it from another IT vendor, and the cost of this could well be similar to the cost of using and contributing to an open source equivalent. But with open source you get full access to all the technology, you get access to the products key developers and you get full rights to modify the product if you need to do so and you get good visibility into the product's roadmap. In fact, you can to some extent control that roadmap if you're prepared to put appropriate resourcing into it.
So the one-sentence summary of open source from a technology businessperson's point of view would be: a source of legacy infrastructure software whose development is cost-optimised via resource pooling and which naturally provides protection against vendor lock-in.
How do new features find their way into "legacy infrastructure" open source projects such as the Linux kernel? In other words, what is the requirements analysis and planning process?
- First up, with a legacy project, the feature set tends to be well understood.
We're implementing 30-year-old technology, so we're working to all that prior understanding of how these things should traditionally operate. This removes a lot of uncertainty from the design process.
And to a large extent we're strongly guided by well-established standards: POSIX, IEEE, IETF, PCI, various hardware specs, etc. There's little room for controversy here.
- Generally, new features are small (less than one person-year) and can be handled by one or two individuals. This is especially true of the kernel which, although a huge project is really an agglomeration of thousands of small projects. Linus has always been fanatical about maintaining the quality and sanity of interfaces between subsystems, and this stands us in good stead when adding new components.
This agglomeration of many small subsystems fits well into the disconnected, distributed development team model which we use.
If the project was a large greenfield thing, such as, say, an integrated security system for the whole of San Jose airport then open source development methodologies would, I suspect, simply come undone: the amount of up-front planning and the team and schedule coordination to deliver such a greenfield product is much higher than with "legacy infrastructure" products.
The resourcing of projects in the open source "legacy infrastructure" world is interesting. We find that the assignment of engineering resources to feature work is very much self-levelling. In that if someone out there has sufficient need for a new feature, then they will put the financial and engineering resources into its development. And if nobody ends up contributing a particular feature, well, it turns out that nobody really wanted the feature anyway, so we don't want it in the kernel. The system is quite self-correcting in that regard.
Of course, the same happens in conventional commercial software development: if management keeps on putting engineers onto features which nobody actually wants then they won't be in management for very long. One hopes. But in the open source world we really do spend zero time being concerned with programmer resource allocation issues -- the top-level kernel developers never sit around a table deciding which features deserve our finite engineering resources for the next financial year. Either features come at us or they do not. We just don't get involved at that level.
And this works. Again, because of the nature of the product: a bundle of well-specified and relatively decoupled features. If one day we decided that we needed to undertake a massive rewrite of major subsystems which required 15 person years of effort then yes, we'd have a big management problem. But that doesn't happen with "legacy infrastructure" projects.
Development processes and workflow
- All work is performed via email. Preferably on public mailing lists so a record of discussions is available on the various web archives. I dislike private design discussions because it cuts people out of the loop, reduces the opportunity for others to correct mistakes and you just end up repeating yourself when the end product of the discussion comes out.
- Internet messaging via the IRC system is used a little bit, but nothing serious happens there -- for a start it's unarchived so for the previously mentioned reasons I and others tend to chop IRC design discussions off and ask that they be taken to email.
- We never ever use phone conferences.
- The emphasis upon email is, incidentally, a great leveller for people who are not comfortable with English -- they can take as much time as they need understanding and composing communications with the rest of the team.
- Contributors send their code submissions as source code patches to the relevant mailing lists for review and testing by other developers. The review process is very important. Especially to top-level maintainers such as myself. I don't understand the whole kernel and I don't have the time or expertise to go through every patch. But I very much like to see that someone I trust has given a patch a good look-over.
- When a patch has passed the review process it will be merged into one of the many kernel development trees out there. The USB tree, the SCSI tree, the ia64 tree, the audio driver tree, etc. Each one of these trees has a single top-level maintainer.
I run a uber-tree called the "mm kernels" which integrates the latest version of Linus's tree with all the other top-level trees (32 at the last count). On top of that I add all the patches which I've collected from various other people or have written myself -- this ranges from 200 to 700 extra patches. I bundle the whole lot together and push it out for testing maybe twice a week.
When we're confident that a particular set of patches has had sufficient test and review we will push that down into Linus's tree, which is the core public kernel, at kernel.org.
- Vendors such as Red Hat and Suse will occasionally take a kernel.org kernel and will add various fixes and features. They will go through a several month QA cycle and will then release the final output as their production kernel.
- The preferred form of bug reports from testers is an email to the relevant mailing list. We go through a diagnosis and resolution process on the public list, hopefully resulting in a fix. This whole process follows a many-to-many model: everyone gets to see the problem resolution in progress and people can and do chip in with additional suggestions and insights.
This process turns out to be quite effective.
- We do have a formal web-based kernel bug reporting system, using bugzilla. But the bugzilla process is one-to-one rather than many-to-many and ends up being much less effective because of this. I screen all bugzilla entries and usually I'll bounce them directly to up email if I think the problem needs attention.
The mailing lists are high volume and it does take some time to follow them. But if a company wishes their engineering staff to become as effective as possible with the open source products, reading and contributing to the development lists is an important function and engineer time should be set aside for this.
The new kernel development model
People may have heard about this, and it does significantly affect consumers of the public kernel.
In previous kernel release cycles over the past ten years we've followed a model of a 2-, 3- or 4-year development cycle, after which the kernel is declared "stable". Linus will hand that stable kernel off to a lieutenant and Linus will then fork off a new unstable kernel for ongoing development. The forked-off stable kernel is the prime source base for kernel consumers for the next several years and we are very cautious and conservative about what changes are made to it. This had the downside that the stable kernel tended to lag in features and device support, so vendors ended up adding a lot of their own patches before releasing that kernel in distributions. It had the upside that the codebase was as reliable as we could make it.
In July of this year we tossed all that out the window. Because we'd discovered that we could keep a massive rate of change flowing into the tree without destabilising it too much. This is partly because we've improved our processes and tools (the adoption of the bitkeeper revision control system helped here). It is also a reflection of the increasing number and skill of the kernel development team. It is also a reflection of the increased quality of the kernel itself: the need for massive destabilising rewrites which break the whole world simply isn't there any more, so we don't need those long developemnt cycles.
So we're currently running on a roughly two-month development cycle. We pack a whole lot of new features and enhancements into each release, but they're months apart rather than years apart. So the kernel continues to make good progress, but at a steadier rate.
This does mean that the production kernel is not as bug-free as it would be if we were concentrating only upon stability, but we do continue to modify these new processes and the kernel's quality does continue to improve even as we add new features to it.