managing open source software
Mark Harrison <markh@usai.asiainfo.com>
Mark Harrison is Chief Software Architect at AsiaInfo Software
R&D in Beijing, China. He is grateful to all software developers
who make their software easy to build and install.
The first obstacle was management and the arguments regarding free software so prevalent at the time. You have heard them all before. How could something free be any good? We don't want software from some dope-smoking college student. Where's the sales rep to take us out to lunch? Where do we get support? Things sat for a while until the Camel book was published. Thus emboldened, I took my copy to our VP, seeking permission once again. This time I was successful. He immediately understood what Larry was doing: "Ha Ha. Give away some software; clean up on the book end. What a bunch of suckers. I like this guy!" This was not quite the reasoning I had intended to use, but I let it go at that. "Yes, sir, I think you've got him pegged. No use trying to pull the wool over your eyes!" [1] The same thing happened when I started using Tcl/Tk. Upon publication of his book, John Ousterhout went from "suspected college radical" to "cunning entrepreneur" [2]. Once I had the blessing of the powers that be, I hit the second obstacle: managing the configuration and deployment of our open source software. My first tries were rather haphazard -- variations on tarring /usr/local and putting it on the distribution tape (quite a mistake, as we will see). I learned a lot along the way. I hope I can help you avoid mistakes and deploy your open source solutions with a minimum of fuss and trouble. Although I emphasize the case where you need to ship some of the open source packages as part of your own product or install onto a computer that is not part of your local network, the process is still useful even if you are just installing the software locally. Ignore the steps that don't apply to you. Getting Started The first thing to do is to decide on the directory structure that will work best for your project. Several guidelines make this easier. Don't install the software into /usr/local. This will work fine for your site, but when you try to install the software at the customer site, you will probably conflict with things the customer has already placed there. Instead, choose an installation prefix that will be unique to your organization. Get permission from your customers to create that directory on their system. Nobody else will install software there, so you don't need to worry about version conflicts, etc. As an example, we use the prefix /aitools so that executables are installed in /aitools/bin. Build from a directory that is not a subdirectory of your installation directory. I initially made the mistake of building in /aitools/src. I wanted to test my installation procedure on a clean directory. Having the source code, etc., sitting there made this a lot more difficult because I was naturally reluctant to delete it regularly. Don't put anything in the installation directory besides the software you wish to ship to the customer site. I installed emacs and the other usual tools there and immediately had a huge directory that would have used up all the disk space on the customer system. For the rest of this article, I will continue to use this directory structure as an example:
Source Control Most open source software packages are distributed as tar files, usually with the version number as part of the filename. All downloaded distributions and patches are stored in /aitools2/dist. I untar these files under /aitools2/src and import them into the source repository. It is usually easier to configure the software before importing. If the configuration process creates any important files, it will save you the trouble of having to add them to the repository separately. Of course, you will commit your changes to the repository whenever you apply a patch. This is one area where the patch program and CVS work very well together. Performing the following steps will make it very easy to keep track of the patches that have been applied:
cvs update # make sure files are up to date Of course, any local modifications you make will be similarly tracked. If you fix a problem, don't forget to send a note to the package maintainer! If you use a checkout-based version control system, you can either inspect the patch file to see which files will be affected or check out everything and release the checkout for all the files that are not modified by the patch. On a related note, if you use a filesystem-based version control system such as Clearcase, you may need to scan the build procedure and replace commands that affect directory entries (such as chmod) with the appropriate commands from the version control system (such as cleartool chprot --chmod). In general, I don't recommend overlaying major distribution versions on top of each other, even though this is how the package maintainers keep the files. Unless you are seriously maintaining the package yourself, it seems to be easier to consider each release of the package as an independent product. Using Tcl as an example, your directory structure would look like this:
/aitools/src/
Configuration and Building One of the most frequently made mistakes during the configuration process is not reading the configuration instructions. Be sure and do this, paying special attention to the specification of the installation area. When you configure the software, specify /aitools as the prefix for the installation area. After the software is configured, build and test it according to the package instructions. If you are going to be creating packages, you should take before and after snapshots of the installation area to assist in making the list of files to be included in the package. The majority of open source software currently uses GNU autoconf (a wonderful system!) to configure the package. The steps discussed so far for a typical autoconf-based package would go like this: First, we unpack and configure the source code.
Next, we need to put the package under version control. With CVS, we usually import the package, delete the source tree, and then check the package out. This results in a properly controlled CVS directory.
Next, we build and test the package.
And finally, install the software into /aitools. We take the chance to make a snapshot of the installed files at this time.
Quality Assurance and Project Management If you are preparing your open source software for shipment to a customer, you will need to take into account your company's QA and Project Management groups. Your QA group probably has some standard procedures for documenting and executing test cases on your company's products. If so, this is a chance to let your open source software shine. By the very nature of open source development (source releases, multiple platforms, many people building and testing the software), most open source projects end up with an impressively large collection of test cases. Many of these are in the form of automatically verifiable regression tests. So take the opportunity to write up a test document using your company's standard format. The sheer volume of tests (for Tcl/Tk, nearly 10,000 combined tests) can garner quite a few supporters for your favorite software. Likewise, your project management group probably has a standard scheme for assigning product codes and the like. Follow up with this, and get each piece of software its own product code, project number, or whatever it is your company uses to track its software. It may seem like a bureaucratic headache, but cooperating in this area makes your open source software seem a lot more "normal." Building the Packages Once the software is built and installed, you need to create an installation package. There are several options for creating the package, but the one I like best these days is RPM, the Redhat Package Manager. Originally written to manage Redhat Linux, it now runs on all major UNIX platforms. I don't have the space to go into all of its features, but you can get more information at <http://www.redhat.org>. Most software packages install both runtime files (executables, shared libraries, runtime support files, etc.) and development related files (header files, documents, static libraries, demos, etc.). Depending on your situation, you can create either a package consisting of all the files, both runtime and development, or two separate packages for the runtime and development related files. Which you choose depends on your particular circumstance. Will the customer benefit from having the full installation? Do you have the disk space to spare? Splitting the packages (if that is what you wish to do) is relatively straightforward. The easiest way to do it is to copy all the files into a temporary location and start deleting all the development files. You will eventually be left with only the executables and runtime support files. When you have this list, take the complement, and that becomes your development list. Installation Procedures Now that you have your package(s) built, you are ready to deploy to your customer site. You can include your newly built open source runtime packages with your standard distribution media, and just add a step in your installation procedure to install those prerequisite packages first. If you use a package manager such as RPM, you can specify that these packages are prerequisites to your own packages and rely on the fact that RPM will make sure the packages are installed in the proper order. Once the packages are installed, you use the standard package maintenance commands to query and verify your installation. Relocating Packages In some cases, you might not have the luxury of being able to specify a particular directory where your runtime tools will be installed. If this applies to you, there are three options open to you.
Relocating ProgramsIt is not difficult to relocate packages, although there may be some system dependencies in doing so. The basic procedure is:
Creating Standalone Programs Sometimes it is desirable to eliminate all external dependencies on a software package, thereby eliminating the need to ship a runtime system for that package. If the package you are using is provided as simply a set of library files, you can either build the software with static libraries or include the correct linker flags (such as --Bstatic) to cause the linker to include the library in the executable file. You will pay the cost of larger executables and possibly less efficient memory usage, but this may be an acceptable trade-off for your project. If you are using one of the popular open source languages, you may be able to use the special features of that language to eliminate the need for shipping the runtime files normally associated with that language. Two examples of this should suffice. If you are using Python, you can use freeze to create a C file that, when compiled and linked, will produce a standalone executable with no external dependencies. If you are using Tcl, you can use tcl2c, one of the components of the Plus Patches, to do the same thing. Similar features exist for most of the common languages. Consult your documentation or resident language guru for details. Summary There are countless examples where open source software has made a significant positive impact on commercial development projects. Proper source and project management is a critical component in making your open source efforts a success. Notes [1] Of course, those people who have read the last paragraph of the Perl README might have a better understanding of Larry's true motivation. [2] We can see a pattern starting to form. It seems people in business relate to people out to make a buck.
|
|
First posted: 17th February 1999 jr Last changed: 17th February 1999 jr |
|