Using pre-compiled RPMs and compiling applications from source
IBM e-business architect Chris Walden is your guide through a nine-part developerWorks series on moving your operational skills from a Windows to a Linux environment. In this final part, we download and compile a software package, discuss the pros and cons of automated package management, and get to know the RPM system.
One of the first things you notice when you install Linux is that there are so many packages available with your distribution. Most distributions come with the Linux operating system, installation tools, and administration tools. Then they include Internet tools, development tools, office tools, games, and some things that you haven't even heard of. It is not uncommon for a Linux distribution to come with thousands of available packages. If you didn't select "install everything," then some subset of these packages were installed.
Now you may be wondering "How do I remove packages I don't want? How do I install things I missed? Can I use software that didn't come with my distribution?"
As Linux installed, you probably noticed a lot of information about RPMs being installed. RPM stands for Redhat Package Manager, a contribution by Redhat that has become a standard for managing software on Redhat and UnitedLinux as well as on many other distributions.
Essentially, an RPM is a package, containing software for Linux ready to install and run on a particular machine architecture. For example, we installed the webmin package from an RPM in "Part 3. Introduction to Webmin." All of the software initially loaded in your distribution was installed from an RPM.
Anatomy of an RPM
An RPM is a package of files. It includes a .spec file, which provides information about the package, its function, and its dependencies (what packages must be in place before it can run). The .spec also contains a manifest of files in the package, where they must be loaded on the system, and what their initial permissions will be. The RPM also contains a pre-install script, which is written by the package developer. Then the RPM contains the compiled binary files. Finally, the RPM contains a post-install script.
When an RPM is installed, the system first looks to see if the dependencies for the package are satisfied. If not, then the installation terminates unless you specify options to force an install anyway.
If all is clear, the pre-install script runs. This script can do anything. Normally it creates users and directories. However, it can do many types of dynamic configuration, even custom-compile source code for the running system.
Know where your RPMs have been
When RPMs install, they copy files onto your system and execute scripts. Since RPM is run as root, all of these functions are performed as root. It is therefore important that you know the origin of an RPM before you install it on your system. Just as with Windows software, malicious code can be contained inside an RPM as easily as any other package. RPMs from the manufacturer are generally safe, but be cautious about randomly downloading and installing things from unknown sources.
If the pre-install script completes successfully, then the binary files are copied onto the system according to the manifest. Once all of the files have been copied and their permissions are set, then the post-install script is run. Again, this script can do almost anything.
Once all of that is completed, the information about the package is added to the RPM database, and the installation is complete. With this simple system, it is possible to perform all of the functions that could be done with a more elaborate commercial installer.
The RPM database
The piece of the RPM that adds elegance is the RPM database. This database typically lives in the /var/lib/rpm directory and holds information about every RPM installed on the system. The database knows the dependency relationships between packages and will warn if removing a package could cause other packages to break. The database knows about every file that was originally installed with a package and its original state on the system. It also knows the locations of the documentation and configuration files for each package. This may sound like a lot of information, and it is. But it isn't bloated and bulky. On a system containing 1,066 packages, comprised of 203,272 files, the database files are only 45 MB! RPM uses the database to check dependencies when packages are loaded and unloaded. Users can also query the database for information on packages.