Monthly Archives: February 2008

Breaking packages

I’ve used the term “breaking packages” a few times. As I said, I maintain my Linux boxes without a package manager. So, how did I get these Linux boxes?

My main Linux computer has over half a million files in its filesystem, and over 3000 separate executables. Where did they all come from? You need some way to start out, your computer isn’t going to do much without a kernel, a shell, and a compiler.

In 1994, I installed Slackware on a 486-based computer. This computer had about 180 MB of hard drive space (nowadays that wouldn’t even hold half of the kernel source tree) and 16 MB of RAM. At that time, Slackware didn’t really have a package manager. It had packages, just compressed tar files of compiled binaries, grouped by function. If you weren’t interested in networking, you didn’t download the networking file. If you weren’t interested in LaTeX, you didn’t download that file. There were only a few dozen “packages”, because of this very coarse granularity. The functions like “upgrade”, “install”, “find package owning file” weren’t present. An upgrade was the same as an install, just install the new package into the filesystem, and it would probably replace the old package. To find out which package provided a certain file, you could look in per-package lists of files.

So, I never really had a package manager on that system. When I needed new programs, I downloaded the source code, compiled it, and installed it. When I moved to a new system, I brought backup images or a live hard drive to the new computer. I didn’t start with a blank hard drive, I started with the hard drive from the old computer I was replacing. Over the years, I have replaced every executable that was installed in 1994 (I know this is the case because all of the files installed then were a.out format, and I have only ELF binaries on my computer now).

Sometimes, though, I’ve started with a computer that had a distribution installed on it. At a previous job, my laptop came with Mandrake Linux installed on it. I tried to keep the distribution alive for a while, but eventually got impatient with the package management system and broke the packages.

So, if you give me a new Linux computer and tell me it’s mine to modify, a good first step for me is to kill the package manager. On an RPM-based system, that’s generally achieved by recursively deleting the directory /var/lib/rpm. After that, the rpm command will stop working, and I have the finer control and more difficult task of managing the box myself.

Breaking packages on the MythTV box

As I mentioned earlier, I have a MythTV computer, installed from packages, but I’ve broken some of the packages. Here are some of the issues that I had with the packages, and how I solved them.

Two of the package manager drawbacks I’ve mentioned previously appear here: the one-size-fits-all approach to software packaging, and the failure to receive timely updates.

The MythTV box is on old hardware. Because it has hardware assistance for both MPEG encoding and decoding, I didn’t need a new computer with a fast CPU. The fact that this is old hardware, with a 7-year old BIOS, may be why I had problems, but I found it easier to break the packages than to try to solve the problems under the constraints of the package system.

First, the MythTV box controls an infra-red LED attached to its serial port, allowing it to change the channels on a digital cable box. This requires the use of the LIRC package, and the lirc_serial kernel module. Well, at the time I set this up, the lirc_serial module was having problems with the SMP kernel. The system would generate an oops quite regularly when it wanted to change channels. Looking at the oops logs, I could see that there were problems specifically with SMP. My MythTV box has only one CPU, so I didn’t need an SMP kernel, but because some users will have SMP computers, the KnoppMyth distribution ships with an SMP kernel. I tried to find a non-SMP kernel for the system, without success. So, the easiest way to fix the problem was just to download a recent kernel source tree from kernel.org, copy the configuration file from the Knoppix kernel, and reconfigure it as non-SMP. The spontaneous reboots stopped occurring. The package manager still believes that it knows what kernel is running on the computer, but that isn’t what is really installed.

When I installed the MythTV box, the software was still a bit immature, and a stability fix in the form of version 0.20 came out several months later. I waited a few weeks with no update to the distribution, and no word of when an update might become available. Eventually, I grew impatient and downloaded the source code of 0.20 myself, recompiled it on the MythTV box, and installed it over top of the existing programs.

There was one other impact of the one-size-fits-all approach that caused difficulties with the MythTV box. I was regularly recording a television show between 6:00AM and 6:30AM. A few minutes before the end of the show, the recording would have problems. The audio would break up, and the video would jump. It appeared that the program was losing frames of data, either because it was losing interrupts, or because it couldn’t get the data to the disk quickly enough. Because it happened at about the same time every day, I expected it was probably a cron job. I got a root shell on the box, and asked for the list of all root-owned cron jobs with the command “crontab -l”. This reported that there were no root-owned cron jobs. I mistrusted this result, and did more investigation. As I mentioned in the first post, distribution packagers often break up a configuration file into a set of separate files. They did that with cron jobs, which means that the command-line tool that ought to tell you all about root-owned cron jobs didn’t report the full set of such processes. A bit of digging around in /etc showed that the slocate database update was being run at that time. This process scans the entire disk, making a list of the files on it. While probably useful in a general context, this is an unnecessary operation on an appliance box that isn’t changing, particularly when it results in so much bus traffic that the primary function of the box is degraded. My solution was to change the /etc/crontab file (which is, itself, not viewed by “crontab -l”) so that a cron job would be skipped if there were any users (reported by the ‘fuser’ command) of either of the two video input devices, /dev/video0 and /dev/video1.

My hardware environment

I have two computers that I use for my work. One is an x86 laptop, a ThinkPad T42 with a built-in ATI video controller (Mobility Radeon 9600). The other is a quad-core x86_64 box with an NVidia card (GeForce 6600). My work involves a lot of scientific computation, sometimes multi-threaded, and I need hardware-accelerated 3D rendering to analyze the results. So, I’m running on two architectures, with two different video cards.

The laptop is fairly standard, so I won’t discuss it further. My big box has the following hardware:

  • Intel DP35DP motherboard
  • Intel Core2 Quad CPU, Q6600, 2.4GHz per core
  • 4 GB RAM
  • Two 160 GB SATA disks
  • One 500 GB SATA disk
  • Two 120 GB EIDE disks

I’ll discuss later why I have so many hard drives.

Because I sit next to this box all day, I’ve put a lot of effort into making it quiet. My laptop makes more noise than the big box.

Why not use a distribution and a package manager?

I have a few Linux computers, but they do not use a package manager. They’re not “redhat” computers, or “debian”, or “ubuntu”. Once, 13 years ago, they were slackware. Briefly. I administer these boxes manually, for lack of a better adjective.

Maintaining a Linux computer manually is a fair amount of work. Installing new software is not always trivial, and sometimes things break in subtle ways that may take some effort to debug. I plan to start recording my adventures here, in part so that I can come back and see what I did the next time I upgrade something and it misbehaves in a familiar manner. Because I do things manually, I tend to run into problems that the majority of Linux users don’t experience. I often have to look on the web for answers to questions, so I hope my experiences can help out other people who, for whatever reason, come across one of these unusual problems.

What do I have against distributions and package managers? Nothing, really. They are very useful. I do have one computer that was installed from packages, a MythTV computer that I installed from a KnoppMyth CD. This is a good example of a place where package managers are useful. The computer is an appliance that I set up once, and then don’t ever modify. It’s not exposed to the Internet, and it isn’t going to change much. I don’t need to install new software on it, because it’s a dedicated single-purpose machine that already does what I want it to do. And yet, I’ve “broken the packages” on the box. There are files ostensibly under control of the knoppix package manager that I have replaced with recompiled binaries, and which I am maintaining myself now. I’ll talk about that in a later post.

Here are some of the things that I think are good and useful about distributions and package managers (note that there are some exceptions to these rules, but most package managers supply at least some of these benefits):

  • They supply the entire filesystem in compiled form, allowing a new computer to be set up and running in under an hour with reasonable defaults, usually after asking just a handful of questions.
  • They usually are associated with a good setup tool that can configure the software correctly for the hardware attached to your computer.
  • They have a good, general-purpose kernel with modules ready to handle many situations.
  • They keep track of dependencies to help to ensure that interdependent packages are correctly installed, so that the user doesn’t end up with an installed package that fails to work correctly.
  • They provide a single location for access to updates and security fixes. A user can simply ask the package manager to do an “update to latest packages”, and expect that they have all of the updates provided by the distribution.
  • If you have a dozen new computers to set up, possibly even on different architectures, it’s not a very big job with the correct installation media available.
  • Probably most importantly, distributions and package managers provide an easy way for people to administer their Linux computer without having to become Linux experts. The computer is a tool used to perform other activities, and a distribution lets the person work with the tool, instead of spending a lot of time maintaining the tool.

So, why don’t I use package managers? There are a few drawbacks to the use of package managers, and for me, they outweigh the benefits. Other people will have different priorities. I would never suggest to a newcomer to Linux that they should be going distribution-free. A person who maintains a large collection of computers on dissimilar hardware might also be poorly served by breaking the distributions (though I have actually done exactly that).

What don’t I like about package managers and distributions? Well, here’s a collection of drawbacks:

  • It isn’t always clear what your computer is doing. There may be packages or services installed that you don’t want, doing things you don’t understand. Somewhere in the 200 packages that were installed when you set up the computer, you may have wound up with, say, an FTP daemon you didn’t ask to have. When you’re installing software manually, you’re more likely to install only the things you really need.
  • Distributions tend to ship with older code. Distributors have to freeze their versions and do extensive testing, and by the time the packages are shipped there may have been improvements, bugfixes, or security fixes that didn’t make it into the base media.
  • Bugfixes and security fixes can be delayed as you wait for the distributor to build updated packages. While most Linux distributors get security fixes out within a small number of days, there is still some delay between the time a fix is produced and the time that updated packages are available.
  • Distributions are set up to be good for the general case, but there will be times when they do the wrong thing for a particular special use.
  • Package installers are generally forbidden from interacting with the user, otherwise a new install would be a tedious exercise in configuring every package as it came along. Consequently, packages are usually dropped in with some default configuration.
  • Many programs come with multiple compile-time configuration options. A media player may have support for multiple codecs, output devices, companion devices, and so on. A distribution will usually turn on as many of these options as possible. Some of these options might not be of interest to a specific user, but that user is still forced to install other packages holding libraries he or she doesn’t expect to use. These dependent libraries increase the interconnectedness of the packages, which can make what would be a simple upgrade of one package into a huge transaction that touches a dozen other packages and the kernel.
  • Because it’s easier for a particular file to be owned by a specific package, even when that file controls the behaviour of multiple packages, distributions tend, when possible, to break up the file into fragments that are logically collated in some other place. This can make it hard to figure out exactly what a specific application is doing.
  • Distributions and package manages don’t insulate the user in all cases. Some users with unusual requirements may still end up having to install software by hand, and figure out how to tie the new software into the system correctly, and sometimes the package management system makes such efforts more difficult.
  • Most importantly, for me, a package manager hides too much of what is happening. You don’t have to learn how to configure a program, you don’t know what files it’s installing, it’s a bit too much of a black box for my tastes.

Given all this, I’ve decided that I prefer not to use package manager. Consequently, I’ve been manually modifying my Linux computers for over 13 years now.