I’ve been building rpm packages for some time. Most of the time this has been for my custom postfix packages for RedHat and Fedora. From time to time I’ve been building special packages for my own or business use either rebuilding packages originally intended for another distribution or building a package from scratch for some new software.
Below are some thoughts on rpm as I know it now and things which help my life as a packager easier.
- No explicit separation of base and external packages.
- Problems with building for more than one “distribution” or version of the same distribution.
- Problems with building from sources
- No facilities for automatically tracking and notifying of upstream changes.
- No standard way of building rpms to support different options, or to query the package to see which of those options may have been applied to a binary packages
No explicit separation of base and external packages.
This first statement might seem silly. However I’m sure when rpm was first designed the assumption was that all packaged software would be provided by a single group (in this case RedHat) and that packages would fit together properly.
In practice today that is never fully true. RedHat, or the OS vendor as is now the more general case, provides a large number of packages for the distribution that they are building. That’s fine and is great. However, for some people the version of certain packages, or the lack of certain packages may be a problem that needs to be resolved. There are plenty of third party repos, such as Dag Wieer’s, Fresh RPMs, EPEL and various other repos that I should probably mention that fill in this gap. However the packages provided by these external repos may conflict with existing software provided by the OS vendor. That’s not helpful.
Other Unixes, specifically the BSDs solve this problem by installing their “external” software into a separate location (/usr/local instead of /usr where only base software is installed). Perhaps external packaged software should be done the same way? That avoids conflicts but needs to be explicitly requested by the OS vendors who have up until recently not really suggested this be done.
Problems with building for more than one “distribution” or version of the same distribution
rpm(8) assumes you are building software on an OS. It doesn’t know or care which one you are building on, or try to provide any helpful hint to indicate this information in the binary package names. Source packages MAY be the same for various different distributions, but it is most convenient if the binary packages can be distinguished. Also it may be necessary for the packager to build the package differently, or provide different build options depending on the distribution he is building on. Currently there is no standard way to detect the distribution name and version you are building on and this information to be provided and may be used as part of the packaging process and later in the package name.
It’s true that we have all worked around this, by adding a .rhel5 suffix somewhere in the release name of the binary packages but it’s not ideal. It’s also true that we packagers solve this problem differently. Some packages require the builder to correctly specify the distribution, some distributions I believe have macros defined which help provide the information and on my packages I’ve written a small script which tries to look at the running environment to determine which distribution it’s building on.
A standard way of solving this would be so much nicer. Ideally a standard hook which all distributions could be persuaded to follow which would remove the issue from the packager, and let him get on with his job of building the software. I’d love the distributions to talk to each other more and agree on something like this, and provide either an rpm which provides this information or alternatively some sort of standard build macros which can be queried. That would be enough.
Problems with building from sources
When building packages a common problem occurs: some package is missing and the build can not proceed. This can be quite frustrating.
Perhaps the required package has not been defined. If you are the packager of this package then you need to fix the dependency. However rpm will not resolve that for you. If you are building software which you are not familiar with you may not know which package is required or which package a missing binary or library is contained in.
Most people don’t use rpm on it’s own to install software, but use yum or it’s equivalents, to find binary dependencies and install them together with the package you request. Yum can also let you search for packages on a remote repository.
It would be nice if there were something similar for the package builder:
- a facility to query a remote repo for a specific file (providing either it’s full path or simply it’s name) to find out which packages it can be found in. This would save a lot of time and currently the remote yum repos at least don’t contain a database of all files contained in the packages they provide. The information is in each rpm, but not directly available in each repo.
- a facility to check, download and install dependent (for the BUILD) packages, prior to running the standard build procedure, and to remove those packages afterwards.
That would make packagers keener to add BuildRequires information to their own packages and also to mention this to any packagers whose packages they depend on. It would also ensure that you don’t end up installing and leaving huge amounts of unneeded software on your server.
No facilities for automatically tracking and notifying of upstream changes
No package is built in a vacuum. All packages are really a derivative work of someone else’s software packaged for easily installation. So while we are building an rpm of my_package version 1.2.3 today. tomorrow it’s pretty likely that we will be building a version of 1.2.4 when it’s released. Note the last phrase. rpm provides a way to indicate the sources of the software it is building, but there is no easy way to check if any of the components have been updated. That would be a nice feature to include if not in rpm itself, in a build wrapper such as yum build … It assumes that you provide some information as to which version is current and then ask to be informed if any newer upstream version is available. vcheck by Marco Götze does just this and in fact OpenPKG uses this successfully with a specially packaged version of rpm. Again this saves time. It can even be used to download the newer version of the software, replace the existing spec file with the newer version and attempt an automatic build of the new version. Complicated but nice and in most cases would probably work fine, leaving packagers only to have to worry about those packages which fail to build correctly with the new software version, or with the testing of the newly built packages. Again a feature I’d like to see used more widely as it would mean faster updates to software which again is what most people want.
No standard way of building rpms to support different options, or to query the package to see which of those options may have been applied to a binary packages
This does not affect all packagers but does affect a few. In my case some people want to build postfix with mysql support. Some people want postgres. Others want pcre included, others want VDA support, or certain special patches. While we can build a package with all these options it does mean that all dependent packages must also be installed so again that’s not very helpful. Some people may not need all these options. I’d rather provide the options and let those that need to rebuild the package with the options they need. The same is almost certainly true for a lot of other software. In the end the distributions make a standard build and that’s it.
If you rebuild by hand with different options these changes are NOT apparent. You generally can’t query the package to see how it was built. That again is something provided by OpenPKG which is very nice. It’s also possible to rebuild the packages using the same options as the installed software, thus avoiding mistakes. OpenPKG uses an option to rpm –define ‘with_pcre 1′ –define ‘with_xxxx 2′ …. but also saves these values so they can be queried later.
So it would be nice if rpm, or a build wrapper for it, provided a standard way to be able to specify options, build with those options and later query a package to see which options it was built with. This would not affect most people but for those of us with special requirements we could safely build the software with the required options and later be able to see how the build was made. Much cleaner and safer for us all.
I should recognise that it is possible to work around many of these issues but it would certainly be nice if some help were provided to the packager to make his or her life easier. If building were consistent on the different distribution then it would be possible to make builds for one distribution work on another. Now that’s pretty hard. You can’t install a SuSE rpm on RedHat or vice versa at least without making minor adjustments and in the end it means that 2 sets of people are working on packaging the same software at the same time. Rather inefficient. Perhaps it’s true that the distributions compete with each other. Nevertheless building distribution specific solutions to many of the issues I’ve indicated above do solve the problem, in my opinion they solve it in the wrong place. It would be better solved by improving rpm or it’s build wrapper and doing it once.
I’ve not seen many people comment on problems with package building. Many people probably have a lot more experience than I do, or build a lot more packages than me, yet I’ve not seen any posts or comments suggesting that some of the issues I address above perhaps need a general solution. So what do you think?