The BeleniX Package Manager

As I had mentioned in earlier posts, I have been spending some time to bring modern package management to BeleniX which is critical to all future work. I had actually spent about 3 months part-time effort to bring “spkg” to the table that provides a whole host of package management services. Not many however are aware of all the possibilities available via this utility, so I have attempted to capture details of the features and technology, pointwise, below. Note that “spkg” will soon be renamed to “pac”. Since spkg still has a few features in implementation stage, those are marked with a TODO.

  1. Networked Package Repository: Every distribution needs a package repository available over the network for users to easily install the software they need without having to hunt the web. BeleniX currently provides 2 main package repositories. The software collection is presently limited but growing. The main repository is at http://pkg.belenix.org/. The package repository is hosted via Apache 2.2 running in an OpenSolaris Zone with the package collection sitting in a ZFS pool on high-performance SCSI disks thanks to http://www.genunix.org/. The remote repository can be accessed either via HTTP or FTP. The BeleniX package repositories run Rsync to allow easy mirroring. So one of the core approaches is to avoid having to implement custom repository servers. This reduces development/management complexity and overheads.
    • Multiple Repository Sites: “spkg” supports multiple repositories transparently. One can configure additional repository sites in /etc/spkg.conf.
    • Mirror support with parallel downloading: Each repository site can have one or more mirrors. “spkg” uses the axel downloader than can do parallel chunked downloading of the same file from multiple mirror sites.
    • Multiple release trains: The repository structure used for BeleniX allows one to have multiple release trains hosted off the same repository site. That is we can have multiple different trains based off the base distro. For eg. there is presently a separate release train for Crossbow builds on pkg.belenix.org.
    • Release-specific tagging: In addition to the above there is a way to create so tagged namespaces in the repository. At the point of each BeleniX release a tag namespace is created with the release nams. This namespace refers to all the packages and revisions in the repository at that time. New packages published into trunk are not automatically visible in the tag namespace. However they can be explicitly “promoted” into a release tag namespace. This tagging allows base OS package versions to be grouped by release in turn allowing upgrades to a specific release tag rather than always to the latest trunk.
    • Insane Compression: Packages in the BeleniX repository are compressed using 7Zip providing the absolute minimum compressed size possible.
  2. Local Package Repository: The utility can support fetching packages from a local repository inside a directory or on a CDROM.
  3. Catalogs and Metadata: The package repositories need to provide complete package metadata apart from dependency info. This allows the package manager to display complete information about packages whether they are installed or not. BeleniX provides a text based catalog and full package metadata including the package info manifest and the pathname list provided by the package. The catalog format is derived from the one used for Blastwave.
    • Catalog Search: “spkg search” provides the ability to search the catalog
    • Content/Info search(TODO): This is the ability to full-text search package info and pathname list currently being implemented.
    • Catalog signing: The catalog file contains the entire package list with filenames, package name, version, SHA-1 checksum etc. The catalog is signed via an ECC key and verified by the client.
    • Package info queries: There are obvious options to list package information and compare installed packages vs those in the repository.
  4. Package Grouping and Clustering: So called Group packages are supported that are essentially references to a group of actual packages constituting a logical collection. These groups can be defined around functional or use-case scenarios. Group packages are a special construct handled differently as compared to just an empty package with a set of dependencies. For eg. uninstalling a group package removes all packages it references but removing a normal package does not normally remove it’s dependencies. There is a recursive uninstall option discussed below, but that is a different behavior. Package clusters are special purpose package collections internal to the use of the package manager and not visible to the end-user. For eg. there is a base cluster that defines the base OS package collection. There is an obvious livecd cluster coming soon.
  5. Dependency Resolution: This is at  the core of any package management utility. The ability to automatically compute the dependency tree while installing a package and automatically install any missing dependencies needed by the package in question. Dependency tree is scanned during upgrade as well. The dependency resolution in “spkg” as of now does not handle version-specific dependencies. This has computational implications (for eg. directed graph optimization) and is planned for a future date. The BeleniX package manager computes dependencies ahead of time creating a so-called action plan. It does not download and scan packages one at a time. This is possible due to presence of the catalog and full metadata locally. The dependency solver is quite fast requiring just a few seconds to compute a complete install/upgrade plan for say 800 packages. The dependency solver also takes care of incompatible packages and circular dependencies. Once the dependencies are computed a topological sort implementation is used to figure out the correct processing order.
  6. Dependency Partitioning: In BeleniX package management dependencies are partitioned into 2 sets: base OS and layered software. Dependencies are not allowed to cross the set boundaries. That is layered software dependency on base OS packages are not entertained and ignored even if a dependency is present. Thus for eg. if Firefox expresses a depdency on a core kernel package such as SUNWcsl then upgrading Firefox will not result in the kernel getting upgraded! This is possible by checking against packages in the base cluster. The presence of a base OS is obviously a given. Ideally such dependencies should not even be present but we live in an imperfect world.
  7. Install and Upgrade: Install and upgrade are separate tasks in spkg. Packages can be referred to by their package name or Common Name. Due to historical reasons, package names in the OpenSolaris world can be cryptic starting with a capitalized prefix (‘SUNW’, ‘SFE’, ‘SFW’ etc.). Thus a user-friendly name called Common Name is also assigned. Package Rename Support(TODO) will be coming soon. Package names can change as long as the Common Name remains same. While providing a package name one can also specify a version number using the following syntax: <package name>@<version>(<revision>). Upgrade can be of the following types:
    • Install/Upgrade of one or more packages
    • Upgrade of the base operating system: It is possible to just upgrade to a new OpenSolaris build via the release tag mechanism alluded to earlier without having to upgrade every single installed package.
    • Upgrade of all installed packages.
    • Transactional Upgrades: Base OS or entire system upgrades in spkg are transactional. This is made possible by adding some spkg masala on top of Snap Upgrade. Snap Upgrade allows creating new boot environments and spkg implements maintaining persistent transaction state. Thus it is possible to hit CTRL + C in the middle of an upgrade process and resume or cancel it later. Resume begins from the point it was interrupted. You can even start an upgrade, interrupt it, point spkg to a different mirror and continue the upgrade as long as the mirror is up to date. In addition attepmting to upgrade a single package in the base OS cluster will automatically cause a new boot environment to be created. Base OS packages in the live environment are not upgraded in place. Canceling a pending upgrade transaction will also destroy the associated boot environment.
  8. Uninstall: The package manager supports both normal and recursive uninstall, normal uninstall being default. Obviously dependent packages are checked before attempting to remove a package. Checking for dependents (which packages depend on package X ?) normally is an expensive process since it requires scanning through the entire SVR4 package database to fetch dependency metadata for every installed package. This for SVR4 equates to scanning the /var/sadm/pkg sibdirectory tree. In spkg we optimize our approach to avoid this directory scanning as much as possible. For the first time the directory scanning is done and an inverse dependency table is built up. This table is then stored persistently on disk and loaded every time. A check is included to rebuild the table only if the /var/sadm/pkg directory contents are changed. This improves performance by many orders of magnitude.
  9. Recursive Uninstall: In a simple sense this means remove package X and it’s entire dependency tree. However this simple approach has problems in that it can cause unintended package removal. For eg. say user installs OpenOffice, user installs add-on software HakunaMatata that depends on OpenOffice package. Now if user uninstalls package HakunaMatata with recursive option it will end up removing OpenOffice! To get around this problem one has to identify how the package came into the system. If it is via user selection or can’t be determined then it is not safe to remove. If it is via automatic dependency resolution, then it can be removed. Thus the BeleniX package manager records this information when installing packages. Finally recursive option is ignored if base OS packages are being removed. One does not want the rug to be pulled from under one’s feet!
  10. Dry Run mode: It is possible to run spkg with the ‘-n’ option to cause it to compute the action plan and show what actions it will perform but not actually perform those actions.
  11. Alternate root image support: Just like the SVR4 package tools spkg supports a ‘-R <directory>’ option to perform actions in a alternate root image in a directory.
  12. Incremental Mirror Support(TODO): There is work underway to support local incremental mirrors. An incremental mirror is one which only contains metadata in a directory when created initially. It does not have packages. When the user points to the local repo and starts installing packages, those are automatically fetched from the remote site and stored locally if they are not present locally.
  13. Non-Root User Image Support(TODO): One of the few good concepts in IPS is support for  Alternate Root images in a directory that can be installed and managed by ordinary users without needing superuser privileges. This allows for maintaining and updating Java application suite contexts for eg. It should be possible to tweak SVR4 packaging enough to get this working on BeleniX but needs further investigation.
  14. Getting Rid of SVR4(TODO): The SVR4 packaging implementation in OpenSolaris is aged, ugly and broken even though the specification is quite advanced. We need to move away from this in BeleniX. The exact alternative has not yet been decided but we will of course utilize an existing opensource implementation.
  15. Clean up server side repository management implementation(TODO): The repository is currently being managed via a Perl script which I wrote based on some earlier script available to manage with Blastwave style repositories. The script is best described as one huge ugly hack that I am ashamed of! I am having great misgivings about my choice of Perl and need to re-write this piece in Python.

The choice of Python made this implementation process easy. Python based utilities run fast and have low resource overheads. The code is easy to maintin. Going forward the plan is to leverage better XML-based repository management by porting Createrepo and utilitizing the advanced dependency solver from Smart which also enables version-based dependencies. Createrepo is meant for RPM and will need some porting to whatever packaging we eventually decide to use. There are some interesting observations on package management at the following link http://www.mancoosi.org/edos/manager.html.

Advertisements

3 thoughts on “The BeleniX Package Manager

  1. Nandan Kumar

    Hi Moinakg,

    I was trying to work on creating a live dvd through the solaris installation on the desktop. I went through the distribution constructor and hadoop cluster project, but they were not much of help. Can you provide me some pointers to this.

    Thanks,
    Nandan

    Reply
  2. moinakg Post author

    I use a custom distribution constructor for belenix since there are differences at some crucial points regarding how the belenix and opensolaris ISO builds are done.
    If you are using OpenSolaris distro constructor you need to setup a local IPS repository for all practical purposes: http://blogs.sun.com/migi/entry/create_your_own_opensolaris_ips
    Then go through the distro constructor docs. Using the BeleniX constructor is something I will blog about later in comprehensive posts about how to go from opensolaris source to getting a bootable ISO.

    Reply
  3. Rob Jones

    Package manager this works well on RC2 but takes a good deal of CPU and memory holting other operations on slower machines -this is a great step forward and have been using it on opensolaris.However please consider maybe a wider breif with theme package availablity?

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s