Implementation
Overview
GSPM is based on 2 existing Open Source programs:
- DPKG (and it's associated programs like apt-get)
- :2. "Modules" which is more a way of enabling users to choose which packages to use, by using environment variables (like PATH and LD_LIBRARY_PATH)
DPKG is used to make sure everything is installed with it's dependencies. Modules is used to provide an easy switch between packages. This is wrapped together such that a user specifies a package to use, and either modules is used to simply switch to it, or if it's not installed, DPKG is used to install it and then modules is used.
Implementing the Requirements on GSPM
(See also the Requirements page)
R001: the end-user may be non-root on his own machine.
This requirement is satisfied with the tarball version of GSPM. This version can be installed anywhere in the system. The variable $GSROOT is set so the tools/objects can adjust themselves to the installed directory.
R002: overlap team central installations with people-specific installations
This can be accomplished with the new environment modules approach. First the required object/tool will be searched within the people-specific installation, then in the central installation. There can be even more then 2 installations to search.
R003a: many releases can be installed at the same time
We are using the Debian tools and conventions. If a package with the same name is installed, it substitutes the old one. So, if we need two versions installed, we need to change the name of the package. For example, to install gcc 3.4 and gcc 4.0, the package names should be not only "gcc", but "gcc3.4" and "gcc4.0". So, it is possible. (More on versions bellow)
R003b: releases can be used in parallel, in different terminal sessions
When using environment modules, all sort of things are controlled by environment variables. Each terminal have a set of variables of its own, and can be configured to use different set of tools and objects.
R004: detect in the configuration time the set of tools and objects that are compatible/required
Maybe we need to talk more about that. If I understand, it is necessary to have access to all tools and objects installed so the configuration tool can select the ones it likes? It is possible to have a list of packages installed if it helps.
R005: each object shall be able to define its own installation process
We simply provide packages with files in it. These files can be pre-compiled binaries (so the user just use it) or source code that the user can browse/change/compile. When distributing source code, it can use it's own installation process. Note it can be dangerous because we possibly won't know where the user will install it (maybe outside the world).
Package Conventions on GSPM
More on versions
Using Systemc to exemplify. There is a vendor version of a package, like "2.0.1" or "2.2". When it is a source package like systemc and we want to supply binary versions, there can be variations in the way it is compiled. We can call it "variation version", like "-gcc4.0.1" (when the libsystemc.a generated was compiled with gcc 4.0.1) or "-gcc3.4-pt-O3" (when compiled with pthreads with optimization)
These two versions (vendor and variation) together with the package name will create the new package name, so many releases and variations can be installed at the same time. Installing one will not replace any other. But suppose one of these packages have some serious problem (was incorrectly packaged, or do not set paths correctly). Then we really want to replace the package. So, we need yet another version, we call "package version", which is just a number like 1, 2, 3. So, for example, the package named "systemc2.2-gcc4.0.1" will have version not only "2.2-gcc4.0.1", but the package version is also appended, like "2.2-gcc4.0.1-1".
It is easy to see that the variation version can grow a lot it a pre-compiled package is dependent of many specific versions of others, like compiled for systemc2.1 and gcc 3.4, or have many different configuration options (like with optimization, with debug, using pthreads, etc). For this example, only the variation part of the version could be as long as "-gcc3.4-systemc2.1-O3-dbg-pthreads", and remember that both vendor and variation parts are appended to the package name.
To alleviate this problem, the Vendor can choose to use sort number codes to represent the variations. GreenSocs suggests "c1", "c2", etc. Then there should be a table of codes mapping them to the list of dependencies and compiler options, so the users could select among them when installing by hand. But users should prefer to use an specific GreenSocs tool to install packages (see section Working with a set of packages).
Package file name
Package file names should follow the convention:
<packagename><vendorversion><variationversion>_<packageversion>_<arch>.debI favour <packagename><vendorversion>.c<variationversioncode>_<packageversion>_<arch>.deb And the special charicter sequence .c is not permitted in either the packagename or the vendorversion.
Noting that the <variationversion> could be either the long form or the short (coded) form described in the last section. Examples:
systemc2.2-gcc3.4-O3-dbg_1_i386.deb systemc2.2c1_1_i386.debDebian Policy for package name, version and architecture (taken from http://www.debian.org/doc/debian-policy)
Package names (which for GreenSocs is <packagename><vendorversion><variationversion>) must consist only of lower case letters (a-z), digits (0-9), plus (+) and minus (-) signs, and periods (.). They must be at least two characters long and must start with an alphanumeric character.
The version may contain only alphanumerics and the characters . + - : (full stop, plus, hyphen, colon) and should start with a digit. But in the GreenSocs scheme it will be only <packageversion>, which is just a sequential number.
The architecture field can include the following sets of values:
- A unique single word identifying a machine architecture, using the format os-arch, though the OS part is sometimes elided, as when the OS is Linux.
- "all", which indicates an architecture-independent package.
- "any", which indicates a package available for building on any architecture.
Virtual packages
All packages with the same <packagename> will provide a virtual package called <packagename>-virtual. It is a fake package, there is no file for it, just the reference of its name. It is transparent for the user, but package developers must know it as it is important for dependency resolution (when selecting a default package version, see next section).
The default package version to install to satisfy dependency
A package, in the dependency field, can list other packages names (exact names) and versions (equal, less then, greater then) which it depends. As the GS package names are really a tuple <packagename><vendorversion><variationversion>, to depend only on "any version of gcc", the solution is to do a so-called "selector package" (this concept does not exist for Debian). This kind of packages does not have any content, just meta information. Their utility is only to select a default version to install. For example (using the vertical bar for OR'ing the packages): gcc4.0.1 | gcc-virtual. In this list, if gcc-virtual is already installed, it means that some gcc version is installed, so the dependency is satisfied, or else, the gcc that will be installed is gcc4.0.1 (the first package in the list).
Where the tools/objects files goes
The idea is that each vendor has a directory inside the GSPM directory. The vendor is free to do what he wants withing this directory, but GreenSocs will give some advises of "best practices". For example, the packages that can have many versions installed should be copied to a hierarchy like vendor/base_package_name/version/, or some other hierarchy that has the version on it, so files from different versions do not overlap.
Working with a set of packages
Many packages will have strong dependency with each other, and users will normally work with a set of packages, for example selecting exact versions for gcc, systemc and tlm in a session of work. We pourpose a file called pkgset that contains a working set of packages. Note that this set can either (1) be created by an entity that certify the packages are compatible; or (2) be created by the user, as a sortcut to selecting his current working set of packages. There can be as many pkgset files as necessary.
Selecting a package and version to install and use
As there can be many variations of the same package, an "inteligent" tool is necessary to help users to select the package he wants. The gs-use serve this pourpose. It receives a pkgset as a parameter and prepares the environment to use the set of packages. This means that this tools installs the packages that are not installed and runs the modulefile to load the environment variables. The user can also select some <packagename> to install/use giving them as additional parameters. The first time it is run with a new pkgset it may take some time to install new packages, but other executions with the same pkgset will quickly load all those packages to the environment.
OLD
Name and dependency
The name of packages that can have multiple versions installed must have the version appended with the name. So, to install an specific version, use "gs-apt-get gcc4.0.1". There will be always a package with the base name, like "gcc". This package will install some default version of the tool/object. So, the dependency list of packages can have just a name like "gcc" when any version will satisfy the dependency or can select a specific version with "gcc4.0.1", for example.
Posted January 8th, 2008 by MarkBurton