After Belenix 0.8 Alpha which included exciting features such as the Google Widgets, Webkit, and KDE 4.2.4 — all built with GCC 4.4.0 in addition to Gnome 2.26, BeleniX 0.8 Beta1 is now available with improvements (bug fixes and functionality) to the KDE 4.3.1 desktop and other apps and new package additions. Several patches/fixes for various packages were taken from the Fedora Core 11 repository.

You need to use the Network Installer in order to install this 0.8 Beta1 Release. The Network Installer will not touch your current environment in any way. It creates a new Boot Environment and installs into that. Your current environment remains as the default one.

You can see the full announcement here: http://www.belenix.org/content/BeleniX-08-Beta1-available-Network-Installer

I want to get Free-CAD working on BeleniX and have been going through the dependencies. One of them is OpenCASCADE that I started to build one week back. Since then it has been a tale of pain till finally once week later I do have a successful build.

Firstly the software is enormous having thousands of files. For eg. after a make install I find that it installs 15600+ header files! I started building it with GCC4.4.1 on BeleniX. Secondly their document mentions support for building with Sun Forte Compiler on Solaris 8 – primitive info. Obviously the combination of OpenSolaris platform + Mesa + Gcc 4.4.1 is untested. Once I started I came across some usual issues: The configure script assumes Sun Studio/Forte and has options not supported by Gcc, some headers assumed Sun Studio/Forte, needed proper declaration for bcopy, replace usage of ieee_handler with fex_set_handling  etc. After those I started coming across variable names like SS, CS that conflict with predefined macros for x86 register names on OpenSolaris. I have seen this on many occassions in KDE and other software in places.

However after manually patching about 15 files from 15 build failures, I started to wonder how many more. So to test I ran a simple command: find . -type f | xargs grep -w SS. Believe it or not there were hundreds of matches! From a hunch I started a round-robin search with all the possible register names and for the record the following are used: CS, SS, DS, GS, FS, ES, ESP, EIP in about 465 different files. The only option now was to whip up a simple shell script to do a global search and replace. The resulting generated patchset is huge and I am not keeping it as a patch! I have embedded the script in the pkgbuild spec file.

At the end of this all I found that the Makefile does not have 100% DESTDIR support in spite of it using the GNU autotools. So I had to patch Makefile.in and that resulted in a full build. After packaging I discovered a packaging issue and had to re-run make install. Even though the source tree is already built that resulted in another full build! Looks like broken Makefiles.

After an exasperating several days I do have a package. I had faced this usage of common variable names clashing in the namespace in several different software like for eg. in Celestia 1.6 that I built last week. It uses the obvious “sun” to represent Sol. This is however a predefined macro in Gcc on OpenSolaris. Granted that this can be worked around by using “-Usun”(Unsafe ?) in CFLAGS and OpenSolaris exposing register defines in headers by default looks like a bug, it is nevertheless a really, REALLY BAD IDEA to use obvious, common, short variable names in your software.

Ugly Beast errr Ritz on the road

Ugly Beast errr Ritz on the road

Here comes another weirdo from Maruti’s stable. It my not be entirely apparent in this picture but seen in person it appears as a cross between a car and an animal! From the back it is a frog’s rear with jutting wheels added, from the front it is a car reminiscent of the Hyundai i10 and from the side it looks as if Bluto has delivered a mighty kick to the rear making it concave.

It does have some goood aspects in terms of handling, mileage price etc. However a car needs to look like a car not a parody. It does not mean that Maruti does not make good-looking cars, but not this one. Hats off to Maruti for producing such an object on wheels and hats off the Indian consumer for actually buying it! I find it hard to believe that this is actually selling!

Ksysguard working on OpenSolaris

Ksysguard working on OpenSolaris

Anyone who might have tried the earlier KDE4.3 packages for BeleniX may have noticed that Ksysguard (CTRL+ESC or Kmenu -> Applications -> System -> System Monitor)  basically shows a blank slate. The process list is empty, CPU and Network stats are unavailable. The number of exposed sensors are too few.

I spent the last few days hacking on that component and got an initial working version that implements all the basic functionality for the OpenSolaris platform. There are still bugs to iron out and new sensors to add (using DTrace here can open up lots of possibilities). The current patch is here. The kdebase4-workspace package has been published into the BeleniX repository.

I am plugging in some obligatory screenshots of my Vbox VM running BeleniX 0.8 Alpha:

The current 4.3.1 release of KDE is now available on BeleniX 0.8 Alpha. See this link for the details: http://www.belenix.org/content/KDE-431-now-available-BeleniX-08-Alpha. I have borrowed patches from the work of the KDE-Solaris team and Fedora Core 11 repository.

Currently 0.8 Alpha is only available via the network installer. We will be starting to work on building a LiveCD ISO soon. In addition 0.8 Alpha is based on OpenSolaris source drop for build 114. This will be updated to a more recent build. There are other things to look at like lofi bypass mode that should make it practical to use encrypted lofi on iSCSI targets – advantage being end-to-end encryption. Rework the ramdisk compression piece for the latest kernels and fix some oustanding corruption issue. Developer documentation for developing software on BeleniX, Hudson based bulk build setup for regular bulk builds on BeleniX repo, an installer written entirely in Python 2.6, A Gcc 4 build of Firefox with Profile Driven Optimization, Gcc build of OpenJDK on OpenSolaris, use RPM5 packaging with Smart Package Manager and lots of other stuff. One of the goals is use an Open Toolchain end-to-end. In that respect it is also important for us to look at a Gcc 4.2 build of the OpenSolaris kernel.

For me personally it is amazing to see how much BeleniX has progressed from the early days of a commandline-only ramdisk-only barebones kernel boot to single-user mode in an image assembled by hand. I manually went through and included individual files back in Sept 2005! Today some people may not realize it but BeleniX is a first-class OpenSolaris environment and a first-class KDE environment. People have been using it daily for months and it has been used in a multi-user build-server environment, like our build server in Moscow. Of course we face the problem of lack of developers, so developers are more than welcome!

In addition few people may know that the OpenSolaris distro from SUN owes it’s origin to BeleniX. Every technology that I developed for BeleniX during the 2.5 yrs prior to OpenSolaris-Distro coming out was used. In fact the first Beta release was based on BeleniX 0.4.1 with IPS and Caiman installer put in and KDE replaced by Gnome – I was part of the core team working on that!   See LiveCD Architecture Overview Diagram and LiveCD Features Timeline. Sadly there is not even a shred of information or documentation that alludes to this except for a sole reference in the OpenSolaris Bible.

Lost on the Dee - BUS

Lost on the Dee - BUS

The earlier KDE 4.2.4 that I built for BeleniX had a weird persistent issue of various DBUS clients timing out after not receiving a response for their messages. It would happen erratically but when it did happen the desktop will be very slow to come up, applications will open after some time and the “.xsession-errors” file will get filled up with DBUS timeout messages. At that time there were many other issues to resolve and this problem got ignored not least due to it’s erratic nature.

The problem however persisted even when I built the latest KDE 4.3.1. It was much less effort to update to 4.3.1 since now the build recipes were already present and 4.3.1 had far less bugs than 4.2.4. When testing in VirtualBox I started seeing that this time the timeouts were consistent. So I started poking around with dbus-monitor and qdbusviewer and eventually found that clicking on thhe kded node on qdbusviewer caused a timeout with appropriate messages coming up with dbus-monitor. So kded4 was stuck. Next I used pstack and found the kded4 stack which showed that it was stuck on a write to the Gamin file descriptor. Promptly I got a stack of the gam_server process and found that one of it’s threads was blocked on a write. Using pfiles I saw them pointing to the same unix domain socket – AHA so they were deadlocked.

Now Gamin is a drop in replacement for FAM – File Alteration Monitor that can monitor files and directories and provide notification of their changes to consumers. Gamin uses Inotify on Linux. The OpenSolaris port done by the JDS team uses File Event Notification that is similar to Linux Inotify but uses a more generic Event Ports framework. Now KDE uses the KDirWatch class that in turn communicates with Gamin and I was using the OpenSolaris Gamin port. It appears that KDirWatch uses a single thread while Gamin can send back events at any time, even when the consumer has issued a subscribe call and it has not returned. Indeed the OpenSolaris port of Gamin sends back events in the new subscription processing flow. There are additional calls in KDirWatch around calls to FAMMonitorFile and FAMMonitorDirectory with comments about avoiding a deadlock. But that is not enough as I could clearly see. This looks like a design shortcoming to me. Ideally KDirWatch should use one thread to handle async notifications and invoke subscription requests in another thread.

Ok all good now what am I to do ? Getting so close to finishing the KDE 4.3.1 build, I was in no mood to sit down and start changing KDirWatch. One alternative was to disable FAM and use Polling Mode, but that would be horrible. Eventually I modified the Gamin patch for OpenSolaris to not send back some of the events during the subscription flow and that did the trick for now. DBUS timeouts are solved. I am not sure what will be the impact of this on Gnome 2.26, but at least KDE which is the primary desktop for BeleniX, is working. BTW KDE 4.3.1 on BeleniX 0.8Alpha is now available. I will put a separate post on that.

Cute-E Four Point Five

Qt 4.5

Having reached a working KDE 4.2.4 desktop milestone, I have been racing to get to 4.3.1. 4.2.x has enough problems and 4.3.x has enough fixes and improvements to warrant a quick move. One of the requirements for 4.3.1 is Qt 4.5 and having already a build recipe for 4.4 I thought it won’t take much time apart from the compilation time itself. But Alas, badly mistaken was I!

It turned out to be a lot of “fun” for 5 days before I could get a working Qt 4.5 built. The first time I built 4.5 all text was appearing as square boxes. Suspecting some locale issues in my older build env I setup a fresh new one using the install_belenix script, but no joy. Cursing my bad luck I sat down for the ardous task of digging through the Qt text rendering and font handling code. To cut a long story short it eventually took me 5 evenings of a wild goose chase  through multiple functions in multiple libraries to identify an iconv issue. I am using GNU libiconv and the way Qt 4.5 caches the iconv handle seems to cause a problem and subsequent googling with more specific search terms turned up this link: http://mail.kde.org/pipermail/kde-freebsd/2009-April/005059.html

The FreeBSD developers had faced the exact same issue back in April. Eventually I patched the code just enough to avoid the caching and finally got text that a human could read (not a monkey BTW :-P ). After this things have been pretty smooth and I have made good progress except for another sticky issue with building the Soprano bindings in KDEbindings. I have disabled the Soprano bindings for now.

I came across this amusing post on TG Daily: Metallica fans are monkeys. This is nice. I never had any fondness for rock music. I absolutely hate the sound of canisters, utensils, hacksaws and what not creating a racket near my ears! I generally listen to classical both Indian and Western, Pop, and other traditional Indian music. So that makes me a Human Being – YEEEEEHAH!

It was a very long story getting to a functional KDE 4.2.4 on BeleniX. The amount of effort needed to integrate KDE 4 and iron out issues is immense indeed and needed months of effort not least because of the humongous dependency tree of KDE 4. Of course the work is still ongoing and there are still bugs to fix and update to the latest 4.3 release. Thanks to several guys like Sriram Narayanan, Kunal Ghosh, Kaya Saman and others who helped to test and find bugs/workarounds. Of course all this is part-time (weekend, evening) effort outside of my day job.

KDE 4.2.4 in BeleniX

The above screenshot shows KDE 4.2.4 desktop on BeleniX running Konqueror with Webkit as the rendering engine, Krdc, and Lotus Notes inside Wine with the Clock from Google Gadgets.

One of the simplifying but interesting things was the usage of Gcc 4.4 and the new Graphite optimizations in some places. In addition I borrowed patches and build recipes from the KDE-Solaris teams efforts and the Fedora repository. There were many challenges and there is still quite a bit of work to do. There are several patches that I’d have to submit to upstream KDE projects.

One of the most recent adventures was to get Amarok2 built properly. At first, I needed MySQL embedded, so I hacked the MySQL 5.0 Makefile in the SFW consolidation to build it. However Amarok would refuse to start first giving symbolic errors and then, after a few hacks, coring somewhere in libQtCore after trying to initialize Innodb. The attempt to initialize Innodb confused me till I read the Amarok2 build guide which states 5.1 is needed. So back to the SFW repository when it hit me again. SFW repo compiles using Studio and MySQL uses C++ stuff. Arrgh. I spent a whole day creating a new Spec file for building MySQL 5.1 with Gcc4. That was quite challenging to get right and also get embedded MySQL as a shared lib. Anyway the Innodb thing went away after that but coring persisted.

It was coring because the dynamic_cast operator was failing to cast a SqlCollection Object to one of it’s parents, SqlStorage. Weird! Eventually I played with the compiler options and changed -march=pentium3 to -march=pentiumpro, added -frtti and finally dynamic_cast started working again. Then the Amarok2 screen finally came up and then it cored again after 10 seconds. This time embedded MySQL was linked into Amarok2 as a static lib, so I rewhacked the Spec file to build a shared lib and got it right after several attempts. Finally made a silent prayer and had Amarok2 working without coring. This can be a Gcc 4.4.0 issue as well. We will be moving to Gcc 4.4.1 shortly with the patch to let it build Wine added.

BeleniX uses package from the SFW repo. One of the onging activities is to migrate package builds from SFW to spec files in BeleniX CVS repo and build with Gcc. The SFW gate packaging is weird in some respects. Not all features are enabled for some packages like no embedded server in MySQL. In addition the most horrible thing is that “11.11″ is used as the package version for every package! What sense does it make to do this ? It requires ugliness in spkg version comparison. There should be other ways to tie SFW package builds to ON build releases.

Another intention of ours is to get a working Firefox 3.5.x build on BeleniX using Gcc4.4. We are able to get a working debug enabled build but the release build crashes in a stub function. In addition since SUN Java is being used, the Java plugin won’t work in a Gcc Firefox build on OpenSolaris. This is because SUN Java for osol is built using SUN Studio C++ compiler. We will have to investigate getting OpenJDK built using Gcc4 on opensolaris. We do have XULRunner built however since that is needed by Google Gadgets.

Well that is enough for now, more stories later.

Next Page »