X keyboard crap

I’m getting this in my log when I try to start X:

[  7780.894] (EE) Error compiling keymap (server-0)
[  7780.894] (EE) XKB: Couldn’t compile keymap
[  7780.894] XKB: Failed to compile keymap
[  7780.894] Keyboard initialization failed. This could be a missing or incorrect setup of xkeyboard-config.
[  7780.894]
Fatal server error:
[  7780.894] Failed to activate core devices.

What’s going on? “Couldn’t compile keymap” has to be one of the most useless error messages ever. Why can’t you compile the keymap??!

Update:

I renamed the “xkbcomp” executable and replaced it with a script which logged options and output before executing the original. I’m seeing this output:

The XKEYBOARD keymap compiler (xkbcomp) reports:
> Error:            Cannot open “/tmp/server-0.xkm” to write keyboard descriptio
>                   Exiting

… However, I don’t understand why it’s unable to create a file in /tmp. I’ve verified the file doesn’t already exist before xkbcomp is run, and that all users can create files in /tmp (the “t” bit is set).

Once again, the error message is bad: please tell my why you can’t open the file. (Hint: it’s called perror).

Update: (Solved!):

Turns out the permissions were the problem. They were:  rwxr-xrwt, i.e. the group was denied write permission. I didn’t think this was a problem seeing as “other” users are allowed to write files, but apparently that’s not how Linux permission checking works.

Advertisements

Trac is super crap

A couple of projects I work on use Trac (0.12.1, the latest at time of writing) for bug/issue tracking. I recently got an email from a user of our Trac system, saying that she couldn’t seem to reset her password – that is, the system told her that it had reset her password and sent her an email with the new password, but in fact the email with the new password never arrived.

I went to scan the trac logs, but it turns out they were disabled by default (first problem). I enabled the logs and reset my own password, but no joy – nothing untoward appears in the logs. I checked and double-checked the mail settings, but they all looked ok. I then altered the configuration, switching from SMTP to sendmail and specifying a freshly-crafted wrapper script which logs when it’s called as the “sendmail” executable. With this I can verify that the mail is not sent – the script doesn’t appear to be being executed, and the mail server logs show no activity.

Finally I stumble on another setting:

acct_mgr.notification.accountchangelistener = enabled

And with this, finally, finally, Trac starts sending the new password emails when I do a password reset. Now, this is a documented setting, but really, I mean really, couldn’t there have been some warning in the logs? Or could maybe the setting to allow users to reset their password also enable the “account change listener” (whatever that is)? Is it too much to ask that an obviously broken configuration be either made redundant or at least identified as such by the software?

Well, I’ve solved the original problem. It’s at this point, though, that I realise a much more significant problem exists: The Trac AccountManagerPlugin password reset functionality allows anyone to reset anyone else’s password, and, get this, this is where it gets good, they can have the new password sent to any email address. [Update: no, this is wrong; once you actually enable the acountchangelistener setting as described above, you get an error if the specified email address doesn’t match the account. However, if you don’t enable that setting, then anyone can reset your password; however they won’t receive the new password at the email address they specify].

When the password reset functionality is enabled, if you register as a new Trac user you will see text telling you that providing your email address is optional, but if you do provide it you will be able to “reset your password if you ever forget it”. That’s clearly bollocks [if accountmanagerlistener isn’t enabled], since the password reset function doesn’t verify that the email address is correct for the user [in that case].

The real problem, I guess, is that Trac doesn’t provide any decent built-in account management. I mean, should the account manager – a most basic piece of functionality – really be a plugin? I don’t think so. (Worse than that, to use the plugin with Trac 0.12, you have to use a subversion pull of the plugin rather than a released version!).

And that is why Trac is super crap.

Lies, damned lies, and Rails documentation

I was recently playing with Rails again. A while back a site I was working on had an issue where the parameters from an uploaded form were all coming in as file objects – even the parts that didn’t correspond to uploaded files. The client in this case was some Java software using the Apache Httpclient library. We’d only recently upgraded Rails, and it turned out we were seeing the problem described here:

http://blogs.sun.com/mandy/entry/rack_and_content_type_fo

The problem was easily worked around. This led me on a hunt, however, for some documentation on how exactly you are meant to handle form uploads in Rails. About the only vaguely official-looking documentation I can find at this point is the Action Controller Overview guide, which has this to say:

Note that parameter values are always strings; Rails makes no attempt to guess or cast the type

Clearly baloney. If one of the form parts is a file upload, then what you get is an object with several methods defined including “read” to return the contents of the file as a string, as well as things like “original_filename” which tell you the filename as specified by the uploader. None of this is documented anywhere.

It’s this “we don’t need to document it” attitude that really annoys me about Rails. Sure, there’s all this magical stuff you can do with it and it has all these built-in goodies that can save you hours of coding just by using magic variable names (except for those times of course when you have to spend hours debugging, because you used one of those magic variable names by mistake and without realising). But there’s just no reliability with the API. The params hash for instance – what exactly am I allowed to do with a value I pull from the hash? Can I be sure that it will either be a string or one of these mysterious file upload objects? This isn’t the only omission in the documentation; for instance what does the “session” method in ActionController::Request actually return? What am I allowed to call on it?

This is the real strength of languages with strong typing – programs written in them are, to a certain degree, self-documenting. If I know what type a method actually returns then I know for certain what methods I can call. No nasty surprises when the framework developers suddenly decide to make it return a different kind of object!

Edit 27/8/2010: Another major strength of statically-typed languages that I forgot to mention is of course the ability to refactor code with the assistance of automated tools. As discussed in this blog post by Cedric (and I love the title).

FUSE, and ways in which it sucks.

I’ve just started toying around with Fuse. The possibilities of writing comprehensive filesystems in user space interests me, not the least because it makes development much easier before you move to kernel space.

I’ll say it right out, Fuse is certainly not the worst conceived / designed/ implemented piece of software by a long shot. The documentation isn’t great, but it’s not totally lacking either (which immediately puts Fuse a cut above a whole swathe of other open source projects). It’s not perfect however.

My main annoyance is the multi-threading support which, frankly, shouldn’t exist; i.e., it should be handled by the application and not the library. It is, generally, better when software layers are thin; it should be up to the app to do appropriate locking etc., and the library (in this case Fuse) should just specify in the documentation where the locking is needed (or rather, more usefully, where it is not needed). Ok, quibbles, and providing more functionality is hardly a crime; after all, the low-level layer is still available and (mostly) unobfuscated by the higher-level functionality.

The real issue is that linking against libfuse pulls in the pthread shared library as well, which is not so good. The pthread library incurs overhead (other than taking up memory/address space) because when it is loaded, locking and unlocking mutexes suddenly become real operations – which means fputc() and a whole bunch of other library functions become a weeny bit slower. (Don’t believe me? do “nm” on /lib/libc.so and /lib/libpthread.so and notice that they both define pthread_mutex_lock).  Fuse should really be split into two librarys, libfuse and libfuse_mt, and only the latter should require pthreads. (Or, as I suggested earlier, just ditch the multi-thread functionality altogether, leave it up to the application to pull in pthreads if it really needs it).

Another annoyance is the fuse_chan_receive function:

int fuse_chan_recv(struct fuse_chan **ch, char *buf, size_t size);

Notice that first argument? Not just a pointer, it’s a pointer-to-a-pointer. Why, great Zeus, why??! The function is meant to receive data from a channel, in what crazy circumstance would you want it to change the channel pointer itself to point at something else? And in fact, the function delegates to a another function via a pointer (embedded in **ch) to a fuse_chan_ops structure; in the library itself, as far as I can see, there are only two implementations (one to receive directly from the kernel, and another called mt_chan_receive) and neither of them does actually alter the value of *ch. If you have a look at the implementation of fuse_loop() though you’ll notice the author clearly had the possibility in mind, since he copies the channel into a temporary pointer variable (tmpch) before using fuse_chan_recv – so yes, the bad API design has caused increased code complexity.

The comments above the declaration of fuse_chan_recv don’t give any clue as to why it would ever modify the *ch value and incidentally ch is incorrectly documented as being a “pointer to the channel” when it is of course really a “pointer to a pointer to the channel”.

Oh, and the mt_chan_recieve function I mentioned earlier – well, it presumably has something to do with the multi-threading support, but it’s hard to be sure because the function lacks anything bearing the slightest resemblance to a comment, and there certainly isn’t any other explanation of what it does nor how it works.

Intel Xorg drivers in a state of chaos?

My primary desktop machine has Intel’s G33 graphics chipset. It’s fairly low-end but it’s fine for my needs. For ages now I’ve been running on Xorg server 1.6.0,. Mesa 7.3rc2, and xf86-video-intel-2.6.3 and suffered only occasional X crashes, although I was certainly limited to run one X session at once (I used to like running two, essentially using it as an alternative to the desktop-switchers that are common now; it was something I could live with out, but it always bothered me that it didn’t work).

Just recently, a gcc 4.4.1 release candidate was tagged and so I decided to upgrade a few of these system components. Firstly, Xorg-sever 1.6.2, the latest stable release. I also updated my kernel to the latest 2.6.30.2. libdrm went to 2.4.12 (the latest). Mesa went to 7.4.4 (7.5 has been released as “stable” with a warning that “you might want to wait for 7.5.1, or stick with 7.4 series for now”). And, I built the xf86-video-intel driver version 2.8.0 (which supports only UXA style of acceleration, and “really likes” Kernel Mode Setting though is still meant to function without it).

Results:

Without KMS: Ok, but when I run a 3D app (Xscreensaver’s “antmaze”) I just get a blank screen. Running a second X server gives me a corrupted display and promptly crashes, forcing me to reboot via a remote log-in.

With KMS: Ok, but no 3D still. When I switch to a different VT I get an instant freeze, no chance to even start another X server (though I can still do a remote log-in).

Neither of these was really acceptable so I’ve gone back to the 2.7.1 xf86-video-intel driver. This driver supports the older “EXA” acceleration method, plus Intel’s own UXA method which they seem to think is the truth, the light, the way, but which has caused performance regressions and screen corruption in the past. (The even older “XAA” was ditched a while ago, I guess). I’m not even going to bother trying to use KMS again so I test between EXA and UXA:

With UXA: Despite the fact that Intel seem to think UXA is the way of the future, it gave me only problems. Specifically, running antmaze gave a window which flickered constantly between what antmaze should look like, and what appeared to be a copy of my desktop (resulting in one of those crazy infinite recursion effects). I managed to get a screen shot, below.

With EXA: Everything is Ok. Performance sucks a bit, but at least everything looks like it should and I don’t get crashes. Even better, it seems I am now able to run two X servers on the same machine and switch between them. I won’t be surprised if I still get the odd X crash, but that’s really just the status quo.

Screen capture when using xf86-video-intel 2.7.1 with UXA acceleration
Screen capture when using xf86-video-intel 2.7.1 with UXA acceleration

Frankly, it bothers me a lot that the 2.8.0 driver has seriously regressed from 2.7.1. For quite a while now it’s been touch-and-go with the intel drivers, I’ve seen quite a few serious problems in different releases.

Also, I’m totally unconvinced that UXA is the right solution for now. It may well have technical merits if done right but it really seems like it would be better to get EXA working properly – and fast – first, and worry about other improvements (like UXA) once the EXA driver is in a stable state, and is about as good performance-wise as it can be.

Hopefully now that EXA has been abandoned (even if it was too early) the focus on UXA will cause it to stabilise and improve.

Update 31/08/09: I’m now running Xorg server 1.6.3 and xf86-video-intel 2.8.1, which has been released since I originally wrote this blog entry. I’m pleased to say that the crashes (non-KMS) are fixed and I can run two Xservers side-by-side without problems. OpenGL was horribly slow and caused disk thrashing, until I also upgraded to Mesa 7.5. With KMS there are still problems; when I switch away from the X server to a text-mode terminal, no mode-change happens – I just keep seeing the X screen; if I switch back to it (CTRL+ALT+F7) I can continue working in X. Exiting X also fails to restore text mode.

Update 29/09/2009: xf86-video-intel 2.9.0 has the same problem as 2.8.1. Reported to the Xorg bug database. Let’s see what happens.

Update 1/10/2009: Ha, the old “not a bug” response. I thought about fighting this but didn’t want to devolve into a “open bug” / “close bug” ping-pong match; anyway, to summarise:

  1. Loading fbcon kind of solves the problem; I can switch away from the X session. However, it turns my text-mode virtual terminals into framebuffer VTs (which I don’t want).
  2. Technically it shouldn’t be impossible to have what I want – that is, to have KMS and still have text-mode terminals. But it’s probably a kernel-side issue rather than anything to do with the xf86-video-intel driver. (I may even look and see if I can fix it myself, at some point; I can’t explain why, I just really like text terminals).
  3. I personally think “NOTABUG” is the wrong resolution: there is a documentation bug. The fbcon requirement should be documented – either in the X driver man page, or the README that comes with the source bundle, or in the linux kernel documentation (the help text for the i915 driver). “NOTOURBUG” may have been appropriate.
  4. For now I’ll stick with non-KMS (can we call it “userspace mode-setting”? I like that). Although I’m worried that UMS will disappear from the driver at some point in the future, just like EXA and DRI1, murdered in the night.

Subversive

Eclipse was the first “real” Java IDE I used and I’ve stuck with it for a while now. We migrated a project from CVS to Subversion a while back and, after a brief and highly unsuccessful fling with Subclipse (I’m sure it’s improved in the meantime, but I’ve never gotten around to trying it out again) I started using Subversive to provide Subversion support within Eclipse. Most of the basic things work but it has the following annoyances:

1. On my home machine, for some reason when I synchronize, it always asks for my username and password. Actually it doesn’t so much ask, seeing as the fields are already filled in and I just have to click “ok”, but it always presents the dialog.

2. On my work machine, on the other hand, it doesn’t do that. What it does instead is ask for the proxy password, despite the fact that I have entered the proxy password in the Eclipse preferences. Annoyingly, it asks for the password again from time to time as I do commits, synchronizes, whatever.

3. A whole lot of corner cases seem to be broken, especially with regards to tagging. For instance if I choose a folder under trunk and “New -> tag” (which incidentally is a really awkward menu choice, why can’t “tag” be one of the top-level options?) and then give it a new tag name, it creates the tag and puts the contents of the folder under the tag, but not the folder itself. If I create the folder under the tag first, and supply that combined tag name and folder name as the tag name (i.e. “TAG_NAME/folder_name”), it then DOES put the folder underneath the folder with the same name that I’ve already created (TAG_NAME/folder_name/folder_name)! WTF! To get it to work I have to create the tag without creating the folder and then specify “TAG_NAME/folder_name” as the tag. It’s borked.

4. Some tag operations don’t automatically refresh the tree in the SVN repository browsing view, even though they should. (I can’t remember exactly which ones, but I did spent a lot of time today creating and removing tags while trying to solve the problem in [3]).

5. What the hell is the “ROOT” folder? It just seems to contain a copy of the repository underneath it. What’s the point?

6. What does “Add revision link” (in the context menu on a folder in the SVN repository browsing view) actually do?