Mysql, and C99 aliasing rules. A detective story.

October 25, 2009

When running mythfrontend after upgrading Mysql (to 5.1.40) I got:

Driver error was [2/1210]:
QMYSQL3: Unable to execute statement
Database error was:
Incorrect arguments to mysql_stmt_execute

After a bit of Google-ing I found a Myth bug ticket which seems to refer to the same (or a very similar) problem. The problem is apparently due to a Mysql bug and has something to do with an invalid cast and gcc optimizations, which makes it easy enough to work around – except of course that the bug was apparently fixed many versions ago.

However, conveniently, changing my configure options for Mysql (from -O3 to -O1 in my CFLAGS and CXXFLAGS) does make the problem go away. I suspected alias analysis to be the problematic optimization, so I checked whether “-O3 -fno-strict-aliasing” was enough – and it was. I resolved to track down the bug.

I started by compiling Mysql twice, in two different trees – one with my regular options and one with -fno-strict-aliasing. Then, I copied object files from the first tree to the second and repeated “make install” until the problem appeared (actually, I had to copy the .o file, as well as the stupidly hidden .libs/*.o file that libtool creates, and then touch the *.lo file; the build system is broken enough that just changing an .o file won’t cause the library to rebuilt). Fortunately I found the right file first try: it was libmysql.o (or rather, libmysql.c).

Unfortunately, libmysql.c is over  5000 lines long, and I didn’t know which function was causing the problem, so I had no idea where to look in the code. I proceeded by compiling the file both with and without “-fno-strict-aliasing” and with the “-save-temps” option so I could grab the generated assembly (i.e. libmysql.s) for each case. Then I did a manual binary search by hand-editing functions from one file into the other, assembling it and re-building the library until I was able to find which function had the problem. It turned out to be “cli_stmt_execute”. At that point I decided to look through the generated assembly and see whether I could spot where it didn’t seem to match up with what was expected.  First, though, I got the size of the function down by marking various other static function that it was calling with __attribute__((noinline)), each time re-testing to make sure that the problem still occurred.

Anyway, enough with the suspense. Here is the problematic line of code (2553 in libmysql.c):

    for (param= stmt->params; param < param_end; param++)
        store_param_type((char**) &net->write_pos, param);

The problem of course is the cast “(char **)” which casts to an incompatible type, thus the value net->write_pos is accessed as a “char *” when it is declared as a “uchar *” (“uchar” is a typedef for “unsigned char”). Now “char” and “unsigned char” are compatible types, but “char *” and “unsigned char *” are not. That means, if you access some value via a “char *” reference then you must always do so and never through any incompatible reference including “unsigned char *”. Doing so breaks the strict aliasing rules.

In this case, store_param_type is actually meant to modify net->writepos:

    static void store_param_type(char **pos, MYSQL_BIND *param)
    {
        uint typecode= param->buffer_type | (param->is_unsigned ? 32768 : 0);
        int2store(*pos, typecode);
        *pos+= 2;
    }

But, as gcc inlines the call, it effectively sees the “*pos+=2″ as updating a location that must be different to the source location (net->write_pos) seeing as the types are incompatible, in other words, it effectively sees “*pos = *otherpos + 2″ (where otherpos is an “unsigned char **”). Then, it sees that *otherpos must be constant for the duration of the loop, seeing as a value of a compatible type is not written. This means it can defer the statement and collapse its multiple instances (one per iteration of the loop) to a single instance. The result? kaboom.

The solution? Change the type of the “pos” parameter for store_param_type() from “char **” to “unsigned char **” (and remove the cast from the call in cli_stmt_execute()). The function isn’t used anywhere else so this makes sense even if it wasn’t causing a bug.

And the lesson to be learnt from this: be very careful with pointer casts. Avoid them whenever possible. And learn, I mean really learn the aliasing rules. There are way too many pieces of software out there written by people who do not understand them and the result is a buggy software.

Mysql bug filed.


FUSE, and ways in which it sucks.

October 14, 2009

I’ve just started toying around with Fuse. The possibilities of writing comprehensive filesystems in user space interests me, not the least because it makes development much easier before you move to kernel space.

I’ll say it right out, Fuse is certainly not the worst conceived / designed/ implemented piece of software by a long shot. The documentation isn’t great, but it’s not totally lacking either (which immediately puts Fuse a cut above a whole swathe of other open source projects). It’s not perfect however.

My main annoyance is the multi-threading support which, frankly, shouldn’t exist; i.e., it should be handled by the application and not the library. It is, generally, better when software layers are thin; it should be up to the app to do appropriate locking etc., and the library (in this case Fuse) should just specify in the documentation where the locking is needed (or rather, more usefully, where it is not needed). Ok, quibbles, and providing more functionality is hardly a crime; after all, the low-level layer is still available and (mostly) unobfuscated by the higher-level functionality.

The real issue is that linking against libfuse pulls in the pthread shared library as well, which is not so good. The pthread library incurs overhead (other than taking up memory/address space) because when it is loaded, locking and unlocking mutexes suddenly become real operations – which means fputc() and a whole bunch of other library functions become a weeny bit slower. (Don’t believe me? do “nm” on /lib/libc.so and /lib/libpthread.so and notice that they both define pthread_mutex_lock).  Fuse should really be split into two librarys, libfuse and libfuse_mt, and only the latter should require pthreads. (Or, as I suggested earlier, just ditch the multi-thread functionality altogether, leave it up to the application to pull in pthreads if it really needs it).

Another annoyance is the fuse_chan_receive function:

int fuse_chan_recv(struct fuse_chan **ch, char *buf, size_t size);

Notice that first argument? Not just a pointer, it’s a pointer-to-a-pointer. Why, great Zeus, why??! The function is meant to receive data from a channel, in what crazy circumstance would you want it to change the channel pointer itself to point at something else? And in fact, the function delegates to a another function via a pointer (embedded in **ch) to a fuse_chan_ops structure; in the library itself, as far as I can see, there are only two implementations (one to receive directly from the kernel, and another called mt_chan_receive) and neither of them does actually alter the value of *ch. If you have a look at the implementation of fuse_loop() though you’ll notice the author clearly had the possibility in mind, since he copies the channel into a temporary pointer variable (tmpch) before using fuse_chan_recv – so yes, the bad API design has caused increased code complexity.

The comments above the declaration of fuse_chan_recv don’t give any clue as to why it would ever modify the *ch value and incidentally ch is incorrectly documented as being a “pointer to the channel” when it is of course really a “pointer to a pointer to the channel”.

Oh, and the mt_chan_recieve function I mentioned earlier – well, it presumably has something to do with the multi-threading support, but it’s hard to be sure because the function lacks anything bearing the slightest resemblance to a comment, and there certainly isn’t any other explanation of what it does nor how it works.


POSIX write()

October 3, 2009

Take a look at:

http://www.opengroup.org/onlinepubs/000095399/functions/write.html

What a mess – different requirements for regular files, pipes, and “other devices supporting non-blocking operation”.  For pipes, there are reasons for this (atomic writes), but I think they should have been abstracted out (why can’t other devices have atomic writes? Why isn’t there a general mechanism to determine maximum size of an atomic write for a given file descriptor?).

I also notice, and I think this is particularly stupid, that if write() is interrupted by a signal before transferring any data it returns -1 rather than 0. If it transfers some data before being interrupted, it returns the amount of transferred data. Why make a special case out of 0?!! This forces increased complexity in the application, which can not assume that the return from write is equal to the number of bytes actually written, and for which -1 is in almost any other case an abortive error.

Unfortunately there is no discussion of the topic I was most interested in: atomicity/ordering of reads and writes to regular files. Consider:

  1. Process A issues a “read()” call to read a particular part of a file. The block device is currently busy so the request is queued.
  2. Process B issues a “write()” request which writes to the same part of the file as process A requested.

The question is now: can the data written by process B be returned to process A, or must the data that was in the file at the time of the read() call being issued? Also, is it allowed that process B might see part of the data that process A wrote, and part of what was in the file at the time of the read request?

Read the rest of this entry »