Clever hacks are just not.

Just recently I wrote about the clever “dual ABI” hack found in GCC 5’s implementation of the standard C++ library. I quipped at the end of that post:

I suspect the right way to handle changing ABI is to do it the way it’s always been done – by bumping the soname.

Having recently done some work on a C++ project, I’ve discovered an issue present in the dual ABI implementation. What seemed like trivial code to handle an exception from attempting to open a non-existent file just wasn’t working. I worked it down to the following test case:

#include <fstream>
#include <iostream>
#include <typeinfo>

int main(int argc, char **argv)
{
    using namespace std;
    using std::ios;
    
    ifstream a_file;
    a_file.exceptions(ios::badbit | ios::failbit);
    
    try {
        a_file.open("a-non-existent-file", ios::in);
    }
    catch (ios_base::failure &exc) {
        cout << "Caught exception on attempt to open non-existing file." << endl;
    }
    catch (exception &exc) {
        cout << "Caught standard exception: " << typeid(exc).name() << endl;
    }

    return 0;
}

Surprisingly, this fails to work properly (prints the wrong message) when compiled with a straight “g++ -o simple simple.cc”. It works correctly when compiled instead with “g++ -D_GLIBCXX_USE_CXX11_ABI=0 -o simple simple.cc”. I would have filed a bug, but it seems the problem is known.

This is a pretty major bug. Any C++ program that handles I/O errors by catching the appropriate exceptions just isn’t going to work. Furthermore, this isn’t just a minor problem; having to separate out implementations of iosbase and derived classes for the two ABIs would significantly reduce the benefit of even having the dual ABI. The problem with “clever” hacks like the dual ABI libstdc++ is that they are (1) often not so clever and are (2) always most definitely hacks. The proposed solution? Heap more hacks on top so the first hack works:

One option to make both forms work would be to hack the EH runtime to make handlers for std::ios_base::failure able to catch std::ios_base::failure[abi:cxx11] objects, and vice versa, creating an object of the other type on the fly.

Urgh. Just urgh.

What really annoys me most about this nonsense is that the one sane option I have – stripping out the ‘abi_tag’ nonsense from the source, re-compiling libstdc++ and calling it libstdc++.so.7 – would probably cause me problems further down the track, because third-party binaries are going to want the new ABI with the old soname (and even if I didn’t care about those, I’d still be one version number up from the official libstdc++ from that point on, which feels like it could cause problems).

Advertisements

Tale of Two ABIs

With GCC 5.2 just having been released, I figured it was time to upgrade my system to the latest-and-greatest in GNU compiler technology. So far the new compiler seems fine, but there’s a subtle issue that has been introduced to do with the “dual ABI” in the standard C++ library implementation, libstdc++. A Red Hat developer (“rhjason”) writes about it here. If I try to boil it down to the essence:

  • The C++11 standard has some complexity requirements which require re-engineered implementations of certain standard classes, among them std::string and std::list. This thereby requires an ABI change (perhaps the sizes of the structures have changed, or inline functions exist which now need to be implemented differently; given that std::list is a template, there’s really no getting around this).

    “Complexity requirements” refers to the algorithmic complexity (think ‘big-O’ notation). In particular std::list size() must now be O(1) – that is, it must take a constant amount of time regardless of the number of elements in the list (GCC bug). Previously the size() method worked by counting the elements in the list, which is O(n); there’s an explanation (in the form of an argument against) here. (I’m not sure if I agree with this argument).

  • Changing the ABI normally means changing the soname, the name of the dynamic library that you link against. In this case it would have been a bump from libstdc++.so.6 to libstdc++.so.7.

    This does, admittedly, come with problems. It’s possible to have some program (A) link against a library (B) and also against the C++ library (libstdc++.so.7). However, it may be the case that the library B was linked against the older version of the C++ library (libstdc++.so.6). When you execute A, then, it will dynamically link against two versions of the library. Because these libraries have (at least some) common symbol names, one will override the other. That means, for instance, that the library B will call some function but have the libstdc++.so.7 execute, which has the wrong ABI. There’s also the possibility of passing standard library objects (such as strings) between A and B, for which the ABI differs. Either case might lead to data structure corruption, incorrect behavior and possibly crashing.

    (Note that this problem is not limited to the C++ library. Any library which changes its ABI and bumps its soname potentially causes similar issues).

  • To combat these problems, the GCC/libstdc++ folk decided to put the old and new ABI into a single dynamic library (libstdc++.so.6).

    In many cases, nested inline namespaces (explanation) are used to avoid symbol clashes between the old and new ABI; however this is not possible in every case (because you can’t have a namespace inside a class, for instance). So, the compiler now understands ‘__attribute (( abi_tag(“cxx11”) ))’, which can be applied to an inline namespace or a declaration.

    (I do not understand why the tag should ever actually need to be applied to a namespace, and indeed it’s not really done in libstdc++. It seems that you do not need to supply the abi argument if you do – so you can have just ‘__attribute ((abi_tag))’ – but neither is it at all clear to me what the purpose of this would be. With a few experiments I see that it has some effect on warnings when -Wabi-tag is used, and might affect the abi tag that functions/variables inside the namespace “inherit” if they use an abi-tagged type).

  • You can select the ABI to compile against using the _GLIBCXX_USE_CXX11_ABI preprocessor macro (set it to 1 for the new ABI, or 0 for the old ABI). If the macro is not set it will be set to 1 when you include a C++ header.

So, this allows two ABIs to exist in a single dynamic library. As an added bonus you may get link-time errors if you try to link to a library using the other ABI. However, there is at least one significant issue with this strategy; Allan McRae (an Arch Linux guy) writes about it here:

This discovered an issue when building software using the new C++ ABI with clang, which builds against the new ABI (as instructed in the GCC header file), but does not know about the abi_tag attribute. This results in problems such as (for example) any function in a library with a std::string return type will be mangled with a [abi:cxx11] ABI tag. Clang does not handle these ABI tags, so will not add the tag to the mangled name and then linking will fail.

In other words, the GCC people have basically decided that other compilers don’t exist. This is, I have to say, pretty shitty. The LLVM guys are now forced to choose between supporting this non-standard abi_tag attribute or supporting only the non-C++11 ABI when using GNU libstdc++. Although I hope they go with the former (because it’ll make life easier for me personally) one could understand if they decided not to. I suspect the right way to handle changing ABI is to do it the way it’s always been done – by bumping the soname.

Compiler Bugs Worst Bugs

I think that compiler bugs – the kind where they produce the wrong code, i.e. an incorrect compilation – are perhaps the worst kind of bug, because they can be very difficult to identify and they can cause subtle issues (including security issues) in other code that should work correctly. That’s why this GCC bug bothers me a lot. Not only is it a “wrong code” bug, but it is easy to reproduce without using any language extensions or unusual compiler options.

Just -O2 is needed when compiling the following program:

#include <assert.h>

unsigned int global;

unsigned int two()
{
    return 2 * global;
}

unsigned int six()
{
    return 3 * two();
}

unsigned int f()
{
    return two() * 2 + six() * 5;
}

void g(const unsigned int from_f)
{
    const unsigned int thirty_four = two() * 2 + six() * 5;
    assert(from_f == thirty_four);
}

int main()
{
    global = 1;
    const unsigned int f_result = f();
    g(f_result);
}

It’s easy to reproduce, and it’s obviously wrong. Somehow the compiler is managing to mess up the calculation of (2 * 2 + 6 * 5). And yet, it’s classified as Priority 2. And furthermore, GCC 4.9.3 was just recently released, with this bug still present. It makes me start to wonder about the quality of GCC. I’m waiting for this to be fixed before I move to the 4.9 series (5.x series is way too new for my liking, though I might skip over 4.9 if 5.2 is released in the near future).

I’d like to run my compiler benchmarking tests on LLVM 3.6.1 and GCC, but I’ll hold off for a bit. I have done some quick testing with LLVM though and I have to say it is exceeding expectations.

Updated C compiler benchmarks, Feb 2012

With several new compiler releases it’s time to update my compiler benchmark results (last round of results here, description of the benchmark programs here). Note that the tests were on a different machine this time and the number of iterations was tweaked so numerical results aren’t comparable.

Without further ado:

So, what’s interesting in this set of results? Generally, note that LLVM and GCC now substantially compete for dominance. Notably, GCC canes LLVM in bm4, where it appears that GCC generates very good “memmove” code, whereas LLVM generates much more concise but apparently also much slower code; also in bm8 (trivial loop removal – though neither compiler actually performs this optimization, GCC apparently generates faster code). On the other hand LLVM beats GCC quite handily in bm6 (a common subexpression elimination problem).

GCC 4.6.2 improves quite a bit over 4.5.2 in bm5 (essentially a common subexpression refactoring test). However, it’s slightly worse in bm3 and for some reason there is a huge drop in performance for the bm7 test (stack placement of returned structure).

The Firm suite, surprisingly, mostly loses ground with 1.20.0 and remains uncompetitive.

Edit 25/03/12: I’ve noticed a flaw in bm6, which when corrected causes GCC to perform much worse – about 0.5 seconds rather than the 0.288 reported above.

Updated C compiler benchmarks: llvm-2.7, libfirm-1.18.1

Since I did my previous C compiler benchmark, there have been a few new software releases, including Gcc 4.5, llvm 2.7 (and 2.6) and libfirm 1.18 (and 1.17). Here are results from benchmarking these compilers:


Most important things: Gcc version is now 4.4.4, though this hasn’t made much difference to the results. llvm-2.7 is benchmarked both with the “native” clang frontend, and the gcc-4.2 based frontend; note the difference in bm1, 3 and 6.

In the case of LLVM, results have improved significantly in bm2 and bm5.

For LibFirm, results have improved dramatically for bm5, but are also significantly worse for bm3.

Gcc is still (subjectively) the overall winner, though it does get beaten in a few of the benchmarks (bm2,6,7) by LLVM and even in a couple (bm5,7) by LibFirm.

Gcc 4.5 has been released but has yet to be benchmarked. Edit 13/7/2010: Gcc 4.5 benchmarked, and the results are so similar to those for Gcc 4.4.4 that it’s not worth reporting them separately.

There are three new tests:

bm8 tests the compiler’s ability to remove trivial loops (loops with empty bodies, but a large number of iterations). The loops are implemented with “goto” rather than a C loop structure, and there is an inner and outer loop. No tested compiler manages to  remove the loops.

bm9 tests the compiler’s ability to break up a loop into two separate loops, when doing so avoids an efficiency problem caused by a potential aliasing pointer. The function in the benchmark takes two arguments: an int array and an int pointer “p”. The inner part of the loop uses the value pointed at by “p” as part of an expression assigned to each array element in turn. Because “p” might alias one of the array elements, the stored value must be re-loaded each iteration, unless the compiler breaks the loop into two. None of the tested compilers perform this optimization.

bm10 tests that a shift-left of a value inside a large loop can be reduced to a constant (due to having shifted all the bits out). None of the tested compilers manage this.

It’s disappointing that some optimizations which potentially yield huge benefit are not implemented in any of the tested free compilers; however, GCC 4.5 has been released and is yet to be tested (edit 13/7/2010: tested, no surprises). Also, GCC includes some options for loop optimization which are not turned on at any optimization level, due to them requiring external libraries at build time; it would be interesting to see whether these would make any difference in any of the tests (13/7/2010. They don’t).

Further update 4/8/2010: I’ve just tested with gcc 4.5.1. No very noticable differences were seen from the performance of 4.4.4, however, I am running a new kernel and this seems to slightly affect the numbers for 4.4.4; for instance, bm3 comes down to about the 0.162 mark, more-or-less even with LLVM.