Pirx un[blog]ged

Mike is reading four blogs. This is none of them.

How many chars do you have?

C++ is always good for a surprise. After years coding in C++ I thought to know the native data types inside out. And then it happened:

template < typename T >
struct foo {
    foo()   {
        std::cout << "to specialize" << std::endl;
    }
};

template <>
struct foo < signed char > {
    foo()   {
        std::cout << "signed char" << std::endl;
    }
};

template <>
struct foo < unsigned char > {
    foo()   {
        std::cout << "unsigned char" << std::endl;
    }
};

int main(int argc, char* argv[])
{
    foo< char > c;
    foo< signed char > sc;
    return EXIT_SUCCESS;
}

If you run this code, it prints:

to specialize
signed char

A char is neither a signed char nor a unsigned char. This was unexpected to me.

C++ data types are somehow fuzzy. An int could be 16, 32 or in rare cases 64 bit or even 8, 24 or 48 bits. It depends on the target architecture. But in all cases a signed int is the same as an int, a signed short is equally to short and a signed long is a long. You recognize the pattern? The keyword signed looks redundant there. But data type char gives it a right to live.

To sum up, C++ has 3 different char types. As described in the C++ standard Chapter 3.9.1 Fundamental types:

... Plain char, signed char, and unsigned char are three distinct types. A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (3.11); that is, they have the same object representation. ...

Yeah, there are tons of legacy code, but this is truly mind-boggling. Please fix it!


comments powered by Disqus