00011-c-compiler-quirks.md 13.4 KB
Newer Older
Martin Dørum's avatar
Martin Dørum committed
1
# C compiler quirks I have encountered
Martin Dørum's avatar
Martin Dørum committed
2

Martin Dørum's avatar
Martin Dørum committed
3 4 5 6
<div class="social">
	<a href="https://www.reddit.com/r/C_Programming/comments/924ou1/c_compiler_quirks/?st=jnnow6h0&sh=2f4b4bc3"><img src="/_/imgs/social-reddit.png"></a>
</div>

Martin Dørum's avatar
Martin Dørum committed
7 8
Date: 2018-07-26 \
Git: <https://gitlab.com/mort96/blog/blob/published/content/00000-home/00011-c-compiler-quirks.md>
Martin Dørum's avatar
Martin Dørum committed
9

Martin Dørum's avatar
Martin Dørum committed
10 11
[In a previous blog post](https://mort.coffee/home/obscure-c-features),
I wrote about some weird features of C, the C preprocessor,
Martin Dørum's avatar
Martin Dørum committed
12 13
and GNU extensions to C that I used in my testing library,
[Snow](https://github.com/mortie/snow).
Martin Dørum's avatar
Martin Dørum committed
14

Martin Dørum's avatar
Martin Dørum committed
15 16
This post will be about some of the weird compiler and language
quirks, limitations, and annoyances I've come across.
Martin Dørum's avatar
Martin Dørum committed
17
I don't mean to bash compilers or the specification;
Martin Dørum's avatar
Martin Dørum committed
18
most of these quirks have good technical or practical reasons.
Martin Dørum's avatar
Martin Dørum committed
19 20 21 22

## Compilers lie about what version of the standard they support

There's a handy macro, called `__STDC_VERSION__`, which describes the version
Martin Dørum's avatar
Martin Dørum committed
23
of the C standard your C implementation conforms to. We can check
Martin Dørum's avatar
Martin Dørum committed
24 25 26 27 28 29 30 31 32 33 34 35
`#if (__STDC_VERSION__ >= 201112L)` to check if our C implementaion confirms to
C11 or higher (C11 was published in December 2011, hence `2011 12`). That's
really useful if, say, you're a library author and
[have a macro which uses \_Generics](https://github.com/mortie/snow/blob/1ca97d03b8eaf824d19d10a41fa382a562392bd5/snow/snow.h#L395),
but also have alternative ways of doing the same and
[want to warn people when they use the C11-only macro in an older
compiler](https://github.com/mortie/snow/blob/1ca97d03b8eaf824d19d10a41fa382a562392bd5/snow/snow.h#L430).

In theory, this should always work; any implementation of C which conforms to
all of C11 will define `__STDC_VERSION__` as `201112L`, while any
implementation which doesn't conform to C11, but conforms to some earlier
version, will define `__STDC_VERSION__` to be less than `201112L`. Therefore,
Martin Dørum's avatar
Martin Dørum committed
36 37
unless the \_Generic feature gets removed in a future version of the standard,
`__STDC_VERSION__ >= 201112L` means that we can safely use \_Generic.
Martin Dørum's avatar
Martin Dørum committed
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53

Sadly, the real world is not that clean.
You could already in GCC 4.7 enable C11 by passing in `-std=c11`,
which would set `__STDC_VERSION__` to `201112L`,
but the first release to actually implement all non-optional features of C11
was GCC 4.9.
That means, if we just check the value of `__STDC_VERSION__`,
users on GCC 4.7 and GCC 4.8 who use `-std=c11`
will see really confusing error messages instead of our nice error message.
Annoyingly, GCC 4.7 and 4.8 happens to still be extremely widespread
versions of GCC.
(Relevant: [GCC Wiki's C11Status page](https://gcc.gnu.org/wiki/C11Status))

The solution still seems relatively simple; just don't use -std=c11.
More recent compilers default to C11 anyways,
and there's no widely used compiler that I know of
Martin Dørum's avatar
Martin Dørum committed
54 55
which will default to setting `__STDC_VERSION__` to C11
without actually supporting all of C11.
Martin Dørum's avatar
Martin Dørum committed
56 57 58 59
That works well enough, but there's one problem:
GCC 4.9 supports all of C11 just fine, but only if we give it `-std=c11`.
GCC 4.9 also seems to be one of those annoyingly widespread versions of GCC,
so we'd prefer to encourage users to set `-std=c11` and make the
Martin Dørum's avatar
Martin Dørum committed
60
macros which rely on \_Generic work in GCC 4.9.
Martin Dørum's avatar
Martin Dørum committed
61 62 63 64

Again, the solution seems obvious enough, if a bit ugly:
if the compiler is GCC, we only use `_Genric` if the GCC version is 4.9 or
greater and `__STDC_VERSION__` is C11.
Martin Dørum's avatar
Martin Dørum committed
65
If the compiler is not GCC, we just trust it if it says it supports C11.
Martin Dørum's avatar
Martin Dørum committed
66 67 68 69 70 71 72 73 74 75 76 77 78 79
This should in theory work perfectly:

``` C
#if (__STDC_VERSION__ >= 201112L)
# ifdef __GNUC__
#  if (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 9))
#   define IS_C11
#  endif
# else
#  define IS_C11
# endif
#endif
```

Martin Dørum's avatar
Martin Dørum committed
80 81
Our new `IS_C11` macro should now always be defined if we can use \_Generic
and always not be defined when we can't use \_Generic, right?
Martin Dørum's avatar
Martin Dørum committed
82 83 84 85

Wrong. It turns out that in their quest to support code written for GCC, Clang
also defines the `__GNUC__`, `__GNUC_MINOR__`, and `__GNUC_PATCHLEVEL__`
macros, specifically to fool code which checks for GCC into thinking Clang is
Martin Dørum's avatar
Martin Dørum committed
86
GCC.
87
<del>However, it doesn't really go far enough;
Martin Dørum's avatar
Martin Dørum committed
88 89
it defines the `__GNUC_*` variables to correspond to the the version of _clang_,
not the version of GCC which Clang claims to imitate.
Martin Dørum's avatar
Martin Dørum committed
90 91
Clang gained support for C11 in 3.6, but using our code,
we would conclude that it doesn't support C11 because `__GNUC__`
92 93 94 95
is 3 and `__GNUC_MINOR__` is 6.</del>
Update: it turns out that Clang always pretends to be GCC 4.2, but the same
issue still applies; `__GNUC__` is 4, and `__GNUC_MINOR__` is 2, so it fails
our version check.
Martin Dørum's avatar
Martin Dørum committed
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
We can solve this by adding a special case for when `__clang__` is defined:

``` C
#if (__STDC_VERSION__ >= 201112L)
# if defined(__GNUC__) && !defined(__clang__)
#  if (__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 9))
#   define IS_C11
#  endif
# else
#  define IS_C11
# endif
#endif
```

Now our code works with both Clang and with GCC, and should work with all other
compilers which don't try to immitate GCC - but for every compiler which does
immitate GCC, we would have to add a new special case.
This is starting to smell a lot like user agent strings.
Martin Dørum's avatar
Martin Dørum committed
114 115 116 117 118

The Intel compiler is at least nice enough to define `__GNUC__` and
`__GNUC_MINOR__` according to be the version of GCC installed on the system;
so even though our version check is completely irrelevant in the Intel
compiler,
Martin Dørum's avatar
Martin Dørum committed
119
at least it will only prevent an otherwise C11-compliant Intel compiler
Martin Dørum's avatar
Martin Dørum committed
120
from using \_Generic if the user has an older version of GCC installed.
Martin Dørum's avatar
Martin Dørum committed
121

Martin Dørum's avatar
Martin Dørum committed
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137
> **User**: Hi, I'm using the Intel compiler, and your library claims
> my compiler doesn't support C11, even though it does.

> **You**: Upgrading GCC should solve the issue. What version of GCC do you
> have installed?

> **User**: ...but I'm using the Intel compiler, not GCC.

> **You**: Still, what version of GCC do you have?

> **User**: 4.8, but I really don't see how that's relevant...

> **You**: Try upgrading GCC to at least version 4.9.

(Relevant: [Intel's Additional Predefined Macros page](https://software.intel.com/en-us/node/524490))

Martin Dørum's avatar
Martin Dørum committed
138
## \_Pragma in macro arguments
Martin Dørum's avatar
Martin Dørum committed
139

Martin Dørum's avatar
Martin Dørum committed
140 141
C has had pragma directives for a long time.
It's a useful way to tell our compiler something implementation-specific;
Martin Dørum's avatar
Martin Dørum committed
142 143
something which there's no way to say using only standard C.
For example, using GCC, we could use a pragma directive to tell our compiler
Martin Dørum's avatar
Martin Dørum committed
144
to ignore a warning for a couple of lines,
Martin Dørum's avatar
Martin Dørum committed
145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
without changing warning settings globally:

``` C
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
// my_float being 0 indicates a horrible failure case.
if (my_float == 0)
	abort();
#pragma GCC diagnostic pop
```

We might also want to define a macro which outputs the above code,
so C99 introduced the `_Pragma` operator,
which works like `#pragma`,
but can be used in macros.
Once this code goes through the preprocessor,
it will do exactly the same as the above code:

``` C
#define abort_if_zero(x) \
	_Pragma("GCC diagnostic push") \
	_Pragma("GCC diagnostic ignored \"-Wfloat-equal\"") \
	if (x == 0) \
		abort(); \
	_Pragma("GCC diagnostic pop")

abort_if_zero(my_float);
```

Now, imagine that we want a macro to trace certain lines;
a macro which takes a line of code,
and prints that line of code while executing the line.
This code looks completely reasonable, right?

``` C
#define trace(x) \
	fprintf(stderr, "TRACE: %s\n", #x); \
	x

trace(abort_if_zero(my_float));
```

However, if we run _that_ code through GCC's preprocessor, we see this mess:

``` C
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
#pragma GCC diagnostic pop
fprintf(stderr, "TRACE: %s\n", "abort_if_zero(my_float)"); if (my_float == 0) abort();
```

The pragmas all got bunched up at the top!
From what I've heard, this isn't against the C standard,
because the standard not entirely clear on what happens when you send in
\_Pragma operators as macro arguments,
but it sure surprised me when I encountered it nonetheless.

For the Snow library, this means that there are certain warnings which
I would have loved to only disable for a few lines, but which I have to disable
for all code following the `#include <snow/snow.h>` line.

Side note:
Clang's preprocessor does exactly what one would expect,
and produces this output:

``` C
fprintf(stderr, "TRACE: %s\n", "abort_if_zero(my_float)");
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wfloat-equal"
 if (my_float == 0) abort();
#pragma GCC diagnostic pop
```

## Line numbers in macro arguments
Martin Dørum's avatar
Martin Dørum committed
219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281

Until now, the quirks I've shown have been issues you could potentially
encounter in decent, real-world code.
If this quirk has caused issues for you however,
it might be a sign that you're slightly over-using macros.

All testing code in Snow happens within macro arguments.
This allows for what I think is a really nice looking API,
[and allows all testing code to be disabled just by changing one macro definition](https://github.com/mortie/snow/blob/1ca97d03b8eaf824d19d10a41fa382a562392bd5/snow/snow.h#L29-L33).
This is a small example of a Snow test suite:

``` C
#include <stdio.h>
#include <snow/snow.h>

describe(files, {
	it("writes to files", {
		FILE *f = fopen("testfile", "w");
		assertneq(f, NULL);
		defer(remove("testfile"));
		defer(fclose(f));

		char str[] = "hello there";
		asserteq(fwrite(str, 1, sizeof(str), f), sizeof(str));
	});
});

snow_main();
```

If that `assertneq` or `asserteq` fails,
we would like and expect to see a line number.
Unfortunately, after the code goes through the preprocessor,
the entire nested macro expansion ends up on a single line.
All line number information is lost.
`__LINE__` just returns the number of the last line of the macro expansion,
which is 14 in this case.
_All_ `__LINE__` expressions inside the block we pass to `describe`
will return the same number.
I have googled around a bunch for a solution to this issue,
but none of the solutions I've looked at actually solve the issue.
The only actual solution I can think of is to write my own preprocessor.

## Some warnings can't be disabled with pragma

Like the above example,
this is probably an issue you shouldn't have come across in production code.

First, some background.
In Snow, both the code which is being tested and the test cases can be in the
same file.
This is to make it possible to test static functions and other functionality
which isn't part of the component's public API.
The idea is that at the bottom of the file, after all non-testing code,
one should include `<snow/snow.h>` and write the test cases.
In a non-testing build,
all the testing code will be removed by the preprocessor,
because the `describe(...)` macro expands to nothing
unless `SNOW_ENABLED` is defined.

My personal philosophy is that your regular builds should _not_ have `-Werror`,
and that your testing builds should have as strict warnings as possible and be
compiled with `-Werror`.
Martin Dørum's avatar
Martin Dørum committed
282
Your users may be using a different compiler version from you,
Martin Dørum's avatar
Martin Dørum committed
283
and that compiler might produce some warnings which you haven't fixed yet.
Martin Dørum's avatar
Martin Dørum committed
284
Being a user of a rolling release distro, with a very recent of GCC,
Martin Dørum's avatar
Martin Dørum committed
285 286 287
I have way too often had to edit someone else's Makefile and remove `-Werror`
just to make their code compile.
Compiling the test suite with `-Werror` and regular builds without `-Werror`
Martin Dørum's avatar
Martin Dørum committed
288 289
has none of the drawbacks of using `-Werror` for regular builds,
and most or all of the advantages
Martin Dørum's avatar
Martin Dørum committed
290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308
(at least if you don't accept contributions which break your test suite).

This all means that I want to be able to compile all files with at least
`-Wall -Wextra -Wpedantic -Werror`, even if the code includes `<snow/snow.h>`.
However, Snow contains code which produces warnings (and therefore errors)
with those settings;
among other things, it uses some GNU extensions
which aren't actually part of the C standard.

I would like to let users of Snow compile their code with at least
`-Wall -Wextra -Wpedantic -Werror`,
but Snow has to disable at least `-Wpedantic` for all code after the inclusion
of the library.
In theory, that shouldn't be an issue, right? 
We just include `#pragma GCC diagnostic ignored "-Wpedantic"` somewhere.

Well, as it turns out, disabling `-Wpedantic` with a pragma doesn't disable all
the warnings enabled by `-Wpedantic`;
there are some warnings which are impossible to disable once they're enabled.
Martin Dørum's avatar
Martin Dørum committed
309
One such warning is about using directives (like `#ifdef`) inside macro
Martin Dørum's avatar
Martin Dørum committed
310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332
arguments.
As I explained earlier,
_everything_ in Snow happens inside of macro arguments.
That means that when compiling with `-Wpedantic`,
this code produces a warning which it's impossible to disable
without removing `-Wpedantic` from the compiler's arguments:

``` C
describe(some_component, {
#ifndef __MINGW32__
	it("does something which can't be tested on mingw", {
		/* ... */
	});
#endif
});
```

That's annoying, because it's perfectly legal in GNU's dialect of C.
The only reason we can't do it is that it just so happens to be impossible to
disable that particular warning with a pragma.

To be completely honest, this issue makes complete sense.
I imagine the preprocessor stage, which is where macros are expanded,
Martin Dørum's avatar
Martin Dørum committed
333 334
doesn't care much about pragmas.
It feels unnecessary to implement pragma parsing for the
Martin Dørum's avatar
Martin Dørum committed
335 336
preprocessor _just_ in order to let people compile files with `-Wpedantic` but
still selectively disable this particular warning.
Martin Dørum's avatar
Martin Dørum committed
337
That doesn't make it less annoying though.
Martin Dørum's avatar
Martin Dørum committed
338 339 340 341 342 343

Funnily enough, I encountered this issue while writing Snow's test suite.
My solution was to just define
[a macro called NO\_MINGW](https://github.com/mortie/snow/blob/1ca97d03b8eaf824d19d10a41fa382a562392bd5/test/test.c#L124-L128)
which is empty if `__MINGW32__` is defined,
and expands to the contents of its arguments otherwise.