arena allocation for virtual edges

I’ve been slowly addressing the ASan-reported leaks, one-by-one. One of the thorny remaining ones is repeated instances of something like:

Direct leak of 16384 byte(s) in 128 object(s) allocated from:
    #0 0x7bae93260340 in calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
    #1 0x7bae8d5c94f2 in gv_calloc /builds/smattr/graphviz/lib/dotgen/../util/alloc.h:36
    #2 0x7bae8d5c9623 in gv_alloc /builds/smattr/graphviz/lib/dotgen/../util/alloc.h:47
    #3 0x7bae8d5d03a0 in new_virtual_edge /builds/smattr/graphviz/lib/dotgen/fastgr.c:133
    #4 0x7bae8d5d3656 in virtual_edge /builds/smattr/graphviz/lib/dotgen/fastgr.c:172
    #5 0x7bae8d660491 in minmax_edges2 /builds/smattr/graphviz/lib/dotgen/rank.c:363
    #6 0x7bae8d662b50 in dot1_rank /builds/smattr/graphviz/lib/dotgen/rank.c:440
    #7 0x7bae8d65c540 in collapse_cluster /builds/smattr/graphviz/lib/dotgen/rank.c:265
    #8 0x7bae8d65c6b7 in collapse_sets /builds/smattr/graphviz/lib/dotgen/rank.c:282
    #9 0x7bae8d662aaa in dot1_rank /builds/smattr/graphviz/lib/dotgen/rank.c:434
    #10 0x7bae8d662f3c in dot_rank /builds/smattr/graphviz/lib/dotgen/rank.c:455
    #11 0x7bae8d55a72b in dotLayout /builds/smattr/graphviz/lib/dotgen/dotinit.c:303
    #12 0x7bae8d55f154 in doDot /builds/smattr/graphviz/lib/dotgen/dotinit.c:442
    #13 0x7bae8d55fd27 in dot_layout /builds/smattr/graphviz/lib/dotgen/dotinit.c:490
    #14 0x7bae92629a46  (/usr/bin/../lib+0x4dca46) (BuildId: 3a2f27213dc054d320bd67a21615ff80d08f9cdf)
    #15 0x5d533199da82 in main /builds/smattr/graphviz/cmd/dot/dot.c:72
    #16 0x7bae916853b7  (/lib/x86_64-linux-gnu/libc.so.6+0x2a3b7) (BuildId: 5f3f024b472f38389da3a2f567b3d0eaa8835ca2)
    #17 0x7bae9168547a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a47a) (BuildId: 5f3f024b472f38389da3a2f567b3d0eaa8835ca2)
    #18 0x5d533199d694 in _start (/usr/bin/dot+0x2694) (BuildId: f9f435ad1d158e77d7a66c05758b59aae800d875)

I’ve convinced myself these seem intractable to fix by inserting a missing free call somewhere, as is the usual solution to leaks. The reason why takes a little explaining…

When you first investigate one of these leaks, the cause seems obvious: new_virtual_edge allocates an edge pair and then immediately loses the pointer to this. But things are not so simple. It returns a pointer to something inside this edge pair, which is later used to recover the containing edge pair. When everything is working correctly, this looks something like this simplification:

typedef struct { Agedge_t out, in; } Agedgepair_t;

Agedge_t *new_virtual_edge() {
  Agedgepair_t *e = malloc(sizeof(*e));
  e->in = ;
  e->out = ;
  return &e->out;
}

void foo(void) {
  Agedge_t *out = new_virtual_edge();
  
  // recover edge pair
  Agedgepair_t *e = (Agedgepair_t *)((char *)out - offsetof(Agedgepair_t, out));
  free(e);
}

So where does this go wrong? Well in the leak scenarios, something like the following seems to occur:

  1. new_virtual_edge allocates an edge pair
  2. &e->in and &e->out are stored in some collections
  3. &e->out is discovered to be no longer necessary, and removed from collections

So the naive fix seems to be to rediscover the edge pair and deallocate it when removing &e->out. But we can’t do this because &e->in is still in collections elsewhere and this leads to a use-after-free. We also can’t rediscover the edge pair and deallocate it when (eventually) removing &e->in because there is no field of it that tells us it is part of an edge pair. I.e. (Agedgepair_t *)((char *)in - offsetof(Agedgepair_t, in)) on an arbitrary in-edge will often give you a pointer into garbage and then crash because not all in-edges are members of edge pairs. Essentially at the point at which we lose the last recoverable reference to the edge pair, other parts of the edge pair are still in use.

The way out of this conundrum I’m proposing is building and using an arena allocator for these edge pairs. This way, we stop trying to track these edge pairs and just free them all when agclose-ing a graph.

We used to actually have an arena allocator but I (perhaps in retrospect unwisely) removed it in 84b2983e. The one I’m proposing now introducing would be vastly simpler: no free, just basic bump-pointer allocation.

If I am wrong about any of the above and there is a simpler way to fix these leaks, I would love to be corrected.