Idea: Gradually Forgetting what was Learnt
Most natural systems learn by gradually forgetting old knowledge. The reason for that is not that it works well, but that it is simple; you need no storage if the mechanism simply says "take the last knowledge, multiply by 0.9 and add the new knowledge multiplied by 0.1".
Generally, the factors are (1-lambda) and lambda. The result of this is exponential decay of knowledge. This might be done with a learning step for a generation of knowledge, or with a time difference scale. In the latter case, old knowledge would erode when no new knowledge is added.
To add or subtract a message, all that is needed is determine the number of times (1-lambda) was applied. This means computing the difference in learning generation, or the time difference. To do that, the message needs to log the generation or time, and so does the wordbase.
When the values stored in the wordbase are logarithms of word counts, then the computations simply yet again, every generation reduces the stored value by log(1-lambda), rather than divide by (1-lambda). Exponential becomes linear. This also helps greatly when composing results or comparing wordbases from different versions.