Separate the single monolithic `tokeniser` C file into multiple language-specific ones.
As the title says, basically we have to split out the main tokeniser to language-specific ones, otherwise it'd quickly become a maintenance hell when we add new languages. The locations of these language-specific tokenisers will look something like this: `src/languages/[language]/tokeniser.c`. This structure will likely make it easier to add new languages into the codebase.
task