Proposal: Formalizing what definitions and references are

Definitions

We have three choices for what we consider to be a definition:

  1. Callable objects. E.g. function and class definitions.
  2. Any statement binding an object to a name, even if that object isn't callable. E.g. variable assignments, type definitions, etc.
  3. Something in between 1 and 2.

Option 1 is the bare minimum and allows us to build up a call graph. The problem is we miss out on important constructs, e.g. in TypeScript:

type Programmer = {
  name: string;
  knownFor: string[];
};

const ada: Programmer = {
  name: 'Ada Lovelace',
  knownFor: ['Mathematics', 'Computing', 'First Programmer']
};

Types aren't callable, but it would be useful if an agent could "jump into" the type definition when it sees a reference. Even classes aren't callable in some languages, like Ruby, but we'd miss out on a lot information if we don't treat them as definitions.

Option 2 is the ideal solution and it's what the LSP does. If we treat every binding of an object to a name (e.g. types, instance variables, etc.) as a definition, we'd build a much richer knowledge graph. But doing this would increase the scope of our work and delay our initial release.

Option 3 is the middle ground and it's what I propose. We'd bias towards only capturing callables but make language-specific exceptions when necessary. Each language owner should make their own judgment call on what non-callables to include or exclude in the first iteration. Eventually, our goal would be to implement option 2 and capture every definition in the namespace. But for now, let's just capture some of these non-callables and document the exceptions we make.

References

There are two choices for how we define a reference:

  1. A function call (e.g. foo(), MyClass(), etc.).
  2. Any instance of definition name (e.g. a type annotation, my_obj: MyClass = null ).

If we choose option 1, we miss out on important references like type annotations or functions-as-arguments, as I described in #17.

Option 2 is ideal, but would increase the scope of our work and delay our initial release.

I propose we start with option 1 and work towards option 2 in future iterations.

Edited by Jonathan Shobrook