Indexing with respect to variable lower bounds
First let's note the following points,
- Fortran - It has variable lower bounds i.e., a user can set the smallest index of a dimension of an array as per their desire. Also, a subroutine/function accepts an array dynamically i.e., an array having any lower bound will be accepted provided it is of the same rank.
- Python - The smallest index is always 0. However, here as well we can pass an array dynamically (i.e., without specifying details of dimensions, same as what happens in Fortran) in LPython.
- LLVM - It uses an array descriptor in which it stores the lower bound and upper bound at runtime. It also auto offsets indices in the generated code according to the lower bound. So, nothing to worry about for LLVM backend.
- C++/C - They offset indices to adjust for the lower bound. If the lower bound is available at compile time then all good. But in case of indexing an input array of subroutine (note above points, 1 and 2 carefully now) in which case lower bound is only available at run-time, it just offsets by 1 in LFortran and by 0 in LPython.
- Offsetting indices in ASR is not an option - The reason being the flexibility of lower bounds in Fortran and the differences of indexing semantics between frontend languages. Creating
ArrayRef
nodes in optimisation passes without ambiguity will be impossible, because for one frontend language the offset will be different and for some other it will be something else. Plus once an ASR is generated, it is impossible to know from which frontend language it was generated (unless we are willing to use fragile checks). See, https://github.com/lcompilers/lpython/issues/683 as well.
As can be seen, C++/C backends aren't robust enough specifically for LFortran where the lower bound can be anything. To fix this (robustly, temporary fix in !1816 (merged)) I have following solutions in my mind,
- Use LLVM approach - Here I mean that support Arrays via array descriptor in C/C++ backend. Exactly same as LLVM. Then we will be able to support calls like
lower_bound(a, 1)
in C/C++ backends as well. Plus we will be able to offset indexing of input array of a subroutine/function correctly. However, this would limit us from using abstract C/C++ features for arrays. - Support only 1 (default for Fortran) as lower bound in LFortran and 0 in LPython for C/C++ backend - This will allow use to offset indices without worrying for lower bounds other than 0 and 1. To differentiate between LCompilers semantics (i.e., choosing b/w 0 and 1 in LFortran/LPython) we can store the default lower bounds for a frontend language in a variable/macro and then pass it on as arguments to
asr_to_c
,asr_to_cpp
. A restrictive approach but will produce simpler code and will allow usage of C/C++ features for arrays more easily.
@certik Let me know what you think.