## Containing geometry information for the sparse patterns

Dear Victor and Volker,

I had a discussion with Volker about the possibility of ELSI containing the geometry information such that sparse patterns are attributed supercell/auxiliary cell information.

My views are this.

- You can choose a direct path with auxiliary arrays that de-reference the supercell information.

I.e.

```
integer :: N
integer :: ptr(N+1)
integer :: nnz
integer :: col(nnz)
integer :: nsc(3) ! number of auxiliary supercells along each lattice vector
integer :: sc_index(nnz) ! auxiliary for geometry
! get supercell offsets (integers) for an sc_index
! i.e. for the primary unit-cell one would get:
! all(index_sc(:,<primary unit-cell index>) == 0) .eqv. .true.
integer :: index_sc(3,*nsc)
! coordinates in the supercell structure
! One does not necessarily need the supercell dimension since it could be calculated
! on the fly. It depends on your needs and performance issues
real :: xa(3,na,*nsc) ! coordinates in the supercell structure
real :: unit_cell(3,3) ! lattice vectors for unit-cell (not supercell)
```

In the above you will have an auxiliary array `sc_index`

which holds the information about locality. The `col`

array will always be containing values in `1<=col(:)<=N`

.

However, my experience is that this is not really needed.

- A second approach would be to
*hide*the supercell information in the`col`

array.

```
integer :: N
integer :: ptr(N+1)
integer :: nnz
integer :: col(nnz)
integer :: nsc(3) ! number of auxiliary supercells along each lattice vector
! get supercell offsets (integers) for an sc_index
integer :: index_sc(3,*nsc)
! coordinates in the supercell structure
! One does not necessarily need the supercell dimension since it could be calculated
! on the fly. It depends on your needs and performance issues
real :: xa(3,na,*nsc) ! coordinates in the supercell structure
real :: unit_cell(3,3) ! lattice vectors for unit-cell (not supercell)
```

Here you will have that the `col`

array are bounded `1<=col(:)<=N*product(nsc)`

.
By using this approach one can get the *actual* column and supercell by this small conversion:

```
sc_idx = (col(...)-1) / N
unit_col = col(...) - sc_idx * N
```

In this way you save an array of `nnz`

which contains integers `1<=sc_index<=product(nsc)`

.

This 2nd approach *limits* the size of ones sparse array. However, given that *any* large structure would only have 3,3,3 supercells (i.e. `product(nsc) == 27`

) you should be able to handle sparse matrices up to `2**31/27 ~ 79536431`

rows. In any case for very large structures one should probably go for `long`

in which case this will never be an issue.

- Instead of populating elsi with these abstractions of geometries etc. it could also be a possibilty of having a few procedure pointers which could be populated by the hosting program. I.e.
`elsi_setup_geometry_funcs(get_sc_coord=host_sc_coord_func, get_phase=host_phase_calc, ...)`

or what-ever. In this last approach ELSI need not worry about having another data-level. Only requirement is that the interfaces for these procedures are well-defined.

I think this last approach would require the least of you since one shouldn't accommodate different basis-sets and how they potentially order stuff in the host code.

Well, these are just some thoughts! :)