# Loop tiling

Example:

```
do i = 1, n
A(i) = sin(B(i))
end do
```

gets transformed into

```
do i = 1, n, 32
do j = 1, 32
A(i+j) = sin(B(i+j))
end do
end do
do i = n-modulo(n,32), n ! roughly
A(i) = sin(B(i))
end do
```

and

```
do i = 1, n, 32
A(i:i+32) = sin(B(i:i+32))
end do
```

Now the `sin`

operates on a vector, instead of a scalar.

There are two sources of speedup:

- The tiling itself keeps the small size 32 array in L1 cache, thus faster access and operations
- A special function implementations, such as
`sin`

, are always faster when operating on a vector, as one can use a more efficient implementation. So replacing scalar`sin`

with a vector`sin`

will be faster.