Skip to content
GitLab
    • GitLab: the DevOps platform
    • Explore GitLab
    • Install GitLab
    • How GitLab compares
    • Get started
    • GitLab docs
    • GitLab Learn
  • Pricing
  • Talk to an expert
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
    • Switch to GitLab Next
    Projects Groups Topics Snippets
  • Register
  • Sign in
  • tezos tezos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
    • Locked files
  • Issues 1,963
    • Issues 1,963
    • List
    • Boards
    • Service Desk
    • Milestones
    • Iterations
    • Requirements
  • Merge requests 260
    • Merge requests 260
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
    • Test cases
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Code review
    • Insights
    • Issue
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • TezosTezos
  • tezostezos
  • Merge requests
  • !4009

Tenderbake: Optimizing round_and_offset using quadratic equations with the simplified datatype for round_durations

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Hai Nguyen Van requested to merge hai@round_repr_simplify_round_and_offset into master Dec 08, 2021
  • Overview 29
  • Commits 3
  • Pipelines 10
  • Changes 4

TL;DR

Fix #2224 (closed)

This MR improves the speed of round/offset calculation with a 5x speed-up based on the simplified datatype brought by MR !3984 (merged).

┌───────────┬──────────┬──────────┬──────────┬──────────┬────────────┐
│ Name      │ Time/Run │  mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
├───────────┼──────────┼──────────┼──────────┼──────────┼────────────┤
│ binsearch │ 248.50us │ 133.75kw │    8.09w │    8.09w │    100.00% │
│ sqrt      │  50.94us │  20.15kw │    0.43w │    0.43w │     20.50% │
└───────────┴──────────┴──────────┴──────────┴──────────┴────────────┘

Problem

We want to optimize function round_and_offset in module round_repr.ml to use an explicit solution driven by a quadratic equation. We denote:

  • Input: \tau_0 \in \mathbb{N}^*, the first round duration
  • Input: \tau_{\varepsilon} \in \mathbb{N}^*, the delay increment per round
  • Input: \text{level\_offset} \in \mathbb{R}^+
  • Find the round and its offset

Let us define the sequence \tau_k as the round duration of round k:

{(\tau_k)}_{k\in\mathbb N} ~:~ 
\left\{\begin{matrix}
\tau_0 &=& \tau_0\\ 
\tau_{k+1} &=& \tau_{k} + \tau_{\varepsilon}
\end{matrix}\right.

Example

\tau_0 \tau_{\varepsilon} \text{level\_offset} round its offset
30 15 32 1 2
30 15 75 2 0

The equation

From a level_offset, we want to find the greatest round such that the offset (w.r.t. to this round) is the least (and non-negative).

In other words, the equation is to maximize \text{round} such that \text{offset} \geq 0 and

\begin{aligned}
\sum_{k=0}^{\text{round} - 1} \tau_k + \text{offset} &= \text{level\_offset}
\end{aligned}

We deliberately minimize the \text{offset} to 0 to solve the equation in \text{round}

\begin{aligned}
\sum_{k=0}^{\text{round} - 1} \tau_k &= \text{level\_offset}
\end{aligned}

💡 It is possible to unfold the sum due to its commutative properties, we have

\begin{aligned}
\sum_{k=0}^{\text{round} - 1} \tau_k &= \tau_0 + (\tau + \tau_{\varepsilon}) + (\tau + 2 \tau_{\varepsilon}) + (\tau + 3\tau_{\varepsilon}) + \dots + (\tau + (\text{round} - 1) \times \tau_{\varepsilon}) \\
&= \text{round} \times \tau_0 + \left ( \sum_{k = 1}^{\text{round} - 1} k \right ) \tau_{\varepsilon}\\
&= \text{round} \times \tau_0 + \frac{(\text{round} - 1) \times \text{round}}{2}\times \tau_{\varepsilon}
\end{aligned}

The equation to solve is now explicited as

\begin{aligned}
\text{round} \times \tau_0 + \frac{(\text{round} - 1) \times \text{round}}{2} \times \tau_{\varepsilon} &= \text{level\_offset}
\end{aligned}

In order to further use a lemma below, we need to get rid of this division

\begin{aligned}
2 \times \text{round} \times \tau_0 + (\text{round} - 1) \times \text{round} \times \tau_{\varepsilon} &= 2 \times \text{level\_offset}
\end{aligned}

After expanding the terms, the quadratic equation in \text{round} is

\begin{aligned}
\tau_{\varepsilon} \times \text{round}^2 + \left ( 2 \tau_0 - \tau_{\varepsilon} \right ) \times \text{round} - 2 \times \text{level\_offset} &= 0
\end{aligned}

Thereby, the discriminant is

\begin{aligned}
\Delta = \left ( 2 \tau_0 - \tau_{\varepsilon} \right )^2 + 8 \times \tau_{\varepsilon} \times \text{level\_offset}
\end{aligned}

Hence,

\begin{aligned}
\text{round} = \frac{\tau_{\varepsilon} - 2 \tau_0 + \sqrt \Delta}{2 \tau_{\varepsilon}}
\end{aligned}

Once upon a time in the world of integer arithmetics

To recall, we want to compute in integer arithmetics. This means we should postpone floating-prone computations to top-level formulae. The problem now lies in the square root above \Delta which impossibly ensures the yielded value to be integer. Nevertheless, a sound approximation is possible using a lemma on quotients with floor.

Lemma (equation 3.11, Graham-Knuth-Patashnik, p. 72).

\forall x \in \mathbb R \quad \forall m \in \mathbb Z \quad \forall n \in \mathbb N^*
\quad \left \lfloor \frac{m + x}{n} \right \rfloor = \left \lfloor \frac{m + \left \lfloor x\right \rfloor}{n} \right \rfloor

From that we can state that,

\begin{aligned}
\left \lfloor \text{round} \right \rfloor &= \left \lfloor \frac{\tau_{\varepsilon} - 2 \tau_0 + \sqrt \Delta}{2 \tau_{\varepsilon}} \right \rfloor  \\
&= \left \lfloor \frac{\tau_{\varepsilon} - 2 \tau_0 + \left \lfloor \sqrt \Delta \right \rfloor}{2 \tau_{\varepsilon}} \right \rfloor
\end{aligned}

Finally, we can compute integer square-root \left \lfloor \sqrt \Delta \right \rfloor using either Newton-Raphson or 4-adic induction [Kreitz, 2003].

Space

For space-complexity, \Delta can grow really big but still linearly in O (\text{level\_offset}). We need to ensure that \Delta does not overflow 2^{63}-1.

\begin{aligned}
\Delta &\quad\leq\quad 2^{63} - 1\\
\left ( 2 \tau_0 - \tau_{\varepsilon} \right )^2 + 8 \times \tau_{\varepsilon} \times \text{level\_offset} &\quad\leq\quad 2^{63} - 1\\
\text{level\_offset} &\quad\leq\quad \frac{2^{63} - 1 - (2 \tau_0 - \tau_{\varepsilon})^2}{8\tau_{\varepsilon}}
\end{aligned}

If we have a \tau_{0} = 30 and \tau_{\varepsilon} = 15, this means that a reasonably large upper bound for \text{level\_offset} is 10^{16} ~ \text{sec}. In other words, 10^9 years. The risk of reaching the overflow is thereby low.

Time

In terms of time-complexity w.r.t. to \text{level\_offset}, is this optimal? Not everything is given for free and what costs the most is the integer square root operation \left \lfloor\sqrt{\Delta} \right \rfloor. The 4-adic induction and the Newton-Raphson method provide a logarithmic algorithm and this would make the solution O (\text{log}~ \text{level\_offset}). In this problem, we choose Newton-Raphson as 4-adic induction tends to reach overflow on call stack.

Benchmarks

All the following values were measured with @richard.bonichon's micro-benchmarks repository (using the core_bench library).

Initially, out first benchmarks were conducted on a uniformly distributed sample of values of \text{level\_offset} (ranging from 0 to 1000). The following table exhibits the results using a uniform distribution of sample and show a x5 speed-up.

┌───────────┬──────────┬──────────┬──────────┬──────────┬────────────┐
│ Name      │ Time/Run │  mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
├───────────┼──────────┼──────────┼──────────┼──────────┼────────────┤
│ binsearch │ 849.44us │ 244.80kw │   16.02w │   16.02w │    100.00% │
│ sqrt      │ 176.01us │  22.46kw │    0.72w │    0.72w │     20.72% │
└───────────┴──────────┴──────────┴──────────┴──────────┴────────────┘

In reality, we can argue that the values for this input are not uniformly distributed. Here is a chart of how values for \text{level\_offset} are distributed in practice during a 12-hour span:

level_offset_distribution

We repeat the benchmark experiment using the statistical distribution as the sample, and we measured the following values. We can observe that the x5 speed-up is preserved even for high incidence around input 0:

┌───────────┬──────────┬──────────┬──────────┬──────────┬────────────┐
│ Name      │ Time/Run │  mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │
├───────────┼──────────┼──────────┼──────────┼──────────┼────────────┤
│ binsearch │ 248.50us │ 133.75kw │    8.09w │    8.09w │    100.00% │
│ sqrt      │  50.94us │  20.15kw │    0.43w │    0.43w │     20.50% │
└───────────┴──────────┴──────────┴──────────┴──────────┴────────────┘

These experiments tend to show that this solution remain effective whether the distribution is uniform or Pareto.

Edited Jan 20, 2022 by Hai Nguyen Van
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: hai@round_repr_simplify_round_and_offset