Skip to content

Run

7 emulators 625 tests

Results as of this run. The arrow shows each target's movement since the previous run it was tested in. The suite grew this run, so a downward arrow can be the new tests biting rather than a target regressing.

Suite grew from 601 to 625 tests this run.

That's 24 new tests measured against every target. Movement below compares to the previous run, so a fall here is as likely to be the stricter suite as a real regression.

What changed in the suite this run

Suite on

Grew to 625 tests, up 24 on the previous run: eleven more in Tier 1 and thirteen more in Tier 3, tightening coverage of core operations and the strict edge cases.

  1. live (AWS) · full coverage
    100% ground truth
    Tier 1 100%
    Tier 2 100%
    Tier 3 100%
  2. 0.9.13 · full coverage
    97.9% -2.1pp fell 2.1 percentage points
    Tier 1 97.8%
    Tier 2 100.0%
    Tier 3 97.1%
  3. v0.1.0 · 25 unsupported
    96.8% -3.2pp fell 3.2 percentage points
    Tier 1 96.8%
    Tier 2 100.0%
    Tier 3 95.7%
  4. efa6922b024a · full coverage
    96.5% -3.0pp fell 3.0 percentage points
    Tier 1 96.8%
    Tier 2 98.1%
    Tier 3 95.2%
  5. 2026.5.0 · full coverage
    88.3% -0.7pp fell 0.7 percentage points
    Tier 1 98.7%
    Tier 2 96.1%
    Tier 3 68.8%
  6. d89f8fcc6b1a · full coverage
    87.5% -0.7pp fell 0.7 percentage points
    Tier 1 98.7%
    Tier 2 91.3%
    Tier 3 68.8%
  7. 4.0.0 · 43 unsupported
    84.7% +0.7pp rose 0.7 percentage points
    Tier 1 98.4%
    Tier 2 16.7%
    Tier 3 83.7%
  8. 89d602080662 · full coverage
    66.1% -2.3pp fell 2.3 percentage points
    Tier 1 83.1%
    Tier 2 76.7%
    Tier 3 35.1%