Run
7 emulators 625 tests
Results as of this run. The arrow shows each target's movement since the previous run it was tested in. The suite grew this run, so a downward arrow can be the new tests biting rather than a target regressing.
Suite grew from 601 to 625 tests this run.
That's 24 new tests measured against every target. Movement below compares to the previous run, so a fall here is as likely to be the stricter suite as a real regression.
What changed in the suite this run
Suite on
Grew to 625 tests, up 24 on the previous run: eleven more in Tier 1 and thirteen more in Tier 3, tightening coverage of core operations and the strict edge cases.
-
live (AWS) · full coverage100% ground truthTier 1 100%Tier 2 100%Tier 3 100%
-
0.9.13 · full coverage97.9% -2.1pp fell 2.1 percentage pointsTier 1 97.8%Tier 2 100.0%Tier 3 97.1%
-
v0.1.0 · 25 unsupported96.8% -3.2pp fell 3.2 percentage pointsTier 1 96.8%Tier 2 100.0%Tier 3 95.7%
-
efa6922b024a · full coverage96.5% -3.0pp fell 3.0 percentage pointsTier 1 96.8%Tier 2 98.1%Tier 3 95.2%
-
2026.5.0 · full coverage88.3% -0.7pp fell 0.7 percentage pointsTier 1 98.7%Tier 2 96.1%Tier 3 68.8%
-
d89f8fcc6b1a · full coverage87.5% -0.7pp fell 0.7 percentage pointsTier 1 98.7%Tier 2 91.3%Tier 3 68.8%
-
4.0.0 · 43 unsupported84.7% +0.7pp rose 0.7 percentage pointsTier 1 98.4%Tier 2 16.7%Tier 3 83.7%
-
89d602080662 · full coverage66.1% -2.3pp fell 2.3 percentage pointsTier 1 83.1%Tier 2 76.7%Tier 3 35.1%