Time2Throttle

1. Purpose

An independent, reproducible methodology for measuring thermal and battery behaviour of smart glasses and head-mounted wearables under realistic data workloads. Vendor runtime claims diverge consistently from real-world performance; no standardized, independent, published measurement exists for this category. This document defines how Time2Throttle produces its numbers — before the first number is produced. Reproducibility is the product: the workload app, raw telemetry, and this methodology are published so anyone can re-run any test.

2. Scope & device tiers

Tier	Devices	What we can measure
Instrumented	Android-based devices (most enterprise HMDs; Android-based consumer AR) — profiler app sideloaded	Full telemetry: thermal headroom & status, power draw, pipeline metrics + external surface temperature
Black-box	Closed ecosystems (no third-party installs)	External-only: standardized-usage runtime, surface temperature, charge analysis, claim delta. Reproducibility limitation: black-box results cannot be fully reproduced by third parties due to the absence of internal telemetry access. This limitation is disclosed inline with every published black-box result.
Reference	Phone-class Android 11+ devices	Methodology development & calibration only — phone results are never published as wearable findings
Out of scope v0.1: VR headsets (different thermal envelope), audio-only frames without camera/display load.

3. Test environment

Standard ambient: 23 °C ± 1 °C, 50% ± 5% relative humidity, still air. All published scores at standard ambient.
Stress ambient: 40 °C ± 1 °C, 50% ± 5% relative humidity, reported separately, never averaged with standard.
Mounting: worn-equivalent — head-form or insulated stand replicating skin-adjacent dissipation. Open-desk bench testing is invalid (overstates dissipation). (Specific head-form material and thermal mass specifications to be detailed in future revisions)
State: battery 100% at start; fixed declared display brightness; charging disabled; radios active (the radio is part of the workload); vendor thermal modes at factory default (deviations documented).
Runs: minimum 3 per workload per configuration; true median of the three runs reported; divergence >10% on any primary metric is flagged in the result.

4. Instrumentation

Internal (instrumented tier): thermal headroom sampled at 1 Hz and thermal status transition events (Android Thermal API); instantaneous current draw and charge-counter delta (BatteryManager); pipeline actuals (encode bitrate/resolution/fps, bytes transmitted); all telemetry timestamped at source, device–server clock offset measured per session. Devices below modern API levels: battery temperature + readable system thermal zones, with per-device capability documented. External (all tiers): surface temperature via thermal camera or contact probes at ≥3 declared skin-contact points (temple arms, nose bridge/forehead pad). Surface temperature is the primary comfort metric — internal sensors alone do not constitute a valid result. Reference ceiling: 39 °C surface (skin-contact comfort/safety convention).

5. Workload profiles

All workloads are generated by the open-source Time2Throttle workload app (versioned; identical APK across instrumented devices; scripted standardized usage for black-box devices).

ID	Name	Definition	Simulates
W0	Idle baseline	Display on, no capture, no upload, 30 min	Attribution floor
W1	Sustained uplink	Camera capture → hardware encode (declared settings) → continuous upload, until throttle event or battery ≤10%. Network conditions: local network, minimum 100 Mbps available bandwidth, maximum 5 ms latency. Controlled network removes congestion as a variable in thermal and battery outcomes.	Remote-assist / streaming — the canonical heavy use
W2	Burst	W1 load in 90 s bursts, 5 min gaps, 10 cycles	Inspection / evidence-capture work
W3	Display-heavy	Max-brightness rendering loop (Standardized 50% APL scene), no capture	Display attribution
Configuration matrix (W1 minimum): Config A — full quality (declared maximum sustainable settings). Config B — reduced pipeline (~50% bitrate, reduced fps/resolution, declared). The A/B delta quantifies how much of the thermal/battery problem is attributable to the data pipeline versus fixed loads.

6. Metrics & definitions

Metric	Definition
Time-to-Throttle (T2T)	Minutes from workload start to the first performance limit — with cause recorded: thermal-triggered or power-triggered. Heat and degraded batteries both end the session; the cause column says which did.
Surface plateau	Maximum stabilized surface temperature at the hottest declared contact point (°C)
Real runtime	Minutes from 100% to 10% battery under workload (10% floor prevents deep-discharge damage so retail units can be re-tested)
Claim delta	Measured runtime vs vendor-claimed runtime (%) — reported for every device
Pipeline slice	Share of average power draw attributable to encode + radio + pipeline processing, derived from W0/W1/W3 decomposition. Known limitation: this decomposition assumes linear and independent contribution of components to power draw. Non-linear interactions — particularly thermal throttling suppressing loads during W1 that were present in isolation during W0 and W3 — are a known limitation of this approach. This will be revisited in a future methodology version.
Config delta	Improvement of Config B over Config A: Δ time-to-throttle and Δ runtime (%)
Recovery time	Time from throttle event to the point at which both of the following conditions are met: (1) thermal status returns to NONE per Android Thermal API; (2) surface temperature at the hottest contact point drops below 37 °C. Whichever condition occurs last is the recorded recovery endpoint. This dual-condition definition ensures recovery is confirmed at both the SoC and skin-contact level.

7. Reporting & publication rules

Every published result includes: device, firmware version, methodology version, full configuration, raw telemetry archive (downloadable), and thermal images of contact points.
Headline format per device: Claim delta · T2T (with cause) · Surface plateau · Real runtime (W1, Config A, 23 °C).
Negative and boring results are published. No tested device is exempt after testing begins.
Errata are append-only: results are timestamped and hash-anchored at publication; corrections are added, history is never silently edited.
Composite scoring is deferred to a later methodology version — premature scores invite disputes before the per-metric data has earned trust.

8. Independence rules

No paid scores. Scores and results are never purchasable, sponsorable, or removable.
Retail units first. Test devices are bought at retail. Vendor-supplied units are disclosed inline with that device's results.
Uncompromised paid testing. Paid services (e.g., pre-release testing) run under this exact public methodology, current version; a vendor may choose whether a pre-release result is published — never what it says.
Transparent links. Affiliate links, where used, are added after the verdict and never influence it.
Conflict disclosure. Any commercial relationship with a measured vendor is disclosed inline with that vendor's results.
Append-only methodology. This methodology changes only by public versioned revision with changelog — never retroactively.

9. Critique invited

This is v0.2, published before any result on purpose. If you build, deploy, or test wearables and see a flaw — in the workloads, the mounting, the metrics, the statistics — open an issue. Credited corrections enter the changelog.

Test Methodology