1. Purpose
An independent, reproducible methodology for measuring thermal and battery behaviour of smart glasses and head-mounted wearables under realistic data workloads. Vendor runtime claims diverge consistently from real-world performance; no standardized, independent, published measurement exists for this category. This document defines how Time2Throttle produces its numbers — before the first number is produced. Reproducibility is the product: the workload app, raw telemetry, and this methodology are published so anyone can re-run any test.
2. Scope & device tiers
| Tier | Devices | What we can measure |
|---|---|---|
| Instrumented | Android-based devices (most enterprise HMDs; Android-based consumer AR) — profiler app sideloaded | Full telemetry: thermal headroom & status, power draw, pipeline metrics + external surface temperature |
| Black-box | Closed ecosystems (no third-party installs) | External-only: standardized-usage runtime, surface temperature, charge analysis, claim delta. Reproducibility limitation: black-box results cannot be fully reproduced by third parties due to the absence of internal telemetry access. This limitation is disclosed inline with every published black-box result. |
| Reference | Phone-class Android 11+ devices | Methodology development & calibration only — phone results are never published as wearable findings |
| Out of scope v0.1: VR headsets (different thermal envelope), audio-only frames without camera/display load. |
3. Test environment
- Standard ambient: 23 °C ± 1 °C, 50% ± 5% relative humidity, still air. All published scores at standard ambient.
- Stress ambient: 40 °C ± 1 °C, 50% ± 5% relative humidity, reported separately, never averaged with standard.
- Mounting: worn-equivalent — head-form or insulated stand replicating skin-adjacent dissipation. Open-desk bench testing is invalid (overstates dissipation). (Specific head-form material and thermal mass specifications to be detailed in future revisions)
- State: battery 100% at start; fixed declared display brightness; charging disabled; radios active (the radio is part of the workload); vendor thermal modes at factory default (deviations documented).
- Runs: minimum 3 per workload per configuration; true median of the three runs reported; divergence >10% on any primary metric is flagged in the result.
4. Instrumentation
Internal (instrumented tier): thermal headroom sampled at 1 Hz and thermal status transition events (Android Thermal API); instantaneous current draw and charge-counter delta (BatteryManager); pipeline actuals (encode bitrate/resolution/fps, bytes transmitted); all telemetry timestamped at source, device–server clock offset measured per session. Devices below modern API levels: battery temperature + readable system thermal zones, with per-device capability documented. External (all tiers): surface temperature via thermal camera or contact probes at ≥3 declared skin-contact points (temple arms, nose bridge/forehead pad). Surface temperature is the primary comfort metric — internal sensors alone do not constitute a valid result. Reference ceiling: 39 °C surface (skin-contact comfort/safety convention).
5. Workload profiles
All workloads are generated by the open-source Time2Throttle workload app (versioned; identical APK across instrumented devices; scripted standardized usage for black-box devices).
| ID | Name | Definition | Simulates |
|---|---|---|---|
| W0 | Idle baseline | Display on, no capture, no upload, 30 min | Attribution floor |
| W1 | Sustained uplink | Camera capture → hardware encode (declared settings) → continuous upload, until throttle event or battery ≤10%. Network conditions: local network, minimum 100 Mbps available bandwidth, maximum 5 ms latency. Controlled network removes congestion as a variable in thermal and battery outcomes. | Remote-assist / streaming — the canonical heavy use |
| W2 | Burst | W1 load in 90 s bursts, 5 min gaps, 10 cycles | Inspection / evidence-capture work |
| W3 | Display-heavy | Max-brightness rendering loop (Standardized 50% APL scene), no capture | Display attribution |
| Configuration matrix (W1 minimum): Config A — full quality (declared maximum sustainable settings). Config B — reduced pipeline (~50% bitrate, reduced fps/resolution, declared). The A/B delta quantifies how much of the thermal/battery problem is attributable to the data pipeline versus fixed loads. |
6. Metrics & definitions
| Metric | Definition |
|---|---|
| Time-to-Throttle (T2T) | Minutes from workload start to the first performance limit — with cause recorded: thermal-triggered or power-triggered. Heat and degraded batteries both end the session; the cause column says which did. |
| Surface plateau | Maximum stabilized surface temperature at the hottest declared contact point (°C) |
| Real runtime | Minutes from 100% to 10% battery under workload (10% floor prevents deep-discharge damage so retail units can be re-tested) |
| Claim delta | Measured runtime vs vendor-claimed runtime (%) — reported for every device |
| Pipeline slice | Share of average power draw attributable to encode + radio + pipeline processing, derived from W0/W1/W3 decomposition. Known limitation: this decomposition assumes linear and independent contribution of components to power draw. Non-linear interactions — particularly thermal throttling suppressing loads during W1 that were present in isolation during W0 and W3 — are a known limitation of this approach. This will be revisited in a future methodology version. |
| Config delta | Improvement of Config B over Config A: Δ time-to-throttle and Δ runtime (%) |
| Recovery time | Time from throttle event to the point at which both of the following conditions are met: (1) thermal status returns to NONE per Android Thermal API; (2) surface temperature at the hottest contact point drops below 37 °C. Whichever condition occurs last is the recorded recovery endpoint. This dual-condition definition ensures recovery is confirmed at both the SoC and skin-contact level. |
7. Reporting & publication rules
- Every published result includes: device, firmware version, methodology version, full configuration, raw telemetry archive (downloadable), and thermal images of contact points.
- Headline format per device: Claim delta · T2T (with cause) · Surface plateau · Real runtime (W1, Config A, 23 °C).
- Negative and boring results are published. No tested device is exempt after testing begins.
- Errata are append-only: results are timestamped and hash-anchored at publication; corrections are added, history is never silently edited.
- Composite scoring is deferred to a later methodology version — premature scores invite disputes before the per-metric data has earned trust.
8. Independence rules
- No paid scores. Scores and results are never purchasable, sponsorable, or removable.
- Retail units first. Test devices are bought at retail. Vendor-supplied units are disclosed inline with that device's results.
- Uncompromised paid testing. Paid services (e.g., pre-release testing) run under this exact public methodology, current version; a vendor may choose whether a pre-release result is published — never what it says.
- Transparent links. Affiliate links, where used, are added after the verdict and never influence it.
- Conflict disclosure. Any commercial relationship with a measured vendor is disclosed inline with that vendor's results.
- Append-only methodology. This methodology changes only by public versioned revision with changelog — never retroactively.
9. Critique invited
This is v0.2, published before any result on purpose. If you build, deploy, or test wearables and see a flaw — in the workloads, the mounting, the metrics, the statistics — open an issue. Credited corrections enter the changelog.