Test Methodology

v0.2 — DRAFT, open for critique. No results have been published under this methodology yet. It will be versioned and changelogged; corrections are append-only.

1. Purpose

An independent, reproducible methodology for measuring thermal and battery behaviour of smart glasses and head-mounted wearables under realistic data workloads. Vendor runtime claims diverge consistently from real-world performance; no standardized, independent, published measurement exists for this category. This document defines how Time2Throttle produces its numbers — before the first number is produced. Reproducibility is the product: the workload app, raw telemetry, and this methodology are published so anyone can re-run any test.

2. Scope & device tiers

TierDevicesWhat we can measure
InstrumentedAndroid-based devices (most enterprise HMDs; Android-based consumer AR) — profiler app sideloadedFull telemetry: thermal headroom & status, power draw, pipeline metrics + external surface temperature
Black-boxClosed ecosystems (no third-party installs)External-only: standardized-usage runtime, surface temperature, charge analysis, claim delta. Reproducibility limitation: black-box results cannot be fully reproduced by third parties due to the absence of internal telemetry access. This limitation is disclosed inline with every published black-box result.
ReferencePhone-class Android 11+ devicesMethodology development & calibration only — phone results are never published as wearable findings
Out of scope v0.1: VR headsets (different thermal envelope), audio-only frames without camera/display load.

3. Test environment

  • Standard ambient: 23 °C ± 1 °C, 50% ± 5% relative humidity, still air. All published scores at standard ambient.
  • Stress ambient: 40 °C ± 1 °C, 50% ± 5% relative humidity, reported separately, never averaged with standard.
  • Mounting: worn-equivalent — head-form or insulated stand replicating skin-adjacent dissipation. Open-desk bench testing is invalid (overstates dissipation). (Specific head-form material and thermal mass specifications to be detailed in future revisions)
  • State: battery 100% at start; fixed declared display brightness; charging disabled; radios active (the radio is part of the workload); vendor thermal modes at factory default (deviations documented).
  • Runs: minimum 3 per workload per configuration; true median of the three runs reported; divergence >10% on any primary metric is flagged in the result.

4. Instrumentation

Internal (instrumented tier): thermal headroom sampled at 1 Hz and thermal status transition events (Android Thermal API); instantaneous current draw and charge-counter delta (BatteryManager); pipeline actuals (encode bitrate/resolution/fps, bytes transmitted); all telemetry timestamped at source, device–server clock offset measured per session. Devices below modern API levels: battery temperature + readable system thermal zones, with per-device capability documented. External (all tiers): surface temperature via thermal camera or contact probes at ≥3 declared skin-contact points (temple arms, nose bridge/forehead pad). Surface temperature is the primary comfort metric — internal sensors alone do not constitute a valid result. Reference ceiling: 39 °C surface (skin-contact comfort/safety convention).

5. Workload profiles

All workloads are generated by the open-source Time2Throttle workload app (versioned; identical APK across instrumented devices; scripted standardized usage for black-box devices).

IDNameDefinitionSimulates
W0Idle baselineDisplay on, no capture, no upload, 30 minAttribution floor
W1Sustained uplinkCamera capture → hardware encode (declared settings) → continuous upload, until throttle event or battery ≤10%. Network conditions: local network, minimum 100 Mbps available bandwidth, maximum 5 ms latency. Controlled network removes congestion as a variable in thermal and battery outcomes.Remote-assist / streaming — the canonical heavy use
W2BurstW1 load in 90 s bursts, 5 min gaps, 10 cyclesInspection / evidence-capture work
W3Display-heavyMax-brightness rendering loop (Standardized 50% APL scene), no captureDisplay attribution
Configuration matrix (W1 minimum): Config A — full quality (declared maximum sustainable settings). Config B — reduced pipeline (~50% bitrate, reduced fps/resolution, declared). The A/B delta quantifies how much of the thermal/battery problem is attributable to the data pipeline versus fixed loads.

6. Metrics & definitions

MetricDefinition
Time-to-Throttle (T2T)Minutes from workload start to the first performance limit — with cause recorded: thermal-triggered or power-triggered. Heat and degraded batteries both end the session; the cause column says which did.
Surface plateauMaximum stabilized surface temperature at the hottest declared contact point (°C)
Real runtimeMinutes from 100% to 10% battery under workload (10% floor prevents deep-discharge damage so retail units can be re-tested)
Claim deltaMeasured runtime vs vendor-claimed runtime (%) — reported for every device
Pipeline sliceShare of average power draw attributable to encode + radio + pipeline processing, derived from W0/W1/W3 decomposition. Known limitation: this decomposition assumes linear and independent contribution of components to power draw. Non-linear interactions — particularly thermal throttling suppressing loads during W1 that were present in isolation during W0 and W3 — are a known limitation of this approach. This will be revisited in a future methodology version.
Config deltaImprovement of Config B over Config A: Δ time-to-throttle and Δ runtime (%)
Recovery timeTime from throttle event to the point at which both of the following conditions are met: (1) thermal status returns to NONE per Android Thermal API; (2) surface temperature at the hottest contact point drops below 37 °C. Whichever condition occurs last is the recorded recovery endpoint. This dual-condition definition ensures recovery is confirmed at both the SoC and skin-contact level.

7. Reporting & publication rules

  • Every published result includes: device, firmware version, methodology version, full configuration, raw telemetry archive (downloadable), and thermal images of contact points.
  • Headline format per device: Claim delta · T2T (with cause) · Surface plateau · Real runtime (W1, Config A, 23 °C).
  • Negative and boring results are published. No tested device is exempt after testing begins.
  • Errata are append-only: results are timestamped and hash-anchored at publication; corrections are added, history is never silently edited.
  • Composite scoring is deferred to a later methodology version — premature scores invite disputes before the per-metric data has earned trust.

8. Independence rules

  1. No paid scores. Scores and results are never purchasable, sponsorable, or removable.
  2. Retail units first. Test devices are bought at retail. Vendor-supplied units are disclosed inline with that device's results.
  3. Uncompromised paid testing. Paid services (e.g., pre-release testing) run under this exact public methodology, current version; a vendor may choose whether a pre-release result is published — never what it says.
  4. Transparent links. Affiliate links, where used, are added after the verdict and never influence it.
  5. Conflict disclosure. Any commercial relationship with a measured vendor is disclosed inline with that vendor's results.
  6. Append-only methodology. This methodology changes only by public versioned revision with changelog — never retroactively.

9. Critique invited

This is v0.2, published before any result on purpose. If you build, deploy, or test wearables and see a flaw — in the workloads, the mounting, the metrics, the statistics — open an issue. Credited corrections enter the changelog.

Changelog

  • v0.1 — founding public draft (June 2026).
  • v0.2 — six revisions (June 2026):
  • 1. §3 humidity added as a controlled variable — standard ambient 50% ± 5% RH; stress ambient 50% ± 5% RH.
  • 2. §3 minimum run count raised from 2 to 3; language updated to reflect a true median now exists.
  • 3. §5 W1 network conditions specified — local network, minimum 100 Mbps available bandwidth, maximum 5 ms latency; note added that this removes network congestion as a variable.
  • 4. §6 Recovery Time redefined as a precise dual-condition endpoint: thermal status NONE per Android Thermal API and surface temperature below 37 °C at the hottest contact point; whichever condition occurs last is the recorded endpoint.
  • 5. §6 Pipeline Slice — known limitation note added: W0/W1/W3 decomposition assumes linear and independent component contributions; non-linear interactions (particularly throttling during W1 suppressing loads present in W0/W3) are acknowledged; to be revisited in a future version.
  • 6. §2 Black-box tier — reproducibility limitation disclosed inline: results cannot be fully reproduced by third parties due to absence of internal telemetry; limitation disclosed on every published black-box result.