Testing

FOMS 22 Testing Session

CE device testing

WebDriver? LG/Samsung simulators

Lack of M1 support
Lack of DRM support
Lack of fidelity to real devices in general

Could run simulators with subset of tests automatically, trigger real devices when needed BrowserStack?/hosted vs physical labs Containerizing Safari/iOS would really help

GitHub? Actions has macOS & Safari for PR testing

Generic WebDriver Server vs full WebDriver? - could be extended?

Could potentially build a custom protocol in a web-app to add WebDriver? capabilities on top of a platform that doesn't support WebDriver? natively

Playback testing

Observable playback states

A/V sync is a problem
QR test videos, timed audio pulses, camera pointed at screen, analyzed after
Done this way at Pinterest, Chrome, Mux

Performance testing

CPU/memory testing

Long frames
Memory leaks

Testing on reference hardware

When switching hardware or OS, need to set new thresholds/refs
Pre-warm to avoid issues with background tasks (updates, caching)
Do multiple runs, look at median, mean, standard deviation
Need multiple reference devices to avoid per-device regressions
Some procedure to normalize data (network standards, some fixed # runs to pre-warm without collecting data)
With all this, you can set fixed thresholds for performance
Also keep records on performance to find regressions/progressions, with narrow range of commits to figure out which commit was responsible for the change

Performance analysis in production

Look at real-time production data
Deploy weekly, stage to canary
Compare canary data to production
Early adopters tend to be high-end, so naive comparison of canary to production tends to be a bad comparison
Must find representative comparison groups that control for other variables (region, high-end/low-end)

Chrome Dev Tools data via WebDriver?.io and Lighthouse API

Performance tools are often broken in WebDriver?.io, maybe because Lighthouse API is unstable

WebDriver?.io not very actively maintained

Noisy environment or cold startup testing (on purpose)
Apple asks people to do a system diagnosis report for bugs, with low-level data to see how noisy an environment is
Facebook does perf testing on PRs?, with custom CPU configs (power modes, etc) to normalize CPU speed and other data
Apple uses a network link conditioner to simulate specific network conditions

Actions

Could potentially build a custom protocol in a web-app to add WebDriver? capabilities on top of a platform that doesn't support WebDriver? natively (add to Generic WebDriver Server)

Standardize open system for A/V sync, generate QR test streams, open-source analysis, recommended camera system

Alternative: analyze on-device in JS, copy pixel data to canvas, read out frame ID, maybe use WebAudio? to detect audio pulse, no camera, could even be headless
Alternative: use small binary codes (black & white pixels in last scanline) instead of QR
A standardized no-camera system would be most valuable if it worked well enough

Performance testing seems to unavoidably require a separate test runner and/or framework from the rest of your testing