I want a memory-efficient way to get video frame pixel data from MediaStream

Submitted by David Sanders

Getting ImageData from a MediaStream (local webcam, remote feed via WebRTC, captured from a video using HTMLMediaElement.captureStream(), etc.) can generate a lot of garbage if you’re doing processing on each frame and don’t care about it once you’ve processed it. Trying to process 1080p video at 30 FPS (Frames Per Second) will generate 230 MB/sec of discarded ImageData objects, which is a lot. 4K video at 30 FPS would be nearly 1 GB/sec.

Currently, freeing this memory is left to garbage collection, and how well implementations handle it varies. Chromium (as of July 2019) can mostly keep up, Firefox will end up with a footprint of gigabytes of memory usage.

The lack of a stable memory footprint impacts the usability of this sort of work in the real-world. On low-memory devices, a blip slowing down garbage collection can cause hundreds of megabytes of garbage to go momentarily uncollected, resulting in tabs crashing or the system running out of memory. No stable memory footprint means it's a crapshoot as to whether the web app can work on someone's environment, and it changes every release as garbage collection gets better or worse.

Workarounds for the limitation mostly don't exist. Instead, we have to capture video at a lower resolution so that the garbage load is smaller. Alternately, we need to use lower frame rates.

Tagged
JavaScript
This was presented at