最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - MediaRecorder ignoring VideoFrame.timestamp - Stack Overflow

programmeradmin3浏览0评论

I would like to generate a video. I am using MediaRecorder to record a track generated by MediaStreamTrackGenerator.

Generating each frame takes some time, let's say 1 second, and I would like to generate the video at 10 fps.

Therefore, when I create a frame, I use timestamp and duration to indicate the real time of the frame.

const ms = 1_000_000; // 1µs
const fps = 10;
const frame = new VideoFrame(await createImageBitmap(canvas), {
  timestamp: (ms * 1) / fps,
  duration: ms / fps,
});

Unfortunately, if generating each frame takes 1 second, despite indicating timestamp and duration, the video is played at 1frame/sec, not 10fps.

How can I encode the video frames at the desired frame rate?

Bonus: Downloading the generated video in VLC, the video has no duration. Can this be set?


CodePen for reproduction: (this example works in Chrome. If you use Safari, change video/webm to video/mp4.)

Things I tried and aren't a good solution for me:

  1. Storing all frames in some cache, then playing them back at the desired speed and recording that playback. It is unreliable, inconsistent, and memory intensive.

I would like to generate a video. I am using MediaRecorder to record a track generated by MediaStreamTrackGenerator.

Generating each frame takes some time, let's say 1 second, and I would like to generate the video at 10 fps.

Therefore, when I create a frame, I use timestamp and duration to indicate the real time of the frame.

const ms = 1_000_000; // 1µs
const fps = 10;
const frame = new VideoFrame(await createImageBitmap(canvas), {
  timestamp: (ms * 1) / fps,
  duration: ms / fps,
});

Unfortunately, if generating each frame takes 1 second, despite indicating timestamp and duration, the video is played at 1frame/sec, not 10fps.

How can I encode the video frames at the desired frame rate?

Bonus: Downloading the generated video in VLC, the video has no duration. Can this be set?


CodePen for reproduction: https://codepen.io/AmitMY/pen/OJxgPoG (this example works in Chrome. If you use Safari, change video/webm to video/mp4.)

Things I tried and aren't a good solution for me:

  1. Storing all frames in some cache, then playing them back at the desired speed and recording that playback. It is unreliable, inconsistent, and memory intensive.
Share Improve this question asked Jun 20, 2022 at 21:17 AmitAmit 6,3247 gold badges51 silver badges106 bronze badges
Add a ment  | 

2 Answers 2

Reset to default 13 +150

Foreword

So... I've been investigating this for two days now and it's a plete mess. I don't have a full answer, but here's what I've tried and figured out so far.

The situation

First up I scrapped up this diagram of Web Codecs / Insertable Streams API to better understand how everything links together:

MediaStream, StreamTrack, VideoFrame, TrackProcessor, TrackGenerator, ...

The most mon use case / flow is that you have a MediaStream, such as a video camera feed or an existing video (playing on canvas), which you'd then "break into" different MediaStreamTracks - usually audio- and video track, though the API actually supports subtitle-, image- and shared screen tracks as well.

So you break a MediaStream into a MediaStreamTrack of "video" kind, which you then feed to MediaStreamTrackProcessor to actually break the video track into individual VideoFrames. You can then do frame-by-frame manipulation and when you're done, you're supposed to stream those VideoFrames into MediaStreamTrackGenerator, which in turn turns those VideoFrames into a MediaStreamTrack, which in turn you can stuff into a MediaStream to make a sort of "Full Media Object" aka. something that contains Video and Audio tracks.

Interestingly enough, I couldn't get a MediaStream to play on a <video> element directly, but I think that this is a hard requirement if we want to acplish what OP wants.

As it currently stands, even when we have all the VideoFrames ready to go and turned into a MediaStream, we still have to, for some reason, record it twice to create a proper Blob which <video> accepts - think of this step pretty much as a "rendering" step of a professional video editing software, the only difference being that we already have the final frames, so why can't we just create a video out of those?

As far as I know, everything here that works for Video, also works for Audio. So there actually exist something called AudioFrame for example, though the documentation page is missing as I am writing this.

Encoding and Decoding

Furthermore, regarding VideoFrames and AudioFrames, there's also API support for encoding and decoding of those, which I actually tried in the hopes that encoding a VideoFrame with VP8 would somehow "bake" that duration and timestamp into it, as at least the duration of VideoFrame does not seem to do anything.

Here's my encoding / decoding code when I tried playing around with it. Note that this whole encoding and decoding business + codecs is one hell of a deep rabbit hole. I have no idea how I found this for example, but it did tell me that Chromium doesn't support hardware accelerated VP8 on Windows (no thanks to the codec error messages, which just babbled something about "cannot used closed codec"):

const createFrames = async (ctx, fps, streamWriter, width, height) => {
    const getRandomRgb = () => {
        var num = Math.round(0xffffff * Math.random());
        var r = num >> 16;
        var g = num >> 8 & 255;
        var b = num & 255;
        return 'rgb(' + r + ', ' + g + ', ' + b + ')';
    }

    const encodedChunks = [];
    const videoFrames = [];

    const encoderOutput = (encodedChunk) => {
        encodedChunks.push(encodedChunk);
    }

    const encoderError = (err) => {
        //console.error(err);
    }

    const encoder = new VideoEncoder({
        output: encoderOutput,
        error: encoderError
    })

    encoder.configure({
        //codec: "avc1.64001E",
        //avc:{format:"annexb"},
        codec: "vp8",
        hardwareAcceleration: "prefer-software", // VP8 with hardware acceleration not supported
        width: width,
        height: height,
        displayWidth: width,
        displayHeight: height,
        bitrate: 3_000_000,
        framerate: fps,
        bitrateMode: "constant",
        latencyMode: "quality"
    });

    const ft = 1 / fps;
    const micro = 1_000_000;
    const ft_us = Math.floor(ft * micro);

    for(let i = 0; i < 10; i++) {
        console.log(`Writing frames ${i * fps}-${(i + 1) * fps}`);
        ctx.fillStyle = getRandomRgb();
        ctx.fillRect(0,0, width, height);

        ctx.fillStyle = "white";
        ctx.textAlign = "center";
        ctx.font = "80px Arial";
        ctx.fillText(`${i}`, width / 2, height / 2);

        for(let j = 0; j < fps; j++) {
            //console.log(`Writing frame ${i}.${j}`);
            const offset = i > 0 ? 1 : 0;
            const timestamp = i * ft_us * fps + j * ft_us;
            const duration = ft_us;

            var frameData = ctx.getImageData(0, 0, width, height);

            var buffer = frameData.data.buffer;

            const frame = new VideoFrame(buffer, 
            { 
                format: "RGBA",
                codedWidth: width,
                codedHeight: height,
                colorSpace: {
                    primaries: "bt709",
                    transfer: "bt709",
                    matrix: "bt709",
                    fullRange: true
                },
                timestamp: timestamp,
                duration: ft_us
            });
            
            encoder.encode(frame, { keyFrame: false });
            videoFrames.push(frame);
        }  
    }

    //return videoFrames;
    
    await encoder.flush();
    //return encodedChunks;

    const decodedChunks = [];
    
    const decoder = new VideoDecoder({
        output: (frame) => {
            decodedChunks.push(frame);
        },
        error: (e) => {
            console.log(e.message);
        }
    });

    decoder.configure({
        codec: 'vp8',
        codedWidth: width,
        codedHeight: height
    });

    encodedChunks.forEach((chunk) => {
        decoder.decode(chunk);
    });

    await decoder.flush();

    return decodedChunks;
}

Frame calculations

Regarding your frame calculations, I did things a bit differently. Consider the following image and code:

const fps = 30;
const ft = 1 / fps;
const micro = 1_000_000;
const ft_us = Math.floor(ft * micro);

Ignoring the fact how long it takes to create 1 frame (as it should be irrelevant here, if we can set the frame duration), here's what I figured.

We want to play the video at 30 frames per second (fps). We generate 10 colored rectangles which we want to show on the screen for 1 second each, resulting in a video length of 10 seconds. This means that, in order to actually play the video at 30fps, we need to generate 30 frames for each rectangle. If we could set a frame duration, we could technically have only 10 frames with a duration of 1 second each, but then the fps would actually be 1 frame per second. We're doing 30fps though.

An fps of 30 gives us a frametime (ft) of 1 / 30 seconds, aka. the time that each frame is shown on the screen. We generate 30 frames for 1 rectangle -> 30 * (1 / 30) = 1 second checks out. The other thing here is that VideoFrame duration and timestamp do not accept seconds or milliseconds, but microseconds, so we need to turn that frametime (ft) to frametime in microseconds (ft_us), which is just (1 / 30) * 1 000 000 = ~33 333us.

Calculating the final duration and timestamp for each frame is a bit tricky as we are now looping twice, one loop for each rectangle and one loop for each frame of a rectangle at 30fps.

The timestamp for a frame j of rectangle i is (in english):

<i> * <frametime in us> * <fps> + <j> * <frametime in us> (+ <offset 0 or 1>

Where <i> * <frametime in us> * <fps> gets us many microseconds each previous rectangle takes and <j> * <frametime in us> gets us how many microseconds each previous frame of the current rectangle takes. We also supply and optional offset of 0, when we're making our very first frame of the very first rectangle and an offset of 1 otherwise, so that we avoid overlapping.

const fps = 30;
const ft = 1 / fps;
const micro = 1_000_000;
const ft_us = Math.floor(ft * micro);

// For each colored rectangle
for(let i = 0; i < 10; i++) {
    // For each frame of colored rectangle at 30fps
    for(let j = 0; j < fps; j++) {
        const offset = i > 0 ? 1 : 0;
        const timestamp = i * ft_us * fps + j * ft_us /* + offset */;
        const duration = ft_us * 10;

        new VideoFrame({ duration, timestamp });
        ...
    }
}

This should get us 10 * 30 = 300 frames in total, for a video length of 10 seconds when played at 30 fps.

My latest try and ReadableStream test

I've refactored everything so many times without luck, but here is my current solution where I try to use ReadableStream to pass the generated VideoFrames to MediaStreamTrackGenerator (skipping the recording step), generate a MediaStream from that and try to give the result to srcObject of a <video> element:

const streamTrackGenerator = new MediaStreamTrackGenerator({ kind: 'video' });
const streamWriter = streamTrackGenerator.writable;
const chunks = await createFrames(ctx, fps, streamWriter, width, height); // array of VideoFrames
let idx = 0;

await streamWriter.ready;

const frameStream = new ReadableStream({
    start(controller) {
        controller.enqueue(chunks[idx]);
        idx++;
    },

    pull(controller) {
        if(idx >= chunks.length) {
            controller.close();
        }
        else {
            controller.enqueue(chunks[idx]);
            idx++;
        }
    },

    cancel(reason) {
        console.log("Cancelled", reason);
    }

});

await frameStream.pipeThrough(new TransformStream({ 
    transform (chunk, controller) {
        console.log(chunk); // debugging
        controller.enqueue(chunk) // passthrough
    }
 })).pipeTo(streamWriter);

const mediaStreamTrack = streamTrackGenerator.clone();

const mediaStream = new MediaStream([mediaStreamTrack]);

const video = document.createElement('video');
video.style.width = `${width}px`;
video.style.height = `${height}px`;
document.body.appendChild(video);
video.srcObject  = mediaStream;
video.setAttribute('controls', 'true')

video.onloadedmetadata = function(e) {
    video.play().catch(e => alert(e.message))
};

Try with VP8 encoding + decoding and trying to give VideoFrames to MediaSource via SourceBuffers

More info on MediaSource and SourceBuffers. This one is also me trying to exploit the MediaRecorder.start() function with timeslice parameter in conjuction with MediaRecorder.requestFrame() to try and record frame-by-frame:

const init = async () => {
    const width = 256;
    const height = 256;
    const fps = 30;
    

    const createFrames = async (ctx, fps, streamWriter, width, height) => {
        const getRandomRgb = () => {
            var num = Math.round(0xffffff * Math.random());
            var r = num >> 16;
            var g = num >> 8 & 255;
            var b = num & 255;
            return 'rgb(' + r + ', ' + g + ', ' + b + ')';
        }

        const encodedChunks = [];
        const videoFrames = [];

        const encoderOutput = (encodedChunk) => {
            encodedChunks.push(encodedChunk);
        }

        const encoderError = (err) => {
            //console.error(err);
        }

        const encoder = new VideoEncoder({
            output: encoderOutput,
            error: encoderError
        })

        encoder.configure({
            //codec: "avc1.64001E",
            //avc:{format:"annexb"},
            codec: "vp8",
            hardwareAcceleration: "prefer-software",
            width: width,
            height: height,
            displayWidth: width,
            displayHeight: height,
            bitrate: 3_000_000,
            framerate: fps,
            bitrateMode: "constant",
            latencyMode: "quality"
        });

        const ft = 1 / fps;
        const micro = 1_000_000;
        const ft_us = Math.floor(ft * micro);

        for(let i = 0; i < 10; i++) {
            console.log(`Writing frames ${i * fps}-${(i + 1) * fps}`);
            ctx.fillStyle = getRandomRgb();
            ctx.fillRect(0,0, width, height);

            ctx.fillStyle = "white";
            ctx.textAlign = "center";
            ctx.font = "80px Arial";
            ctx.fillText(`${i}`, width / 2, height / 2);

            for(let j = 0; j < fps; j++) {
                //console.log(`Writing frame ${i}.${j}`);
                const offset = i > 0 ? 1 : 0;
                const timestamp = i * ft_us * fps + j * ft_us;
                const duration = ft_us;

                var frameData = ctx.getImageData(0, 0, width, height);

                var buffer = frameData.data.buffer;

                const frame = new VideoFrame(buffer, 
                { 
                    format: "RGBA",
                    codedWidth: width,
                    codedHeight: height,
                    colorSpace: {
                        primaries: "bt709",
                        transfer: "bt709",
                        matrix: "bt709",
                        fullRange: true
                    },
                    timestamp: timestamp,
                    duration: ft_us
                });
                
                encoder.encode(frame, { keyFrame: false });
                videoFrames.push(frame);
            }  
        }

        //return videoFrames;
        
        await encoder.flush();
        //return encodedChunks;

        const decodedChunks = [];
        
        const decoder = new VideoDecoder({
            output: (frame) => {
                decodedChunks.push(frame);
            },
            error: (e) => {
                console.log(e.message);
            }
        });

        decoder.configure({
            codec: 'vp8',
            codedWidth: width,
            codedHeight: height
        });

        encodedChunks.forEach((chunk) => {
            decoder.decode(chunk);
        });

        await decoder.flush();

        return decodedChunks;
    }

    const canvas = new OffscreenCanvas(256, 256);
    const ctx = canvas.getContext("2d");

    const recordedChunks = [];
    const streamTrackGenerator = new MediaStreamTrackGenerator({ kind: 'video' });
    const streamWriter = streamTrackGenerator.writable.getWriter();
    const mediaStream = new MediaStream();
    mediaStream.addTrack(streamTrackGenerator);

    const mediaRecorder = new MediaRecorder(mediaStream, {
        mimeType: "video/webm", 
        videoBitsPerSecond: 3_000_000
    });

    mediaRecorder.addEventListener('dataavailable', (event) => {
        recordedChunks.push(event.data);
        console.log(event)
    });

    mediaRecorder.addEventListener('stop', (event) => {
        console.log("stopped?")
        console.log('Frames written');
        console.log('Stopping MediaRecorder');
        console.log('Closing StreamWriter');

        const blob = new Blob(recordedChunks, {type: mediaRecorder.mimeType});
        const url = URL.createObjectURL(blob);

        const video = document.createElement('video');
        video.src = url;
        document.body.appendChild(video);
        video.setAttribute('controls', 'true')
        video.play().catch(e => alert(e.message))
    });

    
    console.log('StreamWrite ready');
    console.log('Starting mediarecorder');

    console.log('Creating frames');
    const chunks = await createFrames(ctx, fps, streamWriter, width, height);

    mediaRecorder.start(33333);

    for(const key in chunks) {
        await streamWriter.ready;
        const chunk = chunks[key];
        //await new Promise(resolve => setTimeout(resolve, 1))
        await streamWriter.write(chunk);
        mediaRecorder.requestData();
    }
    
    //await streamWriter.ready;
    //streamWriter.close();
    //mediaRecorder.stop();

    /*const mediaSource = new MediaSource();
    
    const video = document.createElement('video');
    document.body.appendChild(video);
    video.setAttribute('controls', 'true')

    const url = URL.createObjectURL(mediaSource);
    video.src = url;

    mediaSource.addEventListener('sourceopen', function() {
        var mediaSource = this;
        const sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vp8"');

    let allocationSize = 0;
    chunks.forEach((c) => { allocationSize += c.byteLength});

    var buf = new ArrayBuffer(allocationSize);
    
    chunks.forEach((chunk) => {
        chunk.copyTo(buf);
    });

    sourceBuffer.addEventListener('updateend', function() {
        //mediaSource.endOfStream();
        video.play();
    });

    sourceBuffer.appendBuffer(buf);
    });*/

    //video.play().catch(e => alert(e.message))

    /*mediaStream.getTracks()[0].stop();

    const blob = new Blob(chunks, { type: "video/webm" });
    const url = URL.createObjectURL(blob);

    const video = document.createElement('video');
    video.srcObject = url;
    document.body.appendChild(video);
    video.setAttribute('controls', 'true')
    video.play().catch(e => alert(e.message))*/

    //mediaRecorder.stop();
}

Conclusion / Afterwords

After all that I tried, I had the most problems with turning Frames into Tracks and Tracks into Streams etc. There is so much (poorly documentet) converting from one thing to another and half of it is done with streams, which also lacks a lot of documentation. There doesn't even seem to be any meaningful way to create custom ReadableStreams and WritableStreams without the use of NPM packages.

I never got VideoFrame duration working. What surprised me the most is that basically nothing else in the process mattered with regards to video or frame length other than adjusting the hacky await new Promise(resolve => setTimeout(resolve, 1000)) timing, but even with that, the recording was really inconsistent. If there was any lag during recording, it would show on the recording; I had recordings where some rectangles were shown for half a second and other ones for 2 seconds. Interestingly enough, the whole recording process would sometimes break pletely, if I removed the arbitrary setTimeout. A program that would break without the timeout, would work with await new Promise(resolve => setTimeout(resolve, 1)). This is usually a clue that this has something to do with JS Event Loops, as setTimeouts with 0ms timings tell JS to "wait for next event loop round".

I'm still going to work on this a bit, but I'm doubtful I'll make any further progress. I'd like to get this to work without the use of MediaRecorder and by utilizing streams to work out resource issues.

One really interesting thing that I bumped into was that MediaStreamTrackGenerator is actually old news. The w3 documentation only really talks about VideoTrackGenerator and there's an interesting take on how to basically build a VideoTrackGenerator from the existing MediaStreamTrackGenerator. Also note this part specifically:

This interestingily enough tells us that MediaStreamTrackGenerator.clone() === MediaStreamTrack which I tried to put in use, but without success.

Anyway, I hope this might give you some new ideas or clarify some things. Maybe you'll figure out something I didn't. Have a good one and do tell if you have questions or figure something out!

Further reading

  • w3 VideoFrame and duration

Edit 1

Forgot to mention that I used OffscreenCanvas and it's context, instead of normal Canvas. As we're also talking about performance here, I figured I'd try and see how OffscreenCanvas works.

I also used the second constructor of VideoFrame, that is, I gave it an ArrayBuffer instead of a bitmap image like in your code.

Although you have an accepted Answer, I'll add my two-cents worth of advice...

"Generating each frame takes some time, let's say 1 second, and I would like to generate the video at 10 fps. If generating each frame takes 1 second, despite indicating timestamp and duration, the video is played at 1frame/sec, not 10fps.

How can I encode the video frames at the desired frame rate?"

To encode a 10 frames-per-sec video, from your For loop of 10 bitmaps, would give you a video with a 1 second of duration (but it travels through 10 frames during that 1 second interval).

What you want then is a new frame every 100ms until these 10 frames makes a 1000ms.

To achieve that 10 FPS, you simply...

  • First pause the recorder with mediaRecorder.pause();
  • Now generate your bitmap (this process can take any length of time)
  • When frame/bitmap is ready, then resume the recorder for 100ms with mediaRecorder.resume();
  • To achieve the 100ms per frame, you can use a Timer that re-pauses the recording.

Think of it as using a camcorder, where you:
press record -> await 100ms of capture -> pause -> new frame ->repeat press record until 10 frames.

Here is a quick-ish example as a starting point (eg: readers should improve upon it):

<!DOCTYPE html>
<html>
<body>

<button onclick="recorder_Setup()">Create Video</button>

<h2 id="demo"></h2>

<script>

//# Create canvas for dummy frames
const canvas = document.createElement('canvas');
canvas.width = 256;
canvas.height = 256;
document.body.appendChild(canvas);
const ctx = canvas.getContext('2d');


const recordedChunks = [];
var mediaRecorder; var generator; var writer 
var stream; var frame; var frameCount = 0;

//# needed to pause the function, whilst recorder stores frames at the specified interval
const sleep = ( sleep_time ) => { return new Promise(resolve => setTimeout(resolve, sleep_time) ) }

//# setup recorder
recorder_Setup(); //# create and start recorder here

function getRandomRgb() 
{
  var num = Math.round(0xffffff * Math.random());
  var r = num >> 16;
  var g = num >> 8 & 255;
  var b = num & 255;
  return 'rgb(' + r + ', ' + g + ', ' + b + ')';
}
 
function recorder_Setup()
{

    //# create media generator track
    generator = new MediaStreamTrackGenerator({kind: 'video'});
    writer = generator.writable.getWriter();
    stream = new MediaStream();
    stream.addTrack(generator);
    
    var myObj = {
                    mimeType: "video/webm", 
                    videoBitsPerSecond: 3_000_000 // 3MBps
                };
                
    mediaRecorder = new MediaRecorder( stream, myObj );
    mediaRecorder.addEventListener('dataavailable',  (event) => { onFrameData( event.data ); }  );
    mediaRecorder.addEventListener("stop", (event) => { recorder_Stop() } );
    
    //# start the recorder... and start adding frames
    mediaRecorder.start();
    recorder_addFrame();

}

function onFrameData(  input )
{
    //console.log( "got frame data... frame count v2 : " + frameCount );
    recordedChunks.push( input );
}


async function recorder_addFrame ()
{
    mediaRecorder.pause();
    
    await new Promise(resolve => setTimeout(resolve, 1000) ) 
    
    //# add text for frame number
    ctx.fillStyle = "#808080";
    ctx.fillRect(0, 0, 256, 256);
    
    ctx.font = "30px Arial"; ctx.fillStyle = "#FFFFFF";
    ctx.fillText("frame : " + frameCount ,10,50);
    
    //# add color fill for frame pixels
    ctx.fillStyle = getRandomRgb();
    ctx.fillRect(0, 70, 256, 180);

    const ms = 1000; // 1µs
    
    //# note "timestamp" and "duration" don't mean anything here...
    frame = new VideoFrame( await createImageBitmap(canvas), {timestamp: 0, duration: 0} );
    
    console.log( "frame count v1 : " + frameCount );
    
    frameCount++;
    
    //# When ready to write a frame, you resume the recoder for the required interval period 
    //# (eg: a 10 FPS = 1000/10 = 100 ms interval per frame during the 1000 ms (of 1 second)...
    mediaRecorder.resume();
    await sleep(100);
    
    writer.write(frame);
    frame.close();
    
    if( frameCount >= 10 ) { mediaRecorder.stop(); }
    else { recorder_addFrame(); }

}

function recorder_Stop()
{
    console.log("recorder stopped");

    stream.getTracks().forEach(track => track.stop());

    const blob = new Blob(recordedChunks, {type: mediaRecorder.mimeType});
    const url = URL.createObjectURL(blob);

    const video = document.createElement('video');
    video.src = url;
    document.body.appendChild(video);
    video.setAttribute('controls', 'true')
    video.setAttribute('muted', 'true')
    //video.play().catch(e => alert(e.message))

}

</script>

</body>
</html>
发布评论

评论列表(0)

  1. 暂无评论