最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

video streaming - Is there a way that browsers understand to interleave an mjpeg stream served through HTTP with an MP3 and play

programmeradmin1浏览0评论

I'm serving an MJPEG stream and an MP3 stream from an esp32. The individual jpg frames are compressed by the camera module itself, but the microcontroller is not powerful enough to decode and re-encode this into a h264stream in realtime. I also use a microphone and the esp32 does compress the audio in real-time to mp3.

I'm serving the MJPEG in a raw HTTP response like so:

HTTP/1.1 200 OK
Content-Type: multipart/x-mixed-replace;boundary=123456789000000000000987654321
Transfer-Encoding: chunked
Access-Control-Allow-Origin: *
X-Framerate: 60
--123456789000000000000987654321
Content-Type: image/jpeg
Content-Length: 15021
X-Timestamp: 556.000000

...jpg_frame_bytes...
--123456789000000000000987654321
Content-Type: image/jpeg
Content-Length: 14680
X-Timestamp: 557.000000

...jpg_frame_bytes...

and so on

The http response for the MP3 looks like this:

HTTP/1.1 200 OK
Content-Type: audio/mpeg
Access-Control-Allow-Origin: *

...mp3_data...
...mp3_data...
...mp3_data...

opening both in a (chrome) browser works fine - the mjpeg starts playing immediately - the mp3 buffers for a few seconds before starting to play, which is annoying, but it works nevertheless.

Now I know that mjpeg itself does not support sound - but I also read that various IP cameras do include audio data either in the jpg metadata or in other ways interleaved with the mjpeg stream.

Is there a way I can send MJPG and mp3 in a single HTTP response in a way browsers (or at least chrome) will understand and play both (more or less) in sync?

//edit: I looked at MKV as container which apparently can support MJPEG stream and MP3 data, but the only libraries I found for MKV muxing are too big to get onto the esp32

I'm serving an MJPEG stream and an MP3 stream from an esp32. The individual jpg frames are compressed by the camera module itself, but the microcontroller is not powerful enough to decode and re-encode this into a h264stream in realtime. I also use a microphone and the esp32 does compress the audio in real-time to mp3.

I'm serving the MJPEG in a raw HTTP response like so:

HTTP/1.1 200 OK
Content-Type: multipart/x-mixed-replace;boundary=123456789000000000000987654321
Transfer-Encoding: chunked
Access-Control-Allow-Origin: *
X-Framerate: 60
--123456789000000000000987654321
Content-Type: image/jpeg
Content-Length: 15021
X-Timestamp: 556.000000

...jpg_frame_bytes...
--123456789000000000000987654321
Content-Type: image/jpeg
Content-Length: 14680
X-Timestamp: 557.000000

...jpg_frame_bytes...

and so on

The http response for the MP3 looks like this:

HTTP/1.1 200 OK
Content-Type: audio/mpeg
Access-Control-Allow-Origin: *

...mp3_data...
...mp3_data...
...mp3_data...

opening both in a (chrome) browser works fine - the mjpeg starts playing immediately - the mp3 buffers for a few seconds before starting to play, which is annoying, but it works nevertheless.

Now I know that mjpeg itself does not support sound - but I also read that various IP cameras do include audio data either in the jpg metadata or in other ways interleaved with the mjpeg stream.

Is there a way I can send MJPG and mp3 in a single HTTP response in a way browsers (or at least chrome) will understand and play both (more or less) in sync?

//edit: I looked at MKV as container which apparently can support MJPEG stream and MP3 data, but the only libraries I found for MKV muxing are too big to get onto the esp32

Share Improve this question edited yesterday Crumml asked yesterday CrummlCrumml 212 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

Without good timestamps for the elementary streams, a container like Matroska/WebM isn't going to help you anyway.

Since your microcontroller timing is essentially "best effort", what I suggest you do is play both streams as you are now, but drift your MP3 stream as close to realtime as possible.

There are several reasons why a browser may excessively buffer your MP3 stream. One common issue is that it doesn't know/trust the content type it was given and has to buffer some 256KB of data to "sniff" the type. Your Content-Type header looks good here, so I doubt that's the case. The most likely cause is ensuring that the data rate can keep up for uninterrupted playback. The browser doesn't know (or care) that it's a realtime stream and is going to want a full buffer before starting, to ensure the buffer doesn't repeatedly underrun and annoy the user. These buffers tend to be fixed byte sizes, which take awhile to fill up at low bitrates.

So, how to work around it? Once playback has started, nudge the playback rate to like 1.05 or even 1.1. Eventually it will catch up to as realtime as possible and will automatically continue at 1.0. At that point, you'll have a super tiny buffer but it will be close to realtime, and close to that MJPEG stream.

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论