I want to record a user's screen alongside their webcam and display the result as an overlay, like this:
I assume that while recording I can display the multiple streams in two separate video elements and overlay them with CSS.
However, how do I save the result as an overlay of the two videos?
I want to record a user's screen alongside their webcam and display the result as an overlay, like this:
I assume that while recording I can display the multiple streams in two separate video elements and overlay them with CSS.
However, how do I save the result as an overlay of the two videos?
Share Improve this question asked Mar 21, 2022 at 12:48 amitairosamitairos 2,96713 gold badges51 silver badges86 bronze badges 4- 1 You tried this? stackoverflow.com/questions/37404860/… – amiad Commented Mar 23, 2022 at 22:34
- 1 Have you tried RecordRTC - Record Camera and Screen into Single WebM. Repo - github.com/muaz-khan/RecordRTC Demo - webrtc-experiment.com/RecordRTC/simple-demos/… – Olie Cape Commented Mar 24, 2022 at 8:14
- Thanks! I'll have a look at RecordRTC – amitairos Commented Mar 24, 2022 at 12:20
- (3) Not clear what expected format the recorded "video" should be output as. I mean FLV and MOV can both hold RGB pixel data. Is any format acceptable because you plan to manually convert with a tool like FFmpeg? Maybe you want a video file that's already playable in browser (without converting), so you actually need MP4 (for all browsers) or WebM for (Chrome/Firefox)... As you can see your Questions is not telling us enough to help you. The image shown above, is it your current result with CSS layering? – VC.One Commented Mar 29, 2022 at 8:45
2 Answers
Reset to default 14 +100This can be achieved in pure JS as follows -
- Fetch Webcam Stream via getUserMedia()
- Fetch Screen Share Stream via getDisplayMedia()
- Merge Both Stream using some math & canvas operations
- Use canvas.captureStream() to generate the composite video stream.
- Use AudioContext to merge audio clips (especially needed if using both microphone & system audio).
- Use MediaStream constructor to create a new stream using - the video from the new stream + audio from audioContext Destination Node as follows -
new MediaStream([...newStream.getVideoTracks(), ...audioDestination.stream.getTracks()]);
- Use newly generated MediaStream as required (i.e. replace in RTCPeerConnection, etc.).
- In this example - MediaRecorder API is used to record the resulting composite/picture-in-picture video.
- Recording begins when the "Record Resulting Stream" button is clicked. The final result can be downloaded upon clicking the "Stop Recording and Download Resulting Stream" button.
PS: The snippet won't be able to fetch camera/screen share + please use this CodePen link to see it in action.
let localCamStream,
localScreenStream,
localOverlayStream,
rafId,
cam,
screen,
mediaRecorder,
audioContext,
audioDestination;
let mediaWrapperDiv = document.getElementById("mediaWrapper");
let startWebcamBtn = document.getElementById("startWebcam");
let startScreenShareBtn = document.getElementById("startScreenShare");
let mergeStreamsBtn = document.getElementById("mergeStreams");
let startRecordingBtn = document.getElementById("startRecording");
let stopRecordingBtn = document.getElementById("stopRecording");
let stopAllStreamsBtn = document.getElementById("stopAllStreams");
let canvasElement = document.createElement("canvas");
let canvasCtx = canvasElement.getContext("2d");
let encoderOptions = {
mimeType: "video/webm; codecs=vp9"
};
let recordedChunks = [];
let audioTracks = [];
/**
* Internal Polyfill to simulate
* window.requestAnimationFrame
* since the browser will kill canvas
* drawing when tab is inactive
*/
const requestVideoFrame = function(callback) {
return window.setTimeout(function() {
callback(Date.now());
}, 1000 / 60); // 60 fps - just like requestAnimationFrame
};
/**
* Internal polyfill to simulate
* window.cancelAnimationFrame
*/
const cancelVideoFrame = function(id) {
clearTimeout(id);
};
async function startWebcamFn() {
localCamStream = await navigator.mediaDevices.getUserMedia({
video: true,
audio: {
deviceId: {
exact: "communications"
}
}
});
if (localCamStream) {
cam = await attachToDOM("justWebcam", localCamStream);
}
}
async function startScreenShareFn() {
localScreenStream = await navigator.mediaDevices.getDisplayMedia({
video: true,
audio: true
});
if (localScreenStream) {
screen = await attachToDOM("justScreenShare", localScreenStream);
}
}
async function stopAllStreamsFn() {
[
...(localCamStream ? localCamStream.getTracks() : []),
...(localScreenStream ? localScreenStream.getTracks() : []),
...(localOverlayStream ? localOverlayStream.getTracks() : [])
].map((track) => track.stop());
localCamStream = null;
localScreenStream = null;
localOverlayStream = null;
cancelVideoFrame(rafId);
mediaWrapperDiv.innerHTML = "";
document.getElementById("recordingState").innerHTML = "";
}
async function makeComposite() {
if (cam && screen) {
canvasCtx.save();
canvasElement.setAttribute("width", `${screen.videoWidth}px`);
canvasElement.setAttribute("height", `${screen.videoHeight}px`);
canvasCtx.clearRect(0, 0, screen.videoWidth, screen.videoHeight);
canvasCtx.drawImage(screen, 0, 0, screen.videoWidth, screen.videoHeight);
canvasCtx.drawImage(
cam,
0,
Math.floor(screen.videoHeight - screen.videoHeight / 4),
Math.floor(screen.videoWidth / 4),
Math.floor(screen.videoHeight / 4)
); // this is just a rough calculation to offset the webcam stream to bottom left
let imageData = canvasCtx.getImageData(
0,
0,
screen.videoWidth,
screen.videoHeight
); // this makes it work
canvasCtx.putImageData(imageData, 0, 0); // properly on safari/webkit browsers too
canvasCtx.restore();
rafId = requestVideoFrame(makeComposite);
}
}
async function mergeStreamsFn() {
document.getElementById("mutingStreams").style.display = "block";
await makeComposite();
audioContext = new AudioContext();
audioDestination = audioContext.createMediaStreamDestination();
let fullVideoStream = canvasElement.captureStream();
let existingAudioStreams = [
...(localCamStream ? localCamStream.getAudioTracks() : []),
...(localScreenStream ? localScreenStream.getAudioTracks() : [])
];
audioTracks.push(
audioContext.createMediaStreamSource(
new MediaStream([existingAudioStreams[0]])
)
);
if (existingAudioStreams.length > 1) {
audioTracks.push(
audioContext.createMediaStreamSource(
new MediaStream([existingAudioStreams[1]])
)
);
}
audioTracks.map((track) => track.connect(audioDestination));
console.log(audioDestination.stream);
localOverlayStream = new MediaStream([...fullVideoStream.getVideoTracks()]);
let fullOverlayStream = new MediaStream([
...fullVideoStream.getVideoTracks(),
...audioDestination.stream.getTracks()
]);
console.log(localOverlayStream, existingAudioStreams);
if (localOverlayStream) {
overlay = await attachToDOM("pipOverlayStream", localOverlayStream);
mediaRecorder = new MediaRecorder(fullOverlayStream, encoderOptions);
mediaRecorder.ondataavailable = handleDataAvailable;
overlay.volume = 0;
cam.volume = 0;
screen.volume = 0;
cam.style.display = "none";
// localCamStream.getAudioTracks().map(track => { track.enabled = false });
screen.style.display = "none";
// localScreenStream.getAudioTracks().map(track => { track.enabled = false });
}
}
async function startRecordingFn() {
mediaRecorder.start();
console.log(mediaRecorder.state);
console.log("recorder started");
document.getElementById("pipOverlayStream").style.border = "10px solid red";
document.getElementById(
"recordingState"
).innerHTML = `${mediaRecorder.state}...`;
}
async function attachToDOM(id, stream) {
let videoElem = document.createElement("video");
videoElem.id = id;
videoElem.width = 640;
videoElem.height = 360;
videoElem.autoplay = true;
videoElem.setAttribute("playsinline", true);
videoElem.srcObject = new MediaStream(stream.getTracks());
mediaWrapperDiv.appendChild(videoElem);
return videoElem;
}
function handleDataAvailable(event) {
console.log("data-available");
if (event.data.size > 0) {
recordedChunks.push(event.data);
console.log(recordedChunks);
download();
} else {}
}
function download() {
var blob = new Blob(recordedChunks, {
type: "video/webm"
});
var url = URL.createObjectURL(blob);
var a = document.createElement("a");
document.body.appendChild(a);
a.style = "display: none";
a.href = url;
a.download = "result.webm";
a.click();
window.URL.revokeObjectURL(url);
}
function stopRecordingFn() {
mediaRecorder.stop();
document.getElementById(
"recordingState"
).innerHTML = `${mediaRecorder.state}...`;
}
startWebcamBtn.addEventListener("click", startWebcamFn);
startScreenShareBtn.addEventListener("click", startScreenShareFn);
mergeStreamsBtn.addEventListener("click", mergeStreamsFn);
stopAllStreamsBtn.addEventListener("click", stopAllStreamsFn);
startRecordingBtn.addEventListener("click", startRecordingFn);
stopRecordingBtn.addEventListener("click", stopRecordingFn);
div#mediaWrapper,
div#buttonWrapper {
display: flex;
flex: 1 1 100%;
flex-basis: row nowrap;
}
div#mediaWrapper video {
border: 1px solid black;
margin: 1px;
max-width: 33%;
height: auto;
}
div#mediaWrapper video#pipOverlayStream {
max-width: 100% !important;
}
div#buttonWrapper button {
border-radius: 0.25rem;
color: #ffffff;
display: inline-block;
font-size: 1rem;
font-weight: bold;
line-height: 1.6;
padding: 0.375rem 0.75rem;
text-align: center;
-webkit-user-select: none;
-moz-user-select: none;
-ms-user-select: none;
user-select: none;
vertical-align: middle;
margin: 5px;
cursor: pointer;
}
div#buttonWrapper button#startWebcam {
background-color: #007bff;
border: 1px solid #007bff;
}
div#buttonWrapper button#startScreenShare {
background-color: #17a2b8;
border: 1px solid #17a2b8;
}
div#buttonWrapper button#mergeStreams {
background-color: #28a745;
border: 1px solid #28a745;
}
div#buttonWrapper button#startRecording {
background-color: #17a2b8;
border: 1px solid #17a2b8;
}
div#buttonWrapper button#stopRecording {
background-color: #000000;
border: 1px solid #000000;
}
div#buttonWrapper button#stopAllStreams {
background-color: #dc3545;
border: 1px solid #dc3545;
}
<main>
<p>This demo is a proof-of-concept solution for this <a href="https://stackoverflow.com/questions/71557879" target="_blank" rel="noopener noreferrer">StackOverflow question</a> and <a href="https://stackoverflow.com/questions/37404860" target="_blank"
rel="noopener noreferrer">also this one</a> - as long as you make the required changes<br>i.e. <b>mimeType: "video/mp4; codecs=h264"</b> instead of <b>mimeType: "video/webm; codecs=vp9"</b><br>AND<br><b>type: "video/mp4"</b> instead of <b>type: "video/webm"</b><br>AND<br><b>result.mp4</b> instead of <b>result.webm</b></p>
<h2>Click on "Start Webcam" to get started. </h2>
<h3>
Core Idea:<br>
<ol>
<li>Fetch Webcam Stream via getUserMedia()</li>
<li>Fetch Screen Share Stream via getDisplayMedia()</li>
<li>Merge Both Stream using some math & canvas operations</li>
<li>Use canvas.captureStream() to generate the composite video stream.</li>
<li>Use AudioContext to merge audio clips (especially needed if using both microphone & system audio).</li>
<li>Use MediaStream constructor to create a new stream using - the video from the new stream + audio from audioContext Destination Node as follows -<br><br>
<code>new MediaStream([...newStream.getVideoTracks(), ...audioDestination.stream.getTracks()]);</code>
</li><br>
<li>Use newly generated MediaStream as required (i.e. replace in RTCPeerConnection, etc.).</li>
<li>In this example - MediaRecorder API is used to record the resulting composite/picture-in-picture video. Recording begins when the "Record Resulting Stream" button is clicked. The final result can be downloaded upon clicking the "Stop Recording and
Download Resulting Stream" button</li>
</ol>
</h3>
<div id="mediaWrapper"></div>
<div id="buttonWrapper">
<button id="startWebcam" title="Start Webcam">Start Webcam</button>
<button id="startScreenShare" title="Start Screen Share">Start Screen Share</button>
<button id="mergeStreams" title="Merge Streams">Merge Streams</button>
<button id="startRecording" title="Record Resulting Stream">Record Resulting Stream</button>
<button id="stopRecording" title="Stop Recording and Download Resulting Stream">Stop Recording and Download Resulting Stream</button>
<button id="stopAllStreams" title="Stop All Streams">Stop All Streams</button>
</div>
<div id="helpText">
<h1 id="recordingState"></h1><br>
<h2 id="mutingStreams">
Note: In a WebRTC setting, you wouldn't be hearing your own voice or the screen share audio via the "video" tag. The same has been simulated here by ensuring that all video tags have a "volume = 0". Removing this will create a loopback hell which you
do not want.<br><br> Another way to avoid this issue is to ensure that the video tags are created with ONLY video stream tracks using <em style="color: blue;">new MediaStream([ source.getVideoTracks() ])</em> during the srcObject assignment.
</h2>
<h1>
Remember to send the correct stream (with both audio and video) to the rest of the peers though.
</h1>
</div>
</main>
Easier solution
Grab each frame from the video, draw it onto a canvas and then draw the following video to the same canvas as well (using either custom code or a library like html2canvas). Next, the easiest thing to do would be to send all the frames one by one to the server and use a simple FFmpeg
command (something like FFmpeg -I img%03d.png -c:v libx264 -pix_fmt yuv420p out.mp4
) to generate the mp4 which you would then send back to the client.
more complicated solution
Whilst rendering the video on the client-side is not impossible. You can generate .webm files on the client-side. The library capable of doing this is called whammy.js. Once again, you would need to draw all the frames to a canvas which you then encoder.add
to the Whammy video object.