Is it possible to use W3C Web Speech API to write Javascript code which generates audio file (wav, ogg or mp3) with voice speaking given text? I mean, I want to do something like:
window.speechSynthesis.speak(new SpeechSynthesisUtterance("0 1 2 3"))
but I want sound generated with it not to be output to speakers but to file.
Is it possible to use W3C Web Speech API to write Javascript code which generates audio file (wav, ogg or mp3) with voice speaking given text? I mean, I want to do something like:
window.speechSynthesis.speak(new SpeechSynthesisUtterance("0 1 2 3"))
but I want sound generated with it not to be output to speakers but to file.
Share Improve this question asked Aug 2, 2016 at 18:13 user983447user983447 1,7374 gold badges22 silver badges37 bronze badges1 Answer
Reset to default 5The requirement is not possible using Web Speech API alone, see Re: MediaStream, ArrayBuffer, Blob audio result from speak() for recording?, How to implement option to return Blob, ArrayBuffer, or AudioBuffer from window.speechSynthesis.speak() call
Though requirement is possible using a library, for example, espeak
or meSpeak
, see How to create or convert text to audio at chromium browser?.
fetch("https://gist.githubusercontent./guest271314/f48ee0658bc9b948766c67126ba9104c/raw/958dd72d317a6087df6b7297d4fee91173e0844d/mespeak.js")
.then(response => response.text())
.then(text => {
const script = document.createElement("script");
script.textContent = text;
document.body.appendChild(script);
return Promise.all([
new Promise(resolve => {
meSpeak.loadConfig("https://gist.githubusercontent./guest271314/8421b50dfa0e5e7e5012da132567776a/raw/501fece4fd1fbb4e73f3f0dc133b64be86dae068/mespeak_config.json", resolve)
}),
new Promise(resolve => {
meSpeak.loadVoice("https://gist.githubusercontent./guest271314/fa0650d0e0159ac96b21beaf60766bcc/raw/82414d646a7a7ef11bb04ddffe4091f78ef121d3/en.json", resolve)
})
])
})
.then(() => {
// takes approximately 14 seconds to get here
console.log(meSpeak.isConfigLoaded());
console.log(meSpeak.speak("what it do my ninja", {
amplitude: 100,
pitch: 5,
speed: 150,
wordgap: 1,
variant: "m7",
rawdata: "mime"
}));
})
.catch(err => console.log(err));
There is also workaround using MediaRecorder
, depending on system hardware How to capture generated audio from window.speechSynthesis.speak() call?.