最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How can I make my web browser speak programmatically? - Stack Overflow

programmeradmin4浏览0评论

Is it possible to have a website speak a welcome message to users programmatically?

Suppose I wanted to greet users with an audio message upon successful login to my website. I know that I could record a greeting message(i.e. as an MP3), and play that, but I would want to be able to do this programmatically, since all users' names would be different.

For example, I might want to say Welcome, John Doe when John Doe logs in.

How could I do this with plain javascript?

NOTE: This is not intended for use in a production system, but rather intended to be used as a smaller portion of a bigger UX experiment.

Is it possible to have a website speak a welcome message to users programmatically?

Suppose I wanted to greet users with an audio message upon successful login to my website. I know that I could record a greeting message(i.e. as an MP3), and play that, but I would want to be able to do this programmatically, since all users' names would be different.

For example, I might want to say Welcome, John Doe when John Doe logs in.

How could I do this with plain javascript?

NOTE: This is not intended for use in a production system, but rather intended to be used as a smaller portion of a bigger UX experiment.

Share Improve this question edited Feb 4, 2018 at 17:39 grizzthedj asked Jan 29, 2018 at 15:04 grizzthedjgrizzthedj 7,50517 gold badges46 silver badges65 bronze badges 19
  • 41 Don’t. That is awful UX. – Martin Bean Commented Jan 29, 2018 at 16:38
  • 4 If you just want to do this for yourself or for fun, knock yourself out, but if you are building something which faces the public, then avoid this as it is not a good user experience – Huangism Commented Jan 29, 2018 at 17:29
  • 1 Would suggest that a canonical Question/Answer the Answer should contain substantial relevant details as to the subject matter, see What is a canonical question/answer, and what is their purpose?. This Answer currently omits critical details as to how the Web Speech API is actually implemented at different browsers. window.speechSynthesis.getVoices() and onvoiceschanged event usage cannot be ignored for interop. – guest271314 Commented Jan 29, 2018 at 19:00
  • 3 "greet users with an audio message upon successful login" is surely "awful ux". Although there may be practical reasons to use audio this way, that is not one of them under normal circumstances. – No Results Found Commented Jan 29, 2018 at 19:17
  • 2 Related: Should I add sound effects to my web site? Why is sound sparingly used on websites? Should we use a sound/jingle when users arrive on our site or open our app? – Bernhard Barker Commented Jan 29, 2018 at 20:31
 |  Show 14 more comments

4 Answers 4

Reset to default 20

For window.speechSynthesis.speak() to render audio output at Chromium browser the user needs to have speech-dispatcher installed and launch the browser with --enable-speech-dispatcher flag.

  • How to use Web Speech API at chromium?

onvoiceschanged event handler and window.speech.synthesis.getVoices() needs to be called to populate the list of available voices. The API is not straightforward; .getVoices() may need to be called twice for the SpeechSynthesisVoice objects to populate the array returned by .getVoices().

Note that there is a potential for the calls to .speak() to be placed in a queue and not be rendered as audio output, which is not immediately evident; calling window.speechSynthesis.cancel() clears the queue, where the audio output could then be rendered unexpectedly.

  • speechSynthesis.getVoices() is empty array in Chromium Fedora

You can then use window.speechSynthesis.speak().

Have been trying for some time now to get SSML parsing enabled by default at Chromium browser for *nix; without using an external web service which requires either some form of EUA or is not free as in beer.

The list of entities that have contacted and questions asked to achieve this is quite lengthy, for example

  • SSML parsing implementation at browsers

  • How to extract SSML parsing code of espeak to implement SSML parsing at SpeechSynthesisUtterance?

  • How to set SSML parsing to on at user configuration file?

  • Why hasn't Issue 88072 and Issue 795371 been answered? Are Internals>SpeechSynthesis and Blink>Speech dead?

Firefox at *nix also does not parse SSML.

Perhaps with more interest by users at large we can finally get this feature enabled by default.

Though there are workarounds for SSML parsing without using an external web service; this first link below is still unanswered; though includes PHP code that calls the binary using shell_exec() following $_POST to a local server

  • How to programmatically send a unix socket command to a system server autospawned by browser or convert JavaScript to C++ souce code for Chromium?

  • SpeechSynthesisSSMLParser

Note, that there are several bug with the current Web Speech API implementation, notably that changing volume property at SpeechSynthesisUtterance has no effect on audio output at both Chromium and Firefox

  • Setting SpeechSynthesisUtterance.volume does not change volume of audio output of speechSynthesis.speak()

  • Setting SpeechSynthesisUtterance.volume does not change volume of audio output of speechSynthesis.speak()

There is also a bug when using .pause() and .resume(), which encountered when trying to programmatically parse <break> element of SSML

  • "speak speak slash" is audio output of .speak() following two calls to .speak(), .pause() and .resume()

An alternative to using the apparently dead Web Speech API is speak.js which was created by porting espeak to JavaScript or meSpeak.js, which is a fork of speak.js. espeak-ng is now actively maintained, for example using a modified version of meSpeak.js

  • generate audio file with W3C Web Speech API

or using online dictionaries which serve voice files reflecting the word

  • How to create or convert text to audio at chromium browser?

Interestingly, after posting that Answer the "gstatic" "dictionary" no longer served the audio files.

Fortunately, we have

  • mozilla/voice-web

This is a web, Android and iOS app for collecting speech donations for the Common Voice project.

which is quite active.


We can also use Native Message at both Chromium/Chrome and Firefox to call interact with the native shell and call the binary itself

  • How to parse JSON from stdin at Chrome Native Messaging host?

  • How to parse JSON from stdin at Native Messaging host?

  • Chrome Native Messaging throwing error when sending a base64 string to client

this code achieves expected result with minimal modification using Native Messaging

  • Chrome Native messaging with PHP

or as a drastic measure, change the binary

  • How to set options of commands called by browser?

(opinion, supported by facts follow)

There is a substantial web service market for speech synthesis technologies, both in the generation thereof ( "[L]yrebird") and the recognition of - for profit i.e.g., "*lexa"; "*olly"; (*bm) "*atson *luemix"; (*oogle) "*ctions"; etc.

It is up to open source developers to continue efforts directed towards maintaining open source (FOSS; FLOSS) speech synthesis technologies at open source browsers. If we want these technologies to be implemented in browsers by default, open source developers have to compose the code to make that happen.

This is possible with the SpeechSynthesisUtterance interface of the Web Speech API. More info on this here.

The javascript below will say "Welcome John Doe" when executed in Chrome. Make sure the volume is up!

const message = new SpeechSynthesisUtterance('Welcome, John Doe'); 
window.speechSynthesis.speak(message);

The Web Speech API also provides a speech recognition interface. The following code will print spoken words to the browser's console.

const recognition = new webkitSpeechRecognition();
recognition.onresult = function(event) {
  for (let i = event.resultIndex; i < event.results.length; ++i) {
    console.log(event.results[i][0].transcript); 
  }
}

To start capturing speech, run recognition.start();
To stop capturing speech, run recognition.stop();

Given this is experimental technology, it is not going to be perfect, and it is not supported in all browsers and versions. Check the browser compatibility table for supported browsers and versions.

I made a function that makes life easier. You only have to execute the function with a languagecode, for example speak('hello world','en') for English, see other codes

function speak(text, language) {
    const synth = window.speechSynthesis;
    const utterance = new SpeechSynthesisUtterance(text);
    utterance.voice = synth.getVoices().find(voice => voice.lang.split('-')[0].toLowerCase() === language.split('-')[0].toLowerCase());
    synth.speak(utterance);
}

Check the Web_Speech_API documentation

const utterance = new SpeechSynthesisUtterance(name);
const voices = speechSynthesis.getVoices();
utterance.voice = voices[0];
speechSynthesis.speak(utterance);
发布评论

评论列表(0)

  1. 暂无评论