最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Real time speech recognition using WebRTC, Node.js and speech recognition engine - Stack Overflow

programmeradmin1浏览0评论

A. What I am trying to implement.

A web application allowing real-time speech recognition inside web browser (like this).

B. Technologies I am currently thinking of using to achieve A.

  • JavaScript
  • Node.js
  • WebRTC
  • Microsoft Speech API or Pocketsphinx.js or something else (cannot use Web Speech API)

C. Very basic workflow

  1. Web browser establishes connection to Node server (server acts as a signaling server and also serves static files)
  2. Web browser acquires audio stream using getUserMedia() and sends user's voice to Node server
  3. Node server passes audio stream being received to speech recognition engine for analysis
  4. Speech recognition engine returns result to Node server
  5. Node server sends text result back to initiating web browser
  6. (Node server performs step 1 to 5 to process requests from other browsers)

D. Questions

  1. Would Node.js be suitable to achieve C?
  2. How could I pass received audio streams from my Node server to a speech recognition engine running separately from the server?
  3. Could my speech recognition engine be running as another Node application (if I use Pocketsphinx)? So my Node server communicates to my Node speech recognition server.

A. What I am trying to implement.

A web application allowing real-time speech recognition inside web browser (like this).

B. Technologies I am currently thinking of using to achieve A.

  • JavaScript
  • Node.js
  • WebRTC
  • Microsoft Speech API or Pocketsphinx.js or something else (cannot use Web Speech API)

C. Very basic workflow

  1. Web browser establishes connection to Node server (server acts as a signaling server and also serves static files)
  2. Web browser acquires audio stream using getUserMedia() and sends user's voice to Node server
  3. Node server passes audio stream being received to speech recognition engine for analysis
  4. Speech recognition engine returns result to Node server
  5. Node server sends text result back to initiating web browser
  6. (Node server performs step 1 to 5 to process requests from other browsers)

D. Questions

  1. Would Node.js be suitable to achieve C?
  2. How could I pass received audio streams from my Node server to a speech recognition engine running separately from the server?
  3. Could my speech recognition engine be running as another Node application (if I use Pocketsphinx)? So my Node server communicates to my Node speech recognition server.
Share Improve this question edited Jun 2, 2014 at 15:01 jpen asked Jun 1, 2014 at 20:53 jpenjpen 2,1475 gold badges33 silver badges56 bronze badges 1
  • source code behind your link is at : src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech you may want to look at how THEY implement it to inform your architecture?? – Robert Rowntree Commented Jun 1, 2014 at 20:57
Add a comment  | 

2 Answers 2

Reset to default 9

Would Node.js be suitable to achieve C?

Yes, though there are no hard requirements for that. Some people are running servers with gstreamer, for example check

http://kaljurand.github.io/dictate.js/

node should be fine too.

How could I pass received audio streams from my Node server to a speech recognition engine running separately from the server?

There are many ways for node-to-node communication. One of them is http://socket.io. There are also plain sockets. The particular framework depends on your requirements for fault-tolerance and scalability.

Could my speech recognition engine be running as another Node application (if I use Pocketsphinx)? So my Node server communicates to my Node speech recognition server.

Yes, sure. You can create a node module to warp pocketsphinx API.

UPDATE: check this, it should be similar to what you need:

http://github.com/cmusphinx/node-pocketsphinx

You should contact Andre Natal, who has shown demos similar to this at last fall's Firefox Summit, and is now on a Google Summer of Code project implementing offline speech recognition in Firefox/FxOS: http://cmusphinx.sourceforge.net/2014/04/speech-projects-on-gsoc-2014/

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论