最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

openai api - ChatGPT Realtime API Event conversation.item.created does not include base64 audio or transcription - Stack Overflo

programmeradmin2浏览0评论

According to OpenAI Realtime API document there should be a base64 audio available in conversation.item.created event, but when I try using it with WebRTC there is no base 64 audio or transcript in this event.

Do you have any suggestion?

{
    "event_id": "event_1920",
    "type": "conversation.item.created",
    "previous_item_id": "msg_002",
    "item": {
        "id": "msg_003",
        "object": "realtime.item",
        "type": "message",
        "status": "completed",
        "role": "user",
        "content": [
            {
                "type": "input_audio",
                "transcript": "hello how are you", // this item is null
                "audio": "base64encodedaudio==" // this item does not exists
            }
        ]
    }
}

This is how I create my session in backend (PHP, Laravel):

$data = [
            'model' => $model,
            "modalities" => ["audio", "text"],
            "instructions" => $instruction ?? "You are a friendly assistant.",
            "voice"=> $voice,
            "input_audio_transcription" => [
                'model' =>  "whisper-1"
            ],
            "turn_detection" => [
                "type" => "server_vad"
            ]
        ];

$response = Http::withToken($apiKey)
                ->withHeaders(['Content-Type' => 'application/json'])
                ->post($url, $data);

This is how I connect to webrtc in front end:

  this.aiSession = await this.createRTSession();

      // Create a peer connection
      const pc = new RTCPeerConnection();

      // Set up to play remote audio from the model
      const audioEl = document.createElement("audio");
      audioEl.autoplay = true;


      pc.ontrack = e => {
        const remoteAudioStream = new MediaStream();
        remoteAudioStream.addTrack(e.track);
        audioEl.srcObject = remoteAudioStream;
        this.animate(remoteAudioStream);
      };

      // Add local audio track for microphone input in the browser
      const ms = await this.getMicStream();
      this.startRecording(ms);

      pc.addTrack(ms.getTracks()[0]);

      // Set up data channel for sending and receiving events
      const dc = pc.createDataChannel("oai-events");


      dc.addEventListener("message", (e) => {
        let $data = JSON.parse(e.data);
        // Realtime server events appear here!
        console.log($data.type, $data);
      });

  const offer = await pc.createOffer();
      await pc.setLocalDescription(offer);

      const baseUrl = ";;
      const model = this.aiSession.model;
      const sdpResponse = await fetch(`${baseUrl}?model=${model}`, {
        method: "POST",
        body: offer.sdp,
        headers: {
          Authorization: `Bearer ${this.ephemeralToken}`,
          "Content-Type": "application/sdp"
        },
      });

      const answer = {
        type: "answer",
        sdp: await sdpResponse.text(),
      };

      await pc.setRemoteDescription(answer);

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论