Predictions.convert with SpeechToText always returns empty transcription or "stream too big" error when sending WAV (PCM16) from browser

### Environment information

```plain text
Framework: React (TypeScript)

AWS Amplify Version: 6.6.6, e.g. @aws-amplify/predictions@6.1.63

Browser: Chrome (latest, desktop)

OS: Windows 11

Device: Desktop

Audio Capture: navigator.mediaDevices.getUserMedia with AudioWorklet

Encoding: PCM16, WAV, 16kHz, mono
```

### Describe the bug


---

````markdown
# 🐞 Bug Report: Predictions.convert Speech-to-Text

## Problem Description
When calling `Predictions.convert` with short WAV audio (1–3 seconds), the result is always:

```json
{
  "fullText": ""
}
````

Occasionally, instead of empty text, the call fails with:

```
Error from AWS Predictions: Error: Your stream is too big. Reduce the frame size and try your request again
```

This happens even with very small audio clips (~70–140 KB).

---


---

## 🔍 Notes

* Audio was tested at **16kHz, mono, PCM16**.
* Other sample rates and stereo also tested → still empty.
* Verified that WAVs play correctly in browser via `Audio()` element.




### Reproduction steps


## 🔬 Steps to Reproduce

1. Record microphone input using `navigator.mediaDevices.getUserMedia`.
2. Capture raw PCM samples with an `AudioWorkletNode`.
3. Merge Int16 samples and encode into a 16-bit PCM WAV file at 16kHz.
4. Pass the resulting `ArrayBuffer` to `Predictions.convert`.

### Recording Code

```ts
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const audioContext = new AudioContext({ sampleRate: 16000 });
await audioContext.audioWorklet.addModule("/recorderWorklet.js");

const source = audioContext.createMediaStreamSource(stream);
const workletNode = new AudioWorkletNode(audioContext, "recorder-processor");

workletNode.port.onmessage = (event) => {
  const int16Array = event.data;
  audioBuffer.addData(int16Array);
};

source.connect(workletNode).connect(audioContext.destination);
```

### Buffer Management

```ts
const getBuffer = () => {
  let buffer: any[] = [];

  const add = (raw: any) => {
    if (Array.isArray(raw)) {
      buffer = buffer.concat(raw);
    } else {
      buffer.push(raw);
    }
    return buffer;
  };

  const reset = () => { buffer = []; };

  return {
    reset,
    addData: add,
    getData: () => buffer,
  };
};
```

### Convert to WAV and Call Predictions

```ts
const convertFromBuffer = async () => {
  const mergedPCM = mergeInt16Arrays(audioBuffer.getData());

  // Encode to WAV (PCM16 LE)
  const wavBytes = encodeWAV(mergedPCM, 16000);

  const wavArrayBuffer: ArrayBuffer = wavBytes.buffer.slice(
    wavBytes.byteOffset,
    wavBytes.byteOffset + wavBytes.byteLength
  );

  console.log("WAV length (bytes):", wavBytes.byteLength);
  console.log("ArrayBuffer length:", wavArrayBuffer.byteLength);

  try {
    const { transcription } = await Predictions.convert({
      transcription: {
        source: { bytes: wavArrayBuffer },
        language: "en-US",
      },
    });

    console.log("Transcription result:", transcription);
  } catch (error) {
    console.error("AWS Predictions Error:", error);
  }
};
```

### WAV Encoder

```ts
function encodeWAV(int16Array: Int16Array, sampleRate: number): Uint8Array {
  const buffer = new ArrayBuffer(44 + int16Array.length * 2);
  const view = new DataView(buffer);

  const writeString = (view: DataView, offset: number, str: string) => {
    for (let i = 0; i < str.length; i++) {
      view.setUint8(offset + i, str.charCodeAt(i));
    }
  };

  const bytesPerSample = 2;

  writeString(view, 0, "RIFF");
  view.setUint32(4, 36 + int16Array.length * bytesPerSample, true);
  writeString(view, 8, "WAVE");
  writeString(view, 12, "fmt ");
  view.setUint32(16, 16, true);
  view.setUint16(20, 1, true);
  view.setUint16(22, 1, true);
  view.setUint32(24, sampleRate, true);
  view.setUint32(28, sampleRate * bytesPerSample, true);
  view.setUint16(32, bytesPerSample, true);
  view.setUint16(34, 16, true);
  writeString(view, 36, "data");
  view.setUint32(40, int16Array.length * bytesPerSample, true);

  let offset = 44;
  for (let i = 0; i < int16Array.length; i++, offset += 2) {
    view.setInt16(offset, int16Array[i], true);
  }

  return new Uint8Array(buffer);
}
```

---

## ⚠️ Observed Behavior

* Always returns empty transcription `{ fullText: "" }`.
* Sometimes errors with `"Your stream is too big"`.
* Happens even with small WAVs (2–3 seconds, 70–140 KB).

---

## ✅ Expected Behavior

* Short WAVs (≤3 seconds, ≤200 KB) should return valid transcriptions.
* If the format is invalid, the API should return a descriptive error instead of silently returning `{ fullText: "" }`.

---

## ❓ Questions for AWS Team

1. What is the **maximum supported audio size/duration** for `Predictions.convert`?
2. Which formats are supported? Docs suggest PCM16 WAV or FLAC. Are others (WebM/Opus, MP3) valid?
3. Why does `{ fullText: "" }` appear with no error? Does this indicate decoding failure (bad WAV header)?
4. Why is a ~140 KB (3-second) WAV sometimes rejected as "too big"?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Predictions.convert with SpeechToText always returns empty transcription or "stream too big" error when sending WAV (PCM16) from browser #3014

Environment information

Describe the bug

🔍 Notes

Reproduction steps

🔬 Steps to Reproduce

Recording Code

Buffer Management

Convert to WAV and Call Predictions

WAV Encoder

⚠️ Observed Behavior

✅ Expected Behavior

❓ Questions for AWS Team

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Predictions.convert with SpeechToText always returns empty transcription or "stream too big" error when sending WAV (PCM16) from browser #3014

Description

Environment information

Describe the bug

🔍 Notes

Reproduction steps

🔬 Steps to Reproduce

Recording Code

Buffer Management

Convert to WAV and Call Predictions

WAV Encoder

⚠️ Observed Behavior

✅ Expected Behavior

❓ Questions for AWS Team

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions