# Dudoxx Omni — Common Pitfalls (All Services)

> One-page failure-mode catalog spanning STT, TTS, LLM, the wire envelope, and operations. Each entry has a symptom, root cause, and fix.

---

## STT pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| HTTP 403 on WS handshake | missing `Authorization: Token …` and no `Sec-WebSocket-Protocol` | Add either; both empty → server pre-accept close |
| Close 1003 `unsupported-data` | encoding the normalizer can't decode | Use `linear16` 16 kHz mono OR add encoding to `audio_normalize.py` |
| Close 1008 `channels-too-many` / `sample-rate-too-high` | exceeds `DDX_STT_AUDIO_MAX_*` caps | Resample / downmix client-side, or raise env caps |
| Close 1011 `NET-0001` | 10s without audio AND without KeepAlive | Send `{"type":"KeepAlive"}` every 5s |
| `Cannot unfreeze partially…` server warnings | NeMo `model.transcribe()` race; cosmetic | No client action; finalize-only `_final_lock` keeps finals correct |
| Long broadcasts emit only partials | No natural utterance gap → VAD never fires `speech_end` | Send `{"type":"Finalize"}` on a 30s timer |
| Concurrent N=3 sessions — two clients see 100% WER | Per-session VAD missing → shared `SileroVadDetector` | Already fixed: `vad_factory` returns a fresh detector per session |
| First minute of audio mis-detected as wrong language | Auto-detect window too short (<10s) | Already fixed: 20s sliding window for stable detection |
| "Mm" / "Uh" hallucinations on quiet mic | RMS gate disabled | Already fixed: silence-skip RMS gate on trailing 1s of window |
| Browser MediaRecorder webm-opus chunks rejected | Server treats opus as bytestream not full-frame | Send full webm chunks (`mediaRecorder.start(250)` 250ms tick); server normalizer accepts full-frame webm-opus |
| Italian transcription unreliable | No Italian primer + RAI broadcast 403 | Upstream issue; add IT TTS clip to `app/services/lang_primers/` |

---

## TTS pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| HTTP 401 | missing `X-API-Key` / `?api_key` | Add the header / query param |
| HTTP 422 unsupported `language` | not in `SpeakRequest._LANGUAGE_PATTERN` | Use BCP-47 short tag (`en`, `fr`, `de`, `it`) |
| HTTP 422 unsupported `sample_rate` | not in `{16000, 22050, 24000, 48000}` | Pick a supported rate |
| Last word missing in CUDA TTS | client passed `subtalker_dosample=true` override | Remove override or set `false`; server default is `false` |
| WS closes 1011 mid-utterance | engine exception (Qwen3 / Kokoro init) | Check `logs/tts.log`; `./ddx-manage.sh restart --prod tts` |
| SSE frames buffer at NGINX | proxy-buffering on | Add `proxy_buffering off;` and `proxy_set_header X-Accel-Buffering no;` |
| `audio_b64` payload too large | one-shot response holding 10s+ utterance | Switch to `/v1/speak/sse` for long text |
| Edits to dashboard.html don't show up | FastAPI caches the file at startup | `./ddx-manage.sh restart --prod tts` then user does Cmd-Shift-R |
| Port-already-in-use after restart | Old process freed port during teardown | Just `./ddx-manage.sh start --prod tts` again |

---

## LLM pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| HTTP 404 `model_not_found` on chat | requested `model` not in registry | `GET /v1/models` first, pick an `id` |
| HTTP 404 + `detail="server not in real-model mode"` on `/v1/models/load` | `DDX_LLM_USE_REAL_MODEL=0` | Set env var, restart `./ddx-manage.sh restart --prod llm` |
| HTTP 422 `context_too_long` | total tokens exceed model's `context_window` | Truncate history or pick a larger-context model |
| HTTP 503 `model_unavailable` | model file missing / load failed | Check `model_error` in `/health`; verify `models/cache/` |
| Stream stalls at first chunk | upstream still loading the model | Poll `/v1/models/current.loaded === true` before streaming |
| CORS error in browser | origin not in `DDX_LLM_CORS_ORIGINS` | Add origin or proxy through Next.js (preferred) |
| Mock mode never returns text | mock always returns `tool_calls` | Switch to real mode (`DDX_LLM_USE_REAL_MODEL=1`) or handle `tool_calls` in client |
| `/v1/embeddings` returns 404 | dudoxx layer doesn't expose embeddings | Use a separate embeddings service |

---

## Wire-envelope pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| Client breaks on a new field | strict schema parser rejects unknown keys | Envelope is additive — clients must ignore unknown keys |
| Mixing v1 and v2 paths in one session | breaking changes ship at `/v2/*` | Pick one path per session; v1 stays immutable |
| Hand-edited generated bindings drift | `ddx-mlx-envelopes` regenerated, custom edits lost | Never edit `dist/`; change schema + `make gen` |
| Visemes missing in TTS Audio frames | `emit_visemes: false` (default) or backend doesn't ship visemes | Pass `emit_visemes: true`; only CUDA backend emits PB-15 visemes |
| Word timestamps off by primer length | language primer prepended | Already fixed: server shifts word timestamps user-relative |

---

## Operations / `ddx-manage.sh` pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| `ddx-manage.sh start` hangs the agent | foreground watch-mode | Always use `--prod` for service work; `--dev` is for human terminals |
| `restart` reports "did not become healthy" but `lsof -i :PORT` empty | port freed during teardown | Re-run `start --prod <svc>` |
| Service silently using mock model | `DDX_LLM_USE_REAL_MODEL` not exported | `DDX_LLM_USE_REAL_MODEL=1 ./ddx-manage.sh restart --prod llm` |
| RSS climbing under load | Uvicorn workers too low | `DDX_UVICORN_WORKERS=4 ./ddx-manage.sh restart --prod tts` (~3.8GB/worker, RTF 2.0–2.7×) |
| Logs noisy with `Cannot unfreeze partially…` under load | NeMo internal race; recovered by 3-attempt retry | Cosmetic; suppression via `try_acquire(timeout=0)` is a planned tweak |

---

## Browser / Next.js pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| `ws://` rejected from `https://` page | mixed content | Open `wss://` to your edge; bridge to `ws://` upstream server-side |
| API key visible in browser dev tools | client called TTS / LLM directly | Always proxy via Next.js Route Handler; keep keys in `process.env` (no `NEXT_PUBLIC_`) |
| Server Action used for read-only data | misuse of Server Actions | Use Server Components + service functions for reads; Server Actions are for writes only |
| Hardcoded UI string in TSX | bypassed next-intl | Use `t('key')`; add to `src/locales/{en,de,fr}/<ns>.json` |
| `await params` missed in page.tsx | Next.js 16 made params async | `const { locale } = await params;` |
| Custom WS server doesn't pass cookies upstream | `next` route handlers don't see WS upgrades | Use a tiny `server.mjs` wrapping `next` + `ws` (see STT skill) |

---

## NestJS pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| `WebSocketGateway` doesn't accept connections | default adapter is socket.io | `app.useWebSocketAdapter(new WsAdapter(app))` from `@nestjs/platform-ws` |
| Streaming controller never flushes | response held in memory | Set `'X-Accel-Buffering': 'no'`, `Connection: keep-alive`, write chunks directly |
| Guard order wrong (auth fires after roles) | Nest order: global → controller → handler | Stack as `ApiTokenGuard → JwtAuthGuard → RolesGuard` (cardinal NestJS rule) |
| `ConfigService.get('TTS_API_KEY')` returns undefined | env not loaded or typo | Use `getOrThrow('TTS_API_KEY')` and load `.env` via `ConfigModule.forRoot({ isGlobal: true })` |

---

## Python pitfalls

| Symptom | Cause | Fix |
|---|---|---|
| `websockets.connect` rejects custom headers | older `websockets` API | Pin `websockets>=13`, use `additional_headers=[(...)]` (not `extra_headers`) |
| WS `max_size` too small for long streams | default 1 MiB | `max_size=2**24` (16 MiB) for STT, `max_size=None` for TTS audio frames |
| Client receives partial JSON | reading `bytes` instead of `str` | Filter `if not isinstance(msg, str): continue` before `json.loads` |
| Async cancellation leaks tasks | no timeout on consumer | Wrap in `asyncio.wait_for(task, timeout=N)` and cancel on TimeoutError |

---

## Reference

- STT: `ddx-cuda-live-stt/STT_API_USAGE.md`, `STT_FULL_CAPABILITIES.md`
- TTS: `ddx-cuda-live-tts/TTS_API_USAGE.md`, `TTS_FULL_CAPABILITIES.md`
- LLM: `ddx-mlx-llm/LLM_API_USAGE.md`, `LLM_API_ENDPOINTS.md`
- Envelope: `ddx-prd-specs/envelopes/README.md`
- Service control: `./ddx-manage.sh status|start|stop|restart|logs <svc>` (svc ∈ stt, tts, usage, llm, web, all)