Qwen3-TTS-12Hz-0.6B-Base-QTS

Qwen3-TTS-12Hz-0.6B-Base-QTS is a distribution repository for model artifacts produced by yetanother.ai/xlai (Qwen3 TTS native stack).

This Hugging Face repository is intended to contain stable, downloadable runtime artifacts only:

one shared qwen3-tts-vocoder.onnx
qwen3-tts-reference-codec.onnx and qwen3-tts-reference-codec-preprocess.json for ICL voice clone in xlai-qts-core
one or more GGUF variants such as qwen3-tts-0.6b-f16.gguf
optional additional GGUF variants such as qwen3-tts-0.6b-q8_0.gguf

It is not the source-of-truth repository for code, export logic, or developer documentation. Those live in yetanother.ai/xlai.

Relationship To `xlai`

GitHub yetanother.ai/xlai: source code, export scripts (scripts/qts), Rust runtime (xlai-qts-core)
Hugging Face dsh0416/Qwen3-TTS-12Hz-0.6B-Base-QTS: exported runtime artifacts

Recommended maintenance flow:

Change behavior in the GitHub repository first.
Export artifacts from a known Git commit (uv run export-model-artifacts, or uv run xlai-qts-hf-release).
Publish only the built model files to this Hugging Face repository, preferably from the tagged GitHub Actions release workflow in xlai.
Keep this model card aligned with the GitHub docs, but do not treat this repository as a second source repository.

Expected root layout:

text

{{ROOT_LAYOUT}}
README.md
SHA256SUMS

Notes:

qwen3-tts-vocoder.onnx is shared across all GGUF variants in this repository.
qwen3-tts-reference-codec.onnx and qwen3-tts-reference-codec-preprocess.json are required for ICL voice cloning in xlai; x-vector-only clone does not need them.
The Rust runtime expects the GGUF, vocoder ONNX, and (for ICL) reference-codec files to live in the same directory by default (xlai_qts_core::ModelPaths).
Not every release must ship every quantization variant.
For the current artifact set, q8_0 is the recommended default download and f16 is the reference-quality export.

At the moment, the xlai exporter supports:

Other quantization types may appear in future releases once the export and validation pipeline is ready.

See the source repository for current usage and export documentation:

Typical local layout:

text

models/
  qwen3-tts-0.6b-f16.gguf
  qwen3-tts-vocoder.onnx
  qwen3-tts-reference-codec.onnx
  qwen3-tts-reference-codec-preprocess.json

Example CLI usage:

bash

cargo run -p xlai-qts-cli -- synthesize \
  --model-dir /path/to/models \
  --text "hello" \
  --out target/hello.wav

Current source repository snapshot:

Current artifact checksums:

For future releases, it is recommended to record:

Base upstream model: