Episode workflow: init → ingest → draft → review → pick¶
This guide ties together the MVP pipeline steps for a single episode and shows where each step writes in the workspace.
1. Initialize a workspace (init)¶
podcast init --episode-id ep_001 --workspace ./workspaces/ep_001
Notes:
--workspacemust not exist; the command creates it.If you omit
--workspace, the default is./episodes/<episode_id>.
2. Export and transcribe audio¶
Export a master mix from Ultraschall/Reaper to MP3 first. The per-track FLAC files in the Reaper media folder are
not suitable for transcription unless one track is explicitly marked as the mix source. podcast transcribe expects a
single audio input and prefers auphonic.input_file from episode.yaml. The recommended transcription path is
podcast-transcript with the voxhelm backend.
# Example episode.yaml snippet
cat >> ./workspaces/ep_068/episode.yaml <<'YAML'
auphonic:
input_file: /Users/jochen/Documents/REAPER Media/pp_068/pp_068.mp3
YAML
# Transcribe through podcast-pipeline using podcast-transcript + Voxhelm
VOXHELM_API_BASE=https://voxhelm.home.xn--wersdrfer-47a.de \
VOXHELM_API_KEY=your_voxhelm_token_here \
podcast transcribe \
--workspace ./workspaces/ep_068 \
--command transcribe \
--arg=--backend \
--arg=voxhelm
Notes:
Voxhelm requires a reachable service instance and a valid API key.
podcast transcribeimports the generated plain-text transcript intotranscript/<mode>/transcript.txtinside the workspace.podcast-transcriptstill keeps its own artifacts underTRANSCRIPT_DIR(default~/.podcast-transcripts/transcripts/).If the workspace has multiple possible audio inputs, set
auphonic.input_fileexplicitly before running the command.If you want to override the backend or pass extra transcription flags, add more
--argentries; for example--arg=--language --arg=de.
3. Draft text assets (draft)¶
podcast draft runs the transcript chunking + summary + candidate generation pipeline. It reuses an existing workspace
if one is present (clearing stale chunks/summaries on re-run).
podcast draft \
--workspace ./workspaces/ep_068 \
--episode-id ep_068 \
--host Jochen --host Dominik \
--candidates 3
The --host flag is repeatable and persists host names to episode.yaml. On subsequent runs without --host, the
stored names are reused automatically. Host names are injected into all LLM prompts (summarization and candidate
generation) to prevent hallucinated speaker names.
When the workspace already has transcript/transcript.txt from podcast transcribe, podcast draft reuses it. Pass
--transcript /path/to/file.txt only when creating a new workspace from an external transcript or when you want to
replace the transcript currently stored in the workspace.
Outputs:
transcript/contains the ingested transcript + chunk files.summaries/contains chunk summaries and the episode summary.copy/candidates/<asset_id>/contains candidate JSON + Markdown + HTML files.
4. Run the review loop (review)¶
podcast review \
--workspace ./workspaces/ep_001 \
--episode-id ep_001 \
--asset-id description \
--max-iterations 3
Notes:
Add
--fake-runnerto use the built-in stub creator/reviewer.Review iterations are written under
copy/reviews/<asset_id>/and protocol state undercopy/protocol/<asset_id>/.When the loop converges, the selected draft is written to
copy/selected/<asset_id>.*.
5. Pick final copy (pick)¶
# Web UI (recommended) — opens a browser for full-text side-by-side comparison
podcast pick --workspace ./workspaces/ep_001 --web
# CLI — interactive prompt with truncated previews
podcast pick --workspace ./workspaces/ep_001
Notes:
--webopens a local web UI for full-text comparison of all candidates per asset. Select candidates by clicking, then press “Done” to shut down the server.Without
--web, the CLI prompts when multiple candidates exist and writes the selection tocopy/selected/.Use
--asset-idand--candidate-idto pick a specific candidate non-interactively (CLI only).
Episode workspace layout¶
ep_001/
episode.yaml
state.json
transcript/
transcript.txt
chapters.txt
chunks/
chunk_0001.txt
chunk_0001.json
summaries/
chunks/
chunk_0001.summary.json
episode/
episode_summary.json
episode_summary.md
episode_summary.html
copy/
candidates/<asset_id>/candidate_<uuid>.{json,md,html}
reviews/<asset_id>/iteration_XX.<reviewer>.json
protocol/<asset_id>/iteration_XX.{json,creator.json}
protocol/<asset_id>/state.json
selected/<asset_id>.{md,html,txt}
provenance/<kind>/<ref>.json
auphonic/
downloads/
outputs/
Copy/paste HTML into Wagtail¶
Use the HTML files produced by podcast pick (or by a converged review loop) when pasting into Wagtail RichText
fields.
Open
copy/selected/<asset_id>.html.In Wagtail, switch the RichText field to its HTML/source mode.
Paste the HTML and save.
The HTML is generated deterministically from Markdown and supports headings, paragraphs, lists, links, inline code, and
emphasis. If you need plain text, use the .txt output instead.