# Podroma API Guide for Agents

This document is for AI agents and developer tools that want to
programmatically read podcast insights from **podroma.com**.

Podroma converts long-form YouTube podcasts into structured insights:
a short list of "what's actually worth knowing" from each episode, each
backed by a recent real-world example, plus a speaker-labeled transcript
suitable for chat-style Q&A.

All endpoints below are public — no API key, no authentication.

Base URL: `https://podroma.com`

## Core service

Given a YouTube podcast URL, Podroma:

1. Pulls the auto-caption transcript
2. Generates 3–5 key insights with real-world examples (via Claude with
   web search)
3. Adds speaker labels (`Joe Rogan:`, `Matt Damon:`, …) to the raw
   transcript
4. Categorizes the episode (Tech & AI, Business & Finance, Health, …)
5. Caches the result so future requests are instant

Each analyzed episode lives at a stable URL: `https://podroma.com/podcast/{id}`.

## Key endpoints

### 1. Cache check (free, fast)

`GET /api/check?platform=youtube&id={video_id_or_url}`

Tells you whether a video has already been transcribed + analyzed. Use
this **before** kicking off a transcribe call to avoid burning your
daily quota.

**Request**

```http
GET /api/check?platform=youtube&id=dQw4w9WgXcQ
```

The `id` parameter accepts a bare 11-character YouTube id **or** a full
`https://www.youtube.com/watch?v=...` URL.

**Response**

```json
{
  "cached": true,
  "id": 73,
  "ready": true,
  "synthesis_url": "https://podroma.com/podcast/73"
}
```

`cached=false` means we've never processed this video. `cached=true` +
`ready=false` means it's in progress (synthesis hasn't finished).

### 2. Transcribe (create a new episode)

`POST /api/episodes/create`

Idempotent — if the URL has already been transcribed, this returns the
existing `id` without doing any work and without consuming a transcribe
slot. Otherwise it inserts a placeholder row and returns the new `id`.

**Request**

```http
POST /api/episodes/create
Content-Type: application/json

{
  "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ"
}
```

**Response**

```json
{
  "id": 142,
  "alreadyAnalyzed": false
}
```

After receiving the `id`, **trigger the actual transcribe work** by
calling:

`POST /api/library/item/{id}/regenerate`

This is the call that consumes a transcribe slot against the rate
limits. It takes ~60–120 seconds for a typical 1–2 hour podcast.
Returns `{ ok: true }` on success or `{ ok: true, alreadyRegenerated: true }`
if the row was already done.

Then poll the read endpoint (next section) until `ready: true`.

### 3. Fetch episode

`GET /api/library/item/{id}`

Returns the cached metadata + synthesis. Use the `include` query param
to also pull the transcript.

**Request**

```http
GET /api/library/item/73?include=transcript_diarized
```

**Response**

```json
{
  "id": 73,
  "title": "Joe Rogan Experience #2440 - Matt Damon & Ben Affleck",
  "author": "PowerfulJRE",
  "source_url": "https://www.youtube.com/watch?v=...",
  "source_type": "youtube",
  "created_at": 1733600000000,
  "category": "Entertainment & Culture",
  "synthesis_markdown": "## 1. ...\n\n...",
  "ready": true,
  "video_id": "abc123def45",
  "thumbnail_url": "https://i.ytimg.com/vi/abc123def45/maxresdefault.jpg",
  "transcript_diarized": "Joe Rogan: Welcome back...\n\nMatt Damon: Thanks Joe..."
}
```

`include` accepts a comma-separated list:

| Value                 | What you get                                          |
|-----------------------|-------------------------------------------------------|
| `transcript`          | the raw, unlabeled YouTube auto-caption text         |
| `transcript_diarized` | the same content with `Speaker Name:` labels added   |

Both can be passed together: `?include=transcript,transcript_diarized`.

The `synthesis_markdown` field uses GitHub-flavored markdown with
numbered `## N. <title>` headings — one per insight.

### 4. Ask a question (Q&A)

`POST /api/library/item/{id}/chat`

Streams a Claude response grounded in the episode's transcript + cached
insights. Returns `text/plain` token-by-token (no SSE wrapper — just raw
chunked text).

**Request**

```http
POST /api/library/item/73/chat
Content-Type: application/json

{
  "messages": [
    { "role": "user", "content": "What did Matt say about Ben's directing process?" }
  ]
}
```

`messages` is the full conversation history (user + assistant turns).
Append your new user question at the end on each call.

**Response**

```
text/plain; charset=utf-8
(streaming chunks of the answer)
```

For Q&A on transcripts you supply yourself (not cached on Podroma):

`POST /api/chat` with `{ transcript: "...", messages: [...] }`.

## Rate limits

Enforced per source IP and per session cookie:

| Limit                          | Value |
|--------------------------------|-------|
| Concurrent transcribes per IP  | 2     |
| Daily transcribes per IP       | 25    |
| Daily transcribes per session  | 10    |
| Daily Q&A questions per IP     | 50    |

A "transcribe" counts when `POST /api/library/item/{id}/regenerate`
actually does work — cache hits do not count.

When you exceed a limit, the API returns **HTTP 429** with this body:

```json
{
  "error": "Daily transcription limit reached (25 per IP per 24h). Try again tomorrow.",
  "scope": "daily_ip",
  "action": "transcribe",
  "limit": 25,
  "retry_after_seconds": 1800
}
```

Response headers include `Retry-After`, `X-RateLimit-Scope`,
`X-RateLimit-Action`, and `X-RateLimit-Limit`. The `scope` field tells
you which limit fired (`concurrent_ip`, `daily_ip`, or `daily_session`)
so you can branch accordingly.

The session cookie (`podroma_sid`) is set automatically by Podroma's
middleware on first response. Send it back on subsequent requests if
you want the session-level limit to apply (otherwise only the per-IP
limits gate you).

## Typical agent flow

```text
1. GET  /api/check?platform=youtube&id=<VID>
   → if cached && ready, GOTO 4

2. POST /api/episodes/create { url: "<youtube url>" }
   → get { id }

3. POST /api/library/item/{id}/regenerate
   → wait ~60-120s, then poll step 4

4. GET  /api/library/item/{id}?include=transcript_diarized
   → if ready=true, use synthesis_markdown + transcript

5. POST /api/library/item/{id}/chat { messages: [...] }
   → ask follow-up questions
```

## Constraints

- **Maximum video duration: 4 hours.** Longer episodes are rejected at
  `POST /api/episodes/create` with HTTP 400 and `scope: "duration_cap"`
  in the body, before any transcribe slot is consumed.
- **YouTube only.** Spotify and Apple Podcasts aren't supported.
- **Auto-captions required.** Videos without YouTube captions can't be
  transcribed (we don't run our own ASR).
- **No deletion endpoint.** Episodes are public once analyzed.
- **No webhooks.** Poll the GET endpoint for status changes.

### Duration cap error shape

```json
{
  "error": "We can analyze podcast episodes up to 4 hours long. This episode is longer than that — try a shorter episode from the same podcast.",
  "scope": "duration_cap",
  "max_duration_sec": 14400,
  "video_duration_sec": 21630
}
```

## Schema gotchas

- `created_at` is a Unix epoch in **milliseconds**, not seconds.
- `transcript_diarized` may be null for very old rows that pre-date the
  diarization feature; fall back to `transcript` in that case.
- `synthesis_markdown` is null while the episode is still being analyzed
  (poll until `ready: true`).
- `category` may be null for rows that pre-date the categorizer; this
  doesn't affect the synthesis or transcript content.

## Contact

Questions, bug reports, or want a higher rate limit? Email
[insights@podroma.com](mailto:insights@podroma.com).
