hottub/docs/provider-playbook.md

# New Provider Playbook

This is the implementation checklist for adding a working channel with the least guessing.

## Definition Of Done

A provider is not done when it compiles. It is done when:

1. `/api/status` shows the channel with sensible options and grouping.
2. `/api/videos` returns real items for the default feed.
3. Search works.
4. Pagination works.
5. Thumbnails load.
6. `video.url` or at least one `formats[*].url` resolves to playable media.
7. If the site needs proxying, the `/proxy/...` route works.
8. `HOT_TUB_PROVIDER=<id> cargo check -q` passes.

## Files To Touch

Always:

- `build.rs`
- `src/providers/<channel_id>.rs`

Sometimes:

- `src/proxy.rs`
- `src/proxies/<channel_id>.rs`
- `src/proxies/<channel_id>thumb.rs`
- `prompts/new-channel.md` if you are improving the handoff prompt
- `docs/provider-catalog.md` if you add a new provider or proxy

## Step 1: Pick The Closest Template

Do not start from an empty file.

Choose the nearest match:

- API-first site with tags/uploader metadata: copy `vjav.rs`
- HTML site with background-loaded tags/uploaders: copy `hsex.rs`
- Site with multiple large catalogs like sites/networks/stars: copy `omgxxx.rs`
- Site whose media or thumbs need local proxying: copy `noodlemagazine.rs`, `pornhd3x.rs`, `spankbang.rs`, or `porndish.rs`
- Very simple archive/search site: copy a small provider from `mainstream-tube`

Before writing code, confirm the site shape:

1. home or latest feed URL
2. search URL and page 2 URL
3. detail page URL shape
4. player request or manifest request
5. thumbnail host and whether it needs referer/cookies
6. tag/category/uploader/studio routes if they exist
7. whether the site exposes JSON endpoints that are easier than HTML scraping

Use browser/network tooling for this if needed. Do not guess URL patterns from one page.

## Step 2: Register The Provider

Add the provider to `build.rs`:

- `id`: channel id used by `/api/videos`
- `module`: Rust file name
- `ty`: provider struct name

If this is missing, the server will not discover the provider.

## Step 3: Define Channel Metadata

Every provider should export:

```rust
pub const CHANNEL_METADATA: crate::providers::ProviderChannelMetadata =
    crate::providers::ProviderChannelMetadata {
        group_id: "...",
        tags: &["...", "...", "..."],
    };
```

Pick `group_id` from the existing set in `src/providers/mod.rs`:

- `meta-search`
- `mainstream-tube`
- `tiktok`
- `studio-network`
- `amateur-homemade`
- `onlyfans`
- `chinese`
- `jav`
- `fetish-kink`
- `hentai-animation`
- `ai`
- `gay-male`
- `live-cams`
- `pmv-compilation`

## Step 4: Build The Channel Surface

Implement `build_channel` or equivalent and return it from `get_channel`.

Required:

- `id`
- `name`
- `description`
- `favicon`
- `status`
- `nsfw`

Recommended:

- `cacheDuration: Some(1800)` unless the site is unusually stable
- use standard option IDs like `sort`, `filter`, `sites`, `category`, `stars`, `categories`
- keep options minimal at first; only expose filters that actually work in `get_videos`

The option `id` values matter more than the display `title`.

## Step 5: Model Provider Routing Explicitly

Create a local enum like:

```rust
enum Target {
    Latest,
    Search { query: String },
    Tag { slug: String },
    Uploader { id: String },
}
```

Then write one function that resolves `sort`, `query`, `filter`, `sites`, and related options into a `Target`.

This is easier to debug than scattering URL decisions across the provider.

## Step 6: Load Filter Catalogs In The Background If Needed

If the site exposes tags, uploaders, studios, networks, or stars:

- store them in `Arc<RwLock<Vec<FilterOption>>>`
- initialize them with an `All` option
- spawn a background thread in `new()`
- create a tiny Tokio runtime inside that thread
- fill the lists without blocking server startup

Patterns:

- `hsex.rs`
- `omgxxx.rs`
- `pornhd3x.rs`
- `vjav.rs`

If tags or uploaders need stable IDs, keep a lookup map such as:

- `HashMap<String, String>` from title to site ID
- `HashMap<String, String>` from site ID to URL target

Normalize lookup keys to lowercase trimmed strings.

## Step 7: Fetch Pages Through The Shared Requester

In `get_videos`, start with:

```rust
let mut requester = requester_or_default(&options, CHANNEL_ID, "get_videos");
```

Use it for HTML, JSON, and raw media requests.

Why:

- it preserves cookies
- it carries debug trace IDs
- it respects Burp proxying
- it can fall back to Jina or FlareSolverr

## Step 8: Parse Listing Cards First, Then Enrich Only If Needed

Preferred flow:

1. Fetch the archive or search page.
2. Parse a lightweight list of stubs.
3. Return list data directly if enough metadata is already present.
4. Fetch detail pages or JSON endpoints only for fields the card does not expose.

Use bounded concurrency for detail enrichment. Existing providers usually use `futures::stream` with `buffer_unordered`.

## Step 9: Build High-Quality `VideoItem`s

Always fill:

- `id`
- `title`
- `url`
- `channel`
- `thumb`
- `duration`

Fill when available:

- `views`
- `rating`
- `uploader`
- `uploaderUrl`
- `uploaderId`
- `tags`
- `uploadedAt`
- `preview`
- `aspectRatio`
- `formats`

Rules:

- Keep `tags` as a list of displayable titles.
- Keep uploader data as structured fields, not mashed into the title.
- If you support uploader profiles, set `uploaderId` to a namespaced value like `<channel>:<site-local-id>`.
- Do not include `embed` unless the provider truly needs it.
- If direct media exists, prefer `formats` and keep `url` stable.

## Step 10: Decide Whether A Proxy Is Required

Use no proxy when:

- page URLs are enough and the client can resolve media itself
- or direct media URLs already work cleanly

Use a redirect proxy when:

- the provider must turn a detail URL into a resolved media URL
- headers/cookies do not need full response rewriting

Use a media/image proxy when:

- the site requires a referer for every fetch
- thumbnails need cookie-backed access
- manifests contain relative URIs that must be rewritten
- the server must stream binary content itself

If a proxy is needed:

1. add `src/proxies/<id>.rs`
2. wire the route in `src/proxy.rs`
3. generate provider URLs with `build_proxy_url(&options, "<id>", target)`

## Step 11: Implement Search Correctly

Check for three search modes:

1. native site search endpoint
2. tag/uploader shortcut search from preloaded filter catalogs
3. literal client-side substring search after fetch, triggered by quoted queries

Important server behavior:

- `#tag` becomes `tag`
- `"teacher"` becomes a literal post-fetch filter
- raw URL queries may bypass the provider through the `yt-dlp` fast path

Provider guidance:

- if the query matches a known tag/uploader shortcut, prefer the site’s direct archive URL instead of generic search
- otherwise fall back to the site’s keyword search

## Step 12: Support Pagination Explicitly

Do not assume pagination is `?page=N`.

Confirm:

- archive page 2 URL shape
- search page 2 URL shape
- tag page 2 URL shape
- uploader page 2 URL shape

If the site uses infinite scroll or an XHR endpoint, document that in code comments and hit the underlying endpoint directly.

## Step 13: Only Add `/api/uploaders` When The Site Has Real Uploader Identity

Uploader support is optional. Only implement it when the site exposes stable uploader pages or IDs.

Use `hsex.rs`, `omgxxx.rs`, or `vjav.rs` as the template.

Minimum expectations for `UploaderProfile`:

- stable `id`
- `name`
- `channel`
- `videoCount`
- `totalViews`

Nice to have:

- `avatar`
- `description`
- `videos`
- `layout`
- per-channel stats

## Validation Checklist

Run all of these:

```bash
cargo check -q
HOT_TUB_PROVIDER=<channel_id> cargo check -q
HOT_TUB_PROVIDER=<channel_id> cargo run --features debug
```

Then hit:

```bash
curl -s http://127.0.0.1:18080/api/status \
  -H 'User-Agent: Hot%20Tub/22c CFNetwork/1494.0.7 Darwin/23.4.0' | jq
```

```bash
curl -s http://127.0.0.1:18080/api/videos \
  -H 'Content-Type: application/json' \
  -d '{"channel":"<channel_id>","sort":"new","page":1,"perPage":10}' | jq
```

Also verify:

- search query works
- page 2 works
- tag shortcut works if implemented
- uploader shortcut works if implemented
- `yt-dlp '<video.url or first format url>'` resolves media
- thumbnail URL returns an image
- proxy route returns a `302` or working media body, whichever is expected
- if uploaders are implemented, `/api/uploaders` works with both `uploaderId` and `uploaderName`

## Common Failure Modes

- Forgot `build.rs` entry.
- Returned page URLs but no playable media/formats.
- Used a local requester instead of the shared one and lost cookies.
- Built `/proxy/...` URLs without `public_url_base`.
- Put human-readable titles into filter IDs, making routing brittle.
- Added huge option lists to the status response without background loading.
- Implemented search but not search pagination.
- Implemented proxies but forgot to test them independently with `curl -I`.

## Best Reference Matrix

- Rich uploader support: `vjav.rs`, `hsex.rs`, `omgxxx.rs`
- Tag and uploader lookup maps: `vjav.rs`, `hsex.rs`
- Background catalog loading: `hsex.rs`, `omgxxx.rs`, `pornhd3x.rs`
- Redirect proxy: `spankbang.rs` plus `src/proxies/spankbang.rs`
- Manifest or image proxy: `noodlemagazine.rs` plus `src/proxies/noodlemagazine.rs`
- Complex detail enrichment: `pornhd3x.rs`