350 lines
9.2 KiB
Markdown
350 lines
9.2 KiB
Markdown
# New Provider Playbook
|
||
|
||
This is the implementation checklist for adding a working channel with the least guessing.
|
||
|
||
## Definition Of Done
|
||
|
||
A provider is not done when it compiles. It is done when:
|
||
|
||
1. `/api/status` shows the channel with sensible options and grouping.
|
||
2. `/api/videos` returns real items for the default feed.
|
||
3. Search works.
|
||
4. Pagination works.
|
||
5. Thumbnails load.
|
||
6. `video.url` or at least one `formats[*].url` resolves to playable media.
|
||
7. If the site needs proxying, the `/proxy/...` route works.
|
||
8. `HOT_TUB_PROVIDER=<id> cargo check -q` passes.
|
||
|
||
## Files To Touch
|
||
|
||
Always:
|
||
|
||
- `build.rs`
|
||
- `src/providers/<channel_id>.rs`
|
||
|
||
Sometimes:
|
||
|
||
- `src/proxy.rs`
|
||
- `src/proxies/<channel_id>.rs`
|
||
- `src/proxies/<channel_id>thumb.rs`
|
||
- `prompts/new-channel.md` if you are improving the handoff prompt
|
||
- `docs/provider-catalog.md` if you add a new provider or proxy
|
||
|
||
## Step 1: Pick The Closest Template
|
||
|
||
Do not start from an empty file.
|
||
|
||
Choose the nearest match:
|
||
|
||
- API-first site with tags/uploader metadata: copy `vjav.rs`
|
||
- HTML site with background-loaded tags/uploaders: copy `hsex.rs`
|
||
- Site with multiple large catalogs like sites/networks/stars: copy `omgxxx.rs`
|
||
- Site whose media or thumbs need local proxying: copy `noodlemagazine.rs`, `pornhd3x.rs`, `spankbang.rs`, or `porndish.rs`
|
||
- Very simple archive/search site: copy a small provider from `mainstream-tube`
|
||
|
||
Before writing code, confirm the site shape:
|
||
|
||
1. home or latest feed URL
|
||
2. search URL and page 2 URL
|
||
3. detail page URL shape
|
||
4. player request or manifest request
|
||
5. thumbnail host and whether it needs referer/cookies
|
||
6. tag/category/uploader/studio routes if they exist
|
||
7. whether the site exposes JSON endpoints that are easier than HTML scraping
|
||
|
||
Use browser/network tooling for this if needed. Do not guess URL patterns from one page.
|
||
|
||
## Step 2: Register The Provider
|
||
|
||
Add the provider to `build.rs`:
|
||
|
||
- `id`: channel id used by `/api/videos`
|
||
- `module`: Rust file name
|
||
- `ty`: provider struct name
|
||
|
||
If this is missing, the server will not discover the provider.
|
||
|
||
## Step 3: Define Channel Metadata
|
||
|
||
Every provider should export:
|
||
|
||
```rust
|
||
pub const CHANNEL_METADATA: crate::providers::ProviderChannelMetadata =
|
||
crate::providers::ProviderChannelMetadata {
|
||
group_id: "...",
|
||
tags: &["...", "...", "..."],
|
||
};
|
||
```
|
||
|
||
Pick `group_id` from the existing set in `src/providers/mod.rs`:
|
||
|
||
- `meta-search`
|
||
- `mainstream-tube`
|
||
- `tiktok`
|
||
- `studio-network`
|
||
- `amateur-homemade`
|
||
- `onlyfans`
|
||
- `chinese`
|
||
- `jav`
|
||
- `fetish-kink`
|
||
- `hentai-animation`
|
||
- `ai`
|
||
- `gay-male`
|
||
- `live-cams`
|
||
- `pmv-compilation`
|
||
|
||
## Step 4: Build The Channel Surface
|
||
|
||
Implement `build_channel` or equivalent and return it from `get_channel`.
|
||
|
||
Required:
|
||
|
||
- `id`
|
||
- `name`
|
||
- `description`
|
||
- `favicon`
|
||
- `status`
|
||
- `nsfw`
|
||
|
||
Recommended:
|
||
|
||
- `cacheDuration: Some(1800)` unless the site is unusually stable
|
||
- use standard option IDs like `sort`, `filter`, `sites`, `category`, `stars`, `categories`
|
||
- keep options minimal at first; only expose filters that actually work in `get_videos`
|
||
|
||
The option `id` values matter more than the display `title`.
|
||
|
||
## Step 5: Model Provider Routing Explicitly
|
||
|
||
Create a local enum like:
|
||
|
||
```rust
|
||
enum Target {
|
||
Latest,
|
||
Search { query: String },
|
||
Tag { slug: String },
|
||
Uploader { id: String },
|
||
}
|
||
```
|
||
|
||
Then write one function that resolves `sort`, `query`, `filter`, `sites`, and related options into a `Target`.
|
||
|
||
This is easier to debug than scattering URL decisions across the provider.
|
||
|
||
## Step 6: Load Filter Catalogs In The Background If Needed
|
||
|
||
If the site exposes tags, uploaders, studios, networks, or stars:
|
||
|
||
- store them in `Arc<RwLock<Vec<FilterOption>>>`
|
||
- initialize them with an `All` option
|
||
- spawn a background thread in `new()`
|
||
- create a tiny Tokio runtime inside that thread
|
||
- fill the lists without blocking server startup
|
||
|
||
Patterns:
|
||
|
||
- `hsex.rs`
|
||
- `omgxxx.rs`
|
||
- `pornhd3x.rs`
|
||
- `vjav.rs`
|
||
|
||
If tags or uploaders need stable IDs, keep a lookup map such as:
|
||
|
||
- `HashMap<String, String>` from title to site ID
|
||
- `HashMap<String, String>` from site ID to URL target
|
||
|
||
Normalize lookup keys to lowercase trimmed strings.
|
||
|
||
## Step 7: Fetch Pages Through The Shared Requester
|
||
|
||
In `get_videos`, start with:
|
||
|
||
```rust
|
||
let mut requester = requester_or_default(&options, CHANNEL_ID, "get_videos");
|
||
```
|
||
|
||
Use it for HTML, JSON, and raw media requests.
|
||
|
||
Why:
|
||
|
||
- it preserves cookies
|
||
- it carries debug trace IDs
|
||
- it respects Burp proxying
|
||
- it can fall back to Jina or FlareSolverr
|
||
|
||
## Step 8: Parse Listing Cards First, Then Enrich Only If Needed
|
||
|
||
Preferred flow:
|
||
|
||
1. Fetch the archive or search page.
|
||
2. Parse a lightweight list of stubs.
|
||
3. Return list data directly if enough metadata is already present.
|
||
4. Fetch detail pages or JSON endpoints only for fields the card does not expose.
|
||
|
||
Use bounded concurrency for detail enrichment. Existing providers usually use `futures::stream` with `buffer_unordered`.
|
||
|
||
## Step 9: Build High-Quality `VideoItem`s
|
||
|
||
Always fill:
|
||
|
||
- `id`
|
||
- `title`
|
||
- `url`
|
||
- `channel`
|
||
- `thumb`
|
||
- `duration`
|
||
|
||
Fill when available:
|
||
|
||
- `views`
|
||
- `rating`
|
||
- `uploader`
|
||
- `uploaderUrl`
|
||
- `uploaderId`
|
||
- `tags`
|
||
- `uploadedAt`
|
||
- `preview`
|
||
- `aspectRatio`
|
||
- `formats`
|
||
|
||
Rules:
|
||
|
||
- Keep `tags` as a list of displayable titles.
|
||
- Keep uploader data as structured fields, not mashed into the title.
|
||
- If you support uploader profiles, set `uploaderId` to a namespaced value like `<channel>:<site-local-id>`.
|
||
- Do not include `embed` unless the provider truly needs it.
|
||
- If direct media exists, prefer `formats` and keep `url` stable.
|
||
|
||
## Step 10: Decide Whether A Proxy Is Required
|
||
|
||
Use no proxy when:
|
||
|
||
- page URLs are enough and the client can resolve media itself
|
||
- or direct media URLs already work cleanly
|
||
|
||
Use a redirect proxy when:
|
||
|
||
- the provider must turn a detail URL into a resolved media URL
|
||
- headers/cookies do not need full response rewriting
|
||
|
||
Use a media/image proxy when:
|
||
|
||
- the site requires a referer for every fetch
|
||
- thumbnails need cookie-backed access
|
||
- manifests contain relative URIs that must be rewritten
|
||
- the server must stream binary content itself
|
||
|
||
If a proxy is needed:
|
||
|
||
1. add `src/proxies/<id>.rs`
|
||
2. wire the route in `src/proxy.rs`
|
||
3. generate provider URLs with `build_proxy_url(&options, "<id>", target)`
|
||
|
||
## Step 11: Implement Search Correctly
|
||
|
||
Check for three search modes:
|
||
|
||
1. native site search endpoint
|
||
2. tag/uploader shortcut search from preloaded filter catalogs
|
||
3. literal client-side substring search after fetch, triggered by quoted queries
|
||
|
||
Important server behavior:
|
||
|
||
- `#tag` becomes `tag`
|
||
- `"teacher"` becomes a literal post-fetch filter
|
||
- raw URL queries may bypass the provider through the `yt-dlp` fast path
|
||
|
||
Provider guidance:
|
||
|
||
- if the query matches a known tag/uploader shortcut, prefer the site’s direct archive URL instead of generic search
|
||
- otherwise fall back to the site’s keyword search
|
||
|
||
## Step 12: Support Pagination Explicitly
|
||
|
||
Do not assume pagination is `?page=N`.
|
||
|
||
Confirm:
|
||
|
||
- archive page 2 URL shape
|
||
- search page 2 URL shape
|
||
- tag page 2 URL shape
|
||
- uploader page 2 URL shape
|
||
|
||
If the site uses infinite scroll or an XHR endpoint, document that in code comments and hit the underlying endpoint directly.
|
||
|
||
## Step 13: Only Add `/api/uploaders` When The Site Has Real Uploader Identity
|
||
|
||
Uploader support is optional. Only implement it when the site exposes stable uploader pages or IDs.
|
||
|
||
Use `hsex.rs`, `omgxxx.rs`, or `vjav.rs` as the template.
|
||
|
||
Minimum expectations for `UploaderProfile`:
|
||
|
||
- stable `id`
|
||
- `name`
|
||
- `channel`
|
||
- `videoCount`
|
||
- `totalViews`
|
||
|
||
Nice to have:
|
||
|
||
- `avatar`
|
||
- `description`
|
||
- `videos`
|
||
- `layout`
|
||
- per-channel stats
|
||
|
||
## Validation Checklist
|
||
|
||
Run all of these:
|
||
|
||
```bash
|
||
cargo check -q
|
||
HOT_TUB_PROVIDER=<channel_id> cargo check -q
|
||
HOT_TUB_PROVIDER=<channel_id> cargo run --features debug
|
||
```
|
||
|
||
Then hit:
|
||
|
||
```bash
|
||
curl -s http://127.0.0.1:18080/api/status \
|
||
-H 'User-Agent: Hot%20Tub/22c CFNetwork/1494.0.7 Darwin/23.4.0' | jq
|
||
```
|
||
|
||
```bash
|
||
curl -s http://127.0.0.1:18080/api/videos \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"channel":"<channel_id>","sort":"new","page":1,"perPage":10}' | jq
|
||
```
|
||
|
||
Also verify:
|
||
|
||
- search query works
|
||
- page 2 works
|
||
- tag shortcut works if implemented
|
||
- uploader shortcut works if implemented
|
||
- `yt-dlp '<video.url or first format url>'` resolves media
|
||
- thumbnail URL returns an image
|
||
- proxy route returns a `302` or working media body, whichever is expected
|
||
- if uploaders are implemented, `/api/uploaders` works with both `uploaderId` and `uploaderName`
|
||
|
||
## Common Failure Modes
|
||
|
||
- Forgot `build.rs` entry.
|
||
- Returned page URLs but no playable media/formats.
|
||
- Used a local requester instead of the shared one and lost cookies.
|
||
- Built `/proxy/...` URLs without `public_url_base`.
|
||
- Put human-readable titles into filter IDs, making routing brittle.
|
||
- Added huge option lists to the status response without background loading.
|
||
- Implemented search but not search pagination.
|
||
- Implemented proxies but forgot to test them independently with `curl -I`.
|
||
|
||
## Best Reference Matrix
|
||
|
||
- Rich uploader support: `vjav.rs`, `hsex.rs`, `omgxxx.rs`
|
||
- Tag and uploader lookup maps: `vjav.rs`, `hsex.rs`
|
||
- Background catalog loading: `hsex.rs`, `omgxxx.rs`, `pornhd3x.rs`
|
||
- Redirect proxy: `spankbang.rs` plus `src/proxies/spankbang.rs`
|
||
- Manifest or image proxy: `noodlemagazine.rs` plus `src/proxies/noodlemagazine.rs`
|
||
- Complex detail enrichment: `pornhd3x.rs`
|