Files
hottub/docs/architecture.md
2026-04-10 22:57:27 +00:00

314 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Hottub Architecture
## Purpose
Hottub is a Rust server that exposes Hot Tub compatible endpoints for channel discovery, video search, uploader lookups, and site-specific proxying. Most work in this repo is adding or repairing a provider module under `src/providers/`.
## Top-Level Structure
- `src/main.rs`: server bootstrap, env loading, database pool, shared requester/cache, route mounting.
- `src/api.rs`: `/api/status`, `/api/videos`, `/api/uploaders`, `/api/test`, `/api/proxies`.
- `src/providers/mod.rs`: provider trait, provider registry, build-time provider selection, status decoration, runtime validation, panic/error guards.
- `src/providers/*.rs`: one module per channel/provider.
- `src/proxy.rs`: route table for `/proxy/...`.
- `src/proxies/*.rs`: redirect/media/thumb proxy implementations.
- `src/videos.rs`: request/response payloads, `VideoItem`, `VideoFormat`, `ServerOptions`.
- `src/status.rs`: status/channel/group payloads.
- `src/uploaders.rs`: uploader request/profile payloads.
- `src/util/requester.rs`: outbound HTTP with cookies, optional Burp proxying, Jina fallback, and FlareSolverr fallback.
- `build.rs`: compile-time provider registry generation and single-provider build support.
## Startup Flow
1. `main` loads `.env` and ensures `RUST_LOG` is set.
2. It creates the Diesel SQLite pool from `DATABASE_URL`.
3. It creates a shared `Requester`, enables Burp proxying when `PROXY != 0`, and builds the LRU video cache.
4. It configures provider runtime validation in `providers::configure_runtime_validation`.
5. It spawns a background thread that forces provider initialization via `providers::init_providers_now()`.
6. It starts an `ntex` HTTP server on `0.0.0.0:18080`.
## Runtime Environment
Important environment variables:
- `DATABASE_URL`: required SQLite path.
- `RUST_LOG`: defaults to `warn` if unset.
- `PROXY`: enables Burp proxying when not equal to `0`.
- `BURP_URL`: outbound proxy URL used when `PROXY` is enabled.
- `FLARE_URL`: FlareSolverr endpoint used as the last HTML-fetch fallback.
- `DOMAIN`: used by the `/` redirect target.
- `DISCORD_WEBHOOK`: enables `/api/test` and provider error reporting.
Bundled reference material:
- `docs/hottubapp/📡 Status - Hot Tub Docs.html`
- `docs/hottubapp/🎬 Videos - Hot Tub Docs.html`
- `docs/hottubapp/👤 Uploaders - Hot Tub Docs.html`
Those HTML files are useful when a provider author needs to confirm the expected client payload shape without reading Rust structs first.
## Build-Time Provider Selection
`build.rs` reads `HOT_TUB_PROVIDER` or `HOTTUB_PROVIDER`.
- If unset, every provider in `build.rs` is compiled and registered.
- If set, only that provider is compiled into the binary.
- In a single-provider build, `/api/videos` remaps `"channel": "all"` to the compiled provider.
Generated files in `OUT_DIR` are included by `src/providers/mod.rs`:
- `provider_modules.rs`
- `provider_registry.rs`
- `provider_metadata_fn.rs`
- `provider_selection.rs`
This means adding a new provider always requires updating `build.rs`.
## HTTP Surface
### `/`
Returns a `302` redirect to `hottub://source?url=<DOMAIN-or-request-host>`.
### `/api/status`
Builds the channel list by iterating `ALL_PROVIDERS` and calling `Provider::get_channel`.
Important behavior:
- The `User-Agent` is parsed into `ClientVersion`.
- A provider can hide itself by returning `None`.
- `providers::build_status_response` decorates channels with `groupKey`, top tags, runtime status, and sort order.
- Some heavy status filters are intentionally removed from the client-facing response. The server still accepts them in `/api/videos`.
### `/api/videos`
This is the main provider execution path.
Flow:
1. Parse `VideosRequest`.
2. Normalize `channel`, `sort`, `query`, `page`, and `perPage`.
3. Build `ServerOptions`.
4. If `query` is a full `http://` or `https://` URL, try the `yt-dlp -J` fast path first.
5. Otherwise call `provider.get_videos(...)` through `run_provider_guarded`.
6. For quoted queries like `"teacher"`, apply a literal substring filter after provider fetch.
7. Spawn a background prefetch for the next page.
8. For short videos (`duration <= 120`), populate `preview` from the main URL or first format.
Important behavior:
- Leading `#` is stripped from queries before provider dispatch.
- `"all"` uses `AllProvider` in a normal build, but resolves to the single compiled provider in a single-provider build.
- Older `Hot Tub/38` clients are patched by replacing `video.url` with the last format URL when formats exist.
### `/api/uploaders`
Uploader lookup is optional and provider-specific.
Important behavior:
- At least one of `uploaderId` or `uploaderName` is required.
- If `uploaderId` looks like `channel:id`, the server directly targets that provider.
- Otherwise it scans all providers and returns the best exact-name match.
- Only `hsex`, `omgxxx`, and `vjav` currently implement `get_uploader`.
- In practice, provider-owned uploader IDs should be namespaced, for example `vjav:12345` or `hsex:author_slug`.
### `/api/test`
Sends a Discord error test if `DISCORD_WEBHOOK` is configured.
### `/api/proxies`
Returns the background-fetched outbound proxy snapshot from `src/util/proxy.rs`.
## Core Data Structures
### `VideosRequest`
Defined in `src/videos.rs`. Common fields used by providers:
- `channel`
- `sort`
- `query`
- `page`
- `perPage`
- `featured`
- `category`
- `sites`
- `all_provider_sites`
- `filter`
- `language`
- `networks`
- `stars`
- `categories`
- `duration`
- `sexuality`
### `ServerOptions`
The servers normalized option bag. Providers should read from this instead of reparsing the raw API request.
Important fields:
- `public_url_base`: needed when generating `/proxy/...` URLs.
- `requester`: the shared request client with cookies/debug trace/proxy state.
- `sort`, `sites`, `filter`, `category`, `language`, `network`, `stars`, `categories`, `duration`, `sexuality`.
### `VideoItem`
Minimum useful fields for a provider:
- `id`
- `title`
- `url`
- `channel`
- `thumb`
- `duration`
High-value optional fields:
- `views`
- `rating`
- `uploader`
- `uploaderUrl`
- `uploaderId`
- `tags`
- `uploadedAt`
- `formats`
- `preview`
- `aspectRatio`
Avoid setting `embed` for new providers unless the site truly needs it.
### `VideoFormat`
Use `formats` when:
- the site returns a better direct media URL than the page URL
- HLS or multiple qualities exist
- extra HTTP headers such as `Referer` are required
Use `http_header` or `add_http_header` when the player endpoint needs request headers.
### `Channel` and `ChannelOption`
Each providers `get_channel` returns the status metadata exposed by `/api/status`.
Typical option IDs used across the repo:
- `sort`
- `filter`
- `sites`
- `category`
- `language`
- `networks`
- `stars`
- `categories`
Use the same IDs when possible so the server and client behavior stay consistent.
### `UploaderProfile`
If a provider supports `/api/uploaders`, keep the ID routable:
- preferred format: `<channel>:<site-local-id>`
- examples in the repo: `vjav:<user_id>`, `hsex:<author>`, `omgxxx:<kind>:<id>`
This lets `src/api.rs` derive the owning provider immediately.
## Provider Contract
Defined in `src/providers/mod.rs`:
- `async fn get_videos(...) -> Vec<VideoItem>`
- `fn get_channel(clientversion: ClientVersion) -> Option<Channel>`
- `async fn get_uploader(...) -> Result<Option<UploaderProfile>, String>` optional
The server wraps provider execution in:
- `run_provider_guarded` for video paths
- `run_uploader_provider_guarded` for uploader paths
Panics and reported errors trigger runtime validation and optional Discord reporting.
## Runtime Validation and Error Handling
`src/providers/mod.rs` includes a validation subsystem that:
- runs a small sample request against a provider after failures
- checks that enough video items exist
- tries media URLs or format URLs with a `Range` header
- marks repeated failures over time
This means a provider that returns page URLs but no real media/formats may pass visually but still fail operationally.
## Requester Behavior
`src/util/requester.rs` is the standard outbound HTTP layer.
Capabilities:
- shared cookie jar across clones
- optional Burp proxying via `PROXY` and `BURP_URL`
- direct request retries for `429`
- Jina mirror fallback for blocked HTML fetches
- FlareSolverr fallback via `FLARE_URL`
- raw response helpers for media validation and custom headers
Use the shared requester from `ServerOptions` through `requester_or_default`. Do not instantiate a brand-new requester in normal provider fetch paths unless you have a very specific reason.
FlareSolverr note:
- `src/util/flaresolverr.rs` keeps a reusable session pool pattern by rotating a ready session per solve.
- If a provider only works after anti-bot negotiation, the shared requester is the path that benefits from that solved session and cookie state.
## Proxy Subsystem
There are two proxy styles.
### Redirect proxies
These take a provider-specific endpoint and return `302 Location: <resolved-media-url>`.
Examples:
- `/proxy/spankbang/...`
- `/proxy/sxyprn/...`
- `/proxy/pornhd3x/...`
- `/proxy/vjav/...`
### Media or image proxies
These actively fetch media or thumbnails and stream or rewrite the response.
Examples:
- `/proxy/noodlemagazine/...`
- `/proxy/noodlemagazine-thumb/...`
- `/proxy/shooshtime-media/...`
- `/proxy/hanime-cdn/...`
If a site only needs a referer-preserving redirect, use a redirect proxy. If manifests, relative playlist entries, cookies, or binary thumbs need rewriting, use a media/image proxy.
## Best Existing Templates
Use the closest existing provider instead of inventing a new style.
- `src/providers/vjav.rs`: rich API-backed provider with tags, uploader support, and detail enrichment.
- `src/providers/hsex.rs`: HTML scraping with background-loaded filters, uploader support, and direct HLS formats.
- `src/providers/omgxxx.rs`: large filter catalogs and uploader lookup by site/network identity.
- `src/providers/noodlemagazine.rs`: proxied media/thumbs, Jina fallback, and mirrored listing parsing.
- `src/providers/pornhd3x.rs`: complex filter catalogs, detail enrichment, and proxy-generated playback URLs.
- `src/providers/spankbang.rs`: anti-bot handling and a redirect-proxy-based media strategy.
## Important Gotchas
- New providers must export `CHANNEL_METADATA`.
- New providers must be listed in `build.rs` or they will never compile into the registry.
- If a provider returns proxied URLs, it usually also needs `options.public_url_base`.
- Keep filter IDs stable. The `title` is for display; the `id` is what the provider matches on.
- `categories` in `Channel` are not the same as `ChannelOption { id: "categories" }`.
- `/api/status` sanitizes some options away from the client-facing payload. That does not mean the provider option is useless in `/api/videos`.
- If a site needs per-request cookies or a solved user agent, rely on the shared requester.