fyptt search fix

Bare keyword queries no longer hijack to a category archive when the
query matches a category name (sexy/ass/tiktok/...); only an explicit
cat:/category: prefix or the categories filter routes to an archive.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Simon
2026-06-19 21:47:21 +00:00
parent 1bd06db894
commit d21c36e585
2 changed files with 13 additions and 8 deletions

View File

@@ -16,7 +16,7 @@ This is the current implementation inventory as of this snapshot of the repo. Us
| `erome` | `amateur-homemade` | no | no | HTML album scraper with hot/new feeds, keyword search, and uploader-slug shortcuts (`uploader:<name>`). | | `erome` | `amateur-homemade` | no | no | HTML album scraper with hot/new feeds, keyword search, and uploader-slug shortcuts (`uploader:<name>`). |
| `fikfap` | `tiktok` | yes | yes (thumbs only) | JSON-API provider for fikfap.com (TikTok-style swipe short clips); anonymous auth via a client-generated `Authorization-Anonymous` UUID header (no real login needed); listing via `GET api.fikfap.com/posts?sort=new\|trending\|random&amount=N&afterId=<lastPostId>` (cursor pagination — page N costs N sequential requests); search via `GET search?q=` (single fixed-size batch, no pagination — page 2+ returns empty); hashtag feeds via `GET hashtags/label/{label}/posts` and creator feeds via `GET profile/username/{user}/posts`, both also cursor-paginated; `tag:`/`hashtag:`/`#` and `user:`/`uploader:` query prefixes route directly; `categories` option exposes a small curated static hashtag list (no full catalog endpoint exists anonymously); `video.url` is the `fikfap.com/post/{id}` page (a client-rendered SPA, not yt-dlp-resolvable on its own); `videoStreamUrl` from the JSON response is sent directly as `formats[0].url` (signed Bunny CDN HLS `.m3u8`, ~24h token expiry) with `httpHeaders: {Referer: https://fikfap.com/}` — Hot Tub clients apply a format's `http_headers` across the whole HLS playback session (manifest, sub-playlists, and segments), so no proxying of the media itself is needed; thumbnails have no per-field header mechanism, so they're proxied via `/proxy/fikfap-thumb/...` to inject the same Referer; `get_uploader` implemented (`fikfap:<username>` IDs) using `GET profile/username/{user}`. | | `fikfap` | `tiktok` | yes | yes (thumbs only) | JSON-API provider for fikfap.com (TikTok-style swipe short clips); anonymous auth via a client-generated `Authorization-Anonymous` UUID header (no real login needed); listing via `GET api.fikfap.com/posts?sort=new\|trending\|random&amount=N&afterId=<lastPostId>` (cursor pagination — page N costs N sequential requests); search via `GET search?q=` (single fixed-size batch, no pagination — page 2+ returns empty); hashtag feeds via `GET hashtags/label/{label}/posts` and creator feeds via `GET profile/username/{user}/posts`, both also cursor-paginated; `tag:`/`hashtag:`/`#` and `user:`/`uploader:` query prefixes route directly; `categories` option exposes a small curated static hashtag list (no full catalog endpoint exists anonymously); `video.url` is the `fikfap.com/post/{id}` page (a client-rendered SPA, not yt-dlp-resolvable on its own); `videoStreamUrl` from the JSON response is sent directly as `formats[0].url` (signed Bunny CDN HLS `.m3u8`, ~24h token expiry) with `httpHeaders: {Referer: https://fikfap.com/}` — Hot Tub clients apply a format's `http_headers` across the whole HLS playback session (manifest, sub-playlists, and segments), so no proxying of the media itself is needed; thumbnails have no per-field header mechanism, so they're proxied via `/proxy/fikfap-thumb/...` to inject the same Referer; `get_uploader` implemented (`fikfap:<username>` IDs) using `GET profile/username/{user}`. |
| `freepornvideosxxx` | `studio-network` | no | no | Studio-style scraper. | | `freepornvideosxxx` | `studio-network` | no | no | Studio-style scraper. |
| `fyptt` | `tiktok` | no | no | HTML scraper for fyptt.to (Beaver Builder/WordPress short-form TikTok-style vertical porn); card selector `.fl-post-grid-post[class*="post-ID"]` with `category-{slug}` CSS class doubling as both listing tag and category-archive route; latest feed `/` (page N: `/page/N/`), search `/?s=query` (page N: `/page/N/?s=query`), category archives at bare top-level slugs like `/tiktok-ass/` (12 hardcoded categories exposed via `categories` option, also matched from free-text `cat:`/`category:` query prefixes or bare category-title queries); per-item enrichment fetches the detail page for the JSON-LD `embedURL` (one of three on-site player endpoints: `fypttstr.php`, `fypttjwstr.php`, or `fypttjwstrhls.php`) and `datePublished`, then fetches that embed URL to extract the actual signed `stream.fyptt.to` mp4 or `/hls/*.m3u8` URL (token expires ~2h, no Referer required) for `formats`; thumbnails (`fyptt.to/wp-content/uploads/...webp`) need no proxy; no duration metadata available on listing or detail pages (set to 0); no real uploader/model identity (the `girl-{slug}` CSS class is cosmetic only, not a linkable archive) so `/api/uploaders` is not implemented; `video.url` is the detail page URL (not yt-dlp resolvable directly — the player is sandboxed-iframe-only) so `formats` are populated instead; no proxy needed. | | `fyptt` | `tiktok` | no | no | HTML scraper for fyptt.to (Beaver Builder/WordPress short-form TikTok-style vertical porn); card selector `.fl-post-grid-post[class*="post-ID"]` with `category-{slug}` CSS class doubling as both listing tag and category-archive route; latest feed `/` (page N: `/page/N/`), search `/?s=query` (page N: `/page/N/?s=query`), category archives at bare top-level slugs like `/tiktok-ass/` (12 hardcoded categories exposed via the `categories` filter option, or via an explicit `cat:`/`category:` query prefix — bare keyword queries always go to WordPress search, never a category archive, because the category names ("sexy", "ass", "tiktok", "live", ...) are also the most common search terms); per-item enrichment fetches the detail page for the JSON-LD `embedURL` (one of three on-site player endpoints: `fypttstr.php`, `fypttjwstr.php`, or `fypttjwstrhls.php`) and `datePublished`, then fetches that embed URL to extract the actual signed `stream.fyptt.to` mp4 or `/hls/*.m3u8` URL (token expires ~2h, no Referer required) for `formats`; thumbnails (`fyptt.to/wp-content/uploads/...webp`) need no proxy; no duration metadata available on listing or detail pages (set to 0); no real uploader/model identity (the `girl-{slug}` CSS class is cosmetic only, not a linkable archive) so `/api/uploaders` is not implemented; `video.url` is the detail page URL (not yt-dlp resolvable directly — the player is sandboxed-iframe-only) so `formats` are populated instead; no proxy needed. |
| `freeuseporn` | `fetish-kink` | no | no | Fetish archive pattern. | | `freeuseporn` | `fetish-kink` | no | no | Fetish archive pattern. |
| `hanime` | `hentai-animation` | no | yes | Uses proxied CDN/thumb handling. | | `hanime` | `hentai-animation` | no | yes | Uses proxied CDN/thumb handling. |
| `heavyfetish` | `fetish-kink` | no | no | Direct media handling. | | `heavyfetish` | `fetish-kink` | no | no | Direct media handling. |

View File

@@ -123,15 +123,16 @@ impl FypttProvider {
if let Some(query) = query { if let Some(query) = query {
let q = query.trim(); let q = query.trim();
if !q.is_empty() { if !q.is_empty() {
// Only an explicit `cat:`/`category:` prefix routes to a category
// archive. Bare category-name words ("sexy", "ass", "tiktok", ...)
// are far more common as real search terms on this site, so they
// must fall through to keyword search rather than being hijacked.
if let Some(slug) = q.strip_prefix("cat:").or_else(|| q.strip_prefix("category:")) { if let Some(slug) = q.strip_prefix("cat:").or_else(|| q.strip_prefix("category:")) {
if let Some(known) = Self::category_slug_for(slug) { if let Some(known) = Self::category_slug_for(slug) {
return Target::Category { slug: known.to_string() }; return Target::Category { slug: known.to_string() };
} }
return Target::Category { slug: slug.trim().to_string() }; return Target::Category { slug: slug.trim().to_string() };
} }
if let Some(slug) = Self::category_slug_for(q) {
return Target::Category { slug: slug.to_string() };
}
return Target::Search { query: q.to_string() }; return Target::Search { query: q.to_string() };
} }
} }
@@ -460,10 +461,14 @@ mod tests {
} }
#[test] #[test]
fn picks_category_target_from_title_match() { fn category_name_word_routes_to_search_not_category() {
match FypttProvider::pick_target(Some("Boobs")) { // "Boobs"/"sexy"/"tiktok" are category names but also common search
Target::Category { slug } => assert_eq!(slug, "tiktok-boobs"), // terms; a bare query must search, not hijack to the category archive.
other => panic!("expected Category, got {:?}", other), for word in ["Boobs", "sexy", "tiktok", "ass"] {
match FypttProvider::pick_target(Some(word)) {
Target::Search { query } => assert_eq!(query, word),
other => panic!("expected Search for {word:?}, got {:?}", other),
}
} }
} }