xfree frontpage

This commit is contained in:
Simon
2026-06-23 10:47:30 +00:00
parent 0402e5ac76
commit 4dcdf5e8d1
2 changed files with 102 additions and 15 deletions

View File

@@ -21,7 +21,7 @@ This is the current implementation inventory as of this snapshot of the repo. Us
| `freeuseporn` | `fetish-kink` | no | no | Fetish archive pattern. | | `freeuseporn` | `fetish-kink` | no | no | Fetish archive pattern. |
| `hanime` | `hentai-animation` | no | yes | Uses proxied CDN/thumb handling. | | `hanime` | `hentai-animation` | no | yes | Uses proxied CDN/thumb handling. |
| `heavyfetish` | `fetish-kink` | no | no | Direct media handling. | | `heavyfetish` | `fetish-kink` | no | no | Direct media handling. |
| `hentaihaven` | `hentai-animation` | no | no | HLS format builder pattern. | | `hentaihaven` | `hentai-animation` | no | no | HTML scraper for hentaihaven.xxx (WordPress/Madara theme), Cloudflare-protected so the provider is gated behind `FLARE_URL` in `skip_reason_for_provider` (mod.rs); the shared requester clears CF directly (wreq Firefox136 emulation currently passes for the listing/search/watch/episode/`player.php` GETs) and falls back to Jina/FlareSolverr. Latest feed `/hentai/page/{N}/`, search `/?s={query}` (search is single-page — page>1 returns empty); listing/search cards link to series watch pages `https://hentaihaven.xxx/watch/{slug}/`. Per-series media resolution (the UUID exists nowhere in page HTML, so enrichment is unavoidable): watch page → episode links `…/watch/{slug}/episode-K` (in `manga-chapters-holder`) → episode page → `<iframe src="…/wp-content/plugins/player-logic/player.php?data=…">``player.php``<meta name="x-secure-token" content="sha512-…">` → decode token (strip `sha512-`, then 3× of rot13→base64-decode, then `JSON.parse`) → `{en, iv, uri, hot_domains, …}` → POST `…/wp-content/plugins/player-logic/api.php` with `action=zarat_get_data_player_ajax&a={en}&b={iv}` (urlencoded; this one POST uses a dedicated `wreq` Chrome137 client, not the shared requester) → `{"status":true,"data":{"sources":[{"src":"…m3u8"}],"isOctopus":bool}}`. A multi-episode series collapses into one `VideoItem` titled `"… (N Episodes)"` with one `m3u8` `VideoFormat` per episode (`format_note`/`format_id` = "Episode K"); each format carries `Referer`/`Origin: https://hentaihaven.xxx` + a Firefox `User-Agent`. `video.url` is the `watch/{slug}/` page (no yt-dlp extractor exists for the site, so `formats` are populated rather than relying on `video.url`). Two CDN shapes are returned: newer content-addressed `octopusmanifest.org/{uuid}/playlist.m3u8` (no token, portable across IPs) and older signed `master-lengs.org/api/v3/hh/{slug}/master.m3u8?hash=…` (~2.5h). **Gotcha:** both CDNs (same IP) aggressively per-IP rate-limit/ban with a TCP RST on 80/443 once tripped — looks like "host down" but is an IP ban; browsers play fine over HTTP/3 (QUIC) while TCP clients (curl/yt-dlp/wreq) get refused, so segment fetches can fail from a tripped IP even though the manifest URL is valid. Tags from the series "Genre(s)" block; `views` from the "Viewed … Total" counter; thumbnails (`img.hentaihaven.xxx`) load directly (no proxy/referer). Resolution is slow (each listing page = ~25 series × multi-episode player-API calls), so the provider is DB-first: it fetches the listing once for the ordered watch URLs, serves already-resolved `VideoItem`s from the `videos` SQLite table instantly (`db::upsert_video` to avoid duplicate-row staleness), and `spawn_refresh`es the whole page in the background (in-memory `VideoCache` soft-TTL 1h / hard-TTL 24h, per-listing in-flight guard). No `/api/uploaders` (no uploader identity), no proxy. |
| `hentaitv` | `hentai-animation` | no | yes | Next.js hentai site (hentai.tv) backed by a clean JSON API: `GET /api/browse?page=N&sort=<Label>&genres=<ExactName>` (`{videos:[28],total,pages}`, real pagination) and `GET /api/search?q=Q` (`{videos:[...]}`, single-page — `page` is ignored, so page>1 returns empty). Unlike `animeidhentai`, browse honors both `sort` (labels `Most Recent`/`Most Viewed`/`Trending`, mapped from option ids `new`/`views`/`trending`) and `genres` (the **exact case-sensitive** stored genre name, e.g. `Big Boobs`, `incest`), so genre archives go through `/api/browse?genres=` and paginate. The 68-genre catalogue (exact names) is background-loaded from the `/browse` page HTML (`"genres":[{"name","count"}]`, not exposed by the JSON API) and powers the `categories` filter plus keyword→genre routing. Each episode JSON has `slug`, `title`/`ep`, `tags[]`, `views`, `rating` (0-10 → ×10), `duration` ("MM:SS"), `brand` (studio → `uploader`), `thumb`/`backdrop`/`cover` (relative, served from `hentai.tv/uploads/...`, no referer), and `embedUrl=https://nhplayer.com/v/{embedId}/`. `video.url` is the reachable watch page `https://hentai.tv/hentai/{slug}`; `genre:`/`cat:`/`category:` prefixes and bare keywords that exactly match a genre route to the genre archive, everything else to search. Playback shares the **same nhplayer→`r2.1hanime.com` signed-CDN backend as `animeidhentai`**: `/proxy/hentaitv/{embedId}.mp4` is a redirect proxy that replicates nhplayer's PoW+DOM challenge (`player.php``player-core-v2.php``get-video-url-v2.php`, SHA-256-first-byte-zero PoW, ≥700ms dwell, fixed fingerprint) to mint a signed `?verify=<ts>-<sig>` URL — HEAD→200, GET→302 to the CDN URL (cached 150s). The CF wall is JA3-based not IP-based, so the signed URL is verifiable from anywhere with `yt-dlp --impersonate chrome` even though plain `curl`/`wreq` get 403. `src/proxies/hentaitv.rs` is a near-copy of `src/proxies/animeidhentai.rs` (only `SITE_REFERER` differs). No `/api/uploaders` (brand is studio-only). | | `hentaitv` | `hentai-animation` | no | yes | Next.js hentai site (hentai.tv) backed by a clean JSON API: `GET /api/browse?page=N&sort=<Label>&genres=<ExactName>` (`{videos:[28],total,pages}`, real pagination) and `GET /api/search?q=Q` (`{videos:[...]}`, single-page — `page` is ignored, so page>1 returns empty). Unlike `animeidhentai`, browse honors both `sort` (labels `Most Recent`/`Most Viewed`/`Trending`, mapped from option ids `new`/`views`/`trending`) and `genres` (the **exact case-sensitive** stored genre name, e.g. `Big Boobs`, `incest`), so genre archives go through `/api/browse?genres=` and paginate. The 68-genre catalogue (exact names) is background-loaded from the `/browse` page HTML (`"genres":[{"name","count"}]`, not exposed by the JSON API) and powers the `categories` filter plus keyword→genre routing. Each episode JSON has `slug`, `title`/`ep`, `tags[]`, `views`, `rating` (0-10 → ×10), `duration` ("MM:SS"), `brand` (studio → `uploader`), `thumb`/`backdrop`/`cover` (relative, served from `hentai.tv/uploads/...`, no referer), and `embedUrl=https://nhplayer.com/v/{embedId}/`. `video.url` is the reachable watch page `https://hentai.tv/hentai/{slug}`; `genre:`/`cat:`/`category:` prefixes and bare keywords that exactly match a genre route to the genre archive, everything else to search. Playback shares the **same nhplayer→`r2.1hanime.com` signed-CDN backend as `animeidhentai`**: `/proxy/hentaitv/{embedId}.mp4` is a redirect proxy that replicates nhplayer's PoW+DOM challenge (`player.php``player-core-v2.php``get-video-url-v2.php`, SHA-256-first-byte-zero PoW, ≥700ms dwell, fixed fingerprint) to mint a signed `?verify=<ts>-<sig>` URL — HEAD→200, GET→302 to the CDN URL (cached 150s). The CF wall is JA3-based not IP-based, so the signed URL is verifiable from anywhere with `yt-dlp --impersonate chrome` even though plain `curl`/`wreq` get 403. `src/proxies/hentaitv.rs` is a near-copy of `src/proxies/animeidhentai.rs` (only `SITE_REFERER` differs). No `/api/uploaders` (brand is studio-only). |
| `homoxxx` | `gay-male` | no | no | Gay category grouping example. | | `homoxxx` | `gay-male` | no | no | Gay category grouping example. |
| `hqporner` | `studio-network` | no | yes | Uses thumb and redirect proxy helpers. | | `hqporner` | `studio-network` | no | yes | Uses thumb and redirect proxy helpers. |

View File

@@ -179,6 +179,89 @@ impl XfreeProvider {
Ok(video_items) Ok(video_items)
} }
/// Front-page feed used when there is no search query. This mirrors the
/// site's homepage, which dispatches `getAutoPop` against
/// `/api/post/?t=popular&nsfhp=true&limit=30&offset=N&lgbt=X` instead of the
/// `/api/2/search` endpoint. (`popular` is the real feed type the homepage
/// loads first; `posts` is only a Vuex store key and 404s as a `t=` value.)
/// The response body is the post array directly, not `body.posts`.
async fn front_page(
&self,
cache: VideoCache,
page: u8,
options: ServerOptions,
pool: DbPool,
) -> Result<Vec<VideoItem>> {
let sexuality = match options.clone().sexuality {
Some(s) if !s.is_empty() => s,
_ => "1".to_string(),
};
let offset = (page as u32 - 1) * 30;
let video_url = format!(
"{}/api/post/?t=popular&nsfhp=true&limit=30{}&lgbt={}",
self.url,
if page > 1 {
format!("&offset={offset}")
} else {
String::new()
},
sexuality,
);
// Check our Video Cache. If the result is younger than 1 hour, we return it.
let old_items = match cache.get(&video_url) {
Some((time, items)) => {
if time.elapsed().unwrap_or_default().as_secs() < 60 * 60 * 24 {
return Ok(items.clone());
} else {
let _ = cache.check().await;
return Ok(items.clone());
}
}
None => {
vec![]
}
};
let mut requester =
crate::providers::requester_or_default(&options, module_path!(), "missing_requester");
let text = match requester
.get_with_headers(
&video_url,
vec![
("Apiversion".to_string(), "1.0".to_string()),
(
"Accept".to_string(),
"application/json text/plain */*".to_string(),
),
("Referer".to_string(), "https://www.xfree.com/".to_string()),
],
Some(Version::HTTP_2),
)
.await
{
Ok(text) => text,
Err(e) => {
crate::providers::report_provider_error(
"xfree",
"front_page.request",
&format!("url={video_url}; error={e}"),
)
.await;
return Ok(old_items);
}
};
let video_items: Vec<VideoItem> = self
.get_video_items_from_json(text.clone(), &mut requester, pool)
.await;
if !video_items.is_empty() {
cache.remove(&video_url);
cache.insert(video_url.clone(), video_items.clone());
} else {
return Ok(old_items);
}
Ok(video_items)
}
async fn get_video_items_from_json( async fn get_video_items_from_json(
&self, &self,
html: String, html: String,
@@ -201,12 +284,17 @@ impl XfreeProvider {
} }
}; };
for post in json // The search endpoint returns `{ body: { posts: [...] } }`, while the
.get("body") // front-page feed (`/api/post/`) returns `{ body: [...] }` directly.
// Mirror the site's own logic (`body.posts ? body.posts : body`).
let empty: Vec<serde_json::Value> = vec![];
let body = json.get("body");
let posts = body
.and_then(|v| v.get("posts")) .and_then(|v| v.get("posts"))
.and_then(|p| p.as_array()) .and_then(|p| p.as_array())
.unwrap_or(&vec![]) .or_else(|| body.and_then(|v| v.as_array()))
{ .unwrap_or(&empty);
for post in posts {
let id = post let id = post
.get("media") .get("media")
.and_then(|v| v.get("name")) .and_then(|v| v.get("name"))
@@ -319,16 +407,15 @@ impl Provider for XfreeProvider {
) -> Vec<VideoItem> { ) -> Vec<VideoItem> {
let page = page.parse::<u8>().unwrap_or(1); let page = page.parse::<u8>().unwrap_or(1);
let res = self let query = query.unwrap_or_default();
.to_owned() let res = if query.trim().is_empty() {
.query( // Empty query => front page feed, not the search endpoint.
cache, self.to_owned().front_page(cache, page, options, pool).await
page, } else {
&query.unwrap_or("null".to_string()), self.to_owned()
options, .query(cache, page, &query, options, pool)
pool, .await
) };
.await;
res.unwrap_or_else(|e| { res.unwrap_or_else(|e| {
eprintln!("xfree error: {e}"); eprintln!("xfree error: {e}");