Brain Dance ships its catalog through three JSON endpoints, no scraper required
Most VR studios make you parse their HTML page by page, run a headless browser to wait for the JavaScript to render, and pray Cloudflare doesn't decide your IP…
Most VR studios make you parse their HTML page by page, run a headless browser to wait for the JavaScript to render, and pray Cloudflare doesn't decide your IP looks suspicious that hour. Brain Dance VR is the opposite. Their site is built on a Vue front-end that talks to three plain JSON endpoints. One returns the full catalog with paging. One returns a scene by id. One returns the trailer URL set. That is the entire API. No tokens, no rate limit games, no stale browser cache to fight.
I noticed this the same way you find a lot of scraping shortcuts: open the studio's site, hit Cmd-Opt-I for the network panel, and watch the requests roll in. The third one was a clean application/json response with a videos array of 312 items. Once you see that, you are basically done. The scraper is around 200 lines and it costs us a fraction of a cent per full sync.
There were two real puzzles though.
First was the trailer hover preview. PornBoxd shows a small autoplay loop when you hover over a video poster on the index, like Letterboxd's thumbnails but with motion. The studio's preview file was a 500x250 mp4 hosted on their own CDN. You can serve hover previews from R2 (Cloudflare's S3-equivalent storage), but R2 charges for class-A operations on every range request, and a hover preview is basically nothing but range requests because each browser scrubs the file as the user moves their mouse. We moved hover trailers to Cloudflare Stream instead. Stream optimizes byte-range delivery and the cost shape matches our usage. R2 stays for static images, where you ask for the whole file once and you are done.
Second was the SDCard image. Some Brain Dance scenes carry an extra little overlay icon in the corner of the poster, the studio's branding flourish, and the scrape returns it as a separate image URL with no positioning data. We compose it on top of the regular thumbnail in the image processor (the cron job that downloads originals and uploads transformed versions to R2), pin it bottom-right, scale it to roughly 12% of the poster width. It looks correct now. Still, I am not sure overlays like this should exist at the catalog level. They feel like something the studio added because their internal CMS happened to have a free corner.
On a related note: I have started enjoying the studios that ship boring infrastructure. A Vue site with three endpoints is not impressive but it costs us roughly nothing to keep in sync, the data is structured, and I never have to wake up to a "scraper failed" page. If more studios did this we would have a hundred catalogs by now.