pornboxdBETA
← Field Notes
Devlog

Three new VR studios, added the easy way (and an old data mess that came with them)

Two new affiliate portals joined the deck this week, one covering four Naughty America sub-brands, the other covering a niche VR site nobody has heard of. The

Two new affiliate portals joined the deck this week, one covering four Naughty America sub-brands, the other covering a niche VR site nobody has heard of. The job sounded simple: pull their catalogues into PornBoxd, wire up the tracking links, move on to the next thing. That’s not how the week went.

Four walls and a welcome mat

First reconnaissance on the four Naughty America sites, Real Teens VR, Real Pornstars VR, Naughty America’s own VR section, and Virtual Sex World, was quick and discouraging. Three of them sit behind AWS’s captcha wall, the same kind that stops bots cold. No amount of proxy-hopping was going to make that economical. Normally that’s where the story ends for a studio.

Except the Naughty America affiliate network has something almost no adult studio bothers with: a real API. Public JSON endpoint, no password, full per-scene metadata, title, synopsis, runtime, cast split by gender, release date, thumbnails, trailers in every VR format. The kind of thing you get from a catalog that knows it’s a catalog.

That’s the welcome mat. A week’s worth of scraping fights, traded for an afternoon’s worth of walking through a front door. By evening, three studios were live:

  • Real Teens VR, 24 videos
  • Real Pornstars VR, 37 videos
  • Naughty America VR, 1,350 videos

The fourth site, Virtual Sex World, is publicly live but entirely empty, still running WordPress’s default “Hello world!” post from 2024. Parked. We’ll check back quarterly.

1,411 new videos, zero scrapes, zero proxy fights.

The old mess the new catalogue exposed

Somewhere during the big import, a stream of warning messages scrolled past in the logs:

“Chloe Couture absorbed into Chloe Cherry.” “Jaclyn Taylor absorbed into Ally Breelsen.” “Julia Ann absorbed into JULIA.”

Those aren’t merges. Those are different people, being silently stapled together as if they were the same person.

Here’s what was happening. Every performer profile on PornBoxd carries a list of aliases, stage names, former names, alternate spellings. Useful data; comes from a long-standing enrichment scraper that pulls it from a public performer database. When a new scene arrives crediting “Chloe Cherry,” the importer checks: does any existing actor have that name in their aliases? If so, link the scene there instead of creating a duplicate row.

The problem was that some of the aliases were wrong. The performer database has a lot of crowdsourced data, and occasionally two completely unrelated performers get listed as aliases of each other. Worse: male performers (who often work under multiple stage names across niches) were accumulating single-word aliases like “Chad” or “Blair”, names so generic they’d absorb anyone with the same first name from any future import.

Chasing it down uncovered the real root cause: the enrichment scraper had a bug where, when it had to fall back to a search query, it hardcoded a filter that only returned women. Any male actor whose exact profile URL didn’t resolve was getting stamped with whatever top female result came back for his name. Chad White’s aliases were literally a random collection of women’s stage names.

§§§

Three fixes, in order

One: patch the enrichment scraper so it stops poisoning new records. Deployed this morning.

Two: harden the importer so even bad aliases already in the database can’t do damage. Now, when considering an alias match, the importer requires two sanity checks, the incoming name has to be more than one word (so “Chad” alone can’t match anyone), and the alias has to share at least one word with the canonical actor’s name (so “Jaclyn Taylor” on an “Ally Breelsen” profile no longer counts as “same person”). The narrow trade-off: genuine cross-niche stage names that share zero words now need a manual merge instead of auto-linking. Worth it.

Three: clean up what’s already in the database. Swept out nearly three thousand accidental self-references (actors listing their own name, lowercased, as an alias). Dropped the worst of the known false aliases. Merged a dozen actor rows that were duplicates under different capitalizations or spellings, “Cherie DeVille” and “Cherie Deville” as separate people; “Bridgette B.” and “Bridgette B” as separate people; that kind of thing. The remaining stragglers, mostly around JAV performers who legitimately use multiple stage names across their careers, now live in a review file I’ll work through one quiet afternoon at a time. The important part is none of them can bleed into new imports anymore.

Also, the posters were sideways

One self-inflicted bug along the way: the first pass of the new importer grabbed the portrait thumbnail field from the API instead of the landscape one. Every new scene’s card was letterboxed with a tall skinny image in a wide slot. Noticed, fixed, re-imported.

And while I was at it, I noticed that the scene titles coming out of Real Teens VR were placeholder garbage like “Violet Rain | Age 19 | 33 min”, not actual scene titles, just the studio’s auto-generated filler because nobody at Naughty America ever bothered to write real ones for this sub-brand. Cleaned at the import layer, so the display now shows just the actor’s name. The URL slug keeps a scene-ID suffix so two scenes from the same performer don’t collide.

What’s next

  • Virtual Pee, the other new affiliate portal. No friendly API this time, but the site serves clean HTML with all the metadata spelled out on each page, nationality, age, measurements, the works. Straightforward browser-style scrape at the cheap tier. Content comes first; starting on this tonight.
  • Affiliate wiring for both new networks, the outbound links on these 1,411 new scenes currently just point at the studios’ own scene pages, which works but doesn’t earn. Deferred for now.
  • The JAV alias cleanup, about sixty rows of domain-judgment calls waiting for the right kind of afternoon.