← Field Notes

Devlog

I was about to SSH into prod to onboard a new studio. Then I built a wizard.

After 140 manual studio onboardings, the SSH-and-pray ritual was up. A 3-step wizard plus a daily aggregator coverage cron now does it in 90 seconds.

By Mortimer CockburnMay 2, 20265 min read

Adding studio number 140 felt the same as adding studio number 5. That was the problem.

The flow that had served me from day one looked like this. Open four browser tabs, one each for POVR, SLR, VRSMASH, and vrporn. Search the studio name on each. Eyeball the catalog size on each one and decide which platform has the freshest catalog and the fullest scene list. SSH into prod. Run a one-shot ts-node script with the studio's name, the platform I picked, and the slug for that platform. Hope nothing broke. If it broke, edit a row in psql by hand. Make sure I remembered to flip trailer_hover_enabled on so the hover-preview pipeline picks it up. Deploy.

Six steps, every time. I have onboarded around 140 studios in the last three weeks. Every onboarding session erased an hour somewhere.

This week I built the wizard.

The data layer underneath it is a new table called aggregator_studio_coverage, one row per (source × source_studio_slug) tuple. A daily 02:00 UTC cron walks each of the four aggregator studio indexes and snapshots two columns per studio per platform: total_video_count and latest_release_date. First run on prod yielded 1418 rows.

The four extractors take different shapes because the four aggregators don't share much. VRPorn ships a Nuxt SSR payload that includes both numbers in one fetch. POVR needs two fetches: one for the listing page, one for the topmost-scene detail page so we can grab the latest date (POVR's "N scenes" header is unreliable, so we count scenes-on-page-1 times max-page-number to estimate the total). SLR is two fetches with the count coming from a header text and the date from a topmost detail. VRSMASH is the worst, two to twelve fetches per studio because we cap at a 10-page walk to keep the wall-clock bounded; studios over 320 scenes get marked with a 320+ capped flag in raw_metadata. The whole sequential 4-platform sweep takes about 42 minutes wall-clock, dominated by VRSMASH's 1500ms throttle (Cloudflare's WAF caps single-IP traffic on that origin around 250 requests per 15 minutes).

For studios that are already onboarded (i.e. linked to a row in our local studios table), the coverage cron short-circuits. It reads counts from our own videos table instead of going outbound. Zero outbound HTTP for those rows. That's load-bearing because it means the cron's cost grows with the number of un-onboarded studios on the four aggregators (a finite set), not with our own catalog growth.

The wizard itself sits inside /admin/studios as a 3-step modal.

Step 1 is identity. Display name, slug, format, orientation. Plus a "combine with similar" picker that runs a trigram search against studio_directory.name for entries similar to what I just typed. The picker exists because cross-platform name-mismatch is real. Bravomodelsmedia on POVR is the same studio as Bravomodels on SLR, but a naive name-match would create two rows. The picker checkbox-merges them.

Step 2 is canonical source plus slug pairings. Always-visible 4-cell coverage strip showing total_video_count and latest_release_date per platform. The operator picks which platform is canonical (the one whose catalog gets imported as the source of truth) and pastes in the per-platform slugs for the other three (used later for cross-source enrichment). The strip exists because the freshest catalog is not always the one I would have guessed. Some studios moved their primary distribution to SLR after VRPorn changed terms. Some are POVR-only. Eyeballing 4 numbers in one strip beats opening 4 tabs.

Step 3 is affiliate posture. Skip, NATS, or direct. NATS asks for site_id and platform; direct asks for the affiliate base URL and link mode. This is the only step that's still a pure form, because the affiliate side of onboarding can't be coverage-tracked.

The whole submit fires POST /admin/onboard-from-coverage, which is one transaction that creates the studios row, creates or links the studio_directory entry, populates the four aggregator_slug columns, links the four coverage rows, and (optionally) kicks off the first scrape via the existing scheduler endpoint. Default for new studios includes trailer_hover_enabled = TRUE so the trailers cron picks up hover previews automatically. That default came out of a previous bug where a different default had left 344 trailer rows sitting unprocessed for a day.

The bug-fix loop after first dogfood was satisfying in a "yes, this is exactly why we test in production" way.

POVR pages aren't 12 scenes each, they are about 42. My placeholder constant from the prototype was wrong. Total counts were off by 3.5x for every POVR studio in the first run. Caught when a studio that should have shown around 600 scenes showed around 170.

VRSMASH's "N videos" header text returns NULL on some studios. The bounded walk now returns NULL totals correctly, and the UI renders that as a placeholder dash instead of "0". Both the NULL-tolerance and the dash placeholder were fixes, not the original behavior.

ORDER BY on a column aliased back to its source name was lexically sorting numeric values. This bug has bitten me before. I have a memory rule about it now (feedback_postgres_aggregate_gotchas).

MAX(uuid) doesn't exist in Postgres. Cast to text first. Pinned in the same memory rule.

The wizard's first version forgot to set trailer_hover_enabled=TRUE on creation. Caught when a freshly-onboarded studio had no hover previews after the first scrape.

The next studio I onboarded after the bugs settled took 90 seconds end-to-end. Then I went and onboarded six more. Eyeballing 4 numbers in a coverage strip is meaningfully different from opening 4 tabs and squinting at headers.

Two things stay parked. First, the existing-studios pairing UI: most of my ~140 studios were onboarded before the wizard, so their aggregator_slug columns are empty and they don't get the cross-source enrichment hop. I need a follow-up tool to backfill those one studio at a time from the same modal. Second, the batch onboarding wizard, which would walk every unlinked aggregator studio in one session. I want to dogfood the per-studio modal flow first before automating it further.

Memory rule: a workflow that takes the same number of minutes whether you do it 5 times or 140 times is a workflow that should have been a tool ten studios ago.