How AI puts you on the jumbotron, explained
Every few days someone replies to one of our videos with the same question: “wait, how is this actually made?”Fair question. The clip looks like real broadcast footage — stadium lights, a score bug in the corner, a crowd you can almost hear — but it was built from a single selfie in about a minute. Here's exactly how, including the version that didn't work and why we threw it out.
The version we tried first (and scrapped)
The obvious approach is one step: hand a selfie straight to an image-to-video model and say “put this person on a stadium jumbotron, make it move.” We built that first. It looked fake, and consistently in the same way: because most selfies are tight, well-lit head-and-shoulders shots, the model treated the person like a cut-out and rendered the stadium as a flat backdrop behind them — green-screen energy. The lighting on the face never matched the scene, so your brain instantly flagged it as composited.
The lesson was simple but important: you can't animate your way to realism if the still frame already looks pasted-on. The realism has to be baked into the image before anything moves.
So we split it into two steps
What actually ships today is a two-stage pipeline, and each stage does one job well.
Step 1 — put you in the stadium (a still)
First we run an AI image editor that takes your selfie and re-renders it as a single broadcast still: you, on the big screen, in a real-looking ballpark — summer evening, telephoto crowd-cam framing, an ESPN/Fox-style score bug in the corner. Crucially, the model relights your face to match the scene and places you withinthe environment rather than in front of it. This is the step that kills the green-screen look. If the still is convincing, you're 90% of the way there.
Step 2 — bring the still to life (the video)
Then we feed that finished still into an image-to-video model (we use Kling 3.0 via Magic Hour) to add motion — subtle crowd movement, a touch of camera life, the micro-expressions that sell “this is a live shot.” Because the model is animating a frame that already looks real, the motion reinforces the illusion instead of exposing it. The clip renders at roughly 8 seconds in 16:9 widescreen.
Why 16:9 and not a vertical clip
We actually started in 9:16 vertical (it's what phones shoot, and it's native to TikTok). We switched to 16:9 widescreen on purpose. A real jumbotron broadcast is horizontal, so a vertical frame subtly reads as “phone clip” and undercuts the “caught on TV” effect. The widescreen frame, with the score bug and the letterboxing of a broadcast, does a lot of the believability work before the viewer even processes the content. Small choice, big payoff.
The genuinely hard parts
Two things are hard and worth being honest about:
- Keeping your face your face. Identity drift — where the output looks like a cousin of you rather than you — is the number-one failure mode. It gets worse with low-light selfies, sunglasses, heavy filters, or photos where your face is small in the frame. A clear, front-facing, well-lit selfie is the single biggest thing you control. We wrote a whole guide on taking the perfect selfie for this.
- Cost and time per clip.Generating video is not free or instant — there's real compute behind every render, which is why it takes a few minutes rather than a few seconds. We'd rather wait and ship something that looks real than rush and ship something that looks like a meme template.
Why we text you the result instead of making you wait
Because step 2 takes a few minutes, staring at a loading bar is a bad experience. So we ask for your number, kick off the render, and text you the link the moment it's ready — you can close the tab and go on with your day. (We only ever message you about your video; reply STOP and we stop. More on that in our privacy policy.)
Is it “real”? No — and we never pretend otherwise
Let's be clear: these are AI-generated entertainment, not footage from an actual game. There's a small onthetron.com mark on every clip, and we say so up front. The fun isn't in fooling anyone — it's in the “ha, that looks exactlylike the real fan cam” reaction when you send it to the group chat.
That's the whole pipeline. If you want to see it run, the easiest way is to just make one with your own selfie— it's free and takes about a minute.
Related reading
- Why fan-cam videos go viral on TikTok (and how to make yours pop)The fan-cam look hijacks a feeling we all recognize — being seen, celebrated, caught on the big screen. Here's why that format travels so well, and what we've noticed makes one actually take off.
- The most iconic jumbotron moments of all timeProposals, dance-offs, kiss-cam chaos, the lone superfan — a tour of the big-screen moments that became part of sports culture, and what each one teaches about why the format works.
- What actually makes a good jumbotron video (we made a bunch)After generating a lot of these, the difference between a clip that lands and one that flops is surprisingly consistent. Here's what works, what falls flat, and the patterns behind both.