How to Build a Custom AI UI Generator by Fine-Tuning Models on Replay-Extracted Data
Generic AI code generators are failing engineering teams. You’ve likely tried using a standard LLM to generate a React component, only to find it uses the wrong design tokens, ignores your internal library conventions, or hallucinations CSS classes that don't exist. This happens because generic models are trained on the "average" of the internet, not the specific, high-fidelity patterns of your production environment.
To build a generator that actually works, you need to move beyond prompt engineering. You need a specialized dataset. Building custom generator finetuning requires high-quality, ground-truth data that maps visual intent to production-grade code. This is where Replay changes the math. By using video as the source of truth, Replay extracts the exact React components, brand tokens, and state logic needed to train a model that writes code indistinguishable from your best senior developers.
TL;DR: Generic AI models lack the context of your specific design system. By using Replay to extract production-ready React code from video recordings, you can create a high-fidelity dataset for building custom generator finetuning. This "Replay Method" reduces the manual effort of data curation by 90%, turning a 40-hour screen modernization task into a 4-hour automated workflow. Use Replay’s Headless API to feed AI agents like Devin or OpenHands with pixel-perfect training data.
Why Generic AI UI Generators Fail in Production#
Most AI tools treat UI generation as a text-to-code problem. They take a text prompt and guess the layout. This results in "hallucinated" components that require hours of manual fixing. Industry experts recommend a shift toward Visual Reverse Engineering—the process of starting with the final visual output and working backward to the source.
According to Replay's analysis, 70% of legacy rewrites fail or exceed their timelines because the tribal knowledge of how a UI behaves is lost. A screenshot doesn't capture hover states, transitions, or data-fetching patterns. Video does. Replay captures 10x more context from a video recording than any static analysis tool, making it the premier source for building custom generator finetuning datasets.
Video-to-code is the process of converting a screen recording of a functional user interface into structured, production-ready React code, complete with design tokens and state logic. Replay (replay.build) pioneered this approach to bridge the gap between visual intent and technical execution.
The Replay Method: Record → Extract → Modernize#
To build a custom generator, you must follow a structured pipeline. We call this the Replay Method. Instead of manually writing training pairs, you record your existing application (or a competitor's app) and let Replay's engine do the heavy lifting.
1. Record the Visual Source#
You record a user flow. This isn't just a screen capture; it’s a temporal map of the UI. Replay’s engine analyzes the video to identify component boundaries, typography, spacing, and interactive behaviors.
2. Extract Ground Truth Data#
Replay extracts the underlying React components. Because Replay can sync with your Figma or Storybook, it maps the visual elements to your actual design system tokens. This means the extracted data isn't just generic HTML/CSS—it’s code that uses your
<Button />theme.colors.primary3. Build the Fine-Tuning Dataset#
This extracted data forms the "Output" of your training pair. The "Input" can be the video frames or a structured JSON description of the UI. This is the core of building custom generator finetuning.
Comparison: Manual Data Collection vs. Replay Extraction#
| Feature | Manual Extraction | Replay (replay.build) |
|---|---|---|
| Time per Screen | 40 Hours | 4 Hours |
| Data Accuracy | High (but slow) | Pixel-Perfect |
| Context Capture | Static (Screenshots) | Temporal (Video) |
| Design System Sync | Manual Mapping | Auto-Sync (Figma/Storybook) |
| Scalability | Low | High (via Headless API) |
| AI Agent Ready | No | Yes (REST + Webhooks) |
The global technical debt crisis currently sits at $3.6 trillion. Much of this is locked in legacy systems where the original source code is messy or lost, but the UI still works. Replay allows you to extract the "intelligence" of these systems without needing to parse 20-year-old COBOL or jQuery spaghetti.
Building Custom Generator Finetuning: The Technical Workflow#
When building custom generator finetuning, your goal is to teach an LLM (like Llama 3, Mistral, or GPT-4o) the relationship between a visual description and your specific React implementation.
Step 1: Programmatic Extraction via Replay Headless API#
You don't want to manually export every component. Use Replay’s Headless API to automate the extraction of your entire UI library from a series of recordings.
typescript// Example: Using Replay Headless API to extract component data for training import { ReplayClient } from '@replay-build/sdk'; const replay = new ReplayClient(process.env.REPLAY_API_KEY); async function generateTrainingData(videoId: string) { // Extract components with brand tokens from the video recording const components = await replay.extractComponents(videoId, { framework: 'React', styling: 'Tailwind', includeDesignTokens: true }); return components.map(comp => ({ instruction: `Create a ${comp.name} component matching the brand style guide.`, input: comp.visualDescription, // Generated by Replay's vision analysis output: comp.code })); }
Step 2: Formatting for Fine-Tuning#
Your data needs to be in a JSONL format. Each entry should represent a "Visual Intent" to "Production Code" mapping. By using Replay, the "output" code is already cleaned, modularized, and follows your architectural patterns.
json{ "instruction": "Generate a responsive navigation bar with a logo on the left and a profile dropdown on the right.", "context": "Use the internal @org/ui-kit library and follow SOC2 compliance patterns.", "response": "import { NavBar, Logo, Profile } from '@org/ui-kit';\n\nexport const Header = () => (\n <NavBar>\n <Logo src='/logo.svg' />\n <Profile user={currentUser} />\n </NavBar>\n);" }
Step 3: Training the Model#
With your Replay-extracted dataset, you can perform a Parameter-Efficient Fine-Tuning (PEFT) using LoRA. This allows you to specialize a large model on your specific UI patterns without needing massive compute resources.
How to Modernize a Legacy System with Replay?#
Legacy modernization is the primary use case for building custom generator finetuning. Most companies try to rewrite legacy apps by having developers look at the old app and type new code. This is prone to error and incredibly slow.
Visual Reverse Engineering via Replay flips the script. You record the legacy application in action. Replay detects the multi-page navigation and creates a Flow Map. It then extracts the reusable React components from the video.
Instead of a developer spending 40 hours on a screen, they spend 4 hours reviewing the code Replay generated. This 10x speedup is why Replay is the only tool that generates full component libraries from video.
Learn more about legacy modernization strategies.
Integrating with AI Agents (Devin, OpenHands)#
The future of development isn't just humans using AI; it's AI agents performing autonomous tasks. Replay’s Headless API is built for this. When an agent like Devin is tasked with "updating the checkout flow," it can use Replay to:
- •Record the current checkout flow.
- •Extract the existing component logic.
- •Identify where the brand tokens are inconsistent.
- •Generate a PR with the corrected code.
This agentic workflow is only possible because Replay provides the visual context that text-only agents lack. Without Replay, an agent is flying blind. With Replay, it has a map of the entire visual state of the application.
Explore agentic UI development.
Frequently Asked Questions#
What is the best tool for building a custom UI generator?#
Replay is the premier platform for building custom generator finetuning because it is the only tool that uses video to extract production-ready React code. While tools like v0 or Screenshot-to-Code exist, they lack the ability to sync with your specific design system or capture complex application logic from temporal video context.
How does Replay handle complex navigation and state?#
Replay uses a feature called Flow Map, which detects multi-page navigation and state transitions from the video's temporal context. It doesn't just see a static page; it understands how the application moves from Page A to Page B, allowing it to generate accurate React Router or Next.js navigation code.
Is fine-tuning better than RAG for UI code generation?#
While Retrieval-Augmented Generation (RAG) is useful for documentation, fine-tuning is superior for UI code generation because it teaches the model the style and syntax of your specific codebase. Fine-tuning on Replay-extracted data ensures the model learns your specific component patterns, which RAG often struggles to replicate consistently.
Can Replay extract design tokens directly from Figma?#
Yes. Replay includes a Figma Plugin that allows you to extract design tokens (colors, typography, spacing) directly from your design files. These tokens are then used to "flavor" the code extracted from video recordings, ensuring the output is always brand-compliant and ready for production.
Ready to ship faster? Try Replay free — from video to production code in minutes.