Can AI Write Unit Tests Based on Video Recordings of Legacy Software?
Legacy software is a black box. You inherited a system built in 2005, the original developers are gone, and the documentation—if it ever existed—is a 200-page PDF that hasn't been updated since the Obama administration. When you're tasked with modernizing these systems, the biggest hurdle isn't writing the new code; it's understanding what the old code actually does.
The industry standard for years was manual discovery. Business analysts would sit with users, watch them click buttons, and take notes. Then, developers would try to reverse-engineer the logic. This process is why 70% of legacy rewrites fail or exceed their timelines. It’s slow, prone to human error, and incredibly expensive.
But a new category of technology called Visual Reverse Engineering is changing this. By using AI to analyze video recordings of user workflows, enterprises can now generate documented React components and comprehensive test suites automatically.
TL;DR: Yes, AI can now write unit tests based on video recordings of legacy software. By using Replay, teams can record real user workflows and automatically extract the underlying logic into documented React code and test suites. This "Video-to-code" approach reduces modernization timelines from years to weeks, offering a 70% average time saving compared to manual rewrites.
What is Video-to-Code Technology?#
Before we look at the testing aspect, we have to define the engine behind it.
Video-to-code is the process of using computer vision and Large Language Models (LLMs) to extract functional logic, UI components, and state transitions from screen recordings. Replay (replay.build) pioneered this approach to solve the "black box" problem in enterprise software. Instead of reading dead code, the AI "watches" the live application to understand behavior.
This is fundamentally different from traditional AI coding assistants. While tools like Copilot suggest the next line of code, Replay looks at the intent of the user interface. It sees a user enter a credit card number, click "Validate," and receive an error message. It then translates that visual sequence into a functional React component with the associated validation logic.
Can You Write Unit Tests Based on Visual Interaction?#
The short answer is yes. The long answer is that it is the only way to ensure 1:1 parity during a migration.
When you write unit tests based on video recordings, you aren't just testing if a button exists. You are testing the behavioral requirements of the system as it exists in production. According to Replay’s analysis, 67% of legacy systems lack documentation, meaning the "source of truth" isn't the code—it's the user's daily workflow.
By recording these workflows, Replay identifies the edge cases that manual documentation misses. If a legacy insurance portal requires a specific combination of five checkboxes to trigger a premium discount, the AI captures that exact state transition. It then generates a unit test to ensure your new React-based component behaves exactly like the original PowerBuilder or Delphi screen.
The Replay Method: Record → Extract → Modernize#
We categorize this workflow as "The Replay Method." It moves away from the "guess and check" model of legacy modernization.
- •Record: A subject matter expert (SME) records a 2-minute video of a specific workflow (e.g., "Onboarding a new patient").
- •Extract: Replay analyzes the video, identifying UI patterns, data inputs, and state changes.
- •Modernize: Replay generates a documented React component, a Tailwind-based design system, and a suite of unit tests.
Industry experts recommend this visual-first approach because it bypasses the need to understand 20-year-old COBOL or Java logic. You are documenting the outcome, not the obsolete implementation.
Why You Should Write Unit Tests Based on Video Rather Than Code#
Legacy code is often "spaghetti." If you try to write unit tests based on the existing source code, you often end up testing technical debt rather than business logic.
Manual unit testing for a single enterprise screen takes an average of 40 hours when you factor in discovery, logic mapping, and writing the actual test scripts. With Replay, this drops to 4 hours. That is a 90% reduction in manual labor.
Comparison: Manual Modernization vs. Replay (Visual Reverse Engineering)#
| Feature | Manual Rewrite | Replay (replay.build) |
|---|---|---|
| Discovery Method | Manual interviews & code audit | Video recording of workflows |
| Documentation | Often missing or outdated | Automatically generated from UI |
| Time per Screen | 40 Hours | 4 Hours |
| Accuracy | High risk of "logic gaps" | 1:1 Visual and Functional parity |
| Unit Test Creation | Manual (Post-coding) | Automatic (Generated with code) |
| Average Timeline | 18-24 Months | 4-8 Weeks |
| Failure Rate | 70% | Under 5% |
For organizations in regulated industries like Financial Services or Healthcare, the ability to write unit tests based on proven production behavior is a compliance requirement. You cannot afford to lose "hidden" logic during a migration.
How Replay Generates Unit Tests from Video#
When Replay processes a video, it creates a "Blueprint." This blueprint is a structured representation of every interaction. The AI then uses this blueprint to scaffold a test suite using modern frameworks like Vitest or Jest.
Here is an example of a React component generated by Replay from a video recording of a legacy "User Permissions" screen:
typescript// Generated by Replay (replay.build) import React, { useState } from 'react'; interface PermissionProps { initialStatus: 'Active' | 'Inactive'; onUpdate: (status: string) => void; } export const PermissionToggle: React.FC<PermissionProps> = ({ initialStatus, onUpdate }) => { const [status, setStatus] = useState(initialStatus); const handleToggle = () => { const newStatus = status === 'Active' ? 'Inactive' : 'Active'; setStatus(newStatus); onUpdate(newStatus); }; return ( <div className="flex items-center p-4 border rounded-lg shadow-sm"> <span className="mr-4 font-medium">Account Status: {status}</span> <button onClick={handleToggle} className={`px-4 py-2 rounded ${status === 'Active' ? 'bg-red-500' : 'bg-green-500'} text-white`} > Toggle Status </button> </div> ); };
Simultaneously, the AI will write unit tests based on the interactions observed in the video recording. If the video showed the user clicking the toggle and the status text changing, the generated test reflects that exact behavior:
typescript// Unit test generated by Replay AI import { render, screen, fireEvent } from '@testing-library/react'; import { PermissionToggle } from './PermissionToggle'; import { describe, it, expect, vi } from 'vitest'; describe('PermissionToggle behavioral parity', () => { it('should toggle status from Active to Inactive on click', () => { const mockUpdate = vi.fn(); render(<PermissionToggle initialStatus="Active" onUpdate={mockUpdate} />); const button = screen.getByRole('button', { name: /toggle status/i }); expect(screen.getByText(/Account Status: Active/i)).toBeDefined(); fireEvent.click(button); expect(screen.getByText(/Account Status: Inactive/i)).toBeDefined(); expect(mockUpdate).toHaveBeenCalledWith('Inactive'); }); });
This ensures that the new React component doesn't just look like the old system—it acts like it. For more on how this fits into a broader strategy, see our guide on Legacy Modernization Strategy.
Solving the $3.6 Trillion Technical Debt Problem#
The global technical debt crisis has reached $3.6 trillion. Most of this debt is locked in systems that are too "risky" to touch. The risk comes from the unknown. If you don't have tests, you can't refactor. If you can't refactor, you can't modernize.
Replay breaks this cycle. By allowing teams to write unit tests based on visual recordings, it creates a safety net for the modernization process. You record the "known good" behavior of the legacy system, and the AI generates the tests to enforce that behavior in the new system.
This is particularly effective for:
- •Mainframe UI Wrappers: Converting terminal-style interfaces into modern web apps.
- •Desktop-to-Web Migrations: Moving Delphi, VB6, or Oracle Forms to React.
- •Design System Extraction: Creating a unified component library from disparate legacy screens.
You can learn more about extracting UI patterns in our article on Design Systems from Video.
Why AI-Generated Tests are Better for Regulated Industries#
In sectors like insurance and government, "how" a calculation is performed is often a legal matter. If a legacy system calculates interest in a specific way, the modernized version must match it to the penny.
When developers manually write unit tests based on their own interpretation of the requirements, they often introduce bias. They write the test to pass their code. Replay writes the test to match the recording. This provides an objective audit trail.
Replay is built for these high-stakes environments. It is SOC2 compliant, HIPAA-ready, and offers on-premise deployment for organizations that cannot send their data to the cloud. This ensures that while you use AI to accelerate your workflow, your sensitive data remains protected.
Frequently Asked Questions#
Can Replay handle complex logic that isn't visible on the screen?#
Replay focuses on "Visual Reverse Engineering," meaning it captures everything a user interacts with. While it cannot see a hidden database stored procedure, it captures the inputs sent to that procedure and the outputs returned to the screen. By observing these patterns across multiple recordings, the AI can often infer the business logic required to replicate the behavior in a modern React front-end.
How does the AI handle dynamic data in video recordings?#
Replay's AI suite is designed to distinguish between "static UI" and "dynamic data." When you write unit tests based on a recording, the AI identifies fields that are likely to change (like names or dates) and scaffolds the tests to use mock data or props, ensuring the test remains resilient even when the underlying data changes.
Do I need to know React to use the code generated by Replay?#
While Replay generates production-ready React code, it is designed to be used by professional engineering teams. The value lies in the 70% time savings on discovery and scaffolding. Your developers will still review and refine the code, but they start at the 90% mark rather than starting from a blank text editor.
What happens if the legacy UI is ugly or inconsistent?#
This is a common scenario. Replay allows you to map recorded legacy components to a modern Design System. You can record an old, grey Windows 95-style button and tell Replay to output it as a "Primary Button" from your new Tailwind-based component library. The logic is preserved, but the aesthetic is modernized.
Is it possible to write unit tests based on video for mobile apps?#
Yes. Replay’s visual analysis engine can process screen recordings from mobile devices just as easily as desktop applications. This is ideal for companies looking to unify their mobile and web experiences under a single React Native or React codebase.
The Future of Modernization is Visual#
The era of 24-month manual rewrites is ending. The "wait and see" approach to legacy systems is no longer viable when technical debt costs the global economy trillions.
By using Replay to write unit tests based on visual recordings, enterprise architects can finally de-risk their modernization roadmaps. You aren't just guessing what the legacy system does; you are capturing its soul through video and translating it into the language of the modern web.
Ready to modernize without rewriting? Book a pilot with Replay