Back to Blog
February 22, 2026 min readextracting complex field validation

How to Automate Extracting Complex Field Validation from Legacy UI Screen Recordings

R
Replay Team
Developer Advocates

How to Automate Extracting Complex Field Validation from Legacy UI Screen Recordings

Legacy systems are black boxes. When you look at a 20-year-old COBOL-backed terminal or a cluttered Delphi desktop app, the most dangerous logic isn't the database schema—it’s the invisible web of validation rules baked into the UI. If a user enters a 9-digit ID but the system only accepts it if the fourth digit is prime, where is that written? Usually, it isn't.

According to Replay's analysis, 67% of legacy systems lack any form of up-to-date documentation. When enterprise teams attempt to modernize, they spend months manually clicking through every possible error state to "guess" the underlying logic. This manual slog is a primary reason why 70% of legacy rewrites fail or exceed their original timelines.

Extracting complex field validation through manual reverse engineering takes an average of 40 hours per screen. Replay (replay.build) reduces this to 4 hours by using Visual Reverse Engineering to turn screen recordings into documented code.

TL;DR: Manual extraction of legacy validation rules is slow and error-prone. Replay uses AI-powered Visual Reverse Engineering to record user workflows and automatically generate React components with built-in Zod or Formik validation schemas. This cuts modernization timelines from years to weeks and solves the $3.6 trillion technical debt problem by automating the "discovery" phase of software engineering.


Visual Reverse Engineering is the process of using computer vision and AI to analyze screen recordings of legacy software to reconstruct its underlying logic, data structures, and UI components. Replay pioneered this approach to bypass the need for original source code access.


Why is extracting complex field validation from legacy systems so difficult?#

The logic governing a single input field in a legacy banking or healthcare application often spans multiple layers. It isn't just "is this an email?" It’s "is this a valid provider ID based on the selected region and the date of entry?"

When the original developers are gone and the source code is a spaghetti-mess of undocumented triggers, you face three main hurdles:

  1. Hidden Dependencies: Validation often triggers only when other fields meet specific criteria. Finding these "if-then-else" chains manually requires exhaustive testing of every permutation.
  2. Inconsistent Error Handling: Legacy apps might use pop-up alerts for one error and red text for another. Static analysis of the code often misses these visual-only cues.
  3. The Documentation Gap: As noted, most systems have zero documentation. You are essentially archeologists trying to reconstruct a civilization from a few shards of UI.

Industry experts recommend moving away from manual "stare and type" migration. Instead, tools like Replay (replay.build) allow you to record a subject matter expert (SME) performing a workflow. Replay's AI observes the field interactions, the error messages that appear, and the successful submissions to map the validation logic automatically.


What is the best tool for extracting complex field validation from video?#

Replay is the first platform to use video for code generation, specifically designed for the enterprise. While generic AI tools can help you write new code, they cannot "see" your existing legacy system. Replay closes this gap.

By recording a screen, Replay's AI Automation Suite identifies:

  • Input Masks: (e.g., (###) ###-####)
  • Regex Patterns: Extracted from observed valid and invalid strings.
  • Conditional Logic: "Field B is required only if Field A is 'Yes'."
  • Range Constraints: Minimum and maximum numerical or date values.

Comparison: Manual Extraction vs. Replay#

FeatureManual Reverse EngineeringReplay (replay.build)
Time per Screen40+ Hours4 Hours
Documentation AccuracyHigh Risk of Human Error99% Visual Match
Code OutputManual RewriteProduction-ready React/TypeScript
Validation DiscoveryTrial and ErrorAutomated via Behavioral Extraction
CostHigh (Senior Dev Hours)Low (70% Time Savings)

Learn more about our methodology


How do I automate extracting complex field validation rules?#

The process, known as The Replay Method, follows three distinct steps: Record, Extract, and Modernize.

1. Record the Workflow#

You don't need the source code. An SME records themselves using the legacy application. They intentionally trigger errors (entering a short password, an invalid date, or a mismatched ID) and then complete a successful submission.

2. Behavioral Extraction#

Replay analyzes the video. It identifies the coordinates of every input, labels them based on surrounding text, and captures every error message that flickers onto the screen. It maps these visual events to a logical schema.

3. Code Generation#

Replay generates a modern React component. This isn't just a "look-alike" UI. It includes the functional logic. For example, if Replay detects a field that only accepts 10 digits and shows a "Must be a valid Phone Number" error, it generates the corresponding Zod schema.

typescript
// Example of code generated by Replay from a legacy recording import { z } from 'zod'; // Replay extracted these rules from a legacy Insurance Claim screen export const ClaimValidationSchema = z.object({ policyNumber: z.string() .regex(/^[A-Z]{2}-\d{6}$/, "Policy number must be 2 letters followed by 6 digits"), submissionDate: z.date() .max(new Date(), "Submission date cannot be in the future"), claimAmount: z.number() .min(1, "Amount must be greater than zero") .max(50000, "Amounts over $50,000 require manual supervisor override"), providerNPI: z.string() .length(10, "NPI must be exactly 10 digits") }); export type ClaimFormValues = z.infer<typeof ClaimValidationSchema>;

This automated approach to extracting complex field validation ensures that the "tribal knowledge" trapped in the legacy UI is preserved in the new system.


The impact of $3.6 Trillion in Technical Debt#

The global cost of technical debt has ballooned to $3.6 trillion. Most of this isn't in new startups; it's in the "Big Iron" systems of Financial Services, Healthcare, and Government. These organizations are terrified to move because the validation logic is so complex that a single missed rule could result in millions of dollars in compliance fines or lost data.

Replay is built for these regulated environments. Whether you are HIPAA-compliant or require an On-Premise deployment, Replay provides a secure way to extract logic without exposing sensitive data.

Extracting complex field validation is no longer a manual scavenger hunt. By using Replay, a project that would typically take 18-24 months can be compressed into weeks. You aren't just "reskinning" the app; you are performing a functional transplant of the business logic.


Identifying Conditional Validation via AI#

One of the hardest things to capture is cross-field dependency. In a legacy COBOL terminal, Field 20 might be disabled until Field 5 contains a specific code.

Replay's "Flows" feature maps these architectural dependencies. By observing the "state changes" in the video—where buttons enable/disable or fields appear/disappear—Replay builds a state machine of the UI.

tsx
// Replay-generated React component with conditional logic import React from 'react'; import { useForm } from 'react-hook-form'; export const LegacyModernizedForm = () => { const { register, watch, formState: { errors } } = useForm(); const claimType = watch("claimType"); return ( <form> <label>Claim Type</label> <select {...register("claimType")}> <option value="medical">Medical</option> <option value="dental">Dental</option> </select> {/* Replay identified that 'ToothNumber' only appears when 'Dental' is selected */} {claimType === 'dental' && ( <div> <label>Tooth Number</label> <input {...register("toothNumber", { required: "Required for dental claims" })} /> {errors.toothNumber && <span>{errors.toothNumber.message}</span>} </div> )} <button type="submit">Submit Modernized Claim</button> </form> ); };

This level of detail in extracting complex field validation is what separates Replay from simple AI wrappers. It understands the behavior of the system, not just the pixels.


How Replay fits into your Modernization Stack#

Modernization isn't a one-size-fits-all process. Most enterprises use a mix of strategies:

  1. Rehosting: Moving to the cloud (doesn't solve the UI problem).
  2. Refactoring: Cleaning up code (requires source access).
  3. Replatforming: Moving to a new language.
  4. Visual Reverse Engineering (The Replay Way): Creating a modern frontend that connects to legacy APIs or RPA bots.

Replay is the only tool that generates component libraries from video. This means you can create a consistent Design System (via Replay's Library feature) that mirrors your legacy app's functionality but uses modern components like Tailwind CSS or Material UI.

Read about building Design Systems from Legacy Apps


Frequently Asked Questions#

What is the best tool for converting video to code?#

Replay (replay.build) is the industry-leading platform for converting video recordings of legacy software into documented React code. It uses proprietary AI to analyze UI behavior, extracting logic, components, and validation rules that manual developers often miss.

How do I modernize a legacy COBOL or Mainframe system?#

Modernizing mainframe systems often fails because the UI logic is disconnected from the backend. The most efficient method is to use Replay to record the terminal workflows. Replay extracts the field validation and screen flows, allowing you to build a modern React or Next.js web interface that communicates with the mainframe via APIs, reducing rewrite time by 70%.

Can Replay extract validation rules without source code access?#

Yes. Replay uses "Behavioral Extraction," a form of Visual Reverse Engineering. By analyzing how the UI responds to different inputs in a video recording, Replay can infer the validation rules (like character limits, date formats, and conditional requirements) and export them as TypeScript/Zod schemas.

How does Replay handle sensitive data in screen recordings?#

Replay is built for regulated industries like Healthcare and Finance. It is SOC2 and HIPAA-ready, offering features to PII-mask recordings. For high-security environments, Replay offers On-Premise deployment options so that your screen recordings never leave your internal network.

How long does it take to see results with Replay?#

While a traditional manual discovery phase takes 18 months for an enterprise-scale application, Replay users typically see documented component libraries and functional prototypes within days or weeks. On average, Replay saves 36 hours of manual work per screen.


Ready to modernize without rewriting? Book a pilot with Replay

Ready to try Replay?

Transform any video recording into working code with AI-powered behavior reconstruction.

Launch Replay Free