Back to Blog
February 9, 20269 min readextracting business rules

Extracting Business Rules from Legacy PHP 4 Applications

R
Replay Team
Developer Advocates

Your PHP 4 application is a ticking time bomb, but not for the reason you think. It’s not just the security vulnerabilities or the lack of containerization—it’s the fact that your company’s core intellectual property is trapped inside a procedural "black box" that no one living understands.

When organizations attempt to modernize these systems, they almost always default to "code archaeology." They assign their most expensive architects to sit in a room for six months, reading spaghetti code, trying to figure out if a specific

text
if-else
block from 2004 is still a valid regulatory requirement or just a bug that became a feature.

This manual approach is why 70% of legacy rewrites fail or exceed their timelines. You cannot solve a 20-year documentation gap by reading code line-by-line. The future of modernization isn't rewriting from scratch; it’s extracting business rules through visual reverse engineering.

TL;DR: Manual code archaeology is the primary cause of modernization failure; using Replay to extract business rules from user workflows reduces extraction time from 40 hours to 4 hours per screen.

The PHP 4 Extraction Crisis: Why Manual Audits Fail#

PHP 4 was the Wild West of web development. Released in 2000, it predates modern PSR standards, composer, and even basic object-oriented stability. Extracting business rules from these environments is uniquely difficult because of three architectural sins common to the era:

  1. Global State Chaos: The heavy use of
    text
    register_globals
    and
    text
    $GLOBALS
    means that a variable modified on line 10 of
    text
    header.php
    could fundamentally change the tax calculation logic on line 2,500 of
    text
    checkout.php
    .
  2. The Include Hell: Logic isn't encapsulated in classes; it's spread across hundreds of
    text
    include()
    and
    text
    require()
    statements, often nested five levels deep, creating a non-linear execution path that is impossible to map manually.
  3. The Intertwined UI and Logic: Business rules are frequently hardcoded directly into HTML
    text
    <table>
    tags. There is no separation of concerns.

When you ask a developer to perform manual extraction, they aren't just reading code—they are performing a forensic investigation on a crime scene where the evidence has been tampered with for two decades.

The Cost of Manual Extraction vs. Visual Reverse Engineering#

MetricManual Code ArchaeologyReplay Visual Extraction
Time per Screen40+ Hours4 Hours
Documentation Accuracy60% (Subjective)99% (Observed Behavior)
Risk of Logic GapHighLow
Required Domain ContextExpert LevelMinimal
Output TypeStatic PDF/WikiExecutable Code & Tests

Stop Reading Code, Start Recording Behavior#

The biggest mistake in extracting business rules is assuming the source code is the "source of truth." In legacy systems, the source of truth is the running application.

If a claims adjuster in a healthcare legacy system enters a specific ICD-10 code and the system triggers a 20% co-pay, that behavior is the business rule. It doesn't matter if the underlying PHP code is a mess of

text
goto
statements and deprecated
text
mysql_query
calls.

Replay flips the script by using video as the source of truth. By recording real user workflows, Replay observes the inputs, the state changes, and the outputs. It doesn't just see the code; it sees the intent.

💰 ROI Insight: The average enterprise rewrite takes 18 months. By using Replay to automate the extraction of business rules and UI components, companies move from "discovery" to "functional prototype" in days rather than months, saving an average of 70% in labor costs.

The 4-Step Framework for Extracting Business Rules#

Step 1: Workflow Mapping#

Identify the critical paths. In a financial services app, this might be "Loan Application Submission." Instead of mapping the database schema, map the user journey. What are the 15 fields required to move from Step A to Step B?

Step 2: Visual Recording with Replay#

A subject matter expert (SME) performs the workflow while Replay records the session. Replay doesn't just capture pixels; it captures the underlying DOM changes, network requests, and data transformations. This transforms a "black box" into a documented sequence of events.

Step 3: Logic Extraction and Audit#

Replay’s AI Automation Suite analyzes the recording to identify the "Invisible Rules."

  • Example: "If 'State' is 'California' and 'Loan Amount' > $500k, trigger 'Manual Review' flag." Replay generates the technical debt audit and the API contracts required to replicate this logic in a modern environment.

Step 4: Component and Test Generation#

Once the rules are extracted, Replay generates modern React components and E2E tests. This ensures that the new system behaves exactly like the old one, satisfying compliance and regulatory requirements in industries like Insurance and Government.

Converting Legacy Procedural Logic to Modern TypeScript#

To understand the power of extracting business rules, look at how a typical PHP 4 "rule" is transformed.

The Legacy "Black Box" (PHP 4)#

This is what your developers are currently trying to decipher. It's un-testable and fragile.

php
// legacy_calc.php - Circa 2005 // No types, global variables, mixed logic if ($HTTP_POST_VARS['user_type'] == "admin") { $discount = 0.20; } elseif ($region_code == "NE" && $order_total > 1000) { // Hidden business rule: North East region gets 15% off high value orders $discount = 0.15; } else { $discount = 0.05; } $final_price = $order_total - ($order_total * $discount); echo "<table><tr><td>" . number_format($final_price, 2) . "</td></tr></table>";

The Replay Output (Modern React/TypeScript)#

Replay extracts the logic from the workflow and generates a clean, documented, and type-safe component. It preserves the business rule while discarding the technical debt.

typescript
/** * Generated by Replay Visual Reverse Engineering * Source: /legacy_calc.php (Workflow: Admin Discount Calculation) * Business Rule: Region-based discounting logic for NE territory */ interface PricingProps { userType: 'admin' | 'standard'; regionCode: string; orderTotal: number; } export const ModernPricingCalculator: React.FC<PricingProps> = ({ userType, regionCode, orderTotal }) => { const calculateDiscount = (): number => { if (userType === 'admin') return 0.20; if (regionCode === 'NE' && orderTotal > 1000) return 0.15; return 0.05; }; const discount = calculateDiscount(); const finalPrice = orderTotal * (1 - discount); return ( <div className="p-4 border rounded shadow-sm"> <h3 className="text-lg font-bold">Order Summary</h3> <p className="text-2xl text-green-600"> {new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD' }).format(finalPrice)} </p> <span className="text-sm text-gray-500">Applied Discount: {discount * 100}%</span> </div> ); };

⚠️ Warning: Attempting to "clean up" logic during extraction without automated E2E tests is the fastest way to break a legacy system. Always generate tests based on the original behavior before refactoring.

Why Technical Debt is a $3.6 Trillion Problem#

The global technical debt has reached $3.6 trillion because organizations treat modernization as a "one-and-done" event. They wait until the PHP 4 server literally dies before acting.

The real cost isn't just the developer hours—it's the opportunity cost. While your team spends 18 months manually extracting business rules, your competitors are shipping new features.

Replay changes the economics of this equation. By reducing the extraction time from 40 hours per screen to 4, you aren't just saving money; you're gaining 36 hours of innovation time per screen. In a 100-screen enterprise application, that’s 3,600 hours of engineering capacity returned to your roadmap.

Regulated Environments: SOC2, HIPAA, and On-Premise#

For our clients in Financial Services and Healthcare, "cloud-only" isn't an option. Extracting business rules from a 20-year-old insurance claims system requires strict data sovereignty.

Replay is built for these constraints. Whether you need an On-Premise deployment to keep sensitive data behind your firewall or a HIPAA-ready environment for patient data, the platform ensures that the process of "understanding what you have" doesn't create a new compliance liability.

📝 Note: Replay's AI Automation Suite can be configured to redact PII (Personally Identifiable Information) during the recording process, ensuring that business rules are extracted without compromising user privacy.

The Death of the "Big Bang" Rewrite#

The industry is moving away from the 24-month "Big Bang" rewrite. The risk is simply too high. Instead, we are seeing the rise of the Visual Strangler Fig Pattern.

By using Replay, you can extract one workflow at a time, generate the React equivalent, and "strangle" the old PHP 4 application piece by piece. You get the benefit of modernization in weeks, not years, and you never have to guess what a piece of code does ever again.

Comparison of Modernization Strategies#

StrategyRisk LevelTime to ValueDocumentation
Big Bang Rewrite❌ Extreme18-24 MonthsNone (Starting over)
Lift and Shift❌ High6-12 MonthsNone (Moving the mess)
Manual Refactor⚠️ Medium12-18 MonthsManual / Outdated
Replay Extraction✅ Low2-8 WeeksAutomated / Living

Frequently Asked Questions#

How long does legacy extraction take with Replay?#

While a manual audit takes roughly 40 hours per screen, Replay reduces this to approximately 4 hours. For a standard enterprise module of 20 screens, you can move from a "black box" to a fully documented React codebase in less than two weeks.

What about business logic preservation?#

This is Replay's core strength. Because we record the execution and state changes of the application, we capture the actual business logic as it functions in production. This eliminates the "But the code says X, while the user says Y" discrepancy common in legacy systems.

Can Replay handle non-web PHP 4 apps?#

Replay is designed for web-based interfaces. If your PHP 4 application is accessed via a browser (even an ancient version of Internet Explorer), Replay can record the workflows and extract the logic. For CLI-based legacy systems, we recommend a different discovery path.

Does it generate API documentation?#

Yes. As part of the extraction process, Replay's AI Automation Suite identifies the data structures being passed between the legacy front-end and back-end, automatically generating Swagger/OpenAPI contracts that can be used to build modern microservices.


Ready to modernize without rewriting? Book a pilot with Replay - see your legacy screen extracted live during the call.

Ready to try Replay?

Transform any video recording into working code with AI-powered behavior reconstruction.

Launch Replay Free