
From Monolith to API-First: The Cloud Migration That Saved Evergen
They Were Losing $12,000 a Day. Every New Customer Made It Worse. Here's How We Fixed It—And Transformed Their Business.
Go
Kafka
AWS
Kubernetes
MongoDB
ArgoCD
Docker
API Design
The Numbers
| Metric | Before | After |
|---|---|---|
| Cost per client | $50/month | $0.60/month |
| Processing speed | 20 seconds | 50 milliseconds |
| Uptime | 91% | 99.9% |
| Asset capacity | 400 sites (maxed) | Hundreds of thousands |
| Daily burn | $12,000 | Profitable |
Evergen had a real product in a market that doesn't wait. Energy pricing in Australia changes every 5 minutes. The problem wasn't "performance" or "cloud spend" as separate issues—it was unit economics driven by architecture: a fragile Azure monolith with business logic buried in hundreds of stored procedures. We reverse-engineered the system, rebuilt it as an API-first platform on AWS, ran two production systems in parallel for a year, then cut over with zero customer impact. And the company stopped dying.
Evergen had everything. AI that actually worked for energy optimization. Backing from AMP. 400 customers. Perfect market timing.
One problem: they were losing $30 on every single customer.
New Leadership, Ugly Truth
Ben Hutt came in as CEO. Ben Burns as COO. They started digging into the numbers.
What they found was ugly.
Cost per client was $50. Revenue was $20. The math was brutal. Sixty days of runway left. The faster they grew, the faster they died.
They looked at the engineering team. Three developers watching the clock until 5 PM. Zero documentation. Hundreds of MSSQL stored procedures containing all the business logic. The one engineer who understood the AI? Gone months ago. No notes. No comments. Nothing.
The existing team's solution? "Buy bigger MSSQL instances."
Ben and Ben knew they needed outside help. Fast.
What We Found When We Arrived
They called us in. We walked into the Sydney office and confirmed their fears.
The system was worse than the numbers suggested. The infrastructure wasn't just expensive—it was fragile. One bad deployment away from taking down the whole thing.
When I say "fragile," I don't mean "a few flaky alerts." I mean the kind of system where you don't deploy because you can't predict what'll break, and you can't predict what'll break because there are no tests, no docs, and half the business logic is hidden in SQL nobody has the confidence to touch.
That's not a tech problem. That's a company survival problem.
Then Azure Crashed
Mid-project, Azure deprecated our Kubernetes node version. No warning. No migration tools.
Our entire cluster went down during a routine deployment. We spent all night with Azure support trying to rebuild it.
Every time we needed to scale? Two-week approval cycle. We had to beg to pay them more money.
When we called AWS with the same questions, a solutions architect called back: "I'm just down the street. Want me to stop by after lunch?"
That's when we decided to move.
The Impossible Part
We had to reverse-engineer an AI that nobody understood.
Evergen's optimization engine made split-second decisions about when to buy and sell energy. It lived in hundreds of stored procedures. No documentation. No tests. Just code.
We spent months analyzing stored procedure dependencies. Mapping data flows. Testing edge cases with real energy data.
One memorable discovery: a procedure burning massive computing power 24/7 to create in-memory tables, then doing nothing with them. Legacy code from someone who'd left years ago.
This is the part most people underestimate. It's not "rewrite it in Go." It's "prove the new system makes the same financial decisions as the old system," because in energy markets, being slightly wrong isn't a bug. It's lost money. Or compliance trouble. Or both.
What We Had to Rebuild
To understand why this was hard, you need to understand what Evergen's platform actually does. This isn't a simple CRUD app. It's a real-time AI that makes financial decisions every 5 minutes for thousands of energy assets.
7 core systems we reverse-engineered from undocumented SQL:
Forecasting Engine
The AI predicts optimal buy/sell decisions by analyzing energy consumption patterns, solar generation forecasts, weather data, grid demand, and wholesale electricity prices. All of this feeds into a model that runs predictions for each individual site. The old system did this in stored procedures. We rebuilt it as a distributed microservice that could handle 100x the load.
The shift here wasn't "microservices because microservices." It was isolation. Forecasting is expensive compute, and it doesn't get to take down fleet control because it's having a bad day.
Real-Time Control
Every battery in the network can be controlled remotely—charge, discharge, or hold. The platform decides automatically, but operators can override when needed. The original system had control logic scattered across 47 different stored procedures. We consolidated it into a single, testable control service.
This was about confidence. If a human operator hits override at 2 AM, you don't want to wonder which stored procedure is going to win the argument.
Market Bidding
Evergen participates in wholesale energy markets, ancillary services, and Virtual Power Plant (VPP) programs. The platform automatically bids and rebids based on market conditions. Getting this wrong means losing money or violating market rules. The compliance requirements alone filled a 200-page document.
This is where "speed" becomes money in the most literal sense. Not in the "our dashboard feels snappy" sense. In the "you missed the price spike, you missed the revenue" sense.
Fleet Monitoring
Thousands of batteries, inverters, and solar panels across NSW, VIC, SA, QLD, and ACT. Each device streams telemetry data. The platform combines vendor APIs with inferred data to build a complete picture of fleet health. The old system processed this synchronously. We rebuilt it with Kafka for real-time streaming.
Synchronous processing here was a silent killer. It doesn't fail loudly. It just backs up, gets slower, drops things, and you wake up to "why didn't we see this battery go offline yesterday?"
Anomaly Detection
The AI detects when devices go offline, underperform, or behave unexpectedly. It alerts operators before customers notice problems. This directly impacts ROI—a dead battery earns nothing. We had to extract this logic from stored procedures that mixed alerting with billing with reporting.
Mixing concerns like that is how systems rot. Not because it's "unclean." Because it means you can't change one thing without side effects you don't understand until production.
Behind-the-Meter Optimization
For residential customers, the ML model creates personalized energy plans. It accounts for each site's weather, load patterns, electricity rates, and VPP participation. Every 5 minutes, it reassesses and adjusts. Customers using Evergen save an additional 26% on electricity bills compared to standard solar+battery setups. We had to ensure the new system matched the old system's optimization quality—any regression would show up in customer bills.
This one made me paranoid, in a good way. If you're wrong, you don't just get a few complaints. You lose trust. And in consumer energy, trust is the whole game.
Front-of-Meter Optimization
For utility-scale assets and commercial sites, the platform maximizes returns across multiple revenue streams: wholesale markets, frequency control, demand response, and network support programs. These markets have different rules, different settlement periods, and different penalties for non-compliance. The original system handled this with 89 stored procedures that nobody fully understood.
This is where "just rewrite it" dies. These markets aren't forgiving, and neither are auditors. You need determinism, traceability, and the ability to explain decisions.
All of this was buried in undocumented SQL. We had to understand it well enough to rebuild it, then prove the new system produced identical results before we could switch over.
The API Ecosystem We Built
The old system was a monolith. The new system integrates with dozens of third-party services through a comprehensive API ecosystem:
External API
Gateway for third-party integrations. Energy retailers, grid operators, and partner platforms connect here.
Mobile API
Powers customer-facing apps. Real-time data, device control, VPP participation—all from a phone.
OEM APIs
Tesla Powerwall, LG RESU, Enphase, Fronius. Each vendor has different protocols. We normalized them all.
Device Onboarding API
What used to take manual database edits now happens through a clean API call.
Device Telemetry API
Real-time data from every asset. Battery state-of-charge, solar output, grid import/export—streaming constantly.
Forecasting API
Predictions for each device. Feeds into the optimization engine for proactive load balancing.
From a fragile monolith with business logic buried in stored procedures to a modern, API-first platform that other companies can build on top of.
Here's the part that's easy to miss: this wasn't "nice to have." It was the only way Evergen could scale beyond "we have 400 customers and we're scared to touch anything." An API-first platform forces boundaries. Boundaries force discipline. Discipline is what lets you ship changes without gambling the business.
Running Two Systems for a Year
Australia's energy market changes prices every 5 minutes. There's no maintenance window. There's no "we'll be down for a few hours."
So we ran two complete production systems simultaneously. For a year.
| Component | Old (Azure) | New (AWS) |
|---|---|---|
| Language | .NET | Go |
| Database | MSSQL + stored procedures | MongoDB + Time Series, AWS Athena, Kafka |
| Scaling | Buy bigger servers | Add more nodes |
| Deployment | Manual and scary | CI/CD with instant rollback |
Three engineers maintained the dying system. Everyone else built the replacement. Every feature got built twice. Every data flow got validated across both systems.
Expensive? Yes. But cheaper than failure.
The Friday Cutover
We broke the "never deploy on Friday" rule.
Everyone in one room. Azure console on the big screen. Rollback ready.
CTO Nick McGrath gave the signal. We killed Azure.
Thirty minutes of staring at monitoring dashboards. Hundreds of thousands of optimizations flowing through AWS.
It worked. Zero customer impact.
Why Speed Actually Matters Here
One customer made $50 in a single day from optimized energy trading. More than most people earn in a month from solar panels.
Here's why: Australia's electricity prices change every 5 minutes. Evergen's AI predicts price spikes and sells stored battery energy at 5-10x retail rates.
Old system processed in 20 seconds. That's 20% of your 5-minute window gone. Sometimes you'd miss the spike entirely.
New system? 50 milliseconds. You catch everything.
"When Jack's team showed up, we were drowning. Technical debt everywhere. Bleeding money. They dug into our undocumented mess, figured out the AI, and built something that actually worked. During the midnight Azure crisis—when we were both trying to save the infrastructure—that's when I knew these were real partners, not contractors watching the clock. Today we're running hundreds of thousands of optimizations in real time. That was literally impossible before."— Nick McGrath, CTO, Evergen
"When Ben Burns and I came in, we knew something was wrong. We just didn't know how bad. Losing $30 on every customer in a growth market—that's not a tech problem, that's an existential problem. We called MadAppGang because we needed people who could move fast and wouldn't flinch at the mess. Jack's team didn't just fix the tech—they fixed the business model. $50 to $0.60 per client. That's the difference between a cautionary tale and Australia's top energy platform."— Ben Hutt, CEO, Evergen
What Made This Hard
Most firms would've walked away. Here's what they'd have seen:
- No documentation for any critical system
- An AI that one person understood (who was gone)
- Zero-downtime requirement in a 24/7 market
- Twelve months of running dual infrastructure
- Sixty days of runway when we started
- Company survival depending on us not screwing up
We jumped on the sinking ship and rebuilt it while it was sinking.
The Business After
Before (blocked by infrastructure):
- Couldn't take new customers. System was maxed.
- Couldn't pursue bigger clients. Too unreliable.
- Couldn't add new battery vendors. Too inflexible.
After (infrastructure out of the way):
- Signed EnergyAustralia
- Became Australia's #1 energy optimization platform
- Launched white-label solutions for other providers
- Grew engineering team from 3 to 70+
- Expanded to multiple continents
What We Learned
Infrastructure debt compounds. When you're losing $30 per customer in a $20/customer business, every day costs $12,000. Every week is $84,000. Every month is $360,000. The question isn't whether you can afford to fix it. It's whether you can afford not to.
Cloud support matters more than features. Pricing and feature lists look similar. What's different: how fast they respond when you're dying at 2 AM, whether scaling requires two weeks of approval, whether they treat you like a partner or an invoice.
Parallel infrastructure is expensive but necessary. For 24/7 businesses, it's the only way to migrate without betting everything on a single cutover.
When This Matters to You
You need someone like us when:
- Your infrastructure costs are killing your unit economics
- You're losing customers because the system can't keep up
- You've got critical systems that only one person understands (or understood)
- You need to migrate without downtime
- Other vendors have told you it's impossible
But the migration was just the beginning.
We didn't just rescue the infrastructure. We built one of the world's most advanced energy orchestration platforms—the AI that makes financial decisions every five minutes for tens of thousands of assets. Read that story →
Talk to Us
We've done this before. Not just Evergen—multiple companies where infrastructure was the bottleneck between where they were and where they needed to be.
60 minutes with our CTO. We'll tell you honestly whether we can help. No pitch deck. No "let's schedule a follow-up." Just a conversation about your infrastructure.
What's the infrastructure problem you've been putting off?
Industry: Energy Technology
Duration: 12 months
Team: 8 engineers
Stack: Go, Kafka, AWS, Kubernetes