How I Built an AI-Powered AWS Cost Optimization Engine (And Why We Created Our Own Test Data Generator)

Day 31-32 of building our FinOps AI platform - the integration challenges, breakthroughs, and why we ended up building a Rust tool that nobody asked for but everyone needs.

Peter

Jun 17, 2025

Article voiceover

1×

0:00

-12:55

The Problem That Kept Us Up at Night

Picture this: You're building an AI engine that analyzes AWS infrastructure and suggests cost optimizations. Your machine learning models need realistic data to train on. Your testing needs to cover edge cases. Your demos need to show real value.

But here's the catch - you can't use production data (compliance nightmare), and mock data is... well, useless for anything beyond basic unit tests.

Terraform can mock your infrastructure beautifully, but it can't generate the CloudWatch metrics and billing data that actually matter for cost optimization. We were stuck in a chicken-and-egg problem: we needed realistic AWS data to build our AI engine, but we needed the AI engine to understand what realistic data looks like.

Enter Fox: The Tool We Didn't Plan to Build

Sometimes the best products come from recognizing a problem and just building the solution. Fox emerged from a single day of focused development - a Rust-powered tool that bridges the gap between Terraform infrastructure definitions and the AWS metrics/billing data that actually drives cost decisions.

Here's what blew our minds during testing: Fox analyzed 9 AWS resources and generated 466,000+ CloudWatch metrics and 13,000+ CUR billing records in seconds. Not random data - realistic patterns that mirror actual over-provisioning, idle resources, and optimization opportunities.

The magic? You just run fox in any Terraform directory. It parses your terraform plans, understands your resource dependencies, and generates corresponding AWS data that looks exactly like what you'd see in a real environment.

# That's it. Seriously.
fox

Output? A complete fox/ directory with:

CloudWatch metrics showing realistic CPU, memory, and network patterns
CUR billing data with proper resource attribution
Even terraform.tfstate files with computed attributes
Data patterns that actually trigger our AI optimization recommendations

The Full-Stack Integration Dance

While Fox solved our data problem, we were simultaneously wrestling with the full-stack integration of our AI optimization engine. This is where things got interesting (read: complex).

Real-Time AI Analysis with Human-Friendly UX

Our frontend needed to communicate with an AI engine that could take 5+ minutes to analyze complex AWS environments. Nobody wants to stare at a loading spinner for that long, so we built:

Async processing with 2-second polling: Real-time progress updates without blocking the UI
Task resumption: Store task IDs so users can navigate away and come back to completed analysis
5-minute timeout with graceful degradation: Because even AI has limits

The technical challenge? Coordinating JWT authentication between three services (frontend, backend, AI engine) while maintaining security and user experience.

From Mock Data to Real Intelligence

We completely rewrote our recommendations page to replace static mock data with live AI results. Sounds simple, right?

Not when you're dealing with:

Row-level security policies that need proper project ID mapping
Database schema changes that broke existing authentication
Race conditions between UI navigation and API calls
Inconsistent route handling that confused users

The breakthrough came when we standardized on /ai-recommendations as our canonical route and implemented proper Bearer token authentication at port 8092. Suddenly, everything clicked.

The Code Analysis Engine: Flow

Our Flow engine is where the rubber meets the road - translating AI recommendations into actual Terraform code changes.

This isn't just string replacement. Flow understands:

Complex module structures and variable dependencies
Risk assessment and confidence scoring for each change
Rollback generation (because safety first)
Context-aware mapping between cost optimizations and infrastructure code

We're building this with Claude Code integration, so developers can run /init in their workspace and get automatic project analysis with intelligent code change recommendations.

The technical architecture uses concurrent worker pools with structured zap logging (goodbye logrus!) and comprehensive Viper-based configuration management. It's designed to handle enterprise-scale Terraform codebases without breaking a sweat.

Why This Matters Beyond Our Startup

Here's the thing about building in public - you start to see patterns that extend far beyond your own problem space.

The realistic test data problem is one every engineering team faces. How much time gets spent setting up complex test scenarios? Fox represents a new approach - generate the exact data patterns you need, when you need them, without the overhead of maintaining test environments.

The AI optimization recommendations we're generating aren't theoretical. They're based on real usage patterns and billing data (even when that data is synthetic). Our testing shows Fox-generated scenarios triggering the same optimization recommendations you'd see with production data.

This is infrastructure-as-code meeting AI/ML in a practical, deployable way. We're not just building another monitoring dashboard - we're creating intelligence that directly impacts bottom-line costs with measurable ROI.

The Integration That Changes Everything

What excites us most is how these pieces work together:

Fox generates realistic AWS data for any Terraform configuration
Flow analyzes that data and maps optimizations to code changes
Our AI engine provides intelligent recommendations with confidence scoring
Cloud Atlas Insight presents everything in a clean, actionable interface

The result? A complete workflow from "here's my Terraform code" to "here are the specific changes that will save you $X/month, with Y% confidence."

What's Next: The 90% Problem

We're solving what we call the "90% problem" in FinOps. Most cost optimization tools can identify obvious waste (the easy 10%), but struggle with the nuanced decisions that drive real savings:

Is this EC2 instance actually idle, or does it have bursty workloads?
Should we resize this RDS instance or optimize the application queries?
Which of these 47 optimization recommendations should we prioritize?

Our integrated platform answers these questions with data-driven insights and concrete implementation paths.

Building in Public: The Honest Update

What's Working: Fox exceeded our wildest expectations for a day's work. The performance and data quality show real promise, though like everything else in this stack, it's still actively under development.

What's Reality: I'm working on basically every part of these tools simultaneously. Nothing is "done" - Fox needs refinement, Flow is still being built out, the UI integration has edge cases, and the authentication layer needs hardening. We'll hit issues, debug them, and improve as we go.

What's Challenging: Authentication flows across multiple services are never as simple as they seem. We're still debugging edge cases in our JWT implementation, and each integration reveals new complexity we hadn't anticipated.

What We Learned: Sometimes you need to build the tool that doesn't exist yet. Fox started as a necessity and became central to our entire testing strategy, but it's just one piece of a much larger puzzle we're still assembling.

What's Coming: Continued integration work across all components, fixing the inevitable bugs we'll discover, and the big milestone - demonstrating measurable cost savings on real customer infrastructure. The goal is a complete, working system, but we're very much in the "make it work, then make it better" phase.

Why We're Sharing This

Building enterprise software is hard. Building AI-powered enterprise software is harder. Building it in public, with real technical challenges and honest updates, hopefully helps other founders navigate similar problems.

We're not just building tools - we're building a new category at the intersection of DevOps, FinOps, and AI. If you're solving similar problems or just want to follow along, we'd love to connect.

Follow our build-in-public journey for more technical deep-dives, integration challenges, and the occasional breakthrough that makes it all worthwhile.

P.S. - If you're dealing with AWS cost optimization challenges or need realistic test data for your own infrastructure tools, let's talk. Fox might solve problems you didn't know you had.