An AI Agent Built My Data Pipeline in 2 Hours

I used to spend days building out client data reporting systems. Connect the data sources, figure out the joins, set up the BI tool, build the dashboards, realize the numbers don't match, debug everything, repeat. It was a slog.

Last month, an AI agent built my entire data reporting pipeline—from raw data sources to finished dashboards—in about two hours. And I didn't write a single line of SQL myself.

The secret? A Claude Code AI agent orchestrating the whole thing, combined with a modern semantic layer that makes the agent's job dramatically easier.

Let's go ahead and jump into it.

The Problem: Brittle Reporting That Breaks Constantly

If you've ever managed reporting for a business, you know the pain:

Scattered logic: Joins and calculations live inside individual Looker blends or dashboard queries. Change one thing, break three others.
Numbers that don't match: The revenue in Dashboard A doesn't match Dashboard B. Spend an afternoon figuring out why.
Manual everything: Building a new report means recreating the same joins, the same filters, the same calculations. Every. Single. Time.
No scalability: Each client becomes a bespoke snowflake. What worked for Client A doesn't transfer to Client B.

I've seen analytics engineers spend 40% of their time just maintaining dashboards. Not building new insights—just keeping the lights on.

The core issue? Most BI setups put the logic in the wrong place. The dashboards become the source of truth, when really the data model should be.

The Solution: Model-First Reporting + AI Orchestration

Here's the insight that changed everything for me: once you have a proper semantic layer—a central place where metrics, dimensions, and joins are defined once—an AI agent can do the rest.

Why? Because the agent isn't inventing logic each time. It's working with a consistent, documented data model. The definitions are already there. The AI just needs to assemble them into dashboards.

The stack I landed on:

Claude Code: The orchestrator—plans, writes code, connects things, iterates
Cube: The semantic layer—central definitions for metrics and dimensions
Metabase: The BI tool—fast dashboarding, approachable for stakeholders
Playwright: UI automation—creates dashboards through the Metabase interface

This combination gives you enterprise-grade reporting without the enterprise reporting team.

The Stack: Why These Tools?

Cube: The Semantic Layer

Cube is an open-source semantic layer that sits between your data sources and your BI tools. You define your metrics and dimensions once, in code, and every downstream tool uses those definitions.

The power here is consistency. "Revenue" means the same thing everywhere. "Customer count" uses the same logic whether you're in Metabase, building an embedded dashboard, or querying via API.

Cube supports all major data warehouses—Snowflake, BigQuery, Postgres, Databricks—and exposes your semantic model via SQL, REST, and GraphQL. This is what makes it AI-friendly: Claude Code can introspect the model, understand the relationships, and generate queries that actually work.

Gartner's 2025 guidance explicitly identified semantic technology as "non-negotiable for AI success." I'm seeing that play out firsthand.

Metabase: The BI Layer

Metabase is an open-source BI tool that's genuinely easy to use. Non-technical stakeholders can build their own queries. Technical users can drop into SQL when needed.

The self-hosted version is completely free. Just run docker run -d -p 3000:3000 metabase/metabase and you're up. For managed hosting, plans start at $500/month.

The reason I picked Metabase over alternatives: it's simple enough that an AI agent can navigate the UI via Playwright, and powerful enough for real business analytics.

Playwright: The UI Automation Layer

Here's where things get interesting. Claude Code doesn't just generate SQL or config files—it actually drives the Metabase UI to create dashboards.

Using Playwright (the browser automation framework), the agent can:

Navigate to Metabase
Create new questions using the query builder
Configure filters and visualizations
Build dashboards and arrange charts
Set up date filters, pivot tables, and drill-downs

This closes the "last mile" gap. You're not stopping at generated code—you're shipping dashboards that stakeholders can actually use.

What the AI Agent Actually Did: Step by Step

Let me walk through the actual build process. This was for a client with data in Stripe, HubSpot, and a Postgres application database.

Step 1: Inventory and Connect Data Sources

I started by describing the situation to Claude Code:

"I have three data sources: Stripe for payments, HubSpot for CRM data, and a Postgres database with application data. The goal is consolidated reporting on revenue, customer lifecycle, and product usage."

The agent asked clarifying questions:

What are the core entities? (Customers, subscriptions, events)
What's the grain of each data source? (Transaction-level for Stripe, contact-level for HubSpot)
What are the must-have metrics? (MRR, churn rate, LTV, conversion rates)

Then it provided a connection plan with:

Read-only credentials for each source (least privilege)
Environment variable structure for secrets
Sanity checks to run after connecting ("Can we query? Do row counts match expectations?")

Step 2: Set Up the Data Stack

The agent recommended the Cube + Metabase stack and walked through setup:

For Cube:

Local development configuration
Docker Compose setup for production
Connection strings for each data source

For Metabase:

Docker deployment command
Initial admin setup
Connection to Cube's SQL endpoint

This took maybe 20 minutes of guided iteration. The agent handled the config files; I just approved and ran commands.

Step 3: Pull Schemas and Map Entities

Once connected, the agent introspected the schemas from each data source. It identified:

Customers (from HubSpot contacts + Stripe customers)
Subscriptions (from Stripe)
Transactions (from Stripe charges)
Product events (from Postgres)

It then proposed an entity relationship map, highlighting where the joins would happen (email as the common key between HubSpot and Stripe, user_id linking Postgres events to Stripe customers).

Step 4: Build the Semantic Model in Cube

This is where the agent really earned its keep. It generated Cube data model files that defined:

Measures:

mrr: Monthly recurring revenue
total_revenue: Lifetime revenue per customer
churn_rate: Calculated as churned customers / total customers
conversion_rate: Trials converted to paid

Dimensions:

signup_date, first_payment_date, churn_date
plan_type, billing_interval
acquisition_channel (from HubSpot)
cohort_month (derived from signup date)

Joins:

Explicit, documented relationships between cubes
Handles for the tricky cases (customers with multiple subscriptions, attribution to original acquisition channel)

The agent also flagged potential issues: "Watch out for fanout on the events table—a customer can have thousands of events, so we should aggregate before joining."

Step 5: Connect Cube to Metabase

With the semantic model in place, the agent configured the connection between Cube and Metabase:

Set up Cube's SQL API endpoint
Added it as a database in Metabase
Synced the schema so all cubes appeared as tables
Configured display names so metrics were human-readable

Step 6: Build Dashboards via Playwright

Now for the fun part. I gave the agent a screenshot of an existing Looker dashboard and said: "Recreate this in Metabase."

The agent used Playwright to:

Navigate to Metabase
Create a new question using the MRR measure
Add a date filter (relative: last 12 months)
Set the visualization to a line chart
Save and add to a new dashboard
Repeat for each chart in the original

For pivot tables—which can be tricky—the agent figured out the right configuration after a couple attempts. I'd take a screenshot of the issue, paste it into the conversation, and the agent would adjust.

Step 7: Validate and Iterate

The final step was validation. The agent ran reconciliation queries:

Total revenue in Cube vs. raw Stripe data
Customer counts matching between HubSpot and the semantic layer
MRR calculations verified against known-good monthly totals

We caught one discrepancy (a timezone issue on the date dimension) and fixed it in the model. After that, everything matched.

The Results

Metric	Before (Manual)	After (AI Agent)
Setup time	3-5 days	~2 hours
Dashboard iteration	2-4 hours each	10-15 minutes
Metric consistency	Low (logic scattered)	High (defined once)
Maintainability	Fragile	Robust
Cross-client reusability	None	High

The biggest win isn't even the speed—it's the quality. The semantic layer means metrics are defined once and used everywhere. No more "why don't these numbers match?" debugging sessions.

What Didn't Work Perfectly

Let me be honest about the limitations:

Ambiguous specs require clarification. If I said "show me revenue by month," the agent would ask: "Booking date or payment date? Including refunds? Which revenue types?" This is actually good—it surfaces questions that would otherwise become bugs.

Pivot tables need iteration. Complex visualizations sometimes took 2-3 attempts to get right. The agent would get close, I'd screenshot the issue, and we'd refine.

Data modeling decisions need human judgment. The agent can propose a model, but someone needs to confirm that "customer" means what we think it means, and that the join logic matches business reality.

UI automation can fail. Occasionally Playwright would timeout or click the wrong element. The agent's retry strategy handled most of these, but sometimes I'd need to intervene.

Why This Matters

Here's what I keep coming back to: this isn't about replacing analysts. It's about eliminating the toil so analysts can focus on actual analysis.

Building the infrastructure—the connections, the models, the dashboards—that's grunt work. An AI agent can do it faster and more consistently than I can.

The human value is in:

Defining what metrics actually mean for the business
Deciding which questions are worth answering
Interpreting results and recommending actions

That's where I want to spend my time. Not debugging why two dashboards show different revenue numbers.

The Playbook: How to Replicate This

If you want to try this approach:

Start with a reporting inventory. What decisions do your dashboards support? What are the must-have metrics?
List your data sources. For each: what's the grain, what are the core entities, what are the keys for joining?
Set up a semantic layer. Cube is my recommendation, but dbt's semantic layer or Looker's modeling layer work too. The key is: define metrics once, in code.
Connect your BI tool. Metabase is great for speed. Tableau or Looker work if you're already invested there.
Use an AI agent for the last mile. Claude Code + Playwright can actually build the dashboards, not just generate SQL.
Validate against known-good numbers. Always reconcile back to source systems.
Document metric definitions. Treat them like product requirements. When someone asks "what is churn rate?"—there should be one canonical answer.

FAQ

Q: Does this work with Looker instead of Metabase?

Yes, though the Playwright automation is more complex because Looker's UI is less automation-friendly. I've had better results with Metabase for agent-driven builds.

Q: What about dbt instead of Cube?

dbt is great for transformation logic but doesn't have the same semantic layer capabilities (though dbt semantic layer is catching up). You could use dbt for transforms and Cube for the metrics layer.

Q: How do I handle sensitive data?

Use read-only database credentials. Keep secrets in environment variables, never in prompts. The agent doesn't need access to raw PII—just schema information and aggregate results.

Q: What if my data sources don't have good keys for joining?

This is a data quality problem that no tool can fully solve. But the agent can help identify where the gaps are and propose solutions (fuzzy matching on names, creating lookup tables, etc.).

Q: Can non-technical people use this approach?

They can use the resulting dashboards, absolutely. Setting up the semantic layer and orchestrating the agent still requires technical comfort—but it's a one-time setup cost, not ongoing maintenance.

Ready to Build Your Data Reporting Pipeline?

If you're spending days on data reporting infrastructure that should take hours, there's a better way. The combination of semantic layers and AI agents is a force multiplier for anyone doing analytics work.

We've implemented this AI agent-driven data reporting stack for several clients now—from early-stage startups to established businesses with complex data landscapes. The approach scales, and the ROI is immediate.

Check out our data automation services or explore our comparison of automation tools if you want to see how this fits into a broader automation strategy.

That's all I got for now. Until next time.