
An AI Agent Built My Data Pipeline in 2 Hours
How I used Claude Code to connect data sources, build a semantic model, and create dashboards—turning a week-long project into a single afternoon.
I used to spend days building out client reporting systems. Connect the data sources, figure out the joins, set up the BI tool, build the dashboards, realize the numbers don't match, debug everything, repeat. It was a slog.
Last month, I built an entire data and reporting pipeline—from raw data sources to finished dashboards—in about two hours. And I didn't write a single line of SQL myself.
The secret? A Claude Code agent orchestrating the whole thing, combined with a modern semantic layer that makes the AI's job dramatically easier.
Let's go ahead and jump into it.
The Problem: Brittle Reporting That Breaks Constantly
If you've ever managed reporting for a business, you know the pain:
- Scattered logic: Joins and calculations live inside individual Looker blends or dashboard queries. Change one thing, break three others.
- Numbers that don't match: The revenue in Dashboard A doesn't match Dashboard B. Spend an afternoon figuring out why.
- Manual everything: Building a new report means recreating the same joins, the same filters, the same calculations. Every. Single. Time.
- No scalability: Each client becomes a bespoke snowflake. What worked for Client A doesn't transfer to Client B.
I've seen analytics engineers spend 40% of their time just maintaining dashboards. Not building new insights—just keeping the lights on.
The core issue? Most BI setups put the logic in the wrong place. The dashboards become the source of truth, when really the data model should be.
The Solution: Model-First Reporting + AI Orchestration
Here's the insight that changed everything for me: once you have a proper semantic layer—a central place where metrics, dimensions, and joins are defined once—an AI agent can do the rest.
Why? Because the agent isn't inventing logic each time. It's working with a consistent, documented data model. The definitions are already there. The AI just needs to assemble them into dashboards.
The stack I landed on:
- Claude Code: The orchestrator—plans, writes code, connects things, iterates
- Cube: The semantic layer—central definitions for metrics and dimensions
- Metabase: The BI tool—fast dashboarding, approachable for stakeholders
- Playwright: UI automation—creates dashboards through the Metabase interface
This combination gives you enterprise-grade reporting without the enterprise reporting team.
The Stack: Why These Tools?
Cube: The Semantic Layer
Cube is an open-source semantic layer that sits between your data sources and your BI tools. You define your metrics and dimensions once, in code, and every downstream tool uses those definitions.
The power here is consistency. "Revenue" means the same thing everywhere. "Customer count" uses the same logic whether you're in Metabase, building an embedded dashboard, or querying via API.
Cube supports all major data warehouses—Snowflake, BigQuery, Postgres, Databricks—and exposes your semantic model via SQL, REST, and GraphQL. This is what makes it AI-friendly: Claude Code can introspect the model, understand the relationships, and generate queries that actually work.
Gartner's 2025 guidance explicitly identified semantic technology as "non-negotiable for AI success." I'm seeing that play out firsthand.
Metabase: The BI Layer
Metabase is an open-source BI tool that's genuinely easy to use. Non-technical stakeholders can build their own queries. Technical users can drop into SQL when needed.
The self-hosted version is completely free. Just run docker run -d -p 3000:3000 metabase/metabase and you're up. For managed hosting, plans start at $500/month.
The reason I picked Metabase over alternatives: it's simple enough that an AI agent can navigate the UI via Playwright, and powerful enough for real business analytics.
Playwright: The UI Automation Layer
Here's where things get interesting. Claude Code doesn't just generate SQL or config files—it actually drives the Metabase UI to create dashboards.
Using Playwright (the browser automation framework), the agent can:
- Navigate to Metabase
- Create new questions using the query builder
- Configure filters and visualizations
- Build dashboards and arrange charts
- Set up date filters, pivot tables, and drill-downs
This closes the "last mile" gap. You're not stopping at generated code—you're shipping dashboards that stakeholders can actually use.
What the Agent Actually Did: Step by Step
Let me walk through the actual build process. This was for a client with data in Stripe, HubSpot, and a Postgres application database.
Step 1: Inventory and Connect Data Sources
I started by describing the situation to Claude Code:
"I have three data sources: Stripe for payments, HubSpot for CRM data, and a Postgres database with application data. The goal is consolidated reporting on revenue, customer lifecycle, and product usage."
The agent asked clarifying questions:
- What are the core entities? (Customers, subscriptions, events)
- What's the grain of each data source? (Transaction-level for Stripe, contact-level for HubSpot)
- What are the must-have metrics? (MRR, churn rate, LTV, conversion rates)
Then it provided a connection plan with:
- Read-only credentials for each source (least privilege)
- Environment variable structure for secrets
- Sanity checks to run after connecting ("Can we query? Do row counts match expectations?")
Step 2: Set Up the Data Stack
The agent recommended the Cube + Metabase stack and walked through setup:
For Cube:
- Local development configuration
- Docker Compose setup for production
- Connection strings for each data source
For Metabase:
- Docker deployment command
- Initial admin setup
- Connection to Cube's SQL endpoint
This took maybe 20 minutes of guided iteration. The agent handled the config files; I just approved and ran commands.
Step 3: Pull Schemas and Map Entities
Once connected, the agent introspected the schemas from each data source. It identified:
- Customers (from HubSpot contacts + Stripe customers)
- Subscriptions (from Stripe)
- Transactions (from Stripe charges)
- Product events (from Postgres)
It then proposed an entity relationship map, highlighting where the joins would happen (email as the common key between HubSpot and Stripe, user_id linking Postgres events to Stripe customers).
Step 4: Build the Semantic Model in Cube
This is where the agent really earned its keep. It generated Cube data model files that defined:
Measures:
mrr: Monthly recurring revenuetotal_revenue: Lifetime revenue per customerchurn_rate: Calculated as churned customers / total customersconversion_rate: Trials converted to paid
Dimensions:
signup_date,first_payment_date,churn_dateplan_type,billing_intervalacquisition_channel(from HubSpot)cohort_month(derived from signup date)
Joins:
- Explicit, documented relationships between cubes
- Handles for the tricky cases (customers with multiple subscriptions, attribution to original acquisition channel)
The agent also flagged potential issues: "Watch out for fanout on the events table—a customer can have thousands of events, so we should aggregate before joining."
Step 5: Connect Cube to Metabase
With the semantic model in place, the agent configured the connection between Cube and Metabase:
- Set up Cube's SQL API endpoint
- Added it as a database in Metabase
- Synced the schema so all cubes appeared as tables
- Configured display names so metrics were human-readable
Step 6: Build Dashboards via Playwright
Now for the fun part. I gave the agent a screenshot of an existing Looker dashboard and said: "Recreate this in Metabase."
The agent used Playwright to:
- Navigate to Metabase
- Create a new question using the MRR measure
- Add a date filter (relative: last 12 months)
- Set the visualization to a line chart
- Save and add to a new dashboard
- Repeat for each chart in the original
For pivot tables—which can be tricky—the agent figured out the right configuration after a couple attempts. I'd take a screenshot of the issue, paste it into the conversation, and the agent would adjust.
Step 7: Validate and Iterate
The final step was validation. The agent ran reconciliation queries:
- Total revenue in Cube vs. raw Stripe data
- Customer counts matching between HubSpot and the semantic layer
- MRR calculations verified against known-good monthly totals
We caught one discrepancy (a timezone issue on the date dimension) and fixed it in the model. After that, everything matched.
The Results
| Metric | Before (Manual) | After (AI Agent) |
|---|---|---|
| Setup time | 3-5 days | ~2 hours |
| Dashboard iteration | 2-4 hours each | 10-15 minutes |
| Metric consistency | Low (logic scattered) | High (defined once) |
| Maintainability | Fragile | Robust |
| Cross-client reusability | None | High |
The biggest win isn't even the speed—it's the quality. The semantic layer means metrics are defined once and used everywhere. No more "why don't these numbers match?" debugging sessions.
What Didn't Work Perfectly
Let me be honest about the limitations:
Ambiguous specs require clarification. If I said "show me revenue by month," the agent would ask: "Booking date or payment date? Including refunds? Which revenue types?" This is actually good—it surfaces questions that would otherwise become bugs.
Pivot tables need iteration. Complex visualizations sometimes took 2-3 attempts to get right. The agent would get close, I'd screenshot the issue, and we'd refine.
Data modeling decisions need human judgment. The agent can propose a model, but someone needs to confirm that "customer" means what we think it means, and that the join logic matches business reality.
UI automation can fail. Occasionally Playwright would timeout or click the wrong element. The agent's retry strategy handled most of these, but sometimes I'd need to intervene.
Why This Matters
Here's what I keep coming back to: this isn't about replacing analysts. It's about eliminating the toil so analysts can focus on actual analysis.
Building the infrastructure—the connections, the models, the dashboards—that's grunt work. An AI agent can do it faster and more consistently than I can.
The human value is in:
- Defining what metrics actually mean for the business
- Deciding which questions are worth answering
- Interpreting results and recommending actions
That's where I want to spend my time. Not debugging why two dashboards show different revenue numbers.
The Playbook: How to Replicate This
If you want to try this approach:
-
Start with a reporting inventory. What decisions do your dashboards support? What are the must-have metrics?
-
List your data sources. For each: what's the grain, what are the core entities, what are the keys for joining?
-
Set up a semantic layer. Cube is my recommendation, but dbt's semantic layer or Looker's modeling layer work too. The key is: define metrics once, in code.
-
Connect your BI tool. Metabase is great for speed. Tableau or Looker work if you're already invested there.
-
Use an AI agent for the last mile. Claude Code + Playwright can actually build the dashboards, not just generate SQL.
-
Validate against known-good numbers. Always reconcile back to source systems.
-
Document metric definitions. Treat them like product requirements. When someone asks "what is churn rate?"—there should be one canonical answer.
FAQ
Q: Does this work with Looker instead of Metabase?
Yes, though the Playwright automation is more complex because Looker's UI is less automation-friendly. I've had better results with Metabase for agent-driven builds.
Q: What about dbt instead of Cube?
dbt is great for transformation logic but doesn't have the same semantic layer capabilities (though dbt semantic layer is catching up). You could use dbt for transforms and Cube for the metrics layer.
Q: How do I handle sensitive data?
Use read-only database credentials. Keep secrets in environment variables, never in prompts. The agent doesn't need access to raw PII—just schema information and aggregate results.
Q: What if my data sources don't have good keys for joining?
This is a data quality problem that no tool can fully solve. But the agent can help identify where the gaps are and propose solutions (fuzzy matching on names, creating lookup tables, etc.).
Q: Can non-technical people use this approach?
They can use the resulting dashboards, absolutely. Setting up the semantic layer and orchestrating the agent still requires technical comfort—but it's a one-time setup cost, not ongoing maintenance.
Ready to Build Your Data Pipeline?
If you're spending days on reporting infrastructure that should take hours, there's a better way. The combination of semantic layers and AI agents is a force multiplier for anyone doing analytics work.
We've implemented this stack for several clients now—from early-stage startups to established businesses with complex data landscapes. The approach scales, and the ROI is immediate.
Check out our data automation services or explore our comparison of automation tools if you want to see how this fits into a broader automation strategy.
That's all I got for now. Until next time.
Related Articles
Deploy Clawdbot: Your 24/7 Personal AI Assistant on a VPS
A detailed technical guide to deploying Clawdbot on Hetzner VPS with Telegram or Slack integration, security hardening, and MCP skills for a persistent, proactive AI assistant.
Best AI Receptionists for Small Business: 2025 Comparison
Compare the top AI receptionist and virtual phone answering services for small business. Pricing, features, and honest reviews of each option.