Interviews
Peter Pang, Co-Founder and CTO of CREAO – Interview Series

Peter Pang, Co-Founder and CTO of CREAO, brings a deep background in artificial intelligence research and machine learning infrastructure to the emerging agentic AI space. Before launching CREAO in 2025, Pang spent nearly six years at Meta as a Research Scientist working on generative AI initiatives tied to the LLaMA foundation models, including LLM agentic systems for data annotation and synthetic data generation. Prior to Meta, he worked at Apple as a Machine Learning Engineer focused on NLP, transfer learning, deep learning, and predictive systems, while his earlier years at Brookhaven National Laboratory exposed him to advanced scientific research and experimental systems. His experience across frontier AI research, large-scale machine learning, and autonomous systems now shapes CREAO’s technical vision of building AI agents capable of persistent execution, memory, and workflow automation.
CREAO is an AI-native platform focused on helping individuals and businesses create autonomous AI agents and workflows through natural language rather than traditional coding. The company positions its “Super Agent” as a persistent AI system capable of not only generating outputs, but also converting successful tasks into reusable agents that can operate continuously with memory, scheduling, and integrations across external tools and APIs. CREAO’s platform supports major frontier AI models including OpenAI, Anthropic, Google, and others, while emphasizing an AI-first operating model where agents increasingly handle operational execution instead of serving as one-time assistants. Founded in Palo Alto, the company is focused on advancing agentic AI workflows, no-code automation, and scalable AI orchestration for businesses and independent professionals.
You spent nearly six years at Meta working on LLaMA foundation models and agentic systems for synthetic data generation after earlier machine learning roles at Apple and research work at Brookhaven National Laboratory. What convinced you that it was the right moment to co-found Creao AI, and what problem did you feel existing AI platforms were still failing to solve?
At Meta, I watched model capability improve dramatically with every generation. I worked on scaling LLaMA. I saw what these models could do in controlled settings. The gap that kept widening wasn’t in what the models could accomplish — it was in what people actually got out of them.
Every platform gave you the same interface: a chat window. You ask, it answers, you do the work. The AI is a contractor who forgets everything between shifts. You run it again tomorrow. The system learns nothing between runs. Your business stays the same.
That was the problem. Not model intelligence. Execution. Nobody was building the system where AI could take an outcome all the way from conversation to execution, persist it, and repeat it without a human in the loop.
My background in physics taught me to think in systems, not components. At Brookhaven and through my PhD at Stony Brook, I spent years building experimental pipelines where the instrument, the data collection, and the analysis had to work as a closed loop or the results meant nothing. At Apple, I worked on multimodal models and saw the same pattern: raw capability without a system around it doesn’t compound.
When Kai and I started talking about CREAO, the thesis was specific: the industry doesn’t need another chatbot. It needs a closed loop — AI that builds tools, runs them, and gets better over time. The models were finally good enough. Nobody was building the loop.
CREAO describes itself as a platform where AI can both build tools and autonomously execute work. How do you define the difference between a true “closed-loop” AI system and the wave of AI copilots currently flooding the market?
A copilot sits beside you. You drive, it suggests. If you go to sleep, the copilot does nothing.
A closed-loop system completes the full cycle: observe, act, learn, improve. The human defines the outcome. The system handles execution, persistence, and iteration.
At CREAO, we think about this as a harness system. A single agent can draft an email, triage a bug, or summarize a report. But a single agent doesn’t compound. It runs, produces output, and stops. A harness system wraps around agents and turns them from one-shot tools into a self-improving engine. It follows a loop: Connection, Auditing, Solutions, Build, Self-Improve. The system plugs into your real data sources — your repos, your error logs, your ad accounts. It assesses current state. It proposes fixes. Agents get built to execute those fixes. The outputs feed back in. A new round of auditing begins. The loop tightens.
That’s structurally different from “I have an agent that does X.” This is a system that discovers what X should be, builds X, measures X, and evolves X without you telling it to.
Most copilots are AI-assisted. A closed loop is AI-operated. That’s the dividing line.
Your recent post on “harness engineering” and AI-first organizations generated massive attention online. What surprised you most about the reaction to the idea that 99% of production code at CREAO is now AI-written?
The post hit 1.8 million views. I wrote it as documentation for our team — a record of how we restructured and what we learned. I didn’t expect that kind of reach.
What surprised me was the split in reactions. One camp said this was irresponsible, that you can’t trust AI-generated code in production. The other camp said: “We’re seeing the exact same thing and nobody’s talking about it.”
That second group told me something important. There are teams all over the industry quietly going through this transition. A reporter I spoke with told me she’d talked to about five people on this topic. She said we were further along than anyone: “I don’t think anyone’s just totally rebuilt their entire workflow the way you have.”
The 99% number sounds provocative, but it’s just the outcome of removing human bottlenecks one by one. When you design the architecture so agents can see the full codebase — we unified everything into a single monorepo specifically so AI could see everything — and your CI/CD pipeline has six deterministic phases, and three parallel AI review passes catch quality, security, and dependency issues on every PR, and your self-healing loop detects errors, triages them, and verifies fixes automatically — the human role shifts from writing code to designing the system that governs how agents write code. The code volume follows naturally.
What surprised me most is that more CTOs aren’t saying this publicly. I think many are doing it but are afraid of the perception.
In your article, you argue that most companies are still “AI-assisted” rather than truly “AI-first.” What are the biggest misconceptions leadership teams have when they believe they are already operating as AI-native organizations?
I see teams claim AI-first while running the same sprint cycles, the same Jira boards, the same weekly standups, the same QA sign-offs. They added AI to the loop. They didn’t redesign the loop.
The biggest misconception: “We use AI tools, therefore we’re AI-first.” An engineer opens Cursor. A PM drafts specs with ChatGPT. QA experiments with AI test generation. The workflow stays the same. Efficiency goes up 10 to 20 percent. Nothing structurally changes. That is AI-assisted.
Here’s the test: if you removed every AI tool tomorrow, would your process need to change, or just your tools? If the process stays the same, you’re not AI-first.
The second misconception is that AI-first is an engineering decision. If engineering ships features in hours but marketing takes a week to announce them, marketing is the bottleneck. If the product team still runs a monthly planning cycle, planning is the bottleneck. At CREAO, we pushed AI-native operations into every function: release notes generated from changelogs, feature intro videos created by AI, social media orchestrated and auto-published, health reports generated from production databases. Engineering, product, marketing, and growth run in one AI-native workflow. If one function operates at agent speed and another at human speed, the human-speed function constrains everything.
The third misconception is that this transition can be gradual. A common version of this is what people call vibe coding. Open Cursor, prompt until something works, commit, repeat. That produces prototypes. A production system needs stability, reliability, and security. You need a system that guarantees those properties when AI writes the code. You build the system. The prompts are disposable.
You wrote that the real breakthrough came when CREAO redesigned its entire engineering workflow around AI agents instead of simply adding AI tools into existing processes. Which part of that transition was the most difficult operationally or culturally for the team?
Both, but they’re different kinds of hard.
Operationally, the most difficult decision was unifying the codebase. Our old architecture was scattered across multiple independent systems. A single change might require touching three or four repositories. From a human engineer’s perspective, manageable. From an AI agent’s perspective, opaque. The agent can’t see the full picture. It can’t reason about cross-service implications. It can’t run integration tests locally.
I spent one week designing the new system and another week re-architecting the entire codebase using agents. That was a bold call. If it failed, we’d have a broken monorepo and a broken multi-repo setup. But the principle is clear: the more of your system you pull into a form the agent can inspect, validate, and modify, the more leverage you get. A fragmented codebase is invisible to agents. A unified one is legible.
Culturally, the hardest part was identity. Engineers find their value in the code they write. When AI writes 99% of the code, the question becomes: what is my job now? That’s not a process question. That’s existential.
I won’t pretend everyone was happy. When I stopped talking to people every day because my management time dropped from 60% to under 10%, some team members felt uncertain. What does the CTO not talking to me mean? What is my value in this new world? Those are reasonable concerns. Some people spend more time debating whether AI can do their work than doing the work.
But once people experienced the new workflow — where their job shifted from typing code to designing systems, defining SOPs, and building feedback loops — most found it more intellectually engaging. The engineers who struggled most were the ones whose identity was most tightly coupled to the act of writing code. The ones who adapted fastest saw themselves as problem-solvers first and coders second.
One of the more controversial observations in your piece was that junior engineers adapted faster than senior engineers in this new environment. Why do you think adaptability is becoming more important than accumulated technical experience in AI-first engineering teams?
I noticed a pattern I didn’t expect. Junior engineers with less traditional practice felt empowered. They had access to tools that amplified their impact. They didn’t carry a decade of habits to unlearn.
Senior engineers with strong traditional practice had the hardest time. Two months of their work could be completed in one hour by AI. That is a hard thing to accept after years of building a rare skill set.
The reason is structural. A senior engineer’s expertise lives in the mechanical execution of code — navigating complex systems, writing optimized implementations, conducting thorough reviews. Those skills were earned over years and they’re real. But in an AI-first environment, the mechanical skill of writing code is the part that’s being automated. What remains — and becomes more valuable — is the ability to evaluate, criticize, and direct AI.
I have a PhD in physics. The most useful thing my PhD taught me was how to question assumptions, stress-test arguments, and look for what’s missing. The ability to criticize AI is more valuable than the ability to produce code. Can you look at an architecture proposal and see the failure mode the agent missed? Can you look at a generated UI and know it’s wrong before the user tells you?
Junior engineers never had deep mechanical expertise to let go of. They just learned the new thing.
I’m not making a judgment. I’m describing what I observed. Seniority is still an advantage — the deep architectural thinking, the system design intuition — but only if the senior engineer is willing to operate at a different altitude. In this transition, adaptability matters more than accumulated skill.
CREAO rebuilt its infrastructure around monorepos, automated CI/CD pipelines, AI review systems, and self-healing loops that integrate tools like CloudWatch, Sentry, Linear, and Claude. How close are we to a future where software systems largely maintain and repair themselves?
We run a self-healing loop in production right now. Let me describe what it actually does.
Every morning at 9:00 AM UTC, an automated health workflow runs. Claude Sonnet queries CloudWatch, analyzes error patterns across all services, and generates an executive health summary delivered to the team. Nobody asked for it. One hour later, the triage engine clusters production errors from CloudWatch and Sentry, scores each cluster across nine severity dimensions — user impact, velocity, blast radius, business criticality, and five others — and auto-generates investigation tickets in Linear with sample logs, affected endpoints, and suggested investigation paths.
When an engineer pushes a fix, the same pipeline handles it. Three Claude review passes evaluate the PR — code quality, security, and dependency scanning. CI validates through a six-phase pipeline. After deployment, the triage engine re-checks. If the original errors are resolved, the ticket auto-closes.
On top of that, we built what we call the Agent Harness. A tri-judge grading panel — one Anthropic judge, one OpenAI judge, one Google judge — scores every live agent response. When scores drop, a six-job engineering pipeline turns those low scores into Linear tickets, draft PRs, and verified fixes. And for major changes, AI-gated grey rollouts route 10% of traffic to the new variant and promote through 20%, 50%, 100% only if scores hold.
We don’t have a QA team. We don’t have a staging environment. No one reads transcripts and scores agent replies by hand.
So how close are we? Closer than most people think for constrained domains — known error patterns, regressions, configuration drift. I’d say 70 to 80 percent of production maintenance can run this way with the right infrastructure. The part that still needs humans is ambiguous failures where the architecture itself might be wrong, or where the fix requires understanding a business decision not encoded in the system.
The practical takeaway: you don’t need to wait for perfection. Even partial self-healing dramatically changes team dynamics.
You’ve described a future where “one-person companies” could become common as AI agents replace large operational teams. Which types of companies or industries do you think will feel this shift first?
Companies where the ratio of operational overhead to core value creation is highest.
I believe one-person companies will become common. If one architect with agents can do the work of 100 people, many companies won’t need a second employee. Model capability is the clock driving this. I attribute the entire shift at CREAO to the last two months. Opus 4.5 couldn’t do what Opus 4.6 does. Next-gen models will accelerate it further.
The shift hits first wherever the work is digital, repeatable, and the feedback loops are clear. Content production, marketing operations, e-commerce, developer tooling, digital agencies. If your business is fundamentally about taking information, transforming it, and distributing it, a harness system can handle most of that loop.
We see it on our own platform. We describe specific use cases in our work: a solo founder connects GitHub, Sentry, and CloudWatch. The system audits error logs, deployment frequency, and infrastructure health. From that audit, it builds a bug triage agent, a feature discovery agent, a content generation pipeline, an infrastructure optimization loop. Each agent runs, produces output, and that output changes the state. Next cycle, the audit sees a different picture. The system found the new solutions. The founder didn’t plan them.
At CREAO, we run our own operations this way. One agent replaced a three-person SEO workflow overnight. Another ran a content pipeline for two days before anyone checked the output. The output was garbage, and we killed it. Both happened in the same week. We call ourselves crash test dummies for the future of work.
Industries that feel it last: healthcare, heavy manufacturing, legal — anywhere the work is physical, heavily regulated, or requires human judgment in every decision. But even there, the administrative layers will compress.
CREAO recently raised $10 million led by Prosperity7 Ventures, the venture arm of Saudi Aramco. Beyond the funding itself, what strategic opportunities does this partnership open up for the company as autonomous AI agents become more enterprise-focused?
Prosperity7 manages a $3 billion diversified fund and operates across Silicon Valley, the Middle East, and Asia. They’ve built a portfolio spanning enterprise AI infrastructure — Arcee AI, Spirit AI, and others. The strategic fit goes beyond capital.
Their thesis matched ours. As Raed Twaily, their Executive Managing Director, put it publicly: “As AI adoption matures, the focus is shifting from models to execution. Creao AI is building the infrastructure layer that allows agents to operate autonomously, reliably, and continuously.” That’s exactly what we believe. The AI agents market is projected to hit $52 billion by 2030. The companies that capture real value won’t be the ones with the best model. They’ll be the ones with the best execution layer.
Practically, the $10 million — bringing our total to $25 million across three rounds in under a year — accelerates three things. First, engineering depth. We’re expanding the team to handle enterprise-grade integrations and agent-to-agent collaboration. The self-healing harness, the tri-judge grading panel, the AI-gated rollout system — these need to work at enterprise scale, not just for a 25-person team. Second, geographic reach. Prosperity7’s presence in the Middle East and Asia opens markets where autonomous AI adoption is accelerating faster than most people in Silicon Valley realize. Third, institutional credibility. When you’re asking enterprises to trust agents running their operations continuously, the backing matters.
We raised $25 million in under a year with zero paid marketing and 200,000 users through organic adoption. That speed reflects the market’s conviction that the execution layer is the next battleground.
Looking ahead, what does success look like for CREAO over the next few years? Is the long-term vision to build better AI agents, or are you ultimately trying to redefine how companies themselves are structured and operated?
Both. And they’re inseparable.
Near term, success means proving the closed-loop model works at scale. We have 200,000 users who arrived through organic adoption since our launch in September 2025. We ship to production three to eight times a day. We run our own engineering, marketing, and operations on the harness. The next step is demonstrating that businesses of all sizes can run the same way.
But better agents aren’t the endgame. The endgame is the harness — the system around the agents.
An agent is a tool. A harness system is a flywheel. The more it runs, the tighter it gets. The more data it sees, the better its audits. The better its audits, the sharper its solutions. The sharper its solutions, the more impactful its agents. The more impactful its agents, the more the data changes. And the loop continues.
The 20th-century model of a company is a hierarchy of humans organized into departments, each with managers, each with processes designed for human throughput. When agents can reliably execute most operational work, that model doesn’t just get more efficient — it starts to become unnecessary.
Everyone knows AI promises an explosion of productivity. But the industry is stuck in two traps. If humans still operate AI tools step-by-step, productivity hits a ceiling. And if humans are still the only ones building the tools, the real revolution hasn’t started. We’re building the system where AI does both: builds the tools and runs them.
Most founders I talk to still operate the traditional way. Some think about making the shift. Very few have done it. The tools exist for any team to do this. Nothing in our stack is proprietary. The competitive advantage is the decision to redesign everything around these tools, and the willingness to absorb the cost.
We built an agent platform. The agents rebuilt it. That’s the design principle. The harness tightens day by day.
Thank you for the great interview, readers who wish to learn more should visit CREAO.












