Skip to main content

15 posts tagged with "claude-code"

View All Tags

OpenSpec + Harness, Then We Added Engineers: What Breaks When Individual AI Acceleration Hits the Team

· 12 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

Team engineers working with OpenSpec and Harness workflow

In From Cloud Native Apps to AI Native Agent Platforms: The Belts Are the Problem, I used the factory electrification story to make an argument about AI platform adoption: factory owners in the 1890s replaced steam engines with electric motors and kept the same belts, shafts, and building layouts. For thirty years, productivity barely moved. The breakthrough came when they reorganized the factory around the new technology — workflow-first, not power-transmission-first.

That post argued at the platform layer: the decisions organizations make about how to architect and run AI-native applications. The electrification analogy there was about keeping the wrong infrastructure assumptions while adopting new technology.

This post is one layer down — the development lifecycle itself. What happens to a team's coordination model when the implementation loop accelerates by an order of magnitude? The same pattern applies: if the team keeps the existing process assumptions while individual engineers adopt AI-accelerated workflows, the system neutralizes the gain.

With OpenSpec + Superpowers + Harness, I've run enough iterations to say the individual story is real. Features that used to take 2-3 days take hours. The workflow knows what done means. I'm not watching in between.

Then someone on the team wanted to use the same workflow. That's when I found out where the bottleneck had moved.

Stacking OpenSpec and Superpowers, Then I Added a Harness: The Workflow That Knows What Done Means

· 11 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

Anthropic's engineering team published a post on harness design for long-running apps. One line stopped me:

"Generators self-assess poorly — confident praise even for mediocre output."

I've been running OpenSpec's apply phase with a code review skill baked in. It runs in the same agent context as the implementation. The reviewer and the implementer are the same session. I read that line and immediately knew: that's my problem. The reviewer runs inside the same context that just wrote the code. The confidence is real. The bias is invisible.

That was the crack. This post is the fix.

Stacking OpenSpec and Superpowers, Three Weeks Later: Five Frictions and a Plugin

· 20 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

I published Stacking OpenSpec and Superpowers on April 19th — three weeks ago. The headline number was concrete: a refactor shipped in three hours, eighty-six new tests, zero regressions. I meant every word of it. At that time, that stack — OpenSpec's propose/apply/archive triplet (the framework also offers explore, but I'd been skipping it) over the three Superpowers skills I lean on most (brainstorming, test-driven-development, requesting-code-review) — was the best discipline I'd found for spec-driven development with AI in the loop.

What I didn't say — because I didn't know yet — is that the stack had five hidden cracks. They didn't show on a single refactor. They showed up the third, fourth, fifth time I ran it on a new project, when the workflow had to carry weight rather than win a sprint.

This post is the evolution. It's what I'd add to that earlier post if I had to write it again today, after running the stack across a couple more projects — most cleanly on python-agent, where I deliberately let each friction point speak for itself before designing the fix.

I'll spend the first half listing the five friction points (where the earlier stack hurt) and the second half on what I did about them (a three-step methodology abstraction plus four command-level fixes). Then I'll show the four-command, four-phase shape I ended up with, why this is still agile in the form you'd recognize, and the plugin I packaged so you can run the whole thing without re-reading either post.

If you haven't read the earlier post, the original entry point is there. This post assumes OpenSpec's four-command base (explore/propose/apply/archive) and the three Superpowers skills I just mentioned (brainstorming, test-driven-development, requesting-code-review).

One thing I want to flag up front: nothing in the evolution is a clever invention. Each fix was a response to a specific moment where the earlier stack produced output I didn't trust. The pattern that emerged is practice → friction → fix → harden. Methodology evolves by being run, not by being designed. The plugin at the end is the byproduct, not the goal.

The Investment Operating System: How I Use AI to Manage My Portfolio Systematically

· 20 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is part 3 of the AI Wealth Management series, exploring how to use Claude Code and LLM Wiki for personal investing.

Most investors I know operate in reaction mode. A friend texts about a hot stock. An earnings beat hits the news. A pundit on X says the sector is turning. Something triggers a buy or sell, and the decision feels analytical — but it's really just noise with a story attached.

What changed my investing wasn't reading more research or finding better sources. It was building a system. Specifically, it was treating my investment process the way a software engineer treats a production system: with defined inputs, explicit logic, observable state, and predictable outputs.

Building AI Agent: From Complex Claude Skills to Production-Grade AI Agents

· 15 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is part 2 of the AI Wealth Management series, exploring how to use Claude Code and LLM Wiki for personal investing.

This post is for developers building AI systems. Specifically: how to develop complex, multi-step Claude Code commands and compose them into workflows — and what it actually takes to turn a working personal tool into a production-grade agent.

The domain is stock investing, but the patterns apply broadly. This post stands on its own — if you want background on what Claude Code is, how LLM Wiki works, and how to build a knowledge base from scratch, Part 1 covers that, but it's not a prerequisite here.

Building Your Personal Finance Knowledge Base with Claude Code

· 9 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is part 1 of the AI Wealth Management series, exploring how to use Claude Code and LLM Wiki for personal investing.

Most people with real expertise — in finance, law, immigration, or anything else — carry it around in their heads, where it helps no one and can't compound. This post is about changing that: building a structured knowledge base that AI can reason over directly, and that can eventually serve others or generate income.

I'll use personal finance as the example domain. North American finance is genuinely complex — 401K, Roth IRA, HSA, Wash Sale Rule, FBAR, cross-border compliance — and worth systematizing. But the method here is domain-agnostic. Same approach works for any field where you have accumulated expertise.

Stacking OpenSpec and Superpowers: A Combined SDD Workflow

· 11 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is a follow-up to From Vibe Coding to Spec-Driven Development. That post documented introducing OpenSpec into an existing Finance project. This one covers a new project where I stacked OpenSpec with Superpowers from day one.

After three months of running OpenSpec on my Finance project, I'd formed a clear picture of what it's good at and where it struggles. On a personal wiki project I'd also been using Superpowers, and its brainstorming, TDD, and code-review skills were landing real hits.

So I started a new project — a UTR-based tennis team lineup app (tennis-lineup) — specifically to run both tools together and see how they compose. This post is the report.

[7/6] Claude Code: From Vibe Coding to Spec-Driven Development

· 13 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is an extended chapter to the 6-part Claude Code series. The first six chapters documented building a full-stack Finance app using Vibe Coding. This chapter covers what came next.

The first six chapters documented the complete journey of using Claude Code for Vibe Coding — building a full-stack application from scratch and accumulating 40,000 lines of code. Vibe Coding delivered incredible speed, but as the project grew, a structural problem emerged:

AI writes code fast. AI also goes off-track fast.

When you describe a requirement in one sentence, AI might understand 70% of it and then sprint full-speed in that direction for two hours — only for you to realize the core logic is wrong and have to start over.

This isn't theoretical. Before adopting SDD, my real pain points in the Finance project were:

  • Unstructured workflow: I had to remind AI to organize requirements before writing code, otherwise it jumped straight to implementation
  • Missing design documentation: architectural issues only surfaced after implementation, making course corrections expensive
  • Inconsistent code quality: the same requirement could produce wildly different code quality across sessions
  • Tests routinely skipped: Vibe Coding tends toward "get it running first," making tests optional
  • Slow debugging: without clear task boundaries, bugs were hard to locate and back-and-forth with AI was inefficient

This chapter documents a methodology upgrade experiment: introducing Spec-Driven Development (SDD) into the Finance project using OpenSpec, completing three new features, and comparing results against prior Vibe Coding work.