Skip to main content

26 posts tagged with "ai"

View All Tags

OpenSpec + Harness, Then We Added Engineers: What Breaks When Individual AI Acceleration Hits the Team

· 12 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

Team engineers working with OpenSpec and Harness workflow

In From Cloud Native Apps to AI Native Agent Platforms: The Belts Are the Problem, I used the factory electrification story to make an argument about AI platform adoption: factory owners in the 1890s replaced steam engines with electric motors and kept the same belts, shafts, and building layouts. For thirty years, productivity barely moved. The breakthrough came when they reorganized the factory around the new technology — workflow-first, not power-transmission-first.

That post argued at the platform layer: the decisions organizations make about how to architect and run AI-native applications. The electrification analogy there was about keeping the wrong infrastructure assumptions while adopting new technology.

This post is one layer down — the development lifecycle itself. What happens to a team's coordination model when the implementation loop accelerates by an order of magnitude? The same pattern applies: if the team keeps the existing process assumptions while individual engineers adopt AI-accelerated workflows, the system neutralizes the gain.

With OpenSpec + Superpowers + Harness, I've run enough iterations to say the individual story is real. Features that used to take 2-3 days take hours. The workflow knows what done means. I'm not watching in between.

Then someone on the team wanted to use the same workflow. That's when I found out where the bottleneck had moved.

Stacking OpenSpec and Superpowers, Then I Added a Harness: The Workflow That Knows What Done Means

· 11 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

Anthropic's engineering team published a post on harness design for long-running apps. One line stopped me:

"Generators self-assess poorly — confident praise even for mediocre output."

I've been running OpenSpec's apply phase with a code review skill baked in. It runs in the same agent context as the implementation. The reviewer and the implementer are the same session. I read that line and immediately knew: that's my problem. The reviewer runs inside the same context that just wrote the code. The confidence is real. The bias is invisible.

That was the crack. This post is the fix.

Building an Agent from Scratch: LangGraph, Qdrant, and the Gaps Between the Docs

· 15 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is part of the Agent Engineering track. Building a Personal Finance Knowledge Base with LLM Wiki built the knowledge base. Building an AI Agent: From Claude Skills to Production extended it into a production personal tool. This post is the next step: building an actual agent application — one that runs independently, serves multiple users, and doesn't require you to be sitting at a terminal.

I already had a working LLM Wiki setup. Structured Markdown files, Claude Code reading them at query time, a set of slash commands that composed into useful workflows. It worked well — for me, running it myself, from my own machine.

Three things eventually broke that model. First, I wanted my wife to be able to use it without opening a terminal. Second, the knowledge base grew past what fit cleanly in a context window. Third, I wanted a morning scan to run on a schedule rather than requiring me to trigger it manually.

None of those problems is hard to state. Each one requires a fundamentally different architecture to solve.

This post is the engineering record of what I built: python-agent, a multi-user knowledge base agent running as a self-hosted Docker application. LangGraph for orchestration, Qdrant for vector storage, Flask + Vue for the web layer. The domain is personal finance, but nothing about the architecture is finance-specific.

This is not a feel-good tutorial. The implementation decisions were right more often than wrong, but the failure modes were real — I'll describe four of them in detail. The point is not to discourage you from building agents; it's to give you a more accurate map of the terrain than the docs provide.

The Senior Engineer's AI Trap: Why Experience Works Against You

· 10 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

The engineers I worry about most aren't the juniors struggling to break in.

They're the 10-year veterans who've added AI tools to their workflow, are producing work at roughly the same pace they were 18 months ago, and have concluded that AI is "useful but overhyped."

The junior who's struggling at least knows there's a problem to solve. The senior who's plateaued often doesn't.

Stacking OpenSpec and Superpowers, Three Weeks Later: Five Frictions and a Plugin

· 20 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

I published Stacking OpenSpec and Superpowers on April 19th — three weeks ago. The headline number was concrete: a refactor shipped in three hours, eighty-six new tests, zero regressions. I meant every word of it. At that time, that stack — OpenSpec's propose/apply/archive triplet (the framework also offers explore, but I'd been skipping it) over the three Superpowers skills I lean on most (brainstorming, test-driven-development, requesting-code-review) — was the best discipline I'd found for spec-driven development with AI in the loop.

What I didn't say — because I didn't know yet — is that the stack had five hidden cracks. They didn't show on a single refactor. They showed up the third, fourth, fifth time I ran it on a new project, when the workflow had to carry weight rather than win a sprint.

This post is the evolution. It's what I'd add to that earlier post if I had to write it again today, after running the stack across a couple more projects — most cleanly on python-agent, where I deliberately let each friction point speak for itself before designing the fix.

I'll spend the first half listing the five friction points (where the earlier stack hurt) and the second half on what I did about them (a three-step methodology abstraction plus four command-level fixes). Then I'll show the four-command, four-phase shape I ended up with, why this is still agile in the form you'd recognize, and the plugin I packaged so you can run the whole thing without re-reading either post.

If you haven't read the earlier post, the original entry point is there. This post assumes OpenSpec's four-command base (explore/propose/apply/archive) and the three Superpowers skills I just mentioned (brainstorming, test-driven-development, requesting-code-review).

One thing I want to flag up front: nothing in the evolution is a clever invention. Each fix was a response to a specific moment where the earlier stack produced output I didn't trust. The pattern that emerged is practice → friction → fix → harden. Methodology evolves by being run, not by being designed. The plugin at the end is the byproduct, not the goal.

My AI Writing Roadmap: Four Tracks From Infra to Agents

· 12 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

Four luminous nodes arranged in a continuous loop on a topographic dark map — the visual thesis: four AI writing tracks, one cube.

I write about AI on this blog. Not "AI in general" — that's too big to write coherently about as a single topic. You end up writing platitudes. Instead, I write across four tracks, each anchored to a place where I've actually paid the cost of learning.

Since the start of the year, I've accumulated enough AI-related posts that it's time to step back and lay out the structure they fit into. This post is the map: what the four tracks are, why they hang together, what's already published in each, and what's coming next. If you've ever landed on the blog from one post and wondered whether the rest is worth your time, this is the index.

The Investment Operating System: How I Use AI to Manage My Portfolio Systematically

· 20 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is part 3 of the AI Wealth Management series, exploring how to use Claude Code and LLM Wiki for personal investing.

Most investors I know operate in reaction mode. A friend texts about a hot stock. An earnings beat hits the news. A pundit on X says the sector is turning. Something triggers a buy or sell, and the decision feels analytical — but it's really just noise with a story attached.

What changed my investing wasn't reading more research or finding better sources. It was building a system. Specifically, it was treating my investment process the way a software engineer treats a production system: with defined inputs, explicit logic, observable state, and predictable outputs.

Building AI Agent: From Complex Claude Skills to Production-Grade AI Agents

· 15 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is part 2 of the AI Wealth Management series, exploring how to use Claude Code and LLM Wiki for personal investing.

This post is for developers building AI systems. Specifically: how to develop complex, multi-step Claude Code commands and compose them into workflows — and what it actually takes to turn a working personal tool into a production-grade agent.

The domain is stock investing, but the patterns apply broadly. This post stands on its own — if you want background on what Claude Code is, how LLM Wiki works, and how to build a knowledge base from scratch, Part 1 covers that, but it's not a prerequisite here.

Building Your Personal Finance Knowledge Base with Claude Code

· 9 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is part 1 of the AI Wealth Management series, exploring how to use Claude Code and LLM Wiki for personal investing.

Most people with real expertise — in finance, law, immigration, or anything else — carry it around in their heads, where it helps no one and can't compound. This post is about changing that: building a structured knowledge base that AI can reason over directly, and that can eventually serve others or generate income.

I'll use personal finance as the example domain. North American finance is genuinely complex — 401K, Roth IRA, HSA, Wash Sale Rule, FBAR, cross-border compliance — and worth systematizing. But the method here is domain-agnostic. Same approach works for any field where you have accumulated expertise.

Stacking OpenSpec and Superpowers: A Combined SDD Workflow

· 11 min read
Austin Xu
Cloud Platform Engineering Leader @ eBay

This is a follow-up to From Vibe Coding to Spec-Driven Development. That post documented introducing OpenSpec into an existing Finance project. This one covers a new project where I stacked OpenSpec with Superpowers from day one.

After three months of running OpenSpec on my Finance project, I'd formed a clear picture of what it's good at and where it struggles. On a personal wiki project I'd also been using Superpowers, and its brainstorming, TDD, and code-review skills were landing real hits.

So I started a new project — a UTR-based tennis team lineup app (tennis-lineup) — specifically to run both tools together and see how they compose. This post is the report.