Every week brings a new model, a changed parameter, a deprecated feature, or an entirely new capability that renders yesterday’s workflow slightly obsolete. For a small company building six products simultaneously, the question was never whether to adopt AI development tools. It was how to build workflows that survive the constant churn.
Agentic development is a workflow pattern in which AI models operate as autonomous collaborators, executing code, reading project context, and making implementation decisions within explicit guardrails set by human engineers. The model works through problems end to end and reports the result rather than waiting for instructions at every step.
The stack
Glia runs on four connected systems, and the connections between them matter more than any individual tool. Claude handles the thinking; Claude Code handles the doing; Obsidian holds the memory; Supabase ties the operations together.
In practice, this means strategy, architecture decisions, content drafting, and task planning happen in conversation through claude.ai. When a decision produces code, it becomes a structured prompt handed to Claude Code, which operates from the terminal: editing files, running builds, executing tests, and returning structured evidence of what changed. The two roles never cross. Claude Code does not make design decisions. The chat agent does not touch the filesystem. That boundary sounds rigid, but it is the reason both work reliably.
Obsidian serves as persistent memory. Every project maintains a vault folder with dashboard files, session notes, technical specs, and handoff documents. When a new session begins, the agent reads the vault to understand where the project stands. When a session ends, the agent writes back what happened. Without the vault, every session starts from zero. With it, every session picks up where the last one left off.
The Model Context Protocol (MCP) is the layer that makes all of this feel like a single system rather than four disconnected tools. MCP connects Claude directly to both Obsidian and Supabase, so the agent can read vault files, query database tables, and write session records without leaving the conversation.
Context is everything
The single biggest factor in getting useful output from Claude is not the prompt. It is the context you provide before the prompt. A brilliant instruction given to a model with no background produces generic output. A straightforward instruction given to a model that understands your project, your conventions, and your current state produces work you can ship.
Every Glia project maintains a set of canonical context documents in the vault. A Claude Context file acts as the handoff note between sessions: what has been built, what decisions have been made, what questions remain open. A Project State file captures the technical ground truth: stack, hosting, database schema, deployment configuration. Running session logs record what happened, what was delivered, and what was deferred. The agent reads these before doing anything else, and the discipline of maintaining them after every session is the part that makes the system work. Context that is not written down does not exist for the next session.
We also maintain a routing table that maps task types to reference documents. When the session goal involves database work, the agent automatically fetches the schema safety checklist. When it involves a build, the agent fetches the code prompt template. The routing is explicit and declarative rather than relying on the model to guess which references it needs.
The build prompt pattern
Every piece of code that ships through Claude Code starts as a structured build prompt, not as an open-ended instruction typed into the terminal. This is the single practice that most improved our output quality, and it took several painful failures to learn why it matters.
A good build prompt opens with pre-flight checks: conditions the Code agent must verify before writing a single line. Confirming the current branch, checking that dependencies are installed, verifying that a specific file exists at an expected path. Pre-flight catches environmental drift before it contaminates the work.
Guardrails come next, and they matter more than most people expect. These define what the agent must not do: do not upgrade dependencies, do not modify files outside the specified paths, do not access the database, do not invent content copy. A Code agent without guardrails will helpfully “improve” things you did not ask it to improve, and debugging those improvements costs more than the original task. Guardrails are not about distrust. They are about scope.
The prompt then describes the work itself in enough detail that the agent does not need to make creative decisions. When the task involves user-facing text, the exact copy is included. When it involves a new component, the expected behaviour is specified. Ambiguity in a build prompt produces ambiguous output, reliably.
After the work comes a verification checklist: specific, observable conditions the agent must confirm. Build exit code zero. Page count in the dist folder matches expectation. No console errors on the target route. The checklist transforms “I think it worked” into “here is the evidence.”
Finally, stop conditions. When the agent hits a problem it cannot solve, it stops, documents what it found, and hands back. It does not improvise workarounds. This prevents the cascading failure mode where an agent spends forty minutes fixing a problem that did not need to exist.
Staying current without drowning
We evaluate new Claude releases by testing them against our own products, not by reading the changelog and hoping for the best. When Anthropic ships a new model, we run it against a real task on a real project and compare the output to the previous version. If it is better, we adopt it. If the difference is negligible for our use cases, we wait.
The updates that matter most are changes to tool use reliability, reasoning quality on multi-step tasks, and context window capacity. These directly affect how well the agent handles build prompts and context documents. Speed improvements are welcome but secondary. A fast model that misreads a guardrail is worse than a slow model that follows it precisely.
The updates that matter least are consumer features, interface redesigns, and capability additions outside our workflow. The temptation to chase every new release is real, particularly when the AI development community treats each one as an event. But chasing features is how you end up rebuilding your workflow every month instead of shipping products.
Our approach is conservative adoption. We pin to specific model versions in our project bootstraps, test new versions in controlled conditions, and only update the pin when the improvement is clear on our actual workload. Less exciting than the bleeding edge. Considerably more productive.
What we got wrong
The mistakes taught us more than the successes, and honesty about failures is the most useful thing a practitioner article can offer. Here are the ones that cost us the most time.
We over-relied on Claude for design decisions early on. The model generates functional layouts and sensible component structures, but it has no taste. It produces competent, generic interfaces that look like every other AI-generated page. Design direction, brand decisions, colour choices, the overall feel of a product: these need to come from a human with opinions. We learned this by shipping pages that were technically correct and aesthetically forgettable, then redesigning them by hand.
We underestimated explicit copy. For the first few months, we described what text should say rather than providing the exact words. The model’s attempts were consistently adequate and consistently wrong in tone. Now every build prompt that touches user-facing content includes the verbatim copy. The agent’s job is placement, not composition.
We also fell into the automation trap. Not everything benefits from AI involvement. Quick design tweaks, one-off file renames, and simple copy edits are faster done by hand than briefed, prompted, verified, and reviewed. The overhead of structuring work for an AI agent only pays off when the task is complex enough to justify it. Knowing where that line falls took longer to learn than any technical skill.
The most expensive mistake was treating context as optional. Early sessions started without vault reads, relying on the model’s memory of previous conversations, which does not exist between sessions, or on pasting fragments of project state into the chat. The output was inconsistent and often wrong. The vault discipline solved this entirely, but we lost several weeks of productivity before we committed to it.