Learnings From Using 7 AI Coding Agents to Build Side Projects

Harshal
Mar 15
9 min read

Approach, Learnings, and Opinions on Each Agent

While working on my side projects, I hit a simple constraint: my AI usage habits outpaced my plan limits. That constraint pushed me into a multi-agent workflow. I now rotate tools based on credits, strengths, and friction.

In this article, I share my approach on building a habit of using 7 agents together and opinions and learnings.

You need 9 minutes to read this.

7 AI agents used together to build a side project.

Do not tell two different agents to edit the same files. They overwrite each other's edits. I learned this after many failed releases when, I asked multiple agents to work in parallel, and some changes did not show up in the webapp in production. I was not aware of the recommended approach at the time, so I tried it from first principles and learned that this setup does not work. Use Git worktree (or multiple branches) so that each agent works on the same area of the codebase in parallel but in a separate copy. Merge the results later. I have not yet started using worktree or multiple branches. My workaround is to implement one plan while making future plans in parallel.
Agents that handle Git well are more valuable. Some agents did a better job than others.
Allowlists work differently across agents. Cursor lets you allow certain actions across sessions. AMP and Droid let you allow all actions within one conversation. I prefer allowing specific actions across all conversations, with an override per conversation to allow all actions when needed. Codex needs you to allow each specific action, which is more cumbersome.
With multiple agents, you need a way to get notified when an agent completes its work. Then you can come back, review the work, and plan the next step. Otherwise you lose development time.
I sometimes used the plan mode from one agent, broke that plan into tasks, and gave those tasks to different agents. That was another way to use each agent for what it did best.
Cursor extensions required me to connect them to MCP servers in every new profile or new project. I had many Cursor projects. The MCP connection to Cursor agents carried across projects, which was much easier.

Why I Started Using Multiple Agents

In my previous Product Management job, I had access to enterprise plans for Cursor, n8n, and Lovable. I used AI a lot and coded fast. When I started side projects and kept using Cursor at the same rate on a free account, I exhausted my monthly limit in 3 days. I signed up for the paid plan and exhausted that in 1 week.

I choose to first change my tool mix instead of stopping my AI adoption habit. I found discounted and free AI coding tools, signed up for each, and experimented with multiple agents. I built a habit of using several agents together, prioritized by availability of free credits. I treated agents like a portfolio, prioritized by availability of free credits and capabilities. I share this process to show why I ended up juggling many agents.

Note that I am on different plans for each tool, so this is not a fair comparison. My goal is to share the experience I had with each tool at the plan I could access. I am not trying to rank tools.

These are the agents I used:

Lovable: Paid plan (100 credits per month)
Cursor: Paid plan (~$20/month, with a monthly usage allowance)
Amp: Promotional offer ($15 daily free usage, free plan)
Factory Droid: Promotional offer (20 million tokens per month)
Codex (OpenAI): ChatGPT Plus subscriber, which includes limited free Codex usage
AntiGravity (Google Gemini): Free plan (very limited usage)
Comet (Perplexity): Paid plan for AI browsing

Multi-agent habit

My approach is plan first, then implement. At a high level it works like this:

I start with ChatGPT and Perplexity. I search the web, find examples and screenshots, and brainstorm with these agents. Paid plans give generous usage, so I get the database schema ready and screenshots from similar apps or mocks. I will add Mobbin to this mix.
I give that information to Lovable and get the first version of the web app.
I enable Lovable cloud and back up Lovable's work to GitHub.
I pull the code to my laptop and review it in Cursor.
I use Cursor's ask mode to query the codebase.
I use Lovable, Codex, and Cursor in plan mode to plan changes.
I use Amp and Factory Droid in edit mode to implement changes. Cursor's agent also implements changes effectively.
I use Codex to review the codebase.
I use Amp for version control: add, commit with clear messages, push, and resolve merge conflicts. I ask Cursor or Codex to create plans. I use Cursor, Amp, Factory Droid, and AntiGravity to implement those plans or parts of them.
For most tasks, I create a plan first, then implement it. That workflow keeps the work predictable and on track with fewer surprises
For database changes, a local agent gives me the SQL to run, or I explain the intent to Lovable and it applies the changes. When something breaks and local agents cannot fix it, I switch to Lovable as the most capable agent.

Overall, I research and scope with ChatGPT and Perplexity, build the first version in Lovable, then plan and implement across Cursor, Codex, Amp, and others, with Lovable as the fallback for database work and hard fixes.

Flowchart of a multi-agent workflow from research to planning, implementation, review, and Git operations.

Lovable

I build the first version of most apps in Lovable. When Lovable credits run low, I back the project to GitHub, pull it to my laptop, and use Cursor. I kept using the 5 daily Lovable credits and the Try to Fix button when Builder had issues. I used Lovable for database changes because I could not make those through GitHub. Lovable delivered consistent results, but I felt strong credit anxiety for small changes because each change still consumed many credits. Editors with token-based usage limits suited those small, frequent edits better. I made those changes in local coding agents and synced back via GitHub. Across all my agents and projects, Lovable was the most effective at checking whether the webapp it built matched my intent and solved my actual need. Only Lovable could look across the database, telemetry, frontend, backend, session recordings, app screenshots, and web search to decide where to intervene and improve. This aligns with OpenAI Codex team's find here .

Local coding agents (IDE, CLI) were better for specific, narrow edits. For example, to align a few icons or add text to icons, Lovable might make those edits but also introduce other changes. Local agents made more precise edits. I found it easier to spot check the work of local coding agents, and their explanation for every small change was also more detailed. They were also cheaper.

Cursor IDE

Cursor is my primary interface for writing blogs, home assistant and n8n automations, scripts and analysis, and websites, products, and prototypes. I kept using it with Lovable projects. The main limit was seeing what I had built on my local machine. I asked the AI to change the repository so it ran in both Lovable and on localhost. Cursor’s diff view is a strong feature. See changes in every file and accept or reject each edit. Neither the CLI agents nor the other IDE extensions offered that level of visibility. Switch between preview and markdown mode when needed. The Model Context Protocol (MCP) connections in Cursor carry across projects, so I did not reconfigure them per project. The Composer model worked well. Bugs and errors with the auto model selector pushed me to Claude 4.5 Sonnet in Cursor, but I then hit my paid plan’s monthly usage limit. With multiple Cursor projects (Home Assistant, n8n, Writing, Building), each project had to load the Codex and Amp extensions on its own. That loading often got stuck and pushed me toward CLI-style agents.

SourceGraph Amp IDE extension

I installed the Amp extension in Cursor. The interface differed as it did not have an explicit mode switcher. I could tell it to invoke the oracle to plan, and that worked for me, but it would make me anxious as it was not an explicit read-only mode. I could not choose the model, only between smart, rush, and deep, so I left it on smart. Each agent also referenced files differently. Spawning sub-agents to tackle different parts of the task feels very satisfying. It was harder to attach external files and images to Amp. Amp handled code tasks and Git well in smart mode. In rush mode it drew wrong conclusions on Git. In smart mode (Opus 4.6) it handled merges and commits with clear messages. Amp often picked the wrong folder or file and applied changes in the wrong place. A promotional offer gave me a fixed amount of free usage per day. Within a few days my usage exceeded it. I looked for other options.

Amp uses a selection of models for different tasks, which is very interesting.

OpenAI Codex IDE extension

OpenAI’s Codex extension runs in Cursor. With ChatGPT Plus I get some free usage per day. I set it up and use it. The extension is slow at making code changes. That may be a Windows issue or improve on Linux; it runs heavy commands for every patch and even when reading a file. I found that confusing. Codex did not handle Git well. It planned well and could enter plan mode. I used the Codex extension only for work I could leave running, and I accepted the long runtimes. The permission flow is painful. The extension does nothing until I repeatedly click allow. Codex was the first agent I gave full PC access, because it needed that to be useful. Eventually I used Codex only for creating plans. Asking it to edit code keeps it busy for a long time and blocks other work. Asking it to create a plan uses reads, not writes, so other agents can edit files in parallel.

Google Gemini AntiGravity

After I exhausted the usage limits of the above options, I moved to Google's AntiGravity, which is similar to Cursor. It is a reskinned Visual Studio Code IDE. The free plan includes some daily usage. I exhausted that within a few hours, but the interface matched Cursor and worked well. On the free version I hit per-model rate limits on the more capable models (Gemini Pro, GPT-OSS, Opus Sonnet). I switched to Gemini Flash, which has a higher allowance. That model was often oversubscribed and AntiGravity showed a "servers are overloaded" error. The rate-limit display per model was confusing. I could not tell whether I had hit my own limit or the service was saturated.

Perplexity Comet AI browser

I will write a separate article about the use cases I found for the AI browser. Here are two examples close to building side projects:

I use Comet to upload files in my web app or click through many buttons.
I use Comet when I need to create new devices or entities. It opens the Home Assistant dashboard and goes through the clicks to update the dashboard. I do the actual automation separately.

I still ran out of my monthly usage limits even though I was on the paid plan.

CLI form factor

I explored the CLI form factor because my Lenovo laptop was struggling with multiple Cursor windows open, each with multiple AI extensions and the Comet agentic browser. The CLI worked, but I did not like it. I found it harder to mix shell commands with AI prompts. In Cursor I had a separate terminal window. In the CLI I found it harder to review folders or files before passing them to a prompt and harder to run multiple agent tabs at once. I could not easily see how much I had used versus my limit. My hypothesis is that the CLI form factor likely works better for developers on enterprise accounts who do not need to track usage.

Factory Droid AI CLI

Factory AI has a coding agent called Droid. I installed and tried it. It did a good job at coding, but the interface was tricky. I hit the same CLI limitations: no parallel threads, no easy way to queue messages, tagging files was hard, and reading long plans was hard when the terminal scrolled to the end and expected me to pick an option. Viewing diffs was difficult. I had a one-year promotional paid plan but ran out of monthly credits after one week. Droid handles Git well. Because it runs in the terminal, I could not connect a Model Context Protocol (MCP) server to it, so I stopped using it for MCP. Factory Droid was the only CLI out of the ones I tried that could handle images as inputs.

SourceGraph Amp CLI

I also used Amp in its CLI form. It worked well. Starting March 2026, Amp will only be available on CLI, not as an IDE extension. That is disappointing because the IDE extension worked well. I couldn't pass it images as inputs.

Cursor CLI

I also used Cursor in its CLI form. I often noticed that I would give it a command and it would do nothing, with no visible explanation for the hang. I couldn't pass it images as inputs.

OpenAI Codex CLI

I also used Codex in its CLI form. It worked well, but only on WSL, not in PowerShell. I couldn't pass it images as inputs.

Claude Code

I didn't mention Claude code, Claude desktop, or Claude co-work because I don't have an Anthropic subscription for my side projects. I don't need to take one, given most of my needs are getting met from the other products. But I definitely love the Anthropic models themselves and use them a lot through the other products.

Multi-agent coding is an Ops Problem

I expected multi-agent coding to be a model problem. It is more of an ops problem. Credits, permissions, Git behavior, and feedback loops decide the outcome. To summarize: success depends on ops (limits, permissions, Git, feedback), not on which model you use.