Trying out an AI Paired Programmer

04/12/2023 Reading Time: 4 Minutes

Trying out an AI Paired Programmer

I recently took ChatGPT (GPT-3) for a spin as a paired programmer. The project was a simple Google Sheet extension that detected outliers for a given cell range. This felt like a good project to try out because it contained basic math, and logic that could be written natively in JavaScript, but also involved a layer of complexity with the utilization of the Google Sheets API and understanding the differences in the API’s client and server-side code. Relative to the magic shared by others from single prompt responses with ChatGPT, the process to generate the code for this extension was not so straightforward. However, the result of this project definitely did not disappoint, and the experience made me buy into the hype around the value GPT has been generating. While I don’t see GPT-3 as a replacement for developers as most fear, I see the technology becoming an incredibly important asset to organizations as a paired programmer as my experiences below support.

Fighting Starting Point Developer Block

While every project is different, sometimes I find that the toughest part about starting a new project is determining the architecture and foundational code structure. It is a weird developer block I have where I overcomplicate this step of the process as if I were a veteran FAANG engineer with a highly opinionated approach to how things should be developed. Because of this, I was skeptical that I would end up seeing this project through to the end, but that sentiment immediately changed after ChatGPT took care of this with a simple kick-off prompt.

“can you write a boilerplate code for a google sheets extension that is focused on detecting data anomalies in the data in the present sheet”

This prompt alone completed 70% of my project! Magic! Definitely not what I had anticipated at the start, but as my project carried on I discovered that ChatGPT was great at generating code with a simple scope of the project that required a relatively low-effort task. However, this was not the case for the remaining 30% of the project. This portion took up all of my efforts and captured the reality that there despite what it can do, there is still a good amount of human interaction through prompt engineering, code debugging, and code adjustments to get to a final product. It was a good reality check that while GPT is great with producing code, it still has a way to go to handle all facets of development as I discovered below.

The Need for a Human Touch

This means that currently there is no reason to panic about your job if you are a developer. My experience altogether took 87 requests to get to the final product. There were a variety of reasons why it took so long. Sometimes when I asked for an update to a piece of the code it would refactor the code and change the function and its arguments; Other times it would write code for the client that could only be run server-side and break the previous code that was written.

Server-side code for the wrong file

There was also a browser authentication bug that GPT was not good at handling, but GPT was at the very least helpful in giving me direction on where the issue might be occurring. When you are logged into multiple Google accounts at once in a browser, Apps Scripts can misinterpret the user who is authenticated on the front-end and server-side leading to a miscommunication between client and server authentication. It is a tricky problem to troubleshoot that I ended up doing on my own and an example of certain development environment issues that are tricky for GPT to catch.

In other instances, I ran up against the ChatGPT token limit of 2048 tokens, which meant that some responses were truncated and I had to rewrite my prompts to narrow the scope of ask in order to generate completed code. This was a frequently occurring event and at some points, I felt like it would have been easier just to work on the project myself from that point going forward.

Hitting the token limit

If I were working on a full-scale project, I don’t think I would use GPT-3 to the extent I had for this project. There are certainly better tools trained on codebases like Github Co-Pilot that could be much more reliable for continual work with AI to generate code. Given how much effort I made for the final parts of the project, GPT feels better suited as a starting project and ongoing support than a tool I would fully rely on.

Shaping the Corporate World; Training and Coaching Savings

Reflecting on the experience, I can confidently say that I will be using GPT-3 as a paired programmer for all of my future projects. If I was not a programmer and didn’t know any languages, then I would not advise against trying to use GPT to build an application for you. As I laid out in the previous section the model has flaws and you might be running in circles if GPT can’t troubleshoot certain issues with your code. Who I believe will find the most utility in this model are organizations that are looking to cut down on training resources and costs with developers. Once you have GPT trained on your code base and guardrails in place, you basically have an engineering trainer that can pair with new developers and get them ramped up in a fraction of the time and cost it takes to train a person today. I’m sure many are already putting this belief into practice and I can only imagine the number of startups that will be focused on this utility in the months to come.

📬 Subscribe to My Newsletter

Get the latest articles and exclusive content straight to your inbox.