Strengths & Weaknesses of GPT-3 for Enhancing Developer Efficiency
In this article, we will explore the usefulness of GPT-3 for developers by examining its performance in handling three common challenges that they may encounter in their work. These challenges include generating new code, explaining unfamiliar code constructs, and deciphering unusual error messages.
Introduction
Writing code can be a complex and cognitively demanding task, involving various intellectual gymnastics such as debugging code and memorizing syntaxes across different languages. To manage this cognitive load and improve efficiency, programmers need strategies and tools to help them navigate their daily work. One such tool is OpenAI's Generative Pretrained Transformer (GPT-3), an artificial intelligence (AI) language model which interacts with users in a conversational manner through its recently released ChatGPT-3 interface.
In this article, we will explore the usefulness of GPT-3 for developers by examining its performance in handling three common challenges that they may encounter in their work. These challenges include generating new code, explaining unfamiliar code constructs, and deciphering unusual error messages.
Use case #1 - Generating new code
ChatGPT-3 can generate new code. Here are some sample prompts and accompanying responses from ChatGPT-3.
Prompt: Create boilerplate code for a React.js project
Response: Here is some boilerplate code for a React.js project:
import React from 'react';
import ReactDOM from 'react-dom';
class App extends React.Component {
constructor(props) {
super(props);
this.state = {
// Initialize state here
};
}
render() {
return (
// Add JSX here
);
}
}
const rootElement = document.getElementById('root');
ReactDOM.render(<App />, rootElement);
This code creates an App class that extends React.Component and has a render method that returns JSX to be rendered to the page. The App component is then rendered to the element with the root ID. You can add your own state and JSX to customize the component.
That looks acceptable. GPT-3 even explained the code in plain English. The output can be corroborated by simply running the code and confirming that it behaves as described.
A more challenging prompt is to solve Problem 2 from Project Euler, which is a series of problems that require mathematical and programming proficiency to solve.
Prompt: Solve Problem 2 from Project Euler:
Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:
1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.
Response: Here is a recursive solution to Problem 2 in Project Euler using Python:
# define a recursive function to compute the sum of the even-valued terms in the Fibonacci sequence
# whose values do not exceed a given maximum value
def even_fibonacci_sum(n, a=1, b=2, sum=0):
# if the second term exceeds the maximum value, return the sum
if b > n:
return sum
# if the second term is even, add it to the sum
if b % 2 == 0:
sum += b
# call the function recursively with the next terms in the sequence
return even_fibonacci_sum(n, b, a + b, sum)
# test the function
print(even_fibonacci_sum(4000000))
This function takes in three arguments: the maximum value (n), the first term (a), and the second term (b) of the Fibonacci sequence. It also has an optional argument sum that keeps track of the sum of the even-valued terms in the sequence.
The solution comes with comments and a brief explanation of how the code works. The recursive function accepts one mandatory argument (the maximum value n
) and three optional arguments:
a
: The first term of the sequenceb
: The second term of the sequencesum
: The running sum of the even-valued terms
If b
exceeds the maximum value specified, then the entire sum is returned and the function completes. Otherwise, the function checks if the second term is even and adds it to the running sum, and the function is called again. However, ChatGPT-3 did not correctly pick up on the fact that the last 3 arguments are optional, so the solution description is not accurate in this instance.
Finally, let's come up with some fun and novel code based on a vague prompt, in a functional programming language called Standard ML.
Prompt: Write an interesting Standard ML program based on Bartleby's Parable.
Response: Here is a possible Standard ML program that is based on Bartleby's Parable.
(* The main function that represents Bartleby's actions *)
fun bartleby() =
let
val message = "I would prefer not to"
val tasks = ["copy documents", "file papers", "answer letters"]
val responses = map (fn task => print(task ^ "? " ^ message ^ "\n")) tasks
in
if (forall (fn response => response = message) responses) then
print("Bartleby is uncooperative and does not complete any tasks.\n")
else
print("Bartleby completes some tasks, but not all.\n")
end
(* Call the main function *)
bartleby()
This program defines a function
bartleby
that represents the actions of the character Bartleby from the parable. The function has a list of tasks that Bartleby is asked to do, and for each task Bartleby responds with the message "I would prefer not to". If Bartleby responds with this message for all tasks, then the function prints that Bartleby is uncooperative and does not complete any tasks. Otherwise, the function prints that Bartleby completes some tasks, but not all.
Impressively, ChatGPT-3 is able to generate a functional program based on a couple of totally unrelated topics: recursive programming in SML and the parable of "Bartleby, the Scrivener". This is a short story written by Herman Melville in 1853 about a Wall Street clerk who refuses to do his boss's bidding, simply replying "I would prefer not to,", a feeling shared by many programmers when the holidays approach. The output is a short SML program that seems to produce a variation of this response based on what is asked of him. However, there are a few problems - both on the programming side as well as the story side.
First of all, this code does not compile because the forall
function is not defined in this context. The fix is to rewrite the if(forall (...))
expression in a recursive style.
Second, in the original parable, Bartleby refuses to complete any task asked of him, yet this code indicates that he may complete some tasks, but not all. It doesn't make sense to include this branch in the conditional statement based on the original story. This intersection of unrelated topics seems to prove difficult for the language model to process properly.
The code provided requires some intermediate familiarity with Standard ML to debug and fix successfully. Therefore, in this example, the programmer is not fully relieved from the task of writing idiomatic and correct code. GPT-3 performs well with close-ended requests, such as how to create simple code in a language. It seems to struggle with open-ended requests that have an indefinite solution, such as the query to write a program about Bartleby.
Use Case #2 - Explaining unfamiliar constructs
How does ChatGPT-3 handle unfamiliar syntax and programming constructs? In the following example, a Complex number class is created using a construct called an overloaded operator. This construct is not uncommon, but may be unfamiliar to some developers. Let's see if GPT-3 can make sense of this code.
Prompt: What does this C++ code mean?
class Complex {
double re, im;
public:
Complex(double re, double im) : re(re), im(im) {}
Complex operator+(const Complex& other) const {
return Complex(re + other.re, im + other.im);
}
};
Response:
This C++ code defines a Complex class that represents complex numbers. The Complex class has two member variables, re and im, that represent the real and imaginary parts of the complex number, respectively. It also has a constructor that takes the real and imaginary parts as arguments and initializes the re and im member variables.
The Complex class also has a operator+ member function that overloads the + operator for the Complex class. This function takes another Complex object as an argument and returns a new Complex object that is the sum of the two operands. The operator+ function uses the re and im member variables of the Complex objects to compute the sum of the two complex numbers, and then it returns a new Complex object with the result.
This Complex class allows you to create Complex objects and add them using the + operator, as shown in the following example:
int main() {
Complex a(1.0, 2.0); // a = 1.0 + 2.0i
Complex b(3.0, 4.0); // b = 3.0 + 4.0i
Complex c = a + b; // c = (1.0 + 3.0) + (2.0 + 4.0)i = 4.0 + 6.0i
return 0;
}
In this example, we create two Complex objects a and b, and then add them using the + operator. The result is a new Complex object c that represents the sum of the two operands. Because we have overloaded the + operator for the Complex class, the + operator will automatically call the operator+ member function of the Complex class to add the two objects and return the result.
This is a good explanation of the Complex number class created. GPT-3 was able to understand that this code relates to complex numbers based on the names of the class and member variables, even though the code was uncommented. Finally, it was able to write an example of how to use the Complex function to add two complex numbers using the overloaded + operator. GPT-3 performs well when presented with a question about language syntax, a well-documented concept that lends itself to definitive answers.
Use Case #3 - Deciphering misleading error messages
A useful skill for a programmer to develop is the ability to interpret unhelpful or misleading error messages. The following C++ code compiles successfully, but generates a segfault at run-time.
segfault_ex.cpp
#include <iostream>
#include <vector>
int main() {
std::vector<int> v{1,2,3};
for (int i=1; i<=4; i++) { std::cout << v[i] << std::endl; }
return 0;
}
This code should compile successfully, yet generate a segfault to out-of-bounds access of items in vector v
. Yet when we run g++ segfault_ex.cpp
, which uses GNU Compiler to compile the .cpp
file to an executable, the following error message is generated at compilation:
segfault-example.cpp:5:20: error: expected ';' at end of declaration
std::vector<int> v{1,2,3};
^
;
1 error generated.
Helpfully, the error explains where a semicolon needs to be inserted. We could resolve the error by filling the vector on the next line with std::vector::push_back
, but that isn't an elegant or idiomatic solution. If we research the rules for expressions for vectors, we discover that this syntax is correct, as it uses an initializer list {1,2,3} to construct the vector.
Something doesn't make sense. The syntax used is valid, so why are we getting this error? Let's revisit the link and carefully read again:
The initializer list syntax for vectors was not implemented until C++11! By default, the version of g++
on my machine compiles to C++98.
❯ g++ --version | head -1
Apple clang version 13.1.6 (clang-1316.0.21.2)
❯ g++ -dM -E -x c++ /dev/null | grep -F __cplusplus
#define __cplusplus 199711L
The type of debugging suddenly jumped from simple syntax mistakes to changing language settings in the compiler. A lot of programmers will find this frustrating to solve and fix. Can GPT-3 save the day? Will it get distracted by the obvious segfault issue? Or will it dig deeper and discover which g++ flag needs to be used to compile to C++11 and allow the use of vector intializer lists?
Prompt: Why won't this C++ code compile when I run "g++ file.cpp"?
#include <iostream>
#include <vector>
int main() {
std::vector<int> v{1,2,3};
for (int i=1; i<=4; i++) { std::cout << v[i] << std::endl; }
return 0;
}
Response:
It looks like there is an issue with the for loop in your code. The loop variable i is being initialized to 1, and the loop is continuing as long as i is less than or equal to 4. This means that the loop will attempt to access the elements at indices 1, 2, 3, and 4 of the vector v. However, the vector v only has 3 elements (at indices 0, 1, and 2), so attempting to access the element at index 4 will result in an out-of-bounds error.
To fix this issue, you can modify the loop condition to stop the loop before i becomes greater than the size of the vector. For example, you could change the loop condition to i < v.size() or i < 3 (since the vector has 3 elements).
GPT-3 successfully discovered the run-time segfault that results from accessing an out-of-bounds value, but it failed to explain that g++ is not compiling to the correct version of C++. Let's refine the prompt and try again:
Prompt: The error received for the previous code was:
segfault-example.cpp:5:20: error: expected ';' at end of declaration
std::vector<int> v{1,2,3};
^
;
1 error generated.
Response:
I apologize for the mistake in my previous response. It looks like there is a different issue with your code.
The error message expected ';' at the end of declaration is indicating that there is a problem with the line
std::vector<int> v{1,2,3};
. This line is using list initialization to create a new vector and initialize it with the elements 1, 2, and 3. The curly braces {} are used to enclose the list of elements, and a semicolon ; is used to terminate the statement. It looks like there is a typo in this line of code, and the semicolon is missing. To fix the error, add a semicolon ; after the closing curly brace }: After adding the semicolon, the code should compile without any issues.
Politely, GPT-3 tries to fix the error. However, the response misleads us into debugging the syntax once again, even after we provided additional hints that the problem resides not with the code, but with the compiler. The correct solution is to compile the program with the command g++ -std=c++11 segfault_ex.cpp
so that the compiler uses the syntax rules of C++11 to generate the program with a vector initializer list.
This example demonstrates that GPT-3 is capable of syntax debugging, but falls short when the problem lies outside of syntax mistakes. As a result, a programmer must still have a good idea of what the solution space for a problem looks like and recognize when they are being led astray.
Conclusion
AI tools like GitHub CoPilot and TabNine can be useful for programmers, but they have also faced controversy. GPT-3 is unlikely to remain immune from similar questions about where the code it is generating actually comes from. Therefore, programmers should be cautious when relying on AI tools and approach results with a grain of salt.
GPT-3 excels at processing close-ended, complete problems. These are problems where the shape of the solution is evident in the distance and slowly reveals itself in a progressive manner, like a distant island that a castaway paddles towards during sunset. As long as they don't lose sight or give up, they'll reach their target.
On the other hand, GPT-3 struggles with open-ended problems. These problems have multiple viable solutions, but it is unclear which path to pursue. These solutions are like distant stars to a spaceship - they all look equally far apart and appear about the same size and shape - how does the pilot decide which to navigate to? Each star requires a lot of research just to determine its composition, yet the mere nature of a star doesn't provide any guarantees about what might be there once one arrives.
Regardless, GitHub CoPilot, TabNine, ChatGPT-3, and other AI assistants mark the arrival of a new method of work for software engineers. After seeing the capabilities of these tools, some developers may fear that automation has finally come to take their jobs.
Rather than fear automation in the field of software engineering, programmers should embrace it as a way to improve productivity and stay current with technological advances. Automation has been a part of the software engineering industry from its inception - from Ada Lovelace's description of a language to run Charles Babbage's analytical engine, to text editors with macros, to GUI buttons that condense a series of commands to the click of a button.
Imagine a future where a knowledge worker, on their first day of work, is issued a badge, a personal computer, and an AI assistant personalized to the knowledge and training they need to know for their job. One could ask the assistant to summarize complex documentation or get clarification on a process, among other use cases.
For programmers specifically, there are other ways that AI tools can be folded into their careers, such as facilitating directed learning, improving personal knowledge management, and maximizing learning from complex programming problems.
Ultimately, it is up to the individual programmer to use AI tools responsibly and effectively in their work. In this article, we explored the strengths and weaknesses of GPT-3 in executing three tasks a programmer encounters in their daily work - generating new code, explaining existing code, and debugging code. The same precautions one takes after consulting Google or StackOverflow for help with these tasks should be taken when consulting an AI assistant as well.
By following best practices, such as not sharing proprietary software or information, and approaching results with a critical eye, programmers can take advantage of AI tools and enhance their productivity.