What generative AI means for technical hiring, now and in the future
ChatGPT, OpenAI’s artificial intelligence-powered chatbot, has inspired both awe and fear in the tech industry and beyond. Since its public launch in November 2022, ChatGPT has proven to be adept at drafting college essays, analyzing data, and even writing code—with the latter having obvious implications for technical hiring.
Working in partnership with OpenAI, CodeSignal’s data and engineering teams have analyzed the impact of ChatGPT on our live and asynchronous technical assessments, now and in the future. In this blog post, we’ll address a few of the most common questions about technical hiring and AI-powered tools like ChatGPT:
- What is ChatGPT, and how good is it at writing code?
- Is it possible to tell if a candidate used ChatGPT or another generative AI solution to solve a CodeSignal question?
- What’s the risk of a candidate using ChatGPT to cheat on a CodeSignal assessment?
- In the longer term, how should we respond to and even make use of ChatGPT in CodeSignal evaluations?
Overview of ChatGPT as a code assist tool
ChatGPT, short for Chat Generative Pre-trained Transformer, is a browser-based chatbot that excels at mimicking human speech. It works a bit like a search engine—but rather than producing a list of search results in response to a query, it generates an original and coherently worded (though often factually incorrect) answer.
The power of ChatGPT to do complex tasks (like writing and debugging code) lies in the innovative training of its AI model, which included billions of data points and human trainers. In fact, CodeSignal worked in close collaboration with OpenAI as they developed and trained the GPT-3 and Codex models that power ChatGPT. Some of our anonymized coding data and questions have even been used (with our consent!) in the training of ChatGPT models.
However, just as ChatGPT often makes convincing-sounding statements that are in fact false, it also can write code that looks correct but is not. For this reason, the popular developer forum, Stack Overflow, banned answers from ChatGPT given its unreliability. Stack Overflow gave the following reasoning for this decision:
Stack Overflow, Dec. 2022
Overall, because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking or looking for correct answers.
So, what does all this mean for technical hiring and for software engineering in general? Here at CodeSignal, we believe it means that a major shift in how software engineers work—and how they should be evaluated in the recruiting process—is on the horizon. However, we aren’t there yet, due to limitations with current generative AI technology. These limitations, for one, mean that ChatGPT isn’t great at solving questions in a CodeSignal assessment—for a few reasons that we expand on below.
There are clear tells of ChatGPT-generated code
First, the use of generative AI tools, including ChatGPT, often leaves footprints of detectable indicators that CodeSignal currently tracks. Some of the red flags that raise suspicion include:
- Over-commenting of code
- Unnecessary statements
- Code that is incompatible with the format of CodeSignal’s IDE
While any one of these attributes may appear in original code from a candidate, the combination of multiple red flags may indicate suspicious behavior. Proctoring and activity monitoring of CodeSignal assessments provide an additional layer of protection against cheating via ChatGPT.
ChatGPT-based cheating is rare on CodeSignal assessments—and when it happens, it’s ineffective
Additionally, we’ve found that ChatGPT is very rarely used by candidates to cheat their way through the hiring process. CodeSignal has analyzed the activity logs for all of our Skills Evaluation Framework-backed assessments to identify where a coding assist tool may have been used. We are confident less than 1.5% of CodeSignal assessments have ChatGPT-based plagiarism. Where proctoring was enabled, the vast majority of these instances were caught during our review process and the results were marked as unverified.
Moreover, of this 1.5%, the vast majority of candidates did not score high enough to pass the assessment. From our extensive testing of ChatGPT on our Framework-backed assessments, we’ve found that in 91% of cases, ChatGPT only makes it possible to earn up to half of all available points. In other words: the chances that a candidate can use ChatGPT to answer questions and pass a Framework-backed CodeSignal assessment are extremely low.
The future of technical hiring must embrace, rather than shun, AI tools
At CodeSignal, we believe that AI-assisted coding solutions like ChatGPT and Github Copilot will eventually become the norm in software development. Looking to the future, we plan to regularly adapt our Framework-based assessments to continue mimicking what real-world developers do in their respective job categories.
In the meantime, CodeSignal looks forward to serving as a beta tester for OpenAI’s forthcoming ChatGPT detector tool. As the solution matures, we will be able to provide our customers a consistent way to differentiate AI-written versus human-written code from within CodeSignal coding reports.
We believe ChatGPT is here to stay and will become a tool that software developers and engineers use in their everyday work. The future of technical hiring requires learning how to embrace this tool and innovate new ways of evaluating core engineering skills that incorporate ChatGPT and other AI resources.