Technical Interviews in the Age of LLMs

May 8, 2024

In the lost dialogue of Meno, Plato is said to have described Socrates conducting an engineering technical interview. 

Meno -  I am certain that no one ever taught him.
Socrates - And yet he possesses the knowledge?
Meno - This fact, Socrates, is undeniable. 

~ Plato, Meno (380 BC)

Designing a successful technical interview was difficult before AI

Executing a good technical interview has never been easy. You’re attempting to evaluate qualities in a candidate that are by nature difficult to evaluate: thought processes, self-awareness, judgment, understanding of things like efficiency and scalability, agility, autonomy. There’s no single checklist to go over, and even when the interview is conducted reasonably well, it can still feel like reading tea leaves and parsing fuzzy signals while trying to assemble as complete a picture as possible of a candidate. While this is happening, you attempt to match this obviously incomplete picture of a candidate to your team and your day-to-day work…in about an hour…with all the buzz of interview jitters. 

That’s WITHOUT LLMs in play. 

Less signal, more noise

Setting aside the usual challenges of designing a good technical interview that the industry has been wrestling with for literal decades, LLMs introduce a new challenge in that they are disproportionately good at solving the kinds of problems you are able to present to a candidate in the frame of a technical interview. 

Because technical interview questions are designed to be generalizable (the average candidate can understand the prompt quickly), narrowly scoped (the candidate can finish them in a 1-hour window), and limited in contextual complexity or uncertainty, LLMs are even better suited to solving interview prompts than they are to solving the day-to-day problems on the job. 

Unchecked usage of LLMs in interviews creates more noise in a process that was already too noisy – a candidate using Copilot may solve the prompt, but will they succeed in the job outside of the laboratory setting?

Minimizing false positives

This added noise from LLMs makes it harder to achieve a key goal of any good interview process: minimizing false positives (situations where candidates pass the interview loop with flying colors but aren’t successful in the actual role) without creating false negatives (not hiring someone with the potential to flourish in the role but who didn't pass the interview loop). 

Beyond the age-old wisdom on technical interviewing (e.g., get clear on the competencies of the role, use structured interviews, systematically train interviewers, the list goes on!), when interviewing in the age of LLMs, we’ve found you need to design the interview loop to enable an AI companion without crowding out the candidate. 

If a candidate is using an AI companion, we want to see how the candidate uses it to augment their superpowers rather than supplant or misrepresent their superpowers. 

Getting Tactical: Copilot v. ChatGPT v. Nothing 

For us, finding this balance started with deciding which tools to allow and when to allow them.

Comparing options:

  • Copilot more accurately simulates being on the job relative to no AI companion, and for engineers already using it, it may create a more “ergonomic” interview environment, allowing for more opportunities for higher order thinking. But like an MS-Works spell checker, it tends to throw things in your face and offer solutions before being asked for them, eliminating the opportunity for managing small decision points. (We see a theme, here, with early deployments, Microsoft!). At its best, it offers a window into critical thinking, but it’s hard to control when these moments happen. If Copilot offers a way to implement something, it sometimes gives the opportunity to observe how a candidate thinks about and evaluates its proposal. For example, if it gives a bad suggestion, does a candidate see it for what it is?

  • ChatGPT is more like an advanced Google search, with candidates having to explicitly ask for help. In an interview setting, it offers the opportunity to observe a candidate formulating a query and writing it out, and then analyzing the response they get. You can then ask the candidate what they got out of using it. With the transparency of process afforded here, it becomes easier to probe a candidate for understanding with ChatGPT than with Copilot.

  • Not allowing any AI companion starts to artificially constrain the interview process away from how work actually gets done. For example, we recently had an engineer ship a project written entirely in python despite zero previous python experience; their lack of familiarity with the language wasn't a problem due to their mastery of other languages, problem-solving skills, and an AI companion! That said, limiting some of the process to be LLM-free enables us to evaluate how the candidate will perform in the inevitable situations on the job where LLMs won’t be as useful.

Matching tool options with the goals of each interview:

At Fractional, we’re hiring generalist engineers to work across dozens of customers; projects range from automating API integrations, to deploying LLMs to customize product recommendations, to building AI phone agents, and more. This means the day-to-day job will look like everything from writing clean, production-ready code for our customers’ environments, to solving novel engineering puzzles, to learning rapidly (new customer, new code base, new end users with each project!), to interfacing directly with customers to understand and scope for their needs. 

With this in mind, we’ve broken down our interview process into two types of technical interviews, each with its own approach to LLM usage (which we communicate to candidates in advance), to hone in on evaluating different skills:  

  • Two hands-on-keyboard coding prompts assessing for code cleanliness and problem solving, productivity, judgment, and the potential to harness LLMs for superpowers. Here, because Copilot is prone to adding more noise, we say “yes” to ChatGPT but “no” to Copilot.

  • One whiteboard session assessing for project management, architecture design, decision making, autonomy, and customer engagement. This interview is structured similarly to a client call scoping a new project to get AI into production. Here, while we’re not against using ChatGPT necessarily, we’ve chosen a prompt where ChatGPT is limited in its usefulness, reflecting inevitable problems on the job that will be less suited for AI companions

 

It’s worth noting that interview loops are part art, part science and we’re always iterating. For example, we tried take-home assignments but concluded that this scenario is too error-prone to evaluate a candidate's performance when you’re unable to observe the nuances of how they relied on ChatGPT or Copilot. 

TLDR: Our current thinking 

This is where we’ve currently landed on navigating interviews in the age of LLMs:

  1. Accept that incorporating LLMs into the process is a question of “how” not “if”: Coding with an AI companion will be how the job is done, and a strong generalist engineer should be able to identify moments when an AI companion is or isn’t likely to be useful.
  1. Get clear on your interview rounds and the goals for each round: We currently have two hands-on-keyboard coding prompts and one whiteboarding session, each measuring different competencies and with different approaches to LLM usage.

  2. Decide and communicate in advance the rules of the road: Based on the goals for each interview, we currently say “no” to using Copilot, but “yes” to using ChatGPT for our two coding prompts, and our whiteboarding session is LLM-free. Candidates get these instructions in advance to minimize surprises. 
  1. Always supplement with real-world data points where possible: Interviews are still laboratory settings. References and examples of real-world projects continue to increase signal and reduce noise, especially in the age of LLMs.

Do you want to experience our interview process - or better yet - shape it as a member of the team? We are hiring generalist software engineers to join our San Francisco based team. Get in touch with us if you want to help us take on the most challenging implementations of AI. 


Eddie Siegel is the Chief Technology Officer at Fractional AI. Before launching Fractional AI, Eddie was CTO of Xip, CEO of Wove,  and held engineering leadership positions at LiveRamp and TowerData.

Explore other blog posts

see all