In this, Goddard appears to be caught in the same predicament the AI boom has created for many of us. Three years in, companies have built tools that sound so fluent and humanlike they obscure the intractable problems lurking underneath—answers that read well but are wrong, models that are trained to be decent at everything but perfect for nothing, and the risk that your conversations with them will be leaked to the internet. Each time we use them, we bet that the time saved will outweigh the risks, and trust ourselves to catch the mistakes before they matter. For judges, the stakes are sky-high: If they lose that bet, they face very public consequences, and the impact of such mistakes on the people they serve can be lasting.
“I’m not going to be the judge that cites hallucinated cases and orders,” Goddard says. “It’s really embarrassing, very professionally embarrassing.”
Still, some judges don’t want to get left behind in the AI age. With some in the AI sector suggesting that the supposed objectivity and rationality of AI models could make them better judges than fallible humans, it might lead some on the bench to think that falling behind poses a bigger risk than getting too far out ahead.
A ‘crisis waiting to happen’
The risks of early adoption have raised alarm bells with Judge Scott Schlegel, who serves on the Fifth Circuit Court of Appeal in Louisiana. Schlegel has long blogged about the helpful role technology can play in modernizing the court system, but he has warned that AI-generated mistakes in judges’ rulings signal a “crisis waiting to happen,” one that would dwarf the problem of lawyers’ submitting filings with made-up cases.
Attorneys who make mistakes can get sanctioned, have their motions dismissed, or lose cases when the opposing party finds out and flags the errors. “When the judge makes a mistake, that’s the law,” he says. “I can’t go a month or two later and go ‘Oops, so sorry,’ and reverse myself. It doesn’t work that way.”
Consider child custody cases or bail proceedings, Schlegel says: “There are pretty significant consequences when a judge relies upon artificial intelligence to make the decision,” especially if the citations that decision relies on are made-up or incorrect.
This is not theoretical. In June, a Georgia appellate court judge issued an order that relied partially on made-up cases submitted by one of the parties, a mistake that went uncaught. In July, a federal judge in New Jersey withdrew an opinion after lawyers complained it too contained hallucinations.
Unlike lawyers, who can be ordered by the court to explain why there are mistakes in their filings, judges do not have to show much transparency, and there is little reason to think they’ll do so voluntarily. On August 4, a federal judge in Mississippi had to issue a new decision in a civil rights case after the original was found to contain incorrect names and serious errors. The judge did not fully explain what led to the errors even after the state asked him to do so. “No further explanation is warranted,” the judge wrote.
These mistakes could erode the public’s faith in the legitimacy of courts, Schlegel says. Certain narrow and monitored applications of AI—summarizing testimonies, getting quick writing feedback—can save time, and they can produce good results if judges treat the work like that of a first-year associate, checking it thoroughly for accuracy. But most of the job of being a judge is dealing with what he calls the white-page problem: You’re presiding over a complex case with a blank page in front of you, forced to make difficult decisions. Thinking through those decisions, he says, is indeed the work of being a judge. Getting help with a first draft from an AI undermines that purpose.
“If you’re making a decision on who gets the kids this weekend and somebody finds out you use Grok and you should have used Gemini or ChatGPT—you know, that’s not the justice system.”