One of the more provocative applications of Large Language Models already conceivable today, which might sound radical but is obviously and unambiguously desirable, is to be found in the judiciary.
LLMs can now process human language programmatically. They can apply complex natural language specifications to new situations, enforcing consistent rules while integrating and computing more data than any PhD, with superior contextual understanding to boot. An LLM is, in essence, a programmatic compiler of natural language. This is precisely the function of Supreme Court justices.
Thus, the Supreme Court should rather be an ensemble of LLMs. A Supreme Court justice's role is to enforce the Constitution, a natural language specification of the American republic, buttressed by a large context dump known as case law. The Constitution contains rhetorical flourishes, aesthetic elements, and symbolic resonances. Yet it is meant to be executed faithfully by each successive generation.
Historically, we have lacked any technology capable of executing human language programmatically. We rely on human judges because, while we can establish institutions and procedures that approximate software applications executing the Constitution's directives, such a system inevitably falters. Disagreements arise. Occasionally and inevitably, we require intelligent, responsible arbiters who can survey any legal situation in a comprehensive way and render a final judgment. Supreme Court justices exist as backstops for the ambiguity inherent in the nation's foundational legal specs.
We now have a superior alternative. Adopting LLMs in the judiciary should be seen as the increasingly obvious and safer approach, even though it offends our sensibilities at present. It's similar, I think, to the issue of self-driving cars. We don't want to believe that robot cars are safer, but they are, and so they are destined and now everyone is capitulating. The wetware will be far more resistant to installing LLMs in the Supreme Court, but the logic strikes me as equally strong.
We've always known that even the most intelligent and noble humans suffer from biased and limited compute. Human judges have only ever been a solution of last resort. Now that we have an unambiguously superior device for this function, only inertia and vested interests will delay the inevitable. On matters such as these, AI and crypto are bedfellows. Smart contracts will inevitably tend to replace law, and AI will inevitably tend to replace human judgment, whether from the inside or through disruptively novel systems on the outside.
The intuitive conservative objection to LLM judges, that only human judgment is endowed with the divine spark and that we have a sacred responsibility to ensure human control over technology, is understandable. But in this case, it is the Constitution, not the human, which is meant to be the supreme authority. Justices exist only because someone must interpret and apply the Constitution. If a superior method exists for that interpretation and application, we ought to adopt it.
The progressive concern centers on bias: What if LLMs reflect prejudices embedded in their training data? This is a legitimate technical challenge requiring research and engineering solutions, not grounds for rejecting the idea. Human judges, after all, harbor biases rooted in life experience, social position, and their ideological commitments, biases we cannot fully audit or improve. I don't think LLMs could be worse in this regard, even in their current state of development.
Most intriguing is that the Supreme Court already operates as an ensemble model. Nine justices deliberate, dissent, and collectively decide. This arrangement already attempts what ensemble methods achieve in machine learning: superior predictions through multiple reasoners. One could envision nine LLMs trained with slightly different values, perhaps these could even be weighted to reflect the ideological balance of the voting population, with their outputs aggregated. The result would yield fairer, more consistent, and more principled rulings, even if we wanted to retain some degree of "democratic" variance.