June 2024 D'Angelo Law Library Emerging Technologies Update

Stanford researchers release preprint study on hallucinations in leading AI legal research tools

Researchers from Stanford’s Institute for Human-Centered Artificial Intelligence and Regulation, Evaluation, and Governance Lab have released a preprint of a new study of the accuracy of several proprietary generative AI legal research platforms. Specifically, they tested the Lexis+ AI, Westlaw Precision AI, Westlaw Practical AI platforms and compared them to the results from ChatGPT-4.

Some takeaways from the study:

  • While use of retrieval-augmented generation (RAG) on specially chosen legal information can reduce hallucinations, the researchers found hallucinations in at least one in six answers for every tested platform. Claims that legal research platforms “avoid hallucinations” or are “hallucination-free” should be taken with a large grain of salt.
  • Westlaw’s AI product was found to hallucinate at nearly twice the rate of the LexisNexis product — 33% for Westlaw Precision AI compared to 17% for Lexis+ AI. The researchers hypothesized this may be because Westlaw tended to give longer answers, giving more opportunities for error.
  • The researchers have created an interesting typology of hallucinations. A response may be incorrect—describing the law incorrectly or making a factual error. Or it may be misgrounded—describing the law accurately but citing a source that does not support its claims.
  • Legal research problems pose particular issues for large language models (LLMs). LLMs have difficulty identifying the holding of an opinion, distinguishing between legal actors, and understanding the hierarchy of authority between courts. This may lead to insidious hallucinations that sound confident and plausible but which are not actually supported by the cited authority.
  • To meet their ethical obligations, lawyers using generative AI legal research tools must be able to verify that propositions stated by the tools are accurate using traditional legal research methods.

Further reading

State bar ethics opinions on the use of generative AI

The California, District of Columbia, Florida, Michigan, and New Jersey bar associations have all issued ethics opinions on the use of generative AI in law practice. While the specifics of each opinion vary, they all emphasize that lawyers are ultimately responsible for final work product, whether generative AI was involved in its production or not, and must have the skills necessary to verify the accuracy of any output. They also emphasize that the advent of generative AI does not change lawyers’ ethical duties, including the duties of competence, confidentiality, and client communication.

DuckDuckGo Anonymous AI Chat

DuckDuckGo has publicly released DuckDuckGo AI Chat—an intermediary for talking to several leading chatbots that obscures the user’s IP address and contractually forbids the AI model from training on the user’s queries. The available catbots re OpenAI’s GPT 3.5 Turbo, Anthropic’s Claude 3 Haiku, Meta’s Llama 3, and Mistral’s Mixtral 8x7B According to DuckDuckGo, “all chats are completely anonymous: they cannot be traced back to any one individual.”

Further assistance

If you are interested in learning more about emerging technologies, including their use in legal research, education, and practice, please see our research guide on generative AI, request a research consultation with a D'Angelo Law Librarian, or chat with us at Ask a Law Librarian.