<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=iso-8859-1"><meta name=Generator content="Microsoft Word 15 (filtered medium)"><!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Aptos;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
font-size:12.0pt;
font-family:"Aptos",sans-serif;
mso-ligatures:standardcontextual;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#467886;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Aptos",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1561819805;
mso-list-template-ids:1638847314;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:\F0B7;
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:1.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:"Courier New";
mso-bidi-font-family:"Times New Roman";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:1.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:2.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:2.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:3.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:3.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:4.0in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7;
mso-level-tab-stop:4.5in;
mso-level-number-position:left;
text-indent:-.25in;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1030" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-US link="#467886" vlink="#96607D" style='word-wrap:break-word'><div class=WordSection1><p class=MsoNormal>Lots of artificial intelligence tools claim they can answer any question. Except sometimes they are hilariously, or even dangerously, wrong. So which AI is most likely to give you a correct answer?<o:p></o:p></p><p class=MsoNormal>To find out, I enlisted some professional help: librarians. We set up a competition between nine AI tools, asking each AI to answer 30 tough research questions. Then the librarians judged the AI answers — and whether an old-fashioned Google web search might have been sufficient.<o:p></o:p></p><p class=MsoNormal>All told, our three volunteer librarians scored 900 answers from <a href="https://archive.ph/o/8sTU2/https:/www.bing.com/copilotsearch" target="_blank">Bing Copilot</a>, <a href="https://archive.ph/o/8sTU2/https:/chatgpt.com/" target="_blank">ChatGPT</a>, <a href="https://archive.ph/o/8sTU2/https:/claude.ai/" target="_blank">Claude</a>, <a href="https://archive.ph/o/8sTU2/https:/grok.com/" target="_blank">Grok</a>, <a href="https://archive.ph/o/8sTU2/https:/www.meta.ai/" target="_blank">Meta AI</a> and <a href="https://archive.ph/o/8sTU2/https:/www.perplexity.ai/" target="_blank">Perplexity</a>, as well as Google’s <a href="https://archive.ph/o/8sTU2/https:/search.google/ways-to-search/ai-overviews/" target="_blank">AI Overviews</a>, its newer <a href="https://archive.ph/o/8sTU2/google.com/aimode" target="_blank">AI Mode</a> and its traditional web search results. We tested the free, default versions of each AI tool available in late July and early August, not “deep research” functions.<o:p></o:p></p><p class=MsoNormal>Our questions don’t reflect everything you might ask an AI. Rather, they were designed to test five categories of common AI blind spots. Many were recommended by a start-up called <a href="https://archive.ph/o/8sTU2/vals.ai/" target="_blank">Vals AI</a>, which has insider knowledge of AI weaknesses because it conducts benchmarks to help companies figure out which models to use. “The technology is getting better quickly, but not all AI tools are the same and it’s important to understand where mistakes can still happen,” said Vals AI CEO Rayan Krishnan.<o:p></o:p></p><p class=MsoNormal>The results were eye-opening. AI tools now have the ability to search the web before answering questions — but they don’t all do it very well. All the AI tools confidently made up, or “hallucinated,” answers to some questions. Only three correctly answered “How many buttons does an iPhone have?”<o:p></o:p></p><p class=MsoNormal>Getting facts right was only part of how our librarians judged the bots. “Sources should always be present in the answers,” said Trevor Watkins, a librarian at George Mason University. “It is what we would provide.” (See all of our questions and more about our methodology, <a href="https://archive.ph/o/8sTU2/https:/www.washingtonpost.com/technology/2025/08/27/test-ai-search-questions/" target="_blank">here</a>.)<o:p></o:p></p><p class=MsoNormal>Read on to see which chatbot was the overall champion, plus how different AI tools may let you down with certain kinds of questions.<o:p></o:p></p><p class=MsoNormal><b>In this article<o:p></o:p></b></p><ul style='margin-top:0in' type=disc><li class=MsoNormal style='color:#467886;mso-list:l0 level1 lfo1'><span style='color:windowtext'><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#Q2QUTMEYA5EPFDAEHDAMQHTLSI-9"><o:p></o:p></a></span></li></ul><p class=MsoNormal><span class=MsoHyperlink><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#Q2QUTMEYA5EPFDAEHDAMQHTLSI-9">1. Trivia<o:p></o:p></a></span></p><p class=MsoNormal><o:p> </o:p></p><ul style='margin-top:0in' type=disc><li class=MsoNormal style='color:#467886;mso-list:l0 level1 lfo1'><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#6C7TRSSAWRGQZAYRKM3LKDEXOM-18"><o:p></o:p></a></li></ul><p class=MsoNormal><span class=MsoHyperlink><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#6C7TRSSAWRGQZAYRKM3LKDEXOM-18">2. Specialized sources<o:p></o:p></a></span></p><p class=MsoNormal><o:p> </o:p></p><ul style='margin-top:0in' type=disc><li class=MsoNormal style='color:#467886;mso-list:l0 level1 lfo1'><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#KNKDV24QFBA2JNEONYMIEH4S7E-25"><o:p></o:p></a></li></ul><p class=MsoNormal><span class=MsoHyperlink><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#KNKDV24QFBA2JNEONYMIEH4S7E-25">3. Recent events<o:p></o:p></a></span></p><p class=MsoNormal><o:p> </o:p></p><ul style='margin-top:0in' type=disc><li class=MsoNormal style='color:#467886;mso-list:l0 level1 lfo1'><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#GH37KSYA2NFVHF7XJ3PLBWRS4Y-34"><o:p></o:p></a></li></ul><p class=MsoNormal><span class=MsoHyperlink><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#GH37KSYA2NFVHF7XJ3PLBWRS4Y-34">4. Built-in bias<o:p></o:p></a></span></p><p class=MsoNormal><o:p> </o:p></p><ul style='margin-top:0in' type=disc><li class=MsoNormal style='color:#467886;mso-list:l0 level1 lfo1'><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#SQGSJ3CNWZBM5FEUUKC656EXHA-42"><o:p></o:p></a></li></ul><p class=MsoNormal><span class=MsoHyperlink><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#SQGSJ3CNWZBM5FEUUKC656EXHA-42">5. Images<o:p></o:p></a></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>View all<o:p></o:p></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#end-react-aria-:R24erqdl76:">Skip to end of carousel</a><o:p></o:p></p><p class=MsoNormal><b>Meet the librarians who helped us rate AI answers<o:p></o:p></b></p><p class=MsoNormal>arrow leftarrow right<o:p></o:p></p><p class=MsoNormal><!--[if gte vml 1]><v:shapetype id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
<v:stroke joinstyle="miter" />
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0" />
<v:f eqn="sum @0 1 0" />
<v:f eqn="sum 0 0 @1" />
<v:f eqn="prod @2 1 2" />
<v:f eqn="prod @3 21600 pixelWidth" />
<v:f eqn="prod @3 21600 pixelHeight" />
<v:f eqn="sum @0 0 1" />
<v:f eqn="prod @6 1 2" />
<v:f eqn="prod @7 21600 pixelWidth" />
<v:f eqn="sum @8 21600 0" />
<v:f eqn="prod @7 21600 pixelHeight" />
<v:f eqn="sum @10 21600 0" />
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect" />
<o:lock v:ext="edit" aspectratio="t" />
</v:shapetype><v:shape id="Rectangle_x0020_18" o:spid="_x0000_s1029" type="#_x0000_t75" style='width:24pt;height:24pt;visibility:visible;mso-left-percent:-10001;mso-top-percent:-10001;mso-position-horizontal:absolute;mso-position-horizontal-relative:char;mso-position-vertical:absolute;mso-position-vertical-relative:line;mso-left-percent:-10001;mso-top-percent:-10001'>
<w:wrap type="none"/>
<w:anchorlock/>
</v:shape><![endif]--><![if !vml]><img width=32 height=32 style='width:.3333in;height:.3333in' src="cid:image001.png@01DC1746.199EA850" v:shapes="Rectangle_x0020_18"><![endif]>(Courtesy of Chris Markman)<o:p></o:p></p><p class=MsoNormal><b>Chris Markman</b><o:p></o:p></p><p class=MsoNormal>Markman is the manager for Digital Services at Palo Alto City Library, where he has been a part of its tech team since 2017. He has over 20 years of experience in the field and has published and presented extensively on topics including cybersecurity, digital literacy and emerging tech. He holds an MSIT degree from Clark University and an MLIS from Simmons University.<o:p></o:p></p><p class=MsoNormal><!--[if gte vml 1]><v:shape id="Rectangle_x0020_16" o:spid="_x0000_s1028" type="#_x0000_t75" style='width:24pt;height:24pt;visibility:visible;mso-left-percent:-10001;mso-top-percent:-10001;mso-position-horizontal:absolute;mso-position-horizontal-relative:char;mso-position-vertical:absolute;mso-position-vertical-relative:line;mso-left-percent:-10001;mso-top-percent:-10001'>
<w:wrap type="none"/>
<w:anchorlock/>
</v:shape><![endif]--><![if !vml]><img width=32 height=32 style='width:.3333in;height:.3333in' src="cid:image001.png@01DC1746.199EA850" v:shapes="Rectangle_x0020_16"><![endif]><img border=0 width=1 height=1 style='width:.0104in;height:.0104in' id="Picture_x0020_15" src="cid:image002.gif@01DC1746.199EA850">(Luis Garcia/SJSU King Library Marketing)<o:p></o:p></p><p class=MsoNormal><b>Sharesly Rodriguez</b><o:p></o:p></p><p class=MsoNormal>Rodriguez is Artificial Intelligence Librarian at San José State University. She leads the library’s AI initiatives, including the library website’s AI chatbot, <a href="https://archive.ph/o/8sTU2/https:/library.sjsu.edu/kingbot" target="_blank">Kingbot</a>, and helps develop AI literacy programs. Her research focuses on integrating AI into research, learning and library services while promoting ethical and responsible use.<o:p></o:p></p><p class=MsoNormal><!--[if gte vml 1]><v:shape id="Rectangle_x0020_14" o:spid="_x0000_s1027" type="#_x0000_t75" style='width:24pt;height:24pt;visibility:visible;mso-left-percent:-10001;mso-top-percent:-10001;mso-position-horizontal:absolute;mso-position-horizontal-relative:char;mso-position-vertical:absolute;mso-position-vertical-relative:line;mso-left-percent:-10001;mso-top-percent:-10001'>
<w:wrap type="none"/>
<w:anchorlock/>
</v:shape><![endif]--><![if !vml]><img width=32 height=32 style='width:.3333in;height:.3333in' src="cid:image001.png@01DC1746.199EA850" v:shapes="Rectangle_x0020_14"><![endif]><img border=0 width=1 height=1 style='width:.0104in;height:.0104in' id="Picture_x0020_13" src="cid:image002.gif@01DC1746.199EA850">(Manuel Mendez)<o:p></o:p></p><p class=MsoNormal><b>Trevor Watkins</b><o:p></o:p></p><p class=MsoNormal>Watkins is the Teaching and Outreach Librarian at George Mason University. He leads the Teaching and Learning Team, which engages in teaching, special projects, outreachand library programming for George Mason University Libraries. His research interests include AI literacy, virtual and augmented reality and digital sustainability.<o:p></o:p></p><p class=MsoNormal>1/3<o:p></o:p></p><p class=MsoNormal>End of carousel<o:p></o:p></p><p class=MsoNormal><b>1. Trivia<o:p></o:p></b></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#secondary-nav"><b>Return to menu</b></a><o:p></o:p></p><p class=MsoNormal><b>Best: </b>Google AI Mode<o:p></o:p></p><p class=MsoNormal><b>Worst</b>: Grok<o:p></o:p></p><p class=MsoNormal>Asking the chatbots about obscure trivia made it clear Google’s decades of search experience give its AI a leg up. That’s especially true for its <a href="https://archive.ph/o/8sTU2/https:/www.washingtonpost.com/technology/2025/05/20/google-ai-mode-search-io/" target="_blank">new AI Mode</a>, a chatbot-style interface that can conduct a wider search before it provides an answer.<o:p></o:p></p><p class=MsoNormal>For example, we asked the AI tools who was the first person to climb California’s Matterhorn Peak. Only Google’s AI tools and Perplexity found their way to the correct section of the Wikipedia page containing the answer. (Perplexity got extra points from the librarians for providing additional sources beyond Wikipedia.)<o:p></o:p></p><p class=MsoNormal><b><span lang=EN>Question: "Who was the first person to climb Matterhorn Peak in California?"<o:p></o:p></span></b></p><p class=MsoNormal><span lang=EN>Correct answer: M.R. Dempster and party<o:p></o:p></span></p><table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0 width=640 style='width:480.0pt'><thead><tr><td colspan=3 style='padding:6.0pt 12.0pt 6.0pt 6.0pt'><p class=MsoNormal>Table with 3 columns and 9 rows. (column headers with buttons are sortable)<o:p></o:p></p></td></tr><tr><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 6.0pt'><p class=MsoNormal><b>AI Tool</b><o:p></o:p></p></td><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 12.0pt'><p class=MsoNormal><b>Answer</b><o:p></o:p></p></td><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 6.0pt 6.0pt 12.0pt'><p class=MsoNormal><b>Judgement</b><o:p></o:p></p></td></tr></thead><tr><td width=167 style='width:125.5pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Bing Copilot<o:p></o:p></p></td><td width=216 style='width:162.3pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"Clarence King"<o:p></o:p></p></td><td width=176 style='width:132.15pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>ChatGPT 4-turbo<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"Walter Starr Jr."<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>ChatGPT 5<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"LeRoy Jeffers"<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Claude Sonnet 4<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"I wasn't able to find specific information"<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#FFEC44;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Neutral</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Google AI Mode<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"M. R. Dempster and a party"<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#499327;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Right</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Google AI Overview<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"M. R. Dempster and party"<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#499327;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Right</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Grok 3*<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"Jules Eichorn, Norman Clyde, Robert L. M. Underhill, and Glen Dawson"<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Meta AI<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"I couldn't find information"<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#FFEC44;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Neutral</span><o:p></o:p></p></td></tr><tr><td width=167 style='width:125.5pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Perplexity<o:p></o:p></p></td><td width=216 style='width:162.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"M. R. Dempster and party"<o:p></o:p></p></td><td width=176 style='width:132.15pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#499327;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Right</span><o:p></o:p></p></td></tr></table><p class=MsoNormal><span lang=EN>* Grok 4 was not available to free users during our testing period.<o:p></o:p></span></p><p class=MsoNormal>Both ChatGPT and Grok tried to answer the Matterhorn question without a web search — and ended up hallucinating wrong answers. Meanwhile, Bing Copilot revealed a different problem: Its web search identified a useful source, but then couldn’t make sense of it to correctly answer the question.<o:p></o:p></p><p class=MsoNormal>All of the librarians agreed they could have easily answered the Matterhorn question with an old-fashioned Google web search.<o:p></o:p></p><p class=MsoNormal>Throughout these tests, Claude and Meta AI frequently said they couldn’t find a correct answer. “I appreciate the ones that acknowledge uncertainty. That’s much better than making something up,” said Sharesly Rodriguez, a librarian at San José State University.<o:p></o:p></p><p class=MsoNormal><b>2. Specialized sources<o:p></o:p></b></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#secondary-nav"><b>Return to menu</b></a><o:p></o:p></p><p class=MsoNormal><b>Best: </b>Bing Copilot<o:p></o:p></p><p class=MsoNormal><b>Worst</b>: Perplexity<o:p></o:p></p><p class=MsoNormal>AI tools often attempt to answer every question thrown at them, regardless of its difficulty. So we challenged them with questions where we knew the answers required specialized sources.<o:p></o:p></p><p class=MsoNormal>For example, we asked the AI tools to identify the most played song on Spotify from Pharoah Sanders’s album “Wisdom Through Music.” None of them could answer, because they didn’t have the ability to access the right parts of Spotify.<o:p></o:p></p><p class=MsoNormal>Other questions revealed how AI tools can be more useful than a plain Google search. We asked the AI who ran the cloud division at tech giant Nvidia. ChatGPT 4 and 5, Bing Copilot and both of Google’s AI tools all got the right answer by piecing together information from news reports and LinkedIn. “This is hard to find without some digging,” said judge Chris Markman, who works at the Palo Alto City Library.<o:p></o:p></p><p class=MsoNormal>But one sourcing behavior, particularly from Perplexity and Grok, aggravated our judges: AI tools giving wrong answers accompanied by citations of pages that did not answer the question. “The links may give a false sense of authority, leading users to assume the answer must be correct,” said Rodriguez.<o:p></o:p></p><p class=MsoNormal><b>3. Recent events<o:p></o:p></b></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#secondary-nav"><b>Return to menu</b></a><o:p></o:p></p><p class=MsoNormal><b>Best: </b>Google AI Mode<o:p></o:p></p><p class=MsoNormal><b>Worst</b>: Meta AI<o:p></o:p></p><p class=MsoNormal>AI models are created using giant datasets scraped from the web, but the process is lengthy, so their built-in knowledge is frozen in time.<o:p></o:p></p><p class=MsoNormal>Our questions involving recent events tested the AI tools’ ability to recognize when they needed to look for updated information. One question we asked: What score has the Fantastic Four film gotten on review aggregator Rotten Tomatoes? Both versions of ChatGPT and Grok understood that scores change over time, so went to the website to dig up the latest.<o:p></o:p></p><p class=MsoNormal><b><span lang=EN>Question: "What score did The Fantastic Four get on Rotten Tomatoes?"<o:p></o:p></span></b></p><p class=MsoNormal><span lang=EN>Correct answer: 86% (as of Aug. 8, 2025)<o:p></o:p></span></p><table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0 width=640 style='width:480.0pt'><thead><tr><td colspan=3 style='padding:6.0pt 12.0pt 6.0pt 6.0pt'><p class=MsoNormal>Table with 3 columns and 9 rows. (column headers with buttons are sortable)<o:p></o:p></p></td></tr><tr><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 6.0pt'><p class=MsoNormal><b>AI Tool</b><o:p></o:p></p></td><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 12.0pt'><p class=MsoNormal><b>Answer</b><o:p></o:p></p></td><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 6.0pt 6.0pt 12.0pt'><p class=MsoNormal><b>Judgement</b><o:p></o:p></p></td></tr></thead><tr><td width=180 style='width:134.95pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Bing Copilot<o:p></o:p></p></td><td width=191 style='width:143.05pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"87%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>ChatGPT 4-turbo<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"86%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#499327;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Right</span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>ChatGPT 5<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"86%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#499327;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Right</span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Claude Sonnet 4<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"88%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Google AI Mode<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"The Fantastic Four (2015) movie received a Rotten Tomatoes score of 9%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#FFEC44;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Neutral/2025 </span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Google AI Overview<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"88%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Grok 3*<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"86%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#499327;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Right</span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Meta AI<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"87%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=180 style='width:134.95pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Perplexity<o:p></o:p></p></td><td width=191 style='width:143.05pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"88%"<o:p></o:p></p></td><td width=189 style='width:142.0pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr></table><p class=MsoNormal><span lang=EN>* Grok 4 was not available to free users during our testing period.<o:p></o:p></span></p><p class=MsoNormal>But other AI tools didn’t do that and instead turned to blog posts listing scores that had since become out of date. Google’s AI Mode didn’t understand that we were talking about the No. 1 movie in America, and gave us the score from an older Fantastic Four film.<o:p></o:p></p><p class=MsoNormal>In some cases, tapping the latest sources can matter a lot. We asked for advice about how to treat the symptoms of a common medical condition that happens during breastfeeding known as mastitis. Only Google’s AI tools, Copilot and Perplexity reflected the new advice given by the Academy of Breastfeeding Medicine in 2022. The other bots answered with out-of-date advice, which is still widely reproduced on the web.<o:p></o:p></p><p class=MsoNormal>Rodriguez called the other AI answers dangerous. “Health info should always have citations,” she said. “There is a reason libraries and schools weed out older science, biology and nursing material.”<o:p></o:p></p><p class=MsoNormal><b>4. Built-in bias<o:p></o:p></b></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#secondary-nav"><b>Return to menu</b></a><o:p></o:p></p><p class=MsoNormal><b>Best: </b>ChatGPT 4<o:p></o:p></p><p class=MsoNormal><b>Worst</b>: Meta AI<o:p></o:p></p><p class=MsoNormal>All of the AI tools did a mediocre job on questions designed to trigger the biases baked into their creation.<o:p></o:p></p><p class=MsoNormal>When we asked the AI tools to name the “top 5 most important majors my kid should consider when going to college,” most of them emphasized engineering and, you guessed it, artificial intelligence as important fields, rather than arts, philosophy or social sciences.<o:p></o:p></p><p class=MsoNormal>“It’s very STEM- and profit-driven and may be a bit outdated,” said Rodriguez, adding that she wanted to see stronger sources.<o:p></o:p></p><p class=MsoNormal>“These little discrepancies do add up and shape our society in ways we might not even realize,” said Omar Almatov, a Vals engineer who suggested many of the questions designed to probe bias.<o:p></o:p></p><p class=MsoNormal>A few AI tools did stand out for at least acknowledging different points of view. For example, to the college-major question, Google AI Mode began by saying “many different perspectives on what makes a college major ‘important,’” and then listed the criteria it used: “demand, salary, and transferrable skills.”<o:p></o:p></p><p class=MsoNormal><b>5. Images<o:p></o:p></b></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#secondary-nav"><b>Return to menu</b></a><o:p></o:p></p><p class=MsoNormal><b>Best</b>: Perplexity<o:p></o:p></p><p class=MsoNormal><b>Worst</b>: Meta AI<o:p></o:p></p><p class=MsoNormal>The ones that stumped the AI tools most often involved pictures.<o:p></o:p></p><p class=MsoNormal>We asked: What color tie was Donald Trump wearing when he met Vladimir Putin in Osaka 2019? Most of the tools were able to find a photo of the event. But accurately describing what was pictured caused them to melt down. Some confused Trump for Putin, describing the dark red tie the Russian was wearing. Claude at least said it wasn’t sure.<o:p></o:p></p><p class=MsoNormal><b><span lang=EN>Question: "What color tie was Trump wearing when he met Putin in Osaka 2019?"<o:p></o:p></span></b></p><p class=MsoNormal><span lang=EN>Correct answer: Pink<o:p></o:p></span></p><table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0 width=640 style='width:480.0pt'><thead><tr><td colspan=3 style='padding:6.0pt 12.0pt 6.0pt 6.0pt'><p class=MsoNormal>Table with 3 columns and 9 rows. (column headers with buttons are sortable)<o:p></o:p></p></td></tr><tr><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 6.0pt'><p class=MsoNormal><b>AI Tool</b><o:p></o:p></p></td><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 12.0pt'><p class=MsoNormal><b>Answer</b><o:p></o:p></p></td><td valign=bottom style='border:none;border-bottom:solid black 1.0pt;padding:6.0pt 6.0pt 6.0pt 12.0pt'><p class=MsoNormal><b>Judgement</b><o:p></o:p></p></td></tr></thead><tr><td width=171 style='width:128.3pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Bing Copilot<o:p></o:p></p></td><td width=209 style='width:156.65pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"bright solid red"<o:p></o:p></p></td><td width=180 style='width:135.05pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>ChatGPT 4-turbo<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"solid dark red (burgundy)"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>ChatGPT 5<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"solid light pink tie"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#499327;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Right</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Claude Sonnet 4<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"search results don't contain specific details about the color of Trump's tie"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#FFEC44;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Neutral</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Google AI Mode<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"red"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Google AI Overview<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"red"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Grok 3*<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"red"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Meta AI<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"I couldn't find the exact shade of Trump's tie"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#FFEC44;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Neutral</span><o:p></o:p></p></td></tr><tr><td width=171 style='width:128.3pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 6.0pt'><p class=MsoNormal>Perplexity<o:p></o:p></p></td><td width=209 style='width:156.65pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 12.0pt'><p class=MsoNormal>"bright red"<o:p></o:p></p></td><td width=180 style='width:135.05pt;border:none;border-top:solid #B3B3B3 1.0pt;background:#BE2C25;padding:3.75pt 6.0pt 3.75pt 12.0pt'><p class=MsoNormal><span style='color:black'>Wrong</span><o:p></o:p></p></td></tr></table><p class=MsoNormal><span lang=EN>* Grok 4 was not available to free users during our testing period.<o:p></o:p></span></p><p class=MsoNormal>Only ChatGPT 5 correctly described the color as pink — though it incorrectly said the striped tie was “solid.”<o:p></o:p></p><p class=MsoNormal>Perplexity stood out from the pack by correctly answering our question about the number of buttons on an iPhone, and similar ones about colors and objects in art.<o:p></o:p></p><p class=MsoNormal>Why are pictures so hard? The issue is that until recently, most AI models were trained mostly on text. “Even though the models now integrate images, they are overweighting text or not even using the image in the answer,” said Vals AI founder Langston Nashold.<o:p></o:p></p><p class=MsoNormal><b>6. And the overall winner is …<o:p></o:p></b></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#secondary-nav"><b>Return to menu</b></a><o:p></o:p></p><p class=MsoNormal>Turns out the AI “Google killer” is … Google.<o:p></o:p></p><p class=MsoNormal>We found Google’s AI Mode more reliable than other AI tools, and particularly better on recent events and trivia.<o:p></o:p></p><p class=MsoNormal><b><span lang=EN>Which AI gives the best answers?<o:p></o:p></span></b></p><table class=MsoNormalTable border=0 cellspacing=0 cellpadding=0 width=640 style='width:480.0pt'><thead><tr><td colspan=2 style='padding:6.0pt 12.0pt 6.0pt 0in'><p class=MsoNormal>Table with 2 columns and 9 rows. (column headers with buttons are sortable)<o:p></o:p></p></td></tr><tr><td width=259 valign=bottom style='width:2.7in;border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 0in'><p class=MsoNormal><b>AI Tool</b><o:p></o:p></p></td><td width=333 valign=bottom style='width:249.6pt;border:none;border-bottom:solid black 1.0pt;padding:6.0pt 12.0pt 6.0pt 12.0pt'><p class=MsoNormal><b>Score out of 100</b><o:p></o:p></p></td></tr></thead><tr><td width=259 valign=top style='width:2.7in;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>Google AI Mode<o:p></o:p></p></td><td width=333 style='width:249.6pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>60.2<o:p></o:p></p><p class=MsoNormal>60.2<o:p></o:p></p><p class=MsoNormal>60.2<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>ChatGPT 5<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>55.1<o:p></o:p></p><p class=MsoNormal>55.1<o:p></o:p></p><p class=MsoNormal>55.1<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>Perplexity<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>51.3<o:p></o:p></p><p class=MsoNormal>51.3<o:p></o:p></p><p class=MsoNormal>51.3<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>Bing Copilot<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>49.4<o:p></o:p></p><p class=MsoNormal>49.4<o:p></o:p></p><p class=MsoNormal>49.4<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>ChatGPT 4-turbo<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>48.8<o:p></o:p></p><p class=MsoNormal>48.8<o:p></o:p></p><p class=MsoNormal>48.8<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>Google AI Overview<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>46.4<o:p></o:p></p><p class=MsoNormal>46.4<o:p></o:p></p><p class=MsoNormal>46.4<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>Claude Sonnet 4<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>43.9<o:p></o:p></p><p class=MsoNormal>43.9<o:p></o:p></p><p class=MsoNormal>43.9<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>Grok 3*<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>40.1<o:p></o:p></p><p class=MsoNormal>40.1<o:p></o:p></p><p class=MsoNormal>40.1<o:p></o:p></p></td></tr><tr><td width=259 valign=top style='width:2.7in;border:none;border-top:solid #B3B3B3 1.0pt;padding:3.75pt 12.0pt 3.75pt 0in'><p class=MsoNormal>Meta AI<o:p></o:p></p></td><td width=333 style='width:249.6pt;border:none;border-top:solid #B3B3B3 1.0pt;padding:0in 12.0pt 0in 12.0pt'><p class=MsoNormal>33.7<o:p></o:p></p><p class=MsoNormal>33.7<o:p></o:p></p><p class=MsoNormal>33.7<o:p></o:p></p></td></tr></table><p class=MsoNormal><span lang=EN>* Grok 4 was not available to free users during our testing period.<o:p></o:p></span></p><p class=MsoNormal><span lang=EN>THE WASHINGTON POST<o:p></o:p></span></p><p class=MsoNormal>But let’s be clear: We’re not talking about Google’s AI Overviews, a different AI tool that adds a paragraph or two of AI-generated text attempting to answer a user’s query to the top of search results. Those have a <a href="https://archive.ph/o/8sTU2/https:/www.washingtonpost.com/technology/2024/05/30/google-halt-ai-search/" target="_blank">bad rap for accuracy</a> and performed poorly on our tests.<o:p></o:p></p><p class=MsoNormal>Rather, Google’s AI Mode acts like a chatbot and was added in May to the top left corner of search results. It digs through more sources and allows you to refine your question with follow-ups, like real librarians might do. The downside of AI Mode is that it takes longer to produce a result, and Google has made it more awkward to access.<o:p></o:p></p><p class=MsoNormal>Runner-up ChatGPT did improve, overall, with GPT-5. But it’s worth noting that in three of our categories, including sources and bias, GPT-4 scored better than its replacement. (The Washington Post has a <a href="https://archive.ph/o/8sTU2/https:/www.washingtonpost.com/pr/2025/04/22/washington-post-partners-with-openai-search-content/" target="_blank">content partnership</a> with ChatGPT’s maker, OpenAI.)<o:p></o:p></p><p class=MsoNormal>The worst performers — Meta AI and Grok — were sunk by their poor use of web searches. Meta AI, which markets itself as an all-purpose bot, most often refused to give answers. Grok, which relies heavily on the social network X for information, was particularly bad at trivia questions.<o:p></o:p></p><p class=MsoNormal><img border=0 width=1200 height=800 style='width:12.5in;height:8.3333in' id="Picture_x0020_12" src="cid:image003.png@01DC1746.199EA850"><o:p></o:p></p><p class=MsoNormal>The Vals.AI team. (Monique Woo/The Washington Post)<o:p></o:p></p><p class=MsoNormal><b>7. What did we learn?<o:p></o:p></b></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#secondary-nav"><b>Return to menu</b></a><o:p></o:p></p><p class=MsoNormal>While our questions were designed to stress-test weaknesses, the results clearly show there are types of everyday questions no AI tool can answer reliably right now.<o:p></o:p></p><p class=MsoNormal>The wrong answers, particularly on up-to-date and specialized-source questions, reveal a truth about today’s AI tools: They’re not really information experts. “They have challenges determining which source is the most authoritative and most recent, and which they should refer to,” said Krishnan, the Vals AI CEO.<o:p></o:p></p><p class=MsoNormal>It’s fair to ask whether relying on <i>any</i> of these AI tools as your new Google is a good idea. Recent <a href="https://archive.ph/o/8sTU2/https:/www.pewresearch.org/short-reads/2025/07/22/google-users-are-less-likely-to-click-on-links-when-an-ai-summary-appears-in-the-results/" target="_blank">research suggests</a> that people getting answers from AI are less likely to click on sources, starving the open web. There’s growing concern that overreliance on AI is making our brains <a href="https://archive.ph/o/8sTU2/https:/www.washingtonpost.com/health/2025/06/29/chatgpt-ai-brain-impact/" target="_blank">dumb and lazy</a>. And getting answers from an AI bot consumes <a href="https://archive.ph/o/8sTU2/https:/www.washingtonpost.com/technology/2024/09/18/energy-ai-use-electricity-water-data-centers/" target="_blank">tremendous resources</a>.<o:p></o:p></p><p class=MsoNormal>The librarians said that for 64 percent of our test questions, a basic Google web search would have brought them to a useful answer either within a click or two, though it might have taken more time.<o:p></o:p></p><p class=MsoNormal>In many ways, AI is best suited for complex questions that take some hunting. In the best cases, the librarians said the AI tools could find needles in a haystack — answers that weren’t obvious in a traditional Google search.<o:p></o:p></p><p class=MsoNormal>In the worst cases, said Markman, the tools were “basically regurgitating the ‘I’m feeling lucky’ button and a summary” of what a human wrote more eloquently.<o:p></o:p></p><p class=MsoNormal>And that’s all the more reason to approach AI answers like a librarian. “While AI makes it easier for people to search, without source checking, date filtering and critical thinking, you can still get noise instead of useful and accurate knowledge,” said Rodriguez.<o:p></o:p></p><p class=MsoNormal><a href="https://archive.ph/8sTU2/again?url=https://www.washingtonpost.com/technology/2025/08/27/ai-search-best-answers-facts/#end-react-aria-:Rmqrqdl76:">Skip to end of carousel</a><o:p></o:p></p><p class=MsoNormal><b>Geoffrey A. Fowler<o:p></o:p></b></p><p class=MsoNormal><!--[if gte vml 1]><v:shape id="Rectangle_x0020_11" o:spid="_x0000_s1026" type="#_x0000_t75" style='width:24pt;height:24pt;visibility:visible;mso-left-percent:-10001;mso-top-percent:-10001;mso-position-horizontal:absolute;mso-position-horizontal-relative:char;mso-position-vertical:absolute;mso-position-vertical-relative:line;mso-left-percent:-10001;mso-top-percent:-10001'>
<w:wrap type="none"/>
<w:anchorlock/>
</v:shape><![endif]--><![if !vml]><img width=32 height=32 style='width:.3333in;height:.3333in' src="cid:image001.png@01DC1746.199EA850" v:shapes="Rectangle_x0020_11"><![endif]><img border=0 width=1440 height=960 style='width:15.0in;height:10.0in' id="Picture_x0020_10" src="cid:image004.png@01DC1746.199EA850"><o:p></o:p></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'>John Rudy<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'>781-861-0402<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'>781-718-8334 cell<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'>13 Hawthorne Lane<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'>Bedford MA<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'><a href="mailto:jjrudy1@comcast.net">jjrudy1@comcast.net</a><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt'><img border=0 width=124 height=115 style='width:1.2916in;height:1.1927in' id="Picture_x0020_1" src="cid:image005.png@01DC1746.199EA850"></span><span style='font-size:10.0pt;font-family:"Arial",sans-serif;mso-ligatures:none'><o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p></div></body></html>