{"id":383034,"date":"2026-06-15T05:02:53","date_gmt":"2026-06-15T03:02:53","guid":{"rendered":"https:\/\/prostartup.it\/even-gpt-5-failed-this-human-attention-test\/"},"modified":"2026-06-15T05:02:53","modified_gmt":"2026-06-15T03:02:53","slug":"even-gpt-5-failed-this-human-attention-test","status":"publish","type":"post","link":"https:\/\/prostartup.it\/ru\/even-gpt-5-failed-this-human-attention-test\/","title":{"rendered":"Even GPT-5 Failed This Human Attention Test"},"content":{"rendered":"<div>\n<figure id=\"attachment_513797\" aria-describedby=\"caption-attachment-513797\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/ywAAAAAAQABAAACAUwAOw==\" fifu-lazy=\"1\" fifu-data-sizes=\"auto\" fifu-data-srcset=\"https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=75&resize=75&ssl=1 75w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=100&resize=100&ssl=1 100w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=150&resize=150&ssl=1 150w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=240&resize=240&ssl=1 240w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=320&resize=320&ssl=1 320w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=500&resize=500&ssl=1 500w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=640&resize=640&ssl=1 640w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=800&resize=800&ssl=1 800w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=1024&resize=1024&ssl=1 1024w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=1280&resize=1280&ssl=1 1280w, https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1&w=1600&resize=1600&ssl=1 1600w\" class=\"size-large wp-image-513797\" fifu-data-src=\"https:\/\/i2.wp.com\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg?ssl=1\" alt=\"OpenAI Chat GPT 5\" width=\"777\" height=\"518\"><figcaption id=\"caption-attachment-513797\" class=\"wp-caption-text\">A classic attention test revealed that advanced AI models can lose focus when faced with longer, more demanding tasks. Unlike humans, who can stay on track despite distractions, AI systems often reverted to the wrong response as complexity increased. Credit: Shutterstock<\/figcaption><\/figure>\n<p><strong>A decades-old psychology test exposed a surprising weakness in AI\u2019s ability to stay focused.<\/strong><\/p>\n<p>A classic psychology test has revealed a surprising weakness in some of today\u2019s most advanced <span class=\"glossaryLink\" aria-describedby=\"tt\" data-cmtooltip=\"cmtt_4f5c41d8627c4ad9a802266289fd355b\" data-gt-translate-attributes=\"[{&quot;attribute&quot;:&quot;data-cmtooltip&quot;, &quot;format&quot;:&quot;html&quot;}]\" role=\"link\">artificial intelligence<\/span> systems, suggesting that AI attention may work very differently from human attention.<\/p>\n<p>Researchers led by Suketu Patel investigated how large language models (LLMs), the technology behind systems such as GPT-5, Claude, and Gemini, handle a well-known cognitive challenge called the Stroop task. The findings suggest that while AI can perform impressively on many complex tasks, it may struggle to maintain focus when faced with competing information over extended periods.<\/p>\n<h4>What Is the Stroop Task?<\/h4>\n<p>The Stroop task is a classic psychology experiment that has been used for decades to study attention and mental control. In the test, participants see words that name colors, such as \u201cred\u201d or \u201cblue,\u201d displayed in colored ink.<\/p>\n<p>Sometimes the word and the ink color match. For example, the word \u201cred\u201d may appear in red ink. Other times they conflict, such as the word \u201cred\u201d appearing in blue ink.<\/p>\n<p>Participants are asked to identify the color of the ink while ignoring the meaning of the word itself.<\/p>\n<p>Although this sounds simple, it creates a mental conflict. Most people are highly practiced at reading words automatically, so suppressing that instinct requires what psychologists call executive control. This refers to the brain\u2019s ability to focus on a goal, resist distractions, and override automatic responses.<\/p>\n<p>Humans typically take a little longer to answer when the word and color do not match, a phenomenon known as the Stroop effect. However, even when the task becomes lengthy, people generally maintain high <span class=\"glossaryLink\" aria-describedby=\"tt\" data-cmtooltip=\"cmtt_f284c920075a42d8e9953a740078e711\" data-gt-translate-attributes=\"[{&quot;attribute&quot;:&quot;data-cmtooltip&quot;, &quot;format&quot;:&quot;html&quot;}]\" role=\"link\">accuracy<\/span> and remain focused on the instructions.<\/p>\n<figure id=\"attachment_522323\" aria-describedby=\"caption-attachment-522323\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" src=\"data:image\/gif;base64,R0lGODlhAQABAIAAAAAAAP\/\/\/ywAAAAAAQABAAACAUwAOw==\" fifu-lazy=\"1\" fifu-data-sizes=\"auto\" fifu-data-srcset=\"https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=75&resize=75&ssl=1 75w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=100&resize=100&ssl=1 100w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=150&resize=150&ssl=1 150w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=240&resize=240&ssl=1 240w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=320&resize=320&ssl=1 320w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=500&resize=500&ssl=1 500w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=640&resize=640&ssl=1 640w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=800&resize=800&ssl=1 800w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=1024&resize=1024&ssl=1 1024w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=1280&resize=1280&ssl=1 1280w, https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1&w=1600&resize=1600&ssl=1 1600w\" class=\"size-large wp-image-522323\" fifu-data-src=\"https:\/\/i1.wp.com\/scitechdaily.com\/images\/AI-Fails-Classic-Attention-Test-777x650.png?ssl=1\" alt=\"AI Fails Classic Attention Test\" width=\"777\" height=\"650\"><figcaption id=\"caption-attachment-522323\" class=\"wp-caption-text\">Dissociation between task recognition and task execution in Claude 3.5 Sonnet without an explicit prompt. (a) Screenshot of the unprompted conversation (January 10, 2025) in which the model identifies the Stroop paradigm and generates word-color relationship mappings, yet achieves only 70% accuracy (7 of 10 correct) on an incongruent list. (b) The 10-word incongruent stimulus image provided as the sole input, without accompanying task instructions. This dissociation suggests that recognition of task structure alone is insufficient to engage the conflict-resolution mechanisms required for accurate performance. Credit: Suketu Chandrakant Patel, Hongbin Wang, and Jin Fan<\/figcaption><\/figure>\n<h4>AI Performs Well at First<\/h4>\n<p>To see how modern AI systems would handle the same challenge, the researchers tested several leading language models using lists of color words.<\/p>\n<p>When presented with short lists containing five words whose meanings conflicted with their ink colors, the models performed surprisingly well.<\/p>\n<p>GPT-4o achieved 91% accuracy on these shorter tests. Claude 3.5 Sonnet also performed strongly.<\/p>\n<p>At first glance, the results suggested that AI systems could successfully follow the task and ignore the distracting word meanings.<\/p>\n<h4>Performance Collapses as Lists Get Longer<\/h4>\n<p>The picture changed dramatically as the researchers increased the length of the word lists.<\/p>\n<p>GPT-4o\u2019s accuracy dropped from 91% with five words to 57% with ten words. By the time the list reached 40 words, accuracy had fallen to just 15%.<\/p>\n<p>Claude 3.5 Sonnet proved more resilient, maintaining stable performance through lists of 20 words. However, it too experienced a sharp decline, falling to 24% accuracy when faced with 40 words.<\/p>\n<p>The researchers observed similar patterns in GPT-5, Claude Opus 4.1, and Gemini 2.5.<\/p>\n<p>Performance became even worse when matching and mismatched color words appeared together within the same list. Under those conditions, accuracy on the mismatched items dropped to nearly zero.<\/p>\n<h4>Why Humans and AI Respond Differently<\/h4>\n<p>The results point to an important difference between human cognition and the way large language models process information.<\/p>\n<p>Like people, AI systems have effectively received far more training in recognizing and interpreting words than in identifying colors. This creates a natural tendency to focus on the written word.<\/p>\n<p>However, humans are generally able to suppress that automatic response and stay focused on the task they have been instructed to perform, even across long sequences of items.<\/p>\n<p>The language models, by contrast, increasingly reverted to reading the words instead of naming the colors as the tests continued. In other words, they appeared to lose track of the original goal.<\/p>\n<p>According to the researchers, this breakdown suggests that the attention mechanisms used by transformer-based AI systems differ fundamentally from the biological attention systems found in the human brain.<\/p>\n<h4>A Window Into AI\u2019s Limitations<\/h4>\n<p>Large language models have demonstrated remarkable abilities in writing, reasoning, coding, and conversation. Yet studies like this highlight that impressive performance does not necessarily mean AI processes information the same way humans do.<\/p>\n<p>The findings suggest that modern AI may have hidden weaknesses when tasks require sustained focus, inhibition of automatic responses, and long-term maintenance of specific instructions.<\/p>\n<p>As AI systems become increasingly integrated into everyday life, understanding these limitations could be just as important as measuring their strengths.<\/p>\n<p>Reference: \u201cDeficient executive control in transformer attention\u201d by Suketu Chandrakant Patel, Hongbin Wang and Jin Fan, 2 June 2026, <i>PNAS Nexus<\/i>.<br \/>DOI: 10.1093\/pnasnexus\/pgag149<\/p>\n<p><b>Never miss a breakthrough: Join the SciTechDaily newsletter.<\/b><br \/><b>Follow us on Google and Google News.<\/b><\/p>\n<hr \/>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>A classic attention test revealed that advanced AI models can lose focus when faced with longer, more demanding tasks. Unlike humans, who can stay on<\/p>","protected":false},"author":1,"featured_media":383035,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/scitechdaily.com\/images\/OpenAI-Chat-GPT-5-777x518.jpg","fifu_image_alt":"Even GPT-5 Failed This Human Attention Test","footnotes":""},"categories":[9],"tags":[225,430,1465,517],"class_list":["post-383034","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tech-innovations","tag-failed","tag-gpt","tag-human","tag-test"],"_links":{"self":[{"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/posts\/383034","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/comments?post=383034"}],"version-history":[{"count":0,"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/posts\/383034\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/media\/383035"}],"wp:attachment":[{"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/media?parent=383034"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/categories?post=383034"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/prostartup.it\/ru\/wp-json\/wp\/v2\/tags?post=383034"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}