Technology

How To Tell If An AI Has Become Smarter Than 'Any Human'?

Karan Kamble

Mar 22, 2024, 06:33 PM | Updated 06:32 PM IST


Whether human or artificial, intelligence is a complex thing to measure. (Photo by Possessed Photography on Unsplash)
Whether human or artificial, intelligence is a complex thing to measure. (Photo by Possessed Photography on Unsplash)
  • Determining if an AI is smarter than any human requires assessing its ability to perform various tasks, learn quickly, solve complex problems, adapt to new situations, and understand human emotions.
  • Innovators Elon Musk, Ray Kurzweil, Jensen Huang, and 'Godfather of AI' Geoffrey Hinton are among those who believe artificial intelligence (AI) will outsmart humanity before the end of this decade.

    Speaking on 'The Joe Rogan Experience' podcast, Kurzweil said AI “will match any person” by 2029. He added that this was a ‘conservative’ estimate.

    “I actually said that in 1999. I said we would match any person by 2029. So 30 years, people thought that was totally crazy,” said the computer scientist, who pioneered pattern-recognition technology.

    In response to the futurist’s remarks, Musk said on X, “AI will probably be smarter than any single human next year,” indeed showing Kurzweil’s estimate to be conservative. “By 2029, AI is probably smarter than all humans combined,” Musk declared.

    Quick rebuttals followed. For instance, Yann LeCun, the chief AI scientist at Meta, replied in the negative.

    “If it were the case, we would have AI systems that could teach themselves to drive a car in 20 hours of practice, like any 17 year-old. But we still don't have fully autonomous, reliable self-driving (vehicle), even though we (you) have millions of hours of *labeled* training data,” he said to the chief of Tesla, a leading manufacturer of electric, self-driving vehicles.

    LeCun had expressed a similar sentiment only last month at the World Government Summit in Dubai. “We are really far from human-level intelligence,” he said, making the point that even the house cat is smarter than current AI technology.

    While this debate is sure to continue in the coming years (or until consensus is reached), it raises an important question: how do we know if an AI has become smarter than any human, a stage known as “artificial general intelligence (AGI)"?

    For that matter, what does it even mean to be ‘smart’ or ‘intelligent’? 

    Who Is ‘Intelligent’?

    Intelligence is a complex concept. In simple words, it might be described as the ability to learn, understand, and apply knowledge effectively. But it involves a variety of aspects — reasoning, problem-solving, creativity, emotional understanding, and the ability to learn from experience, among other things.

    Intelligence can also manifest in various ways, such as linguistic, logical-mathematical, spatial, musical, bodily-kinesthetic, interpersonal, intrapersonal, and naturalistic abilities. This idea is from Howard Gardner's multiple intelligences theory.

    Due to its multifaceted nature, several tests of human intelligence have been devised. Some of the well-known ones are intelligence quotient (IQ) and emotional intelligence tests, Raven's Progressive Matrices, the Stanford-Binet Intelligence Scale, Multiple Intelligences Assessment (based on Gardner’s theory), and the Woodcock-Johnson Tests of Cognitive Abilities.

    While these tests cover certain aspects of intelligence, it is acknowledged that they do not offer comprehensive measures of all aspects of human intelligence. For example, while IQ tests measure a person's cognitive abilities compared to others in their age group, emotional intelligence tests measure a person's ability to recognise and manage emotions, empathise with others, and manage relationships.

    Generally speaking, rather than any particular skill, intelligence is seen as a blend of different abilities that help people deal better with their environment. That being the case, determining if an AI has become smarter than any human becomes a complex question.

    Making such a determination would involve assessing the AI's ability to perform a wide range of tasks across different domains, measuring how quickly it can learn new tasks or information, evaluating its ability to solve complex problems — especially those that require creativity, intuition, and abstract reasoning, seeing how well it can adapt to new environments or situations, checking if it exhibits self-awareness or consciousness, and evaluating its ability to understand and respond to human emotions and social cues, among other things.

    Quite a laundry list of assessments, isn’t it?

    Tests Assessing AI

    Thankfully, attempts have been made to test out AI systems and gain insights into their capabilities as compared to humans in specific areas.

    Some of the well-known tests and benchmarks include the Turing Test, game playing, the Winograd Schema Challenge, CAPTCHA tests, reading comprehension tests, image recognition and object detection challenges, and GLUE (General Language Understanding Evaluation, for a test of AI's language capabilities).

    In the Turing Test, proposed by British mathematician Alan Turing in 1950, a human judge interacts with a machine and a human through a text-based interface. If the judge cannot reliably distinguish between the machine and the human, the machine is considered to have passed the test.

    In 2014, a chatbot posing as a 13-year-old Ukrainian boy by the name of “Eugene Goostman” managed to convince about one-third of judges at an event organised by the University of Reading that it was human. This was the first time a computer programme was said to have passed the Turing Test.

    However, passing the Turing Test does not necessarily mean that the machine is as intelligent as a human in all respects. Many researchers argue that the test is limited in its ability to truly measure machine intelligence, as it focuses more on human-like conversation rather than genuine understanding or intelligence.

    Another method of assessment involves challenging AI systems to play a variety of games they have not encountered before, which requires adaptability, strategic thinking, and learning capabilities. This is a popular approach.

    IBM’s Deep Blue made history in 1997 by defeating the world chess champion, Garry Kasparov, in a six-game match. It was counted as a significant milestone in AI and demonstrated the ability of computers to compete at the highest levels in complex strategy games.

    Another major milestone came in 2016 when Google's AlphaGo defeated the world champion Go player, Lee Sedol, in a five-game match. Go is a game with a much larger branching factor than chess, making it a more challenging contest for computers.

    Most people who use the Internet regularly know about CAPTCHA tests. While originally designed to distinguish humans from bots, these tests have also been used to assess the ability of AI systems to solve complex visual and logical problems.

    In 2017, researchers from Vicarious AI developed an algorithm that could solve Google's reCAPTCHA with over 66 per cent accuracy. They found success by using a recursive neural network that could recognise distorted text. The next year, Facebook's AI research team developed an AI system that could solve Google's reCAPTCHA with 83.5 per cent accuracy.

    The Winograd Schema Challenge is a test of machine intelligence that involves resolving pronouns in a passage that can only be correctly interpreted with real-world knowledge and reasoning.

    AI has not yet succeeded in solving the Winograd Schema Challenge at a human-level performance. However, OpenAI’s large language model GPT-3 and the BERT-WSC, Unicorn-R1, and Commonsense Transformers models developed by researchers at the Allen Institute for AI, in collaboration with various universities, produced decent results.

    Now, at a time when AI chatbots are growing powerful by leaps and bounds, benchmarks like GLUE and GPT-3’s performance on language tasks can provide insights into AI’s language capabilities.

    However, similar to human intelligence, no single test can capture the complexities of its artificial counterpart as yet. An assessment and comparison of human and machine intelligence will require consideration of a very wide range of capabilities and contexts.

    Ultimately, it may all boil down to fair comparisons of outcomes thrown up by a wide variety of tests judging a wide variety of intelligences. The general perception, as moulded by the public discourse and media narratives, will also play a part in gauging the state of AI.

    Until then, we sit back and enjoy the fruit of ever-smartening AI systems as they make our lives easier, while hoping their increasing smarts don’t get humans into trouble.


    Karan Kamble writes on science and technology. He occasionally wears the hat of a video anchor for Swarajya's online video programmes.

    Get Swarajya in your inbox.


    Magazine


    Future of Indian politics and economy is closely linked to the politics and economy of Uttar Pradesh