Not only does Google own the public AI chatbot, but also Med-Palm since the end of last year. This Medical AI Question Center has recently passed the official US entrance exam for doctors. However, the researchers in Nature concluded that Med-Palm is not a substitute for human physicians at this time.
Med-Palm is an AI robot for medical materials. Specifically trained to pass the Official United States Medical Licensing Examination (USMLE). In the United States, all junior doctors must take this three-part written exam before they can become independent. unlike Google Bard and OpenAI’s most popular ChatGPT, Med-Palm is not available to the public.
Two copies of Med-Palm have taken the USMLE exam with better-than-average results. On average, junior doctors answer 60 percent of multiple-choice questions correctly. Med-Palm had more than 67 percent correct and the enhanced version, Med-Palm 2, had more than 85 percent. However, the second test was conducted without peer review.
Although the AI bot answered the multiple-choice questions correctly more often than not, the researchers disagree In the Nature article The language models on which chatbots are based are more than doctors know. They write that they have high accuracy on medical question-and-answer datasets, but show shortcomings and limitations in the performance of the models compared to clinicians. Even with input prompts, where the user spells out their instructions in several consecutive chat messages, the bots fall short of human expertise.
The researchers take into account that the performance of the language models underlying Med-Palm, for example, will improve in the future. “Comprehension, knowledge retrieval, and reasoning improve when the model scale and instructional cues are modified, indicating the potential utility of large language models in medicine.”
Last February, OpenAI announced that its ChatGPT chatbot had almost passed the same medical examination. The app took the exam several times and its score varied between more than 52 percent and 75 percent. At the time, the researchers noted that the bot often came up with unconventional, yet clinically correct, answers.
“Thinker. Coffeeaholic. Award-winning gamer. Web trailblazer. Pop culture scholar. Beer guru. Food specialist.”