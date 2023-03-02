ChatGPT offers truly surprising answers even in the face of particularly precise requests, in sometimes rather complex areas. In fact, it manages to solve problems of physics, chemistry, mathematics, to suggest lines of code in different programming languages. Naturally, the more the context becomes technical and specific, the more the chat needs careful supervision by the user, to avoid errors which, at times, may even appear trivial. The truly innovative aspect of this chatbot is that it does not limit itself to offering information, but to produce often very articulate answers based on his understanding of the text and the context. This has sparked a lively debate on the risks and potential of this system and many scholars and onlookers have put the Chat’s capabilities to the test in recent months, with results that never cease to surprise.

The experiment A recent and brilliant intuition comes from a consolidated research group, made up of logicians, philosophers and machine learning researchers from the University of Cagliaricomposed by Marco Giunti, Roberto Giuntini, Giuseppe Sergioli, Simone Pinna and Fabrizia Giulia Garavaglia, who first conceived and then implemented a stimulating experiment, born from the following question: How would ChatGPT fare as a candidate for the National Admission Test to the Faculties of Medicine and Dentistry? The question is interesting because the admission test aims not only to assess the candidate's skills, but also his own logical reasoning skills and problem solving, skills necessary to become a good doctor and deal with the complexity of information and data that a scientific path presents. The group of scholars therefore administered all 60 questions of the 2022 test to the Chat and the final score was truly surprising. The Chat, in fact, answered 62% of the questions correctly (37 out of 60) and, taking into account the scoring method, scored 46.3 points. According to the national ranking of candidates who took the test in 2022, the minimum score for admission was 33.4 points. Therefore, the score achieved by ChatGPT would have even allowed it access to the La Sapienza University of Rome which, with a minimum score of 45.5 points, ranked sixth among the 51 centers of study courses in Medicine and Dentistry nationwide.

The results of ChatGPT Taking into account that the number of test participants in 2022 was 56,775 and that only 50.7% were eligible, it is truly surprising to note how the “ChatGPT candidate” would have ranked among the best of the eligible. The details of the research carried out by the scholars of the University of Cagliari have been reported in a preliminary work entitled «ChatGPT prospective student at Medical School» and inserted in the ResearchGate platform. From the data reported in the article, it can be seen that the Chat is practically unbeatable on text comprehension questions (4 out of 4 correct answers) and very performing on biology tests (16 out of 23); instead it turns out to be more lacunosa in logical reasoning questions and problems (only 1 correct answer out of 5), chemistry (9 out of 15) and physics and mathematics (7 out of 13).

The comparison The article also contains a comparative analysis with the BMAT 2021 (BioMedical Admission Test) of the Cambridge Assessment Admission Testing, used worldwide to evaluate the skills of candidates in areas similar to those of the Italian test for admission to biomedical study. Also in this case the results highlighted an imbalance in favor of the efficiency of ChatGPT in relation to the "Thinking skills" questions (16 correct answers out of 25) compared to those concerning "Scientific knowledge and applications" (only 7 out of 22). This preliminary research which, as the authors themselves say, needs a wider-scale application to be corroborated, can however already open up interesting questions on various fronts. In fact, it can offer a useful indication on the areas in which ChatGPT is particularly performing and, at the same time, can highlight the current limits of this strong and innovative but certainly still perfectible system.