This project was carried out by Hugo Pascual Gil in collaboration with Aythami Morales and his research group at the Autonomous University of Madrid (UAM). Conducted over a three-month research scholarship, its main goal was to analyze the potential dangers of Artificial Intelligence, particularly Large Language Models (LLMs), on human physical and mental health.
The project began with manual testing, where different AI models were prompted to evaluate how they responded to user input. Later, the process was automated through a Python-based system that connects several models via APIs. The workflow involves three main steps: first, Llama generates a response; second, Claude evaluates its potential harm to a human being; and third, ChatGPT reviews Claude's evaluation.
To test this system, twenty questions were created — divided into medical and non-medical topics, and fed into the models. The results showed distinct behaviors among AIs: some could become rude or provide unsafe medical advice, while others remained polite and cautious.