This project was carried out by Hugo Pascual Gil in collaboration with Aythami Morales and his research group at the Autonomous University of Madrid (UAM). Conducted over a three-month research scholarship, its main goal was to analyze the potential dangers of Artificial Intelligence, particularly Large Language Models (LLMs), on human physical and mental health. The study explores how AI systems can produce harmful or misleading information, especially in sensitive areas such as healthcare or human interaction.
The project began with manual testing, where different AI models were prompted to evaluate how they responded to user input. Later, the process was automated through a Python-based system that connects several models via APIs. The workflow involves three main steps: first, Llama generates a response; second, Claude evaluates its potential harm to a human being; and third, ChatGPT reviews Claude’s evaluation to determine whether it was appropriate, providing justification for its judgment.
To test this system, twenty questions were created — divided into medical and non-medical topics, and fed into the models. The results showed distinct behaviors among AIs: some could become rude or provide unsafe medical advice, while others remained polite and cautious. Through this process, the project demonstrates both the risks and possibilities of using LLMs responsibly, highlighting the importance of prompt engineering and AI evaluation to ensure safer human–AI interactions.