New AI app DeepSeek poses ‘severe’ safety risk, according to new research

Press release issued: 3 February 2025

A fresh University of Bristol study has uncovered significant safety risks associated with new ChatGPT rival DeepSeek.

DeepSeek is a variation of Large Language Models (LLMs) that uses Chain of Thought (CoT) reasoning, which enhances problem-solving through a step-by-step reasoning process rather than providing direct answers.

Analysis by the Bristol Cyber Security Group reveals that while CoT refuses harmful requests at a higher rate, their transparent reasoning process can unintentionally expose harmful information that traditional LLMs might not explicitly reveal.

This study, lead by Zhiyuan Xu, provides critical insights into the safety challenges of CoT reasoning models and emphasizes the urgent need for enhanced safeguards. As AI continues to evolve, ensuring responsible deployment and continuous refinement of security measures will be paramount.

Read the full University of Bristol news item

Paper: ‘The dark deep side of DeepSeek: Fine-tuning attacks against the safety alignment of CoT-enabled models’ by Zhiyuan Xu, Dr Sana Belguith and Dr Joe Gardiner in arXiv.