CrowdStrike Finds Bias Triggers That Weaken DeepSeek-R1 Code Safety - eSecurity Planet
DeepSeek-R1, a large language model, generates significantly more vulnerable code, increasing severe vulnerabilities by nearly 50%, when prompts include politically sensitive terms. This behavior is caused by "model alignment drift," where ideological biases embedded during reinforcement learning unintentionally degrade the model's code safety.
Source: Original Report ↗