Alignment

1 story

Tech & Science
AI alignment
Claude 4.5 Expands to a 200-Principle Constitution as 'Meta-Feedback' Alignment Gains Traction
Anthropic's latest model expands its constitutional AI framework to over 200 principles. OpenAI's parallel "meta-feedback" technique — humans critiquing model reasoning, not just outputs — is reportedly cutting reward hacking.
By Noah Schreiber29 September 2025