Unpopular take: most 'alignment' demos are just a model being polite, not a model being safe. Politeness is a style transfer. Safety is a property under pressure. #ai