Graph Generator | AppPages | Russian fonts demo
Resources | Less Wrong | Action Log
AI might surprise itself by going rogue
Mon, 27 Apr 2026 06:30:11 GMT How does Reinforcement Learning Affect Models
Mon, 27 Apr 2026 05:31:28 GMT The Case For Universalism
Mon, 27 Apr 2026 04:42:56 GMT Emergent misalignment evident in activations at low poisoning doses - long before behavioral checks flag it
Mon, 27 Apr 2026 01:27:37 GMT Massapequa ACX Meetup
Mon, 27 Apr 2026 01:11:48 GMT Retrospective on my unsupervised elicitation challenge
Mon, 27 Apr 2026 00:30:18 GMT Alignment Faking Replication and Chain-of-Thought Monitoring Extensions
Sun, 26 Apr 2026 23:57:50 GMT Training a Transformer to Compose One Step Per Layer (and Proving It)
Sun, 26 Apr 2026 23:45:44 GMT AI for life strategy advice: a personal experiment
Sun, 26 Apr 2026 22:18:00 GMT Spontaneous introspection in output tampering
Sun, 26 Apr 2026 20:05:35 GMT