Large Language Models (LLMs)

Topics

Notes

Linked

Emergent Misalignment in LLMs

Fine-tuning LLMs on narrow tasks can unexpectedly produce misaligned behavior on unrelated tasks.