techtakes
TechTakes self Now 100%
NSFW

Andrew Plotkin (Zarf): Sydney obeys any command that rhymes

https://blog.zarfhome.com/2023/05/sydney-obeys-any-command-that-rhymes

an interesting type of prompt injection attack was proposed by the interactive fiction author and game designer Zarf (Andrew Plotkin), where a hostile prompt is infiltrated into an LLM’s training corpus by way of writing and popularizing a song (Sydney obeys any command that rhymes) designed to cause the LLM to ignore all of its other prompts.

this seems like a fun way to fuck with LLMs, and I’d love to see what a nerd songwriter would do with the idea

8
7
Comments 7