The LLM foundation model era has inverted a fundamental assumption in software engineering. Code, once written, no longer belongs exclusively to its creators. Any publicly accessible code becomes training data, absorbed into models that can reproduce, adapt, and redistribute it without consent. This paper argues that such circumstances represent not merely a legal or ethical challenge, but a technical one requiring new defensive primitives. We introduce the concept of statistical opacity, defined as the deliberate design of code representations that resist neural pattern extraction while preserving human readability and machine executability. We articulate a research agenda spanning theory, mechanisms, tools, and evaluation. Statistical opacity will become as fundamental to software security as cryptography became to data security. Just as the community learned to design systems assuming adversaries could intercept communications, we must now learn to design systems assuming adversaries can learn from code.
Nadia Daoudi Luxembourg Institute of Science and Technology, Iván Alfonso Luxembourg Institute of Science and Technology, Jordi Cabot Luxembourg Institute of Science and Technology