TL;DR: LLMs excel at pattern recognition, data summarization, and generating boilerplate security controls. They fail at adversarial thinking, organizational context, and the kind of judgment that comes from seeing things go wrong. AI augments security work — it doesn’t replace it.
I use AI daily. I have Claude help with scripting, research, and prototyping personal projects. It’s absurdly good at turning vague ideas into working code. But when I think about deploying AI tools in production security contexts, the limitations become obvious.
Security engineering isn’t just technical execution. It’s understanding what an attacker would do with partial information. It’s knowing which vulnerabilities matter in your specific environment and which are theoretical. It’s reading between the lines when a vendor says their product is “secure by default.” LLMs don’t do that.
OpenAI’s GPT-4 technical report shows the model performs well on coding benchmarks and structured reasoning tasks. But adversarial thinking isn’t a benchmark problem. It’s about imagining what someone would do if they wanted to break your system — and then imagining what they’d do after that, and after that. It’s about questioning assumptions in ways that don’t follow obvious patterns.
I’ve worked with security engineers who can look at an architecture diagram and immediately spot the abuse case no one thought of. That skill comes from experience, not training data. LLMs can generate threat models based on STRIDE or MITRE ATT&CK frameworks, but they can’t tell you which threats actually matter for your business or which controls you can skip because your threat model doesn’t include nation-state actors.
Organizational context is another gap. Security decisions are always trade-offs. Do you enforce MFA on this legacy app and risk breaking it, or do you accept the risk and document it? Do you spend budget on EDR or on security training? LLMs don’t understand your organization’s risk appetite, your technical debt, or your political landscape. A human security engineer does.
AI tools are best used as force multipliers. I’ve seen GitHub Copilot help security teams write detection rules faster. I’ve seen LLMs summarize vulnerability disclosures and generate remediation guidance. I’ve seen AI-assisted code review catch bugs that humans missed. These are all valuable. But they require a human in the loop to evaluate the output, apply context, and make the final call.
The people worried about being replaced by AI are focused on the wrong problem. If your security work is entirely pattern-matching and generating boilerplate, you’re already at risk — not from AI, but from automation in general. The engineers who will thrive are the ones who lean into the work AI can’t do: adversarial thinking, contextual judgment, and institutional knowledge.
AI won’t replace security engineers. But security engineers who use AI will replace those who don’t.
Sources
- GPT-4 Technical Report - OpenAI’s technical documentation on GPT-4 capabilities and limitations
- GitHub Copilot for Security - AI-assisted coding tool used in security contexts
- MITRE ATT&CK Framework - Knowledge base of adversary tactics and techniques used in threat modeling