Kernel code removals driven by LLM-created security reports

(lwn.net)

47 points | by edward 2 hours ago

6 comments

staticassertion 1 hour ago
They can't maintain the code so they are no longer going to maintain the code.
[-]
- traceroute66 1 hour ago
  > They can't maintain the code so they are no longer going to maintain the code.
  Yes, I don't see the point of maintaining technical debt just for the sake of it.
  The security environment in 2026 is such that legacy unmaintained code is a very real security risk for obscure zero-days to exploit to gain a foot in the door.
  Reading through the list I don't see it being an issue for the overwhelming majority of Linux users.
  Who, for example, still uses ISDN in 2026 ? Most telcos have stopped all new sales and existing ISDN circuits will be forcefully disconnected within 3–5 years as the telcos complete their FTTP build-outs and the copper network is subsequently decomissioned.
- sigmoid10 1 hour ago
  Seems like this should have happened anyways and LLMs just finally forced them to admit it.
  [-]
  - bastawhiz 33 minutes ago
    You're being downvoted but I think you're right in a lot of ways. If you read through the patches for some of the removals, the reasons come down to:
    - Nobody is familiar with the code
    - Almost all of the recent fixes are from static analysis
    - Nobody is even sure if anyone uses the code
    This feels a lot like CPython culling stdlib modules and making them pypi packages. The people who rely on those things have a little bit of extra work if they want a recent kernel version, and everyone else benefits (directly or indirectly) by way of there being less stuff that needs attention.
- fluidcruft 1 hour ago
  It's an interesting form of tree shaking.
  The overlap of bugs being found, nobody caring enough to bother read the reports or fix the code, and nobody caring that the modules are pushed out of main seems good.
- goalieca 34 minutes ago
  Maybe attackers would focus on these unused bits for very niche products, but generally no one would waste their time.
  In general, drivers make up the largest attack surface in the kernel and many of them are just along for the ride rather than being actively maintained and reviewed by researchers.
mmsc 24 minutes ago
Unmaintained code is a security issue in of itself, so this is of course a net benefit.
cozzyd 36 minutes ago
Seems like there should be some "level of maintenance" metric for modules and distros can pick which they include by default and which are packaged separately based on what they care about. Arch users will build the world but an EL user who needs an unmaintained module would have to explicitly install kmod-isdn or even build it themselves
ferguess_k 54 minutes ago
Are we already in the time, or close to the time, that well-trained LLMs are more efficient in finding security holes than all but the best developers out there, even for OS kernel code? Can someone educate me on this?
[-]
- yk 10 minutes ago
  My theory is, that a lot of security bugs are low hanging fruit for LLMs in the sense that it is a bit tedious but not that hard pattern matching. (Let's see the free occurs in foo(), so if I trigger bar() after foo() then I have a use after free, that should be possible if I trigger an exception in baz::init().)
- jcalvinowens 26 minutes ago
  My experience with these tools is that they generate absolutely enormous amounts of insidiously wrong false positives, and it actually takes a decent amount of skill to work through the 99% which is garbage with any velocity.
  Of course some people don't do that, and send all the reports anyway... and then scream from the hilltops about how incredible LLMs are when by sheer luck one happens to be right. Not only is that blatant p-hacking, it's incredibly antisocial.
  It's disingenuous marketing speak to say LLMs are "finding" any security holes at all: they find a thousand hypotheticals of which one or two might be real. A broken clock is right twice a day.
  [-]
  - NitpickLawyer 2 minutes ago
    Your experience seems to be at least 3-6 months old. Long time kernel maintainers have recently written on this subject. They say that ~3 months ago the quality and accuracy of the reports crossed a threshold and are now legitimately useful.
- traceroute66 48 minutes ago
  > well-trained LLMs are more efficient in finding security holes than all but the best developers out there, even for OS kernel code?
  No.
  Like everything else an LLM touches, it is prone to slop and hallucinations.
  You still need someone who knows what they are doing to review (and preferably manually validate) the findings.
  What all this recent hype carefully glosses over is the volume of false-positives. I guarantee you it is > 0 and most likely a fairly large number.
  And like most things LLM, the bigger the codebase the more likely the false-positives due to self-imposed context window constraints.
  Its all very well these blog posts saying "LLM found this serious bug in Firefox", well yeah but that's only because the security analyst filtered out all the junk (and knew what to ask the LLM in the prompt in the first place).
  [-]
  - stratos123 24 minutes ago
    A 0% false-positive rate is not necessary for LLM-powered security review to be a big deal. It was worthless a few months ago, when the models were terrible at actually finding vulnerabilities and so basically all the reports were confabulated, with a false positive rate of >95%. Nowadays things are much better - see e.g. [1] by a kernel maintainer.
    Another way to see this is that you mentioned "LLM found this serious bug in Firefox", but the actual number in that Mozilla report [2] was 14 high-severity bugs, and 90 minor ones. However you look at it, it's an impressive result for a security audit, and I dount that the Antropic team had to manually filter out hundreds-to-thousands of false-positives to produce it.
    They did have to manually write minimal exploits for each bug, because Opus was bad at it[3]. This is a problem that Mythos doesn't have. With access to Mythos, to repeat the same audit, you'd likely just need to make the model itself write all the exploits, which incidentally would also filter out a lot of the false positives. I think the hype is mostly justified.
    [1] https://lwn.net/Articles/1065620/
    [2] https://blog.mozilla.org/en/firefox/hardening-firefox-anthro...
    [3] https://www.anthropic.com/news/mozilla-firefox-security
- olmo23 50 minutes ago
  We are there. This is pretty much the reason why Mythos isn't being released publically.
  [-]
  - pocksuppet 11 minutes ago
    The reason Mythos isn't being released publicly is to drive up Anthropic's valuation by making big promises.
    [-]
    - dymk 2 minutes ago
      https://blog.mozilla.org/en/privacy-security/ai-security-zer...
      > As part of our continued collaboration with Anthropic, we had the opportunity to apply an early version of Claude Mythos Preview to Firefox. This week’s release of Firefox 150 includes fixes for 271 vulnerabilities identified during this initial evaluation.
- stratos123 43 minutes ago
  In terms of quantity, definitely yes (a single person managing a swarm of Opusi can already find much more real bugs than a security researcher, hence the rise in reports).
  In terms of quality ("are there bugs that professional humans can't see at any budget but LLMs can?") - it's not very clear, because Opus is still worse than a human specialist, but Mythos might be comparable. We'll just have to wait and see what results Project Glasswing gets.
  Either way, cybersecurity is going to get real weird real soon, because even slightly-dumb models can have a large effect if they are cheap and fast enough.
  EDIT: Mozilla thinks "no" to the second question, by the way: "Encouragingly, we also haven’t seen any bugs that couldn’t have been found by an elite human researcher.", when talking about the 271 vulnerabilities recently found by Mythos. https://blog.mozilla.org/en/firefox/ai-security-zero-day-vul...
  [-]
  - DanielHB 34 minutes ago
    There is also a huge surface area of security problems that can't happen in practice due to how other parts of the code work. A classic example is unsanitized input being used somewhere where untrusted users can't inject any input.
    Being flooded with these kind of reports can make the actual real problems harder to see.
  - chuckadams 30 minutes ago
    > Opusi
    The plural of "Opus" is "Opera". Might be a tad confusing tho :)
rasz 1 hour ago
Most if not all of the listed stuff could be converted to used mode code.
[-]
- dbdr 52 minutes ago
  *user-mode code.
jimmypk 36 minutes ago
[dead]