Linux Security Modules development
 help / color / mirror / Atom feed
* Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD
@ 2026-06-16 20:16 Bryam Vargas
  2026-06-16 20:36 ` Jens Axboe
  2026-06-17  9:47 ` Günther Noack
  0 siblings, 2 replies; 5+ messages in thread
From: Bryam Vargas @ 2026-06-16 20:16 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Günther Noack, Paul Moore, Jens Axboe, Keith Busch,
	Christoph Hellwig, Sagi Grimberg, linux-security-module, io-uring,
	linux-block, linux-nvme, linux-kernel

Hello Mickaël, and Landlock / io_uring folks,

A task confined by a Landlock ruleset that grants READ_FILE/WRITE_FILE on a block
or NVMe character device but withholds LANDLOCK_ACCESS_FS_IOCTL_DEV can still
reach the device-command surface through io_uring IORING_OP_URING_CMD with the
IOCTL_DEV check bypassed: the request enters the device-command handler (block
discard, or the NVMe char-device passthrough) where the equivalent ioctl(2) is
denied. The destructive completion and the NVMe-admin surface follow from the
code -- see Impact.

Affected
--------
Any kernel with CONFIG_SECURITY_LANDLOCK=y and Landlock enabled that supports
LANDLOCK_ACCESS_FS_IOCTL_DEV (Landlock ABI >= 5, since Linux 6.8) and io_uring
uring_cmd for the device class (block BLOCK_URING_CMD_DISCARD; NVMe passthrough).
Confirmed by source inspection on mainline (v7.1-rc7) and reproduced on Linux
7.0.11 (Landlock ABI 8). The confined task needs a writable fd to a device it is
legitimately allowed to use (e.g. a partition/loop device or an NVMe namespace
passed into a container or granted by the ruleset); no CAP is required to reach
the io_uring path. The gap is structural -- Landlock has never registered a
uring_cmd hook -- so it is present from ABI 5 (Linux 6.8) through current
mainline (v7.1-rc7) and is not a regression tied to a single Fixes: commit.

Root cause
----------
On the ioctl(2) path, the syscall handler in fs/ioctl.c calls
security_file_ioctl() (its only call site on the ioctl(2) path) before
dispatching to do_vfs_ioctl(); that reaches Landlock hook_file_ioctl_common(),
which denies a device ioctl unless the file's
allowed_access holds LANDLOCK_ACCESS_FS_IOCTL_DEV (BLKDISCARD/BLKSECDISCARD/
BLKZEROOUT and NVMe passthrough are not in the is_masked_device_ioctl()
allow-list, so they require the right).

io_uring reaches the same device-command surface by a different producer:

  IORING_OP_URING_CMD -> io_uring_cmd()   io_uring/uring_cmd.c
   -> security_uring_cmd(ioucmd)          (the ONLY LSM gate on this path)
   -> file->f_op->uring_cmd()             e.g. blkdev_uring_cmd() / nvme_ns_chr_uring_cmd()

Landlock's LSM_HOOK_INIT list (security/landlock/fs.c, net.c, task.c) registers
file_ioctl/file_ioctl_compat but no uring_cmd hook -- only SELinux
(selinux_uring_cmd) and Smack (smack_uring_cmd) gate this surface -- so
security_uring_cmd() returns 0 for a Landlocked task and hook_file_ioctl /
IOCTL_DEV is never consulted. For block, blkdev_cmd_discard() is then gated only
by BLK_OPEN_WRITE; for NVMe, nvme_ns_chr_uring_cmd() reaches the admin/IO
passthrough with no security_file_ioctl on the path. There is no shared helper
that re-applies the IOCTL_DEV check.

SELinux and Smack hooking uring_cmd while Landlock does not is the coverage
asymmetry; the Landlock documentation describes IOCTL_DEV as gating ioctl(2) but
does not mention io_uring.

Reproducer
----------
A self-contained PoC is available on request (it needs root only to set up a loop
block device and open it; Landlock enforcement is uid-independent, so the
confined child demonstrates the gap regardless of the setup uid). The child
applies a Landlock ruleset handling READ_FILE|WRITE_FILE|IOCTL_DEV with a rule
granting only READ_FILE|WRITE_FILE on the device, then:

  (1) ioctl(fd, BLKDISCARD, range)        -> -EACCES  (Landlock enforces IOCTL_DEV)
  (2) IORING_OP_URING_CMD,
      cmd_op = BLOCK_URING_CMD_DISCARD     -> reaches the block command handler

Observed on Linux 7.0.11 (Landlock ABI 8):

  [1] ioctl(BLKDISCARD)   -> ret=-1 errno=13 (Permission denied)
  [2] uring_cmd(DISCARD)  -> cqe.res=-22 (Invalid argument)

A Landlock denial is always -EACCES; the io_uring path returned -EINVAL, which
originates in a post-authorization check inside the block command handler
(blk_validate_byte_range() in blkdev_cmd_discard()), reached only after
security_uring_cmd() returned 0. So this run demonstrates the authorization
bypass -- the request traversed the LSM gate into the block device-command
handler with no IOCTL_DEV check -- and then failed a parameter check, not an
authorization check. The destructive completion (an authorized discard with a
granularity-aligned range) is the expected behaviour but was not exercised in
this run.

Impact
------
Demonstrated: the LANDLOCK_ACCESS_FS_IOCTL_DEV authorization is bypassed. The
device-command request reaches the block command handler with no Landlock check;
the only remaining gate is BLK_OPEN_WRITE (held, since the policy granted write).
Inferred from the code, not exercised here: an authorized DISCARD with a valid
range completes (DISCARD/secure-erase semantics, destroying on-device data), and
the same missing hook leaves the NVMe char-device uring_cmd surface ungated --
nvme_ns_chr_uring_cmd (namespace device /dev/nvmeXnY) -> nvme_ns_uring_cmd for
NVME_URING_CMD_IO/IO_VEC passthrough, and nvme_dev_uring_cmd (controller device
/dev/nvmeX) for NVME_URING_CMD_ADMIN (format, sanitize, firmware download,
security send) -- both reach f_op->uring_cmd with no Landlock/IOCTL_DEV gate.

So the confirmed finding is a missing authorization (the confined task escapes
its own IOCTL_DEV restriction); the destructive data effect and the NVMe-admin
high-water-mark follow from the code but are not shown in the run above. The
proven authorization bypass alone scores CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:N
(6.5 Medium) -- S:C because the confined task crosses the Landlock policy
boundary it was placed under, I:H because the bypassed path reaches a handler
whose authorized completion modifies device data. With the device command
completing destructively the projected ceiling is
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:H (8.4 High), the A:H component
reasoned from the source rather than executed. No memory safety is involved.

Suggested direction
-------------------
Have Landlock register a uring_cmd hook that maps the device command to the same
checks the ioctl path applies (IOCTL_DEV, and truncate where relevant), so a
single chokepoint covers every f_op->uring_cmd provider (block, NVMe, ublk, and
any future one). Mirrors how SELinux/Smack already gate this surface.

I am happy to send a patch for this if you would like.

Best regards,

Bryam Vargas
Independent security researcher, HEXLAB S.A.S., Cali, Colombia
hexlabsecurity@proton.me


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD
  2026-06-16 20:16 Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD Bryam Vargas
@ 2026-06-16 20:36 ` Jens Axboe
  2026-06-17  2:25   ` Bryam Vargas
  2026-06-17  9:47 ` Günther Noack
  1 sibling, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2026-06-16 20:36 UTC (permalink / raw)
  To: Bryam Vargas, Mickaël Salaün
  Cc: Günther Noack, Paul Moore, Keith Busch, Christoph Hellwig,
	Sagi Grimberg, linux-security-module, io-uring, linux-block,
	linux-nvme, linux-kernel

On 6/16/26 2:16 PM, Bryam Vargas wrote:
> Hello Micka?l, and Landlock / io_uring folks,
> 
> A task confined by a Landlock ruleset that grants READ_FILE/WRITE_FILE
> on a block or NVMe character device but withholds
> LANDLOCK_ACCESS_FS_IOCTL_DEV can still reach the device-command
> surface through io_uring IORING_OP_URING_CMD with the IOCTL_DEV check
> bypassed: the request enters the device-command handler (block
> discard, or the NVMe char-device passthrough) where the equivalent
> ioctl(2) is denied. The destructive completion and the NVMe-admin
> surface follow from the code -- see Impact.

I've said this before, but apparently it hasn't been received - this
isn't an io_uring issue. If landlock is missing a hook, then that's on
landlock and they should add it. Other security handlers already have
that. Hence no need to broadcast this to a bunch of lists, it's strictly
a landlock issue.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD
  2026-06-16 20:36 ` Jens Axboe
@ 2026-06-17  2:25   ` Bryam Vargas
  2026-06-17  2:44     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Bryam Vargas @ 2026-06-17  2:25 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mickaël Salaün, Günther Noack,
	linux-security-module

Thanks Jens — noted, the fix belongs in Landlock. Mickaël has the full report.

Bryam


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD
  2026-06-17  2:25   ` Bryam Vargas
@ 2026-06-17  2:44     ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2026-06-17  2:44 UTC (permalink / raw)
  To: Bryam Vargas
  Cc: Mickaël Salaün, Günther Noack,
	linux-security-module

On 6/16/26 8:25 PM, Bryam Vargas wrote:
> Thanks Jens ? noted, the fix belongs in Landlock. Micka?l has the full
> report.

Indeed - and hence no need to bother anyone else with it by blasting it
wide. I've already explained this multiple times, but on the private
security list, when the occasional AI report comes in on things like
this. Hence why it's a bit tiring to see the same stuff come across,
once again.

For the landlock folks, I'd suggest taking a look at what hooks already
exists (and existed, when landlock was merged) for selinux etc, that'd
be a really good hint on the existing surface covered.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD
  2026-06-16 20:16 Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD Bryam Vargas
  2026-06-16 20:36 ` Jens Axboe
@ 2026-06-17  9:47 ` Günther Noack
  1 sibling, 0 replies; 5+ messages in thread
From: Günther Noack @ 2026-06-17  9:47 UTC (permalink / raw)
  To: Bryam Vargas
  Cc: Mickaël Salaün, Paul Moore, Jens Axboe, Keith Busch,
	Christoph Hellwig, Sagi Grimberg, linux-security-module, io-uring,
	linux-block, linux-nvme, linux-kernel

Hello Bryam!

Thanks for the report!

On Tue, Jun 16, 2026 at 08:16:41PM +0000, Bryam Vargas wrote:
> Hello Mickaël, and Landlock / io_uring folks,
> 
> A task confined by a Landlock ruleset that grants READ_FILE/WRITE_FILE on a block
> or NVMe character device but withholds LANDLOCK_ACCESS_FS_IOCTL_DEV can still
> reach the device-command surface through io_uring IORING_OP_URING_CMD with the
> IOCTL_DEV check bypassed: the request enters the device-command handler (block
> discard, or the NVMe char-device passthrough) where the equivalent ioctl(2) is
> denied. The destructive completion and the NVMe-admin surface follow from the
> code -- see Impact.
> 
> Affected
> --------
> Any kernel with CONFIG_SECURITY_LANDLOCK=y and Landlock enabled that supports
> LANDLOCK_ACCESS_FS_IOCTL_DEV (Landlock ABI >= 5, since Linux 6.8) and io_uring
> uring_cmd for the device class (block BLOCK_URING_CMD_DISCARD; NVMe passthrough).
> Confirmed by source inspection on mainline (v7.1-rc7) and reproduced on Linux
> 7.0.11 (Landlock ABI 8). The confined task needs a writable fd to a device it is
> legitimately allowed to use (e.g. a partition/loop device or an NVMe namespace
> passed into a container or granted by the ruleset); no CAP is required to reach
> the io_uring path. The gap is structural -- Landlock has never registered a
> uring_cmd hook -- so it is present from ABI 5 (Linux 6.8) through current
> mainline (v7.1-rc7) and is not a regression tied to a single Fixes: commit.
> 
> Root cause
> ----------
> On the ioctl(2) path, the syscall handler in fs/ioctl.c calls
> security_file_ioctl() (its only call site on the ioctl(2) path) before
> dispatching to do_vfs_ioctl(); that reaches Landlock hook_file_ioctl_common(),
> which denies a device ioctl unless the file's
> allowed_access holds LANDLOCK_ACCESS_FS_IOCTL_DEV (BLKDISCARD/BLKSECDISCARD/
> BLKZEROOUT and NVMe passthrough are not in the is_masked_device_ioctl()
> allow-list, so they require the right).
> 
> io_uring reaches the same device-command surface by a different producer:
> 
>   IORING_OP_URING_CMD -> io_uring_cmd()   io_uring/uring_cmd.c
>    -> security_uring_cmd(ioucmd)          (the ONLY LSM gate on this path)
>    -> file->f_op->uring_cmd()             e.g. blkdev_uring_cmd() / nvme_ns_chr_uring_cmd()
> 
> Landlock's LSM_HOOK_INIT list (security/landlock/fs.c, net.c, task.c) registers
> file_ioctl/file_ioctl_compat but no uring_cmd hook -- only SELinux
> (selinux_uring_cmd) and Smack (smack_uring_cmd) gate this surface -- so
> security_uring_cmd() returns 0 for a Landlocked task and hook_file_ioctl /
> IOCTL_DEV is never consulted. For block, blkdev_cmd_discard() is then gated only
> by BLK_OPEN_WRITE; for NVMe, nvme_ns_chr_uring_cmd() reaches the admin/IO
> passthrough with no security_file_ioctl on the path. There is no shared helper
> that re-applies the IOCTL_DEV check.
> 
> SELinux and Smack hooking uring_cmd while Landlock does not is the coverage
> asymmetry; the Landlock documentation describes IOCTL_DEV as gating ioctl(2) but
> does not mention io_uring.
> 
> Reproducer
> ----------
> A self-contained PoC is available on request (it needs root only to set up a loop
> block device and open it; Landlock enforcement is uid-independent, so the
> confined child demonstrates the gap regardless of the setup uid). The child
> applies a Landlock ruleset handling READ_FILE|WRITE_FILE|IOCTL_DEV with a rule
> granting only READ_FILE|WRITE_FILE on the device, then:
> 
>   (1) ioctl(fd, BLKDISCARD, range)        -> -EACCES  (Landlock enforces IOCTL_DEV)
>   (2) IORING_OP_URING_CMD,
>       cmd_op = BLOCK_URING_CMD_DISCARD     -> reaches the block command handler
> 
> Observed on Linux 7.0.11 (Landlock ABI 8):
> 
>   [1] ioctl(BLKDISCARD)   -> ret=-1 errno=13 (Permission denied)
>   [2] uring_cmd(DISCARD)  -> cqe.res=-22 (Invalid argument)
> 
> A Landlock denial is always -EACCES; the io_uring path returned -EINVAL, which
> originates in a post-authorization check inside the block command handler
> (blk_validate_byte_range() in blkdev_cmd_discard()), reached only after
> security_uring_cmd() returned 0. So this run demonstrates the authorization
> bypass -- the request traversed the LSM gate into the block device-command
> handler with no IOCTL_DEV check -- and then failed a parameter check, not an
> authorization check. The destructive completion (an authorized discard with a
> granularity-aligned range) is the expected behaviour but was not exercised in
> this run.
> 
> Impact
> ------
> Demonstrated: the LANDLOCK_ACCESS_FS_IOCTL_DEV authorization is bypassed. The
> device-command request reaches the block command handler with no Landlock check;
> the only remaining gate is BLK_OPEN_WRITE (held, since the policy granted write).
> Inferred from the code, not exercised here: an authorized DISCARD with a valid
> range completes (DISCARD/secure-erase semantics, destroying on-device data), and
> the same missing hook leaves the NVMe char-device uring_cmd surface ungated --
> nvme_ns_chr_uring_cmd (namespace device /dev/nvmeXnY) -> nvme_ns_uring_cmd for
> NVME_URING_CMD_IO/IO_VEC passthrough, and nvme_dev_uring_cmd (controller device
> /dev/nvmeX) for NVME_URING_CMD_ADMIN (format, sanitize, firmware download,
> security send) -- both reach f_op->uring_cmd with no Landlock/IOCTL_DEV gate.
> 
> So the confirmed finding is a missing authorization (the confined task escapes
> its own IOCTL_DEV restriction); the destructive data effect and the NVMe-admin
> high-water-mark follow from the code but are not shown in the run above. The
> proven authorization bypass alone scores CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:N
> (6.5 Medium) -- S:C because the confined task crosses the Landlock policy
> boundary it was placed under, I:H because the bypassed path reaches a handler
> whose authorized completion modifies device data. With the device command
> completing destructively the projected ceiling is
> CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:H (8.4 High), the A:H component
> reasoned from the source rather than executed. No memory safety is involved.
> 
> Suggested direction
> -------------------
> Have Landlock register a uring_cmd hook that maps the device command to the same
> checks the ioctl path applies (IOCTL_DEV, and truncate where relevant), so a
> single chokepoint covers every f_op->uring_cmd provider (block, NVMe, ublk, and
> any future one). Mirrors how SELinux/Smack already gate this surface.
> 
> I am happy to send a patch for this if you would like.

I have read through the code a bit, but I am not sure I follow the argument of
this report. Let me paraphrase my understanding --

* LANDLOCK_ACCESS_FS_IOCTL_DEV is documented as blocking ioctl(2)
  commands on opened character and block devices.
  (c.f. https://docs.kernel.org/userspace-api/landlock.html#filesystem-flags)

* One of many block-device IOCTL operations is BLKDISCARD.

* Block devices offer BLKDISCARD over io_uring as well,
  but io_uring does *not* offer a generic interface through which you
  can do IOCTLs.  It *only* implements BLOCK_URING_CMD_DISCARD in that
  place.  The header where that constant is defined happens to use one
  of the ioctl macros to construct the number, but points out that "It's
  a different number space from ioctl()" (see
  include/uapi/linux/blkdev.h).

So... while this is similar to IOCTL, and while this block device operation is
also available through ioctl(2), this is a different command multiplexer
than IOCTL and I am not convinced that that namespace should be guarded with
the same LANDLOCK_ACCESS_FS_IOCTL_DEV access right.

Do I understand correctly that the only operation affected in this report is
BLOCK_URING_CMD_DISCARD?  Or are there other operations affected by this
(through other devices)?  I saw you also mentioned the truncate right above,
but I assume that for this access right you have not found a way to side-step
it (assuming that this calls the more specific LSM hooks).

Thanks,
—Günther

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-17  9:48 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 20:16 Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD Bryam Vargas
2026-06-16 20:36 ` Jens Axboe
2026-06-17  2:25   ` Bryam Vargas
2026-06-17  2:44     ` Jens Axboe
2026-06-17  9:47 ` Günther Noack

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox