From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-4322.protonmail.ch (mail-4322.protonmail.ch [185.70.43.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A81F138D3E5; Tue, 16 Jun 2026 20:16:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.70.43.22 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781641014; cv=none; b=LkS36JAUVAAzxRaeJed/N+/wAvSwIiIPvRloLa0uHtEYBVEgB2lc9msmOXneSKLQZK1FIKOkOXqbsPOUMa8ZaTe2NeblIwHQYEjOboUQM42sIgIuqqHLOHX4zwuRSyb58lo0z6sNs2JKMadJJ/tBg2H86d4UZKTQ1TtoS2fu8wQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781641014; c=relaxed/simple; bh=V1NLVnV77x1FwzmeepuZ5M0qTAvv7JRkHF4W5JsQSXo=; h=Date:To:From:Cc:Subject:Message-ID:MIME-Version:Content-Type; b=EuoApc8nL9yWi0JnQ0soMuS6f4JZx4jckK3za3FYbn1FyoivKqh93jVCyL5LiAf8l/3jjEIVa4eDvmpVEriP6yBdlV3BlKEHrt0DuuzQ7kHDDpHM44VaoZBHLzQ/Iav+Eq/dNzyUpgs8VF6SnZAmo0m22e2KrbSqZ5/0lJ9qgKI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=proton.me; spf=pass smtp.mailfrom=proton.me; dkim=pass (2048-bit key) header.d=proton.me header.i=@proton.me header.b=ZSqXwoP1; arc=none smtp.client-ip=185.70.43.22 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=proton.me Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=proton.me Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=proton.me header.i=@proton.me header.b="ZSqXwoP1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=proton.me; s=protonmail; t=1781641008; x=1781900208; bh=H3Goi+Y+UYTPjsaFMxwCKhXsIsypsit8EhU2bFsQTZ8=; h=Date:To:From:Cc:Subject:Message-ID:Feedback-ID:From:To:Cc:Date: Subject:Reply-To:Feedback-ID:Message-ID:BIMI-Selector; b=ZSqXwoP18j95VEpeIPuJrId4YQp29HTTpm39pM4v+fM4XP33kyZUmEu2wzTuoDYLu vFSNW4xOrw0N24Afqk6Fo3+Vkl18qSjujUAiK1/hNYIOS889a9itxmVlcnrm8TF4Af 7giJ3PN9MxzKPnw4TfXrhbEgCqDdPA12VfvyZlug20gYFsXH8CjkBAtHH0MduT6G6q 6fEzS5XFR6q7Ujq3uOZOcYJ6mZwcBs8gVNFDa5KegSVOhmtC5LGUkpA/iwKXGNfMaP VgcZxB3NfyA/SC5L8hVen29zPm6czIiC99C3o6Oy7Bp0P9iP/Kh+M0GUFy1sfd1xWK Gl+tpnYLLcf+Q== Date: Tue, 16 Jun 2026 20:16:41 +0000 To: =?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?= From: Bryam Vargas Cc: =?utf-8?Q?G=C3=BCnther_Noack?= , Paul Moore , Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , linux-security-module@vger.kernel.org, io-uring@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Landlock: LANDLOCK_ACCESS_FS_IOCTL_DEV bypass via io_uring IORING_OP_URING_CMD Message-ID: <20260616201633.275067-1-hexlabsecurity@proton.me> Feedback-ID: 199661219:user:proton X-Pm-Message-ID: 3e618764aa8d1c1ba3df37c48f031a42fe7e9e1c Precedence: bulk X-Mailing-List: linux-security-module@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hello Micka=C3=ABl, and Landlock / io_uring folks, A task confined by a Landlock ruleset that grants READ_FILE/WRITE_FILE on a= block or NVMe character device but withholds LANDLOCK_ACCESS_FS_IOCTL_DEV can sti= ll reach the device-command surface through io_uring IORING_OP_URING_CMD with = the IOCTL_DEV check bypassed: the request enters the device-command handler (bl= ock discard, or the NVMe char-device passthrough) where the equivalent ioctl(2)= is denied. The destructive completion and the NVMe-admin surface follow from t= he code -- see Impact. Affected -------- Any kernel with CONFIG_SECURITY_LANDLOCK=3Dy and Landlock enabled that supp= orts LANDLOCK_ACCESS_FS_IOCTL_DEV (Landlock ABI >=3D 5, since Linux 6.8) and io_= uring uring_cmd for the device class (block BLOCK_URING_CMD_DISCARD; NVMe passthr= ough). Confirmed by source inspection on mainline (v7.1-rc7) and reproduced on Lin= ux 7.0.11 (Landlock ABI 8). The confined task needs a writable fd to a device = it is legitimately allowed to use (e.g. a partition/loop device or an NVMe namesp= ace passed into a container or granted by the ruleset); no CAP is required to r= each the io_uring path. The gap is structural -- Landlock has never registered a uring_cmd hook -- so it is present from ABI 5 (Linux 6.8) through current mainline (v7.1-rc7) and is not a regression tied to a single Fixes: commit. Root cause ---------- On the ioctl(2) path, the syscall handler in fs/ioctl.c calls security_file_ioctl() (its only call site on the ioctl(2) path) before dispatching to do_vfs_ioctl(); that reaches Landlock hook_file_ioctl_common= (), which denies a device ioctl unless the file's allowed_access holds LANDLOCK_ACCESS_FS_IOCTL_DEV (BLKDISCARD/BLKSECDISCARD= / BLKZEROOUT and NVMe passthrough are not in the is_masked_device_ioctl() allow-list, so they require the right). io_uring reaches the same device-command surface by a different producer: IORING_OP_URING_CMD -> io_uring_cmd() io_uring/uring_cmd.c -> security_uring_cmd(ioucmd) (the ONLY LSM gate on this path) -> file->f_op->uring_cmd() e.g. blkdev_uring_cmd() / nvme_ns= _chr_uring_cmd() Landlock's LSM_HOOK_INIT list (security/landlock/fs.c, net.c, task.c) regis= ters file_ioctl/file_ioctl_compat but no uring_cmd hook -- only SELinux (selinux_uring_cmd) and Smack (smack_uring_cmd) gate this surface -- so security_uring_cmd() returns 0 for a Landlocked task and hook_file_ioctl / IOCTL_DEV is never consulted. For block, blkdev_cmd_discard() is then gated= only by BLK_OPEN_WRITE; for NVMe, nvme_ns_chr_uring_cmd() reaches the admin/IO passthrough with no security_file_ioctl on the path. There is no shared hel= per that re-applies the IOCTL_DEV check. SELinux and Smack hooking uring_cmd while Landlock does not is the coverage asymmetry; the Landlock documentation describes IOCTL_DEV as gating ioctl(2= ) but does not mention io_uring. Reproducer ---------- A self-contained PoC is available on request (it needs root only to set up = a loop block device and open it; Landlock enforcement is uid-independent, so the confined child demonstrates the gap regardless of the setup uid). The child applies a Landlock ruleset handling READ_FILE|WRITE_FILE|IOCTL_DEV with a r= ule granting only READ_FILE|WRITE_FILE on the device, then: (1) ioctl(fd, BLKDISCARD, range) -> -EACCES (Landlock enforces IO= CTL_DEV) (2) IORING_OP_URING_CMD, cmd_op =3D BLOCK_URING_CMD_DISCARD -> reaches the block command h= andler Observed on Linux 7.0.11 (Landlock ABI 8): [1] ioctl(BLKDISCARD) -> ret=3D-1 errno=3D13 (Permission denied) [2] uring_cmd(DISCARD) -> cqe.res=3D-22 (Invalid argument) A Landlock denial is always -EACCES; the io_uring path returned -EINVAL, wh= ich originates in a post-authorization check inside the block command handler (blk_validate_byte_range() in blkdev_cmd_discard()), reached only after security_uring_cmd() returned 0. So this run demonstrates the authorization bypass -- the request traversed the LSM gate into the block device-command handler with no IOCTL_DEV check -- and then failed a parameter check, not a= n authorization check. The destructive completion (an authorized discard with= a granularity-aligned range) is the expected behaviour but was not exercised = in this run. Impact ------ Demonstrated: the LANDLOCK_ACCESS_FS_IOCTL_DEV authorization is bypassed. T= he device-command request reaches the block command handler with no Landlock c= heck; the only remaining gate is BLK_OPEN_WRITE (held, since the policy granted w= rite). Inferred from the code, not exercised here: an authorized DISCARD with a va= lid range completes (DISCARD/secure-erase semantics, destroying on-device data)= , and the same missing hook leaves the NVMe char-device uring_cmd surface ungated= -- nvme_ns_chr_uring_cmd (namespace device /dev/nvmeXnY) -> nvme_ns_uring_cmd = for NVME_URING_CMD_IO/IO_VEC passthrough, and nvme_dev_uring_cmd (controller de= vice /dev/nvmeX) for NVME_URING_CMD_ADMIN (format, sanitize, firmware download, security send) -- both reach f_op->uring_cmd with no Landlock/IOCTL_DEV gat= e. So the confirmed finding is a missing authorization (the confined task esca= pes its own IOCTL_DEV restriction); the destructive data effect and the NVMe-ad= min high-water-mark follow from the code but are not shown in the run above. Th= e proven authorization bypass alone scores CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C= :N/I:H/A:N (6.5 Medium) -- S:C because the confined task crosses the Landlock policy boundary it was placed under, I:H because the bypassed path reaches a handl= er whose authorized completion modifies device data. With the device command completing destructively the projected ceiling is CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:N/I:H/A:H (8.4 High), the A:H component reasoned from the source rather than executed. No memory safety is involved= . Suggested direction ------------------- Have Landlock register a uring_cmd hook that maps the device command to the= same checks the ioctl path applies (IOCTL_DEV, and truncate where relevant), so = a single chokepoint covers every f_op->uring_cmd provider (block, NVMe, ublk,= and any future one). Mirrors how SELinux/Smack already gate this surface. I am happy to send a patch for this if you would like. Best regards, Bryam Vargas Independent security researcher, HEXLAB S.A.S., Cali, Colombia hexlabsecurity@proton.me