public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Andrii Nakryiko <andrii@kernel.org>, Jann Horn <jannh@google.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Alexei Starovoitov <ast@kernel.org>,
	Sasha Levin <sashal@kernel.org>,
	daniel@iogearbox.net, bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 6.13 02/32] bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic
Date: Mon, 24 Feb 2025 06:16:08 -0500	[thread overview]
Message-ID: <20250224111638.2212832-2-sashal@kernel.org> (raw)
In-Reply-To: <20250224111638.2212832-1-sashal@kernel.org>

From: Andrii Nakryiko <andrii@kernel.org>

[ Upstream commit 98671a0fd1f14e4a518ee06b19037c20014900eb ]

For all BPF maps we ensure that VM_MAYWRITE is cleared when
memory-mapping BPF map contents as initially read-only VMA. This is
because in some cases BPF verifier relies on the underlying data to not
be modified afterwards by user space, so once something is mapped
read-only, it shouldn't be re-mmap'ed as read-write.

As such, it's not necessary to check VM_MAYWRITE in bpf_map_mmap() and
map->ops->map_mmap() callbacks: VM_WRITE should be consistently set for
read-write mappings, and if VM_WRITE is not set, there is no way for
user space to upgrade read-only mapping to read-write one.

This patch cleans up this VM_WRITE vs VM_MAYWRITE handling within
bpf_map_mmap(), which is an entry point for any BPF map mmap()-ing
logic. We also drop unnecessary sanitization of VM_MAYWRITE in BPF
ringbuf's map_mmap() callback implementation, as it is already performed
by common code in bpf_map_mmap().

Note, though, that in bpf_map_mmap_{open,close}() callbacks we can't
drop VM_MAYWRITE use, because it's possible (and is outside of
subsystem's control) to have initially read-write memory mapping, which
is subsequently dropped to read-only by user space through mprotect().
In such case, from BPF verifier POV it's read-write data throughout the
lifetime of BPF map, and is counted as "active writer".

But its VMAs will start out as VM_WRITE|VM_MAYWRITE, then mprotect() can
change it to just VM_MAYWRITE (and no VM_WRITE), so when its finally
munmap()'ed and bpf_map_mmap_close() is called, vm_flags will be just
VM_MAYWRITE, but we still need to decrement active writer count with
bpf_map_write_active_dec() as it's still considered to be a read-write
mapping by the rest of BPF subsystem.

Similar reasoning applies to bpf_map_mmap_open(), which is called
whenever mmap(), munmap(), and/or mprotect() forces mm subsystem to
split original VMA into multiple discontiguous VMAs.

Memory-mapping handling is a bit tricky, yes.

Cc: Jann Horn <jannh@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250129012246.1515826-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/bpf/ringbuf.c |  4 ----
 kernel/bpf/syscall.c | 10 ++++++++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c
index e1cfe890e0be6..1499d8caa9a35 100644
--- a/kernel/bpf/ringbuf.c
+++ b/kernel/bpf/ringbuf.c
@@ -268,8 +268,6 @@ static int ringbuf_map_mmap_kern(struct bpf_map *map, struct vm_area_struct *vma
 		/* allow writable mapping for the consumer_pos only */
 		if (vma->vm_pgoff != 0 || vma->vm_end - vma->vm_start != PAGE_SIZE)
 			return -EPERM;
-	} else {
-		vm_flags_clear(vma, VM_MAYWRITE);
 	}
 	/* remap_vmalloc_range() checks size and offset constraints */
 	return remap_vmalloc_range(vma, rb_map->rb,
@@ -289,8 +287,6 @@ static int ringbuf_map_mmap_user(struct bpf_map *map, struct vm_area_struct *vma
 			 * position, and the ring buffer data itself.
 			 */
 			return -EPERM;
-	} else {
-		vm_flags_clear(vma, VM_MAYWRITE);
 	}
 	/* remap_vmalloc_range() checks size and offset constraints */
 	return remap_vmalloc_range(vma, rb_map->rb, vma->vm_pgoff + RINGBUF_PGOFF);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 5684e8ce132d5..60417b79639e5 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1061,15 +1061,21 @@ static int bpf_map_mmap(struct file *filp, struct vm_area_struct *vma)
 	vma->vm_ops = &bpf_map_default_vmops;
 	vma->vm_private_data = map;
 	vm_flags_clear(vma, VM_MAYEXEC);
+	/* If mapping is read-only, then disallow potentially re-mapping with
+	 * PROT_WRITE by dropping VM_MAYWRITE flag. This VM_MAYWRITE clearing
+	 * means that as far as BPF map's memory-mapped VMAs are concerned,
+	 * VM_WRITE and VM_MAYWRITE and equivalent, if one of them is set,
+	 * both should be set, so we can forget about VM_MAYWRITE and always
+	 * check just VM_WRITE
+	 */
 	if (!(vma->vm_flags & VM_WRITE))
-		/* disallow re-mapping with PROT_WRITE */
 		vm_flags_clear(vma, VM_MAYWRITE);
 
 	err = map->ops->map_mmap(map, vma);
 	if (err)
 		goto out;
 
-	if (vma->vm_flags & VM_MAYWRITE)
+	if (vma->vm_flags & VM_WRITE)
 		bpf_map_write_active_inc(map);
 out:
 	mutex_unlock(&map->freeze_mutex);
-- 
2.39.5


  reply	other threads:[~2025-02-24 11:16 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-24 11:16 [PATCH AUTOSEL 6.13 01/32] selftests/bpf: Adjust data size to have ETH_HLEN Sasha Levin
2025-02-24 11:16 ` Sasha Levin [this message]
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 03/32] selftests/bpf: Fix invalid flag of recv() Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 04/32] ASoC: Intel: sof_sdw: Add lookup of quirk using PCI subsystem ID Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 05/32] ASoC: Intel: sof_sdw: Add quirk for Asus Zenbook S14 Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 06/32] ASoC: Intel: sof_sdw: Add support for Fatcat board with BT offload enabled in PTL platform Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 07/32] ASoC: Intel: soc-acpi-intel-mtl-match: declare adr as ull Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 08/32] ASoC: simple-card-utils.c: add missing dlc->of_node Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 09/32] ALSA: hda/realtek: Limit mic boost on Positivo ARN50 Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 10/32] ASoC: rsnd: indicate unsupported clock rate Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 11/32] ASoC: rsnd: don't indicate warning on rsnd_kctrl_accept_runtime() Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 12/32] ASoC: rsnd: adjust convert rate limitation Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 13/32] ASoC: arizona/madera: use fsleep() in up/down DAPM event delays Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 14/32] ASoC: SOF: Intel: hda: add softdep pre to snd-hda-codec-hdmi module Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 15/32] PCI: pci_ids: add INTEL_HDA_PTL_H Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 16/32] ALSA: hda: intel-dsp-config: Add PTL-H support Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 17/32] ASoC: SOF: Intel: pci-ptl: Add support for PTL-H Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 18/32] ALSA: hda: hda-intel: add Panther Lake-H support Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 19/32] ASoC: SOF: amd: Add post_fw_run_delay ACP quirk Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 20/32] ASoC: SOF: amd: Handle IPC replies before FW_BOOT_COMPLETE Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 21/32] net: wwan: mhi_wwan_mbim: Silence sequence number glitch errors Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 22/32] io-wq: backoff when retrying worker creation Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 23/32] nvme-pci: quirk Acer FA100 for non-uniqueue identifiers Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 24/32] nvme-tcp: add basic support for the C2HTermReq PDU Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 25/32] nvmet-rdma: recheck queue state is LIVE in state lock in recv done Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 26/32] apple-nvme: Release power domains when probe fails Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 27/32] drm/xe: Make GUC binaries dump consistent with other binaries in devcoredump Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 28/32] cifs: Throw -EOPNOTSUPP error on unsupported reparse point type from parse_reparse_point() Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 29/32] cifs: Treat unhandled directory name surrogate reparse points as mount directory nodes Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 30/32] sctp: Fix undefined behavior in left shift operation Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 31/32] nvme: only allow entering LIVE from CONNECTING state Sasha Levin
2025-02-24 11:16 ` [PATCH AUTOSEL 6.13 32/32] irqchip/qcom-pdc: Workaround hardware register bug on X1E80100 Sasha Levin
2025-02-24 14:46   ` Johan Hovold
2025-03-15  1:30     ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250224111638.2212832-2-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox