From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev,
Bing-Jhong Billy Jheng <billy@starlabs.sg>,
Muhammad Ramdhan <ramdhan@starlabs.sg>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Dominique Martinet <dominique.martinet@atmark-techno.com>
Subject: [PATCH 5.15 05/87] bpf: Fix overrunning reservations in ringbuf
Date: Thu, 25 Jul 2024 16:36:38 +0200 [thread overview]
Message-ID: <20240725142738.631594000@linuxfoundation.org> (raw)
In-Reply-To: <20240725142738.422724252@linuxfoundation.org>
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann <daniel@iogearbox.net>
commit cfa1a2329a691ffd991fcf7248a57d752e712881 upstream.
The BPF ring buffer internally is implemented as a power-of-2 sized circular
buffer, with two logical and ever-increasing counters: consumer_pos is the
consumer counter to show which logical position the consumer consumed the
data, and producer_pos which is the producer counter denoting the amount of
data reserved by all producers.
Each time a record is reserved, the producer that "owns" the record will
successfully advance producer counter. In user space each time a record is
read, the consumer of the data advanced the consumer counter once it finished
processing. Both counters are stored in separate pages so that from user
space, the producer counter is read-only and the consumer counter is read-write.
One aspect that simplifies and thus speeds up the implementation of both
producers and consumers is how the data area is mapped twice contiguously
back-to-back in the virtual memory, allowing to not take any special measures
for samples that have to wrap around at the end of the circular buffer data
area, because the next page after the last data page would be first data page
again, and thus the sample will still appear completely contiguous in virtual
memory.
Each record has a struct bpf_ringbuf_hdr { u32 len; u32 pg_off; } header for
book-keeping the length and offset, and is inaccessible to the BPF program.
Helpers like bpf_ringbuf_reserve() return `(void *)hdr + BPF_RINGBUF_HDR_SZ`
for the BPF program to use. Bing-Jhong and Muhammad reported that it is however
possible to make a second allocated memory chunk overlapping with the first
chunk and as a result, the BPF program is now able to edit first chunk's
header.
For example, consider the creation of a BPF_MAP_TYPE_RINGBUF map with size
of 0x4000. Next, the consumer_pos is modified to 0x3000 /before/ a call to
bpf_ringbuf_reserve() is made. This will allocate a chunk A, which is in
[0x0,0x3008], and the BPF program is able to edit [0x8,0x3008]. Now, lets
allocate a chunk B with size 0x3000. This will succeed because consumer_pos
was edited ahead of time to pass the `new_prod_pos - cons_pos > rb->mask`
check. Chunk B will be in range [0x3008,0x6010], and the BPF program is able
to edit [0x3010,0x6010]. Due to the ring buffer memory layout mentioned
earlier, the ranges [0x0,0x4000] and [0x4000,0x8000] point to the same data
pages. This means that chunk B at [0x4000,0x4008] is chunk A's header.
bpf_ringbuf_submit() / bpf_ringbuf_discard() use the header's pg_off to then
locate the bpf_ringbuf itself via bpf_ringbuf_restore_from_rec(). Once chunk
B modified chunk A's header, then bpf_ringbuf_commit() refers to the wrong
page and could cause a crash.
Fix it by calculating the oldest pending_pos and check whether the range
from the oldest outstanding record to the newest would span beyond the ring
buffer size. If that is the case, then reject the request. We've tested with
the ring buffer benchmark in BPF selftests (./benchs/run_bench_ringbufs.sh)
before/after the fix and while it seems a bit slower on some benchmarks, it
is still not significantly enough to matter.
Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Bing-Jhong Billy Jheng <billy@starlabs.sg>
Reported-by: Muhammad Ramdhan <ramdhan@starlabs.sg>
Co-developed-by: Bing-Jhong Billy Jheng <billy@starlabs.sg>
Co-developed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Bing-Jhong Billy Jheng <billy@starlabs.sg>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240621140828.18238-1-daniel@iogearbox.net
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
kernel/bpf/ringbuf.c | 30 +++++++++++++++++++++++++-----
1 file changed, 25 insertions(+), 5 deletions(-)
--- a/kernel/bpf/ringbuf.c
+++ b/kernel/bpf/ringbuf.c
@@ -41,9 +41,12 @@ struct bpf_ringbuf {
* mapping consumer page as r/w, but restrict producer page to r/o.
* This protects producer position from being modified by user-space
* application and ruining in-kernel position tracking.
+ * Note that the pending counter is placed in the same
+ * page as the producer, so that it shares the same cache line.
*/
unsigned long consumer_pos __aligned(PAGE_SIZE);
unsigned long producer_pos __aligned(PAGE_SIZE);
+ unsigned long pending_pos;
char data[] __aligned(PAGE_SIZE);
};
@@ -141,6 +144,7 @@ static struct bpf_ringbuf *bpf_ringbuf_a
rb->mask = data_sz - 1;
rb->consumer_pos = 0;
rb->producer_pos = 0;
+ rb->pending_pos = 0;
return rb;
}
@@ -304,9 +308,9 @@ bpf_ringbuf_restore_from_rec(struct bpf_
static void *__bpf_ringbuf_reserve(struct bpf_ringbuf *rb, u64 size)
{
- unsigned long cons_pos, prod_pos, new_prod_pos, flags;
- u32 len, pg_off;
+ unsigned long cons_pos, prod_pos, new_prod_pos, pend_pos, flags;
struct bpf_ringbuf_hdr *hdr;
+ u32 len, pg_off, tmp_size, hdr_len;
if (unlikely(size > RINGBUF_MAX_RECORD_SZ))
return NULL;
@@ -324,13 +328,29 @@ static void *__bpf_ringbuf_reserve(struc
spin_lock_irqsave(&rb->spinlock, flags);
}
+ pend_pos = rb->pending_pos;
prod_pos = rb->producer_pos;
new_prod_pos = prod_pos + len;
- /* check for out of ringbuf space by ensuring producer position
- * doesn't advance more than (ringbuf_size - 1) ahead
+ while (pend_pos < prod_pos) {
+ hdr = (void *)rb->data + (pend_pos & rb->mask);
+ hdr_len = READ_ONCE(hdr->len);
+ if (hdr_len & BPF_RINGBUF_BUSY_BIT)
+ break;
+ tmp_size = hdr_len & ~BPF_RINGBUF_DISCARD_BIT;
+ tmp_size = round_up(tmp_size + BPF_RINGBUF_HDR_SZ, 8);
+ pend_pos += tmp_size;
+ }
+ rb->pending_pos = pend_pos;
+
+ /* check for out of ringbuf space:
+ * - by ensuring producer position doesn't advance more than
+ * (ringbuf_size - 1) ahead
+ * - by ensuring oldest not yet committed record until newest
+ * record does not span more than (ringbuf_size - 1)
*/
- if (new_prod_pos - cons_pos > rb->mask) {
+ if (new_prod_pos - cons_pos > rb->mask ||
+ new_prod_pos - pend_pos > rb->mask) {
spin_unlock_irqrestore(&rb->spinlock, flags);
return NULL;
}
next prev parent reply other threads:[~2024-07-25 14:52 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-25 14:36 [PATCH 5.15 00/87] 5.15.164-rc1 review Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 01/87] gcc-plugins: Rename last_stmt() for GCC 14+ Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 02/87] filelock: Remove locks reliably when fcntl/close race is detected Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 03/87] ARM: 9324/1: fix get_user() broken with veneer Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 04/87] ACPI: processor_idle: Fix invalid comparison with insertion sort for latency Greg Kroah-Hartman
2024-07-25 14:36 ` Greg Kroah-Hartman [this message]
2024-07-25 14:36 ` [PATCH 5.15 06/87] scsi: core: Fix a use-after-free Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 07/87] scsi: core: alua: I/O errors for ALUA state transitions Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 08/87] scsi: qedf: Dont process stag work during unload and recovery Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 09/87] scsi: qedf: Wait for stag work during unload Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 10/87] scsi: qedf: Set qed_slowpath_params to zero before use Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 11/87] ACPI: EC: Abort address space access upon error Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 12/87] ACPI: EC: Avoid returning AE_OK on errors in address space handler Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 13/87] tools/power/cpupower: Fix Pstate frequency reporting on AMD Family 1Ah CPUs Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 14/87] wifi: mac80211: mesh: init nonpeer_pm to active by default in mesh sdata Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 15/87] wifi: mac80211: handle tasklet frames before stopping Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 16/87] wifi: iwlwifi: mvm: d3: fix WoWLAN command version lookup Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 17/87] wifi: iwlwifi: mvm: Handle BIGTK cipher in kek_kck cmd Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 18/87] wifi: iwlwifi: mvm: properly set 6 GHz channel direct probe option Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 19/87] wifi: mac80211: fix UBSAN noise in ieee80211_prep_hw_scan() Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 20/87] selftests/openat2: Fix build warnings on ppc64 Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 21/87] Input: silead - Always support 10 fingers Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 22/87] net: ipv6: rpl_iptunnel: block BH in rpl_output() and rpl_input() Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 23/87] ila: block BH in ila_output() Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 24/87] arm64: armv8_deprecated: Fix warning in isndep cpuhp starting process Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 25/87] null_blk: fix validation of block size Greg Kroah-Hartman
2024-07-25 14:36 ` [PATCH 5.15 26/87] kconfig: gconf: give a proper initial state to the Save button Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 27/87] kconfig: remove wrong expr_trans_bool() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 28/87] fs/file: fix the check in find_next_fd() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 29/87] mei: demote client disconnect warning on suspend to debug Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 30/87] nvme: avoid double free special payload Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 31/87] wifi: cfg80211: wext: add extra SIOCSIWSCAN data check Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 32/87] KVM: PPC: Book3S HV: Prevent UAF in kvm_spapr_tce_attach_iommu_group() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 33/87] drm/vmwgfx: Fix missing HYPERVISOR_GUEST dependency Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 34/87] ALSA: hda/realtek: Add more codec ID to no shutup pins list Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 35/87] mips: fix compat_sys_lseek syscall Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 36/87] Input: elantech - fix touchpad state on resume for Lenovo N24 Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 37/87] Input: i8042 - add Ayaneo Kun to i8042 quirk table Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 38/87] bytcr_rt5640 : inverse jack detect for Archos 101 cesium Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 39/87] ALSA: dmaengine: Synchronize dma channel after drop() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 40/87] ASoC: ti: davinci-mcasp: Set min period size using FIFO config Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 41/87] ASoC: ti: omap-hdmi: Fix too long driver name Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 42/87] can: kvaser_usb: fix return value for hif_usb_send_regout Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 43/87] s390/sclp: Fix sclp_init() cleanup on failure Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 44/87] platform/x86: wireless-hotkey: Add support for LG Airplane Button Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 45/87] platform/x86: lg-laptop: Remove LGEX0815 hotkey handling Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 46/87] platform/x86: lg-laptop: Change ACPI device id Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 47/87] platform/x86: lg-laptop: Use ACPI device handle when evaluating WMAB/WMBB Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 48/87] btrfs: qgroup: fix quota root leak after quota disable failure Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 49/87] ALSA: hda/relatek: Enable Mute LED on HP Laptop 15-gw0xxx Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 50/87] ALSA: dmaengine_pcm: terminate dmaengine before synchronize Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 51/87] net: usb: qmi_wwan: add Telit FN912 compositions Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 52/87] net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 53/87] powerpc/pseries: Whitelist dtl slub object for copying to userspace Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 54/87] powerpc/eeh: avoid possible crash when edev->pdev changes Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 55/87] scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 56/87] Bluetooth: hci_core: cancel all works upon hci_unregister_dev() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 57/87] drm/radeon: check bo_va->bo is non-NULL before using it Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 58/87] fs: better handle deep ancestor chains in is_subdir() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 59/87] riscv: stacktrace: fix usage of ftrace_graph_ret_addr() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 60/87] spi: imx: Dont expect DMA for i.MX{25,35,50,51,53} cspi devices Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 61/87] selftests/vDSO: fix clang build errors and warnings Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 62/87] hfsplus: fix uninit-value in copy_name Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 63/87] spi: mux: set ctlr->bits_per_word_mask Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 64/87] tracing: Define the is_signed_type() macro once Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 65/87] minmax: sanity check constant bounds when clamping Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 66/87] minmax: clamp more efficiently by avoiding extra comparison Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 67/87] minmax: fix header inclusions Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 68/87] minmax: allow min()/max()/clamp() if the arguments have the same signedness Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 69/87] minmax: allow comparisons of int against unsigned char/short Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 70/87] minmax: relax check to allow comparison between unsigned arguments and signed constants Greg Kroah-Hartman
2024-07-25 16:58 ` Linus Torvalds
2024-07-26 5:21 ` Greg Kroah-Hartman
2024-07-26 16:05 ` SeongJae Park
2024-07-27 5:08 ` Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 71/87] mm/damon/core: merge regions aggressively when max_nr_regions is unmet Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 72/87] wifi: mac80211: disable softirqs for queued frame handling Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 73/87] drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 74/87] samples: Add fs error monitoring example Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 75/87] samples: Make fs-monitor depend on libc and headers Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 76/87] docs: Fix formatting of literal sections in fanotify docs Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 77/87] Add gitignore file for samples/fanotify/ subdirectory Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 78/87] net: relax socket state check at accept time Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 79/87] ocfs2: add bounds checking to ocfs2_check_dir_entry() Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 80/87] jfs: dont walk off the end of ealist Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 81/87] fs/ntfs3: Validate ff offset Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 82/87] ALSA: hda/realtek: Enable headset mic on Positivo SU C1400 Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 83/87] ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360 Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 84/87] arm64: dts: qcom: msm8996: Disable SS instance in Parkmode for USB Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 85/87] arm64: dts: qcom: sdm630: " Greg Kroah-Hartman
2024-07-25 14:37 ` [PATCH 5.15 86/87] ALSA: pcm_dmaengine: Dont synchronize DMA channel when DMA is paused Greg Kroah-Hartman
2024-07-25 14:38 ` [PATCH 5.15 87/87] filelock: Fix fcntl/close race recovery compat path Greg Kroah-Hartman
2024-07-25 16:48 ` [PATCH 5.15 00/87] 5.15.164-rc1 review Naresh Kamboju
2024-07-26 6:33 ` Greg Kroah-Hartman
2024-07-26 8:15 ` Pavel Machek
2024-07-25 17:37 ` ChromeOS Kernel Stable Merge
2024-07-25 23:19 ` SeongJae Park
2024-07-26 5:23 ` Harshit Mogalapalli
2024-07-26 16:38 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240725142738.631594000@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=andrii@kernel.org \
--cc=billy@starlabs.sg \
--cc=daniel@iogearbox.net \
--cc=dominique.martinet@atmark-techno.com \
--cc=patches@lists.linux.dev \
--cc=ramdhan@starlabs.sg \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox