* [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap [not found] <2026020124-flashbulb-stumble-f24a@gregkh> @ 2026-02-01 11:34 ` Haocheng Yu 2026-02-01 11:49 ` Greg KH 2026-02-01 18:43 ` [PATCH] " kernel test robot 0 siblings, 2 replies; 15+ messages in thread From: Haocheng Yu @ 2026-02-01 11:34 UTC (permalink / raw) To: acme; +Cc: security, linux-kernel, linux-perf-users, gregkh The issue is caused by a race condition between mmap() and event teardown. In perf_mmap(), the ring_buffer (rb) is accessed via map_range() after the mmap_mutex is released. If another thread closes the event or detaches the buffer during this window, the reference count of rb can drop to zero, leading to a UAF or refcount saturation when map_range() or subsequent logic attempts to use it. Fix this by extending the scope of mmap_mutex to cover the entire setup process, including map_range(), ensuring the buffer remains valid until the mapping is complete. Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> --- kernel/events/core.c | 42 +++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 2c35acc2722b..7c93f7d057cb 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) ret = perf_mmap_aux(vma, event, nr_pages); if (ret) return ret; - } - - /* - * Since pinned accounting is per vm we cannot allow fork() to copy our - * vma. - */ - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); - vma->vm_ops = &perf_mmap_vmops; - mapped = get_mapped(event, event_mapped); - if (mapped) - mapped(event, vma->vm_mm); - - /* - * Try to map it into the page table. On fail, invoke - * perf_mmap_close() to undo the above, as the callsite expects - * full cleanup in this case and therefore does not invoke - * vmops::close(). - */ - ret = map_range(event->rb, vma); - if (ret) - perf_mmap_close(vma); + /* + * Since pinned accounting is per vm we cannot allow fork() to copy our + * vma. + */ + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); + vma->vm_ops = &perf_mmap_vmops; + + mapped = get_mapped(event, event_mapped); + if (mapped) + mapped(event, vma->vm_mm); + + /* + * Try to map it into the page table. On fail, invoke + * perf_mmap_close() to undo the above, as the callsite expects + * full cleanup in this case and therefore does not invoke + * vmops::close(). + */ + ret = map_range(event->rb, vma); + if (ret) + perf_mmap_close(vma); + } return ret; } -- 2.51.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-01 11:34 ` [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap Haocheng Yu @ 2026-02-01 11:49 ` Greg KH 2026-02-02 7:44 ` Haocheng Yu 2026-02-01 18:43 ` [PATCH] " kernel test robot 1 sibling, 1 reply; 15+ messages in thread From: Greg KH @ 2026-02-01 11:49 UTC (permalink / raw) To: Haocheng Yu; +Cc: acme, security, linux-kernel, linux-perf-users On Sun, Feb 01, 2026 at 07:34:36PM +0800, Haocheng Yu wrote: > The issue is caused by a race condition between mmap() and event > teardown. In perf_mmap(), the ring_buffer (rb) is accessed via > map_range() after the mmap_mutex is released. If another thread > closes the event or detaches the buffer during this window, the > reference count of rb can drop to zero, leading to a UAF or > refcount saturation when map_range() or subsequent logic attempts > to use it. > > Fix this by extending the scope of mmap_mutex to cover the entire > setup process, including map_range(), ensuring the buffer remains > valid until the mapping is complete. > > Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> > --- > kernel/events/core.c | 42 +++++++++++++++++++++--------------------- > 1 file changed, 21 insertions(+), 21 deletions(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 2c35acc2722b..7c93f7d057cb 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) > ret = perf_mmap_aux(vma, event, nr_pages); > if (ret) > return ret; > - } > - > - /* > - * Since pinned accounting is per vm we cannot allow fork() to copy our > - * vma. > - */ > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > - vma->vm_ops = &perf_mmap_vmops; > > - mapped = get_mapped(event, event_mapped); > - if (mapped) > - mapped(event, vma->vm_mm); > - > - /* > - * Try to map it into the page table. On fail, invoke > - * perf_mmap_close() to undo the above, as the callsite expects > - * full cleanup in this case and therefore does not invoke > - * vmops::close(). > - */ > - ret = map_range(event->rb, vma); > - if (ret) > - perf_mmap_close(vma); > + /* > + * Since pinned accounting is per vm we cannot allow fork() to copy our > + * vma. > + */ > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > + vma->vm_ops = &perf_mmap_vmops; > + > + mapped = get_mapped(event, event_mapped); > + if (mapped) > + mapped(event, vma->vm_mm); > + > + /* > + * Try to map it into the page table. On fail, invoke > + * perf_mmap_close() to undo the above, as the callsite expects > + * full cleanup in this case and therefore does not invoke > + * vmops::close(). > + */ > + ret = map_range(event->rb, vma); > + if (ret) > + perf_mmap_close(vma); > + } This indentation looks very odd, are you sure it is correct? thanks, greg k-h ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-01 11:49 ` Greg KH @ 2026-02-02 7:44 ` Haocheng Yu 2026-02-02 13:58 ` Peter Zijlstra 0 siblings, 1 reply; 15+ messages in thread From: Haocheng Yu @ 2026-02-02 7:44 UTC (permalink / raw) To: acme; +Cc: security, linux-kernel, linux-perf-users, gregkh Syzkaller reported a refcount_t: addition on 0; use-after-free warning in perf_mmap. The issue is caused by a race condition between mmap() and event teardown. In perf_mmap(), the ring_buffer (rb) is accessed via map_range() after the mmap_mutex is released. If another thread closes the event or detaches the buffer during this window, the reference count of rb can drop to zero, leading to a UAF or refcount saturation when map_range() or subsequent logic attempts to use it. Fix this by extending the scope of mmap_mutex to cover the entire setup process, including map_range(), ensuring the buffer remains valid until the mapping is complete. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> --- kernel/events/core.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 2c35acc2722b..abefd1213582 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) ret = perf_mmap_aux(vma, event, nr_pages); if (ret) return ret; - } - /* - * Since pinned accounting is per vm we cannot allow fork() to copy our - * vma. - */ - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); - vma->vm_ops = &perf_mmap_vmops; + /* + * Since pinned accounting is per vm we cannot allow fork() to copy our + * vma. + */ + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); + vma->vm_ops = &perf_mmap_vmops; - mapped = get_mapped(event, event_mapped); - if (mapped) - mapped(event, vma->vm_mm); + mapped = get_mapped(event, event_mapped); + if (mapped) + mapped(event, vma->vm_mm); - /* - * Try to map it into the page table. On fail, invoke - * perf_mmap_close() to undo the above, as the callsite expects - * full cleanup in this case and therefore does not invoke - * vmops::close(). - */ - ret = map_range(event->rb, vma); - if (ret) - perf_mmap_close(vma); + /* + * Try to map it into the page table. On fail, invoke + * perf_mmap_close() to undo the above, as the callsite expects + * full cleanup in this case and therefore does not invoke + * vmops::close(). + */ + ret = map_range(event->rb, vma); + if (ret) + perf_mmap_close(vma); + } return ret; } base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 -- 2.51.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-02 7:44 ` Haocheng Yu @ 2026-02-02 13:58 ` Peter Zijlstra 2026-02-02 14:36 ` Peter Zijlstra 0 siblings, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2026-02-02 13:58 UTC (permalink / raw) To: Haocheng Yu; +Cc: acme, security, linux-kernel, linux-perf-users, gregkh On Mon, Feb 02, 2026 at 03:44:35PM +0800, Haocheng Yu wrote: > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > in perf_mmap. > > The issue is caused by a race condition between mmap() and event > teardown. In perf_mmap(), the ring_buffer (rb) is accessed via > map_range() after the mmap_mutex is released. If another thread > closes the event or detaches the buffer during this window, the > reference count of rb can drop to zero, leading to a UAF or > refcount saturation when map_range() or subsequent logic attempts > to use it. So you're saying this is something like: Thread-1 Thread-2 mmap(fd) close(fd) / ioctl(fd, IOC_SET_OUTPUT) I don't think close() is possible, because mmap() should have a reference on the struct file from fget(), no? That leaves the ioctl(), let me go have a peek. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-02 13:58 ` Peter Zijlstra @ 2026-02-02 14:36 ` Peter Zijlstra 2026-02-02 15:51 ` 余昊铖 0 siblings, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2026-02-02 14:36 UTC (permalink / raw) To: Haocheng Yu; +Cc: acme, security, linux-kernel, linux-perf-users, gregkh On Mon, Feb 02, 2026 at 02:58:59PM +0100, Peter Zijlstra wrote: > On Mon, Feb 02, 2026 at 03:44:35PM +0800, Haocheng Yu wrote: > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > > in perf_mmap. > > > > The issue is caused by a race condition between mmap() and event > > teardown. In perf_mmap(), the ring_buffer (rb) is accessed via > > map_range() after the mmap_mutex is released. If another thread > > closes the event or detaches the buffer during this window, the > > reference count of rb can drop to zero, leading to a UAF or > > refcount saturation when map_range() or subsequent logic attempts > > to use it. > > So you're saying this is something like: > > Thread-1 Thread-2 > > mmap(fd) > close(fd) / ioctl(fd, IOC_SET_OUTPUT) > > > I don't think close() is possible, because mmap() should have a > reference on the struct file from fget(), no? > > That leaves the ioctl(), let me go have a peek. I'm not seeing it; once perf_mmap_rb() completes, we should have event->mmap_count != 0, and this the IOC_SET_OUTPUT will fail. Please provide a better explanation. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-02 14:36 ` Peter Zijlstra @ 2026-02-02 15:51 ` 余昊铖 2026-02-02 16:20 ` [PATCH v2] " yuhaocheng035 0 siblings, 1 reply; 15+ messages in thread From: 余昊铖 @ 2026-02-02 15:51 UTC (permalink / raw) To: Peter Zijlstra; +Cc: acme, security, linux-kernel, linux-perf-users, gregkh Hi Peter, Thanks for the review. You are right, my previous explanation was inaccurate. The actual race condition occurs between a failing mmap() on one event and a concurrent mmap() on a second event that shares the ring buffer (e.g., via output redirection). Detailed scenario is as follows, for example: 1. Thread A calls mmap(event_A). It allocates the ring buffer, sets event_A->rb, and initializes refcount to 1. It then drops mmap_mutex. 2. Thread A calls map_range(). Suppose this fails. Thread A then proceeds to the error path and calls perf_mmap_close(). 3. Thread B concurrently calls mmap(event_B), where event_B is configured to share event_A's buffer. Thread B acquires event_A->mmap_mutex and sees the valid event_A->rb pointer. 4. The race triggers here: If Thread A's perf_mmap_close() logic decrements the ring buffer's refcount to 0 (releasing it) but the pointer event_A->rb is still visible to Thread B (or was read by Thread B before it was cleared), Thread B triggers the "refcount_t: addition on 0" warning when it attempts to increment the refcount in perf_mmap_rb(). The fix extends the scope of mmap_mutex to cover map_range() and the potential error handling path. This ensures that event->rb is only exposed to other threads after it is fully successfully mapped, or it is cleaned up atomically inside the lock if mapping fails. I have updated the commit message accordingly. Thanks, Haocheng ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-02 15:51 ` 余昊铖 @ 2026-02-02 16:20 ` yuhaocheng035 2026-02-06 9:06 ` Peter Zijlstra 2026-03-05 18:56 ` Ian Rogers 0 siblings, 2 replies; 15+ messages in thread From: yuhaocheng035 @ 2026-02-02 16:20 UTC (permalink / raw) To: peterz; +Cc: acme, security, linux-kernel, linux-perf-users, gregkh From: Haocheng Yu <yuhaocheng035@gmail.com> Syzkaller reported a refcount_t: addition on 0; use-after-free warning in perf_mmap. The issue is caused by a race condition between a failing mmap() setup and a concurrent mmap() on a dependent event (e.g., using output redirection). In perf_mmap(), the ring_buffer (rb) is allocated and assigned to event->rb with the mmap_mutex held. The mutex is then released to perform map_range(). If map_range() fails, perf_mmap_close() is called to clean up. However, since the mutex was dropped, another thread attaching to this event (via inherited events or output redirection) can acquire the mutex, observe the valid event->rb pointer, and attempt to increment its reference count. If the cleanup path has already dropped the reference count to zero, this results in a use-after-free or refcount saturation warning. Fix this by extending the scope of mmap_mutex to cover the map_range() call. This ensures that the ring buffer initialization and mapping (or cleanup on failure) happens atomically effectively, preventing other threads from accessing a half-initialized or dying ring buffer. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> --- kernel/events/core.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 2c35acc2722b..abefd1213582 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) ret = perf_mmap_aux(vma, event, nr_pages); if (ret) return ret; - } - /* - * Since pinned accounting is per vm we cannot allow fork() to copy our - * vma. - */ - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); - vma->vm_ops = &perf_mmap_vmops; + /* + * Since pinned accounting is per vm we cannot allow fork() to copy our + * vma. + */ + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); + vma->vm_ops = &perf_mmap_vmops; - mapped = get_mapped(event, event_mapped); - if (mapped) - mapped(event, vma->vm_mm); + mapped = get_mapped(event, event_mapped); + if (mapped) + mapped(event, vma->vm_mm); - /* - * Try to map it into the page table. On fail, invoke - * perf_mmap_close() to undo the above, as the callsite expects - * full cleanup in this case and therefore does not invoke - * vmops::close(). - */ - ret = map_range(event->rb, vma); - if (ret) - perf_mmap_close(vma); + /* + * Try to map it into the page table. On fail, invoke + * perf_mmap_close() to undo the above, as the callsite expects + * full cleanup in this case and therefore does not invoke + * vmops::close(). + */ + ret = map_range(event->rb, vma); + if (ret) + perf_mmap_close(vma); + } return ret; } base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 -- 2.51.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-02 16:20 ` [PATCH v2] " yuhaocheng035 @ 2026-02-06 9:06 ` Peter Zijlstra 2026-02-09 15:26 ` 余昊铖 2026-03-05 18:56 ` Ian Rogers 1 sibling, 1 reply; 15+ messages in thread From: Peter Zijlstra @ 2026-02-06 9:06 UTC (permalink / raw) To: yuhaocheng035; +Cc: acme, security, linux-kernel, linux-perf-users, gregkh On Tue, Feb 03, 2026 at 12:20:56AM +0800, yuhaocheng035@gmail.com wrote: > From: Haocheng Yu <yuhaocheng035@gmail.com> > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > in perf_mmap. > > The issue is caused by a race condition between a failing mmap() setup > and a concurrent mmap() on a dependent event (e.g., using output > redirection). > > In perf_mmap(), the ring_buffer (rb) is allocated and assigned to > event->rb with the mmap_mutex held. The mutex is then released to > perform map_range(). > > If map_range() fails, perf_mmap_close() is called to clean up. > However, since the mutex was dropped, another thread attaching to > this event (via inherited events or output redirection) can acquire > the mutex, observe the valid event->rb pointer, and attempt to > increment its reference count. If the cleanup path has already > dropped the reference count to zero, this results in a > use-after-free or refcount saturation warning. > > Fix this by extending the scope of mmap_mutex to cover the > map_range() call. This ensures that the ring buffer initialization > and mapping (or cleanup on failure) happens atomically effectively, > preventing other threads from accessing a half-initialized or > dying ring buffer. And you're sure this time? To me it feels bit like talking to an LLM. I suppose there is nothing wrong with having an LLM process syzkaller output and even have it propose patches, but before you send it out an actual human should get involved and apply critical thinking skills. Just throwing stuff at a maintainer and hoping he does the thinking for you is not appreciated. > Reported-by: kernel test robot <lkp@intel.com> > Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ > Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> > --- > kernel/events/core.c | 38 +++++++++++++++++++------------------- > 1 file changed, 19 insertions(+), 19 deletions(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 2c35acc2722b..abefd1213582 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) > ret = perf_mmap_aux(vma, event, nr_pages); > if (ret) > return ret; > - } > > - /* > - * Since pinned accounting is per vm we cannot allow fork() to copy our > - * vma. > - */ > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > - vma->vm_ops = &perf_mmap_vmops; > + /* > + * Since pinned accounting is per vm we cannot allow fork() to copy our > + * vma. > + */ > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > + vma->vm_ops = &perf_mmap_vmops; > > - mapped = get_mapped(event, event_mapped); > - if (mapped) > - mapped(event, vma->vm_mm); > + mapped = get_mapped(event, event_mapped); > + if (mapped) > + mapped(event, vma->vm_mm); > > - /* > - * Try to map it into the page table. On fail, invoke > - * perf_mmap_close() to undo the above, as the callsite expects > - * full cleanup in this case and therefore does not invoke > - * vmops::close(). > - */ > - ret = map_range(event->rb, vma); > - if (ret) > - perf_mmap_close(vma); > + /* > + * Try to map it into the page table. On fail, invoke > + * perf_mmap_close() to undo the above, as the callsite expects > + * full cleanup in this case and therefore does not invoke > + * vmops::close(). > + */ > + ret = map_range(event->rb, vma); > + if (ret) > + perf_mmap_close(vma); > + } > > return ret; > } > > base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 > -- > 2.51.0 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-06 9:06 ` Peter Zijlstra @ 2026-02-09 15:26 ` 余昊铖 0 siblings, 0 replies; 15+ messages in thread From: 余昊铖 @ 2026-02-09 15:26 UTC (permalink / raw) To: Peter Zijlstra; +Cc: acme, security, linux-kernel, linux-perf-users, gregkh These explanations and patches were indeed generated by LLM, but I considered and reviewed them before sending the email, along with the kernel code and the C reproducer, and modified anything I deemed unreasonable. I believe this patch is meaningful, and there are indeed some issues with the kernel code, and that's why I send it out. Below is my own thinking: In the C reproducer, these four system calls are the core to this problem. Specifically, the third syscall takes r0 as an argument(group), establishing a shared ring buffer. The fourth syscall uses an unusual flag combination, which is likely the reason this bug can be triggered. res = syscall(__NR_perf_event_open, /*attr=*/0x200000000000ul, /*pid=*/0, /*cpu=*/1ul, /*group=*/(intptr_t)-1, /*flags=PERF_FLAG_FD_CLOEXEC*/ 8ul); syscall(__NR_mmap, /*addr=*/0x200000002000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_SHARED*/ 0x11ul, /*fd=*/r[0], /*offset=*/0ul); res = syscall(__NR_perf_event_open, /*attr=*/0x200000000000ul, /*pid=*/0, /*cpu=*/1ul, /*group=*/r[0], /*flags=PERF_FLAG_FD_OUTPUT*/ 2ul); syscall(__NR_mmap, /*addr=*/0x200000186000ul, /*len=*/0x1000ul, /*prot=PROT_GROWSDOWN|PROT_SEM|PROT_WRITE|PROT_READ*/ 0x100000bul, /*flags=MAP_SHARED_VALIDATE|MAP_FIXED*/ 0x13ul, /*fd=*/r[1], /*offset=*/0ul); The sequence is as follows: r0 enters perf_mmap first. It acquires the mutex, executes perf_mmap_rb, releases the mutex, and then calls map_range. If map_range fails, the function enters perf_mmap_close, which calls ring_buffer_put and drops the refcount to 0. At this moment, r1 also enters perf_mmap and attempts to attach to r0's ring buffer. Because the mutex is released during the r0's execution of map_range, the second mmap can acquire the mutex and access the rb pointer which is shared with r0 before it is cleared, attempting to increment the refcount on a buffer that is already being destroyed. So I think simply extend the scope of mutex to let it cover perf_mmap_close could solve this problem. > On Tue, Feb 03, 2026 at 12:20:56AM +0800, yuhaocheng035@gmail.com wrote: > > From: Haocheng Yu <yuhaocheng035@gmail.com> > > > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > > in perf_mmap. > > > > The issue is caused by a race condition between a failing mmap() setup > > and a concurrent mmap() on a dependent event (e.g., using output > > redirection). > > > > In perf_mmap(), the ring_buffer (rb) is allocated and assigned to > > event->rb with the mmap_mutex held. The mutex is then released to > > perform map_range(). > > > > If map_range() fails, perf_mmap_close() is called to clean up. > > However, since the mutex was dropped, another thread attaching to > > this event (via inherited events or output redirection) can acquire > > the mutex, observe the valid event->rb pointer, and attempt to > > increment its reference count. If the cleanup path has already > > dropped the reference count to zero, this results in a > > use-after-free or refcount saturation warning. > > > > Fix this by extending the scope of mmap_mutex to cover the > > map_range() call. This ensures that the ring buffer initialization > > and mapping (or cleanup on failure) happens atomically effectively, > > preventing other threads from accessing a half-initialized or > > dying ring buffer. > > And you're sure this time? To me it feels bit like talking to an LLM. > > I suppose there is nothing wrong with having an LLM process syzkaller > output and even have it propose patches, but before you send it out an > actual human should get involved and apply critical thinking skills. > > Just throwing stuff at a maintainer and hoping he does the thinking for > you is not appreciated. > > > Reported-by: kernel test robot <lkp@intel.com> > > Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ > > Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> > > --- > > kernel/events/core.c | 38 +++++++++++++++++++------------------- > > 1 file changed, 19 insertions(+), 19 deletions(-) > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c > > index 2c35acc2722b..abefd1213582 100644 > > --- a/kernel/events/core.c > > +++ b/kernel/events/core.c > > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) > > ret = perf_mmap_aux(vma, event, nr_pages); > > if (ret) > > return ret; > > - } > > > > - /* > > - * Since pinned accounting is per vm we cannot allow fork() to copy our > > - * vma. > > - */ > > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > - vma->vm_ops = &perf_mmap_vmops; > > + /* > > + * Since pinned accounting is per vm we cannot allow fork() to copy our > > + * vma. > > + */ > > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > + vma->vm_ops = &perf_mmap_vmops; > > > > - mapped = get_mapped(event, event_mapped); > > - if (mapped) > > - mapped(event, vma->vm_mm); > > + mapped = get_mapped(event, event_mapped); > > + if (mapped) > > + mapped(event, vma->vm_mm); > > > > - /* > > - * Try to map it into the page table. On fail, invoke > > - * perf_mmap_close() to undo the above, as the callsite expects > > - * full cleanup in this case and therefore does not invoke > > - * vmops::close(). > > - */ > > - ret = map_range(event->rb, vma); > > - if (ret) > > - perf_mmap_close(vma); > > + /* > > + * Try to map it into the page table. On fail, invoke > > + * perf_mmap_close() to undo the above, as the callsite expects > > + * full cleanup in this case and therefore does not invoke > > + * vmops::close(). > > + */ > > + ret = map_range(event->rb, vma); > > + if (ret) > > + perf_mmap_close(vma); > > + } > > > > return ret; > > } > > > > base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 > > -- > > 2.51.0 > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-02 16:20 ` [PATCH v2] " yuhaocheng035 2026-02-06 9:06 ` Peter Zijlstra @ 2026-03-05 18:56 ` Ian Rogers 2026-03-06 9:35 ` yuhaocheng035 2026-03-06 9:36 ` Haocheng Yu 1 sibling, 2 replies; 15+ messages in thread From: Ian Rogers @ 2026-03-05 18:56 UTC (permalink / raw) To: yuhaocheng035 Cc: peterz, acme, security, linux-kernel, linux-perf-users, gregkh On Mon, Feb 2, 2026 at 8:30 AM <yuhaocheng035@gmail.com> wrote: > > From: Haocheng Yu <yuhaocheng035@gmail.com> > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > in perf_mmap. > > The issue is caused by a race condition between a failing mmap() setup > and a concurrent mmap() on a dependent event (e.g., using output > redirection). > > In perf_mmap(), the ring_buffer (rb) is allocated and assigned to > event->rb with the mmap_mutex held. The mutex is then released to > perform map_range(). > > If map_range() fails, perf_mmap_close() is called to clean up. > However, since the mutex was dropped, another thread attaching to > this event (via inherited events or output redirection) can acquire > the mutex, observe the valid event->rb pointer, and attempt to > increment its reference count. If the cleanup path has already > dropped the reference count to zero, this results in a > use-after-free or refcount saturation warning. > > Fix this by extending the scope of mmap_mutex to cover the > map_range() call. This ensures that the ring buffer initialization > and mapping (or cleanup on failure) happens atomically effectively, > preventing other threads from accessing a half-initialized or > dying ring buffer. As perf_mmap_close is now called inside the guarded region, is there potential for self deadlock? In perf_mmap it is now calling perf_mmap_close holding the event->mmap_mutex: ``` scoped_guard (mutex, &event->mmap_mutex) { [...] ret = map_range(event->rb, vma); if (ret) perf_mmap_close(vma); } ``` and in perf_mmap_close the mutex will be taken again: ``` static void perf_mmap_close(struct vm_area_struct *vma) { struct perf_event *event = vma->vm_file->private_data; [...] if (!refcount_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex)) goto out_put; ``` Thanks, Ian > Reported-by: kernel test robot <lkp@intel.com> > Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ > Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> > --- > kernel/events/core.c | 38 +++++++++++++++++++------------------- > 1 file changed, 19 insertions(+), 19 deletions(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 2c35acc2722b..abefd1213582 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) > ret = perf_mmap_aux(vma, event, nr_pages); > if (ret) > return ret; > - } > > - /* > - * Since pinned accounting is per vm we cannot allow fork() to copy our > - * vma. > - */ > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > - vma->vm_ops = &perf_mmap_vmops; > + /* > + * Since pinned accounting is per vm we cannot allow fork() to copy our > + * vma. > + */ > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > + vma->vm_ops = &perf_mmap_vmops; > > - mapped = get_mapped(event, event_mapped); > - if (mapped) > - mapped(event, vma->vm_mm); > + mapped = get_mapped(event, event_mapped); > + if (mapped) > + mapped(event, vma->vm_mm); > > - /* > - * Try to map it into the page table. On fail, invoke > - * perf_mmap_close() to undo the above, as the callsite expects > - * full cleanup in this case and therefore does not invoke > - * vmops::close(). > - */ > - ret = map_range(event->rb, vma); > - if (ret) > - perf_mmap_close(vma); > + /* > + * Try to map it into the page table. On fail, invoke > + * perf_mmap_close() to undo the above, as the callsite expects > + * full cleanup in this case and therefore does not invoke > + * vmops::close(). > + */ > + ret = map_range(event->rb, vma); > + if (ret) > + perf_mmap_close(vma); > + } > > return ret; > } > > base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 > -- > 2.51.0 > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-03-05 18:56 ` Ian Rogers @ 2026-03-06 9:35 ` yuhaocheng035 2026-03-06 9:36 ` Haocheng Yu 1 sibling, 0 replies; 15+ messages in thread From: yuhaocheng035 @ 2026-03-06 9:35 UTC (permalink / raw) To: irogers; +Cc: peterz, acme, security, linux-kernel, linux-perf-users, gregkh From: Haocheng Yu <yuhaocheng035@gmail.com> Syzkaller reported a refcount_t: addition on 0; use-after-free warning in perf_mmap. The issue is caused by a race condition between a failing mmap() setup and a concurrent mmap() on a dependent event (e.g., using output redirection). In perf_mmap(), the ring_buffer (rb) is allocated and assigned to event->rb with the mmap_mutex held. The mutex is then released to perform map_range(). If map_range() fails, perf_mmap_close() is called to clean up. However, since the mutex was dropped, another thread attaching to this event (via inherited events or output redirection) can acquire the mutex, observe the valid event->rb pointer, and attempt to increment its reference count. If the cleanup path has already dropped the reference count to zero, this results in a use-after-free or refcount saturation warning. Fix this by extending the scope of mmap_mutex to cover the map_range() call. This ensures that the ring buffer initialization and mapping (or cleanup on failure) happens atomically effectively, preventing other threads from accessing a half-initialized or dying ring buffer. v2: Because expanding the guarded region would cause the event->mmap_mutex to be acquired repeatedly in the perf_mmap_close function, potentially leading to a self deadlock, the original logic of perf_mmap_close was retained, and the mutex-holding logic was modified to obtain the perf_mmap_close_locked function. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> --- kernel/events/core.c | 152 +++++++++++++++++++++++++++++++++++++------ 1 file changed, 133 insertions(+), 19 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 2c35acc2722b..6c161761db38 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6842,6 +6842,120 @@ static void perf_mmap_close(struct vm_area_struct *vma) ring_buffer_put(rb); /* could be last */ } +static void perf_mmap_close_locked(struct vm_area_struct *vma) +{ + struct perf_event *event = vma->vm_file->private_data; + struct perf_event *iter_event; + mapped_f unmapped = get_mapped(event, event_unmapped); + struct perf_buffer *rb = ring_buffer_get(event); + struct user_struct *mmap_user = rb->mmap_user; + int mmap_locked = rb->mmap_locked; + unsigned long size = perf_data_size(rb); + bool detach_rest = false; + + /* FIXIES vs perf_pmu_unregister() */ + if (unmapped) + unmapped(event, vma->vm_mm); + + /* + * The AUX buffer is strictly a sub-buffer, serialize using aux_mutex + * to avoid complications. + */ + if (rb_has_aux(rb) && vma->vm_pgoff == rb->aux_pgoff && + refcount_dec_and_mutex_lock(&rb->aux_mmap_count, &rb->aux_mutex)) { + /* + * Stop all AUX events that are writing to this buffer, + * so that we can free its AUX pages and corresponding PMU + * data. Note that after rb::aux_mmap_count dropped to zero, + * they won't start any more (see perf_aux_output_begin()). + */ + perf_pmu_output_stop(event); + + /* now it's safe to free the pages */ + atomic_long_sub(rb->aux_nr_pages - rb->aux_mmap_locked, &mmap_user->locked_vm); + atomic64_sub(rb->aux_mmap_locked, &vma->vm_mm->pinned_vm); + + /* this has to be the last one */ + rb_free_aux(rb); + WARN_ON_ONCE(refcount_read(&rb->aux_refcount)); + + mutex_unlock(&rb->aux_mutex); + } + + if (refcount_dec_and_test(&rb->mmap_count)) + detach_rest = true; + + if (!refcount_dec_and_test(&event->mmap_count)) + goto out_put; + + ring_buffer_attach(event, NULL); + + /* If there's still other mmap()s of this buffer, we're done. */ + if (!detach_rest) + goto out_put; + + /* + * No other mmap()s, detach from all other events that might redirect + * into the now unreachable buffer. Somewhat complicated by the + * fact that rb::event_lock otherwise nests inside mmap_mutex. + */ +again: + rcu_read_lock(); + list_for_each_entry_rcu(iter_event, &rb->event_list, rb_entry) { + if (!atomic_long_inc_not_zero(&iter_event->refcount)) { + /* + * This event is en-route to free_event() which will + * detach it and remove it from the list. + */ + continue; + } + rcu_read_unlock(); + + if (iter_event != event) { + mutex_lock(&iter_event->mmap_mutex); + /* + * Check we didn't race with perf_event_set_output() which can + * swizzle the rb from under us while we were waiting to + * acquire mmap_mutex. + * + * If we find a different rb; ignore this event, a next + * iteration will no longer find it on the list. We have to + * still restart the iteration to make sure we're not now + * iterating the wrong list. + */ + if (iter_event->rb == rb) + ring_buffer_attach(iter_event, NULL); + + mutex_unlock(&iter_event->mmap_mutex); + } + put_event(iter_event); + + /* + * Restart the iteration; either we're on the wrong list or + * destroyed its integrity by doing a deletion. + */ + goto again; + } + rcu_read_unlock(); + + /* + * It could be there's still a few 0-ref events on the list; they'll + * get cleaned up by free_event() -- they'll also still have their + * ref on the rb and will free it whenever they are done with it. + * + * Aside from that, this buffer is 'fully' detached and unmapped, + * undo the VM accounting. + */ + + atomic_long_sub((size >> PAGE_SHIFT) + 1 - mmap_locked, + &mmap_user->locked_vm); + atomic64_sub(mmap_locked, &vma->vm_mm->pinned_vm); + free_uid(mmap_user); + +out_put: + ring_buffer_put(rb); /* could be last */ +} + static vm_fault_t perf_mmap_pfn_mkwrite(struct vm_fault *vmf) { /* The first page is the user control page, others are read-only. */ @@ -7167,28 +7281,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) ret = perf_mmap_aux(vma, event, nr_pages); if (ret) return ret; - } - /* - * Since pinned accounting is per vm we cannot allow fork() to copy our - * vma. - */ - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); - vma->vm_ops = &perf_mmap_vmops; + /* + * Since pinned accounting is per vm we cannot allow fork() to copy our + * vma. + */ + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); + vma->vm_ops = &perf_mmap_vmops; - mapped = get_mapped(event, event_mapped); - if (mapped) - mapped(event, vma->vm_mm); + mapped = get_mapped(event, event_mapped); + if (mapped) + mapped(event, vma->vm_mm); - /* - * Try to map it into the page table. On fail, invoke - * perf_mmap_close() to undo the above, as the callsite expects - * full cleanup in this case and therefore does not invoke - * vmops::close(). - */ - ret = map_range(event->rb, vma); - if (ret) - perf_mmap_close(vma); + /* + * Try to map it into the page table. On fail, invoke + * perf_mmap_close() to undo the above, as the callsite expects + * full cleanup in this case and therefore does not invoke + * vmops::close(). + */ + ret = map_range(event->rb, vma); + if (ret) + perf_mmap_close_locked(vma); + } return ret; } base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 -- 2.51.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-03-05 18:56 ` Ian Rogers 2026-03-06 9:35 ` yuhaocheng035 @ 2026-03-06 9:36 ` Haocheng Yu 2026-03-06 19:04 ` Ian Rogers 1 sibling, 1 reply; 15+ messages in thread From: Haocheng Yu @ 2026-03-06 9:36 UTC (permalink / raw) To: Ian Rogers; +Cc: peterz, acme, security, linux-kernel, linux-perf-users, gregkh That makes a lot of sense. It's indeed possible for a self deadlock to occur. I tried updating my patch by modifying `perf_mmap_close` to get a `perf_mmap_close_locked` function that handles the case where event->mutex is held from start. But this approach isn't very concise, and I'm not so sure if I changed the original logic for some unexpected reasons. Nevertheless, releasing the mutex before perf_mmap_close finishes executing might cause the original race condition issue again, which puts me in a dilemma. Do you have any suggestions? Thanks, Haocheng > On Mon, Feb 2, 2026 at 8:30 AM <yuhaocheng035@gmail.com> wrote: > > > > From: Haocheng Yu <yuhaocheng035@gmail.com> > > > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > > in perf_mmap. > > > > The issue is caused by a race condition between a failing mmap() setup > > and a concurrent mmap() on a dependent event (e.g., using output > > redirection). > > > > In perf_mmap(), the ring_buffer (rb) is allocated and assigned to > > event->rb with the mmap_mutex held. The mutex is then released to > > perform map_range(). > > > > If map_range() fails, perf_mmap_close() is called to clean up. > > However, since the mutex was dropped, another thread attaching to > > this event (via inherited events or output redirection) can acquire > > the mutex, observe the valid event->rb pointer, and attempt to > > increment its reference count. If the cleanup path has already > > dropped the reference count to zero, this results in a > > use-after-free or refcount saturation warning. > > > > Fix this by extending the scope of mmap_mutex to cover the > > map_range() call. This ensures that the ring buffer initialization > > and mapping (or cleanup on failure) happens atomically effectively, > > preventing other threads from accessing a half-initialized or > > dying ring buffer. > > As perf_mmap_close is now called inside the guarded region, is there > potential for self deadlock? > > In perf_mmap it is now calling perf_mmap_close holding the event->mmap_mutex: > ``` > scoped_guard (mutex, &event->mmap_mutex) { > [...] > ret = map_range(event->rb, vma); > if (ret) > perf_mmap_close(vma); > } > ``` > and in perf_mmap_close the mutex will be taken again: > ``` > static void perf_mmap_close(struct vm_area_struct *vma) > { > struct perf_event *event = vma->vm_file->private_data; > [...] > if (!refcount_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex)) > goto out_put; > ``` > > Thanks, > Ian > > > Reported-by: kernel test robot <lkp@intel.com> > > Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ > > Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> > > --- > > kernel/events/core.c | 38 +++++++++++++++++++------------------- > > 1 file changed, 19 insertions(+), 19 deletions(-) > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c > > index 2c35acc2722b..abefd1213582 100644 > > --- a/kernel/events/core.c > > +++ b/kernel/events/core.c > > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) > > ret = perf_mmap_aux(vma, event, nr_pages); > > if (ret) > > return ret; > > - } > > > > - /* > > - * Since pinned accounting is per vm we cannot allow fork() to copy our > > - * vma. > > - */ > > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > - vma->vm_ops = &perf_mmap_vmops; > > + /* > > + * Since pinned accounting is per vm we cannot allow fork() to copy our > > + * vma. > > + */ > > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > + vma->vm_ops = &perf_mmap_vmops; > > > > - mapped = get_mapped(event, event_mapped); > > - if (mapped) > > - mapped(event, vma->vm_mm); > > + mapped = get_mapped(event, event_mapped); > > + if (mapped) > > + mapped(event, vma->vm_mm); > > > > - /* > > - * Try to map it into the page table. On fail, invoke > > - * perf_mmap_close() to undo the above, as the callsite expects > > - * full cleanup in this case and therefore does not invoke > > - * vmops::close(). > > - */ > > - ret = map_range(event->rb, vma); > > - if (ret) > > - perf_mmap_close(vma); > > + /* > > + * Try to map it into the page table. On fail, invoke > > + * perf_mmap_close() to undo the above, as the callsite expects > > + * full cleanup in this case and therefore does not invoke > > + * vmops::close(). > > + */ > > + ret = map_range(event->rb, vma); > > + if (ret) > > + perf_mmap_close(vma); > > + } > > > > return ret; > > } > > > > base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 > > -- > > 2.51.0 > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-03-06 9:36 ` Haocheng Yu @ 2026-03-06 19:04 ` Ian Rogers 2026-03-07 5:57 ` Haocheng Yu 0 siblings, 1 reply; 15+ messages in thread From: Ian Rogers @ 2026-03-06 19:04 UTC (permalink / raw) To: Haocheng Yu Cc: peterz, acme, security, linux-kernel, linux-perf-users, gregkh On Fri, Mar 6, 2026 at 1:37 AM Haocheng Yu <yuhaocheng035@gmail.com> wrote: > > That makes a lot of sense. It's indeed possible for a self deadlock to occur. > > I tried updating my patch by modifying `perf_mmap_close` to get a > `perf_mmap_close_locked` > function that handles the case where event->mutex is held from start. > But this approach > isn't very concise, and I'm not so sure if I changed the original > logic for some unexpected reasons. > Nevertheless, releasing the mutex before perf_mmap_close finishes > executing might cause the > original race condition issue again, which puts me in a dilemma. > > Do you have any suggestions? With the: ``` + if (ret) + perf_mmap_close_locked(vma); ``` Wouldn't moving it outside the "scoped_guard(mutex, &event->mmap_mutex)" be a fix? Thanks, Ian > Thanks, > Haocheng > > > > > On Mon, Feb 2, 2026 at 8:30 AM <yuhaocheng035@gmail.com> wrote: > > > > > > From: Haocheng Yu <yuhaocheng035@gmail.com> > > > > > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > > > in perf_mmap. > > > > > > The issue is caused by a race condition between a failing mmap() setup > > > and a concurrent mmap() on a dependent event (e.g., using output > > > redirection). > > > > > > In perf_mmap(), the ring_buffer (rb) is allocated and assigned to > > > event->rb with the mmap_mutex held. The mutex is then released to > > > perform map_range(). > > > > > > If map_range() fails, perf_mmap_close() is called to clean up. > > > However, since the mutex was dropped, another thread attaching to > > > this event (via inherited events or output redirection) can acquire > > > the mutex, observe the valid event->rb pointer, and attempt to > > > increment its reference count. If the cleanup path has already > > > dropped the reference count to zero, this results in a > > > use-after-free or refcount saturation warning. > > > > > > Fix this by extending the scope of mmap_mutex to cover the > > > map_range() call. This ensures that the ring buffer initialization > > > and mapping (or cleanup on failure) happens atomically effectively, > > > preventing other threads from accessing a half-initialized or > > > dying ring buffer. > > > > As perf_mmap_close is now called inside the guarded region, is there > > potential for self deadlock? > > > > In perf_mmap it is now calling perf_mmap_close holding the event->mmap_mutex: > > ``` > > scoped_guard (mutex, &event->mmap_mutex) { > > [...] > > ret = map_range(event->rb, vma); > > if (ret) > > perf_mmap_close(vma); > > } > > ``` > > and in perf_mmap_close the mutex will be taken again: > > ``` > > static void perf_mmap_close(struct vm_area_struct *vma) > > { > > struct perf_event *event = vma->vm_file->private_data; > > [...] > > if (!refcount_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex)) > > goto out_put; > > ``` > > > > Thanks, > > Ian > > > > > Reported-by: kernel test robot <lkp@intel.com> > > > Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ > > > Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> > > > --- > > > kernel/events/core.c | 38 +++++++++++++++++++------------------- > > > 1 file changed, 19 insertions(+), 19 deletions(-) > > > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c > > > index 2c35acc2722b..abefd1213582 100644 > > > --- a/kernel/events/core.c > > > +++ b/kernel/events/core.c > > > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) > > > ret = perf_mmap_aux(vma, event, nr_pages); > > > if (ret) > > > return ret; > > > - } > > > > > > - /* > > > - * Since pinned accounting is per vm we cannot allow fork() to copy our > > > - * vma. > > > - */ > > > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > > - vma->vm_ops = &perf_mmap_vmops; > > > + /* > > > + * Since pinned accounting is per vm we cannot allow fork() to copy our > > > + * vma. > > > + */ > > > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > > + vma->vm_ops = &perf_mmap_vmops; > > > > > > - mapped = get_mapped(event, event_mapped); > > > - if (mapped) > > > - mapped(event, vma->vm_mm); > > > + mapped = get_mapped(event, event_mapped); > > > + if (mapped) > > > + mapped(event, vma->vm_mm); > > > > > > - /* > > > - * Try to map it into the page table. On fail, invoke > > > - * perf_mmap_close() to undo the above, as the callsite expects > > > - * full cleanup in this case and therefore does not invoke > > > - * vmops::close(). > > > - */ > > > - ret = map_range(event->rb, vma); > > > - if (ret) > > > - perf_mmap_close(vma); > > > + /* > > > + * Try to map it into the page table. On fail, invoke > > > + * perf_mmap_close() to undo the above, as the callsite expects > > > + * full cleanup in this case and therefore does not invoke > > > + * vmops::close(). > > > + */ > > > + ret = map_range(event->rb, vma); > > > + if (ret) > > > + perf_mmap_close(vma); > > > + } > > > > > > return ret; > > > } > > > > > > base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 > > > -- > > > 2.51.0 > > > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-03-06 19:04 ` Ian Rogers @ 2026-03-07 5:57 ` Haocheng Yu 0 siblings, 0 replies; 15+ messages in thread From: Haocheng Yu @ 2026-03-07 5:57 UTC (permalink / raw) To: Ian Rogers; +Cc: peterz, acme, security, linux-kernel, linux-perf-users, gregkh But if it is moved out, other events might be able to hold the mutex after the current event finishes executing refcount_dec but before releasing rb, thus causing the race condition I mentioned in the patch. The following is a more detailed analysis: In the C reproducer, these four system calls are central to this problem. Specifically, the third syscall takes r0 as an argument(group), establishing a shared ring buffer. The fourth syscall uses an unusual flag combination, which is likely the reason this bug can be triggered. res = syscall(__NR_perf_event_open, /*attr=*/0x200000000000ul, /*pid=*/0, /*cpu=*/1ul, /*group=*/(intptr_t)-1, /*flags=PERF_FLAG_FD_CLOEXEC*/ 8ul); syscall(__NR_mmap, /*addr=*/0x200000002000ul, /*len=*/0x1000ul, /*prot=*/0ul, /*flags=MAP_FIXED|MAP_SHARED*/ 0x11ul, /*fd=*/r[0], /*offset=*/0ul); res = syscall(__NR_perf_event_open, /*attr=*/0x200000000000ul, /*pid=*/0, /*cpu=*/1ul, /*group=*/r[0], /*flags=PERF_FLAG_FD_OUTPUT*/ 2ul); syscall(__NR_mmap, /*addr=*/0x200000186000ul, /*len=*/0x1000ul, /*prot=PROT_GROWSDOWN|PROT_SEM|PROT_WRITE|PROT_READ*/ 0x100000bul, /*flags=MAP_SHARED_VALIDATE|MAP_FIXED*/ 0x13ul, /*fd=*/r[1], /*offset=*/0ul); And that's what happened: r0 enters perf_mmap first. It acquires the mutex, executes perf_mmap_rb, releases the mutex, and then calls map_range. If map_range fails, the function enters perf_mmap_close, which drops the refcount to 0 and finally releases the rb. At an exact moment(after decreasing refcount but before releasing rb), r1 also enters perf_mmap and attempts to attach to r0's ring buffer. Because the mutex is released, the second mmap can acquire the mutex and access the rb pointer which is shared with r0 before it is cleared, attempting to increment the refcount on a buffer that is already being destroyed. > On Fri, Mar 6, 2026 at 1:37 AM Haocheng Yu <yuhaocheng035@gmail.com> wrote: > > > > That makes a lot of sense. It's indeed possible for a self deadlock to occur. > > > > I tried updating my patch by modifying `perf_mmap_close` to get a > > `perf_mmap_close_locked` > > function that handles the case where event->mutex is held from start. > > But this approach > > isn't very concise, and I'm not so sure if I changed the original > > logic for some unexpected reasons. > > Nevertheless, releasing the mutex before perf_mmap_close finishes > > executing might cause the > > original race condition issue again, which puts me in a dilemma. > > > > Do you have any suggestions? > > With the: > ``` > + if (ret) > + perf_mmap_close_locked(vma); > ``` > Wouldn't moving it outside the "scoped_guard(mutex, > &event->mmap_mutex)" be a fix? > > Thanks, > Ian > > > Thanks, > > Haocheng > > > > > > > > > On Mon, Feb 2, 2026 at 8:30 AM <yuhaocheng035@gmail.com> wrote: > > > > > > > > From: Haocheng Yu <yuhaocheng035@gmail.com> > > > > > > > > Syzkaller reported a refcount_t: addition on 0; use-after-free warning > > > > in perf_mmap. > > > > > > > > The issue is caused by a race condition between a failing mmap() setup > > > > and a concurrent mmap() on a dependent event (e.g., using output > > > > redirection). > > > > > > > > In perf_mmap(), the ring_buffer (rb) is allocated and assigned to > > > > event->rb with the mmap_mutex held. The mutex is then released to > > > > perform map_range(). > > > > > > > > If map_range() fails, perf_mmap_close() is called to clean up. > > > > However, since the mutex was dropped, another thread attaching to > > > > this event (via inherited events or output redirection) can acquire > > > > the mutex, observe the valid event->rb pointer, and attempt to > > > > increment its reference count. If the cleanup path has already > > > > dropped the reference count to zero, this results in a > > > > use-after-free or refcount saturation warning. > > > > > > > > Fix this by extending the scope of mmap_mutex to cover the > > > > map_range() call. This ensures that the ring buffer initialization > > > > and mapping (or cleanup on failure) happens atomically effectively, > > > > preventing other threads from accessing a half-initialized or > > > > dying ring buffer. > > > > > > As perf_mmap_close is now called inside the guarded region, is there > > > potential for self deadlock? > > > > > > In perf_mmap it is now calling perf_mmap_close holding the event->mmap_mutex: > > > ``` > > > scoped_guard (mutex, &event->mmap_mutex) { > > > [...] > > > ret = map_range(event->rb, vma); > > > if (ret) > > > perf_mmap_close(vma); > > > } > > > ``` > > > and in perf_mmap_close the mutex will be taken again: > > > ``` > > > static void perf_mmap_close(struct vm_area_struct *vma) > > > { > > > struct perf_event *event = vma->vm_file->private_data; > > > [...] > > > if (!refcount_dec_and_mutex_lock(&event->mmap_count, &event->mmap_mutex)) > > > goto out_put; > > > ``` > > > > > > Thanks, > > > Ian > > > > > > > Reported-by: kernel test robot <lkp@intel.com> > > > > Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ > > > > Signed-off-by: Haocheng Yu <yuhaocheng035@gmail.com> > > > > --- > > > > kernel/events/core.c | 38 +++++++++++++++++++------------------- > > > > 1 file changed, 19 insertions(+), 19 deletions(-) > > > > > > > > diff --git a/kernel/events/core.c b/kernel/events/core.c > > > > index 2c35acc2722b..abefd1213582 100644 > > > > --- a/kernel/events/core.c > > > > +++ b/kernel/events/core.c > > > > @@ -7167,28 +7167,28 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma) > > > > ret = perf_mmap_aux(vma, event, nr_pages); > > > > if (ret) > > > > return ret; > > > > - } > > > > > > > > - /* > > > > - * Since pinned accounting is per vm we cannot allow fork() to copy our > > > > - * vma. > > > > - */ > > > > - vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > > > - vma->vm_ops = &perf_mmap_vmops; > > > > + /* > > > > + * Since pinned accounting is per vm we cannot allow fork() to copy our > > > > + * vma. > > > > + */ > > > > + vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); > > > > + vma->vm_ops = &perf_mmap_vmops; > > > > > > > > - mapped = get_mapped(event, event_mapped); > > > > - if (mapped) > > > > - mapped(event, vma->vm_mm); > > > > + mapped = get_mapped(event, event_mapped); > > > > + if (mapped) > > > > + mapped(event, vma->vm_mm); > > > > > > > > - /* > > > > - * Try to map it into the page table. On fail, invoke > > > > - * perf_mmap_close() to undo the above, as the callsite expects > > > > - * full cleanup in this case and therefore does not invoke > > > > - * vmops::close(). > > > > - */ > > > > - ret = map_range(event->rb, vma); > > > > - if (ret) > > > > - perf_mmap_close(vma); > > > > + /* > > > > + * Try to map it into the page table. On fail, invoke > > > > + * perf_mmap_close() to undo the above, as the callsite expects > > > > + * full cleanup in this case and therefore does not invoke > > > > + * vmops::close(). > > > > + */ > > > > + ret = map_range(event->rb, vma); > > > > + if (ret) > > > > + perf_mmap_close(vma); > > > > + } > > > > > > > > return ret; > > > > } > > > > > > > > base-commit: 7d0a66e4bb9081d75c82ec4957c50034cb0ea449 > > > > -- > > > > 2.51.0 > > > > > > > > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap 2026-02-01 11:34 ` [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap Haocheng Yu 2026-02-01 11:49 ` Greg KH @ 2026-02-01 18:43 ` kernel test robot 1 sibling, 0 replies; 15+ messages in thread From: kernel test robot @ 2026-02-01 18:43 UTC (permalink / raw) To: Haocheng Yu, acme; +Cc: oe-kbuild-all, linux-kernel, linux-perf-users, gregkh Hi Haocheng, kernel test robot noticed the following build warnings: [auto build test WARNING on perf-tools-next/perf-tools-next] [also build test WARNING on tip/perf/core perf-tools/perf-tools linus/master v6.19-rc7 next-20260130] [cannot apply to acme/perf/core] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Haocheng-Yu/perf-core-Fix-refcount-bug-and-potential-UAF-in-perf_mmap/20260201-193746 base: https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git perf-tools-next patch link: https://lore.kernel.org/r/20260201113446.4328-1-yuhaocheng035%40gmail.com patch subject: [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap config: mips-randconfig-r072-20260201 (https://download.01.org/0day-ci/archive/20260202/202602020208.m7KIjdzW-lkp@intel.com/config) compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 9b8addffa70cee5b2acc5454712d9cf78ce45710) smatch version: v0.5.0-8994-gd50c5a4c If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202602020208.m7KIjdzW-lkp@intel.com/ smatch warnings: kernel/events/core.c:7183 perf_mmap() warn: inconsistent indenting vim +7183 kernel/events/core.c 7b732a75047738 kernel/perf_counter.c Peter Zijlstra 2009-03-23 7131 37d81828385f8f kernel/perf_counter.c Paul Mackerras 2009-03-23 7132 static int perf_mmap(struct file *file, struct vm_area_struct *vma) 37d81828385f8f kernel/perf_counter.c Paul Mackerras 2009-03-23 7133 { cdd6c482c9ff9c kernel/perf_event.c Ingo Molnar 2009-09-21 7134 struct perf_event *event = file->private_data; 81e026ca47b386 kernel/events/core.c Thomas Gleixner 2025-08-12 7135 unsigned long vma_size, nr_pages; da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7136 mapped_f mapped; 5d299897f1e360 kernel/events/core.c Peter Zijlstra 2025-08-12 7137 int ret; d57e34fdd60be7 kernel/perf_event.c Peter Zijlstra 2010-05-28 7138 c7920614cebbf2 kernel/perf_event.c Peter Zijlstra 2010-05-18 7139 /* c7920614cebbf2 kernel/perf_event.c Peter Zijlstra 2010-05-18 7140 * Don't allow mmap() of inherited per-task counters. This would c7920614cebbf2 kernel/perf_event.c Peter Zijlstra 2010-05-18 7141 * create a performance issue due to all children writing to the 76369139ceb955 kernel/events/core.c Frederic Weisbecker 2011-05-19 7142 * same rb. c7920614cebbf2 kernel/perf_event.c Peter Zijlstra 2010-05-18 7143 */ c7920614cebbf2 kernel/perf_event.c Peter Zijlstra 2010-05-18 7144 if (event->cpu == -1 && event->attr.inherit) c7920614cebbf2 kernel/perf_event.c Peter Zijlstra 2010-05-18 7145 return -EINVAL; 4ec8363dfc1451 kernel/events/core.c Vince Weaver 2011-06-01 7146 43a21ea81a2400 kernel/perf_counter.c Peter Zijlstra 2009-03-25 7147 if (!(vma->vm_flags & VM_SHARED)) 37d81828385f8f kernel/perf_counter.c Paul Mackerras 2009-03-23 7148 return -EINVAL; 26cb63ad11e040 kernel/events/core.c Peter Zijlstra 2013-05-28 7149 da97e18458fb42 kernel/events/core.c Joel Fernandes (Google 2019-10-14 7150) ret = security_perf_event_read(event); da97e18458fb42 kernel/events/core.c Joel Fernandes (Google 2019-10-14 7151) if (ret) da97e18458fb42 kernel/events/core.c Joel Fernandes (Google 2019-10-14 7152) return ret; 26cb63ad11e040 kernel/events/core.c Peter Zijlstra 2013-05-28 7153 7b732a75047738 kernel/perf_counter.c Peter Zijlstra 2009-03-23 7154 vma_size = vma->vm_end - vma->vm_start; 0c8a4e4139adf0 kernel/events/core.c Peter Zijlstra 2024-11-04 7155 nr_pages = vma_size / PAGE_SIZE; ac9721f3f54b27 kernel/perf_event.c Peter Zijlstra 2010-05-27 7156 0c8a4e4139adf0 kernel/events/core.c Peter Zijlstra 2024-11-04 7157 if (nr_pages > INT_MAX) 0c8a4e4139adf0 kernel/events/core.c Peter Zijlstra 2024-11-04 7158 return -ENOMEM; 9a0f05cb368885 kernel/events/core.c Peter Zijlstra 2011-11-21 7159 0c8a4e4139adf0 kernel/events/core.c Peter Zijlstra 2024-11-04 7160 if (vma_size != PAGE_SIZE * nr_pages) 0c8a4e4139adf0 kernel/events/core.c Peter Zijlstra 2024-11-04 7161 return -EINVAL; 45bfb2e50471ab kernel/events/core.c Peter Zijlstra 2015-01-14 7162 d23a6dbc0a7174 kernel/events/core.c Peter Zijlstra 2025-08-12 7163 scoped_guard (mutex, &event->mmap_mutex) { da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7164 /* da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7165 * This relies on __pmu_detach_event() taking mmap_mutex after marking da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7166 * the event REVOKED. Either we observe the state, or __pmu_detach_event() da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7167 * will detach the rb created here. da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7168 */ d23a6dbc0a7174 kernel/events/core.c Peter Zijlstra 2025-08-12 7169 if (event->state <= PERF_EVENT_STATE_REVOKED) d23a6dbc0a7174 kernel/events/core.c Peter Zijlstra 2025-08-12 7170 return -ENODEV; 37d81828385f8f kernel/perf_counter.c Paul Mackerras 2009-03-23 7171 5d299897f1e360 kernel/events/core.c Peter Zijlstra 2025-08-12 7172 if (vma->vm_pgoff == 0) 5d299897f1e360 kernel/events/core.c Peter Zijlstra 2025-08-12 7173 ret = perf_mmap_rb(vma, event, nr_pages); 5d299897f1e360 kernel/events/core.c Peter Zijlstra 2025-08-12 7174 else 2aee3768239133 kernel/events/core.c Peter Zijlstra 2025-08-12 7175 ret = perf_mmap_aux(vma, event, nr_pages); 07091aade394f6 kernel/events/core.c Thomas Gleixner 2025-08-02 7176 if (ret) 07091aade394f6 kernel/events/core.c Thomas Gleixner 2025-08-02 7177 return ret; 07091aade394f6 kernel/events/core.c Thomas Gleixner 2025-08-02 7178 9bb5d40cd93c9d kernel/events/core.c Peter Zijlstra 2013-06-04 7179 /* 9bb5d40cd93c9d kernel/events/core.c Peter Zijlstra 2013-06-04 7180 * Since pinned accounting is per vm we cannot allow fork() to copy our 9bb5d40cd93c9d kernel/events/core.c Peter Zijlstra 2013-06-04 7181 * vma. 9bb5d40cd93c9d kernel/events/core.c Peter Zijlstra 2013-06-04 7182 */ 1c71222e5f2393 kernel/events/core.c Suren Baghdasaryan 2023-01-26 @7183 vm_flags_set(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); 37d81828385f8f kernel/perf_counter.c Paul Mackerras 2009-03-23 7184 vma->vm_ops = &perf_mmap_vmops; 7b732a75047738 kernel/perf_counter.c Peter Zijlstra 2009-03-23 7185 da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7186 mapped = get_mapped(event, event_mapped); da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7187 if (mapped) da916e96e2dedc kernel/events/core.c Peter Zijlstra 2024-10-25 7188 mapped(event, vma->vm_mm); 1e0fb9ec679c92 kernel/events/core.c Andy Lutomirski 2014-10-24 7189 f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7190 /* f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7191 * Try to map it into the page table. On fail, invoke f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7192 * perf_mmap_close() to undo the above, as the callsite expects f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7193 * full cleanup in this case and therefore does not invoke f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7194 * vmops::close(). f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7195 */ 191759e5ea9f69 kernel/events/core.c Peter Zijlstra 2025-08-12 7196 ret = map_range(event->rb, vma); f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7197 if (ret) f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7198 perf_mmap_close(vma); 8f75f689bf8133 kernel/events/core.c Haocheng Yu 2026-02-01 7199 } f74b9f4ba63ffd kernel/events/core.c Thomas Gleixner 2025-08-02 7200 7b732a75047738 kernel/perf_counter.c Peter Zijlstra 2009-03-23 7201 return ret; 37d81828385f8f kernel/perf_counter.c Paul Mackerras 2009-03-23 7202 } 37d81828385f8f kernel/perf_counter.c Paul Mackerras 2009-03-23 7203 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2026-03-07 5:57 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2026020124-flashbulb-stumble-f24a@gregkh>
2026-02-01 11:34 ` [PATCH] perf/core: Fix refcount bug and potential UAF in perf_mmap Haocheng Yu
2026-02-01 11:49 ` Greg KH
2026-02-02 7:44 ` Haocheng Yu
2026-02-02 13:58 ` Peter Zijlstra
2026-02-02 14:36 ` Peter Zijlstra
2026-02-02 15:51 ` 余昊铖
2026-02-02 16:20 ` [PATCH v2] " yuhaocheng035
2026-02-06 9:06 ` Peter Zijlstra
2026-02-09 15:26 ` 余昊铖
2026-03-05 18:56 ` Ian Rogers
2026-03-06 9:35 ` yuhaocheng035
2026-03-06 9:36 ` Haocheng Yu
2026-03-06 19:04 ` Ian Rogers
2026-03-07 5:57 ` Haocheng Yu
2026-02-01 18:43 ` [PATCH] " kernel test robot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox