All of lore.kernel.org
 help / color / mirror / Atom feed
* Question: BPF stack build-id lookup while holding mmap_lock
@ 2026-06-18  3:31 Runyu Xiao
  2026-06-18  5:05 ` Ihor Solodrai
  0 siblings, 1 reply; 3+ messages in thread
From: Runyu Xiao @ 2026-06-18  3:31 UTC (permalink / raw)
  To: Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Jiri Olsa, Martin KaFai Lau, Eduard Zingerman, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, bpf,
	linux-kernel, jianhao.xu, runyu.xiao

Hi,

While auditing lock ordering around faultable build-id lookups, our
static analysis tool flagged the BPF stackmap user-build-id path, and we
manually reviewed it against the current tree.

The path we are concerned about is the sleepable helper path:

  bpf_get_stack_sleepable() / bpf_get_task_stack_sleepable()
    -> __bpf_get_stack(..., may_fault = true)
       -> stack_map_get_build_id_offset()
          -> mmap_read_trylock(current->mm)
          -> build_id_parse(vma, ...)
          -> __kernel_read()

`build_id_parse()` can read from the backing file while mmap_lock is
held.  That can form an ABBA dependency with file read paths where the
inode side is held first and copy_to_user/copy_page_to_iter can fault
and then need mmap_lock.

A minimal Lockdep reproducer preserving this BPF stackmap carrier and
the reverse file-read edge reports:

  WARNING: possible circular locking dependency detected
  __kernel_read
  stack_map_get_build_id_offset
  __bpf_get_stack
  *** DEADLOCK ***

The local fix I am considering is only for the faultable build-id path.
It would snapshot the VMA file reference and offset metadata under
mmap_lock, drop mmap_lock, and then parse the build-id from the file
reference with build_id_parse_file().  The existing no-fault path would
remain unchanged.

Roughly:

  1. Under mmap_lock, find the VMA for each user IP.
  2. Take a file reference and snapshot vm_start/vm_pgoff.
  3. Drop mmap_lock.
  4. Parse build IDs from the files.
  5. Fall back to reporting IPs if the faultable path cannot safely
     release mmap_lock or allocate the temporary snapshot array.

The tradeoff is that build-id parsing would happen after releasing
mmap_lock, so the VMA/file relationship is represented by the file
reference and copied metadata rather than by holding the VMA lock context
through the file read.  That avoids file I/O under mmap_lock, but may
change edge-case behavior if the mapping changes concurrently.

Does this direction sound acceptable for sleepable BPF stack helpers, or
would you prefer a stricter fallback-to-IP behavior whenever build-id
parsing would require faultable file I/O?  Another option would be to
avoid build-id parsing entirely in the may_fault=true stackmap path unless
there is an existing BPF/MM helper pattern I should reuse.

The local draft subject is:

  bpf: avoid faultable build-id lookup under mmap_lock

Thanks,
Runyu

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question: BPF stack build-id lookup while holding mmap_lock
  2026-06-18  3:31 Question: BPF stack build-id lookup while holding mmap_lock Runyu Xiao
@ 2026-06-18  5:05 ` Ihor Solodrai
  2026-06-18  5:10   ` Runyu Xiao
  0 siblings, 1 reply; 3+ messages in thread
From: Ihor Solodrai @ 2026-06-18  5:05 UTC (permalink / raw)
  To: Runyu Xiao, Song Liu, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko
  Cc: Jiri Olsa, Martin KaFai Lau, Eduard Zingerman, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, bpf,
	linux-kernel, jianhao.xu



On 6/17/26 8:31 PM, Runyu Xiao wrote:
> Hi,
> 
> While auditing lock ordering around faultable build-id lookups, our
> static analysis tool flagged the BPF stackmap user-build-id path, and we
> manually reviewed it against the current tree.
> 
> The path we are concerned about is the sleepable helper path:
> 
>   bpf_get_stack_sleepable() / bpf_get_task_stack_sleepable()
>     -> __bpf_get_stack(..., may_fault = true)
>        -> stack_map_get_build_id_offset()
>           -> mmap_read_trylock(current->mm)
>           -> build_id_parse(vma, ...)
>           -> __kernel_read()
> 
> `build_id_parse()` can read from the backing file while mmap_lock is
> held.  That can form an ABBA dependency with file read paths where the
> inode side is held first and copy_to_user/copy_page_to_iter can fault
> and then need mmap_lock.
> 
> A minimal Lockdep reproducer preserving this BPF stackmap carrier and
> the reverse file-read edge reports:
> 
>   WARNING: possible circular locking dependency detected
>   __kernel_read
>   stack_map_get_build_id_offset
>   __bpf_get_stack
>   *** DEADLOCK ***
> 
> The local fix I am considering is only for the faultable build-id path.
> It would snapshot the VMA file reference and offset metadata under
> mmap_lock, drop mmap_lock, and then parse the build-id from the file
> reference with build_id_parse_file().  The existing no-fault path would
> remain unchanged.
> 
> Roughly:
> 
>   1. Under mmap_lock, find the VMA for each user IP.
>   2. Take a file reference and snapshot vm_start/vm_pgoff.
>   3. Drop mmap_lock.
>   4. Parse build IDs from the files.
>   5. Fall back to reporting IPs if the faultable path cannot safely
>      release mmap_lock or allocate the temporary snapshot array.


Hi Runyu,

A patch implementing more or less this algorithm has recently landed:
https://lore.kernel.org/bpf/20260525223948.1920986-1-ihor.solodrai@linux.dev/

I recommend doing a search on lore.kernel.org or other mailing list mirror
in advance, to avoid unnecessary or duplicate work.


> 
> The tradeoff is that build-id parsing would happen after releasing
> mmap_lock, so the VMA/file relationship is represented by the file
> reference and copied metadata rather than by holding the VMA lock context
> through the file read.  That avoids file I/O under mmap_lock, but may
> change edge-case behavior if the mapping changes concurrently.
> 
> Does this direction sound acceptable for sleepable BPF stack helpers, or
> would you prefer a stricter fallback-to-IP behavior whenever build-id
> parsing would require faultable file I/O?  Another option would be to
> avoid build-id parsing entirely in the may_fault=true stackmap path unless
> there is an existing BPF/MM helper pattern I should reuse.
> 
> The local draft subject is:
> 
>   bpf: avoid faultable build-id lookup under mmap_lock
> 
> Thanks,
> Runyu


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re:Re: Question: BPF stack build-id lookup while holding mmap_lock
  2026-06-18  5:05 ` Ihor Solodrai
@ 2026-06-18  5:10   ` Runyu Xiao
  0 siblings, 0 replies; 3+ messages in thread
From: Runyu Xiao @ 2026-06-18  5:10 UTC (permalink / raw)
  To: Ihor Solodrai
  Cc: Song Liu, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Jiri Olsa, Martin KaFai Lau, Eduard Zingerman, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, bpf,
	linux-kernel, jianhao.xu

Hi,
Thanks for pointing this out, and sorry for missing the existing series.
I checked the linked patch and it covers the same lock-ordering issue I
was concerned about in the sleepable stackmap build-id path. I will drop
my local draft and avoid sending a duplicate patch.
Thanks,
Runyu


Original:
From:Ihor Solodrai <ihor.solodrai@linux.dev>
Date:2026-06-18 13:05:24(中国 (GMT+08:00))
To:Runyu Xiao <runyu.xiao@seu.edu.cn> , Song Liu <song@kernel.org> , Alexei Starovoitov <ast@kernel.org> , Daniel Borkmann <daniel@iogearbox.net> , Andrii Nakryiko <andrii@kernel.org>
Cc:Jiri Olsa <jolsa@kernel.org> , Martin KaFai Lau <martin.lau@linux.dev> , Eduard Zingerman <eddyz87@gmail.com> , Yonghong Song <yonghong.song@linux.dev> , John Fastabend <john.fastabend@gmail.com> , KP Singh <kpsingh@kernel.org> , Stanislav Fomichev <sdf@fomichev.me> , Hao Luo <haoluo@google.com> , bpf<bpf@vger.kernel.org> , linux-kernel<linux-kernel@vger.kernel.org> , jianhao.xu<jianhao.xu@seu.edu.cn>
Subject:Re: Question: BPF stack build-id lookup while holding mmap_lock
On 6/17/26 8:31 PM, Runyu Xiao wrote:
&gt; Hi,
&gt; 
&gt; While auditing lock ordering around faultable build-id lookups, our
&gt; static analysis tool flagged the BPF stackmap user-build-id path, and we
&gt; manually reviewed it against the current tree.
&gt; 
&gt; The path we are concerned about is the sleepable helper path:
&gt; 
&gt;   bpf_get_stack_sleepable() / bpf_get_task_stack_sleepable()
&gt;     -&gt; __bpf_get_stack(..., may_fault = true)
&gt;        -&gt; stack_map_get_build_id_offset()
&gt;           -&gt; mmap_read_trylock(current-&gt;mm)
&gt;           -&gt; build_id_parse(vma, ...)
&gt;           -&gt; __kernel_read()
&gt; 
&gt; `build_id_parse()` can read from the backing file while mmap_lock is
&gt; held.  That can form an ABBA dependency with file read paths where the
&gt; inode side is held first and copy_to_user/copy_page_to_iter can fault
&gt; and then need mmap_lock.
&gt; 
&gt; A minimal Lockdep reproducer preserving this BPF stackmap carrier and
&gt; the reverse file-read edge reports:
&gt; 
&gt;   WARNING: possible circular locking dependency detected
&gt;   __kernel_read
&gt;   stack_map_get_build_id_offset
&gt;   __bpf_get_stack
&gt;   *** DEADLOCK ***
&gt; 
&gt; The local fix I am considering is only for the faultable build-id path.
&gt; It would snapshot the VMA file reference and offset metadata under
&gt; mmap_lock, drop mmap_lock, and then parse the build-id from the file
&gt; reference with build_id_parse_file().  The existing no-fault path would
&gt; remain unchanged.
&gt; 
&gt; Roughly:
&gt; 
&gt;   1. Under mmap_lock, find the VMA for each user IP.
&gt;   2. Take a file reference and snapshot vm_start/vm_pgoff.
&gt;   3. Drop mmap_lock.
&gt;   4. Parse build IDs from the files.
&gt;   5. Fall back to reporting IPs if the faultable path cannot safely
&gt;      release mmap_lock or allocate the temporary snapshot array.


Hi Runyu,

A patch implementing more or less this algorithm has recently landed:
https://lore.kernel.org/bpf/20260525223948.1920986-1-ihor.solodrai@linux.dev/

I recommend doing a search on lore.kernel.org or other mailing list mirror
in advance, to avoid unnecessary or duplicate work.


&gt; 
&gt; The tradeoff is that build-id parsing would happen after releasing
&gt; mmap_lock, so the VMA/file relationship is represented by the file
&gt; reference and copied metadata rather than by holding the VMA lock context
&gt; through the file read.  That avoids file I/O under mmap_lock, but may
&gt; change edge-case behavior if the mapping changes concurrently.
&gt; 
&gt; Does this direction sound acceptable for sleepable BPF stack helpers, or
&gt; would you prefer a stricter fallback-to-IP behavior whenever build-id
&gt; parsing would require faultable file I/O?  Another option would be to
&gt; avoid build-id parsing entirely in the may_fault=true stackmap path unless
&gt; there is an existing BPF/MM helper pattern I should reuse.
&gt; 
&gt; The local draft subject is:
&gt; 
&gt;   bpf: avoid faultable build-id lookup under mmap_lock
&gt; 
&gt; Thanks,
&gt; Runyu


</jianhao.xu@seu.edu.cn></linux-kernel@vger.kernel.org></bpf@vger.kernel.org></haoluo@google.com></sdf@fomichev.me></kpsingh@kernel.org></john.fastabend@gmail.com></yonghong.song@linux.dev></eddyz87@gmail.com></martin.lau@linux.dev></jolsa@kernel.org></andrii@kernel.org></daniel@iogearbox.net></ast@kernel.org></song@kernel.org></runyu.xiao@seu.edu.cn></ihor.solodrai@linux.dev>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-18  5:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-18  3:31 Question: BPF stack build-id lookup while holding mmap_lock Runyu Xiao
2026-06-18  5:05 ` Ihor Solodrai
2026-06-18  5:10   ` Runyu Xiao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.