* [PATCH] buildid: validate page-backed file before parsing build ID
@ 2025-12-23 10:32 Jinchao Wang
2025-12-23 17:29 ` Andrew Morton
2025-12-23 19:05 ` Shakeel Butt
0 siblings, 2 replies; 9+ messages in thread
From: Jinchao Wang @ 2025-12-23 10:32 UTC (permalink / raw)
To: Andrew Morton, Song Liu, Jiri Olsa, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, linux-kernel, bpf
Cc: Jinchao Wang, syzbot+e008db2ac01e282550ee
__build_id_parse() only works on page-backed storage. Its helper paths
eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
that do not map a regular file or lack valid address_space operations.
Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
---
lib/buildid.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/buildid.c b/lib/buildid.c
index aaf61dfc0919..7131594cb071 100644
--- a/lib/buildid.c
+++ b/lib/buildid.c
@@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
int ret;
/* only works for page backed storage */
- if (!vma->vm_file)
+ if (!vma->vm_file ||
+ !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
+ !vma->vm_file->f_mapping->a_ops ||
+ !vma->vm_file->f_mapping->a_ops->read_folio)
return -EINVAL;
freader_init_from_file(&r, buf, sizeof(buf), vma->vm_file, may_fault);
--
2.43.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2025-12-23 10:32 [PATCH] buildid: validate page-backed file before parsing build ID Jinchao Wang
@ 2025-12-23 17:29 ` Andrew Morton
2025-12-30 22:11 ` David Hildenbrand (Red Hat)
2025-12-23 19:05 ` Shakeel Butt
1 sibling, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2025-12-23 17:29 UTC (permalink / raw)
To: Jinchao Wang
Cc: Song Liu, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
Hao Luo, linux-kernel, bpf, syzbot+e008db2ac01e282550ee,
Axel Rasmussen, David Hildenbrand, Johannes Weiner,
Lorenzo Stoakes, Michal Hocko, Qi Zheng, Shakeel Butt, Wei Xu,
Yuanchu Xie, Andrii Nakryiko, Eduard Zingerman, Omar Sandoval,
Deepanshu Kartikey, Alexei Starovoitov, Daniel Borkman, Hao Luo,
Jiri Olsa, John Fastabend, KP Singh, Martin KaFai Lau, Song Liu,
Stanislav Fomichev, Yonghong Song
On Tue, 23 Dec 2025 18:32:07 +0800 Jinchao Wang <wangjinchao600@gmail.com> wrote:
> __build_id_parse() only works on page-backed storage. Its helper paths
> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
> that do not map a regular file or lack valid address_space operations.
>
> Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
> Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
>
> ...
>
> --- a/lib/buildid.c
> +++ b/lib/buildid.c
> @@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
> int ret;
>
> /* only works for page backed storage */
> - if (!vma->vm_file)
> + if (!vma->vm_file ||
> + !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
> + !vma->vm_file->f_mapping->a_ops ||
> + !vma->vm_file->f_mapping->a_ops->read_folio)
> return -EINVAL;
>
> freader_init_from_file(&r, buf, sizeof(buf), vma->vm_file, may_fault);
Thanks. Seems this one needs additional paperwork.
I added the below:
Fixes: ad41251c290d ("lib/buildid: implement sleepable build_id_parse() API")
Tested-by: <syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com>
Link: https://lkml.kernel.org/r/694a67ab.050a0220.19928e.001c.GAE@google.com
Closes: https://lkml.kernel.org/r/693540fe.a70a0220.38f243.004c.GAE@google.com
Cc: <stable@vger.kernel.org>
and a large number of cc's which I scraped together from various
emails.
Could people please eyeball all of this and verify that everything is
good?
From: Jinchao Wang <wangjinchao600@gmail.com>
Subject: buildid: validate page-backed file before parsing build ID
Date: Tue, 23 Dec 2025 18:32:07 +0800
__build_id_parse() only works on page-backed storage. Its helper paths
eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
that do not map a regular file or lack valid address_space operations.
Link: https://lkml.kernel.org/r/20251223103214.2412446-1-wangjinchao600@gmail.com
Fixes: ad41251c290d ("lib/buildid: implement sleepable build_id_parse() API")
Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
Reported-by: <syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com>
Tested-by: <syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com>
Link: https://lkml.kernel.org/r/694a67ab.050a0220.19928e.001c.GAE@google.com
Closes: https://lkml.kernel.org/r/693540fe.a70a0220.38f243.004c.GAE@google.com
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Wei Xu <weixugc@google.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Deepanshu Kartikey <kartikey406@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkman <daniel@iogearbox.net>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@fomichev.me>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
lib/buildid.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/lib/buildid.c~buildid-validate-page-backed-file-before-parsing-build-id
+++ a/lib/buildid.c
@@ -288,7 +288,10 @@ static int __build_id_parse(struct vm_ar
int ret;
/* only works for page backed storage */
- if (!vma->vm_file)
+ if (!vma->vm_file ||
+ !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
+ !vma->vm_file->f_mapping->a_ops ||
+ !vma->vm_file->f_mapping->a_ops->read_folio)
return -EINVAL;
freader_init_from_file(&r, buf, sizeof(buf), vma->vm_file, may_fault);
_
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2025-12-23 10:32 [PATCH] buildid: validate page-backed file before parsing build ID Jinchao Wang
2025-12-23 17:29 ` Andrew Morton
@ 2025-12-23 19:05 ` Shakeel Butt
2025-12-24 3:29 ` Jinchao Wang
1 sibling, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2025-12-23 19:05 UTC (permalink / raw)
To: Jinchao Wang
Cc: Andrew Morton, Song Liu, Jiri Olsa, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, linux-kernel, bpf,
syzbot+e008db2ac01e282550ee
Hi Jinchao,
On Tue, Dec 23, 2025 at 06:32:07PM +0800, Jinchao Wang wrote:
> __build_id_parse() only works on page-backed storage. Its helper paths
> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
> that do not map a regular file or lack valid address_space operations.
>
> Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
> Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
Check the previous discussion on this at
https://lore.kernel.org/all/20251114193729.251892-1-ssranevjti@gmail.com/
The preferred solution was to use kernel_read() call instead of adding
more such checks. Please check and test the patch at
https://lore.kernel.org/20251222205859.3968077-1-shakeel.butt@linux.dev/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2025-12-23 19:05 ` Shakeel Butt
@ 2025-12-24 3:29 ` Jinchao Wang
0 siblings, 0 replies; 9+ messages in thread
From: Jinchao Wang @ 2025-12-24 3:29 UTC (permalink / raw)
To: Shakeel Butt
Cc: Andrew Morton, Song Liu, Jiri Olsa, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, linux-kernel, bpf,
syzbot+e008db2ac01e282550ee
On Tue, Dec 23, 2025 at 11:05:49AM -0800, Shakeel Butt wrote:
> Hi Jinchao,
>
> On Tue, Dec 23, 2025 at 06:32:07PM +0800, Jinchao Wang wrote:
> > __build_id_parse() only works on page-backed storage. Its helper paths
> > eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
> > that do not map a regular file or lack valid address_space operations.
> >
> > Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
> > Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
>
> Check the previous discussion on this at
> https://lore.kernel.org/all/20251114193729.251892-1-ssranevjti@gmail.com/
>
> The preferred solution was to use kernel_read() call instead of adding
> more such checks. Please check and test the patch at
> https://lore.kernel.org/20251222205859.3968077-1-shakeel.butt@linux.dev/
>
Thanks for the pointer.
After reading the discussion and the patch, I agree with you.
I also tested your patch, it fixes:
https://syzkaller.appspot.com/bug?extid=e008db2ac01e282550ee
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2025-12-23 17:29 ` Andrew Morton
@ 2025-12-30 22:11 ` David Hildenbrand (Red Hat)
2026-01-05 22:52 ` Andrii Nakryiko
0 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-30 22:11 UTC (permalink / raw)
To: Andrew Morton, Jinchao Wang
Cc: Song Liu, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman,
Yonghong Song, John Fastabend, KP Singh, Stanislav Fomichev,
Hao Luo, linux-kernel, bpf, syzbot+e008db2ac01e282550ee,
Axel Rasmussen, Johannes Weiner, Lorenzo Stoakes, Michal Hocko,
Qi Zheng, Shakeel Butt, Wei Xu, Yuanchu Xie, Omar Sandoval,
Deepanshu Kartikey
On 12/23/25 18:29, Andrew Morton wrote:
> On Tue, 23 Dec 2025 18:32:07 +0800 Jinchao Wang <wangjinchao600@gmail.com> wrote:
>
>> __build_id_parse() only works on page-backed storage. Its helper paths
>> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
>> that do not map a regular file or lack valid address_space operations.
>>
>> Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
>> Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
>>
>> ...
>>
>> --- a/lib/buildid.c
>> +++ b/lib/buildid.c
>> @@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
>> int ret;
>>
>> /* only works for page backed storage */
>> - if (!vma->vm_file)
>> + if (!vma->vm_file ||
>> + !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
>> + !vma->vm_file->f_mapping->a_ops ||
>> + !vma->vm_file->f_mapping->a_ops->read_folio)
>> return -EINVAL;
Just wondering, we are fine with MAP_PRIVATE files, right? I guess it's
not about the actual content in the VMA (which might be different for a
MAP_PRIVATE VMA), but only about the content of the mapped file.
LGTM, although I wonder whether some of these these checks should be
exposed as part of the read_cache_folio()/do_read_cache_folio() API.
Like, having a helper function that tells us whether we can use
do_read_cache_folio() against a given mapping+file.
--
Cheers
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2025-12-30 22:11 ` David Hildenbrand (Red Hat)
@ 2026-01-05 22:52 ` Andrii Nakryiko
2026-01-06 19:16 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 9+ messages in thread
From: Andrii Nakryiko @ 2026-01-05 22:52 UTC (permalink / raw)
To: David Hildenbrand (Red Hat)
Cc: Andrew Morton, Jinchao Wang, Song Liu, Jiri Olsa,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, linux-kernel, bpf,
syzbot+e008db2ac01e282550ee, Axel Rasmussen, Johannes Weiner,
Lorenzo Stoakes, Michal Hocko, Qi Zheng, Shakeel Butt, Wei Xu,
Yuanchu Xie, Omar Sandoval, Deepanshu Kartikey
On Tue, Dec 30, 2025 at 2:11 PM David Hildenbrand (Red Hat)
<david@kernel.org> wrote:
>
> On 12/23/25 18:29, Andrew Morton wrote:
> > On Tue, 23 Dec 2025 18:32:07 +0800 Jinchao Wang <wangjinchao600@gmail.com> wrote:
> >
> >> __build_id_parse() only works on page-backed storage. Its helper paths
> >> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
> >> that do not map a regular file or lack valid address_space operations.
> >>
> >> Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
> >> Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
> >>
> >> ...
> >>
> >> --- a/lib/buildid.c
> >> +++ b/lib/buildid.c
> >> @@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
> >> int ret;
> >>
> >> /* only works for page backed storage */
> >> - if (!vma->vm_file)
> >> + if (!vma->vm_file ||
> >> + !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
> >> + !vma->vm_file->f_mapping->a_ops ||
> >> + !vma->vm_file->f_mapping->a_ops->read_folio)
> >> return -EINVAL;
>
> Just wondering, we are fine with MAP_PRIVATE files, right? I guess it's
> not about the actual content in the VMA (which might be different for a
> MAP_PRIVATE VMA), but only about the content of the mapped file.
Yep, this code is fetching contents of a file that backs given VMA.
>
>
> LGTM, although I wonder whether some of these these checks should be
> exposed as part of the read_cache_folio()/do_read_cache_folio() API.
>
> Like, having a helper function that tells us whether we can use
> do_read_cache_folio() against a given mapping+file.
I agree, this seems to be leaking a lot of internal mm details into
higher-level caller (__build_id_parse). Right now we try to fetch
folio with filemap_get_folio() and if that succeeds, then we do
read_cache_folio. Would it be possible for filemap_get_folio() to
return error if the folio cannot be read using read_cache_folio()? Or
maybe have a variant of filemap_get_folio() that would have this
semantic?
>
> --
> Cheers
>
> David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2026-01-05 22:52 ` Andrii Nakryiko
@ 2026-01-06 19:16 ` David Hildenbrand (Red Hat)
2026-01-09 23:43 ` Andrii Nakryiko
0 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand (Red Hat) @ 2026-01-06 19:16 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Andrew Morton, Jinchao Wang, Song Liu, Jiri Olsa,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, linux-kernel, bpf,
syzbot+e008db2ac01e282550ee, Axel Rasmussen, Johannes Weiner,
Lorenzo Stoakes, Michal Hocko, Qi Zheng, Shakeel Butt, Wei Xu,
Yuanchu Xie, Omar Sandoval, Deepanshu Kartikey
On 1/5/26 23:52, Andrii Nakryiko wrote:
> On Tue, Dec 30, 2025 at 2:11 PM David Hildenbrand (Red Hat)
> <david@kernel.org> wrote:
>>
>> On 12/23/25 18:29, Andrew Morton wrote:
>>> On Tue, 23 Dec 2025 18:32:07 +0800 Jinchao Wang <wangjinchao600@gmail.com> wrote:
>>>
>>>> __build_id_parse() only works on page-backed storage. Its helper paths
>>>> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
>>>> that do not map a regular file or lack valid address_space operations.
>>>>
>>>> Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
>>>> Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
>>>>
>>>> ...
>>>>
>>>> --- a/lib/buildid.c
>>>> +++ b/lib/buildid.c
>>>> @@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
>>>> int ret;
>>>>
>>>> /* only works for page backed storage */
>>>> - if (!vma->vm_file)
>>>> + if (!vma->vm_file ||
>>>> + !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
>>>> + !vma->vm_file->f_mapping->a_ops ||
>>>> + !vma->vm_file->f_mapping->a_ops->read_folio)
>>>> return -EINVAL;
>>
>> Just wondering, we are fine with MAP_PRIVATE files, right? I guess it's
>> not about the actual content in the VMA (which might be different for a
>> MAP_PRIVATE VMA), but only about the content of the mapped file.
>
> Yep, this code is fetching contents of a file that backs given VMA.
Good!
>
>>
>>
>> LGTM, although I wonder whether some of these these checks should be
>> exposed as part of the read_cache_folio()/do_read_cache_folio() API.
>>
>> Like, having a helper function that tells us whether we can use
>> do_read_cache_folio() against a given mapping+file.
>
> I agree, this seems to be leaking a lot of internal mm details into
> higher-level caller (__build_id_parse). Right now we try to fetch
> folio with filemap_get_folio() and if that succeeds, then we do
> read_cache_folio. Would it be possible for filemap_get_folio() to
> return error if the folio cannot be read using read_cache_folio()? Or
> maybe have a variant of filemap_get_folio() that would have this
> semantic?
Good question. But really, for files that always have everything in the pagecache,
there would not be a problem, right? I'm thinking about hugetlb, for example.
There, we never expect to fallback to do_read_cache_folio().
So maybe we could just teach do_read_cache_folio() to fail properly?
diff --git a/mm/filemap.c b/mm/filemap.c
index ebd75684cb0a7..3f81b8481af4c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -4051,8 +4051,11 @@ static struct folio *do_read_cache_folio(struct address_space *mapping,
struct folio *folio;
int err;
- if (!filler)
+ if (!filler) {
+ if (!mapping->a_ops || !mapping->a_ops->read_folio)
+ return ERR_PTR(-EINVAL);
filler = mapping->a_ops->read_folio;
+ }
repeat:
folio = filemap_get_folio(mapping, index);
if (IS_ERR(folio)) {
Then __build_id_parse() would only check for the existence of vma->vm_file and maybe
the !S_ISREG(file_inode(vma->vm_file)->i_mode).
--
Cheers
David
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2026-01-06 19:16 ` David Hildenbrand (Red Hat)
@ 2026-01-09 23:43 ` Andrii Nakryiko
2026-01-11 11:32 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 9+ messages in thread
From: Andrii Nakryiko @ 2026-01-09 23:43 UTC (permalink / raw)
To: David Hildenbrand (Red Hat)
Cc: Andrew Morton, Jinchao Wang, Song Liu, Jiri Olsa,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, linux-kernel, bpf,
syzbot+e008db2ac01e282550ee, Axel Rasmussen, Johannes Weiner,
Lorenzo Stoakes, Michal Hocko, Qi Zheng, Shakeel Butt, Wei Xu,
Yuanchu Xie, Omar Sandoval, Deepanshu Kartikey
On Tue, Jan 6, 2026 at 11:16 AM David Hildenbrand (Red Hat)
<david@kernel.org> wrote:
>
> On 1/5/26 23:52, Andrii Nakryiko wrote:
> > On Tue, Dec 30, 2025 at 2:11 PM David Hildenbrand (Red Hat)
> > <david@kernel.org> wrote:
> >>
> >> On 12/23/25 18:29, Andrew Morton wrote:
> >>> On Tue, 23 Dec 2025 18:32:07 +0800 Jinchao Wang <wangjinchao600@gmail.com> wrote:
> >>>
> >>>> __build_id_parse() only works on page-backed storage. Its helper paths
> >>>> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
> >>>> that do not map a regular file or lack valid address_space operations.
> >>>>
> >>>> Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
> >>>> Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
> >>>>
> >>>> ...
> >>>>
> >>>> --- a/lib/buildid.c
> >>>> +++ b/lib/buildid.c
> >>>> @@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
> >>>> int ret;
> >>>>
> >>>> /* only works for page backed storage */
> >>>> - if (!vma->vm_file)
> >>>> + if (!vma->vm_file ||
> >>>> + !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
> >>>> + !vma->vm_file->f_mapping->a_ops ||
> >>>> + !vma->vm_file->f_mapping->a_ops->read_folio)
> >>>> return -EINVAL;
> >>
> >> Just wondering, we are fine with MAP_PRIVATE files, right? I guess it's
> >> not about the actual content in the VMA (which might be different for a
> >> MAP_PRIVATE VMA), but only about the content of the mapped file.
> >
> > Yep, this code is fetching contents of a file that backs given VMA.
>
> Good!
>
> >
> >>
> >>
> >> LGTM, although I wonder whether some of these these checks should be
> >> exposed as part of the read_cache_folio()/do_read_cache_folio() API.
> >>
> >> Like, having a helper function that tells us whether we can use
> >> do_read_cache_folio() against a given mapping+file.
> >
> > I agree, this seems to be leaking a lot of internal mm details into
> > higher-level caller (__build_id_parse). Right now we try to fetch
> > folio with filemap_get_folio() and if that succeeds, then we do
> > read_cache_folio. Would it be possible for filemap_get_folio() to
> > return error if the folio cannot be read using read_cache_folio()? Or
> > maybe have a variant of filemap_get_folio() that would have this
> > semantic?
>
> Good question. But really, for files that always have everything in the pagecache,
> there would not be a problem, right? I'm thinking about hugetlb, for example.
>
> There, we never expect to fallback to do_read_cache_folio().
>
> So maybe we could just teach do_read_cache_folio() to fail properly?
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index ebd75684cb0a7..3f81b8481af4c 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -4051,8 +4051,11 @@ static struct folio *do_read_cache_folio(struct address_space *mapping,
> struct folio *folio;
> int err;
>
> - if (!filler)
> + if (!filler) {
> + if (!mapping->a_ops || !mapping->a_ops->read_folio)
> + return ERR_PTR(-EINVAL);
> filler = mapping->a_ops->read_folio;
> + }
> repeat:
> folio = filemap_get_folio(mapping, index);
> if (IS_ERR(folio)) {
>
> Then __build_id_parse() would only check for the existence of vma->vm_file and maybe
> the !S_ISREG(file_inode(vma->vm_file)->i_mode).
>
That would be great. But something like this was proposed earlier and
Matthew didn't particularly like this approach ([0]).
[0] https://lore.kernel.org/all/aReUv1kVACh3UKv-@casper.infradead.org/
>
> --
> Cheers
>
> David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] buildid: validate page-backed file before parsing build ID
2026-01-09 23:43 ` Andrii Nakryiko
@ 2026-01-11 11:32 ` David Hildenbrand (Red Hat)
0 siblings, 0 replies; 9+ messages in thread
From: David Hildenbrand (Red Hat) @ 2026-01-11 11:32 UTC (permalink / raw)
To: Andrii Nakryiko
Cc: Andrew Morton, Jinchao Wang, Song Liu, Jiri Olsa,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Martin KaFai Lau, Eduard Zingerman, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, linux-kernel, bpf,
syzbot+e008db2ac01e282550ee, Axel Rasmussen, Johannes Weiner,
Lorenzo Stoakes, Michal Hocko, Qi Zheng, Shakeel Butt, Wei Xu,
Yuanchu Xie, Omar Sandoval, Deepanshu Kartikey
On 1/10/26 00:43, Andrii Nakryiko wrote:
> On Tue, Jan 6, 2026 at 11:16 AM David Hildenbrand (Red Hat)
> <david@kernel.org> wrote:
>>
>> On 1/5/26 23:52, Andrii Nakryiko wrote:
>>> On Tue, Dec 30, 2025 at 2:11 PM David Hildenbrand (Red Hat)
>>> <david@kernel.org> wrote:
>>>>
>>>> On 12/23/25 18:29, Andrew Morton wrote:
>>>>> On Tue, 23 Dec 2025 18:32:07 +0800 Jinchao Wang <wangjinchao600@gmail.com> wrote:
>>>>>
>>>>>> __build_id_parse() only works on page-backed storage. Its helper paths
>>>>>> eventually call mapping->a_ops->read_folio(), so explicitly reject VMAs
>>>>>> that do not map a regular file or lack valid address_space operations.
>>>>>>
>>>>>> Reported-by: syzbot+e008db2ac01e282550ee@syzkaller.appspotmail.com
>>>>>> Signed-off-by: Jinchao Wang <wangjinchao600@gmail.com>
>>>>>>
>>>>>> ...
>>>>>>
>>>>>> --- a/lib/buildid.c
>>>>>> +++ b/lib/buildid.c
>>>>>> @@ -280,7 +280,10 @@ static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
>>>>>> int ret;
>>>>>>
>>>>>> /* only works for page backed storage */
>>>>>> - if (!vma->vm_file)
>>>>>> + if (!vma->vm_file ||
>>>>>> + !S_ISREG(file_inode(vma->vm_file)->i_mode) ||
>>>>>> + !vma->vm_file->f_mapping->a_ops ||
>>>>>> + !vma->vm_file->f_mapping->a_ops->read_folio)
>>>>>> return -EINVAL;
>>>>
>>>> Just wondering, we are fine with MAP_PRIVATE files, right? I guess it's
>>>> not about the actual content in the VMA (which might be different for a
>>>> MAP_PRIVATE VMA), but only about the content of the mapped file.
>>>
>>> Yep, this code is fetching contents of a file that backs given VMA.
>>
>> Good!
>>
>>>
>>>>
>>>>
>>>> LGTM, although I wonder whether some of these these checks should be
>>>> exposed as part of the read_cache_folio()/do_read_cache_folio() API.
>>>>
>>>> Like, having a helper function that tells us whether we can use
>>>> do_read_cache_folio() against a given mapping+file.
>>>
>>> I agree, this seems to be leaking a lot of internal mm details into
>>> higher-level caller (__build_id_parse). Right now we try to fetch
>>> folio with filemap_get_folio() and if that succeeds, then we do
>>> read_cache_folio. Would it be possible for filemap_get_folio() to
>>> return error if the folio cannot be read using read_cache_folio()? Or
>>> maybe have a variant of filemap_get_folio() that would have this
>>> semantic?
>>
>> Good question. But really, for files that always have everything in the pagecache,
>> there would not be a problem, right? I'm thinking about hugetlb, for example.
>>
>> There, we never expect to fallback to do_read_cache_folio().
>>
>> So maybe we could just teach do_read_cache_folio() to fail properly?
>>
>> diff --git a/mm/filemap.c b/mm/filemap.c
>> index ebd75684cb0a7..3f81b8481af4c 100644
>> --- a/mm/filemap.c
>> +++ b/mm/filemap.c
>> @@ -4051,8 +4051,11 @@ static struct folio *do_read_cache_folio(struct address_space *mapping,
>> struct folio *folio;
>> int err;
>>
>> - if (!filler)
>> + if (!filler) {
>> + if (!mapping->a_ops || !mapping->a_ops->read_folio)
>> + return ERR_PTR(-EINVAL);
>> filler = mapping->a_ops->read_folio;
>> + }
>> repeat:
>> folio = filemap_get_folio(mapping, index);
>> if (IS_ERR(folio)) {
>>
>> Then __build_id_parse() would only check for the existence of vma->vm_file and maybe
>> the !S_ISREG(file_inode(vma->vm_file)->i_mode).
>>
>
> That would be great. But something like this was proposed earlier and
> Matthew didn't particularly like this approach ([0]).
>
> [0] https://lore.kernel.org/all/aReUv1kVACh3UKv-@casper.infradead.org/
Well, but on the higher level we don't know whether we have to even call
into read_cache_folio().
Again, hugetlb. Might be worth reproducing with hugetlb/shmem, and
making sure it keeps working even with your changes.
--
Cheers
David
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-01-11 11:32 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-23 10:32 [PATCH] buildid: validate page-backed file before parsing build ID Jinchao Wang
2025-12-23 17:29 ` Andrew Morton
2025-12-30 22:11 ` David Hildenbrand (Red Hat)
2026-01-05 22:52 ` Andrii Nakryiko
2026-01-06 19:16 ` David Hildenbrand (Red Hat)
2026-01-09 23:43 ` Andrii Nakryiko
2026-01-11 11:32 ` David Hildenbrand (Red Hat)
2025-12-23 19:05 ` Shakeel Butt
2025-12-24 3:29 ` Jinchao Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox