* [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page()
@ 2025-06-30 11:28 Yu Kuai
2025-06-30 15:24 ` Jens Axboe
2025-07-01 14:14 ` Jens Axboe
0 siblings, 2 replies; 8+ messages in thread
From: Yu Kuai @ 2025-06-30 11:28 UTC (permalink / raw)
To: hch, axboe, yukuai3
Cc: penguin-kernel, linux-block, linux-kernel, yukuai1, yi.zhang,
yangerkun, johnny.chenyi
From: Yu Kuai <yukuai3@huawei.com>
__xa_cmpxchg() is called with rcu_read_lock(), and it will allocate
memory if necessary.
Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile,
it still should be held before xa_unlock(), prevent returned page to be
freed by concurrent discard.
Fixes: bbcacab2e8ee ("brd: avoid extra xarray lookups on first write")
Reported-by: syzbot+ea4c8fd177a47338881a@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/685ec4c9.a00a0220.129264.000c.GAE@google.com/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
Changes in v2:
- fix typo in the subject
- add review tag.
drivers/block/brd.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index b1be6c510372..0c2eabe14af3 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -64,13 +64,15 @@ static struct page *brd_insert_page(struct brd_device *brd, sector_t sector,
rcu_read_unlock();
page = alloc_page(gfp | __GFP_ZERO | __GFP_HIGHMEM);
- rcu_read_lock();
- if (!page)
+ if (!page) {
+ rcu_read_lock();
return ERR_PTR(-ENOMEM);
+ }
xa_lock(&brd->brd_pages);
ret = __xa_cmpxchg(&brd->brd_pages, sector >> PAGE_SECTORS_SHIFT, NULL,
page, gfp);
+ rcu_read_lock();
if (ret) {
xa_unlock(&brd->brd_pages);
__free_page(page);
--
2.39.2
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() 2025-06-30 11:28 [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() Yu Kuai @ 2025-06-30 15:24 ` Jens Axboe 2025-06-30 15:28 ` Jens Axboe 2025-07-01 14:14 ` Jens Axboe 1 sibling, 1 reply; 8+ messages in thread From: Jens Axboe @ 2025-06-30 15:24 UTC (permalink / raw) To: Yu Kuai, hch, yukuai3 Cc: penguin-kernel, linux-block, linux-kernel, yi.zhang, yangerkun, johnny.chenyi On 6/30/25 5:28 AM, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate > memory if necessary. > > Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile, > it still should be held before xa_unlock(), prevent returned page to be > freed by concurrent discard. The rcu locking in there is a bit of a mess, imho. What _exactly_ is the rcu read side locking protecting? Is it only needed around the lookup and insert? We even hold it over the kmap and copy, which seems very heavy handed. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() 2025-06-30 15:24 ` Jens Axboe @ 2025-06-30 15:28 ` Jens Axboe 2025-07-01 1:28 ` Yu Kuai 0 siblings, 1 reply; 8+ messages in thread From: Jens Axboe @ 2025-06-30 15:28 UTC (permalink / raw) To: Yu Kuai, hch, yukuai3 Cc: penguin-kernel, linux-block, linux-kernel, yi.zhang, yangerkun, johnny.chenyi On 6/30/25 9:24 AM, Jens Axboe wrote: > On 6/30/25 5:28 AM, Yu Kuai wrote: >> From: Yu Kuai <yukuai3@huawei.com> >> >> __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate >> memory if necessary. >> >> Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile, >> it still should be held before xa_unlock(), prevent returned page to be >> freed by concurrent discard. > > The rcu locking in there is a bit of a mess, imho. What _exactly_ is the > rcu read side locking protecting? Is it only needed around the lookup > and insert? We even hold it over the kmap and copy, which seems very > heavy handed. Gah it's holding the page alive too. Can't we just grab a ref to the page when inserting it, and drop that at free time? It would be a lot better to have only the lookup be RCU protected, having the full copies under it seems kind of crazy. IOW, I think there's room for some good cleanups here. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() 2025-06-30 15:28 ` Jens Axboe @ 2025-07-01 1:28 ` Yu Kuai 2025-07-01 3:00 ` Jens Axboe 0 siblings, 1 reply; 8+ messages in thread From: Yu Kuai @ 2025-07-01 1:28 UTC (permalink / raw) To: Jens Axboe, Yu Kuai, hch Cc: penguin-kernel, linux-block, linux-kernel, yi.zhang, yangerkun, johnny.chenyi, yukuai (C) Hi, 在 2025/06/30 23:28, Jens Axboe 写道: > On 6/30/25 9:24 AM, Jens Axboe wrote: >> On 6/30/25 5:28 AM, Yu Kuai wrote: >>> From: Yu Kuai <yukuai3@huawei.com> >>> >>> __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate >>> memory if necessary. >>> >>> Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile, >>> it still should be held before xa_unlock(), prevent returned page to be >>> freed by concurrent discard. >> >> The rcu locking in there is a bit of a mess, imho. What _exactly_ is the >> rcu read side locking protecting? Is it only needed around the lookup >> and insert? We even hold it over the kmap and copy, which seems very >> heavy handed. > > Gah it's holding the page alive too. Can't we just grab a ref to the > page when inserting it, and drop that at free time? It would be a lot > better to have only the lookup be RCU protected, having the full > copies under it seems kind of crazy. In this case, we must grab a ref to the page for each read/write as well, I choose RCU because I think it has less performance overhead than page ref, which is atomic. BTW, I thought copy at most one page is lightweight, if this is not true, I agree page ref is better. Thanks, Kuai > > IOW, I think there's room for some good cleanups here. > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() 2025-07-01 1:28 ` Yu Kuai @ 2025-07-01 3:00 ` Jens Axboe 2025-07-01 7:38 ` Yu Kuai 0 siblings, 1 reply; 8+ messages in thread From: Jens Axboe @ 2025-07-01 3:00 UTC (permalink / raw) To: Yu Kuai, hch Cc: penguin-kernel, linux-block, linux-kernel, yi.zhang, yangerkun, johnny.chenyi, yukuai (C) On 6/30/25 7:28 PM, Yu Kuai wrote: > Hi, > > ? 2025/06/30 23:28, Jens Axboe ??: >> On 6/30/25 9:24 AM, Jens Axboe wrote: >>> On 6/30/25 5:28 AM, Yu Kuai wrote: >>>> From: Yu Kuai <yukuai3@huawei.com> >>>> >>>> __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate >>>> memory if necessary. >>>> >>>> Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile, >>>> it still should be held before xa_unlock(), prevent returned page to be >>>> freed by concurrent discard. >>> >>> The rcu locking in there is a bit of a mess, imho. What _exactly_ is the >>> rcu read side locking protecting? Is it only needed around the lookup >>> and insert? We even hold it over the kmap and copy, which seems very >>> heavy handed. >> >> Gah it's holding the page alive too. Can't we just grab a ref to the >> page when inserting it, and drop that at free time? It would be a lot >> better to have only the lookup be RCU protected, having the full >> copies under it seems kind of crazy. > > In this case, we must grab a ref to the page for each read/write as > well, I choose RCU because I think it has less performance overhead than > page ref, which is atomic. BTW, I thought copy at most one page is > lightweight, if this is not true, I agree page ref is better. Right, you'd need to grab a ref. I do think that is (by far) the better solution. Yes if you microbenchmark I'm sure the current approach will look fine, but it's a heavy section inside an rcu read lock and will hold off the grace period. So yeah, I do think it'd be a lot better to do proper page references on lookup+free, and have just the lookup be behind rcu. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() 2025-07-01 3:00 ` Jens Axboe @ 2025-07-01 7:38 ` Yu Kuai 2025-07-01 14:01 ` Jens Axboe 0 siblings, 1 reply; 8+ messages in thread From: Yu Kuai @ 2025-07-01 7:38 UTC (permalink / raw) To: Jens Axboe, Yu Kuai, hch Cc: penguin-kernel, linux-block, linux-kernel, yi.zhang, yangerkun, johnny.chenyi, yukuai (C) Hi, 在 2025/07/01 11:00, Jens Axboe 写道: > On 6/30/25 7:28 PM, Yu Kuai wrote: >> Hi, >> >> ? 2025/06/30 23:28, Jens Axboe ??: >>> On 6/30/25 9:24 AM, Jens Axboe wrote: >>>> On 6/30/25 5:28 AM, Yu Kuai wrote: >>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>> >>>>> __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate >>>>> memory if necessary. >>>>> >>>>> Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile, >>>>> it still should be held before xa_unlock(), prevent returned page to be >>>>> freed by concurrent discard. >>>> >>>> The rcu locking in there is a bit of a mess, imho. What _exactly_ is the >>>> rcu read side locking protecting? Is it only needed around the lookup >>>> and insert? We even hold it over the kmap and copy, which seems very >>>> heavy handed. >>> >>> Gah it's holding the page alive too. Can't we just grab a ref to the >>> page when inserting it, and drop that at free time? It would be a lot >>> better to have only the lookup be RCU protected, having the full >>> copies under it seems kind of crazy. >> >> In this case, we must grab a ref to the page for each read/write as >> well, I choose RCU because I think it has less performance overhead than >> page ref, which is atomic. BTW, I thought copy at most one page is >> lightweight, if this is not true, I agree page ref is better. > > Right, you'd need to grab a ref. I do think that is (by far) the better > solution. Yes if you microbenchmark I'm sure the current approach will > look fine, but it's a heavy section inside an rcu read lock and will > hold off the grace period. > > So yeah, I do think it'd be a lot better to do proper page references on > lookup+free, and have just the lookup be behind rcu. > Ok, and just to be sure, since the rcu is introduced before the fixed tag, do you think it's better to do cleanups after this patch, I prefer this way, or fix this problem directly by page ref? Thanks, Kuai ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() 2025-07-01 7:38 ` Yu Kuai @ 2025-07-01 14:01 ` Jens Axboe 0 siblings, 0 replies; 8+ messages in thread From: Jens Axboe @ 2025-07-01 14:01 UTC (permalink / raw) To: Yu Kuai, hch Cc: penguin-kernel, linux-block, linux-kernel, yi.zhang, yangerkun, johnny.chenyi, yukuai (C) On 7/1/25 1:38 AM, Yu Kuai wrote: > Hi, > > 在 2025/07/01 11:00, Jens Axboe 写道: >> On 6/30/25 7:28 PM, Yu Kuai wrote: >>> Hi, >>> >>> ? 2025/06/30 23:28, Jens Axboe ??: >>>> On 6/30/25 9:24 AM, Jens Axboe wrote: >>>>> On 6/30/25 5:28 AM, Yu Kuai wrote: >>>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>>> >>>>>> __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate >>>>>> memory if necessary. >>>>>> >>>>>> Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile, >>>>>> it still should be held before xa_unlock(), prevent returned page to be >>>>>> freed by concurrent discard. >>>>> >>>>> The rcu locking in there is a bit of a mess, imho. What _exactly_ is the >>>>> rcu read side locking protecting? Is it only needed around the lookup >>>>> and insert? We even hold it over the kmap and copy, which seems very >>>>> heavy handed. >>>> >>>> Gah it's holding the page alive too. Can't we just grab a ref to the >>>> page when inserting it, and drop that at free time? It would be a lot >>>> better to have only the lookup be RCU protected, having the full >>>> copies under it seems kind of crazy. >>> >>> In this case, we must grab a ref to the page for each read/write as >>> well, I choose RCU because I think it has less performance overhead than >>> page ref, which is atomic. BTW, I thought copy at most one page is >>> lightweight, if this is not true, I agree page ref is better. >> >> Right, you'd need to grab a ref. I do think that is (by far) the better >> solution. Yes if you microbenchmark I'm sure the current approach will >> look fine, but it's a heavy section inside an rcu read lock and will >> hold off the grace period. >> >> So yeah, I do think it'd be a lot better to do proper page references on >> lookup+free, and have just the lookup be behind rcu. >> > > Ok, and just to be sure, since the rcu is introduced before the fixed > tag, do you think it's better to do cleanups after this patch, I prefer > this way, or fix this problem directly by page ref? Yeah probably best to do the simple fix, and then base the further work on that. -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() 2025-06-30 11:28 [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() Yu Kuai 2025-06-30 15:24 ` Jens Axboe @ 2025-07-01 14:14 ` Jens Axboe 1 sibling, 0 replies; 8+ messages in thread From: Jens Axboe @ 2025-07-01 14:14 UTC (permalink / raw) To: hch, yukuai3, Yu Kuai Cc: penguin-kernel, linux-block, linux-kernel, yi.zhang, yangerkun, johnny.chenyi On Mon, 30 Jun 2025 19:28:28 +0800, Yu Kuai wrote: > __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate > memory if necessary. > > Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile, > it still should be held before xa_unlock(), prevent returned page to be > freed by concurrent discard. > > [...] Applied, thanks! [1/1] brd: fix sleeping function called from invalid context in brd_insert_page() commit: 0d519bb0de3bf0ac9e6f401d4910fc119062d7be Best regards, -- Jens Axboe ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-07-01 14:14 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-30 11:28 [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page() Yu Kuai 2025-06-30 15:24 ` Jens Axboe 2025-06-30 15:28 ` Jens Axboe 2025-07-01 1:28 ` Yu Kuai 2025-07-01 3:00 ` Jens Axboe 2025-07-01 7:38 ` Yu Kuai 2025-07-01 14:01 ` Jens Axboe 2025-07-01 14:14 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox