public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
@ 2026-03-02  2:51 Jiucheng Xu via B4 Relay
  2026-03-02 13:50 ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Jiucheng Xu via B4 Relay @ 2026-03-02  2:51 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, linux-kernel, jianxin.pan, tuan.zhang, Jiucheng Xu

From: Jiucheng Xu <jiucheng.xu@amlogic.com>

When current->bio_list is non-NULL in submit_bio_wait(),
submit_bio_noacct_nocheck appends bio to bio_list but skips IO
submission, causing submit_bio_wait() to hang indefinitely.

Fix this by temporarily backup bio_list, setting bio_list to
NULL before calling submit_bio(), then restoring bio_list
after submit_bio() returns.

I've trimmed down the call stack, as follows:

f2fs_submit_read_io
  submit_bio
    mmc_blk_mq_recovery
      z_erofs_endio
        vm_map_ram
          __pte_alloc_kernel
            __alloc_pages_direct_reclaim
              shrink_folio_list
                __swap_writepage
                  submit_bio_wait  hang!!!

Signed-off-by: Jiucheng Xu <jiucheng.xu@amlogic.com>
---
 block/bio.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index d80d5d26804e32944bcfe4506ca190033308844f..22c8769722cc89620c239310a0f3d4924de68cf9 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1505,8 +1505,17 @@ int submit_bio_wait(struct bio *bio)
 	bio->bi_private = &done;
 	bio->bi_end_io = submit_bio_wait_endio;
 	bio->bi_opf |= REQ_SYNC;
-	submit_bio(bio);
-	blk_wait_io(&done);
+	if (!current->bio_list) {
+		submit_bio(bio);
+		blk_wait_io(&done);
+	} else {
+		struct bio_list *tmp = current->bio_list;
+
+		current->bio_list = NULL;
+		submit_bio(bio);
+		blk_wait_io(&done);
+		current->bio_list = tmp;
+	}
 
 	return blk_status_to_errno(bio->bi_status);
 }

---
base-commit: 8c5f40a3ba43ae9a26991f0e4a01a3a06e8958fc
change-id: 20260224-for-next-df6f02c3694d

Best regards,
-- 
Jiucheng Xu <jiucheng.xu@amlogic.com>



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
  2026-03-02  2:51 [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait() Jiucheng Xu via B4 Relay
@ 2026-03-02 13:50 ` Christoph Hellwig
  2026-03-02 14:23   ` Gao Xiang
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2026-03-02 13:50 UTC (permalink / raw)
  To: jiucheng.xu
  Cc: Jens Axboe, linux-block, linux-kernel, jianxin.pan, tuan.zhang,
	Gao Xiang, linux-erofs

On Mon, Mar 02, 2026 at 10:51:03AM +0800, Jiucheng Xu via B4 Relay wrote:
> From: Jiucheng Xu <jiucheng.xu@amlogic.com>
> 
> When current->bio_list is non-NULL in submit_bio_wait(),
> submit_bio_noacct_nocheck appends bio to bio_list but skips IO
> submission, causing submit_bio_wait() to hang indefinitely.
> 
> Fix this by temporarily backup bio_list, setting bio_list to
> NULL before calling submit_bio(), then restoring bio_list
> after submit_bio() returns.

No.  Fix this by not doing something that is a bad idea.

> I've trimmed down the call stack, as follows:
> 
> f2fs_submit_read_io
>   submit_bio
>     mmc_blk_mq_recovery
>       z_erofs_endio
>         vm_map_ram

->bi_end_io code really should not be having random in_atomic()
checks that make it completely different, but even if they have
that need to use GFP_NOIO.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
  2026-03-02 13:50 ` Christoph Hellwig
@ 2026-03-02 14:23   ` Gao Xiang
  2026-03-02 14:29     ` Christoph Hellwig
  2026-03-03  2:03     ` Jiucheng Xu
  0 siblings, 2 replies; 8+ messages in thread
From: Gao Xiang @ 2026-03-02 14:23 UTC (permalink / raw)
  To: Christoph Hellwig, jiucheng.xu
  Cc: Jens Axboe, linux-block, linux-kernel, jianxin.pan, tuan.zhang,
	Gao Xiang, linux-erofs

Hi Christoph,

On 2026/3/2 21:50, Christoph Hellwig wrote:
> On Mon, Mar 02, 2026 at 10:51:03AM +0800, Jiucheng Xu via B4 Relay wrote:
>> From: Jiucheng Xu <jiucheng.xu@amlogic.com>
>>
>> When current->bio_list is non-NULL in submit_bio_wait(),
>> submit_bio_noacct_nocheck appends bio to bio_list but skips IO
>> submission, causing submit_bio_wait() to hang indefinitely.
>>
>> Fix this by temporarily backup bio_list, setting bio_list to
>> NULL before calling submit_bio(), then restoring bio_list
>> after submit_bio() returns.
> 
> No.  Fix this by not doing something that is a bad idea.
> 
>> I've trimmed down the call stack, as follows:
>>
>> f2fs_submit_read_io
>>    submit_bio
>>      mmc_blk_mq_recovery
>>        z_erofs_endio
>>          vm_map_ram
> 
> ->bi_end_io code really should not be having random in_atomic()
> checks that make it completely different, but even if they have

Thanks for the head-up.

For this part, I'm pretty sure we need this particular one
otherwise the scheduling performance (latency sensitive)
is unacceptable for all Android phone users.

> that need to use GFP_NOIO.

Yes, it should make vm_map_ram() in the end_io path use
GFP_NOIO instead.

Jiucheng, could you add memalloc_noio_{save,restore}() to
wrap up this path?

Thanks,
Gao Xiang

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
  2026-03-02 14:23   ` Gao Xiang
@ 2026-03-02 14:29     ` Christoph Hellwig
  2026-03-02 14:31       ` Gao Xiang
  2026-03-03  2:03     ` Jiucheng Xu
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2026-03-02 14:29 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Christoph Hellwig, jiucheng.xu, Jens Axboe, linux-block,
	linux-kernel, jianxin.pan, tuan.zhang, Gao Xiang, linux-erofs

On Mon, Mar 02, 2026 at 10:23:04PM +0800, Gao Xiang wrote:
> > > I've trimmed down the call stack, as follows:
> > > 
> > > f2fs_submit_read_io
> > >    submit_bio
> > >      mmc_blk_mq_recovery
> > >        z_erofs_endio
> > >          vm_map_ram
> > 
> > ->bi_end_io code really should not be having random in_atomic()
> > checks that make it completely different, but even if they have
> 
> Thanks for the head-up.
> 
> For this part, I'm pretty sure we need this particular one
> otherwise the scheduling performance (latency sensitive)
> is unacceptable for all Android phone users.

Where do you regularly get user context calls to ->bi_end_io?


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
  2026-03-02 14:29     ` Christoph Hellwig
@ 2026-03-02 14:31       ` Gao Xiang
  0 siblings, 0 replies; 8+ messages in thread
From: Gao Xiang @ 2026-03-02 14:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: jiucheng.xu, Jens Axboe, linux-block, linux-kernel, jianxin.pan,
	tuan.zhang, Gao Xiang, linux-erofs



On 2026/3/2 22:29, Christoph Hellwig wrote:
> On Mon, Mar 02, 2026 at 10:23:04PM +0800, Gao Xiang wrote:
>>>> I've trimmed down the call stack, as follows:
>>>>
>>>> f2fs_submit_read_io
>>>>     submit_bio
>>>>       mmc_blk_mq_recovery
>>>>         z_erofs_endio
>>>>           vm_map_ram
>>>
>>> ->bi_end_io code really should not be having random in_atomic()
>>> checks that make it completely different, but even if they have
>>
>> Thanks for the head-up.
>>
>> For this part, I'm pretty sure we need this particular one
>> otherwise the scheduling performance (latency sensitive)
>> is unacceptable for all Android phone users.
> 
> Where do you regularly get user context calls to ->bi_end_io?

The obvious one is that dm-verity, it's actually in
the workqueue context.

Thanks,
Gao Xiang



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
  2026-03-02 14:23   ` Gao Xiang
  2026-03-02 14:29     ` Christoph Hellwig
@ 2026-03-03  2:03     ` Jiucheng Xu
  2026-03-03  2:11       ` Gao Xiang
  1 sibling, 1 reply; 8+ messages in thread
From: Jiucheng Xu @ 2026-03-03  2:03 UTC (permalink / raw)
  To: Gao Xiang, Christoph Hellwig
  Cc: Jens Axboe, linux-block, linux-kernel, jianxin.pan, tuan.zhang,
	Gao Xiang, linux-erofs



On 3/2/2026 10:23 PM, Gao Xiang wrote:
> [Some people who received this message don't often get email from 
> hsiangkao@linux.alibaba.com. Learn why this is important at https:// 
> aka.ms/LearnAboutSenderIdentification ]
> 
> [ EXTERNAL EMAIL ]
> 
> Hi Christoph,
> 
> On 2026/3/2 21:50, Christoph Hellwig wrote:
>> On Mon, Mar 02, 2026 at 10:51:03AM +0800, Jiucheng Xu via B4 Relay wrote:
>>> From: Jiucheng Xu <jiucheng.xu@amlogic.com>
>>>
>>> When current->bio_list is non-NULL in submit_bio_wait(),
>>> submit_bio_noacct_nocheck appends bio to bio_list but skips IO
>>> submission, causing submit_bio_wait() to hang indefinitely.
>>>
>>> Fix this by temporarily backup bio_list, setting bio_list to
>>> NULL before calling submit_bio(), then restoring bio_list
>>> after submit_bio() returns.
>>
>> No.  Fix this by not doing something that is a bad idea.
>>
>>> I've trimmed down the call stack, as follows:
>>>
>>> f2fs_submit_read_io
>>>    submit_bio
>>>      mmc_blk_mq_recovery
>>>        z_erofs_endio
>>>          vm_map_ram
>>
>> ->bi_end_io code really should not be having random in_atomic()
>> checks that make it completely different, but even if they have
> 
> Thanks for the head-up.
> 
> For this part, I'm pretty sure we need this particular one
> otherwise the scheduling performance (latency sensitive)
> is unacceptable for all Android phone users.
> 
>> that need to use GFP_NOIO.
> 
> Yes, it should make vm_map_ram() in the end_io path use
> GFP_NOIO instead.
> 
> Jiucheng, could you add memalloc_noio_{save,restore}() to
> wrap up this path?

Thanks for Christoph's and Xiang's comments, I will try it. Thanks!

Best Regards,
Jiucheng

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
  2026-03-03  2:03     ` Jiucheng Xu
@ 2026-03-03  2:11       ` Gao Xiang
  2026-03-03  2:17         ` Jiucheng Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Gao Xiang @ 2026-03-03  2:11 UTC (permalink / raw)
  To: Jiucheng Xu
  Cc: Jens Axboe, linux-block, linux-kernel, jianxin.pan, tuan.zhang,
	Gao Xiang, linux-erofs, Christoph Hellwig



On 2026/3/3 10:03, Jiucheng Xu wrote:
> 
> 

...

>>
>>> that need to use GFP_NOIO.
>>
>> Yes, it should make vm_map_ram() in the end_io path use
>> GFP_NOIO instead.
>>
>> Jiucheng, could you add memalloc_noio_{save,restore}() to
>> wrap up this path?
> 
> Thanks for Christoph's and Xiang's comments, I will try it. Thanks!

Just one more note: just wrap up z_erofs_decompressqueue_work() in
z_erofs_decompress_kickoff() with memalloc_noio_{save,restore}() is
enough.

  ...
  memalloc_noio_save()
  z_erofs_decompressqueue_work()
  memalloc_noio_restore()

> 
> Best Regards,
> Jiucheng


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait()
  2026-03-03  2:11       ` Gao Xiang
@ 2026-03-03  2:17         ` Jiucheng Xu
  0 siblings, 0 replies; 8+ messages in thread
From: Jiucheng Xu @ 2026-03-03  2:17 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Jens Axboe, linux-block, linux-kernel, jianxin.pan, tuan.zhang,
	Gao Xiang, linux-erofs, Christoph Hellwig



On 3/3/2026 10:11 AM, Gao Xiang wrote:
> [Some people who received this message don't often get email from 
> hsiangkao@linux.alibaba.com. Learn why this is important at https:// 
> aka.ms/LearnAboutSenderIdentification ]
> 
> [ EXTERNAL EMAIL ]
> 
> On 2026/3/3 10:03, Jiucheng Xu wrote:
>>
>>
> 
> ...
> 
>>>
>>>> that need to use GFP_NOIO.
>>>
>>> Yes, it should make vm_map_ram() in the end_io path use
>>> GFP_NOIO instead.
>>>
>>> Jiucheng, could you add memalloc_noio_{save,restore}() to
>>> wrap up this path?
>>
>> Thanks for Christoph's and Xiang's comments, I will try it. Thanks!
> 
> Just one more note: just wrap up z_erofs_decompressqueue_work() in
> z_erofs_decompress_kickoff() with memalloc_noio_{save,restore}() is
> enough.
> 
>   ...
>   memalloc_noio_save()
>   z_erofs_decompressqueue_work()
>   memalloc_noio_restore()
Got it, thanks for the details!



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-03-03  2:18 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-02  2:51 [PATCH] block: avoild hang when bio_list is non-NULL in submit_bio_wait() Jiucheng Xu via B4 Relay
2026-03-02 13:50 ` Christoph Hellwig
2026-03-02 14:23   ` Gao Xiang
2026-03-02 14:29     ` Christoph Hellwig
2026-03-02 14:31       ` Gao Xiang
2026-03-03  2:03     ` Jiucheng Xu
2026-03-03  2:11       ` Gao Xiang
2026-03-03  2:17         ` Jiucheng Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox