linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Zhihao Cheng <chengzhihao1@huawei.com>,
	hch@lst.de, torvalds@linux-foundation.org, mingo@redhat.com,
	viro@zeniv.linux.org.uk
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	yukuai3@huawei.com
Subject: Re: [PATCH v3 1/1] fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages
Date: Thu, 19 May 2022 06:30:20 -0600	[thread overview]
Message-ID: <d97c1e72-5359-79e1-3af3-56d7911702f3@kernel.dk> (raw)
In-Reply-To: <20220510133805.1988292-1-chengzhihao1@huawei.com>

On 5/10/22 7:38 AM, Zhihao Cheng wrote:
> Commit 505a666ee3fc ("writeback: plug writeback in wb_writeback() and
> writeback_inodes_wb()") has us holding a plug during wb_writeback, which
> may cause a potential ABBA dead lock:
> 
>     wb_writeback		fat_file_fsync
> blk_start_plug(&plug)
> for (;;) {
>   iter i-1: some reqs have been added into plug->mq_list  // LOCK A
>   iter i:
>     progress = __writeback_inodes_wb(wb, work)
>     . writeback_sb_inodes // fat's bdev
>     .   __writeback_single_inode
>     .   . generic_writepages
>     .   .   __block_write_full_page
>     .   .   . . 	    __generic_file_fsync
>     .   .   . . 	      sync_inode_metadata
>     .   .   . . 	        writeback_single_inode
>     .   .   . . 		  __writeback_single_inode
>     .   .   . . 		    fat_write_inode
>     .   .   . . 		      __fat_write_inode
>     .   .   . . 		        sync_dirty_buffer	// fat's bdev
>     .   .   . . 			  lock_buffer(bh)	// LOCK B
>     .   .   . . 			    submit_bh
>     .   .   . . 			      blk_mq_get_tag	// LOCK A
>     .   .   . trylock_buffer(bh)  // LOCK B
>     .   .   .   redirty_page_for_writepage
>     .   .   .     wbc->pages_skipped++
>     .   .   --wbc->nr_to_write
>     .   wrote += write_chunk - wbc.nr_to_write  // wrote > 0
>     .   requeue_inode
>     .     redirty_tail_locked
>     if (progress)    // progress > 0
>       continue;
>   iter i+1:
>       queue_io
>       // similar process with iter i, infinite for-loop !
> }
> blk_finish_plug(&plug)   // flush plug won't be called
> 
> Above process triggers a hungtask like:
> [  399.044861] INFO: task bb:2607 blocked for more than 30 seconds.
> [  399.046824]       Not tainted 5.18.0-rc1-00005-gefae4d9eb6a2-dirty
> [  399.051539] task:bb              state:D stack:    0 pid: 2607 ppid:
> 2426 flags:0x00004000
> [  399.051556] Call Trace:
> [  399.051570]  __schedule+0x480/0x1050
> [  399.051592]  schedule+0x92/0x1a0
> [  399.051602]  io_schedule+0x22/0x50
> [  399.051613]  blk_mq_get_tag+0x1d3/0x3c0
> [  399.051640]  __blk_mq_alloc_requests+0x21d/0x3f0
> [  399.051657]  blk_mq_submit_bio+0x68d/0xca0
> [  399.051674]  __submit_bio+0x1b5/0x2d0
> [  399.051708]  submit_bio_noacct+0x34e/0x720
> [  399.051718]  submit_bio+0x3b/0x150
> [  399.051725]  submit_bh_wbc+0x161/0x230
> [  399.051734]  __sync_dirty_buffer+0xd1/0x420
> [  399.051744]  sync_dirty_buffer+0x17/0x20
> [  399.051750]  __fat_write_inode+0x289/0x310
> [  399.051766]  fat_write_inode+0x2a/0xa0
> [  399.051783]  __writeback_single_inode+0x53c/0x6f0
> [  399.051795]  writeback_single_inode+0x145/0x200
> [  399.051803]  sync_inode_metadata+0x45/0x70
> [  399.051856]  __generic_file_fsync+0xa3/0x150
> [  399.051880]  fat_file_fsync+0x1d/0x80
> [  399.051895]  vfs_fsync_range+0x40/0xb0
> [  399.051929]  __x64_sys_fsync+0x18/0x30
> 
> In my test, 'need_resched()' (which is imported by 590dca3a71 "fs-writeback:
> unplug before cond_resched in writeback_sb_inodes") in function
> 'writeback_sb_inodes()' seldom comes true, unless cond_resched() is deleted
> from write_cache_pages().
> 
> Fix it by correcting wrote number according number of skipped pages
> in writeback_sb_inodes().
> 
> Goto Link to find a reproducer.

I can take this one for 5.19, thanks.

-- 
Jens Axboe


      parent reply	other threads:[~2022-05-19 12:30 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-10 13:38 [PATCH v3 1/1] fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages Zhihao Cheng
2022-05-12  6:18 ` Christoph Hellwig
2022-05-19 12:26 ` Jan Kara
2022-05-19 12:30 ` Jens Axboe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d97c1e72-5359-79e1-3af3-56d7911702f3@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=chengzhihao1@huawei.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).