From: Wengang Wang <wen.gang.wang@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: don't change to infinate lock to avoid dead lock
Date: Thu, 23 Apr 2020 16:19:52 -0700 [thread overview]
Message-ID: <bca65738-3deb-ef43-6dde-1c2402942032@oracle.com> (raw)
In-Reply-To: <ed040889-5f79-e4f5-a203-b7ad8aa701d4@oracle.com>
On 4/23/20 4:14 PM, Wengang Wang wrote:
>
> On 4/23/20 4:05 PM, Dave Chinner wrote:
>> On Thu, Apr 23, 2020 at 10:23:25AM -0700, Wengang Wang wrote:
>>> xfs_reclaim_inodes_ag() do infinate locking on pag_ici_reclaim_lock
>>> at the
>>> 2nd round of walking of all AGs when SYNC_TRYLOCK is set
>>> (conditionally).
>>> That causes dead lock in a special situation:
>>>
>>> 1) In a heavy memory load environment, process A is doing direct memory
>>> reclaiming waiting for xfs_inode.i_pincount to be cleared while holding
>>> mutex lock pag_ici_reclaim_lock.
>>>
>>> 2) i_pincount is increased by adding the xfs_inode to journal
>>> transection,
>>> and it's expected to be decreased when the transection related IO is
>>> done.
>>> Step 1) happens after i_pincount is increased and before
>>> truansection IO is
>>> issued.
>>>
>>> 3) Now the transection IO is issued by process B. In the IO path (IO
>>> could
>>> be more complex than you think), memory allocation and memory direct
>>> reclaiming happened too.
>> Sure, but IO path allocations are done under GFP_NOIO context, which
>> means IO path allocations can't recurse back into filesystem reclaim
>> via direct reclaim. Hence there should be no way for an IO path
>> allocation to block on XFS inode reclaim and hence there's no
>> possible deadlock here...
>>
>> IOWs, I don't think this is the deadlock you are looking for. Do you
>> have a lockdep report or some other set of stack traces that lead
>> you to this point?
>
> As I mentioned, the IO path can be more complex than you think.
>
> The real case I hit is that the process A is waiting for inode unpin
> on XFS A which is a loop device backed mount.
And actually, there is a dm-thin on top of the loop device..
thanks,
wengang
>
> And the backing file is from a different (X)FS B mount. So the IO is
> going through loop device, (direct) writes to (X)FS B.
>
> The (direct) writes to (X)FS B do memory allocations and then memory
> direct reclaims...
>
> thanks,
> wengang
>
next prev parent reply other threads:[~2020-04-23 23:19 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-23 17:23 [PATCH] xfs: don't change to infinate lock to avoid dead lock Wengang Wang
2020-04-23 23:05 ` Dave Chinner
2020-04-23 23:14 ` Wengang Wang
2020-04-23 23:19 ` Wengang Wang [this message]
2020-04-24 1:39 ` Dave Chinner
2020-04-24 16:58 ` Wengang Wang
2020-04-24 21:37 ` Dave Chinner
2020-04-24 21:45 ` Wengang Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bca65738-3deb-ef43-6dde-1c2402942032@oracle.com \
--to=wen.gang.wang@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox