Re: __sb_start_write() && force_trylock hack

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: __sb_start_write() && force_trylock hack
Date: Wed, 19 Aug 2015 17:00:26 +0200	[thread overview]
Message-ID: <20150819150026.GA19317@redhat.com> (raw)
In-Reply-To: <20150819031158.GO714@dastard>

On 08/19, Dave Chinner wrote:
>
> On Tue, Aug 18, 2015 at 04:49:00PM +0200, Oleg Nesterov wrote:
> > Jan, Dave, perhaps you can take a look...
> >
> > On 08/14, Oleg Nesterov wrote:
> > >
> > > Plus another patch which removes the "trylock"
> > > hack in __sb_start_write().
> >
> > I meant the patch we already discussed (attached at the end). And yes,
> > previously I reported it passed the tests. However, I only ran the same
> > 'grep -il freeze tests/*/???' tests. When I tried to run all tests, I
> > got the new reports from lockdep.
> >
> > 	[ 2098.281171]  May be due to missing lock nesting notation
>
> <groan>
>
> > 	[ 2098.288744] 4 locks held by fsstress/50971:
> > 	[ 2098.293408]  #0:  (sb_writers#13){++++++}, at: [<ffffffff81248d32>] __sb_start_write+0x32/0x40
> > 	[ 2098.303085]  #1:  (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [<ffffffff8125685f>] filename_create+0x7f/0x170
> > 	[ 2098.314038]  #2:  (sb_internal#2){++++++}, at: [<ffffffff81248d32>] __sb_start_write+0x32/0x40
> > 	[ 2098.323711]  #3:  (&type->s_umount_key#54){++++++}, at: [<ffffffffa05a638c>] xfs_flush_inodes+0x1c/0x40 [xfs]
> > 	[ 2098.334898]
> > 		       stack backtrace:
> > 	[ 2098.339762] CPU: 3 PID: 50971 Comm: fsstress Not tainted 4.2.0-rc6+ #27
> > 	[ 2098.347143] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
> > 	[ 2098.358303]  0000000000000000 00000000e70ee864 ffff880c05a2b9c8 ffffffff817ee692
> > 	[ 2098.366603]  0000000000000000 ffffffff826f8030 ffff880c05a2bab8 ffffffff810f45be
> > 	[ 2098.374900]  0000000000000000 ffff880c05a2bb20 0000000000000000 0000000000000000
> > 	[ 2098.383197] Call Trace:
> > 	[ 2098.385930]  [<ffffffff817ee692>] dump_stack+0x45/0x57
> > 	[ 2098.391667]  [<ffffffff810f45be>] __lock_acquire+0x1e9e/0x2040
> > 	[ 2098.398177]  [<ffffffff810f310d>] ? __lock_acquire+0x9ed/0x2040
> > 	[ 2098.404787]  [<ffffffff811d4702>] ? pagevec_lookup_entries+0x22/0x30
> > 	[ 2098.411879]  [<ffffffff811d5142>] ? truncate_inode_pages_range+0x2b2/0x7e0
> > 	[ 2098.419551]  [<ffffffff810f542e>] lock_acquire+0xbe/0x150
> > 	[ 2098.425566]  [<ffffffff81248d32>] ? __sb_start_write+0x32/0x40
> > 	[ 2098.432079]  [<ffffffff810ede91>] percpu_down_read+0x51/0xa0
> > 	[ 2098.438396]  [<ffffffff81248d32>] ? __sb_start_write+0x32/0x40
> > 	[ 2098.444905]  [<ffffffff81248d32>] __sb_start_write+0x32/0x40
> > 	[ 2098.451244]  [<ffffffffa05a7f84>] xfs_trans_alloc+0x24/0x40 [xfs]
> > 	[ 2098.458076]  [<ffffffffa059e872>] xfs_inactive_truncate+0x32/0x110 [xfs]
> > 	[ 2098.465580]  [<ffffffffa059f662>] xfs_inactive+0x102/0x120 [xfs]
> > 	[ 2098.472308]  [<ffffffffa05a475b>] xfs_fs_evict_inode+0x7b/0xc0 [xfs]
> > 	[ 2098.479401]  [<ffffffff812642ab>] evict+0xab/0x170
> > 	[ 2098.484748]  [<ffffffff81264568>] iput+0x1a8/0x230
> > 	[ 2098.490100]  [<ffffffff8127701c>] sync_inodes_sb+0x14c/0x1d0
> > 	[ 2098.496439]  [<ffffffffa05a6398>] xfs_flush_inodes+0x28/0x40 [xfs]
> > 	[ 2098.503361]  [<ffffffffa059e213>] xfs_create+0x5c3/0x6d0 [xfs]
>
> Another false positive.

Confused...

Dave, I removed your explanation which I can't understand anyway, let
me remind that I know absolutely nothing about filesystems.

> But there is no deadlock here

Yes, we hold SB_FREEZE_WRITE lock, so recursive SB_FREEZE_FS is safe.

But, this means that the comment in __sb_start_write() is still correct,
"XFS for example gets freeze protection on internal level twice" is true,
and we can not remove this force_trylock hack.

(Plus we have the jbd2_handle/sb_internal lock inversion but this is
 the same thing, safe because of the held lock on higher level).

Or I misunderstood?

Oleg.

next prev parent reply	other threads:[~2015-08-19 15:00 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-18 14:49 __sb_start_write() && force_trylock hack Oleg Nesterov
2015-08-18 15:18 ` Oleg Nesterov
2015-08-19  3:25   ` Dave Chinner
2015-08-19  3:11 ` Dave Chinner
2015-08-19 15:00   ` Oleg Nesterov [this message]
2015-08-19 23:26     ` Dave Chinner
2015-08-21 18:30       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150819150026.GA19317@redhat.com \
    --to=oleg@redhat.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.