linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: __sb_start_write() && force_trylock hack
Date: Wed, 19 Aug 2015 17:00:26 +0200	[thread overview]
Message-ID: <20150819150026.GA19317@redhat.com> (raw)
In-Reply-To: <20150819031158.GO714@dastard>

On 08/19, Dave Chinner wrote:
>
> On Tue, Aug 18, 2015 at 04:49:00PM +0200, Oleg Nesterov wrote:
> > Jan, Dave, perhaps you can take a look...
> >
> > On 08/14, Oleg Nesterov wrote:
> > >
> > > Plus another patch which removes the "trylock"
> > > hack in __sb_start_write().
> >
> > I meant the patch we already discussed (attached at the end). And yes,
> > previously I reported it passed the tests. However, I only ran the same
> > 'grep -il freeze tests/*/???' tests. When I tried to run all tests, I
> > got the new reports from lockdep.
> >
> > 	[ 2098.281171]  May be due to missing lock nesting notation
>
> <groan>
>
> > 	[ 2098.288744] 4 locks held by fsstress/50971:
> > 	[ 2098.293408]  #0:  (sb_writers#13){++++++}, at: [<ffffffff81248d32>] __sb_start_write+0x32/0x40
> > 	[ 2098.303085]  #1:  (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [<ffffffff8125685f>] filename_create+0x7f/0x170
> > 	[ 2098.314038]  #2:  (sb_internal#2){++++++}, at: [<ffffffff81248d32>] __sb_start_write+0x32/0x40
> > 	[ 2098.323711]  #3:  (&type->s_umount_key#54){++++++}, at: [<ffffffffa05a638c>] xfs_flush_inodes+0x1c/0x40 [xfs]
> > 	[ 2098.334898]
> > 		       stack backtrace:
> > 	[ 2098.339762] CPU: 3 PID: 50971 Comm: fsstress Not tainted 4.2.0-rc6+ #27
> > 	[ 2098.347143] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.R3.27.D685.1305151734 05/15/2013
> > 	[ 2098.358303]  0000000000000000 00000000e70ee864 ffff880c05a2b9c8 ffffffff817ee692
> > 	[ 2098.366603]  0000000000000000 ffffffff826f8030 ffff880c05a2bab8 ffffffff810f45be
> > 	[ 2098.374900]  0000000000000000 ffff880c05a2bb20 0000000000000000 0000000000000000
> > 	[ 2098.383197] Call Trace:
> > 	[ 2098.385930]  [<ffffffff817ee692>] dump_stack+0x45/0x57
> > 	[ 2098.391667]  [<ffffffff810f45be>] __lock_acquire+0x1e9e/0x2040
> > 	[ 2098.398177]  [<ffffffff810f310d>] ? __lock_acquire+0x9ed/0x2040
> > 	[ 2098.404787]  [<ffffffff811d4702>] ? pagevec_lookup_entries+0x22/0x30
> > 	[ 2098.411879]  [<ffffffff811d5142>] ? truncate_inode_pages_range+0x2b2/0x7e0
> > 	[ 2098.419551]  [<ffffffff810f542e>] lock_acquire+0xbe/0x150
> > 	[ 2098.425566]  [<ffffffff81248d32>] ? __sb_start_write+0x32/0x40
> > 	[ 2098.432079]  [<ffffffff810ede91>] percpu_down_read+0x51/0xa0
> > 	[ 2098.438396]  [<ffffffff81248d32>] ? __sb_start_write+0x32/0x40
> > 	[ 2098.444905]  [<ffffffff81248d32>] __sb_start_write+0x32/0x40
> > 	[ 2098.451244]  [<ffffffffa05a7f84>] xfs_trans_alloc+0x24/0x40 [xfs]
> > 	[ 2098.458076]  [<ffffffffa059e872>] xfs_inactive_truncate+0x32/0x110 [xfs]
> > 	[ 2098.465580]  [<ffffffffa059f662>] xfs_inactive+0x102/0x120 [xfs]
> > 	[ 2098.472308]  [<ffffffffa05a475b>] xfs_fs_evict_inode+0x7b/0xc0 [xfs]
> > 	[ 2098.479401]  [<ffffffff812642ab>] evict+0xab/0x170
> > 	[ 2098.484748]  [<ffffffff81264568>] iput+0x1a8/0x230
> > 	[ 2098.490100]  [<ffffffff8127701c>] sync_inodes_sb+0x14c/0x1d0
> > 	[ 2098.496439]  [<ffffffffa05a6398>] xfs_flush_inodes+0x28/0x40 [xfs]
> > 	[ 2098.503361]  [<ffffffffa059e213>] xfs_create+0x5c3/0x6d0 [xfs]
>
> Another false positive.

Confused...

Dave, I removed your explanation which I can't understand anyway, let
me remind that I know absolutely nothing about filesystems.

> But there is no deadlock here

Yes, we hold SB_FREEZE_WRITE lock, so recursive SB_FREEZE_FS is safe.

But, this means that the comment in __sb_start_write() is still correct,
"XFS for example gets freeze protection on internal level twice" is true,
and we can not remove this force_trylock hack.

(Plus we have the jbd2_handle/sb_internal lock inversion but this is
 the same thing, safe because of the held lock on higher level).

Or I misunderstood?

Oleg.

  reply	other threads:[~2015-08-19 15:00 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-18 14:49 __sb_start_write() && force_trylock hack Oleg Nesterov
2015-08-18 15:18 ` Oleg Nesterov
2015-08-19  3:25   ` Dave Chinner
2015-08-19  3:11 ` Dave Chinner
2015-08-19 15:00   ` Oleg Nesterov [this message]
2015-08-19 23:26     ` Dave Chinner
2015-08-21 18:30       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150819150026.GA19317@redhat.com \
    --to=oleg@redhat.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).