linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dmitry Monakhov <dmonakhovopenvz@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Dmitriy Monakhov <dmonakhov@openvz.org>,
	linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com, tytso@mit.edu,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH] ext4: fix race aio-dio vs freeze_fs
Date: Wed, 25 Nov 2015 10:19:16 +0100	[thread overview]
Message-ID: <20151125091916.GL25232@quack.suse.cz> (raw)
In-Reply-To: <CAF5pi0HpbN=kKQQUA5V1QaBzZMGQDcwuDs4bvuRYifnOHiJ8Bw@mail.gmail.com>

On Tue 24-11-15 20:55:40, Dmitry Monakhov wrote:
> On Nov 24, 2015 16:25, "Jan Kara" <jack@suse.cz> wrote:
> > On Mon 23-11-15 20:02:48, Dmitry Monakhov wrote:
> > > After freeze_fs was revoked (from Jan Kara) pages's write-back
> completion
> > > is deffered before unwritten conversion, so explicit
> flush_unwritten_io()
> > > was removed here: c724585b62411
> > > But we still may face deferred conversion for aio-dio case
> > > # Trivial testcase
> > > for ((i=0;i<60;i++));do fsfreeze -f /mnt ;sleep 1;fsfreeze -u /mnt;done
> &
> > > fio --bs=4k --ioengine=libaio --iodepth=128 --size=1g --direct=1 \
> > >     --runtime=60 --filename=/mnt/file --name=rand-write --rw=randwrite
> > > NOTE: Sane testcase should be integrated to xfstests, but it requires
> > > changes in common/* code, so let's use this this test at the moment.
> > >
> > > In order to fix this race we have to guard journal transaction with
> explicit
> > > sb_{start,end}_intwrite()  as we do with ext4_evict_inode here:8e8ad8a5
> >
> > Well, this problem seems to suggest that we have the freeze protection for
> > AIO writes wrong. We should call file_end_write() from aio_complete() and
> > not from aio_run_iocb()...
> Yep. It was my first attempt to fix that issue, but  unfortunately this
> trick will break lockdep. Caller will do file_start_write and exit to
> userspace. Lockdep treats such behaviour as bug (return to userspace with a
> lock held)
> 
> There are two way to fix that
> 1) add specific 'long' lock primitive to lockdep

The way we tell lockdep about transfer of context is that we just lie to
lockdep and tell it that the lock got unlocked at appropriate place and
then tell it we locked it again at another place. It is somewhat ugly but
not that hard to do... Generally lockdep is a tool that should help but by
no means it should be a reason for poor locking decisions just because
lockdep cannot handle them.

								Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR

      reply	other threads:[~2015-11-25  9:19 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1448294568-20892-1-git-send-email-dmonakhov@openvz.org>
2015-11-24 13:24 ` [PATCH] ext4: fix race aio-dio vs freeze_fs Jan Kara
2015-11-24 16:07   ` Christoph Hellwig
2015-11-25 10:25     ` Jan Kara
2015-11-24 16:55   ` Dmitry Monakhov
2015-11-25  9:19     ` Jan Kara [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151125091916.GL25232@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=dmonakhov@openvz.org \
    --cc=dmonakhovopenvz@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).