Re: [Bug 50981] generic_file_aio_read ?: No locking means DATA CORRUPTION read and write on same 4096 page range

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Christoph Hellwig <hch@infradead.org>,
	Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	bugzilla-daemon@bugzilla.kernel.org, meetmehiro@gmail.com,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: Re: [Bug 50981] generic_file_aio_read ?: No locking means DATA CORRUPTION read and write on same 4096 page  range
Date: Tue, 27 Nov 2012 15:27:11 +1100	[thread overview]
Message-ID: <20121127042711.GL6434@dastard> (raw)
In-Reply-To: <20121127013254.GA25222@thunk.org>

On Mon, Nov 26, 2012 at 08:32:54PM -0500, Theodore Ts'o wrote:
> On Mon, Nov 26, 2012 at 05:09:08PM -0500, Christoph Hellwig wrote:
> > On Mon, Nov 26, 2012 at 04:49:37PM -0500, Theodore Ts'o wrote:
> > > Christoph, can you give some kind of estimate for the overhead that
> > > adding this locking in XFS actually costs in practice?
> > 
> > I don't know any real life measurements, but in terms of implementation
> > the over head is:
> > 
> >  a) taking a the rw_semaphore in shared mode for every buffered read
> >  b) taking the slightly slower exclusive rw_semaphore for buffered writes
> >     instead of the plain mutex
> > 
> > On the other hand it significantly simplifies the locking for direct
> > I/O and allows parallel direct I/O writers.
> 
> I should probably just look at the XFS code, but.... if you're taking
> an exclusve lock for buffered writes, won't this impact the
> performance of buffered writes happening in parallel on different
> CPU's?

Indeed it does - see my previous email. But it's no worse than
generic_file_aio_write() that takes i_mutex across buffered writes,
which is what most filesystems currently do. And FWIW, we also take
the i_mutex outside the i_iolock for the buffered write case because
generic_file_buffered_write() is documented to require it held.
See xfs_rw_ilock() and friends for locking order semantics...

FWIW, this buffered write exclusion is why we have been considering
replacing the rwsem with a shared/exclusive range lock - so we can
do concurrent non-overlapping reads and writes (for both direct IO and
buffered IO) without compromising the POSIX atomic write guarantee
(i.e. that a read will see the entire write or none of it). Range
locking will allow us to do that for both buffered and direct IO...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

WARNING: multiple messages have this Message-ID (diff)

From: Dave Chinner <david@fromorbit.com>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Christoph Hellwig <hch@infradead.org>,
	Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	bugzilla-daemon@bugzilla.kernel.org, meetmehiro@gmail.com,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Subject: Re: [Bug 50981] generic_file_aio_read ?: No locking means DATA CORRUPTION read and write on same 4096 page  range
Date: Tue, 27 Nov 2012 15:27:11 +1100	[thread overview]
Message-ID: <20121127042711.GL6434@dastard> (raw)
In-Reply-To: <20121127013254.GA25222@thunk.org>

On Mon, Nov 26, 2012 at 08:32:54PM -0500, Theodore Ts'o wrote:
> On Mon, Nov 26, 2012 at 05:09:08PM -0500, Christoph Hellwig wrote:
> > On Mon, Nov 26, 2012 at 04:49:37PM -0500, Theodore Ts'o wrote:
> > > Christoph, can you give some kind of estimate for the overhead that
> > > adding this locking in XFS actually costs in practice?
> > 
> > I don't know any real life measurements, but in terms of implementation
> > the over head is:
> > 
> >  a) taking a the rw_semaphore in shared mode for every buffered read
> >  b) taking the slightly slower exclusive rw_semaphore for buffered writes
> >     instead of the plain mutex
> > 
> > On the other hand it significantly simplifies the locking for direct
> > I/O and allows parallel direct I/O writers.
> 
> I should probably just look at the XFS code, but.... if you're taking
> an exclusve lock for buffered writes, won't this impact the
> performance of buffered writes happening in parallel on different
> CPU's?

Indeed it does - see my previous email. But it's no worse than
generic_file_aio_write() that takes i_mutex across buffered writes,
which is what most filesystems currently do. And FWIW, we also take
the i_mutex outside the i_iolock for the buffered write case because
generic_file_buffered_write() is documented to require it held.
See xfs_rw_ilock() and friends for locking order semantics...

FWIW, this buffered write exclusion is why we have been considering
replacing the rwsem with a shared/exclusive range lock - so we can
do concurrent non-overlapping reads and writes (for both direct IO and
buffered IO) without compromising the POSIX atomic write guarantee
(i.e. that a read will see the entire write or none of it). Range
locking will allow us to do that for both buffered and direct IO...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2012-11-27  4:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-50981-5823@https.bugzilla.kernel.org/>
     [not found] ` <20121126163328.ACEB011FE9C@bugzilla.kernel.org>
2012-11-26 16:45   ` [Bug 50981] generic_file_aio_read ?: No locking means DATA CORRUPTION read and write on same 4096 page range Theodore Ts'o
2012-11-26 16:45     ` Theodore Ts'o
2012-11-26 18:59     ` Hiro Lalwani
2012-11-26 20:05     ` Hugh Dickins
2012-11-26 20:05       ` Hugh Dickins
2012-11-26 20:13       ` Christoph Hellwig
2012-11-26 21:28         ` Dave Chinner
2012-11-26 21:39           ` Christoph Hellwig
2012-11-26 21:39             ` Christoph Hellwig
2012-11-26 21:49         ` Theodore Ts'o
2012-11-26 22:09           ` Christoph Hellwig
2012-11-26 22:09             ` Christoph Hellwig
2012-11-27  1:32             ` Theodore Ts'o
2012-11-27  4:27               ` Dave Chinner [this message]
2012-11-27  4:27                 ` Dave Chinner
2012-11-26 22:17           ` Dave Chinner
2012-11-26 22:17             ` Dave Chinner
2012-11-26 20:15       ` Zach Brown
2012-11-25 16:31 [Bug 50981] New: ext4 : " bugzilla-daemon
2012-11-26 12:26 ` [Bug 50981] generic_file_aio_read ?: No locking means " bugzilla-daemon
2012-11-26 13:21   ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121127042711.GL6434@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=meetmehiro@gmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.