All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hpe.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Matthew Wilcox <willy@linux.intel.com>,
	<linux-ext4@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>,
	Toshimitsu Kani <toshi.kani@hpe.com>
Subject: Re: [PATCH v5 0/2] ext4: Improve parallel I/O performance on NVDIMM
Date: Mon, 2 May 2016 13:45:08 -0400	[thread overview]
Message-ID: <57279224.7030702@hpe.com> (raw)
In-Reply-To: <20160501172854.GA19601@infradead.org>

On 05/01/2016 01:28 PM, Christoph Hellwig wrote:
> On Fri, Apr 29, 2016 at 12:38:20PM -0400, Waiman Long wrote:
>>  From my testing, it looked like that parallel overwrites to the same file in
>> an ext4 filesystem on DAX can happen in parallel even if their range
>> overlaps. It was mainly because the code will drop the i_mutex before the
>> write. That means the overlapped blocks can get garbage. I think this is a
>> problem, but I am not expert in the ext4 filesystem to say for sure. I would
>> like to know your thought on that.
> That's another issue with dax I/O pretending to be direct I/O..  Because
> it isn't we'll need to synchronize it like buffered I/O and not like
> direct I/O in all file systems.

 From what I saw in the code, I think filemap_write_and_wait_range()
should have prevented concurrent overwrites from stepping on each
other for non-DAX I/O.  However it is essentially a no-op for DAX
I/O and so the protection is gone.

I am planning to send out a patch to disable mutex dropping for DAX
overwrite. There is still an issue on the read side. If journal is
disabled and the dioread_nolock mount option is used, read will done
without locking. Again, the filemap_write_and_wait_range() check on
the read side will not protect against write.

Cheers,
Longman


  reply	other threads:[~2016-05-02 17:45 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-29 16:27 [PATCH v5 0/2] ext4: Improve parallel I/O performance on NVDIMM Waiman Long
2016-04-29 16:27 ` [PATCH v5 1/2] dax: Don't touch i_dio_count in dax_do_io() Waiman Long
2016-04-29 16:27   ` Waiman Long
2016-05-05 14:16   ` Jan Kara
2016-05-05 14:27     ` Christoph Hellwig
2016-05-05 15:48       ` Jan Kara
2016-04-29 16:27 ` [PATCH v5 2/2] ext4: Make cache hits/misses per-cpu counts Waiman Long
2016-05-05 14:03   ` Jan Kara
2016-04-29 16:38 ` [PATCH v5 0/2] ext4: Improve parallel I/O performance on NVDIMM Waiman Long
2016-05-01 17:28   ` Christoph Hellwig
2016-05-02 17:45     ` Waiman Long [this message]
2016-05-05  1:57     ` Dave Chinner
2016-05-05 14:19       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57279224.7030702@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=david@fromorbit.com \
    --cc=doug.hatch@hpe.com \
    --cc=hch@infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=scott.norton@hpe.com \
    --cc=toshi.kani@hpe.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.