linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Boaz Harrosh <boaz@plexistor.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	linux-block@vger.kernel.org, Jan Kara <jack@suse.cz>,
	Matthew Wilcox <matthew@wil.cx>,
	Dave Chinner <david@fromorbit.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	XFS Developers <xfs@oss.sgi.com>, Jens Axboe <axboe@fb.com>,
	Linux MM <linux-mm@kvack.org>, Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
Date: Mon, 02 May 2016 19:22:50 +0300	[thread overview]
Message-ID: <57277EDA.9000803@plexistor.com> (raw)
In-Reply-To: <CAPcyv4jWPTDbbw6uMFEEt2Kazgw+wb5Pfwroej--uQPE+AtUbA@mail.gmail.com>

On 05/02/2016 07:01 PM, Dan Williams wrote:
> On Mon, May 2, 2016 at 8:41 AM, Boaz Harrosh <boaz@plexistor.com> wrote:
>> On 04/29/2016 12:16 AM, Vishal Verma wrote:
>>> All IO in a dax filesystem used to go through dax_do_io, which cannot
>>> handle media errors, and thus cannot provide a recovery path that can
>>> send a write through the driver to clear errors.
>>>
>>> Add a new iocb flag for DAX, and set it only for DAX mounts. In the IO
>>> path for DAX filesystems, use the same direct_IO path for both DAX and
>>> direct_io iocbs, but use the flags to identify when we are in O_DIRECT
>>> mode vs non O_DIRECT with DAX, and for O_DIRECT, use the conventional
>>> direct_IO path instead of DAX.
>>>
>>
>> Really? What are your thinking here?
>>
>> What about all the current users of O_DIRECT, you have just made them
>> 4 times slower and "less concurrent*" then "buffred io" users. Since
>> direct_IO path will queue an IO request and all.
>> (And if it is not so slow then why do we need dax_do_io at all? [Rhetorical])
>>
>> I hate it that you overload the semantics of a known and expected
>> O_DIRECT flag, for special pmem quirks. This is an incompatible
>> and unrelated overload of the semantics of O_DIRECT.
> 
> I think it is the opposite situation, it us undoing the premature
> overloading of O_DIRECT that went in without performance numbers.

We have tons of measurements. Is not hard to imagine the results though.
Specially the 1000 threads case

> This implementation clarifies that dax_do_io() handles the lack of a
> page cache for buffered I/O and O_DIRECT behaves as it nominally would
> by sending an I/O to the driver.  

> It has the benefit of matching the
> error semantics of a typical block device where a buffered write could
> hit an error filling the page cache, but an O_DIRECT write potentially
> triggers the drive to remap the block.
> 

I fail to see how in writes the device error semantics regarding remapping of
blocks is any different between buffered and direct IO. As far as the block
device it is the same exact code path. All The big difference is higher in the
VFS.

And ... So you are willing to sacrifice the 99% hotpath for the sake of the
1% error path? and piggybacking on poor O_DIRECT.

Again there are tons of O_DIRECT apps out there, why are you forcing them to
change if they want true pmem performance?

I still believe dax_do_io() can be made more resilient to errors, and clear
errors on writes. Me going digging in old patches ...

Cheers
Boaz

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-05-02 16:22 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-28 21:16 [PATCH v4 0/7] dax: handling media errors Vishal Verma
2016-04-28 21:16 ` [PATCH v4 1/7] block, dax: pass blk_dax_ctl through to drivers Vishal Verma
2016-04-28 21:16 ` [PATCH v4 2/7] dax: fallback from pmd to pte on error Vishal Verma
2016-04-28 21:16 ` [PATCH v4 3/7] dax: enable dax in the presence of known media errors (badblocks) Vishal Verma
2016-04-28 21:16 ` [PATCH v4 4/7] dax: use sb_issue_zerout instead of calling dax_clear_sectors Vishal Verma
2016-04-28 21:16 ` [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io Vishal Verma
2016-05-02 14:56   ` Christoph Hellwig
2016-05-02 15:45     ` Vishal Verma
2016-05-02 15:41   ` Boaz Harrosh
2016-05-02 15:51     ` Vishal Verma
2016-05-02 16:03       ` Boaz Harrosh
2016-05-02 18:52         ` Verma, Vishal L
2016-05-02 16:01     ` Dan Williams
2016-05-02 16:22       ` Boaz Harrosh [this message]
2016-05-02 16:49         ` Dan Williams
2016-05-02 17:44           ` Boaz Harrosh
2016-05-02 18:10             ` Dan Williams
2016-05-02 18:32               ` Boaz Harrosh
2016-05-02 18:48                 ` Dan Williams
2016-05-02 19:22                   ` Boaz Harrosh
2016-05-05 14:24     ` Christoph Hellwig
2016-05-05 15:15       ` Dan Williams
2016-05-05 15:22         ` Christoph Hellwig
2016-05-05 16:24           ` Dan Williams
2016-05-05 21:45           ` Verma, Vishal L
2016-05-08  9:01             ` hch
2016-05-08 18:42               ` Verma, Vishal L
2016-05-05 21:42         ` Verma, Vishal L
2016-05-05 21:39       ` Verma, Vishal L
2016-05-08  9:01         ` hch
2016-04-28 21:16 ` [PATCH v4 6/7] dax: for truncate/hole-punch, do zeroing through the driver if possible Vishal Verma
2016-04-28 21:16 ` [PATCH v4 7/7] dax: fix a comment in dax_zero_page_range and dax_truncate_page Vishal Verma
2016-04-29 21:55 ` [PATCH v4 8/7] Documentation: add error handling information to dax.txt Vishal Verma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57277EDA.9000803@plexistor.com \
    --to=boaz@plexistor.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=matthew@wil.cx \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.l.verma@intel.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).