From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-block-owner@vger.kernel.org>
Received: from mail.kernel.org ([198.145.29.136]:33364 "EHLO mail.kernel.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752853AbcEBPvf (ORCPT <rfc822;linux-block@vger.kernel.org>);
	Mon, 2 May 2016 11:51:35 -0400
Message-ID: <1462204291.11211.20.camel@kernel.org>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: Vishal Verma <vishal@kernel.org>
To: Boaz Harrosh <boaz@plexistor.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	linux-nvdimm@lists.01.org
Cc: linux-block@vger.kernel.org, Jan Kara <jack@suse.cz>,
	Matthew Wilcox <matthew@wil.cx>,
	Dave Chinner <david@fromorbit.com>,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com,
	Jens Axboe <axboe@fb.com>, linux-mm@kvack.org,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-ext4@vger.kernel.org
Date: Mon, 02 May 2016 09:51:31 -0600
In-Reply-To: <5727753F.6090104@plexistor.com>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com>
	 <1461878218-3844-6-git-send-email-vishal.l.verma@intel.com>
	 <5727753F.6090104@plexistor.com>
Content-Type: text/plain; charset="UTF-8"
Mime-Version: 1.0
Sender: linux-block-owner@vger.kernel.org
List-Id: linux-block@vger.kernel.org

On Mon, 2016-05-02 at 18:41 +0300, Boaz Harrosh wrote:
> On 04/29/2016 12:16 AM, Vishal Verma wrote:
> > 
> > All IO in a dax filesystem used to go through dax_do_io, which
> > cannot
> > handle media errors, and thus cannot provide a recovery path that
> > can
> > send a write through the driver to clear errors.
> > 
> > Add a new iocb flag for DAX, and set it only for DAX mounts. In the
> > IO
> > path for DAX filesystems, use the same direct_IO path for both DAX
> > and
> > direct_io iocbs, but use the flags to identify when we are in
> > O_DIRECT
> > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the
> > conventional
> > direct_IO path instead of DAX.
> > 
> Really? What are your thinking here?
> 
> What about all the current users of O_DIRECT, you have just made them
> 4 times slower and "less concurrent*" then "buffred io" users. Since
> direct_IO path will queue an IO request and all.
> (And if it is not so slow then why do we need dax_do_io at all?
> [Rhetorical])
> 
> I hate it that you overload the semantics of a known and expected
> O_DIRECT flag, for special pmem quirks. This is an incompatible
> and unrelated overload of the semantics of O_DIRECT.

We overloaded O_DIRECT a long time ago when we made DAX piggyback on
the same path:

static inline bool io_is_direct(struct file *filp)
{
	return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host);
}

Yes O_DIRECT on a DAX mounted file system will now be slower, but -

> 
> > 
> > This allows us a recovery path in the form of opening the file with
> > O_DIRECT and writing to it with the usual O_DIRECT semantics
> > (sector
> > alignment restrictions).
> > 
> I understand that you want a sector aligned IO, right? for the
> clear of errors. But I hate it that you forced all O_DIRECT IO
> to be slow for this.
> Can you not make dax_do_io handle media errors? At least for the
> parts of the IO that are aligned.
> (And your recovery path application above can use only aligned
>  IO to make sure)
> 
> Please look for another solution. Even a special
> IOCTL_DAX_CLEAR_ERROR

 - see all the versions of this series prior to this one, where we try
to do a fallback...

> 
> [*"less concurrent" because of the queuing done in bdev. Note how
>   pmem is not even multi-queue, and even if it was it will be much
>   slower then DAX because of the code depth and all the locks and
> task
>   switches done in the block layer. In DAX the final memcpy is done
> directly
>   on the user-mode thread]
> 
> Thanks
> Boaz
> 


From mboxrd@z Thu Jan  1 00:00:00 1970
From: Vishal Verma <vishal@kernel.org>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
Date: Mon, 02 May 2016 09:51:31 -0600
Message-ID: <1462204291.11211.20.camel@kernel.org>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com>
	 <1461878218-3844-6-git-send-email-vishal.l.verma@intel.com>
	 <5727753F.6090104@plexistor.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Cc: linux-block@vger.kernel.org, Jan Kara <jack@suse.cz>, Matthew Wilcox
	 <matthew@wil.cx>, Dave Chinner <david@fromorbit.com>,
	linux-kernel@vger.kernel.org, xfs@oss.sgi.com, Jens Axboe <axboe@fb.com>,
	linux-mm@kvack.org, Al Viro <viro@zeniv.linux.org.uk>, Christoph Hellwig
	 <hch@infradead.org>, linux-fsdevel@vger.kernel.org, Andrew Morton
	 <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org
To: Boaz Harrosh <boaz@plexistor.com>, Vishal Verma
 <vishal.l.verma@intel.com>,  linux-nvdimm@lists.01.org
Return-path: <owner-linux-mm@kvack.org>
In-Reply-To: <5727753F.6090104@plexistor.com>
Sender: owner-linux-mm@kvack.org
List-Id: linux-ext4.vger.kernel.org

On Mon, 2016-05-02 at 18:41 +0300, Boaz Harrosh wrote:
> On 04/29/2016 12:16 AM, Vishal Verma wrote:
> >=20
> > All IO in a dax filesystem used to go through dax_do_io, which
> > cannot
> > handle media errors, and thus cannot provide a recovery path that
> > can
> > send a write through the driver to clear errors.
> >=20
> > Add a new iocb flag for DAX, and set it only for DAX mounts. In the
> > IO
> > path for DAX filesystems, use the same direct_IO path for both DAX
> > and
> > direct_io iocbs, but use the flags to identify when we are in
> > O_DIRECT
> > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the
> > conventional
> > direct_IO path instead of DAX.
> >=20
> Really? What are your thinking here?
>=20
> What about all the current users of O_DIRECT, you have just made them
> 4 times slower and "less concurrent*" then "buffred io" users. Since
> direct_IO path will queue an IO request and all.
> (And if it is not so slow then why do we need dax_do_io at all?
> [Rhetorical])
>=20
> I hate it that you overload the semantics of a known and expected
> O_DIRECT flag, for special pmem quirks. This is an incompatible
> and unrelated overload of the semantics of O_DIRECT.

We overloaded O_DIRECT a long time ago when we made DAX piggyback on
the same path:

static inline bool io_is_direct(struct file *filp)
{
	return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host);
}

Yes O_DIRECT on a DAX mounted file system will now be slower, but -

>=20
> >=20
> > This allows us a recovery path in the form of opening the file with
> > O_DIRECT and writing to it with the usual O_DIRECT semantics
> > (sector
> > alignment restrictions).
> >=20
> I understand that you want a sector aligned IO, right? for the
> clear of errors. But I hate it that you forced all O_DIRECT IO
> to be slow for this.
> Can you not make dax_do_io handle media errors? At least for the
> parts of the IO that are aligned.
> (And your recovery path application above can use only aligned
> =C2=A0IO to make sure)
>=20
> Please look for another solution. Even a special
> IOCTL_DAX_CLEAR_ERROR

=C2=A0- see all the versions of this series prior to this one, where we t=
ry
to do a fallback...

>=20
> [*"less concurrent" because of the queuing done in bdev. Note how
> =C2=A0 pmem is not even multi-queue, and even if it was it will be much
> =C2=A0 slower then DAX because of the code depth and all the locks and
> task
> =C2=A0 switches done in the block layer. In DAX the final memcpy is don=
e
> directly
> =C2=A0 on the user-mode thread]
>=20
> Thanks
> Boaz
>=20

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=3Dmailto:"dont@kvack.org"> email@kvack.org </a>

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-nvdimm-bounces@lists.01.org>
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
 (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
 (No client certificate requested)
 by ml01.01.org (Postfix) with ESMTPS id 8847B1A1EE6
 for <linux-nvdimm@lists.01.org>; Mon,  2 May 2016 08:51:34 -0700 (PDT)
Message-ID: <1462204291.11211.20.camel@kernel.org>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: Vishal Verma <vishal@kernel.org>
Date: Mon, 02 May 2016 09:51:31 -0600
In-Reply-To: <5727753F.6090104@plexistor.com>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com>
 <1461878218-3844-6-git-send-email-vishal.l.verma@intel.com>
 <5727753F.6090104@plexistor.com>
Mime-Version: 1.0
List-Unsubscribe: <https://lists.01.org/mailman/options/linux-nvdimm>,
 <mailto:linux-nvdimm-request@lists.01.org?subject=unsubscribe>
List-Archive: <http://lists.01.org/pipermail/linux-nvdimm/>
List-Post: <mailto:linux-nvdimm@lists.01.org>
List-Help: <mailto:linux-nvdimm-request@lists.01.org?subject=help>
List-Subscribe: <https://lists.01.org/mailman/listinfo/linux-nvdimm>,
 <mailto:linux-nvdimm-request@lists.01.org?subject=subscribe>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: linux-nvdimm-bounces@lists.01.org
Sender: "Linux-nvdimm" <linux-nvdimm-bounces@lists.01.org>
To: Boaz Harrosh <boaz@plexistor.com>, Vishal Verma <vishal.l.verma@intel.com>, linux-nvdimm@lists.01.org
Cc: Jens Axboe <axboe@fb.com>, Jan Kara <jack@suse.cz>, Matthew Wilcox <matthew@wil.cx>, Dave Chinner <david@fromorbit.com>, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, linux-block@vger.kernel.org, linux-mm@kvack.org, Al Viro <viro@zeniv.linux.org.uk>, Christoph Hellwig <hch@infradead.org>, linux-fsdevel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org
List-ID: <linux-nvdimm@lists.01.org>

T24gTW9uLCAyMDE2LTA1LTAyIGF0IDE4OjQxICswMzAwLCBCb2F6IEhhcnJvc2ggd3JvdGU6Cj4g
T24gMDQvMjkvMjAxNiAxMjoxNiBBTSwgVmlzaGFsIFZlcm1hIHdyb3RlOgo+ID4gCj4gPiBBbGwg
SU8gaW4gYSBkYXggZmlsZXN5c3RlbSB1c2VkIHRvIGdvIHRocm91Z2ggZGF4X2RvX2lvLCB3aGlj
aAo+ID4gY2Fubm90Cj4gPiBoYW5kbGUgbWVkaWEgZXJyb3JzLCBhbmQgdGh1cyBjYW5ub3QgcHJv
dmlkZSBhIHJlY292ZXJ5IHBhdGggdGhhdAo+ID4gY2FuCj4gPiBzZW5kIGEgd3JpdGUgdGhyb3Vn
aCB0aGUgZHJpdmVyIHRvIGNsZWFyIGVycm9ycy4KPiA+IAo+ID4gQWRkIGEgbmV3IGlvY2IgZmxh
ZyBmb3IgREFYLCBhbmQgc2V0IGl0IG9ubHkgZm9yIERBWCBtb3VudHMuIEluIHRoZQo+ID4gSU8K
PiA+IHBhdGggZm9yIERBWCBmaWxlc3lzdGVtcywgdXNlIHRoZSBzYW1lIGRpcmVjdF9JTyBwYXRo
IGZvciBib3RoIERBWAo+ID4gYW5kCj4gPiBkaXJlY3RfaW8gaW9jYnMsIGJ1dCB1c2UgdGhlIGZs
YWdzIHRvIGlkZW50aWZ5IHdoZW4gd2UgYXJlIGluCj4gPiBPX0RJUkVDVAo+ID4gbW9kZSB2cyBu
b24gT19ESVJFQ1Qgd2l0aCBEQVgsIGFuZCBmb3IgT19ESVJFQ1QsIHVzZSB0aGUKPiA+IGNvbnZl
bnRpb25hbAo+ID4gZGlyZWN0X0lPIHBhdGggaW5zdGVhZCBvZiBEQVguCj4gPiAKPiBSZWFsbHk/
IFdoYXQgYXJlIHlvdXIgdGhpbmtpbmcgaGVyZT8KPiAKPiBXaGF0IGFib3V0IGFsbCB0aGUgY3Vy
cmVudCB1c2VycyBvZiBPX0RJUkVDVCwgeW91IGhhdmUganVzdCBtYWRlIHRoZW0KPiA0IHRpbWVz
IHNsb3dlciBhbmQgImxlc3MgY29uY3VycmVudCoiIHRoZW4gImJ1ZmZyZWQgaW8iIHVzZXJzLiBT
aW5jZQo+IGRpcmVjdF9JTyBwYXRoIHdpbGwgcXVldWUgYW4gSU8gcmVxdWVzdCBhbmQgYWxsLgo+
IChBbmQgaWYgaXQgaXMgbm90IHNvIHNsb3cgdGhlbiB3aHkgZG8gd2UgbmVlZCBkYXhfZG9faW8g
YXQgYWxsPwo+IFtSaGV0b3JpY2FsXSkKPiAKPiBJIGhhdGUgaXQgdGhhdCB5b3Ugb3ZlcmxvYWQg
dGhlIHNlbWFudGljcyBvZiBhIGtub3duIGFuZCBleHBlY3RlZAo+IE9fRElSRUNUIGZsYWcsIGZv
ciBzcGVjaWFsIHBtZW0gcXVpcmtzLiBUaGlzIGlzIGFuIGluY29tcGF0aWJsZQo+IGFuZCB1bnJl
bGF0ZWQgb3ZlcmxvYWQgb2YgdGhlIHNlbWFudGljcyBvZiBPX0RJUkVDVC4KCldlIG92ZXJsb2Fk
ZWQgT19ESVJFQ1QgYSBsb25nIHRpbWUgYWdvIHdoZW4gd2UgbWFkZSBEQVggcGlnZ3liYWNrIG9u
CnRoZSBzYW1lIHBhdGg6CgpzdGF0aWMgaW5saW5lIGJvb2wgaW9faXNfZGlyZWN0KHN0cnVjdCBm
aWxlICpmaWxwKQp7CglyZXR1cm4gKGZpbHAtPmZfZmxhZ3MgJiBPX0RJUkVDVCkgfHwgSVNfREFY
KGZpbHAtPmZfbWFwcGluZy0+aG9zdCk7Cn0KClllcyBPX0RJUkVDVCBvbiBhIERBWCBtb3VudGVk
IGZpbGUgc3lzdGVtIHdpbGwgbm93IGJlIHNsb3dlciwgYnV0IC0KCj4gCj4gPiAKPiA+IFRoaXMg
YWxsb3dzIHVzIGEgcmVjb3ZlcnkgcGF0aCBpbiB0aGUgZm9ybSBvZiBvcGVuaW5nIHRoZSBmaWxl
IHdpdGgKPiA+IE9fRElSRUNUIGFuZCB3cml0aW5nIHRvIGl0IHdpdGggdGhlIHVzdWFsIE9fRElS
RUNUIHNlbWFudGljcwo+ID4gKHNlY3Rvcgo+ID4gYWxpZ25tZW50IHJlc3RyaWN0aW9ucykuCj4g
PiAKPiBJIHVuZGVyc3RhbmQgdGhhdCB5b3Ugd2FudCBhIHNlY3RvciBhbGlnbmVkIElPLCByaWdo
dD8gZm9yIHRoZQo+IGNsZWFyIG9mIGVycm9ycy4gQnV0IEkgaGF0ZSBpdCB0aGF0IHlvdSBmb3Jj
ZWQgYWxsIE9fRElSRUNUIElPCj4gdG8gYmUgc2xvdyBmb3IgdGhpcy4KPiBDYW4geW91IG5vdCBt
YWtlIGRheF9kb19pbyBoYW5kbGUgbWVkaWEgZXJyb3JzPyBBdCBsZWFzdCBmb3IgdGhlCj4gcGFy
dHMgb2YgdGhlIElPIHRoYXQgYXJlIGFsaWduZWQuCj4gKEFuZCB5b3VyIHJlY292ZXJ5IHBhdGgg
YXBwbGljYXRpb24gYWJvdmUgY2FuIHVzZSBvbmx5IGFsaWduZWQKPiDCoElPIHRvIG1ha2Ugc3Vy
ZSkKPiAKPiBQbGVhc2UgbG9vayBmb3IgYW5vdGhlciBzb2x1dGlvbi4gRXZlbiBhIHNwZWNpYWwK
PiBJT0NUTF9EQVhfQ0xFQVJfRVJST1IKCsKgLSBzZWUgYWxsIHRoZSB2ZXJzaW9ucyBvZiB0aGlz
IHNlcmllcyBwcmlvciB0byB0aGlzIG9uZSwgd2hlcmUgd2UgdHJ5CnRvIGRvIGEgZmFsbGJhY2su
Li4KCj4gCj4gWyoibGVzcyBjb25jdXJyZW50IiBiZWNhdXNlIG9mIHRoZSBxdWV1aW5nIGRvbmUg
aW4gYmRldi4gTm90ZSBob3cKPiDCoCBwbWVtIGlzIG5vdCBldmVuIG11bHRpLXF1ZXVlLCBhbmQg
ZXZlbiBpZiBpdCB3YXMgaXQgd2lsbCBiZSBtdWNoCj4gwqAgc2xvd2VyIHRoZW4gREFYIGJlY2F1
c2Ugb2YgdGhlIGNvZGUgZGVwdGggYW5kIGFsbCB0aGUgbG9ja3MgYW5kCj4gdGFzawo+IMKgIHN3
aXRjaGVzIGRvbmUgaW4gdGhlIGJsb2NrIGxheWVyLiBJbiBEQVggdGhlIGZpbmFsIG1lbWNweSBp
cyBkb25lCj4gZGlyZWN0bHkKPiDCoCBvbiB0aGUgdXNlci1tb2RlIHRocmVhZF0KPiAKPiBUaGFu
a3MKPiBCb2F6Cj4gCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fXwpMaW51eC1udmRpbW0gbWFpbGluZyBsaXN0CkxpbnV4LW52ZGltbUBsaXN0cy4wMS5vcmcK
aHR0cHM6Ly9saXN0cy4wMS5vcmcvbWFpbG1hbi9saXN0aW5mby9saW51eC1udmRpbW0K

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <xfs-bounces@oss.sgi.com>
Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29])
	by oss.sgi.com (Postfix) with ESMTP id 15E197CBE
	for <xfs@oss.sgi.com>; Mon,  2 May 2016 10:51:37 -0500 (CDT)
Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15])
	by relay2.corp.sgi.com (Postfix) with ESMTP id DB6B230405F
	for <xfs@oss.sgi.com>; Mon,  2 May 2016 08:51:36 -0700 (PDT)
Received: from mail.kernel.org ([198.145.29.136]) by cuda.sgi.com with ESMTP
	id Ta4DCXiqugQyS6MP (version=TLSv1.2
	cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for
	<xfs@oss.sgi.com>; Mon, 02 May 2016 08:51:34 -0700 (PDT)
Message-ID: <1462204291.11211.20.camel@kernel.org>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: Vishal Verma <vishal@kernel.org>
Date: Mon, 02 May 2016 09:51:31 -0600
In-Reply-To: <5727753F.6090104@plexistor.com>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com>
	<1461878218-3844-6-git-send-email-vishal.l.verma@intel.com>
	<5727753F.6090104@plexistor.com>
Mime-Version: 1.0
List-Id: XFS Filesystem from SGI <xfs.oss.sgi.com>
List-Unsubscribe: <http://oss.sgi.com/mailman/options/xfs>,
	<mailto:xfs-request@oss.sgi.com?subject=unsubscribe>
List-Archive: <http://oss.sgi.com/pipermail/xfs>
List-Post: <mailto:xfs@oss.sgi.com>
List-Help: <mailto:xfs-request@oss.sgi.com?subject=help>
List-Subscribe: <http://oss.sgi.com/mailman/listinfo/xfs>,
	<mailto:xfs-request@oss.sgi.com?subject=subscribe>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: xfs-bounces@oss.sgi.com
Sender: xfs-bounces@oss.sgi.com
To: Boaz Harrosh <boaz@plexistor.com>, Vishal Verma <vishal.l.verma@intel.com>, linux-nvdimm@lists.01.org
Cc: Jens Axboe <axboe@fb.com>, Jan Kara <jack@suse.cz>, Matthew Wilcox <matthew@wil.cx>, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, linux-block@vger.kernel.org, linux-mm@kvack.org, Al Viro <viro@zeniv.linux.org.uk>, Christoph Hellwig <hch@infradead.org>, linux-fsdevel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org

T24gTW9uLCAyMDE2LTA1LTAyIGF0IDE4OjQxICswMzAwLCBCb2F6IEhhcnJvc2ggd3JvdGU6Cj4g
T24gMDQvMjkvMjAxNiAxMjoxNiBBTSwgVmlzaGFsIFZlcm1hIHdyb3RlOgo+ID4gCj4gPiBBbGwg
SU8gaW4gYSBkYXggZmlsZXN5c3RlbSB1c2VkIHRvIGdvIHRocm91Z2ggZGF4X2RvX2lvLCB3aGlj
aAo+ID4gY2Fubm90Cj4gPiBoYW5kbGUgbWVkaWEgZXJyb3JzLCBhbmQgdGh1cyBjYW5ub3QgcHJv
dmlkZSBhIHJlY292ZXJ5IHBhdGggdGhhdAo+ID4gY2FuCj4gPiBzZW5kIGEgd3JpdGUgdGhyb3Vn
aCB0aGUgZHJpdmVyIHRvIGNsZWFyIGVycm9ycy4KPiA+IAo+ID4gQWRkIGEgbmV3IGlvY2IgZmxh
ZyBmb3IgREFYLCBhbmQgc2V0IGl0IG9ubHkgZm9yIERBWCBtb3VudHMuIEluIHRoZQo+ID4gSU8K
PiA+IHBhdGggZm9yIERBWCBmaWxlc3lzdGVtcywgdXNlIHRoZSBzYW1lIGRpcmVjdF9JTyBwYXRo
IGZvciBib3RoIERBWAo+ID4gYW5kCj4gPiBkaXJlY3RfaW8gaW9jYnMsIGJ1dCB1c2UgdGhlIGZs
YWdzIHRvIGlkZW50aWZ5IHdoZW4gd2UgYXJlIGluCj4gPiBPX0RJUkVDVAo+ID4gbW9kZSB2cyBu
b24gT19ESVJFQ1Qgd2l0aCBEQVgsIGFuZCBmb3IgT19ESVJFQ1QsIHVzZSB0aGUKPiA+IGNvbnZl
bnRpb25hbAo+ID4gZGlyZWN0X0lPIHBhdGggaW5zdGVhZCBvZiBEQVguCj4gPiAKPiBSZWFsbHk/
IFdoYXQgYXJlIHlvdXIgdGhpbmtpbmcgaGVyZT8KPiAKPiBXaGF0IGFib3V0IGFsbCB0aGUgY3Vy
cmVudCB1c2VycyBvZiBPX0RJUkVDVCwgeW91IGhhdmUganVzdCBtYWRlIHRoZW0KPiA0IHRpbWVz
IHNsb3dlciBhbmQgImxlc3MgY29uY3VycmVudCoiIHRoZW4gImJ1ZmZyZWQgaW8iIHVzZXJzLiBT
aW5jZQo+IGRpcmVjdF9JTyBwYXRoIHdpbGwgcXVldWUgYW4gSU8gcmVxdWVzdCBhbmQgYWxsLgo+
IChBbmQgaWYgaXQgaXMgbm90IHNvIHNsb3cgdGhlbiB3aHkgZG8gd2UgbmVlZCBkYXhfZG9faW8g
YXQgYWxsPwo+IFtSaGV0b3JpY2FsXSkKPiAKPiBJIGhhdGUgaXQgdGhhdCB5b3Ugb3ZlcmxvYWQg
dGhlIHNlbWFudGljcyBvZiBhIGtub3duIGFuZCBleHBlY3RlZAo+IE9fRElSRUNUIGZsYWcsIGZv
ciBzcGVjaWFsIHBtZW0gcXVpcmtzLiBUaGlzIGlzIGFuIGluY29tcGF0aWJsZQo+IGFuZCB1bnJl
bGF0ZWQgb3ZlcmxvYWQgb2YgdGhlIHNlbWFudGljcyBvZiBPX0RJUkVDVC4KCldlIG92ZXJsb2Fk
ZWQgT19ESVJFQ1QgYSBsb25nIHRpbWUgYWdvIHdoZW4gd2UgbWFkZSBEQVggcGlnZ3liYWNrIG9u
CnRoZSBzYW1lIHBhdGg6CgpzdGF0aWMgaW5saW5lIGJvb2wgaW9faXNfZGlyZWN0KHN0cnVjdCBm
aWxlICpmaWxwKQp7CglyZXR1cm4gKGZpbHAtPmZfZmxhZ3MgJiBPX0RJUkVDVCkgfHwgSVNfREFY
KGZpbHAtPmZfbWFwcGluZy0+aG9zdCk7Cn0KClllcyBPX0RJUkVDVCBvbiBhIERBWCBtb3VudGVk
IGZpbGUgc3lzdGVtIHdpbGwgbm93IGJlIHNsb3dlciwgYnV0IC0KCj4gCj4gPiAKPiA+IFRoaXMg
YWxsb3dzIHVzIGEgcmVjb3ZlcnkgcGF0aCBpbiB0aGUgZm9ybSBvZiBvcGVuaW5nIHRoZSBmaWxl
IHdpdGgKPiA+IE9fRElSRUNUIGFuZCB3cml0aW5nIHRvIGl0IHdpdGggdGhlIHVzdWFsIE9fRElS
RUNUIHNlbWFudGljcwo+ID4gKHNlY3Rvcgo+ID4gYWxpZ25tZW50IHJlc3RyaWN0aW9ucykuCj4g
PiAKPiBJIHVuZGVyc3RhbmQgdGhhdCB5b3Ugd2FudCBhIHNlY3RvciBhbGlnbmVkIElPLCByaWdo
dD8gZm9yIHRoZQo+IGNsZWFyIG9mIGVycm9ycy4gQnV0IEkgaGF0ZSBpdCB0aGF0IHlvdSBmb3Jj
ZWQgYWxsIE9fRElSRUNUIElPCj4gdG8gYmUgc2xvdyBmb3IgdGhpcy4KPiBDYW4geW91IG5vdCBt
YWtlIGRheF9kb19pbyBoYW5kbGUgbWVkaWEgZXJyb3JzPyBBdCBsZWFzdCBmb3IgdGhlCj4gcGFy
dHMgb2YgdGhlIElPIHRoYXQgYXJlIGFsaWduZWQuCj4gKEFuZCB5b3VyIHJlY292ZXJ5IHBhdGgg
YXBwbGljYXRpb24gYWJvdmUgY2FuIHVzZSBvbmx5IGFsaWduZWQKPiDCoElPIHRvIG1ha2Ugc3Vy
ZSkKPiAKPiBQbGVhc2UgbG9vayBmb3IgYW5vdGhlciBzb2x1dGlvbi4gRXZlbiBhIHNwZWNpYWwK
PiBJT0NUTF9EQVhfQ0xFQVJfRVJST1IKCsKgLSBzZWUgYWxsIHRoZSB2ZXJzaW9ucyBvZiB0aGlz
IHNlcmllcyBwcmlvciB0byB0aGlzIG9uZSwgd2hlcmUgd2UgdHJ5CnRvIGRvIGEgZmFsbGJhY2su
Li4KCj4gCj4gWyoibGVzcyBjb25jdXJyZW50IiBiZWNhdXNlIG9mIHRoZSBxdWV1aW5nIGRvbmUg
aW4gYmRldi4gTm90ZSBob3cKPiDCoCBwbWVtIGlzIG5vdCBldmVuIG11bHRpLXF1ZXVlLCBhbmQg
ZXZlbiBpZiBpdCB3YXMgaXQgd2lsbCBiZSBtdWNoCj4gwqAgc2xvd2VyIHRoZW4gREFYIGJlY2F1
c2Ugb2YgdGhlIGNvZGUgZGVwdGggYW5kIGFsbCB0aGUgbG9ja3MgYW5kCj4gdGFzawo+IMKgIHN3
aXRjaGVzIGRvbmUgaW4gdGhlIGJsb2NrIGxheWVyLiBJbiBEQVggdGhlIGZpbmFsIG1lbWNweSBp
cyBkb25lCj4gZGlyZWN0bHkKPiDCoCBvbiB0aGUgdXNlci1tb2RlIHRocmVhZF0KPiAKPiBUaGFu
a3MKPiBCb2F6Cj4gCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f
X19fXwp4ZnMgbWFpbGluZyBsaXN0Cnhmc0Bvc3Muc2dpLmNvbQpodHRwOi8vb3NzLnNnaS5jb20v
bWFpbG1hbi9saXN0aW5mby94ZnMK

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
Received: from mail-pf0-f200.google.com (mail-pf0-f200.google.com [209.85.192.200])
	by kanga.kvack.org (Postfix) with ESMTP id 66D306B0253
	for <linux-mm@kvack.org>; Mon,  2 May 2016 11:51:35 -0400 (EDT)
Received: by mail-pf0-f200.google.com with SMTP id 203so359293614pfy.2
        for <linux-mm@kvack.org>; Mon, 02 May 2016 08:51:35 -0700 (PDT)
Received: from mail.kernel.org (mail.kernel.org. [198.145.29.136])
        by mx.google.com with ESMTPS id o86si2010256pfi.217.2016.05.02.08.51.34
        for <linux-mm@kvack.org>
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Mon, 02 May 2016 08:51:34 -0700 (PDT)
Message-ID: <1462204291.11211.20.camel@kernel.org>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: Vishal Verma <vishal@kernel.org>
Date: Mon, 02 May 2016 09:51:31 -0600
In-Reply-To: <5727753F.6090104@plexistor.com>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com>
	 <1461878218-3844-6-git-send-email-vishal.l.verma@intel.com>
	 <5727753F.6090104@plexistor.com>
Content-Type: text/plain; charset="UTF-8"
Mime-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
List-ID: <linux-mm.kvack.org>
To: Boaz Harrosh <boaz@plexistor.com>, Vishal Verma <vishal.l.verma@intel.com>, linux-nvdimm@lists.01.org
Cc: linux-block@vger.kernel.org, Jan Kara <jack@suse.cz>, Matthew Wilcox <matthew@wil.cx>, Dave Chinner <david@fromorbit.com>, linux-kernel@vger.kernel.org, xfs@oss.sgi.com, Jens Axboe <axboe@fb.com>, linux-mm@kvack.org, Al Viro <viro@zeniv.linux.org.uk>, Christoph Hellwig <hch@infradead.org>, linux-fsdevel@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org

On Mon, 2016-05-02 at 18:41 +0300, Boaz Harrosh wrote:
> On 04/29/2016 12:16 AM, Vishal Verma wrote:
> > 
> > All IO in a dax filesystem used to go through dax_do_io, which
> > cannot
> > handle media errors, and thus cannot provide a recovery path that
> > can
> > send a write through the driver to clear errors.
> > 
> > Add a new iocb flag for DAX, and set it only for DAX mounts. In the
> > IO
> > path for DAX filesystems, use the same direct_IO path for both DAX
> > and
> > direct_io iocbs, but use the flags to identify when we are in
> > O_DIRECT
> > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the
> > conventional
> > direct_IO path instead of DAX.
> > 
> Really? What are your thinking here?
> 
> What about all the current users of O_DIRECT, you have just made them
> 4 times slower and "less concurrent*" then "buffred io" users. Since
> direct_IO path will queue an IO request and all.
> (And if it is not so slow then why do we need dax_do_io at all?
> [Rhetorical])
> 
> I hate it that you overload the semantics of a known and expected
> O_DIRECT flag, for special pmem quirks. This is an incompatible
> and unrelated overload of the semantics of O_DIRECT.

We overloaded O_DIRECT a long time ago when we made DAX piggyback on
the same path:

static inline bool io_is_direct(struct file *filp)
{
	return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host);
}

Yes O_DIRECT on a DAX mounted file system will now be slower, but -

> 
> > 
> > This allows us a recovery path in the form of opening the file with
> > O_DIRECT and writing to it with the usual O_DIRECT semantics
> > (sector
> > alignment restrictions).
> > 
> I understand that you want a sector aligned IO, right? for the
> clear of errors. But I hate it that you forced all O_DIRECT IO
> to be slow for this.
> Can you not make dax_do_io handle media errors? At least for the
> parts of the IO that are aligned.
> (And your recovery path application above can use only aligned
> A IO to make sure)
> 
> Please look for another solution. Even a special
> IOCTL_DAX_CLEAR_ERROR

A - see all the versions of this series prior to this one, where we try
to do a fallback...

> 
> [*"less concurrent" because of the queuing done in bdev. Note how
> A  pmem is not even multi-queue, and even if it was it will be much
> A  slower then DAX because of the code depth and all the locks and
> task
> A  switches done in the block layer. In DAX the final memcpy is done
> directly
> A  on the user-mode thread]
> 
> Thanks
> Boaz
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753950AbcEBPvp (ORCPT <rfc822;w@1wt.eu>);
	Mon, 2 May 2016 11:51:45 -0400
Received: from mail.kernel.org ([198.145.29.136]:33364 "EHLO mail.kernel.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752853AbcEBPvf (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Mon, 2 May 2016 11:51:35 -0400
Message-ID: <1462204291.11211.20.camel@kernel.org>
Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io
From: Vishal Verma <vishal@kernel.org>
To: Boaz Harrosh <boaz@plexistor.com>, Vishal Verma <vishal.l.verma@intel.com>,
        linux-nvdimm@ml01.01.org
Cc: linux-block@vger.kernel.org, Jan Kara <jack@suse.cz>,
        Matthew Wilcox <matthew@freeurl.abc188.com>,
        Dave Chinner <david@fromorbit.com>, linux-kernel@vger.kernel.org,
        xfs@oss.sgi.com, Jens Axboe <axboe@fb.com>, linux-mm@kvack.org,
        Al Viro <viro@zeniv.linux.org.uk>,
        Christoph Hellwig <hch@infradead.org>, linux-fsdevel@vger.kernel.org,
        Andrew Morton <akpm@linux-foundation.org>, linux-ext4@vger.kernel.org
Date: Mon, 02 May 2016 09:51:31 -0600
In-Reply-To: <5727753F.6090104@plexistor.com>
References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com>
	 <1461878218-3844-6-git-send-email-vishal.l.verma@intel.com>
	 <5727753F.6090104@plexistor.com>
Content-Type: text/plain; charset="UTF-8"
X-Mailer: Evolution 3.18.5.2 (3.18.5.2-1.fc23) 
Mime-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, 2016-05-02 at 18:41 +0300, Boaz Harrosh wrote:
> On 04/29/2016 12:16 AM, Vishal Verma wrote:
> > 
> > All IO in a dax filesystem used to go through dax_do_io, which
> > cannot
> > handle media errors, and thus cannot provide a recovery path that
> > can
> > send a write through the driver to clear errors.
> > 
> > Add a new iocb flag for DAX, and set it only for DAX mounts. In the
> > IO
> > path for DAX filesystems, use the same direct_IO path for both DAX
> > and
> > direct_io iocbs, but use the flags to identify when we are in
> > O_DIRECT
> > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the
> > conventional
> > direct_IO path instead of DAX.
> > 
> Really? What are your thinking here?
> 
> What about all the current users of O_DIRECT, you have just made them
> 4 times slower and "less concurrent*" then "buffred io" users. Since
> direct_IO path will queue an IO request and all.
> (And if it is not so slow then why do we need dax_do_io at all?
> [Rhetorical])
> 
> I hate it that you overload the semantics of a known and expected
> O_DIRECT flag, for special pmem quirks. This is an incompatible
> and unrelated overload of the semantics of O_DIRECT.

We overloaded O_DIRECT a long time ago when we made DAX piggyback on
the same path:

static inline bool io_is_direct(struct file *filp)
{
	return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host);
}

Yes O_DIRECT on a DAX mounted file system will now be slower, but -

> 
> > 
> > This allows us a recovery path in the form of opening the file with
> > O_DIRECT and writing to it with the usual O_DIRECT semantics
> > (sector
> > alignment restrictions).
> > 
> I understand that you want a sector aligned IO, right? for the
> clear of errors. But I hate it that you forced all O_DIRECT IO
> to be slow for this.
> Can you not make dax_do_io handle media errors? At least for the
> parts of the IO that are aligned.
> (And your recovery path application above can use only aligned
>  IO to make sure)
> 
> Please look for another solution. Even a special
> IOCTL_DAX_CLEAR_ERROR

 - see all the versions of this series prior to this one, where we try
to do a fallback...

> 
> [*"less concurrent" because of the queuing done in bdev. Note how
>   pmem is not even multi-queue, and even if it was it will be much
>   slower then DAX because of the code depth and all the locks and
> task
>   switches done in the block layer. In DAX the final memcpy is done
> directly
>   on the user-mode thread]
> 
> Thanks
> Boaz
>