From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757667AbcEEVjT (ORCPT ); Thu, 5 May 2016 17:39:19 -0400 Received: from mga11.intel.com ([192.55.52.93]:4533 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756561AbcEEVjR (ORCPT ); Thu, 5 May 2016 17:39:17 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,583,1455004800"; d="scan'208";a="959811912" From: "Verma, Vishal L" To: "hch@infradead.org" , "boaz@plexistor.com" CC: "linux-kernel@vger.kernel.org" , "linux-block@vger.kernel.org" , "linux-nvdimm@ml01.01.org" , "xfs@oss.sgi.com" , "linux-mm@kvack.org" , "viro@zeniv.linux.org.uk" , "akpm@linux-foundation.org" , "axboe@fb.com" , "linux-fsdevel@vger.kernel.org" , "linux-ext4@vger.kernel.org" , "david@fromorbit.com" , "jack@suse.cz" , "matthew@wil.cx" Subject: Re: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io Thread-Topic: [PATCH v4 5/7] fs: prioritize and separate direct_io from dax_io Thread-Index: AQHRoZNfS9ZF3cQEEUydwj8b8xF2vZ+mRHyAgAShZYCAAHlmgA== Date: Thu, 5 May 2016 21:39:14 +0000 Message-ID: <1462484343.29294.1.camel@intel.com> References: <1461878218-3844-1-git-send-email-vishal.l.verma@intel.com> <1461878218-3844-6-git-send-email-vishal.l.verma@intel.com> <5727753F.6090104@plexistor.com> <20160505142433.GA4557@infradead.org> In-Reply-To: <20160505142433.GA4557@infradead.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.232.112.171] Content-Type: text/plain; charset="utf-8" Content-ID: MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id u45LdNv2013153 On Thu, 2016-05-05 at 07:24 -0700, Christoph Hellwig wrote: > On Mon, May 02, 2016 at 06:41:51PM +0300, Boaz Harrosh wrote: > > > > > > > > All IO in a dax filesystem used to go through dax_do_io, which > > > cannot > > > handle media errors, and thus cannot provide a recovery path that > > > can > > > send a write through the driver to clear errors. > > > > > > Add a new iocb flag for DAX, and set it only for DAX mounts. In > > > the IO > > > path for DAX filesystems, use the same direct_IO path for both DAX > > > and > > > direct_io iocbs, but use the flags to identify when we are in > > > O_DIRECT > > > mode vs non O_DIRECT with DAX, and for O_DIRECT, use the > > > conventional > > > direct_IO path instead of DAX. > > > > > Really? What are your thinking here? > > > > What about all the current users of O_DIRECT, you have just made > > them > > 4 times slower and "less concurrent*" then "buffred io" users. Since > > direct_IO path will queue an IO request and all. > > (And if it is not so slow then why do we need dax_do_io at all? > > [Rhetorical]) > > > > I hate it that you overload the semantics of a known and expected > > O_DIRECT flag, for special pmem quirks. This is an incompatible > > and unrelated overload of the semantics of O_DIRECT. > Agreed - makig O_DIRECT less direct than not having it is plain > stupid, > and I somehow missed this initially. How is it any 'less direct'? All it does now is follow the blockdev O_DIRECT path. There still isn't any page cache involved.. > > This whole DAX story turns into a major nightmare, and I fear all our > hodge podge tweaks to the semantics aren't helping it. > > It seems like we simply need an explicit O_DAX for the read/write > bypass if can't sort out the semantics (error, writer synchronization) > just as we need a special flag for MMAP..