From mboxrd@z Thu Jan 1 00:00:00 1970 From: Matthew Wilcox Subject: Re: [PATCH 1/2] block: Add support for atomic writes Date: Tue, 12 Nov 2013 10:11:51 -0500 Message-ID: <20131112151151.GI6900@linux.intel.com> References: <20131101212704.10239.73920@localhost.localdomain> <20131101212854.10239.19830@localhost.localdomain> <20131107135220.3802.91392@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Jeff Moyer , Linux FS Devel , Jens Axboe To: Chris Mason Return-path: Received: from mga09.intel.com ([134.134.136.24]:53174 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755114Ab3KLPLx (ORCPT ); Tue, 12 Nov 2013 10:11:53 -0500 Content-Disposition: inline In-Reply-To: <20131107135220.3802.91392@localhost.localdomain> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Thu, Nov 07, 2013 at 08:52:20AM -0500, Chris Mason wrote: > Unfortunately, it's hard to say. I think the fusionio cards are the > only shipping devices that support this, but I've definitely heard that > others plan to support it as well. mariadb/percona already support the > atomics via fusionio specific ioctls, and turning that into a real > O_ATOMIC is a priority so other hardware can just hop on the train. > > This feature in general is pretty natural for the log structured squirrels > they stuff inside flash, so I'd expect everyone to support it. Matthew, > how do you feel about all of this? NVMe doesn't have support for this functionality. I know what stories I've heard from our internal device teams about what they can and can't support in the way of this kind of thing, but I obviously can't repeat them here! I took a look at the SCSI Block Command spec. If I understand it correctly, SCSI would implement this with the WRITE USING TOKEN command. I don't see why it couldn't implement this API, though it seems like SCSI would prefer a separate setup step before the write comes in. I'm not sure that's a reasonable request to make of the application (nor am I sure I understand SBC correctly). I like the API, but I'm a little confused not to see a patch saying "Oh, and here's how we implemented it in btrfs without any hardware support" ;-) It seems to me that the concept is just as good a match for an advanced filesystem that supports snapshots as it is for the FTL inside a drive. > With the fusionio drivers, we've recently increased the max atomic size. > It's basically 1MB, disjoint or contig doesn't matter. We're powercut > safe at 1MB.