From mboxrd@z Thu Jan 1 00:00:00 1970 From: Howard Chu Subject: Re: [PATCH 1/2] block: Add support for atomic writes Date: Wed, 13 Nov 2013 12:53:48 -0800 Message-ID: <5283E6DC.8000801@symas.com> References: <20131101212704.10239.73920@localhost.localdomain> <20131101212854.10239.19830@localhost.localdomain> <20131107135220.3802.91392@localhost.localdomain> <20131112151151.GI6900@linux.intel.com> <20131113204438.3802.80855@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Jeff Moyer , Linux FS Devel , Jens Axboe To: Chris Mason , Matthew Wilcox Return-path: Received: from zill.ext.symas.net ([69.43.206.106]:47582 "EHLO zill.ext.symas.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750896Ab3KMUxx (ORCPT ); Wed, 13 Nov 2013 15:53:53 -0500 In-Reply-To: <20131113204438.3802.80855@localhost.localdomain> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Chris Mason wrote: > Quoting Matthew Wilcox (2013-11-12 10:11:51) >> I took a look at the SCSI Block Command spec. If I understand it >> correctly, SCSI would implement this with the WRITE USING TOKEN command. >> I don't see why it couldn't implement this API, though it seems like >> SCSI would prefer a separate setup step before the write comes in. I'm >> not sure that's a reasonable request to make of the application (nor >> am I sure I understand SBC correctly). > > What kind of setup would we have to do? We have all the IO in hand, so > it can be organized in just about any way needed. > >> >> I like the API, but I'm a little confused not to see a patch saying "Oh, >> and here's how we implemented it in btrfs without any hardware support" >> ;-) It seems to me that the concept is just as good a match for an >> advanced filesystem that supports snapshots as it is for the FTL inside >> a drive. > > Grin, almost Btrfs already does this...COW means that btrfs needs to > update metadata to point to new locations. To avoid an ugly > flush-all-the-io-every-commit mess, we track pending writes and update > the meatadata when the write is fully on media. > > We're missing a firm line that makes sure all the metadata updates for a > single write happen in the same transaction, but that part isn't hard. > > We're missing good performance in database workloads, which is a > slightly bigger trick. This is precisely why this is needed: http://www.spinics.net/lists/linux-fsdevel/msg70047.html -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/