From mboxrd@z Thu Jan  1 00:00:00 1970
From: Howard Chu <hyc@symas.com>
Subject: Re: [LSF/MM TOPIC] atomic block device
Date: Sat, 15 Feb 2014 10:29:18 -0800
Message-ID: <52FFB1FE.9040300@symas.com>
References: <CAA9_cmf7Y1TL8XqR7dYUn=Pv-En2e0X0FM0zdpkiBkUuNBGKfQ@mail.gmail.com> <CABBL8ELycRzfyDGtKWk1nFySh9-a5Rh5uZXdgGEwMYHxCQzO3Q@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: lsf-pc <lsf-pc@lists.linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	jmoyer <jmoyer@redhat.com>, david <david@fromorbit.com>,
	Chris Mason <clm@fb.com>, Jens Axboe <axboe@kernel.dk>,
	Bryan E Veal <bryan.e.veal@intel.com>,
	Annie Foong <annie.foong@intel.com>
To: Andy Rudoff <andy@rudoff.com>,
	Dan Williams <dan.j.williams@intel.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from zill.ext.symas.net ([69.43.206.106]:48939 "EHLO
	zill.ext.symas.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753522AbaBOS3a (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Sat, 15 Feb 2014 13:29:30 -0500
In-Reply-To: <CABBL8ELycRzfyDGtKWk1nFySh9-a5Rh5uZXdgGEwMYHxCQzO3Q@mail.gmail.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

Andy Rudoff wrote:
> On the other side of the coin, I remember Dave talking about this
> during our NVM discussion at LSF last year and I got the impression
> the size and number of writes he'd need supported before he could
> really stop using his journaling code was potentially large.  Dave:
> perhaps you can re-state the number of writes and their total size
> that would have to be supported by block level atomics in order for
> them to be worth using by XFS?

If you're dealing with a typical update-in-place database then there's no 
upper bound on this, a DB transaction can be arbitrarily large and any partial 
write will result in corrupted data structures.

On the other hand, with a multi-version copy-on-write DB (like mine, 
http://symas.com/mdb/ ) all you need is a guarantee that all data writes 
complete before any metadata is updated.

IMO, catering to the update-in-place approach is an exercise in futility since 
it will require significant memory resources on every link in the storage 
chain and whatever amount you have available will never be sufficient.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/