public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Sage Weil <sage@newdream.net>
Cc: Andrey Kuzmin <andrey.v.kuzmin@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: [RFC] big fat transaction ioctl
Date: Tue, 10 Nov 2009 16:49:24 -0800	[thread overview]
Message-ID: <4AFA0A14.7070701@goop.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0911101400270.27554@cobra.newdream.net>

On 11/10/09 14:13, Sage Weil wrote:
> On Tue, 10 Nov 2009, Andrey Kuzmin wrote:
>
>   
>> On Tue, Nov 10, 2009 at 11:12 PM, Sage Weil <sage@newdream.net> wrote:
>>     
>>> Hi all,
>>>
>>> This is an alternative approach to atomic user transactions for btrfs.
>>> The old start/end ioctls suffer from some basic limitations, namely
>>>
>>>  - We can't properly reserve space ahead of time to avoid ENOSPC part
>>> way through the transaction, and
>>>  - The process may die (seg fault, SIGKILL) part way through the
>>> transaction.  Currently when that happens the partial transaction will
>>> commit.
>>>
>>> This patch implements an ioctl that lets the application completely
>>> specify the entire transaction in a single syscall.  If the process gets
>>> killed or seg faults part way through, the entire transaction will still
>>> complete.
>>>
>>> The goal is to atomically commit updates to multiple files, xattrs,
>>> directories.  But this is still a file system: we don't get rollback if
>>> things go wrong.  Instead, do what we can up front to make sure things
>>> will work out.  And if things do go wrong, optionally prevent a partial
>>> result from reaching the disk.
>>>       
>> Why not snapshot respective root (doesn't work if transaction spans
>> multiple file-systems, but this doesn't look like a real-world
>> limitation), run txn against that snapshot and rollback on failure
>> instead? Snapshots are writable, cheap, and this looks like a real
>> transaction abort mechanism.
>>     
> Good question.  :)
>
> I hadn't looked into this before, but I think the snapshots could be used 
> to achieve both atomicity and rollback.  If userspace uses an rw mutex to 
> quiesce writes, it can make sure all transactions complete before creating 
> a snapshot (commit).  The problem with this currently is the create 
> snapshot ioctl is relatively slow... it calls commit_transaction, which 
> blocks until everything reaches disk.  I think to perform well this 
> approach would need a hook to start a commit and then return as soon as it 
> can guarantee than any subsequent operation's start_transaction can't join 
> in that commit.
>
> This may be a better way to go about this, though.  Does that sound 
> reasonable, Chris?
>   

If snapshots only capture what's currently physically on disk, then it
means that the transactions will be fairly heavyweight in requiring
everything to be physically synced.  That may be what some apps want
anyway, but I can certainly imagine apps wanting transaction semantics
without having fsync-level durability requirements.

    J

  reply	other threads:[~2009-11-11  0:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-10 20:12 [RFC] big fat transaction ioctl Sage Weil
2009-11-10 20:44 ` Andrey Kuzmin
2009-11-10 22:13   ` Sage Weil
2009-11-11  0:49     ` Jeremy Fitzhardinge [this message]
2009-11-11  5:15       ` Sage Weil
2009-11-11 15:03     ` Chris Mason
2009-11-11 15:41       ` Andrey Kuzmin
2009-11-11 15:55         ` Chris Mason
2009-11-11 17:19       ` Sage Weil
2009-11-12  3:56         ` Andrey Kuzmin
2009-11-11 14:54 ` Chris Mason
2009-11-11 18:22   ` Zach Brown
2009-11-11 22:22     ` Sage Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AFA0A14.7070701@goop.org \
    --to=jeremy@goop.org \
    --cc=andrey.v.kuzmin@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox