All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oren Laadan <orenl@cs.columbia.edu>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>,
	Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>,
	Serge Hallyn <serge@hallyn.com>, Dan Smith <danms@us.ibm.com>,
	John Stultz <johnstul@us.ibm.com>,
	Matthew Wilcox <matthew@wil.cx>,
	Jamie Lokier <jamie@shareable.org>,
	linux-fsdevel@vger.kernel.org,
	Containers <containers@lists.linux-foundation.org>
Subject: Re: [PATCH 00/16][cr][v3]: C/R file owner, locks, leases
Date: Wed, 04 Aug 2010 14:03:50 -0400	[thread overview]
Message-ID: <4C59AB86.6080202@cs.columbia.edu> (raw)
In-Reply-To: <20100804172649.GM2927@count0.beaverton.ibm.com>



On 08/04/2010 01:26 PM, Matt Helsley wrote:
> On Wed, Aug 04, 2010 at 11:45:20AM +0100, Steven Whitehouse wrote:
>> Hi,
>>
>> On Tue, 2010-08-03 at 16:11 -0700, Sukadev Bhattiprolu wrote:
>>> Checkpoint/restart file owner, file-locks and file-lease information.
>>>
>> Can you explain roughly how this is intended to work, or point me at a
>> document explaining it?
>>
>> I'm trying to figure out how the file lock checkpoint will work with
>> cluster filesystems, or if there needs to be a mechanism to turn this
>> feature off for those filesystems. What prevents the lock state changing
>> in an incompatible way between the checkpoint and the restore?
>

Hi Steve,

In addition to Matt's reply -

Checkpoint/restart _assumes_ that there exists a mechanism to keep
the filesystem state _unchanged_ between checkpoint and restart.

For example, one can kill the application after checkpoint and keep
the filesystem from being touched.
A more likely scenario is to use a filesystem's snapshot/backup
solution during checkpoint to ensure a pristine copy for restart.
In particular, there needs to be a mechanism to accomplish this
in a cluster filesystem, or rely on dedicated userspace tools.

So at restart, the filesystem is assumed to be visible and in the
same state as before. That state also includes locks etc.

Also, c/r has a mechanism to detect cases where a file in use by
the checkpoint application(s) is shared with a task that is not
being checkpointed. In this case, checkpoint will fail, to prevent
inconsistencies.

(I also imagine that often a cluster filesystem is used by parallel
applications - which in turn require some support to be checkpointed
in a consisted manner).

Oren.


> Hi Steve,
>
> [ I'm just going to address your cluster filesystem question and let
>    Suka answer your questions on these patches. ]
>
> 	Open files whose file operations structs are missing the
> .checkpoint operation cause checkpoint to fail. We haven't added a
> .checkpoint operation to cluster filesystems because of the kinds of
> issues you're referring to.
>
> 	I don't think there are any file locks/leases which do not
> require opening the file(s) in question. That means file locks
> and leases in cluster filesystems should also cause checkpoint
> to fail.
>
> 	Each cluster filesystem probably needs some special care when
> considering the use of the generic_file_checkpoint operation.
>
> 	Using generic_file_checkpoint is appropriate when we have some
> way to get a consistent image of the filesystem at the time checkpoint
> takes place. How that happens is largely up to the userspace tools
> called user-cr. Device-mapper snapshots, fsfreezer + rsync, and
> filesystem snapshots will all work. Of course those tools usually don't
> save more volatile state information like locks.
>
> 	It's quite possible cluster filesystems will need their own
> .checkpoint file operations. generic_file_checkpoint is composed of a few
> smaller functions which could make writing such ops easier. For example,
> we've already reused the smaller functions in .checkpoint operations for
> anon_inode-based interfaces, pipes, fifos, and more,
>
> 	What it may come down to is this: How do you backup a cluster
> filesystem? If there's already a backup method that works then we can
> write the .checkpoint operation to rely on it. Often that means we
> can use generic_file_checkpoint. The "backup method" should be
> something which can be invoked by the userspace checkpoint/restart tools
> (user-cr). If the backup method is too slow we can work on
> improving it or we can try something else.
>
> 	So perhaps the best thing we can do to help you is learn how
> folks backup their cluster filesystems. Got any pointers to basic info
> on that?
>
> Cheers,
> 	-Matt Helsley
>
>

  parent reply	other threads:[~2010-08-04 18:04 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-03 23:11 [PATCH 00/16][cr][v3]: C/R file owner, locks, leases Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 01/16][cr][v3]: Add uid, euid params to f_modown() Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 02/16][cr][v3]: Add uid, euid params to __f_setown() Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 03/16][cr][v3]: Checkpoint file-owner information Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 04/16][cr][v3]: Restore file_owner info Sukadev Bhattiprolu
     [not found]   ` <1280877097-12377-5-git-send-email-sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-08-04 23:01     ` Oren Laadan
2010-08-04 23:01   ` Oren Laadan
2010-08-03 23:11 ` [PATCH 05/16][cr][v3]: Move file_lock macros into linux/fs.h Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 06/16][cr][v3]: Checkpoint file-locks Sukadev Bhattiprolu
     [not found]   ` <1280877097-12377-7-git-send-email-sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-08-04 23:26     ` Oren Laadan
2010-08-04 23:26   ` Oren Laadan
2010-08-03 23:11 ` [PATCH 07/16][cr][v3]: Define flock_set() Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 08/16][cr][v3]: Define flock64_set() Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 09/16][cr][v3]: Restore file-locks Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 10/16][cr][v3]: Initialize ->fl_break_time to 0 Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 11/16][cr][v3]: Add ->fl_type_prev field Sukadev Bhattiprolu
     [not found] ` <1280877097-12377-1-git-send-email-sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-08-03 23:11   ` [PATCH 01/16][cr][v3]: Add uid, euid params to f_modown() Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 02/16][cr][v3]: Add uid, euid params to __f_setown() Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 03/16][cr][v3]: Checkpoint file-owner information Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 04/16][cr][v3]: Restore file_owner info Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 05/16][cr][v3]: Move file_lock macros into linux/fs.h Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 06/16][cr][v3]: Checkpoint file-locks Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 07/16][cr][v3]: Define flock_set() Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 08/16][cr][v3]: Define flock64_set() Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 09/16][cr][v3]: Restore file-locks Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 10/16][cr][v3]: Initialize ->fl_break_time to 0 Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 11/16][cr][v3]: Add ->fl_type_prev field Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 12/16][cr][v3]: Add ->fl_break_notified field Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 13/16][cr][v3]: Add jiffies_begin field to ckpt_ctx Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 14/16][cr][v3]: Checkpoint file-leases Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 15/16][cr][v3]: Define do_setlease() Sukadev Bhattiprolu
2010-08-03 23:11   ` [PATCH 16/16][cr][v3]: Restore file-leases Sukadev Bhattiprolu
2010-08-04 10:45   ` [PATCH 00/16][cr][v3]: C/R file owner, locks, leases Steven Whitehouse
2010-08-03 23:11 ` [PATCH 12/16][cr][v3]: Add ->fl_break_notified field Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 13/16][cr][v3]: Add jiffies_begin field to ckpt_ctx Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 14/16][cr][v3]: Checkpoint file-leases Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 15/16][cr][v3]: Define do_setlease() Sukadev Bhattiprolu
2010-08-03 23:11 ` [PATCH 16/16][cr][v3]: Restore file-leases Sukadev Bhattiprolu
2010-08-04 23:35   ` Oren Laadan
     [not found]   ` <1280877097-12377-17-git-send-email-sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-08-04 23:35     ` Oren Laadan
2010-08-04 10:45 ` [PATCH 00/16][cr][v3]: C/R file owner, locks, leases Steven Whitehouse
2010-08-04 17:26   ` Matt Helsley
     [not found]     ` <20100804172649.GM2927-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-08-04 18:03       ` Oren Laadan
2010-08-04 18:03     ` Oren Laadan [this message]
2010-08-04 17:26   ` Matt Helsley
2010-08-04 19:01   ` Sukadev Bhattiprolu
2010-08-04 19:01   ` Sukadev Bhattiprolu
2010-08-04 19:16     ` Oren Laadan
     [not found]     ` <20100804190112.GA11571-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-08-04 19:16       ` Oren Laadan
  -- strict thread matches above, loose matches on Subject: below --
2010-08-03 23:11 Sukadev Bhattiprolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C59AB86.6080202@cs.columbia.edu \
    --to=orenl@cs.columbia.edu \
    --cc=containers@lists.linux-foundation.org \
    --cc=danms@us.ibm.com \
    --cc=jamie@shareable.org \
    --cc=johnstul@us.ibm.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=matthltc@us.ibm.com \
    --cc=serge@hallyn.com \
    --cc=sukadev@linux.vnet.ibm.com \
    --cc=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.