From: "Serge E. Hallyn" <serue@us.ibm.com>
To: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>,
Christoph Hellwig <hch@infradead.org>,
containers <containers@lists.linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH 00/11] track files for checkpointability
Date: Fri, 6 Mar 2009 12:24:51 -0600 [thread overview]
Message-ID: <20090306182451.GA6307@us.ibm.com> (raw)
In-Reply-To: <1236357965.10626.51.camel@nimitz>
Quoting Dave Hansen (dave@linux.vnet.ibm.com):
> On Fri, 2009-03-06 at 10:23 -0600, Serge E. Hallyn wrote:
> > Which imo is fine, but my question is whether that leaves any actual
> > value in the persistent per-resource uncheckpointable flag.
>
> OK, let's take a look back at this discussion a little bit and how we
> got here.
>
> Ingo quotes:
> > Yeah, per resource it should be. That's per task in the normal
> > case - except for threaded workloads where it's shared by
> > threads.
>
> > Uncheckpointable should be a one-way flag anyway. We want this
> > to become usable, so uncheckpointable functionality should be as
> > painful as possible, to make sure it's getting fixed ...
>
> > Is there any automated test that could discover C/R breakage via
> > brute force? All that matters in such cases is to get the "you
> > broke stuff" information as soon as possible. If it comes at an
> > early stage developers can generally just fix stuff.
>
> You add these things together and you get what I posted. My patch is:
> 1. per resource
> 2. has a one way flag
> 3. Gives messages to developers at an early stage (dmesg) and lets them
> explore it more thoroughly (/proc)
>
> But, these "early stage" messages are completely opposed to an approach
> that uses sys_checkpoint() in some form (like with a -1 fd as an
> argument).
Well I disagree with that. The 'early stage' messages could be seen as
either:
1. a short-term way to prioritize resources to support
or
2. a long-term way to catch new resources introduced
without checkpoint/restart support
I don't believe 2. would work. I think 1. would work, but that we
risk imposing permanent code changes to support a temporary goal.
In contrast, the sys_checkpoint() check will always be needed to
check whether a particular application is checkpointable. For
instance a task will never be checkpointable if it shares a mm-struct
with a task not being checkpointed.
> Think of it like lockdep. We *could* have designed lockdep to simply
> give us a nice message whenever we do an a/b b/a deadlock. That would
> be helpful. Or, we could design it to record all lock acquisitions that
> didn't deadlock to see if they ever possibly deadlock. (We did the
> second one, btw). That gave an early, useful, warning that developers
> could fix before we encounter an actual problem. I'm advocating such a
> mechanism for c/r.
If you can convince me that it'll do that you'll have me on board :)
-serge
next prev parent reply other threads:[~2009-03-06 18:25 UTC|newest]
Thread overview: 93+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-05 16:38 [RFC][PATCH 00/11] track files for checkpointability Dave Hansen
2009-03-05 16:38 ` Dave Hansen
2009-03-05 16:38 ` [RFC][PATCH 01/11] kill '_data' in cr_hdr_fd_data name Dave Hansen
2009-03-05 16:38 ` Dave Hansen
2009-03-05 16:38 ` [RFC][PATCH 02/11] breakout fdinfo sprintf() into its own function Dave Hansen
2009-03-05 16:38 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 03/11] Introduce generic_file_checkpoint() Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 04/11] actually use f_op in checkpoint code Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 05/11] add generic checkpoint f_op to ext fses Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-13 2:50 ` Oren Laadan
2009-03-13 2:50 ` Oren Laadan
2009-03-05 16:39 ` [RFC][PATCH 06/11] add checkpoint_file_generic() to /proc Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 07/11] file c/r: expose functions to query fs support Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 08/11] expose file checkpointability and reasoning in /proc Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 09/11] check files for checkpointability Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-09 17:38 ` Matt Helsley
2009-03-09 17:38 ` Matt Helsley
2009-03-12 19:14 ` Dave Hansen
[not found] ` <20090309173837.GC7561-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-12 19:14 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 10/11] add checkpoint/restart compile helper Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 11/11] optimize c/r check in dup_fd() Dave Hansen
2009-03-05 16:39 ` Dave Hansen
2009-03-05 17:40 ` [RFC][PATCH 00/11] track files for checkpointability Alexey Dobriyan
[not found] ` <20090305174037.GA2274-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-05 19:16 ` Dave Hansen
2009-03-05 19:16 ` Dave Hansen
2009-03-05 21:08 ` Alexey Dobriyan
2009-03-05 21:08 ` Alexey Dobriyan
2009-03-05 21:27 ` Dave Hansen
2009-03-05 22:00 ` Alexey Dobriyan
2009-03-05 22:00 ` Alexey Dobriyan
2009-03-05 22:24 ` Dave Hansen
2009-03-06 14:34 ` Serge E. Hallyn
2009-03-06 15:48 ` Dave Hansen
2009-03-06 16:23 ` Serge E. Hallyn
2009-03-06 16:23 ` Serge E. Hallyn
[not found] ` <20090306162337.GA3040-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 16:46 ` Dave Hansen
2009-03-06 16:46 ` Dave Hansen
2009-03-06 18:24 ` Serge E. Hallyn
2009-03-06 18:24 ` Serge E. Hallyn [this message]
2009-03-06 19:42 ` Dave Hansen
[not found] ` <20090306182451.GA6307-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 19:42 ` Dave Hansen
[not found] ` <20090306143425.GA31250-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 15:48 ` Dave Hansen
2009-03-13 3:05 ` Oren Laadan
2009-03-13 3:05 ` Oren Laadan
2009-03-06 14:34 ` Serge E. Hallyn
[not found] ` <20090305220044.GA2819-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-05 22:24 ` Dave Hansen
2009-03-06 15:08 ` Greg Kurz
2009-03-06 15:08 ` Greg Kurz
2009-03-06 15:35 ` Serge E. Hallyn
2009-03-06 15:35 ` Serge E. Hallyn
[not found] ` <20090306153549.GA898-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 17:36 ` Cedric Le Goater
2009-03-06 17:36 ` Cedric Le Goater
[not found] ` <49B15F35.2010909-GANU6spQydw@public.gmane.org>
2009-03-06 18:30 ` Serge E. Hallyn
2009-03-06 18:30 ` Serge E. Hallyn
[not found] ` <20090306183055.GA6729-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-11 7:51 ` Cedric Le Goater
2009-03-11 7:51 ` Cedric Le Goater
[not found] ` <49B76D91.1020807-GANU6spQydw@public.gmane.org>
2009-03-12 15:30 ` Serge E. Hallyn
2009-03-12 15:30 ` Serge E. Hallyn
2009-03-13 6:36 ` Ensuring c/r maintainability (WAS Re: [RFC][PATCH 00/11] track files for checkpointability) Matt Helsley
2009-03-13 17:53 ` Serge E. Hallyn
[not found] ` <20090305210840.GA2499-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-05 21:27 ` [RFC][PATCH 00/11] track files for checkpointability Dave Hansen
2009-03-05 19:44 ` Dave Hansen
2009-03-05 19:44 ` Dave Hansen
2009-03-05 17:40 ` Alexey Dobriyan
2009-03-05 18:13 ` Serge E. Hallyn
2009-03-05 18:13 ` Serge E. Hallyn
[not found] ` <20090305181325.GA10666-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-05 18:16 ` Dave Hansen
2009-03-05 18:16 ` Dave Hansen
2009-03-10 15:57 ` Nathan Lynch
2009-03-10 15:57 ` Nathan Lynch
[not found] ` <20090310105702.43eb1402-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-03-10 16:00 ` Nathan Lynch
2009-03-10 16:20 ` Serge E. Hallyn
2009-03-10 16:22 ` Dave Hansen
2009-03-10 16:00 ` Nathan Lynch
[not found] ` <20090310110000.24893e0c-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-03-10 16:23 ` Serge E. Hallyn
2009-03-10 16:23 ` Serge E. Hallyn
2009-03-10 16:20 ` Serge E. Hallyn
2009-03-10 17:23 ` Nathan Lynch
2009-03-10 17:45 ` Serge E. Hallyn
2009-03-10 17:47 ` Dave Hansen
2009-03-10 22:54 ` what is CONFIG_VZ_GENCALLS Zhaohui Wang
[not found] ` <20090310174517.GA12101-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-10 17:47 ` [RFC][PATCH 00/11] track files for checkpointability Dave Hansen
[not found] ` <20090310122320.313491ce-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-03-10 17:45 ` Serge E. Hallyn
[not found] ` <20090310162026.GA9354-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-10 17:23 ` Nathan Lynch
2009-03-10 16:22 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090306182451.GA6307@us.ibm.com \
--to=serue@us.ibm.com \
--cc=adobriyan@gmail.com \
--cc=containers@lists.linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.