From: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
"Serge E. Hallyn" <serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>,
"Serge E. Hallyn"
<serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>,
Nathan Lynch <nathanl-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org>c
Subject: Re: C/R: File substitution at restart
Date: Thu, 9 Sep 2010 04:02:20 -0700 [thread overview]
Message-ID: <20100909110220.GF8957@count0.beaverton.ibm.com> (raw)
In-Reply-To: <20100909103720.GF4812-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
On Thu, Sep 09, 2010 at 12:37:20PM +0200, Louis Rilling wrote:
> On 08/09/10 21:06 -0700, Matt Helsley wrote:
> > On Wed, Sep 08, 2010 at 08:03:52PM -0500, Serge E. Hallyn wrote:
> > > Quoting Matt Helsley (matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org):
> > > > On Wed, Sep 08, 2010 at 08:09:31AM -0500, Serge E. Hallyn wrote:
> > > > I think it can be split into two composable pieces which may also be
> > > > useful independently.
> > > >
> > > > The first uses the fcntl() interface to add a flag like
> > > > O_CLOEXEC. Unlike O_CLOEXEC it marks an fd for preservation during
> > > > restart. That way we don't have to specify an fd number and a "source"
> > > > to the kernel. Just tell the kernel to keep the fd. The source can
> > > > be opened and dup2'd via userspace. This is useful without the
> > > > second piece if we want to simply add rather than replace an fd.
> > >
> > > Can you think of any other use for this flag other than restart?
> >
> > <joking>
> > I can't think of any other uses for O_CLOEXEC.
> > </joking>
> >
> > Seriously though, restart will be used _much_ less often than exec so yes
> > it does seem like a waste of a valuable bit and something that wouldn't
> > quite belong in an fcntl interface.
> >
> > However we can try to be a tad clever -- we could (ab|re)use O_CLOEXEC.
> > Right now restart closes all file descriptors and pays absolutely
> > no attention to O_CLOEXEC. We could reuse O_CLOEXEC to mean O_CLOREST
> > too. Have user-cr's restart tool mark all unwanted fds O_CLOEXEC. Any we
> > want to keep we do not mark with O_CLOEXEC.
>
> This would also be useful at checkpoint, to tell sys_checkpoint() which fds
> should be ignored, being because it is not supported or because the application
> has a better way to deal with it.
True. Though unlike restart I don't think we just can (ab|re)use O_CLOEXEC
for that purpose.
>
> >
> >
> > Here's another idea which I haven't fully thought out yet.
> >
> > We could introduce the concept of object id substitutions in the image.
> > So the image would look like (going from file pos 0 at the top..):
> >
> > 0 +-------------------------------+
> > | |
> > .....
> > +-------------------------------+
> > | <substitute object> | <--- object with id == <substitute id>
> > .....
> > +---------------+---------------+
> > | <object id> |<substitute id>|
> > +---------------+---------------+
> > .....
> > +---------------+---------------+
> > | <object to ignore> | <-- object with id == <object id>
> > .....
> >
> > (The above is ignoring the ckpt_hdr fields..)
> >
> > When we read the image during restart we use the substitute ids to
> > create indirect objhash entries. When we encounter an obj id and
> > it refers to an indirect entry we first parse the object (ignoring
> > errors and dropping references on new objhash insertions), flip
> > a bit on the indirect entry (indicating the object has been parsed),
> > and then lookup the substitute id and return whatever that resolved to.
> >
> > We can ignore the new objhash objects by making the objhash have its
> > own operation struct. When we're parsing an object that's been
> > substituted we just temporarily set the objhash add/lookup operations
> > to something suitable for properly dropping references to the new
> > object(s). This way we don't have to add checks for this peculiar
> > need all over the checkpoint/restart code. Sure it'll be slower...
>
> If at checkpoint we can take care to ignore files that we know will be
> substituted, this should not be that slower.
So, would you say typically it's the application developer who knows
what to ignore? Are we expecting distros/packagers to be able to set
that up? Admins? These specific optimizations seem like they would be a
bit fragile unless the application developer is involved.
Cheers,
-Matt Helsley
next prev parent reply other threads:[~2010-09-09 11:02 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-08 10:03 C/R: File substitution at restart Matthieu Fertré
[not found] ` <4C875F6E.2030004-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2010-09-08 13:09 ` Serge E. Hallyn
[not found] ` <20100908130931.GA11161-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2010-09-08 17:56 ` Sukadev Bhattiprolu
[not found] ` <20100908175648.GA12281-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-09-08 22:49 ` Serge E. Hallyn
2010-09-08 19:35 ` Matt Helsley
[not found] ` <20100908193531.GB8957-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-09-09 1:03 ` Serge E. Hallyn
[not found] ` <20100909010352.GA13880-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>
2010-09-09 4:06 ` Matt Helsley
[not found] ` <20100909040635.GE8957-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-09-09 10:37 ` Louis Rilling
[not found] ` <20100909103720.GF4812-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2010-09-09 11:02 ` Matt Helsley [this message]
[not found] ` <20100909110220.GF8957-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-09-09 11:34 ` Louis Rilling
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100909110220.GF8957@count0.beaverton.ibm.com \
--to=matthltc-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
--cc=nathanl-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org \
--cc=serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org \
--cc=serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox