public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oren Laadan <orenl@cs.columbia.edu>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: Tejun Heo <tj@kernel.org>, Gene Cooperman <gene@ccs.neu.edu>,
	Kapil Arya <kapil@ccs.neu.edu>,
	ksummit-2010-discuss@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org, hch@lst.de
Subject: Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
Date: Sat, 06 Nov 2010 11:01:11 -0400	[thread overview]
Message-ID: <4CD56DB7.109@cs.columbia.edu> (raw)
In-Reply-To: <20101106053204.GB12449@count0.beaverton.ibm.com>



On 11/06/2010 01:32 AM, Matt Helsley wrote:
> On Fri, Nov 05, 2010 at 10:28:09AM +0100, Tejun Heo wrote:
>> Hello,
>>
>> On 11/04/2010 05:44 PM, Gene Cooperman wrote:
>>>>> In our personal view, a key difference between in-kernel and userland
>>>>> approaches is the issue of security.
>>>>
>>>> That's an interesting point but I don't think it's a dealbreaker.
>>>> ... but it's not like CR is gonna be deployed on
>>>> majority of desktops and servers (if so, let's talk about it then).
>>>
>>> This is a good point to clarify some issues.  C/R has several good
>>> targets.  For example, BLCR has targeted HPC batch facilities, and
>>> does it well.
>>>
>>> DMTCP started life on the desktop, and it's still a primary focus of
>>> DMTCP.  We worked to support screen on this release precisely so
>>> that advanced desktop users have the option of putting their whole
>>> screen session under checkpoint control.  It complements the core
>>> goal of screen: If you walk away from a terminal, you can get back
>>> the session elsewhere.  If your session crashes, you can get back
>>> the session elsewhere (depending on where you save the checkpoint
>>> files, of course :-) ).
>>
>> Call me skeptical but I still don't see, yet, it being a mainstream
>> thing (for average sysadmin John and proverbial aunt Tilly).  It
>> definitely is useful for many different use cases tho.  Hey, but let's
>> see.
> 
> Rightly so. It hasn't been widely proven as something that distros
> would be willing to integrate into a normal desktop session. We've got
> some demos of it working with VNC, twm, and vim. Oren has his own VNC,
> twm, etc demos too. We haven't looked very closely at more advanced
> desktop sessions like (in no particular order) KDE or Gnome. Nor have
> we yet looked at working with any portions of X that were meant to provide
> this but were never popular enough to do so (XSMP iirc).

Actually, I do have a demo of Zap (linux-cr predecessor) with a _full_
gnome desktop running under VNC with:
* a movie player,
* firefox,
* thunderbird,
* openoffice,
* kernel make,
* gdb debugging something,
* WINE with microsoft office (oops)
all of these checkpointed with < 25ms of downtime and resumed an
arbitrary time later, successfully.

I even have witnesses that saw it ;)

> 
> Does DMTCP handle KDE/Gnome sessions? X too?
> 
> On the kernel side of things for the desktop, right now we think our
> biggest obstacle is inotify. I've been working on kernel patches for
> kernel-cr to do that and it seems fairly do-able. Does DMTCP handle
> restarting inotify watches without dropping events that were present
> during checkpoint?
> 

At the very least userspace would need to interpose on all
inotify related syscalls to track (log) what the user did to
be able to redo it at restart. (And I'm sure there will be
crazy to impossible races and corner cases there).

Does it make sense to replicate in userspace everything already done
in the kernel ?

> The other problem for kernel c/r of X is likely to be DRM. Since the
> different graphics chipsets vary so widely there's nothing we can do
> to migrate DRM state of an NVIDIA chipset to DRM state of an ATI chipset
> as far as I know. Perhaps if that would help hybrid graphics systems
> then it's something that could be common between DRM and
> checkpoint/restart but it's very much pie-in-the-sky at the moment.

DRM is hardware, and is complex for both userspace and kernel. Let's
assume it isn't support until it's properly virtualized.

(In the long-long run, I'd envision hardware manufacturers providing
c/r support within their drivers - e.g. a checkpoint() and restart()
kernel methods. But that's only if they care about it, and in any
event, pretty far down the road...)

> kernel c/r of input devices might be alot easier. We just simulate
> hot [un]plug of the devices and rely on X responding. We can even
> checkpoint the events X would have missed and deliver them prior to hot
> unplug.
> 

[snip]

Oren.

  reply	other threads:[~2010-11-06 15:01 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <Pine.LNX.4.64.1011021530470.12128@takamine.ncl.cs.columbia.edu>
2010-11-02 21:35 ` [Ksummit-2010-discuss] checkpoint-restart: naked patch Tejun Heo
2010-11-02 21:47   ` Christoph Hellwig
2010-11-04  1:47     ` Nathan Lynch
2010-11-04  7:36       ` Tejun Heo
2010-11-04 16:04         ` Gene Cooperman
2010-11-04 20:45         ` Nathan Lynch
2010-11-06  6:48           ` Matt Helsley
2010-11-04  4:34     ` Oren Laadan
2010-11-04 14:25       ` Christoph Hellwig
2010-11-04  3:40   ` Kapil Arya
2010-11-04  8:05     ` Tejun Heo
2010-11-04 16:44       ` Gene Cooperman
2010-11-05  9:28         ` Tejun Heo
2010-11-05 23:18           ` Oren Laadan
2010-11-06 10:13             ` Tejun Heo
2010-11-06  0:36           ` Kapil Arya
2010-11-06 22:55             ` Oren Laadan
2010-11-07 19:42               ` Gene Cooperman
2010-11-07 21:30                 ` Oren Laadan
2010-11-07 23:05                   ` Gene Cooperman
2010-11-08  3:55                     ` Oren Laadan
2010-11-08 16:26                       ` Gene Cooperman
2010-11-08 18:14                         ` Oren Laadan
2010-11-08 18:37                           ` Gene Cooperman
2010-11-08 19:34                             ` Oren Laadan
2010-11-08 19:05                         ` Dan Smith
2010-11-17 11:14                           ` Tejun Heo
2010-11-17 15:33                             ` Dan Smith
2010-11-17 15:40                               ` Tejun Heo
2010-11-17 17:04                                 ` Alexey Dobriyan
2010-11-17 10:45             ` Tejun Heo
2010-11-17 12:12               ` Tejun Heo
2010-11-06  5:32           ` Matt Helsley
2010-11-06 15:01             ` Oren Laadan [this message]
2010-11-06 20:40             ` Gene Cooperman
2010-11-06 22:41               ` Oren Laadan
2010-11-07 18:49                 ` Gene Cooperman
2010-11-07 21:59                   ` Oren Laadan
2010-11-17 11:57                     ` Tejun Heo
2010-11-17 15:39                       ` Serge E. Hallyn
2010-11-17 15:46                         ` Tejun Heo
2010-11-18  9:13                           ` Pavel Emelyanov
2010-11-18  9:48                             ` Tejun Heo
2010-11-18 20:13                               ` Jose R. Santos
2010-11-19  3:54                               ` Serge Hallyn
2010-11-18 19:53                           ` Oren Laadan
2010-11-19  4:10                           ` Serge Hallyn
2010-11-19 14:04                             ` Tejun Heo
2010-11-19 14:36                               ` Kirill Korotaev
2010-11-19 15:33                                 ` Tejun Heo
2010-11-19 16:00                                   ` Alexey Dobriyan
2010-11-19 16:01                                     ` Alexey Dobriyan
2010-11-19 16:10                                       ` Tejun Heo
2010-11-19 16:25                                         ` Alexey Dobriyan
2010-11-19 16:06                                     ` Tejun Heo
2010-11-19 16:16                                       ` Alexey Dobriyan
2010-11-19 16:19                                         ` Tejun Heo
2010-11-19 16:27                                           ` Alexey Dobriyan
2010-11-19 16:32                                             ` Tejun Heo
2010-11-19 16:38                                               ` Alexey Dobriyan
2010-11-19 16:50                                                 ` Tejun Heo
2010-11-19 16:55                                                   ` Alexey Dobriyan
2010-11-20 17:58                                   ` Oren Laadan
2010-11-20 18:05                               ` Oren Laadan
2010-11-20 18:08                               ` Oren Laadan
2010-11-20 18:11                               ` Oren Laadan
2010-11-20 18:15                                 ` Oren Laadan
2010-11-20 19:33                                   ` Tejun Heo
2010-11-21  8:18                                     ` Gene Cooperman
2010-11-21  8:21                                       ` Gene Cooperman
2010-11-22 18:02                                         ` Sukadev Bhattiprolu
2010-11-23 17:53                                         ` Oren Laadan
2010-11-24  3:50                                           ` Kapil Arya
2010-11-25 16:04                                             ` Oren Laadan
2010-11-29  4:09                                               ` Gene Cooperman
2010-11-21 22:41                                       ` Grant Likely
2010-11-22 17:34                                       ` Oren Laadan
2010-11-22 17:18                                     ` Oren Laadan
2010-11-17 22:17                       ` Matt Helsley
2010-11-18 10:06                         ` Tejun Heo
2010-11-18 20:25                         ` Oren Laadan
2010-11-07 21:44               ` Oren Laadan
2010-11-07 23:31                 ` Gene Cooperman
2010-11-05 22:24       ` Oren Laadan
2010-11-04  4:03   ` Oren Laadan
2010-11-04  9:43     ` Tejun Heo
2010-11-04 12:48       ` Luck, Tony
2010-11-04 13:06         ` Tejun Heo
2010-11-06 10:12       ` Matt Helsley
2010-11-06 11:03         ` Tejun Heo
2010-11-07 22:59         ` Davide Libenzi
2010-11-08  2:32           ` david
2010-11-18 20:41             ` Oren Laadan
2010-11-05  3:55     ` Kapil Arya
2010-11-05 11:57       ` Luck, Tony
2010-11-05 17:17         ` Gene Cooperman
2010-11-06  1:16           ` Matt Helsley
2010-11-06  4:06             ` Oren Laadan
2010-11-06  5:18               ` Matt Helsley
2010-11-06 21:00           ` Oren Laadan
2010-11-05 17:31       ` Sukadev Bhattiprolu
2010-11-06 21:05       ` Oren Laadan
2010-11-08 16:55 ` Grant Likely
2010-11-08 21:01   ` Nathan Lynch
2010-11-11  6:27   ` Nathan Lynch
2010-11-17  5:29   ` Anton Blanchard
2010-11-17 11:08     ` Tejun Heo
2010-11-18  9:53     ` Alan Cox
2010-11-18 12:27       ` Alexey Dobriyan
2010-11-19  6:33     ` Gene Cooperman
2010-11-21 23:20     ` Grant Likely

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CD56DB7.109@cs.columbia.edu \
    --to=orenl@cs.columbia.edu \
    --cc=gene@ccs.neu.edu \
    --cc=hch@lst.de \
    --cc=kapil@ccs.neu.edu \
    --cc=ksummit-2010-discuss@lists.linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthltc@us.ibm.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox