All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: Trivial patch to fix logging level output by XendCheckpoint.py
@ 2006-11-21  1:23 Graham, Simon
  2006-11-21  9:20 ` Ewan Mellor
  0 siblings, 1 reply; 13+ messages in thread
From: Graham, Simon @ 2006-11-21  1:23 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: xen-devel

Well, I didn't expect this much discussion!!! I must admit I found it
hard to grok this code initially but I think the idea of exec'ing a
separate program is a good one to avoid the chance that xend can die
(especially since you don't currently seem to be able to restart xend
without rebooting Dom0).

Anyway - even if XendCheckpoint.py called xc_linux_save/restore
directly, you'd still have to deal with getting the output to the right
place since these APIs write their debug output directly to stderr so
I've got to believe you'd still fork a child that made the call and have
the parent slurp the contents of stderr and log them so the same issue
would exist...

Simon

> -----Original Message-----
> From: Daniel P. Berrange [mailto:berrange@redhat.com]
> Sent: Monday, November 20, 2006 4:56 PM
> To: Graham, Simon
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] Trivial patch to fix logging level output by
> XendCheckpoint.py
> 
> On Mon, Nov 20, 2006 at 04:34:25PM -0500, Graham, Simon wrote:
> > Signed-off by: Simon Graham <Simon.Graham@stratus.com>
> >
> > There is a somewhat trivial issue with XendCheckpoint.py right now
in
> > that it logs everything written to stderr by xc_save and xc_restore
> as
> > errors whereas in fact the vast majority of this output is
> > information/debug (and all actual errors are marked by the string
> ERROR:
> > at the start of the message) -- this is confusing to folks looking
at
> > the logs and makes automated log analysis tricky.
> >
> > Fix is to scan for the ERROR: string and log anything without it
> using
> > log.info instead.
> 
> This bit of XenD looks rather dubious to me. XenD spawns a background
> thread, which in turn forks the xc_save or xc_restore program, reading
> the stdout from this child process.
> 
> Aside from command line argument handling, the xc_save and xc_restore
> programs each only make *one* single function call to xc_linux_save or
> xc_linux_restore APIs in libxc.
> 
> But XenD already has libxc loaded & calls it directly for all other
> jobs
> its needs to do, including actually createing the domain in the first
> place.  So why is it going to all the trouble of fork'ing a child
> process
> rather than just calling  xc_linux_save/restore APIs directly ?  If it
> did this, we wouldn't need this loop to read, grok & log STDOUT from
> the
> child process at all - it would just end up in regular XenD logs.
> 
> And even more importantly, once we get the error reporting patches
> integrated
> into libxc, having the xc_linux_save/restore calls made by XenD
> directly
> would ensure we get reliable error handling for save/restore
> operations.
> 
> So, is there any good reason for xc_save & xc_restore to exist as
> separate
> processes - it just seems like a huge complication to me, increasing
> fragility of the system & reducing the quality of error reporting we
> can
> get
> 
> Regards,
> Dan.
> --
> |=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392
> 2496 -=|
> |=-           Perl modules: http://search.cpan.org/~danberr/
> -=|
> |=-               Projects: http://freshmeat.net/~danielpb/
> -=|
> |=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B
> 9505  -=|

^ permalink raw reply	[flat|nested] 13+ messages in thread
* RE: Trivial patch to fix logging level output by XendCheckpoint.py
@ 2006-11-21 14:58 Graham, Simon
  2006-11-21 15:04 ` Daniel P. Berrange
  2006-11-21 15:05 ` Ewan Mellor
  0 siblings, 2 replies; 13+ messages in thread
From: Graham, Simon @ 2006-11-21 14:58 UTC (permalink / raw)
  To: Daniel P. Berrange, Ewan Mellor; +Cc: xen-devel

> > > Well, I didn't expect this much discussion!!! I must admit I found
> it
> > > hard to grok this code initially but I think the idea of exec'ing
a
> > > separate program is a good one to avoid the chance that xend can
> die
> > > (especially since you don't currently seem to be able to restart
> xend
> > > without rebooting Dom0).
> >
> > That's not true -- Xend ought to be restarting itself (it forks
> itself so that
> > it has a monitor process) and even then "xend restart" ought to
work.
> Are you
> > having trouble with that?
> >
> > The one process that we can't restart at the moment is xenstored.
> Everything
> > else should be fine.
>

Yes we have had problems with this (in 3.0.2) -- however, we test this
with /etc/init.d/xend stop/start which will, I think, restart xenstored
as well, so I guess this still does not work. I will look into modifying
the test so it kills the main xend process instead to more accurately
test fault insertion in xend.

> While XenD itself can easily be restarted it does cause a few problems
> if you have any HVM domains, or paravirt + blktap domains running.
XenD
> is the parent process for both qemu-dm and tapdisk helper processes.
It
> appears that these two are incapable of detecting shutdown of the
guest
> they are associated with themselves, and instead rely on XenD to tell
> them when to exit / kill them. So the problem is, that if you restart
> XenD these processes get re-parented to init, and then when you
> shutdown
> the guest domain, the qemu-dm/tapdisk helpers stay around causing the
> domain to linger in zombie state forever.
> 
> If we could figure out how to sort this, then XenD would be trivially
> restartable without any ill effects.
> 

Ah - OK, well, we aren't testing HVM guests yet but we will be;
hopefully one of us can figure out how to solve this issue.

Simon

^ permalink raw reply	[flat|nested] 13+ messages in thread
* Trivial patch to fix logging level output by XendCheckpoint.py
@ 2006-11-20 21:34 Graham, Simon
  2006-11-20 21:56 ` Daniel P. Berrange
  2006-11-21 10:19 ` Ewan Mellor
  0 siblings, 2 replies; 13+ messages in thread
From: Graham, Simon @ 2006-11-20 21:34 UTC (permalink / raw)
  To: xen-devel

[-- Attachment #1: Type: text/plain, Size: 571 bytes --]

Signed-off by: Simon Graham <Simon.Graham@stratus.com>

There is a somewhat trivial issue with XendCheckpoint.py right now in
that it logs everything written to stderr by xc_save and xc_restore as
errors whereas in fact the vast majority of this output is
information/debug (and all actual errors are marked by the string ERROR:
at the start of the message) -- this is confusing to folks looking at
the logs and makes automated log analysis tricky.

Fix is to scan for the ERROR: string and log anything without it using
log.info instead.

Regards,
Simon


[-- Attachment #2: xendcheckpoint.patch --]
[-- Type: application/octet-stream, Size: 584 bytes --]

Index: trunk/tools/python/xen/xend/XendCheckpoint.py
===================================================================
--- trunk/tools/python/xen/xend/XendCheckpoint.py	(revision 6921)
+++ trunk/tools/python/xen/xend/XendCheckpoint.py	(working copy)
@@ -234,4 +234,9 @@
         if line == "":
             break
         else:
-            log.error('%s', line.strip())
+            line = line.strip()
+            m = re.match(r"^ERROR: (.*)", line)
+            if m is None:
+                log.info('%s', line)
+            else:
+                log.error('%s', m.group(1))

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2006-11-21 15:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-21  1:23 Trivial patch to fix logging level output by XendCheckpoint.py Graham, Simon
2006-11-21  9:20 ` Ewan Mellor
2006-11-21 12:29   ` Daniel P. Berrange
  -- strict thread matches above, loose matches on Subject: below --
2006-11-21 14:58 Graham, Simon
2006-11-21 15:04 ` Daniel P. Berrange
2006-11-21 15:05 ` Ewan Mellor
2006-11-20 21:34 Graham, Simon
2006-11-20 21:56 ` Daniel P. Berrange
2006-11-20 22:34   ` Anthony Liguori
2006-11-20 22:43     ` Daniel P. Berrange
2006-11-20 22:40   ` Ewan Mellor
2006-11-20 22:53     ` John Levon
2006-11-21 10:19 ` Ewan Mellor

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.