All of lore.kernel.org
 help / color / mirror / Atom feed
* Live migration: "netbuf race" messages can cause significant perfomance impact
@ 2006-12-21  3:00 John Byrne
  2006-12-21 18:18 ` Ian Pratt
  2006-12-29 13:07 ` Steven Hand
  0 siblings, 2 replies; 3+ messages in thread
From: John Byrne @ 2006-12-21  3:00 UTC (permalink / raw)
  To: xen-devel


Hi,

Someone found that doing a live migration of a domain that had ballooned 
down took far longer to migrate. (Ballooned down from 3000M to 1000M, 31 
seconds vs 89 seconds, real time) I came up with a complex theory and 
asked him to look in the xend.log to confirm it. He didn't, but he 
mentioned there was a lot of "netbuf race" messages in the log. In this 
particular case, live migration generated approximately 512000 "netbuf 
race" messages. Deleting the DPRINTF reduced the migration time to 11 
seconds.

While it is simple enough to submit a patch to delete this DPRINTF, 
perhaps something more subtle is called for such as modifying the 
migrate/save command paths to accept a debug argument and passing to 
xc_save?

Thanks,

John Byrne

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: Live migration: "netbuf race" messages can cause significant perfomance impact
  2006-12-21  3:00 Live migration: "netbuf race" messages can cause significant perfomance impact John Byrne
@ 2006-12-21 18:18 ` Ian Pratt
  2006-12-29 13:07 ` Steven Hand
  1 sibling, 0 replies; 3+ messages in thread
From: Ian Pratt @ 2006-12-21 18:18 UTC (permalink / raw)
  To: John Byrne, xen-devel

> Someone found that doing a live migration of a domain that had
ballooned
> down took far longer to migrate. (Ballooned down from 3000M to 1000M,
31
> seconds vs 89 seconds, real time) I came up with a complex theory and
> asked him to look in the xend.log to confirm it. He didn't, but he
> mentioned there was a lot of "netbuf race" messages in the log. In
this
> particular case, live migration generated approximately 512000 "netbuf
> race" messages. Deleting the DPRINTF reduced the migration time to 11
> seconds.
> 
> While it is simple enough to submit a patch to delete this DPRINTF,
> perhaps something more subtle is called for such as modifying the
> migrate/save command paths to accept a debug argument and passing to
> xc_save?

Ideally, we'd do more than suppress the printf.  We're needless
re-scanning the bitmap for the pages that are ballooned out because
we're not distinguishing them from other pages like network buffers that
are temporarily not part of the p2m map. I'm pretty sure my original
implementation got this right and its since been broken :) 

This scanning probably isn't very expensive (sans the printf), but its
worth cleaning up.

Ian

 
> Thanks,
> 
> John Byrne
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Live migration: "netbuf race" messages can cause significant perfomance impact
  2006-12-21  3:00 Live migration: "netbuf race" messages can cause significant perfomance impact John Byrne
  2006-12-21 18:18 ` Ian Pratt
@ 2006-12-29 13:07 ` Steven Hand
  1 sibling, 0 replies; 3+ messages in thread
From: Steven Hand @ 2006-12-29 13:07 UTC (permalink / raw)
  To: John Byrne; +Cc: xen-devel, Steven.Hand


>Someone found that doing a live migration of a domain that had ballooned 
>down took far longer to migrate. (Ballooned down from 3000M to 1000M, 31 
>seconds vs 89 seconds, real time) I came up with a complex theory and 
>asked him to look in the xend.log to confirm it. He didn't, but he 
>mentioned there was a lot of "netbuf race" messages in the log. In this 
>particular case, live migration generated approximately 512000 "netbuf 
>race" messages. Deleting the DPRINTF reduced the migration time to 11 
>seconds.
>
>While it is simple enough to submit a patch to delete this DPRINTF, 
>perhaps something more subtle is called for such as modifying the 
>migrate/save command paths to accept a debug argument and passing to 
>xc_save?

There's nothing much we can do here - there's no easy way for us to 
distinguish between pages which are 'ballooned out' and pages which
are temporarily being used for network buffers. I've checked in a fix
to unstable (cset 13185:62ef527eb19f) which simply removes this particular
debug output.


thanks for spotting this! 

cheers,

S.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-12-29 13:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-21  3:00 Live migration: "netbuf race" messages can cause significant perfomance impact John Byrne
2006-12-21 18:18 ` Ian Pratt
2006-12-29 13:07 ` Steven Hand

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.