* [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
@ 2005-09-30 16:54 Dan Smith
2005-10-01 10:54 ` Ewan Mellor
2005-10-04 7:36 ` Ewan Mellor
0 siblings, 2 replies; 8+ messages in thread
From: Dan Smith @ 2005-09-30 16:54 UTC (permalink / raw)
To: Xen Developers
[-- Attachment #1: Type: text/plain, Size: 350 bytes --]
This is a resend of my stale state fix, which is yet unapplied. If
there are issues, please let me know.
Note that this fixes the issue poked by xm-test, as shown in the
following snippet of David's latest FC3pae.report:
> FAIL: 01_shutdown_basic_pos
> I had to run an xm list to update xend state!
Signed-off-by: Dan Smith <danms@us.ibm.com>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: domid_fix.patch --]
[-- Type: text/x-patch, Size: 2852 bytes --]
diff -r 9d047fb99e38 tools/python/xen/xend/XendDomainInfo.py
--- a/tools/python/xen/xend/XendDomainInfo.py Fri Sep 30 16:37:52 2005
+++ b/tools/python/xen/xend/XendDomainInfo.py Fri Sep 30 09:47:51 2005
@@ -795,7 +795,7 @@
if not info:
info = dom_get(self.domid)
if not info:
- return
+ return False
self.info.update(info)
self.validateInfo()
@@ -803,6 +803,8 @@
log.debug("XendDomainInfo.update done on domain %d: %s", self.domid,
self.info)
+
+ return True
## private:
diff -r 9d047fb99e38 tools/python/xen/xend/server/SrvDomain.py
--- a/tools/python/xen/xend/server/SrvDomain.py Fri Sep 30 16:37:52 2005
+++ b/tools/python/xen/xend/server/SrvDomain.py Fri Sep 30 09:47:51 2005
@@ -21,6 +21,8 @@
from xen.xend import XendDomain
from xen.xend import PrettyPrint
from xen.xend.Args import FormFn
+from xen.xend.XendError import XendError
+from xen.xend.XendLogging import log
from xen.web.SrvDir import SrvDir
@@ -210,7 +212,9 @@
#
# if op and op[0] in ['vifs', 'vif', 'vbds', 'vbd', 'mem_target_set']:
# return self.perform(req)
- self.dom.update()
+ if not self.dom.update():
+ raise XendError("Domain %s no longer exists" % self.dom.getName())
+
if self.use_sxp(req):
req.setHeader("Content-Type", sxp.mime_type)
sxp.show(self.dom.sxpr(), out=req)
diff -r 9d047fb99e38 tools/python/xen/xm/main.py
--- a/tools/python/xen/xm/main.py Fri Sep 30 16:37:52 2005
+++ b/tools/python/xen/xm/main.py Fri Sep 30 09:47:51 2005
@@ -32,6 +32,7 @@
warnings.filterwarnings('ignore', category=FutureWarning)
from xen.xend import PrettyPrint
from xen.xend import sxp
+from xen.xend.XendClient import XendError
from xen.xm.opts import *
shorthelp = """Usage: xm <subcommand> [args]
Control, list, and manipulate Xen guest instances
@@ -385,14 +386,24 @@
name = args[0]
from xen.xend.XendClient import server
- dom = server.xend_domain(name)
+ try:
+ dom = server.xend_domain(name)
+ except XendError, e:
+ err("Unable to get info for domain %s" % name)
+ sys.exit(1)
+
print sxp.child_value(dom, 'domid')
def xm_domname(args):
name = args[0]
from xen.xend.XendClient import server
- dom = server.xend_domain(name)
+ try:
+ dom = server.xend_domain(name)
+ except XendError, e:
+ err("Unable to get info for domain %s" % name)
+ sys.exit(1)
+
print sxp.child_value(dom, 'name')
def xm_sched_bvt(args):
@@ -687,7 +698,6 @@
args = argv[2:]
if cmd:
try:
- from xen.xend.XendClient import XendError
rc = cmd(args)
if rc:
usage()
[-- Attachment #3: Type: text/plain, Size: 87 bytes --]
--
Dan Smith
IBM Linux Technology Center
Open Hypervisor Team
email: danms@us.ibm.com
[-- Attachment #4: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
2005-09-30 16:54 [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}' Dan Smith
@ 2005-10-01 10:54 ` Ewan Mellor
2005-10-01 15:33 ` Dan Smith
2005-10-04 7:36 ` Ewan Mellor
1 sibling, 1 reply; 8+ messages in thread
From: Ewan Mellor @ 2005-10-01 10:54 UTC (permalink / raw)
To: xen-devel
On Fri, Sep 30, 2005 at 09:54:55AM -0700, Dan Smith wrote:
>
> This is a resend of my stale state fix, which is yet unapplied. If
> there are issues, please let me know.
>
> Note that this fixes the issue poked by xm-test, as shown in the
> following snippet of David's latest FC3pae.report:
>
> > FAIL: 01_shutdown_basic_pos
> > I had to run an xm list to update xend state!
Sorry Dan, I didn't mean to sit on this patch. The thing is, it solves the
problem by making sure that SrvDomain can cope with stale domains being returned
by XendDomain, but I _really_ don't want XendDomain to be returning stale
information in the first place. I've been trying to decide how easy it would
be to fix the underlying problem -- if it's going to take a long time, then I'll
apply your patch as a workaround, but I hope to solve the problem more
definitively.
If I haven't fixed it by Monday, I'll apply your patch.
Thanks,
Ewan.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
2005-10-01 10:54 ` Ewan Mellor
@ 2005-10-01 15:33 ` Dan Smith
2005-10-01 19:40 ` Anthony Liguori
0 siblings, 1 reply; 8+ messages in thread
From: Dan Smith @ 2005-10-01 15:33 UTC (permalink / raw)
To: Ewan Mellor; +Cc: xen-devel
EM> The thing is, it solves the problem by making sure that SrvDomain
EM> can cope with stale domains being returned by XendDomain, but I
EM> _really_ don't want XendDomain to be returning stale information
EM> in the first place.
I agree. I recently submitted a patch that would cause XendDomainInfo
to destroy itself whenever it realized that it was stale. Christian
didn't like the idea of random places modifying the Xend state. So,
in this patch, I just handled the stale state instead of returning the
bogus information, and without modifying the state itself.
EM> I've been trying to decide how easy it would be to fix the
EM> underlying problem -- if it's going to take a long time, then I'll
EM> apply your patch as a workaround, but I hope to solve the problem
EM> more definitively.
I think the solution (or best fix) is to standardize on the fact that
we always update domain information right before we return it, and
purge that information if needed. I think that "xm list" triggers
this somewhere deep inside xend, as I can always purge the stale data
by running an "xm list". This tells me that some async signals aren't
always being sent to clean up, which means "xm list" has to trigger
it. I *think* Anthony had a comment about polling being necessary in
this case for some reason, so perhaps he can chime in and explain.
Thanks Ewan!
--
Dan Smith
IBM Linux Technology Center
Open Hypervisor Team
email: danms@us.ibm.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
2005-10-01 15:33 ` Dan Smith
@ 2005-10-01 19:40 ` Anthony Liguori
0 siblings, 0 replies; 8+ messages in thread
From: Anthony Liguori @ 2005-10-01 19:40 UTC (permalink / raw)
To: Dan Smith; +Cc: xen-devel, Ewan Mellor
Dan Smith wrote:
>EM> I've been trying to decide how easy it would be to fix the
>EM> underlying problem -- if it's going to take a long time, then I'll
>EM> apply your patch as a workaround, but I hope to solve the problem
>EM> more definitively.
>
>I think the solution (or best fix) is to standardize on the fact that
>we always update domain information right before we return it, and
>purge that information if needed. I think that "xm list" triggers
>this somewhere deep inside xend, as I can always purge the stale data
>by running an "xm list". This tells me that some async signals aren't
>always being sent to clean up, which means "xm list" has to trigger
>it. I *think* Anthony had a comment about polling being necessary in
>this case for some reason, so perhaps he can chime in and explain.
>
>
Actually, Keir made a recent change that will cause the @releaseDomain
notification to go out when the domain disappears which was the thing
that necessitated polling before.
As long as Xend updates it's state on every @introduceDomain and
@releaseDomain watch, it should always be up-to-date (barring the
obvious scheduling race between Xend and XenStore--but that only allows
a stale domain state window of a few 10s of milliseconds at worse).
What would be really ideal is to do away completely with the Xend
internal state and just always pull things from the store. This is
probably too big of a change for 3.0 though.
Regards,
Anthony Liguori
>Thanks Ewan!
>
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
2005-09-30 16:54 [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}' Dan Smith
2005-10-01 10:54 ` Ewan Mellor
@ 2005-10-04 7:36 ` Ewan Mellor
2005-10-04 13:41 ` Dan Smith
2005-10-04 17:35 ` Dan Smith
1 sibling, 2 replies; 8+ messages in thread
From: Ewan Mellor @ 2005-10-04 7:36 UTC (permalink / raw)
To: xen-devel
On Fri, Sep 30, 2005 at 09:54:55AM -0700, Dan Smith wrote:
>
> This is a resend of my stale state fix, which is yet unapplied. If
> there are issues, please let me know.
>
> Note that this fixes the issue poked by xm-test, as shown in the
> following snippet of David's latest FC3pae.report:
>
> > FAIL: 01_shutdown_basic_pos
> > I had to run an xm list to update xend state!
Hi Dan,
I made a big change yesterday to XendDomain to make it thread-safe. As far as
I can tell, most of the problems that you've been seeing were caused by
watches firing and modifying XendDomain internal state at the same time as
each other and as the xm commands. This meant that it was pretty easy to
confuse Xend into thinking that domains existed when they didn't and vice
versa.
I would be grateful if you could re-run xm-test and let me know how it looks.
There might still be some bugs to iron out, but hopefully you will find that
the behaviour under xm-test is much improved.
We've got someone working right now on integrating xm-test with our automated
test/build infrastructure here, so I expect to be able to run all your tests
myself soon, but I would also appreciate your feedback on this.
Thanks,
Ewan.
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
2005-10-04 7:36 ` Ewan Mellor
@ 2005-10-04 13:41 ` Dan Smith
2005-10-04 17:35 ` Dan Smith
1 sibling, 0 replies; 8+ messages in thread
From: Dan Smith @ 2005-10-04 13:41 UTC (permalink / raw)
To: Ewan Mellor; +Cc: Xen Developers
EM> I made a big change yesterday to XendDomain to make it
EM> thread-safe. As far as I can tell, most of the problems that
EM> you've been seeing were caused by watches firing and modifying
EM> XendDomain internal state at the same time as each other and as
EM> the xm commands. This meant that it was pretty easy to confuse
EM> Xend into thinking that domains existed when they didn't and vice
EM> versa.
Sounds about right :)
EM> I would be grateful if you could re-run xm-test and let me know
EM> how it looks. There might still be some bugs to iron out, but
EM> hopefully you will find that the behaviour under xm-test is much
EM> improved.
Absolutely. I will run it today and post my findings. It would be
great if others "out there" could do the same to help verify that the
problem is better or fixed. Since threads are involved, many tests
across varying platforms will be more convincing :)
EM> We've got someone working right now on integrating xm-test with
EM> our automated test/build infrastructure here, so I expect to be
EM> able to run all your tests myself soon, but I would also
EM> appreciate your feedback on this.
That is excellent news. The latest version generates an XML file of
results that it automatically submits to us for review. It would be
great if you could use that as your data source for merging the data
into your own test infrastructure.
--
Dan Smith
IBM Linux Technology Center
Open Hypervisor Team
email: danms@us.ibm.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
2005-10-04 7:36 ` Ewan Mellor
2005-10-04 13:41 ` Dan Smith
@ 2005-10-04 17:35 ` Dan Smith
2005-10-04 23:14 ` Ewan Mellor
1 sibling, 1 reply; 8+ messages in thread
From: Dan Smith @ 2005-10-04 17:35 UTC (permalink / raw)
To: Ewan Mellor; +Cc: xen-devel
EM> I would be grateful if you could re-run xm-test and let me know
EM> how it looks. There might still be some bugs to iron out, but
EM> hopefully you will find that the behaviour under xm-test is much
EM> improved.
After two runs of xm-test, I'm not seeing failures on either of the
tests that poke specific stale-state issues. That's good news :)
I'll see about writing some more tests targeted at stale-state
detection, just to make sure ;)
--
Dan Smith
IBM Linux Technology Center
Open Hypervisor Team
email: danms@us.ibm.com
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}'
2005-10-04 17:35 ` Dan Smith
@ 2005-10-04 23:14 ` Ewan Mellor
0 siblings, 0 replies; 8+ messages in thread
From: Ewan Mellor @ 2005-10-04 23:14 UTC (permalink / raw)
To: xen-devel
On Tue, Oct 04, 2005 at 10:35:04AM -0700, Dan Smith wrote:
> EM> I would be grateful if you could re-run xm-test and let me know
> EM> how it looks. There might still be some bugs to iron out, but
> EM> hopefully you will find that the behaviour under xm-test is much
> EM> improved.
>
> After two runs of xm-test, I'm not seeing failures on either of the
> tests that poke specific stale-state issues. That's good news :)
>
> I'll see about writing some more tests targeted at stale-state
> detection, just to make sure ;)
Great, thanks a lot Dan, I appreciate it.
Ewan.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-10-04 23:14 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-30 16:54 [PATCH][RESEND] Fix stale-state issue with 'xm dom{id, name}' Dan Smith
2005-10-01 10:54 ` Ewan Mellor
2005-10-01 15:33 ` Dan Smith
2005-10-01 19:40 ` Anthony Liguori
2005-10-04 7:36 ` Ewan Mellor
2005-10-04 13:41 ` Dan Smith
2005-10-04 17:35 ` Dan Smith
2005-10-04 23:14 ` Ewan Mellor
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.