* xenbus stress testing
@ 2011-02-17 9:01 James Harper
2011-02-17 19:06 ` Ian Jackson
0 siblings, 1 reply; 12+ messages in thread
From: James Harper @ 2011-02-17 9:01 UTC (permalink / raw)
To: xen-devel
Is there a simple way to stress test xenbus from DomU? In particular,
the sending of partial messages.
Thanks
James
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: xenbus stress testing
2011-02-17 9:01 xenbus stress testing James Harper
@ 2011-02-17 19:06 ` Ian Jackson
2011-02-17 22:17 ` James Harper
0 siblings, 1 reply; 12+ messages in thread
From: Ian Jackson @ 2011-02-17 19:06 UTC (permalink / raw)
To: James Harper; +Cc: xen-devel
James Harper writes ("[Xen-devel] xenbus stress testing"):
> Is there a simple way to stress test xenbus from DomU? In particular,
> the sending of partial messages.
Not without messing with the domU kernel. The xenbus driver in the
domU kernel is responsible for actually formatting the messages to
and/from the xenstore shared ring; domU userland if it talks to
xenstore at all just talks to its kernel.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: xenbus stress testing
2011-02-17 19:06 ` Ian Jackson
@ 2011-02-17 22:17 ` James Harper
2011-02-18 11:05 ` Olaf Hering
2011-02-18 12:25 ` Ian Jackson
0 siblings, 2 replies; 12+ messages in thread
From: James Harper @ 2011-02-17 22:17 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel
>
> James Harper writes ("[Xen-devel] xenbus stress testing"):
> > Is there a simple way to stress test xenbus from DomU? In
particular,
> > the sending of partial messages.
>
> Not without messing with the domU kernel. The xenbus driver in the
> domU kernel is responsible for actually formatting the messages to
> and/from the xenstore shared ring; domU userland if it talks to
> xenstore at all just talks to its kernel.
>
I think I have found the error and it was probably a 1 in a million race
so stress testing might not have helped anyway. My code went:
len = min(ring->rsp_prod - ring->rsp_cons, msg_size)
and the ASSERT was hit because len was > msg_size, and the only possible
way I can ever see that happening is if ring->rsp_prod changed between
the if in the min() and the assignment. I'm now snapshotting rsp_prod to
a local variable at the start. Kind of embarrassing really as plenty of
example code exists.
For vif and vbd that sort of race would easily be hit fairly often but
xenbus is obviously used much less, I've only ever had one bug report
from it.
James
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: xenbus stress testing
2011-02-17 22:17 ` James Harper
@ 2011-02-18 11:05 ` Olaf Hering
2011-02-18 11:43 ` Paul Durrant
2011-02-18 12:25 ` Ian Jackson
1 sibling, 1 reply; 12+ messages in thread
From: Olaf Hering @ 2011-02-18 11:05 UTC (permalink / raw)
To: James Harper; +Cc: xen-devel, Ian Jackson
On Fri, Feb 18, James Harper wrote:
> I think I have found the error and it was probably a 1 in a million race
> so stress testing might not have helped anyway. My code went:
>
> len = min(ring->rsp_prod - ring->rsp_cons, msg_size)
>
> and the ASSERT was hit because len was > msg_size, and the only possible
> way I can ever see that happening is if ring->rsp_prod changed between
> the if in the min() and the assignment. I'm now snapshotting rsp_prod to
> a local variable at the start. Kind of embarrassing really as plenty of
> example code exists.
Why is there no lock to protect the ring accesses?
Olaf
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: xenbus stress testing
2011-02-18 11:05 ` Olaf Hering
@ 2011-02-18 11:43 ` Paul Durrant
2011-02-18 12:01 ` Olaf Hering
0 siblings, 1 reply; 12+ messages in thread
From: Paul Durrant @ 2011-02-18 11:43 UTC (permalink / raw)
To: Olaf Hering, James Harper; +Cc: Ian, xen-devel@lists.xensource.com, Jackson
[-- Attachment #1: Type: text/plain, Size: 1223 bytes --]
How do you propose to lock against another VM updating a counter in shared memory?
Paul
> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-
> bounces@lists.xensource.com] On Behalf Of Olaf Hering
> Sent: 18 February 2011 11:05
> To: James Harper
> Cc: xen-devel@lists.xensource.com; Ian Jackson
> Subject: Re: [Xen-devel] xenbus stress testing
>
> On Fri, Feb 18, James Harper wrote:
>
> > I think I have found the error and it was probably a 1 in a
> million
> > race so stress testing might not have helped anyway. My code went:
> >
> > len = min(ring->rsp_prod - ring->rsp_cons, msg_size)
> >
> > and the ASSERT was hit because len was > msg_size, and the only
> > possible way I can ever see that happening is if ring->rsp_prod
> > changed between the if in the min() and the assignment. I'm now
> > snapshotting rsp_prod to a local variable at the start. Kind of
> > embarrassing really as plenty of example code exists.
>
> Why is there no lock to protect the ring accesses?
>
> Olaf
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: xenbus stress testing
2011-02-18 11:43 ` Paul Durrant
@ 2011-02-18 12:01 ` Olaf Hering
2011-02-18 12:43 ` James Harper
0 siblings, 1 reply; 12+ messages in thread
From: Olaf Hering @ 2011-02-18 12:01 UTC (permalink / raw)
To: Paul Durrant; +Cc: James Harper, Ian Jackson, xen-devel@lists.xensource.com
On Fri, Feb 18, Paul Durrant wrote:
> How do you propose to lock against another VM updating a counter in shared memory?
I thought xenstore uses the same ringbuffer as defined by
DEFINE_RING_TYPES(). But reading the code in
include/xen/interface/io/xs_wire.h shows it has its own logic.
And reading further in xenpaging code, which I had in mind while I wrote
the mail, shows there is a separate spinlock.
I can imagine that even if a lock exists, a VM crashing while holding
the lock can stall all other VMs. So its probably not the best idea.
Olaf
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: xenbus stress testing
2011-02-18 12:01 ` Olaf Hering
@ 2011-02-18 12:43 ` James Harper
0 siblings, 0 replies; 12+ messages in thread
From: James Harper @ 2011-02-18 12:43 UTC (permalink / raw)
To: Olaf Hering, Paul Durrant; +Cc: xen-devel, Ian Jackson
[-- Attachment #1: Type: text/plain, Size: 680 bytes --]
>
> On Fri, Feb 18, Paul Durrant wrote:
>
> > How do you propose to lock against another VM updating a counter in shared
> memory?
>
> I thought xenstore uses the same ringbuffer as defined by
> DEFINE_RING_TYPES(). But reading the code in
> include/xen/interface/io/xs_wire.h shows it has its own logic.
>
> And reading further in xenpaging code, which I had in mind while I wrote
> the mail, shows there is a separate spinlock.
>
> I can imagine that even if a lock exists, a VM crashing while holding
> the lock can stall all other VMs. So its probably not the best idea.
>
It's also inefficient when a perfectly good lock free method exists.
James
[-- Attachment #2: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: xenbus stress testing
2011-02-17 22:17 ` James Harper
2011-02-18 11:05 ` Olaf Hering
@ 2011-02-18 12:25 ` Ian Jackson
2011-02-18 12:42 ` James Harper
1 sibling, 1 reply; 12+ messages in thread
From: Ian Jackson @ 2011-02-18 12:25 UTC (permalink / raw)
To: James Harper; +Cc: xen-devel@lists.xensource.com
James Harper writes ("RE: [Xen-devel] xenbus stress testing"):
> I think I have found the error and it was probably a 1 in a million race
> so stress testing might not have helped anyway. My code went:
>
> len = min(ring->rsp_prod - ring->rsp_cons, msg_size)
>
> and the ASSERT was hit because len was > msg_size, and the only possible
> way I can ever see that happening is if ring->rsp_prod changed between
> the if in the min() and the assignment. I'm now snapshotting rsp_prod to
> a local variable at the start. Kind of embarrassing really as plenty of
> example code exists.
You need to think about memory barriers and/or volatile. Simply
"snapshotting" with an ordinary assignment doesn't work.
I don't know how this is done in Windows but the Linux kernel has a
clear explanation of the problem and how it's solved in Linux. Look
in the kernel source tree in Documentation/memory-barriers.txt.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: xenbus stress testing
2011-02-18 12:25 ` Ian Jackson
@ 2011-02-18 12:42 ` James Harper
2011-02-18 13:08 ` Keir Fraser
2011-02-18 15:30 ` Ian Jackson
0 siblings, 2 replies; 12+ messages in thread
From: James Harper @ 2011-02-18 12:42 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel
>
> James Harper writes ("RE: [Xen-devel] xenbus stress testing"):
> > I think I have found the error and it was probably a 1 in a million
race
> > so stress testing might not have helped anyway. My code went:
> >
> > len = min(ring->rsp_prod - ring->rsp_cons, msg_size)
> >
> > and the ASSERT was hit because len was > msg_size, and the only
possible
> > way I can ever see that happening is if ring->rsp_prod changed
between
> > the if in the min() and the assignment. I'm now snapshotting
rsp_prod to
> > a local variable at the start. Kind of embarrassing really as plenty
of
> > example code exists.
>
> You need to think about memory barriers and/or volatile. Simply
> "snapshotting" with an ordinary assignment doesn't work.
>
> I don't know how this is done in Windows but the Linux kernel has a
> clear explanation of the problem and how it's solved in Linux. Look
> in the kernel source tree in Documentation/memory-barriers.txt.
>
I issue a barrier (KeMemoryBarrier() which is a compiler and a memory
barrier) after copying rsp_prod, eg:
rsp_prod = ring->rsp_prod;
KeMemoryBarrier();
Access the actual ring buffer
Is there anything else required?
James
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: xenbus stress testing
2011-02-18 12:42 ` James Harper
@ 2011-02-18 13:08 ` Keir Fraser
2011-02-18 15:30 ` Ian Jackson
1 sibling, 0 replies; 12+ messages in thread
From: Keir Fraser @ 2011-02-18 13:08 UTC (permalink / raw)
To: James Harper, Ian Jackson; +Cc: xen-devel
On 18/02/2011 12:42, "James Harper" <james.harper@bendigoit.com.au> wrote:
>> You need to think about memory barriers and/or volatile. Simply
>> "snapshotting" with an ordinary assignment doesn't work.
>>
>> I don't know how this is done in Windows but the Linux kernel has a
>> clear explanation of the problem and how it's solved in Linux. Look
>> in the kernel source tree in Documentation/memory-barriers.txt.
>>
>
> I issue a barrier (KeMemoryBarrier() which is a compiler and a memory
> barrier) after copying rsp_prod, eg:
>
> rsp_prod = ring->rsp_prod;
> KeMemoryBarrier();
> Access the actual ring buffer
>
> Is there anything else required?
Should be okay. That's basically what all other xenstore clients are doing.
-- Keir
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: xenbus stress testing
2011-02-18 12:42 ` James Harper
2011-02-18 13:08 ` Keir Fraser
@ 2011-02-18 15:30 ` Ian Jackson
2011-02-18 22:36 ` James Harper
1 sibling, 1 reply; 12+ messages in thread
From: Ian Jackson @ 2011-02-18 15:30 UTC (permalink / raw)
To: James Harper; +Cc: xen-devel
James Harper writes ("RE: [Xen-devel] xenbus stress testing"):
> I issue a barrier (KeMemoryBarrier() which is a compiler and a memory
> barrier) after copying rsp_prod, eg:
>
> rsp_prod = ring->rsp_prod;
> KeMemoryBarrier();
> Access the actual ring buffer
This is probably correct, but it might depend exactly what
"KeMemoryBarrier" is.
Ian.
^ permalink raw reply [flat|nested] 12+ messages in thread* RE: xenbus stress testing
2011-02-18 15:30 ` Ian Jackson
@ 2011-02-18 22:36 ` James Harper
0 siblings, 0 replies; 12+ messages in thread
From: James Harper @ 2011-02-18 22:36 UTC (permalink / raw)
To: Ian Jackson; +Cc: xen-devel
>
> James Harper writes ("RE: [Xen-devel] xenbus stress testing"):
> > I issue a barrier (KeMemoryBarrier() which is a compiler and a
memory
> > barrier) after copying rsp_prod, eg:
> >
> > rsp_prod = ring->rsp_prod;
> > KeMemoryBarrier();
> > Access the actual ring buffer
>
> This is probably correct, but it might depend exactly what
> "KeMemoryBarrier" is.
>
As above, it's a compiler and a memory barrier.
James
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-02-18 22:36 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-17 9:01 xenbus stress testing James Harper
2011-02-17 19:06 ` Ian Jackson
2011-02-17 22:17 ` James Harper
2011-02-18 11:05 ` Olaf Hering
2011-02-18 11:43 ` Paul Durrant
2011-02-18 12:01 ` Olaf Hering
2011-02-18 12:43 ` James Harper
2011-02-18 12:25 ` Ian Jackson
2011-02-18 12:42 ` James Harper
2011-02-18 13:08 ` Keir Fraser
2011-02-18 15:30 ` Ian Jackson
2011-02-18 22:36 ` James Harper
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.