* credit scheduler svc->flags access race?
@ 2011-12-29 7:50 Liu,Yi
0 siblings, 0 replies; 7+ messages in thread
From: Liu,Yi @ 2011-12-29 7:50 UTC (permalink / raw)
To: xen-devel
hi, all
I run 8 vms in HP DL380G7 server. These vms run fine if vm's schedule cap
value is 0(default).But most of vms stop running about one minute later if
their schedule cap values is 42 or other value.
After a few weeks tracking, I find that when I modify sched_credit.c as
follows, vms run fine again:
1. in csched_acct function:
- svc->flags |= CSCHED_FLAG_VCPU_PARKED;
+ set_bit(0, &svc->flags); // set CSCHED_FLAG_VCPU_PARKED
2. in csched_vcpu_yield function:
- sv->flags |= CSCHED_FLAG_VCPU_YIELD;
+ set_bit(1, &svc->flags); // set CSCHED_FLAG_VCPU_YIELD
3. in csched_vcpu struct:
- uint16_t flags;
+ uint32_t flags; // set_bit ask
Vms don't stop running either if sched_credit_default_yield = 1 in grub
menu.lst.
So, what I modified is correct? Is there access race when xen access
svc->flags from csched_acct and csched_vcpu_yield?
Another interesting things is that vms don't stop running in other server
even if vms's schedule cap values is not 0, just on this HP DL380G7 server.
I don't know why.
My xen version is 4.1.0, dom0 kernel version is 2.6.32.41 from jeremy's git
a few months before.
--
View this message in context: http://xen.1045712.n5.nabble.com/credit-scheduler-svc-flags-access-race-tp5107044p5107044.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 7+ messages in thread
* credit scheduler svc->flags access race?
@ 2011-12-31 2:55 Liu.yi
2012-01-03 7:33 ` Keir Fraser
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Liu.yi @ 2011-12-31 2:55 UTC (permalink / raw)
To: xen-devel
hi,all
In my HP DL380G7 server, csched_vcpu_yield and csched_acct may access
svc->flags at the same time, when this happens vms stop running because
csched_vcpu_yield overwrites svc->flags which csched_acct set to
CSCHED_FLAG_VCPU_PARKED (My vm's schedule cap values is not 0).
Vms run fine if I modified schedule_credit.c as follows:
--- xen/common/sched_credit.c 2010-12-10 10:19:45.000000000 +0800
+++ ../../xen-4.1.0/xen/common/sched_credit.c 2010-12-31
10:47:39.000000000 +0800
@@ -135,7 +135,7 @@ struct csched_vcpu {
struct vcpu *vcpu;
atomic_t credit;
s_time_t start_time; /* When we were scheduled (used for credit) */
- uint16_t flags;
+ uint32_t flags;
int16_t pri;
#ifdef CSCHED_STATS
struct {
@@ -787,7 +787,7 @@ csched_vcpu_yield(const struct scheduler
if ( !sched_credit_default_yield )
{
/* Let the scheduler know that this vcpu is trying to yield */
- sv->flags |= CSCHED_FLAG_VCPU_YIELD;
+ set_bit(1, &sv->flags);
}
}
@@ -1086,7 +1086,7 @@ csched_acct(void* dummy)
{
CSCHED_STAT_CRANK(vcpu_park);
vcpu_pause_nosync(svc->vcpu);
- svc->flags |= CSCHED_FLAG_VCPU_PARKED;
+ set_bit(0, &svc->flags);
}
/* Lower bound on credits */
@@ -1111,7 +1111,7 @@ csched_acct(void* dummy)
*/
CSCHED_STAT_CRANK(vcpu_unpark);
vcpu_unpause(svc->vcpu);
- svc->flags &= ~CSCHED_FLAG_VCPU_PARKED;
+ clear_bit(0, &svc->flags);
}
/* Upper bound on credits means VCPU stops earning */
@@ -1337,7 +1337,7 @@ csched_schedule(
* Clear YIELD flag before scheduling out
*/
if ( scurr->flags & CSCHED_FLAG_VCPU_YIELD )
- scurr->flags &= ~(CSCHED_FLAG_VCPU_YIELD);
+ clear_bit(1, &scurr->flags);
Are these modification correct? Another interesting thing is that vms run
fine on other server even if these vms's credit cap value is not 0. I don't
know why.
My xen version is 4.1.0, dom0 kernel version is 2.6.32.41 from jerry's git a
few months before.
Thanks
liuyi
--
View this message in context: http://xen.1045712.n5.nabble.com/credit-scheduler-svc-flags-access-race-tp5111504p5111504.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: credit scheduler svc->flags access race?
2011-12-31 2:55 Liu.yi
@ 2012-01-03 7:33 ` Keir Fraser
2012-01-04 23:02 ` George Dunlap
2012-01-05 2:36 ` Liu.yi
2 siblings, 0 replies; 7+ messages in thread
From: Keir Fraser @ 2012-01-03 7:33 UTC (permalink / raw)
To: Liu.yi, xen-devel; +Cc: George Dunlap
On 31/12/2011 02:55, "Liu.yi" <liu.yi24@zte.com.cn> wrote:
> hi,all
> In my HP DL380G7 server, csched_vcpu_yield and csched_acct may access
> svc->flags at the same time, when this happens vms stop running because
> csched_vcpu_yield overwrites svc->flags which csched_acct set to
> CSCHED_FLAG_VCPU_PARKED (My vm's schedule cap values is not 0).
> Vms run fine if I modified schedule_credit.c as follows:
Cc'ing George Dunlap. This is probably a good bug fix.
-- Keir
> --- xen/common/sched_credit.c 2010-12-10 10:19:45.000000000 +0800
> +++ ../../xen-4.1.0/xen/common/sched_credit.c 2010-12-31
> 10:47:39.000000000 +0800
> @@ -135,7 +135,7 @@ struct csched_vcpu {
> struct vcpu *vcpu;
> atomic_t credit;
> s_time_t start_time; /* When we were scheduled (used for credit) */
> - uint16_t flags;
> + uint32_t flags;
> int16_t pri;
> #ifdef CSCHED_STATS
> struct {
> @@ -787,7 +787,7 @@ csched_vcpu_yield(const struct scheduler
> if ( !sched_credit_default_yield )
> {
> /* Let the scheduler know that this vcpu is trying to yield */
> - sv->flags |= CSCHED_FLAG_VCPU_YIELD;
> + set_bit(1, &sv->flags);
> }
> }
>
> @@ -1086,7 +1086,7 @@ csched_acct(void* dummy)
> {
> CSCHED_STAT_CRANK(vcpu_park);
> vcpu_pause_nosync(svc->vcpu);
> - svc->flags |= CSCHED_FLAG_VCPU_PARKED;
> + set_bit(0, &svc->flags);
> }
>
> /* Lower bound on credits */
> @@ -1111,7 +1111,7 @@ csched_acct(void* dummy)
> */
> CSCHED_STAT_CRANK(vcpu_unpark);
> vcpu_unpause(svc->vcpu);
> - svc->flags &= ~CSCHED_FLAG_VCPU_PARKED;
> + clear_bit(0, &svc->flags);
> }
>
> /* Upper bound on credits means VCPU stops earning */
> @@ -1337,7 +1337,7 @@ csched_schedule(
> * Clear YIELD flag before scheduling out
> */
> if ( scurr->flags & CSCHED_FLAG_VCPU_YIELD )
> - scurr->flags &= ~(CSCHED_FLAG_VCPU_YIELD);
> + clear_bit(1, &scurr->flags);
>
> Are these modification correct? Another interesting thing is that vms run
> fine on other server even if these vms's credit cap value is not 0. I don't
> know why.
> My xen version is 4.1.0, dom0 kernel version is 2.6.32.41 from jerry's git a
> few months before.
>
> Thanks
>
> liuyi
>
>
> --
> View this message in context:
> http://xen.1045712.n5.nabble.com/credit-scheduler-svc-flags-access-race-tp5111
> 504p5111504.html
> Sent from the Xen - Dev mailing list archive at Nabble.com.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: credit scheduler svc->flags access race?
2011-12-31 2:55 Liu.yi
2012-01-03 7:33 ` Keir Fraser
@ 2012-01-04 23:02 ` George Dunlap
2012-01-05 2:36 ` Liu.yi
2 siblings, 0 replies; 7+ messages in thread
From: George Dunlap @ 2012-01-04 23:02 UTC (permalink / raw)
To: Liu.yi; +Cc: xen-devel
On Fri, Dec 30, 2011 at 9:55 PM, Liu.yi <liu.yi24@zte.com.cn> wrote:
> hi,all
> In my HP DL380G7 server, csched_vcpu_yield and csched_acct may access
> svc->flags at the same time, when this happens vms stop running because
> csched_vcpu_yield overwrites svc->flags which csched_acct set to
> CSCHED_FLAG_VCPU_PARKED (My vm's schedule cap values is not 0).
> Vms run fine if I modified schedule_credit.c as follows:
>
> --- xen/common/sched_credit.c 2010-12-10 10:19:45.000000000 +0800
> +++ ../../xen-4.1.0/xen/common/sched_credit.c 2010-12-31
Liu,
Thanks for the patch. Unfortunately it doesn't apply for me. It
needs to be able to apply using either "patch -p1 < patch_file" or "hg
qimport patch_file". There are instructions on how to make such a
patch here:
http://wiki.xen.org/wiki/SubmittingXenPatches
Just from looking at it: You're right, the svc->flags variable is
modified while holding the vcpu scheduling lock in the case of
vcpu_yield, but the scheduler private lock in the case of
csched_acct(). It looks like your solution makes the update atomic --
let me see if that's the best thing. I think the updates shouldn't be
too frequent, so it might be easier than trying to sort out locking.
I'll take a look at changing the locking and get back with you.
-George
> 10:47:39.000000000 +0800
> @@ -135,7 +135,7 @@ struct csched_vcpu {
> struct vcpu *vcpu;
> atomic_t credit;
> s_time_t start_time; /* When we were scheduled (used for credit) */
> - uint16_t flags;
> + uint32_t flags;
> int16_t pri;
> #ifdef CSCHED_STATS
> struct {
> @@ -787,7 +787,7 @@ csched_vcpu_yield(const struct scheduler
> if ( !sched_credit_default_yield )
> {
> /* Let the scheduler know that this vcpu is trying to yield */
> - sv->flags |= CSCHED_FLAG_VCPU_YIELD;
> + set_bit(1, &sv->flags);
> }
> }
>
> @@ -1086,7 +1086,7 @@ csched_acct(void* dummy)
> {
> CSCHED_STAT_CRANK(vcpu_park);
> vcpu_pause_nosync(svc->vcpu);
> - svc->flags |= CSCHED_FLAG_VCPU_PARKED;
> + set_bit(0, &svc->flags);
> }
>
> /* Lower bound on credits */
> @@ -1111,7 +1111,7 @@ csched_acct(void* dummy)
> */
> CSCHED_STAT_CRANK(vcpu_unpark);
> vcpu_unpause(svc->vcpu);
> - svc->flags &= ~CSCHED_FLAG_VCPU_PARKED;
> + clear_bit(0, &svc->flags);
> }
>
> /* Upper bound on credits means VCPU stops earning */
> @@ -1337,7 +1337,7 @@ csched_schedule(
> * Clear YIELD flag before scheduling out
> */
> if ( scurr->flags & CSCHED_FLAG_VCPU_YIELD )
> - scurr->flags &= ~(CSCHED_FLAG_VCPU_YIELD);
> + clear_bit(1, &scurr->flags);
>
> Are these modification correct? Another interesting thing is that vms run
> fine on other server even if these vms's credit cap value is not 0. I don't
> know why.
> My xen version is 4.1.0, dom0 kernel version is 2.6.32.41 from jerry's git a
> few months before.
>
> Thanks
>
> liuyi
>
>
> --
> View this message in context: http://xen.1045712.n5.nabble.com/credit-scheduler-svc-flags-access-race-tp5111504p5111504.html
> Sent from the Xen - Dev mailing list archive at Nabble.com.
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: credit scheduler svc->flags access race?
2011-12-31 2:55 Liu.yi
2012-01-03 7:33 ` Keir Fraser
2012-01-04 23:02 ` George Dunlap
@ 2012-01-05 2:36 ` Liu.yi
2012-01-05 7:58 ` Jan Beulich
2 siblings, 1 reply; 7+ messages in thread
From: Liu.yi @ 2012-01-05 2:36 UTC (permalink / raw)
To: xen-devel
Hi, Keir and George, thanks for your concerns.
Access race to svc->flags seems inevitably when csched_acct calls
vcpu_pause_nosync:
vm excutes PAUSE instruction.
hypercalls call csched_vcpu_yield.
I had think about taking csched_private->lock in csched_vcpu_yield, but
csched_acct takes this lock for a long time so that csched_vcpu_yield may
been blocked for long time too, so I chooes set_bit and clear_bit.
Atomic operations using LOCK prefix (like set_bit and clear_bit) will block
all the physical cpus which excute memory access. Does this fact implies
that spin locks are more efficient when their's granularity is small?
Sorry for my poor english.
Liuyi
--
View this message in context: http://xen.1045712.n5.nabble.com/credit-scheduler-svc-flags-access-race-tp5111504p5121613.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: credit scheduler svc->flags access race?
2012-01-05 2:36 ` Liu.yi
@ 2012-01-05 7:58 ` Jan Beulich
2012-01-05 8:43 ` Liu.yi
0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2012-01-05 7:58 UTC (permalink / raw)
To: Liu.yi; +Cc: xen-devel
>>> On 05.01.12 at 03:36, "Liu.yi" <liu.yi24@zte.com.cn> wrote:
> Atomic operations using LOCK prefix (like set_bit and clear_bit) will block
> all the physical cpus which excute memory access.
... to that same cache line.
> Does this fact implies
> that spin locks are more efficient when their's granularity is small?
No - after all, acquiring a spin lock unavoidably implies using a LOCKed
instruction.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: credit scheduler svc->flags access race?
2012-01-05 7:58 ` Jan Beulich
@ 2012-01-05 8:43 ` Liu.yi
0 siblings, 0 replies; 7+ messages in thread
From: Liu.yi @ 2012-01-05 8:43 UTC (permalink / raw)
To: xen-devel
Understood, thanks.
--
View this message in context: http://xen.1045712.n5.nabble.com/credit-scheduler-svc-flags-access-race-tp5111504p5122127.html
Sent from the Xen - Dev mailing list archive at Nabble.com.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-01-05 8:43 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-29 7:50 credit scheduler svc->flags access race? Liu,Yi
-- strict thread matches above, loose matches on Subject: below --
2011-12-31 2:55 Liu.yi
2012-01-03 7:33 ` Keir Fraser
2012-01-04 23:02 ` George Dunlap
2012-01-05 2:36 ` Liu.yi
2012-01-05 7:58 ` Jan Beulich
2012-01-05 8:43 ` Liu.yi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).