* Problems with spin_lock_irqsave() and for_each_online_cpu()
@ 2013-10-17 19:34 Andrew Cooper
2013-10-18 4:57 ` Keir Fraser
2013-10-18 8:19 ` Jan Beulich
0 siblings, 2 replies; 5+ messages in thread
From: Andrew Cooper @ 2013-10-17 19:34 UTC (permalink / raw)
To: Xen-devel List, Keir Fraser, Jan Beulich, Tim Deegan
Hello,
I was triaging a Coverity issue (1055455) which was complaining about
spinlock inbalance in common/trace.c:424.
First of all, there is a latent bug here using "int flags" in a
irqsave/irqrestore. It will be safe until bit 31 of RFLAGS is defined,
but will introduce subtle corruption thereafter.
This bug was not caught by the compiler because of the
spin_lock_irqsave() macro which has slightly non-function-like
semantics. Would it be acceptable to change spin_lock_irqsave() into a
static inline so can be properly typed? (This would come with a huge
amount of code churn as the function would have to take flags by pointer)
As for the spinlocks, while I as the programmer am fairly sure that
"&per_cpu(t_lock, i)" is unlikely to change, coverity takes into account
that updates to __per_cpu_offset[i] have no protection. Because of
RELOC_HIDE(), gcc can't hoist the per_cpu() calculation, but will run
from register cached values of per_cpu__t_lock and __per_cpu_offset[i],
so is guaranteed to find the same lock both times even if
__per_cpu_offset[i] changes.
However, it occurs to me that there is no protection between
for_each_online_cpu() finding a bit set in the cpu_online_map and safely
using per_cpu() to poke at another cpus data. One solution seems to be
to make use of {get,put}_cpu_maps() but this seems overkill especially
as the innards of a for_each_online_cpu() loop only care about one
particular pcpu not disappearing under its feet.
Would it be sensible to have RW locks for each pcpu (specifically not a
per_cpu lock - array of size nr_cpu_ids probably) specifically for the
purpose of playing with another pcpus data? Hopefully this shouldn't
cause too much overhead, but there are a large number of
for_each_online_cpu() calls which would need protecting.
~Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problems with spin_lock_irqsave() and for_each_online_cpu()
2013-10-17 19:34 Problems with spin_lock_irqsave() and for_each_online_cpu() Andrew Cooper
@ 2013-10-18 4:57 ` Keir Fraser
2013-10-18 8:19 ` Jan Beulich
1 sibling, 0 replies; 5+ messages in thread
From: Keir Fraser @ 2013-10-18 4:57 UTC (permalink / raw)
To: Andrew Cooper, Xen-devel List, Jan Beulich, Tim Deegan
On 17/10/2013 20:34, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:
> As for the spinlocks, while I as the programmer am fairly sure that
> "&per_cpu(t_lock, i)" is unlikely to change, coverity takes into account
> that updates to __per_cpu_offset[i] have no protection. Because of
> RELOC_HIDE(), gcc can't hoist the per_cpu() calculation, but will run
> from register cached values of per_cpu__t_lock and __per_cpu_offset[i],
> so is guaranteed to find the same lock both times even if
> __per_cpu_offset[i] changes.
>
> However, it occurs to me that there is no protection between
> for_each_online_cpu() finding a bit set in the cpu_online_map and safely
> using per_cpu() to poke at another cpus data. One solution seems to be
> to make use of {get,put}_cpu_maps() but this seems overkill especially
> as the innards of a for_each_online_cpu() loop only care about one
> particular pcpu not disappearing under its feet.
CPUs are taken offline in a stop_machine_run environment, *and* the
__per_cpu_offset[i] change is done via RCU. We're safe here I'm sure.
-- Keir
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problems with spin_lock_irqsave() and for_each_online_cpu()
2013-10-17 19:34 Problems with spin_lock_irqsave() and for_each_online_cpu() Andrew Cooper
2013-10-18 4:57 ` Keir Fraser
@ 2013-10-18 8:19 ` Jan Beulich
2013-10-18 14:55 ` Andrew Cooper
1 sibling, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2013-10-18 8:19 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan
>>> On 17.10.13 at 21:34, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> I was triaging a Coverity issue (1055455) which was complaining about
> spinlock inbalance in common/trace.c:424.
>
> First of all, there is a latent bug here using "int flags" in a
> irqsave/irqrestore. It will be safe until bit 31 of RFLAGS is defined,
> but will introduce subtle corruption thereafter.
>
> This bug was not caught by the compiler because of the
> spin_lock_irqsave() macro which has slightly non-function-like
> semantics. Would it be acceptable to change spin_lock_irqsave() into a
> static inline so can be properly typed? (This would come with a huge
> amount of code churn as the function would have to take flags by pointer)
Why not simply add
BUILD_BUG_ON(sizeof(f) != sizeof(_spin_lock_irqsave(l)))
(or equivalent in case of header dependency issues) to the
macro?
But then again I don't see the corruption to occur when RFLAGS
beyond bit 31 would become defined: the flags get passed to
local_irq_restore() only, and that one's effect is "defined" to set
IF to the intended state - what it does with the other flags isn't
really defined (and in fact I wonder whether it really is correct
to use a plain POPF there - imagine code fiddling with e.g. DF
at the same time as using the proper abstractions to control IF).
Jan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problems with spin_lock_irqsave() and for_each_online_cpu()
2013-10-18 8:19 ` Jan Beulich
@ 2013-10-18 14:55 ` Andrew Cooper
2013-10-18 15:00 ` Jan Beulich
0 siblings, 1 reply; 5+ messages in thread
From: Andrew Cooper @ 2013-10-18 14:55 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, Keir Fraser, Tim Deegan
On 18/10/13 09:19, Jan Beulich wrote:
>>>> On 17.10.13 at 21:34, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> I was triaging a Coverity issue (1055455) which was complaining about
>> spinlock inbalance in common/trace.c:424.
>>
>> First of all, there is a latent bug here using "int flags" in a
>> irqsave/irqrestore. It will be safe until bit 31 of RFLAGS is defined,
>> but will introduce subtle corruption thereafter.
>>
>> This bug was not caught by the compiler because of the
>> spin_lock_irqsave() macro which has slightly non-function-like
>> semantics. Would it be acceptable to change spin_lock_irqsave() into a
>> static inline so can be properly typed? (This would come with a huge
>> amount of code churn as the function would have to take flags by pointer)
> Why not simply add
>
> BUILD_BUG_ON(sizeof(f) != sizeof(_spin_lock_irqsave(l)))
>
> (or equivalent in case of header dependency issues) to the
> macro?
>
> But then again I don't see the corruption to occur when RFLAGS
> beyond bit 31 would become defined: the flags get passed to
> local_irq_restore() only, and that one's effect is "defined" to set
> IF to the intended state - what it does with the other flags isn't
> really defined (and in fact I wonder whether it really is correct
> to use a plain POPF there - imagine code fiddling with e.g. DF
> at the same time as using the proper abstractions to control IF).
>
> Jan
>
The BUILD_BUG_ON() is a rather more simple solution to the problem.
Also changing local_irq_restore() to a conditional sti will also be a
good improvement. I shall see about spinning a patch for these issues.
~Andrew
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Problems with spin_lock_irqsave() and for_each_online_cpu()
2013-10-18 14:55 ` Andrew Cooper
@ 2013-10-18 15:00 ` Jan Beulich
0 siblings, 0 replies; 5+ messages in thread
From: Jan Beulich @ 2013-10-18 15:00 UTC (permalink / raw)
To: Andrew Cooper; +Cc: xen-devel, Keir Fraser, Tim Deegan
>>> On 18.10.13 at 16:55, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> Also changing local_irq_restore() to a conditional sti will also be a
> good improvement. I shall see about spinning a patch for these issues.
A conditional sti or cli you meant.
Jan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-10-18 15:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-17 19:34 Problems with spin_lock_irqsave() and for_each_online_cpu() Andrew Cooper
2013-10-18 4:57 ` Keir Fraser
2013-10-18 8:19 ` Jan Beulich
2013-10-18 14:55 ` Andrew Cooper
2013-10-18 15:00 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.