All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
@ 2019-05-24 16:53 Waiman Long
  2019-05-24 17:19 ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2019-05-24 16:53 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin
  Cc: linux-kernel, x86, Davidlohr Bueso, Linus Torvalds, Tim Chen,
	huang ying, Waiman Long

The kernel test robot has reported that the use of __this_cpu_add()
causes bug messages like:

  BUG: using __this_cpu_add() in preemptible [00000000] code: ...

This is only an issue on preempt kernel where preemption can happen in
the middle of a percpu operation. We are still using __this_cpu_*() for
!preempt kernel to avoid additional overhead in case CONFIG_PREEMPT_COUNT
is set.

 v2: Simplify the condition to just preempt or !preempt.

Fixes: a8654596f0371 ("locking/rwsem: Enable lock event counting")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lock_events.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/lock_events.h b/kernel/locking/lock_events.h
index feb1acc54611..05f34068ec06 100644
--- a/kernel/locking/lock_events.h
+++ b/kernel/locking/lock_events.h
@@ -30,13 +30,32 @@ enum lock_events {
  */
 DECLARE_PER_CPU(unsigned long, lockevents[lockevent_num]);
 
+/*
+ * The purpose of the lock event counting subsystem is to provide a low
+ * overhead way to record the number of specific locking events by using
+ * percpu counters. It is the percpu sum that matters, not specifically
+ * how many of them happens in each cpu.
+ *
+ * In !preempt kernel, we can just use __this_cpu_*() as preemption
+ * won't happen in the middle of the percpu operation. In preempt kernel,
+ * preemption happens in the middle of the percpu operation may produce
+ * incorrect result.
+ */
+#ifdef CONFIG_PREEMPT
+#define lockevent_percpu_inc(x)		this_cpu_inc(x)
+#define lockevent_percpu_add(x, v)	this_cpu_add(x, v)
+#else
+#define lockevent_percpu_inc(x)		__this_cpu_inc(x)
+#define lockevent_percpu_add(x, v)	__this_cpu_add(x, v)
+#endif
+
 /*
  * Increment the PV qspinlock statistical counters
  */
 static inline void __lockevent_inc(enum lock_events event, bool cond)
 {
 	if (cond)
-		__this_cpu_inc(lockevents[event]);
+		lockevent_percpu_inc(lockevents[event]);
 }
 
 #define lockevent_inc(ev)	  __lockevent_inc(LOCKEVENT_ ##ev, true)
@@ -44,7 +63,7 @@ static inline void __lockevent_inc(enum lock_events event, bool cond)
 
 static inline void __lockevent_add(enum lock_events event, int inc)
 {
-	__this_cpu_add(lockevents[event], inc);
+	lockevent_percpu_add(lockevents[event], inc);
 }
 
 #define lockevent_add(ev, c)	__lockevent_add(LOCKEVENT_ ##ev, c)
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
  2019-05-24 16:53 [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary Waiman Long
@ 2019-05-24 17:19 ` Will Deacon
  2019-05-24 17:27   ` Linus Torvalds
  2019-05-24 17:28   ` Waiman Long
  0 siblings, 2 replies; 8+ messages in thread
From: Will Deacon @ 2019-05-24 17:19 UTC (permalink / raw)
  To: Waiman Long
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, linux-kernel, x86, Davidlohr Bueso,
	Linus Torvalds, Tim Chen, huang ying

On Fri, May 24, 2019 at 12:53:46PM -0400, Waiman Long wrote:
> The kernel test robot has reported that the use of __this_cpu_add()
> causes bug messages like:
> 
>   BUG: using __this_cpu_add() in preemptible [00000000] code: ...
> 
> This is only an issue on preempt kernel where preemption can happen in
> the middle of a percpu operation. We are still using __this_cpu_*() for
> !preempt kernel to avoid additional overhead in case CONFIG_PREEMPT_COUNT
> is set.
> 
>  v2: Simplify the condition to just preempt or !preempt.
> 
> Fixes: a8654596f0371 ("locking/rwsem: Enable lock event counting")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  kernel/locking/lock_events.h | 23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/locking/lock_events.h b/kernel/locking/lock_events.h
> index feb1acc54611..05f34068ec06 100644
> --- a/kernel/locking/lock_events.h
> +++ b/kernel/locking/lock_events.h
> @@ -30,13 +30,32 @@ enum lock_events {
>   */
>  DECLARE_PER_CPU(unsigned long, lockevents[lockevent_num]);
>  
> +/*
> + * The purpose of the lock event counting subsystem is to provide a low
> + * overhead way to record the number of specific locking events by using
> + * percpu counters. It is the percpu sum that matters, not specifically
> + * how many of them happens in each cpu.
> + *
> + * In !preempt kernel, we can just use __this_cpu_*() as preemption
> + * won't happen in the middle of the percpu operation. In preempt kernel,
> + * preemption happens in the middle of the percpu operation may produce
> + * incorrect result.
> + */
> +#ifdef CONFIG_PREEMPT
> +#define lockevent_percpu_inc(x)		this_cpu_inc(x)
> +#define lockevent_percpu_add(x, v)	this_cpu_add(x, v)
> +#else
> +#define lockevent_percpu_inc(x)		__this_cpu_inc(x)
> +#define lockevent_percpu_add(x, v)	__this_cpu_add(x, v)

Are you sure this works wrt IRQs? For example, if I take an interrupt when
trying to update the counter, and then the irq handler takes a qspinlock
which in turn tries to update the counter. Would I lose an update in that
scenario?

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
  2019-05-24 17:19 ` Will Deacon
@ 2019-05-24 17:27   ` Linus Torvalds
  2019-05-24 17:35     ` Waiman Long
  2019-05-24 17:28   ` Waiman Long
  1 sibling, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2019-05-24 17:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: Waiman Long, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Linux List Kernel Mailing,
	the arch/x86 maintainers, Davidlohr Bueso, Tim Chen, huang ying

On Fri, May 24, 2019 at 10:19 AM Will Deacon <will.deacon@arm.com> wrote:
>
> Are you sure this works wrt IRQs? For example, if I take an interrupt when
> trying to update the counter, and then the irq handler takes a qspinlock
> which in turn tries to update the counter. Would I lose an update in that
> scenario?

Sounds about right.

We might decide that the lock event counters are not necessarily
precise, but just rough guide-line statistics ("close enough in
practice")

But that would imply that it shouldn't be dependent on CONFIG_PREEMPT
at all, and we should always use the double-underscore version, except
without the debug checking.

Maybe the #ifdef should just be CONFIG_PREEMPT_DEBUG, with a comment
saying "we're not exact, but debugging complains, so if you enable
debugging it will be slower and precise". Because I don't think we
have a "do this unsafely and without any debugging" option.

And the whole "not precise" thing should be documented, of course.

I can't imagine that people would rely on _exact_ lock statistics, but
hey, there are a lot of things people do that I can't fathom, so
that's not necessarily a strong argument.

Comments?

                  Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
  2019-05-24 17:19 ` Will Deacon
  2019-05-24 17:27   ` Linus Torvalds
@ 2019-05-24 17:28   ` Waiman Long
  1 sibling, 0 replies; 8+ messages in thread
From: Waiman Long @ 2019-05-24 17:28 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, linux-kernel, x86, Davidlohr Bueso,
	Linus Torvalds, Tim Chen, huang ying

On 5/24/19 1:19 PM, Will Deacon wrote:
> On Fri, May 24, 2019 at 12:53:46PM -0400, Waiman Long wrote:
>> The kernel test robot has reported that the use of __this_cpu_add()
>> causes bug messages like:
>>
>>   BUG: using __this_cpu_add() in preemptible [00000000] code: ...
>>
>> This is only an issue on preempt kernel where preemption can happen in
>> the middle of a percpu operation. We are still using __this_cpu_*() for
>> !preempt kernel to avoid additional overhead in case CONFIG_PREEMPT_COUNT
>> is set.
>>
>>  v2: Simplify the condition to just preempt or !preempt.
>>
>> Fixes: a8654596f0371 ("locking/rwsem: Enable lock event counting")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>  kernel/locking/lock_events.h | 23 +++++++++++++++++++++--
>>  1 file changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/locking/lock_events.h b/kernel/locking/lock_events.h
>> index feb1acc54611..05f34068ec06 100644
>> --- a/kernel/locking/lock_events.h
>> +++ b/kernel/locking/lock_events.h
>> @@ -30,13 +30,32 @@ enum lock_events {
>>   */
>>  DECLARE_PER_CPU(unsigned long, lockevents[lockevent_num]);
>>  
>> +/*
>> + * The purpose of the lock event counting subsystem is to provide a low
>> + * overhead way to record the number of specific locking events by using
>> + * percpu counters. It is the percpu sum that matters, not specifically
>> + * how many of them happens in each cpu.
>> + *
>> + * In !preempt kernel, we can just use __this_cpu_*() as preemption
>> + * won't happen in the middle of the percpu operation. In preempt kernel,
>> + * preemption happens in the middle of the percpu operation may produce
>> + * incorrect result.
>> + */
>> +#ifdef CONFIG_PREEMPT
>> +#define lockevent_percpu_inc(x)		this_cpu_inc(x)
>> +#define lockevent_percpu_add(x, v)	this_cpu_add(x, v)
>> +#else
>> +#define lockevent_percpu_inc(x)		__this_cpu_inc(x)
>> +#define lockevent_percpu_add(x, v)	__this_cpu_add(x, v)
> Are you sure this works wrt IRQs? For example, if I take an interrupt when
> trying to update the counter, and then the irq handler takes a qspinlock
> which in turn tries to update the counter. Would I lose an update in that
> scenario?
>
> Will

Good point! But this will be an issue even if we use the non-underscore
version as I don't think it will disable interrupt. Also it is only a
problem if the percpu operation is more than 1 instruction. It is a
single instruction for x86. Other architectures may require more than 1
instruction. In those cases, we may lose count, but it is still better
than getting the count from one CPU and put it into another CPU.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
  2019-05-24 17:27   ` Linus Torvalds
@ 2019-05-24 17:35     ` Waiman Long
  2019-05-24 17:39       ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Waiman Long @ 2019-05-24 17:35 UTC (permalink / raw)
  To: Linus Torvalds, Will Deacon
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	H. Peter Anvin, Linux List Kernel Mailing,
	the arch/x86 maintainers, Davidlohr Bueso, Tim Chen, huang ying

On 5/24/19 1:27 PM, Linus Torvalds wrote:
> On Fri, May 24, 2019 at 10:19 AM Will Deacon <will.deacon@arm.com> wrote:
>> Are you sure this works wrt IRQs? For example, if I take an interrupt when
>> trying to update the counter, and then the irq handler takes a qspinlock
>> which in turn tries to update the counter. Would I lose an update in that
>> scenario?
> Sounds about right.
>
> We might decide that the lock event counters are not necessarily
> precise, but just rough guide-line statistics ("close enough in
> practice")
>
> But that would imply that it shouldn't be dependent on CONFIG_PREEMPT
> at all, and we should always use the double-underscore version, except
> without the debug checking.
>
> Maybe the #ifdef should just be CONFIG_PREEMPT_DEBUG, with a comment
> saying "we're not exact, but debugging complains, so if you enable
> debugging it will be slower and precise". Because I don't think we
> have a "do this unsafely and without any debugging" option.

I am not too worry about losing count here and there once in a while
because of interrupts, but the possibility of having the count from one
CPU to be put into another CPU in a preempt kernel may distort the total
count significantly. This is what I want to avoid.


>
> And the whole "not precise" thing should be documented, of course.

Yes, I will update the patch to document that fact that the count may
not be precise. Anyway even if we have a 1-2% error, it is not a big
deal in term of presenting a global picture of what operations are being
done.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
  2019-05-24 17:35     ` Waiman Long
@ 2019-05-24 17:39       ` Will Deacon
       [not found]         ` <ed19cb78-3c00-4788-5369-73bcd8199e15@redhat.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2019-05-24 17:39 UTC (permalink / raw)
  To: Waiman Long
  Cc: Linus Torvalds, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Linux List Kernel Mailing,
	the arch/x86 maintainers, Davidlohr Bueso, Tim Chen, huang ying

On Fri, May 24, 2019 at 01:35:39PM -0400, Waiman Long wrote:
> On 5/24/19 1:27 PM, Linus Torvalds wrote:
> > On Fri, May 24, 2019 at 10:19 AM Will Deacon <will.deacon@arm.com> wrote:
> >> Are you sure this works wrt IRQs? For example, if I take an interrupt when
> >> trying to update the counter, and then the irq handler takes a qspinlock
> >> which in turn tries to update the counter. Would I lose an update in that
> >> scenario?
> > Sounds about right.
> >
> > We might decide that the lock event counters are not necessarily
> > precise, but just rough guide-line statistics ("close enough in
> > practice")
> >
> > But that would imply that it shouldn't be dependent on CONFIG_PREEMPT
> > at all, and we should always use the double-underscore version, except
> > without the debug checking.
> >
> > Maybe the #ifdef should just be CONFIG_PREEMPT_DEBUG, with a comment
> > saying "we're not exact, but debugging complains, so if you enable
> > debugging it will be slower and precise". Because I don't think we
> > have a "do this unsafely and without any debugging" option.
> 
> I am not too worry about losing count here and there once in a while
> because of interrupts, but the possibility of having the count from one
> CPU to be put into another CPU in a preempt kernel may distort the total
> count significantly. This is what I want to avoid.
> 
> 
> >
> > And the whole "not precise" thing should be documented, of course.
> 
> Yes, I will update the patch to document that fact that the count may
> not be precise. Anyway even if we have a 1-2% error, it is not a big
> deal in term of presenting a global picture of what operations are being
> done.

I suppose one alternative would be to have a per-cpu local_t variable,
and do the increments on that. However, that's probably worse than the
current approach for x86.

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
       [not found]         ` <ed19cb78-3c00-4788-5369-73bcd8199e15@redhat.com>
@ 2019-05-24 18:32           ` Will Deacon
  2019-05-24 18:50             ` Waiman Long
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2019-05-24 18:32 UTC (permalink / raw)
  To: Waiman Long
  Cc: Linus Torvalds, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Linux List Kernel Mailing,
	the arch/x86 maintainers, Davidlohr Bueso, Tim Chen, huang ying

On Fri, May 24, 2019 at 02:11:23PM -0400, Waiman Long wrote:
> On 5/24/19 1:39 PM, Will Deacon wrote:
> 
>             And the whole "not precise" thing should be documented, of course.
> 
>         Yes, I will update the patch to document that fact that the count may
>         not be precise. Anyway even if we have a 1-2% error, it is not a big
>         deal in term of presenting a global picture of what operations are being
>         done.
> 
>     I suppose one alternative would be to have a per-cpu local_t variable,
>     and do the increments on that. However, that's probably worse than the
>     current approach for x86.
> 
> I don't quite understand what you mean by per-cpu local_t variable. A per-cpu
> variable is either statically allocated or dynamically allocated. Even with
> dynamical allocation, the same problem exists, I think unless you differentiate
> between irq context and process context. That will make it a lot more messier,
> I think.

So I haven't actually tried this to see if it works, but all I meant was
that you could replace the current:

DECLARE_PER_CPU(unsigned long, lockevents[lockevent_num]);

with:

DECLARE_PER_CPU(local_t, lockevents[lockevent_num]);

and then rework the inc/add macros to use a combination of raw_cpu_ptr
and local_inc().

I think that would allow you to get rid of the #ifdeffery, but it may
introduce a small overhead for x86.

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary
  2019-05-24 18:32           ` Will Deacon
@ 2019-05-24 18:50             ` Waiman Long
  0 siblings, 0 replies; 8+ messages in thread
From: Waiman Long @ 2019-05-24 18:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: Linus Torvalds, Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Borislav Petkov, H. Peter Anvin, Linux List Kernel Mailing,
	the arch/x86 maintainers, Davidlohr Bueso, Tim Chen, huang ying

On 5/24/19 2:32 PM, Will Deacon wrote:
> On Fri, May 24, 2019 at 02:11:23PM -0400, Waiman Long wrote:
>> On 5/24/19 1:39 PM, Will Deacon wrote:
>>
>>             And the whole "not precise" thing should be documented, of course.
>>
>>         Yes, I will update the patch to document that fact that the count may
>>         not be precise. Anyway even if we have a 1-2% error, it is not a big
>>         deal in term of presenting a global picture of what operations are being
>>         done.
>>
>>     I suppose one alternative would be to have a per-cpu local_t variable,
>>     and do the increments on that. However, that's probably worse than the
>>     current approach for x86.
>>
>> I don't quite understand what you mean by per-cpu local_t variable. A per-cpu
>> variable is either statically allocated or dynamically allocated. Even with
>> dynamical allocation, the same problem exists, I think unless you differentiate
>> between irq context and process context. That will make it a lot more messier,
>> I think.
> So I haven't actually tried this to see if it works, but all I meant was
> that you could replace the current:
>
> DECLARE_PER_CPU(unsigned long, lockevents[lockevent_num]);
>
> with:
>
> DECLARE_PER_CPU(local_t, lockevents[lockevent_num]);
>
> and then rework the inc/add macros to use a combination of raw_cpu_ptr
> and local_inc().
>
> I think that would allow you to get rid of the #ifdeffery, but it may
> introduce a small overhead for x86.

OK, I was not aware of the local_t type. Anyway, the x86 local_t type
perform similar single-instruction update. On other architectures that
can't do that, it will be a real atomic operation which will be more costly.

-Longman


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-05-24 18:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-24 16:53 [PATCH v2] locking/lock_events: Use this_cpu_add() when necessary Waiman Long
2019-05-24 17:19 ` Will Deacon
2019-05-24 17:27   ` Linus Torvalds
2019-05-24 17:35     ` Waiman Long
2019-05-24 17:39       ` Will Deacon
     [not found]         ` <ed19cb78-3c00-4788-5369-73bcd8199e15@redhat.com>
2019-05-24 18:32           ` Will Deacon
2019-05-24 18:50             ` Waiman Long
2019-05-24 17:28   ` Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.