From: Vikram Mulukutla <markivx@codeaurora.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>, linux-kernel@vger.kernel.org
Subject: Re: Additional compiler barrier required in sched_preempt_enable_no_resched?
Date: Sat, 14 May 2016 11:28:46 -0700 [thread overview]
Message-ID: <57376E5E.50207@codeaurora.org> (raw)
In-Reply-To: <alpine.DEB.2.11.1605141730290.4044@nanos>
On 5/14/2016 8:39 AM, Thomas Gleixner wrote:
> On Fri, 13 May 2016, Vikram Mulukutla wrote:
>> On 5/13/2016 7:58 AM, Peter Zijlstra wrote:
>>> diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
>>> index 5d8ffa3e6f8c..c1cde3577551 100644
>>> --- a/include/asm-generic/preempt.h
>>> +++ b/include/asm-generic/preempt.h
>>> @@ -7,10 +7,10 @@
>>>
>>> static __always_inline int preempt_count(void)
>>> {
>>> - return current_thread_info()->preempt_count;
>>> + return READ_ONCE(current_thread_info()->preempt_count);
>>> }
>>>
>>> -static __always_inline int *preempt_count_ptr(void)
>>> +static __always_inline volatile int *preempt_count_ptr(void)
>>> {
>>> return ¤t_thread_info()->preempt_count;
>>> }
>>>
>>
>> Thanks Peter, this patch worked for me. The compiler no longer optimizes out
>> the increment/decrement of the preempt_count.
>
> I have a hard time to understand why the compiler optimizes out stuff w/o that
> patch.
>
> We already have:
>
> #define preempt_disable() \
> do { \
> preempt_count_inc(); \
> barrier(); \
> } while (0)
>
> #define sched_preempt_enable_no_resched() \
> do { \
> barrier(); \
> preempt_count_dec(); \
> } while (0)
>
> #define preempt_enable() \
> do { \
> barrier(); \
> if (unlikely(preempt_count_dec_and_test())) \
> __preempt_schedule(); \
> } while (0)
>
> So the barriers already forbid that the compiler reorders code. How on earth
> is the compiler supposed to optimize the dec/inc out?
>
While I cannot claim more than a very rudimentary knowledge of
compilers, this was the initial reaction that I had as well. But then
the barriers are in the wrong spots for the way the code was used in the
driver in question. preempt_disable has the barrier() _after_ the
increment, and sched_preempt_enable_no_resched has it _before_ the
decrement. Therefore, if one were to use preempt_enable_no_resched
followed by preempt_disable, there is no compiler barrier between the
decrement and the increment statements. Whether this should translate to
such a seemingly aggressive optimization is something I'm not completely
sure of.
> There is more code than the preempt stuff depending on barrier() and expecting
> that the compiler does not optimize out things at will. Are we supposed to
> hunt all occurences and amend them with READ_ONCE or whatever one by one?
>
I think the barrier is working as intended for the sequence of
preempt_disable followed by preempt_enable_no_resched.
> Thanks,
>
> tglx
>
>
>
As a simple experiment I used gcc on x86 on the following C program.
This was really to convince myself rather than you and Peter!
#include <stdio.h>
#define barrier() __asm__ __volatile__("": : :"memory")
struct thread_info {
int pc;
};
#define preempt_count() \
ti_ptr->pc
#define preempt_disable() \
do \
{ \
preempt_count() += 1; \
barrier(); \
} \
while (0)
#define preempt_enable() \
do \
{ \
barrier(); \
preempt_count() -= 1; \
} \
while (0)
struct thread_info *ti_ptr;
int main(void)
{
struct thread_info ti;
ti_ptr = &ti;
int b;
preempt_enable();
b = b + 500;
preempt_disable();
printf("b = %d\n", b);
return 0;
}
With gcc -O2 I get this (the ARM compiler behaves similarly):
0000000000400470 <main>:
400470: 48 83 ec 18 sub $0x18,%rsp
400474: 48 89 25 cd 0b 20 00 mov %rsp,0x200bcd(%rip)
40047b: ba f4 01 00 00 mov $0x1f4,%edx
400480: be 14 06 40 00 mov $0x400614,%esi
400485: bf 01 00 00 00 mov $0x1,%edi
40048a: 31 c0 xor %eax,%eax
40048c: e8 cf ff ff ff callq 400460 <__printf_chk@plt>
400491: 31 c0 xor %eax,%eax
400493: 48 83 c4 18 add $0x18,%rsp
400497: c3 retq
If I swap preempt_enable and preempt_disable I get this:
0000000000400470 <main>:
400470: 48 83 ec 18 sub $0x18,%rsp
400474: 48 89 25 cd 0b 20 00 mov %rsp,0x200bcd(%rip)
40047b: 83 04 24 01 addl $0x1,(%rsp)
40047f: 48 8b 05 c2 0b 20 00 mov 0x200bc2(%rip),%rax
400486: ba f4 01 00 00 mov $0x1f4,%edx
40048b: be 14 06 40 00 mov $0x400614,%esi
400490: bf 01 00 00 00 mov $0x1,%edi
400495: 83 28 01 subl $0x1,(%rax)
400498: 31 c0 xor %eax,%eax
40049a: e8 c1 ff ff ff callq 400460 <__printf_chk@plt>
40049f: 31 c0 xor %eax,%eax
4004a1: 48 83 c4 18 add $0x18,%rsp
4004a5: c3 retq
Note: If I place ti_ptr on the stack, the ordering/barriers no longer
matter, the inc/dec is always optimized out. I suppose the compiler does
treat current_thread_info as coming from a different memory location
rather than the current stack frame. In any case, IMHO the volatile
preempt_count patch is necessary.
Thanks,
Vikram
--
next prev parent reply other threads:[~2016-05-14 18:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-13 6:39 Additional compiler barrier required in sched_preempt_enable_no_resched? Vikram Mulukutla
2016-05-13 14:21 ` Thomas Gleixner
2016-05-13 14:58 ` Peter Zijlstra
2016-05-13 22:44 ` Vikram Mulukutla
2016-05-14 15:39 ` Thomas Gleixner
2016-05-14 18:28 ` Vikram Mulukutla [this message]
2016-05-16 10:55 ` Peter Zijlstra
2016-05-16 11:00 ` Peter Zijlstra
2016-05-16 13:17 ` Peter Zijlstra
2016-05-17 14:21 ` [tip:sched/urgent] sched/preempt: Fix preempt_count manipulations tip-bot for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57376E5E.50207@codeaurora.org \
--to=markivx@codeaurora.org \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.