linux-input.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Torokhov <dmitry.torokhov@gmail.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: "Pali Rohár" <pali.rohar@gmail.com>,
	"Andrey Konovalov" <andreyknvl@google.com>,
	"Hans de Goede" <hdegoede@redhat.com>,
	"Mathias Gottschlag" <mgottschlag@gmail.com>,
	"Shailendra Verma" <shailendra.capricorn@gmail.com>,
	"Rusty Russell" <rusty@rustcorp.com.au>,
	"Luis R. Rodriguez" <mcgrof@suse.com>,
	"Thomas Hellstrom" <thellstrom@vmware.com>,
	"linux-input@vger.kernel.org" <linux-input@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Kostya Serebryany" <kcc@google.com>,
	"Alexander Potapenko" <glider@google.com>,
	ktsan@googlegroups.com
Subject: Re: Potential data race in psmouse_interrupt
Date: Fri, 4 Sep 2015 13:27:39 -0700	[thread overview]
Message-ID: <CAKdAkRTq9EH9Egu4z6e3BrqDnfZ9MEMkEyu4deRJUMaqzuD9EA@mail.gmail.com> (raw)
In-Reply-To: <CACT4Y+asObD22HSEQHCJ8=pRpyf6aHAZJN2CwQKxq+-jbUh+uQ@mail.gmail.com>

On Fri, Sep 4, 2015 at 12:32 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Fri, Sep 4, 2015 at 6:56 PM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
>> On Tue, Sep 1, 2015 at 11:46 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Fri, Aug 28, 2015 at 8:32 PM, Dmitry Torokhov
>>> <dmitry.torokhov@gmail.com> wrote:
>>>> On Fri, Aug 28, 2015 at 11:08 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>>> On Fri, Aug 28, 2015 at 7:51 PM, Dmitry Torokhov
>>>>> <dmitry.torokhov@gmail.com> wrote:
>>>>>> On Fri, Aug 28, 2015 at 10:34 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I am looking at this code in __ps2_command again:
>>>>>>>
>>>>>>> /*
>>>>>>> * The reset command takes a long time to execute.
>>>>>>> */
>>>>>>> timeout = msecs_to_jiffies(command == PS2_CMD_RESET_BAT ? 4000 : 500);
>>>>>>>
>>>>>>> timeout = wait_event_timeout(ps2dev->wait,
>>>>>>>     !(READ_ONCE(ps2dev->flags) & PS2_FLAG_CMD1), timeout);
>>>>>>>
>>>>>>> if (smp_load_acquire(&ps2dev->cmdcnt) &&
>>>>>>>     !(smp_load_acquire(&ps2dev->flags) & PS2_FLAG_CMD1)) {
>>>>>>>               timeout = ps2_adjust_timeout(ps2dev, command, timeout);
>>>>>>>               wait_event_timeout(ps2dev->wait,
>>>>>>>                         !(smp_load_acquire(&ps2dev->flags) &
>>>>>>> PS2_FLAG_CMD), timeout);
>>>>>>> }
>>>>>>>
>>>>>>> if (param)
>>>>>>>     for (i = 0; i < receive; i++)
>>>>>>>               param[i] = ps2dev->cmdbuf[(receive - 1) - i];
>>>>>>>
>>>>>>>
>>>>>>> Here are two moments I don't understand:
>>>>>>> 1. The last parameter of ps2_adjust_timeout is timeout in jiffies (it
>>>>>>> is compared against 100ms). However, timeout is assigned to result of
>>>>>>> wait_event_timeout, which returns 0 or 1. This does not make sense to
>>>>>>> me. What am I missing?
>>>>>>
>>>>>> The fact that wait_event_timeout can return value greater than one:
>>>>>>
>>>>>>  * Returns:
>>>>>>  * 0 if the @condition evaluated to %false after the @timeout elapsed,
>>>>>>  * 1 if the @condition evaluated to %true after the @timeout elapsed,
>>>>>>  * or the remaining jiffies (at least 1) if the @condition evaluated
>>>>>>               ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>
>>>>>
>>>>> OK, makes sense now!
>>>>>
>>>>>>> 2. This code pays great attention to timeouts, but in the end I don't
>>>>>>> see how it handles timeouts. That is, if a timeout is happened, we
>>>>>>> still copyout (garbage) from cmdbuf. What am I missing here?
>>>>>>
>>>>>> Once upon a time wait_event() did not return positive value when
>>>>>> timeout expired and then condition satisfied. So we just examine the
>>>>>> final state (psmpouse->cmdcnt should be 0 if command actually
>>>>>> succeeded) and even if we copy in garbage nobody should care since
>>>>>> we'll return error in this case.
>>>>>
>>>>>
>>>>> I see.
>>>>> But the cmdcnt is re-read after copying out response. So it is
>>>>> possible that we read garbage response, but then read cmdcnt==0 and
>>>>> return OK to caller.
>>>>
>>>> That assumes that we actually timed out, and while we were copying the
>>>> data the response finally came.
>>>
>>> Right.
>>>
>>>>>
>>>>> So far I have something along the following lines to fix data races in libps2.c
>>>>
>>>> I don't know, maybe we should simply move call to
>>>> serio_pause_rx(ps2dev->serio) higher, before we check ps2dev->cmdcnt,
>>>> and move copying of the buffer down, after checking cmdcnt.
>>>
>>> I don't know about serio_pause_rx, but copying of response should be
>>> done after checking cmdcnt.
>>
>> It will stop the interrupt handler from running while we are examining
>> the cmdcnt and copy out the data, thus removing the race.
>>
>>> Also you need to use smp_store_release/smp_load_acquire cmdcnt and
>>> flags when they have dependent data. And READ_ONCE/WRITE_ONCE on
>>> shared state otherwise is highly desirable.
>>>
>>>>> diff --git a/drivers/input/serio/libps2.c b/drivers/input/serio/libps2.c
>>>>> index 7551699..51c747f 100644
>>>>> --- a/drivers/input/serio/libps2.c
>>>>> +++ b/drivers/input/serio/libps2.c
>>>>> @@ -43,7 +43,7 @@ int ps2_sendbyte(struct ps2dev *ps2dev, unsigned
>>>>> char byte, int timeout)
>>>>>
>>>>>          if (serio_write(ps2dev->serio, byte) == 0)
>>>>>                  wait_event_timeout(ps2dev->wait,
>>>>> -                                   !(ps2dev->flags & PS2_FLAG_ACK),
>>>>> +                                   !(READ_ONCE(ps2dev->flags) & PS2_FLAG_ACK),
>>>>>                                     msecs_to_jiffies(timeout));
>>>>>
>>>>>          serio_pause_rx(ps2dev->serio);
>>>>> @@ -187,6 +187,7 @@ int __ps2_command(struct ps2dev *ps2dev, unsigned
>>>>> char *param, int command)
>>>>>          int receive = (command >> 8) & 0xf;
>>>>>          int rc = -1;
>>>>>          int i;
>>>>> +        unsigned char cmdcnt;
>>>>>
>>>>>          if (receive > sizeof(ps2dev->cmdbuf)) {
>>>>>                  WARN_ON(1);
>>>>> @@ -225,23 +226,22 @@ int __ps2_command(struct ps2dev *ps2dev,
>>>>> unsigned char *param, int command)
>>>>>          timeout = msecs_to_jiffies(command == PS2_CMD_RESET_BAT ? 4000 : 500);
>>>>>
>>>>>          timeout = wait_event_timeout(ps2dev->wait,
>>>>> -                                     !(ps2dev->flags &
>>>>> PS2_FLAG_CMD1), timeout);
>>>>> -
>>>>> -        if (ps2dev->cmdcnt && !(ps2dev->flags & PS2_FLAG_CMD1)) {
>>>>> +                !(READ_ONCE(ps2dev->flags) & PS2_FLAG_CMD1), timeout);
>>>>>
>>>>> +        if (READ_ONCE(&ps2dev->cmdcnt) &&
>>>>> +                        !(READ_ONCE(&ps2dev->flags) & PS2_FLAG_CMD1)) {
>>>>>                  timeout = ps2_adjust_timeout(ps2dev, command, timeout);
>>>>>                  wait_event_timeout(ps2dev->wait,
>>>>> -                                   !(ps2dev->flags & PS2_FLAG_CMD), timeout);
>>>>> +                        !(READ_ONCE(&ps2dev->flags) & PS2_FLAG_CMD), timeout);
>>>>
>>>> What all these READ_ONCE()s give us?
>>>
>>> I've wrote up the response here:
>>> https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE
>>
>> I read it and I still do not understand what READ_ONCE() in
>> wait_event* conditions will buy us.
>>
>> Also if the following is true:
>>
>>> As the consequence C compilers stopped guarantying that "word accesses are atomic".
>>
>> a lot of stuff will break in the kernel. Maybe compilers should stop
>> moving towards the lala land?
>
> It buys us:
> 1. More readable code but highlighting important aspects. Inter-thread
> synchronization is important and complex, explicit is better than
> implicit in such contexts.

*Every* condition in wait_event* is modified by a separate thread,
there is no need to higlight anything.

> 2. Conformance to relevant standards and relieve you, me and everybody
> else reading this code from spending time on proving that it cannot
> break (which is not actually possible to do, "I don't see how it can
> break" is not quite proof).

I expect wait_event() API to ensure that the condition is re-evaluated
properly instead of sprinkling these annotations throughout entire
kernel. As far as I know prepare_to_wait* does provides necessary
barriers.

> 3. Allow tooling that finds undoubtedly harmful bugs, like this one.

You already found this bug without annotations, once it is fixed (by
expanding critical section) there is no longer a reason for using
slower access as there are no concurrency anymore.

Thanks.

-- 
Dmitry

  reply	other threads:[~2015-09-04 20:27 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAAeHK+wZ=fXCRDUvkQdSmFWxU1fqN5suF=P1kL_CT1L8p3ybDg@mail.gmail.com>
2015-07-22 13:14 ` Fwd: Potential data race in psmouse_interrupt Andrey Konovalov
2015-07-22 16:52   ` Dmitry Torokhov
2015-07-23 13:08   ` Pali Rohár
2015-07-23 13:31     ` Andrey Konovalov
2015-07-29 11:53       ` Pali Rohár
2015-08-28 17:34         ` Dmitry Vyukov
2015-08-28 17:51           ` Dmitry Torokhov
2015-08-28 18:08             ` Dmitry Vyukov
2015-08-28 18:32               ` Dmitry Torokhov
2015-08-28 18:34                 ` Dmitry Torokhov
2015-09-01 18:46                 ` Dmitry Vyukov
2015-09-04 16:56                   ` Dmitry Torokhov
2015-09-04 19:32                     ` Dmitry Vyukov
2015-09-04 20:27                       ` Dmitry Torokhov [this message]
2015-09-05 13:21                         ` Dmitry Vyukov
2015-09-07 14:06                           ` Dmitry Vyukov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKdAkRTq9EH9Egu4z6e3BrqDnfZ9MEMkEyu4deRJUMaqzuD9EA@mail.gmail.com \
    --to=dmitry.torokhov@gmail.com \
    --cc=andreyknvl@google.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=hdegoede@redhat.com \
    --cc=kcc@google.com \
    --cc=ktsan@googlegroups.com \
    --cc=linux-input@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@suse.com \
    --cc=mgottschlag@gmail.com \
    --cc=pali.rohar@gmail.com \
    --cc=rusty@rustcorp.com.au \
    --cc=shailendra.capricorn@gmail.com \
    --cc=thellstrom@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).