All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Tesarik <ptesarik@suse.cz>
To: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Tony Luck <tony.luck@intel.com>,
	Hedi Berriche <hedi@sgi.com>
Subject: Serious problem with ticket spinlocks on ia64
Date: Fri, 27 Aug 2010 13:37:33 +0000	[thread overview]
Message-ID: <201008271537.35709.ptesarik@suse.cz> (raw)

Hi everybody,

SGI has recently experienced failures with the new ticket spinlock
implementation. Hedi Berriche sent me a simple test case that can
trigger the failure on the siglock. To debug the issue, I wrote a
small module that watches writes to current->sighand->siglock and
records the values.

I observed that the __ticket_spin_lock() primitive fails when the
tail wraps around to zero. I reconstructed the following:

CPU 7 holds the spinlock
CPU 5 wants to acquire the spinlock
Spinlock value is 0xfffcffff
	 (now serving 0x7fffe, next ticket 0x7ffff)

CPU 7 executes st2.rel to release the spinlock.
At the same time CPU 5 executes a fetchadd4.acq.
The resulting lock value is 0xfffe0000 (correct), and CPU 5 has
recorded its ticket number (0x7fff).

Consequently, the first spinlock loop iteration succeeds, and CPU 5
now holds the spinlock.

Next, CPU 5 releases the spinlock with st2.rel, changing the lock
value to 0x0 (correct).

SO FAR SO GOOD.

Now, CPU 4, CPU 5 and CPU 7 all want to acquire the lock again.
Interestingly, CPU 5 and CPU 7 are both granted the same ticket,
and the spinlock value (as seen from the debug fault handler) is
0x0 after single-stepping over the fetchadd4.acq, in both cases.
CPU 4 correctly sets the spinlock value to 0x1.

I don't know if the simultaneos acquire attempt and release are
necessary to trigger the bug, but I noted it here.

I've only seen this happen when the spinlock wraps around to zero,
but I don't know whether it cannot happen otherwise.

In any case, there seems to be a serious problem with memory
ordering, and I'm not an expert to tell exactly what it is.

Any ideas?

Petr Tesarik
L3 International
Novell, Inc.

WARNING: multiple messages have this Message-ID (diff)
From: Petr Tesarik <ptesarik@suse.cz>
To: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Tony Luck <tony.luck@intel.com>,
	Hedi Berriche <hedi@sgi.com>
Subject: Serious problem with ticket spinlocks on ia64
Date: Fri, 27 Aug 2010 15:37:33 +0200	[thread overview]
Message-ID: <201008271537.35709.ptesarik@suse.cz> (raw)

Hi everybody,

SGI has recently experienced failures with the new ticket spinlock
implementation. Hedi Berriche sent me a simple test case that can
trigger the failure on the siglock. To debug the issue, I wrote a
small module that watches writes to current->sighand->siglock and
records the values.

I observed that the __ticket_spin_lock() primitive fails when the
tail wraps around to zero. I reconstructed the following:

CPU 7 holds the spinlock
CPU 5 wants to acquire the spinlock
Spinlock value is 0xfffcffff
	 (now serving 0x7fffe, next ticket 0x7ffff)

CPU 7 executes st2.rel to release the spinlock.
At the same time CPU 5 executes a fetchadd4.acq.
The resulting lock value is 0xfffe0000 (correct), and CPU 5 has
recorded its ticket number (0x7fff).

Consequently, the first spinlock loop iteration succeeds, and CPU 5
now holds the spinlock.

Next, CPU 5 releases the spinlock with st2.rel, changing the lock
value to 0x0 (correct).

SO FAR SO GOOD.

Now, CPU 4, CPU 5 and CPU 7 all want to acquire the lock again.
Interestingly, CPU 5 and CPU 7 are both granted the same ticket,
and the spinlock value (as seen from the debug fault handler) is
0x0 after single-stepping over the fetchadd4.acq, in both cases.
CPU 4 correctly sets the spinlock value to 0x1.

I don't know if the simultaneos acquire attempt and release are
necessary to trigger the bug, but I noted it here.

I've only seen this happen when the spinlock wraps around to zero,
but I don't know whether it cannot happen otherwise.

In any case, there seems to be a serious problem with memory
ordering, and I'm not an expert to tell exactly what it is.

Any ideas?

Petr Tesarik
L3 International
Novell, Inc.

             reply	other threads:[~2010-08-27 13:37 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-27 13:37 Petr Tesarik [this message]
2010-08-27 13:37 ` Serious problem with ticket spinlocks on ia64 Petr Tesarik
2010-08-27 13:48 ` Hedi Berriche
2010-08-27 13:48   ` Hedi Berriche
2010-08-27 14:09   ` Petr Tesarik
2010-08-27 14:09     ` Petr Tesarik
2010-08-27 14:31     ` Hedi Berriche
2010-08-27 14:31       ` Hedi Berriche
2010-08-27 14:40       ` Petr Tesarik
2010-08-27 14:40         ` Petr Tesarik
2010-08-27 14:52         ` Hedi Berriche
2010-08-27 14:52           ` Hedi Berriche
2010-08-27 16:37           ` Petr Tesarik
2010-08-27 16:37             ` Petr Tesarik
2010-08-27 16:08 ` Luck, Tony
2010-08-27 16:08   ` Luck, Tony
2010-08-27 17:16   ` Petr Tesarik
2010-08-27 17:16     ` Petr Tesarik
2010-08-27 18:20     ` Hedi Berriche
2010-08-27 18:20       ` Hedi Berriche
2010-08-27 19:40     ` Petr Tesarik
2010-08-27 19:40       ` Petr Tesarik
2010-08-27 20:29   ` Luck, Tony
2010-08-27 20:29     ` Luck, Tony
2010-08-27 20:41     ` Petr Tesarik
2010-08-27 20:41       ` Petr Tesarik
2010-08-27 21:03     ` Petr Tesarik
2010-08-27 21:03       ` Petr Tesarik
2010-08-27 21:11       ` Luck, Tony
2010-08-27 21:11         ` Luck, Tony
2010-08-27 22:13         ` Petr Tesarik
2010-08-27 22:13           ` Petr Tesarik
2010-08-27 23:26           ` Luck, Tony
2010-08-27 23:26             ` Luck, Tony
2010-08-27 23:55             ` Luck, Tony
2010-08-27 23:55               ` Luck, Tony
2010-08-28  0:28               ` Hedi Berriche
2010-08-28  0:28                 ` Hedi Berriche
2010-08-28  5:01                 ` Luck, Tony
2010-08-28  5:01                   ` Luck, Tony
2010-08-30 18:17                   ` Luck, Tony
2010-08-30 18:17                     ` Luck, Tony
2010-08-30 21:41                     ` Petr Tesarik
2010-08-30 21:41                       ` Petr Tesarik
2010-08-30 22:43                       ` Tony Luck
2010-08-30 22:43                         ` Tony Luck
2010-08-31 22:17                         ` Tony Luck
2010-08-31 22:17                           ` Tony Luck
2010-09-01 23:09                           ` Tony Luck
2010-09-01 23:09                             ` Tony Luck
2010-09-02  0:26                             ` Hedi Berriche
2010-09-02  0:26                               ` Hedi Berriche
2010-09-03  0:06                               ` Tony Luck
2010-09-03  0:06                                 ` Tony Luck
2010-09-03  9:04                                 ` Petr Tesarik
2010-09-03  9:04                                   ` Petr Tesarik
2010-09-03 14:35                                   ` Petr Tesarik
2010-09-03 14:35                                     ` Petr Tesarik
2010-09-03 14:52                                     ` Petr Tesarik
2010-09-03 14:52                                       ` Petr Tesarik
2010-09-03 15:50                                       ` Tony Luck
2010-09-03 15:50                                         ` Tony Luck
2010-09-06 14:47                                         ` Petr Tesarik
2010-09-06 14:47                                           ` Petr Tesarik
2010-09-07 13:17                                           ` Petr Tesarik
2010-09-07 13:17                                             ` Petr Tesarik
2010-09-07 17:35                                             ` Tony Luck
2010-09-07 17:35                                               ` Tony Luck
2010-09-08 15:55                                               ` Tony Luck
2010-09-08 15:55                                                 ` Tony Luck
2010-09-10  2:55                                     ` Dave Jones
2010-09-10  2:55                                       ` Dave Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201008271537.35709.ptesarik@suse.cz \
    --to=ptesarik@suse.cz \
    --cc=hedi@sgi.com \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.