public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Arcangeli <andrea@suse.de>
To: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: linux-kernel@vger.kernel.org,
	Linus Torvalds <torvalds@transmeta.com>,
	Robert Macaulay <robert_macaulay@dell.com>
Subject: Re: x86 smp_call_function recent breakage
Date: Tue, 23 Oct 2001 19:23:07 +0200	[thread overview]
Message-ID: <20011023192307.Q26029@athlon.random> (raw)
In-Reply-To: <20011023182954.O26029@athlon.random> <E15w4kB-0006Vf-00@the-village.bc.nu>
In-Reply-To: <E15w4kB-0006Vf-00@the-village.bc.nu>; from alan@lxorguk.ukuu.org.uk on Tue, Oct 23, 2001 at 05:49:35PM +0100

On Tue, Oct 23, 2001 at 05:49:35PM +0100, Alan Cox wrote:
> > This isn't the right fix:
> 
> The cache thing you may be right on. The core fix is not about caching
> its about fixing races on IPI replay. We have to handle an IPI reoccuring
> potentially with a small time delay due to a retransmit of a message
> lost by one party on the bus. Furthermore it seems other messages can
> pass the message that then gets retransmitted.
> 
> > Robert, this explains the missed IPI during drain_cpu_caches, it isn't
> > ram fault or IPI missed by the hardware, so I suggest to just backout
> > the second diff above and try again. Will be fixed also in next -aa of
> > course.
> 
> I'm not sure I follow why it explains a missed IPI. Please explain in
> more detail.

The bug is very simple and it have nothing to do with hardware details,
it is a plain software bug:

-	if (wait) {
-		mb();
-		atomic_inc(&call_data->finished);
-	}
+	if (wait)
+		set_bit(smp_processor_id(), &call_data->finished);

since 2.4.10pre12 (and in all previous -ac also) you still use the above
"call_data" for notifying "finished", but you also allowed "call_data"
to be changed under the IPI handler by another incoming IPI (the
scalability optimization), so the first IPI handler will notify the
completion of the second IPI (instead of the first one) generating the
"missed IPI" generating kernel instability (deadlock for Roboert in
drain_cpu_caches that waits all cpus to "finish").

So you wonder about reply IPI or whatever, and you add further hackery
plus you left the NR_CPUS array that now is useless since there's
nothing to scale and we hold the spinlock for the whole duration of the
IPI cycle, while the only fix in your ac5<->ac6 diff is that you now
spin_unlock after touching ->finished like the old code correctly did,
so you notify completion of the right IPI and not of the wrong one, so
it won't deadlock anymore.

I don't think it's necessary to read ->func and friends before checking
->started, also there's no need of testing ->started. we have the
guarantee that only 1 IPI cycle can run at once because of the spinlock
held for the whole duration of the IPI cycle, so if we are running in
smp_call_function_interrupt we know all ->func,->started == 0 fields are
just coherent and freely usable in memory, we only need to care to mb()
between reading them and increasing ->started like the old code
correctly did.  So I think the old code was just perfect.

Andrea

      reply	other threads:[~2001-10-23 17:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-10-23 16:29 x86 smp_call_function recent breakage Andrea Arcangeli
2001-10-23 16:49 ` Alan Cox
2001-10-23 17:23   ` Andrea Arcangeli [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20011023192307.Q26029@athlon.random \
    --to=andrea@suse.de \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robert_macaulay@dell.com \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox