All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chuck Ebbert <cebbert@redhat.com>
To: Jordan Russell <jr-list-2007@quo.to>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@elte.hu>,
	john stultz <johnstul@us.ibm.com>
Subject: Re: 2.6.23.1: Random hangs during boot with "tsc" clocksource
Date: Wed, 21 Nov 2007 12:10:40 -0500	[thread overview]
Message-ID: <47446690.5090808@redhat.com> (raw)
In-Reply-To: <47436BA0.5050809@quo.to>

On 11/20/2007 06:20 PM, Jordan Russell wrote:
>>> On Fri, 02 Nov 2007 12:10:00 -0500 Jordan Russell <jr-list-2007@quo.to> wrote:
>>> Hi,
>>>
>>> With 2.6.23.1 (stock and Fedora), roughly 50% of the time my system
>>> hangs indefinitely during the kernel boot process. The hangs occur in
>>> places where normally a brief delay is seen, such as when detecting
>>> serial ports, ATA devices, and USB hubs. SysRq+W, when it works, shows
>>> tasks stuck inside schedule_timeout and lock_timer_base.
> 
> Same problem with 2.6.23.8.
> 
> Are there any specific (TSC related?) patches I should try reverting?
> 
> Would it help if I captured the dmesg/SysRq output from one of the
> hanging boots?
> 
> Any other information that might be useful in getting to the bottom of this?
> 

Did you try this one? You are seeing problems with preemption disabled,
but it's at least worth trying.



From: Marin Mitov <mitov@issp.bas.bg>
To: linux-kernel@vger.kernel.org
Subject: [PATCH]new_TSC_based_delay_tsc()
Cc: Ingo Molnar <mingo@elte.hu>
Date: 	Tue, 20 Nov 2007 21:32:27 +0200

Hi all,

Please ignore the previous patch with the same subject.
It has a bug that can manifest itself in the very exotic case
when each do {} while() iteration executes on different cpu
leading to potentially infinite loop.

This is a patch based on the Ingo's idea/patch to track 
delay_tsc() migration to another cpu by comparing 
smp_processor_id(). It is against kernel-2.6.24-rc3.

What is different:
1. Using unsigned (instead of long) to unify for i386/x86_64.
2. Minimal preempt_disable/enable() critical sections
   (more room for preemption)
3. some statements have been rearranged, to account for
    possible under/overflow of left/TSC

Tested on both: 32/64 bit SMP PREEMPT kernel-2.6.24-rc3

Comments, please.

Signed-off-by: Marin Mitov <mitov@issp.bas.bg>
=========================================
--- a/arch/x86/lib/delay_32.c	2007-11-18 08:14:05.000000000 +0200
+++ b/arch/x86/lib/delay_32.c	2007-11-20 19:03:52.000000000 +0200
@@ -38,18 +38,42 @@
 		:"0" (loops));
 }
 
-/* TSC based delay: */
+/* TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
 static void delay_tsc(unsigned long loops)
 {
-	unsigned long bclock, now;
+	unsigned prev, prev_1, now;
+	unsigned left = loops;
+	unsigned prev_cpu, cpu;
+
+	preempt_disable();
+	rdtscl(prev);
+	prev_cpu = smp_processor_id();
+	preempt_enable();
+	now = prev;
 
-	preempt_disable();		/* TSC's are per-cpu */
-	rdtscl(bclock);
 	do {
 		rep_nop();
+
+		left -= now - prev;
+		prev = now;
+
+		preempt_disable();
+		rdtscl(prev_1);
+		cpu = smp_processor_id();
 		rdtscl(now);
-	} while ((now-bclock) < loops);
-	preempt_enable();
+		preempt_enable();
+
+		if (prev_cpu != cpu){
+			/*
+			 * We have migrated, forget prev_cpu's tsc reading
+			 */
+			prev = prev_1;
+			prev_cpu = cpu;
+		}
+	} while ((now-prev) < left);
 }
 
 /*
--- a/arch/x86/lib/delay_64.c	2007-11-18 08:14:40.000000000 +0200
+++ b/arch/x86/lib/delay_64.c	2007-11-20 19:47:29.000000000 +0200
@@ -26,18 +26,42 @@
 	return 0;
 }
 
+/* TSC based delay:
+ *
+ * We are careful about preemption as TSC's are per-CPU.
+ */
 void __delay(unsigned long loops)
 {
-	unsigned bclock, now;
+	unsigned prev, prev_1, now;
+	unsigned left = loops;
+	unsigned prev_cpu, cpu;
+
+	preempt_disable();
+	rdtscl(prev);
+	prev_cpu = smp_processor_id();
+	preempt_enable();
+	now = prev;
 
-	preempt_disable();		/* TSC's are pre-cpu */
-	rdtscl(bclock);
 	do {
-		rep_nop(); 
+		rep_nop();
+
+		left -= now - prev;
+		prev = now;
+
+		preempt_disable();
+		rdtscl(prev_1);
+		cpu = smp_processor_id();
 		rdtscl(now);
-	}
-	while ((now-bclock) < loops);
-	preempt_enable();
+		preempt_enable();
+
+		if (prev_cpu != cpu){
+			/*
+			 * We have migrated, forget prev_cpu's tsc reading
+			 */
+			 prev = prev_1;
+			 prev_cpu = cpu;
+		}
+	} while ((now-prev) < left);
 }
 EXPORT_SYMBOL(__delay);
 

  reply	other threads:[~2007-11-21 17:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-02 17:10 2.6.23.1: Random hangs during boot with "tsc" clocksource Jordan Russell
2007-11-07  6:15 ` Andrew Morton
2007-11-20 23:20   ` Jordan Russell
2007-11-21 17:10     ` Chuck Ebbert [this message]
2007-11-23 18:29       ` Jordan Russell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47446690.5090808@redhat.com \
    --to=cebbert@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=johnstul@us.ibm.com \
    --cc=jr-list-2007@quo.to \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.