linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] SCSI: Replace semaphores with wait_even
@ 2004-10-20 19:29 Thomas Gleixner
  2004-10-24 19:57 ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2004-10-20 19:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, LKML, SCSI, James Bottomley


Use wait_event instead of semaphores. Semaphores are slower
and trigger owner conflicts during semaphore debugging.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
---

 2.6.9-bk-041020-thomas/drivers/scsi/hosts.c      |    2 +-
 2.6.9-bk-041020-thomas/drivers/scsi/scsi_error.c |   19
++++++++-----------
 2.6.9-bk-041020-thomas/include/scsi/scsi_host.h  |    2 +-
 3 files changed, 10 insertions(+), 13 deletions(-)

diff -puN drivers/scsi/scsi_error.c~scsihost drivers/scsi/scsi_error.c
--- 2.6.9-bk-041020/drivers/scsi/scsi_error.c~scsihost	2004-10-20
15:58:57.000000000 +0200
+++ 2.6.9-bk-041020-thomas/drivers/scsi/scsi_error.c	2004-10-20
16:02:57.000000000 +0200
@@ -49,7 +49,7 @@
 void scsi_eh_wakeup(struct Scsi_Host *shost)
 {
 	if (shost->host_busy == shost->host_failed) {
-		up(shost->eh_wait);
+		wake_up(shost->eh_wait);
 		SCSI_LOG_ERROR_RECOVERY(5,
 				printk("Waking error handler thread\n"));
 	}
@@ -1598,7 +1598,7 @@ int scsi_error_handler(void *data)
 {
 	struct Scsi_Host *shost = (struct Scsi_Host *) data;
 	int rtn;
-	DECLARE_MUTEX_LOCKED(sem);
+ 	DECLARE_WAIT_QUEUE_HEAD(eh_wait);
 
 	/*
 	 *    Flush resources
@@ -1608,7 +1608,7 @@ int scsi_error_handler(void *data)
 
 	current->flags |= PF_NOFREEZE;
 
-	shost->eh_wait = &sem;
+	shost->eh_wait = &eh_wait;
 	shost->ehandler = current;
 
 	/*
@@ -1630,15 +1630,12 @@ int scsi_error_handler(void *data)
 						  " sleeping\n",shost->host_no));
 
 		/*
-		 * Note - we always use down_interruptible with the semaphore
-		 * even if the module was loaded as part of the kernel.  The
-		 * reason is that down() will cause this thread to be counted
-		 * in the load average as a running process, and down
-		 * interruptible doesn't.  Given that we need to allow this
-		 * thread to die if the driver was loaded as a module, using
-		 * semaphores isn't unreasonable.
+		 * Wait, until somebody decides to wake us due to an error
+		 * or because we should be killed
 		 */
-		down_interruptible(&sem);
+		wait_event_interruptible(eh_wait, shost->eh_kill ||
+				(shost->host_busy == shost->host_failed));
+
 		if (shost->eh_kill)
 			break;
 
diff -puN drivers/scsi/hosts.c~scsihost drivers/scsi/hosts.c
--- 2.6.9-bk-041020/drivers/scsi/hosts.c~scsihost	2004-10-20
16:00:04.000000000 +0200
+++ 2.6.9-bk-041020-thomas/drivers/scsi/hosts.c	2004-10-20
16:00:35.000000000 +0200
@@ -157,7 +157,7 @@ static void scsi_host_dev_release(struct
 		DECLARE_COMPLETION(sem);
 		shost->eh_notify = &sem;
 		shost->eh_kill = 1;
-		up(shost->eh_wait);
+		wake_up(shost->eh_wait);
 		wait_for_completion(&sem);
 		shost->eh_notify = NULL;
 	}
diff -puN include/scsi/scsi_host.h~scsihost include/scsi/scsi_host.h
--- 2.6.9-bk-041020/include/scsi/scsi_host.h~scsihost	2004-10-20
16:00:12.000000000 +0200
+++ 2.6.9-bk-041020-thomas/include/scsi/scsi_host.h	2004-10-20
16:00:29.000000000 +0200
@@ -396,7 +396,7 @@ struct Scsi_Host {
 
 	struct list_head	eh_cmd_q;
 	struct task_struct    * ehandler;  /* Error recovery thread. */
-	struct semaphore      * eh_wait;   /* The error recovery thread waits
+	wait_queue_head_t     * eh_wait;   /* The error recovery thread waits
 					      on this. */
 	struct completion     * eh_notify; /* wait for eh to begin or end */
 	struct semaphore      * eh_action; /* Wait for specific actions on the
_

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] SCSI: Replace semaphores with wait_even
  2004-10-20 19:29 [PATCH] SCSI: Replace semaphores with wait_even Thomas Gleixner
@ 2004-10-24 19:57 ` James Bottomley
  2004-10-24 20:06   ` Thomas Gleixner
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2004-10-24 19:57 UTC (permalink / raw)
  To: tglx; +Cc: Andrew Morton, Ingo Molnar, LKML, SCSI Mailing List

On Wed, 2004-10-20 at 15:29, Thomas Gleixner wrote:
> 
> Use wait_event instead of semaphores. Semaphores are slower
> and trigger owner conflicts during semaphore debugging.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Ingo Molnar <mingo@elte.hu>
> ---

There's something deeply wrong with this.  It causes a boot hang in my
scsi test systems.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] SCSI: Replace semaphores with wait_even
  2004-10-24 19:57 ` James Bottomley
@ 2004-10-24 20:06   ` Thomas Gleixner
  2004-10-24 23:01     ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Gleixner @ 2004-10-24 20:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Morton, Ingo Molnar, LKML, SCSI

On Sun, 2004-10-24 at 15:57 -0400, James Bottomley wrote:
> On Wed, 2004-10-20 at 15:29, Thomas Gleixner wrote:
> > 
> > Use wait_event instead of semaphores. Semaphores are slower
> > and trigger owner conflicts during semaphore debugging.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Acked-by: Ingo Molnar <mingo@elte.hu>
> > ---
> 
> There's something deeply wrong with this.  It causes a boot hang in my
> scsi test systems.

Hmm, strange. It works on two systems here and others using this
modification had no problem either. 
I will check again.

tglx





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] SCSI: Replace semaphores with wait_even
  2004-10-24 20:06   ` Thomas Gleixner
@ 2004-10-24 23:01     ` James Bottomley
  2004-10-24 23:06       ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: James Bottomley @ 2004-10-24 23:01 UTC (permalink / raw)
  To: tglx; +Cc: Andrew Morton, Ingo Molnar, LKML, SCSI Mailing List

On Sun, 2004-10-24 at 16:06, Thomas Gleixner wrote:
> Hmm, strange. It works on two systems here and others using this
> modification had no problem either. 
> I will check again.

Yes, very strange given what the mistake is:

-               down_interruptible(&sem);
+               wait_event_interruptible(eh_wait, shost->eh_kill ||
+                               (shost->host_busy ==
shost->host_failed));

This condition is always true when the eh thread first starts because
the default quiescent state of a scsi host is

shost->host_busy = shost->host_failed = 0

so your change makes the eh_thread spin forever locking everything else
off the CPU.  On a UP system, this is a complete hang.

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] SCSI: Replace semaphores with wait_even
  2004-10-24 23:01     ` James Bottomley
@ 2004-10-24 23:06       ` Ingo Molnar
  2004-10-25  0:02         ` James Bottomley
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2004-10-24 23:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: tglx, Andrew Morton, LKML, SCSI Mailing List


* James Bottomley <James.Bottomley@SteelEye.com> wrote:

> On Sun, 2004-10-24 at 16:06, Thomas Gleixner wrote:
> > Hmm, strange. It works on two systems here and others using this
> > modification had no problem either. 
> > I will check again.
> 
> Yes, very strange given what the mistake is:
> 
> -               down_interruptible(&sem);
> +               wait_event_interruptible(eh_wait, shost->eh_kill ||
> +                               (shost->host_busy ==
> shost->host_failed));
> 
> This condition is always true when the eh thread first starts because
> the default quiescent state of a scsi host is
> 
> shost->host_busy = shost->host_failed = 0
> 
> so your change makes the eh_thread spin forever locking everything
> else off the CPU.  On a UP system, this is a complete hang.

i think i fixed this in my PREEMPT_REALTIME tree (having seen spinning
eh_threads) - maybe Thomas forgot to merge those fixes back?

(in a PREEMPT_REALTIME kernel a spinning thread is just a thread eating
up CPU power, it doesnt cause a hang.)

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] SCSI: Replace semaphores with wait_even
  2004-10-24 23:06       ` Ingo Molnar
@ 2004-10-25  0:02         ` James Bottomley
  0 siblings, 0 replies; 6+ messages in thread
From: James Bottomley @ 2004-10-25  0:02 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: tglx, Andrew Morton, LKML, SCSI Mailing List

On Sun, 2004-10-24 at 19:06, Ingo Molnar wrote:
> i think i fixed this in my PREEMPT_REALTIME tree (having seen spinning
> eh_threads) - maybe Thomas forgot to merge those fixes back?
> 
> (in a PREEMPT_REALTIME kernel a spinning thread is just a thread eating
> up CPU power, it doesnt cause a hang.)

I've really got to say, I don't like what you're doing.  This program
seems to replace

if (condition)
	up(sem);
[...]
down(sem);


with

if (condition)
	up(event_queue);
[...]
wait_event(event_queue, condition);

That can be wrong on three counts:

1) The condition is a local one that will fluctuate between the wake_up
and the actual thread being scheduled to run
2) The actual condition you need to check for might not be the same as
the one that triggered the wake_up.  This is what the SCSI problem was.
3) There might genuinely be n triggers of the event.  With a semaphore,
if we get three up()'s before the waiting thread is schedules, it will
process three times.  With wait_event, the other two will be lost.

Thus, to effect this replacement, you need a thorough audit of what is
usually pretty non-trivial code.

What's the overriding reason for doing this?  the pain doesn't look to
be worth the gain (since I don't see any gain).

James



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-10-25  0:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-20 19:29 [PATCH] SCSI: Replace semaphores with wait_even Thomas Gleixner
2004-10-24 19:57 ` James Bottomley
2004-10-24 20:06   ` Thomas Gleixner
2004-10-24 23:01     ` James Bottomley
2004-10-24 23:06       ` Ingo Molnar
2004-10-25  0:02         ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).