public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Andrew Patterson <andrew.patterson@hp.com>
Cc: linux-kernel@vger.kernel.org, linux-scsi@vger.kernel.org,
	mike.miller@hp.com, jens.axboe@oracle.com
Subject: Re: [PATCH 2/3] cciss: use only one scan thread
Date: Wed, 15 Jul 2009 15:06:37 -0700	[thread overview]
Message-ID: <20090715150637.6f171759.akpm@linux-foundation.org> (raw)
In-Reply-To: <20090714220250.22282.69385.stgit@bob.kio>

On Tue, 14 Jul 2009 16:02:50 -0600
Andrew Patterson <andrew.patterson@hp.com> wrote:

> cciss: use only one scan thread
> 
> Replace the use of one scan kthread per controller with one per driver.
> Use a queue to hold a list of controllers that need to be rescanned with
> routines to add and remove controllers from the queue.
> Fix locking and completion handling to prevent a hang during rmmod.
> 
>
> ...
>
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -39,6 +39,7 @@
>  #include <linux/hdreg.h>
>  #include <linux/spinlock.h>
>  #include <linux/compat.h>
> +#include <linux/mutex.h>
>  #include <asm/uaccess.h>
>  #include <asm/io.h>
>  
> @@ -155,6 +156,10 @@ static struct board_type products[] = {
>  
>  static ctlr_info_t *hba[MAX_CTLR];
>  
> +static struct task_struct *cciss_scan_thread;
> +static struct mutex scan_mutex;
> +static struct list_head scan_q;

Both of the above can be initialised at compile-time.  This is a bit
neater and prevents reviewers from having to run around checking that
they got initialised early enough.

>  static void do_cciss_request(struct request_queue *q);
>  static irqreturn_t do_cciss_intr(int irq, void *dev_id);
>  static int cciss_open(struct block_device *bdev, fmode_t mode);
> @@ -189,6 +194,8 @@ static int sendcmd_withirq_core(ctlr_info_t *h, CommandList_struct *c,
>  static int process_sendcmd_error(ctlr_info_t *h, CommandList_struct *c);
>  
>  static void fail_all_cmds(unsigned long ctlr);
> +static int add_to_scan_list(struct ctlr_info *h);
> +static void remove_from_scan_list(struct ctlr_info *h);

These forward declarations are unneeded.

>  static int scan_thread(void *data);
>  static int check_for_unit_attention(ctlr_info_t *h, CommandList_struct *c);
>  
> @@ -3232,20 +3239,121 @@ static irqreturn_t do_cciss_intr(int irq, void *dev_id)
>  	return IRQ_HANDLED;
>  }
>  
> -static int scan_thread(void *data)
> +/**
> + * add_to_scan_list() - add controller to rescan queue
> + * @h:		      Pointer to the controller.
> + *
> + * Adds the controller to the rescan queue if not already on the queue.
> + *
> + * returns 1 if added to the queue, 0 if skipped (could be on the
> + * queue already, or the controller could be initializing or shutting
> + * down).
> + **/
> +static int add_to_scan_list(struct ctlr_info *h)
>  {
> -	ctlr_info_t *h = data;
> -	int rc;
> -	DECLARE_COMPLETION_ONSTACK(wait);
> -	h->rescan_wait = &wait;
> +	struct ctlr_info *test_h;
> +	int found = 0;
> +	int ret = 0;
> +
> +	if (h->busy_initializing)
> +		return 0;

This looks pretty random and racy.

For example, what stops busy_initializing from becoming true one
picosecond after the test?

> +	if (!mutex_trylock(&h->busy_shutting_down))
> +		return 0;
>  
> -	for (;;) {
> -		rc = wait_for_completion_interruptible(&wait);
> -		if (kthread_should_stop())
> +	mutex_lock(&scan_mutex);
> +	list_for_each_entry(test_h, &scan_q, scan_list) {
> +		if (test_h == h) {
> +			found = 1;
>  			break;
> -		if (!rc)
> +		}
> +	}
> +	if (!found && !h->busy_scanning) {
> +		INIT_COMPLETION(h->scan_wait);
> +		list_add_tail(&h->scan_list, &scan_q);
> +		ret = 1;
> +	}
> +	mutex_unlock(&scan_mutex);
> +	mutex_unlock(&h->busy_shutting_down);
> +
> +	return ret;
> +}

We already did init_completion(h->scan_wait) at startup time?  Doing
the INIT_COMPLETION() here looks unusual and is hopefully unnecessary.

> +/**
> + * scan_thread() - kernel thread used to rescan controllers
> + * @data:	 Ignored.
> + *
> + * A kernel thread used scan for drive topology changes on
> + * controllers. The thread processes only one controller at a time
> + * using a queue.  Controllers are added to the queue using
> + * add_to_scan_list() and removed from the queue either after done
> + * processing or using remove_from_scan_list().
> + *
> + * returns 0.
> + **/
> +static int scan_thread(void *data)
> +{
> +	struct ctlr_info *h;
> +
> +	set_current_state(TASK_INTERRUPTIBLE);
> +	while (!kthread_should_stop()) {
> +		mutex_lock(&scan_mutex);
> +		if (list_empty(&scan_q)) {
> +			h = NULL;
> +		} else {
> +			h = list_entry(scan_q.next,
> +				       struct ctlr_info,
> +				       scan_list);
> +			list_del(&h->scan_list);
> +			h->busy_scanning = 1;
> +		}
> +		mutex_unlock(&scan_mutex);
> +
> +		if (h) {
> +			__set_current_state(TASK_RUNNING);
>  			rebuild_lun_table(h, 0);
> +			complete_all(&h->scan_wait);
> +			mutex_lock(&scan_mutex);
> +			h->busy_scanning = 0;
> +			mutex_unlock(&scan_mutex);
> +			set_current_state(TASK_INTERRUPTIBLE);
> +		} else {
> +			schedule();
> +			set_current_state(TASK_INTERRUPTIBLE);
> +			continue;
> +		}
>  	}
> +
> +	__set_current_state(TASK_RUNNING);
>  	return 0;
>  }

This code is somewhat wrong.  Look:

	set_current_state(TASK_INTERRUPTIBLE);
	...
		mutex_lock(&scan_mutex);

now, we don't know what state the task is in.  If the mutex_lock()
slept the we're in state TASK_RUNNING.  If the mutex_lock() didn't
sleep then we're in state <mumble>.  Where
<mumble>==TASK_INTERRUPTIBLE, but that's an implementation fluke.

So I'd say that some rethinking/restructuring is needed here.



  reply	other threads:[~2009-07-15 22:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-14 22:02 [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Patterson
2009-07-14 22:02 ` [PATCH 1/3] cciss: remove logical drive sysfs entries during driver cleanup Andrew Patterson
2009-07-14 22:02 ` [PATCH 2/3] cciss: use only one scan thread Andrew Patterson
2009-07-15 22:06   ` Andrew Morton [this message]
2009-07-16  4:15     ` Andrew Patterson
2009-07-16  4:33       ` Andrew Morton
2009-07-14 22:02 ` [PATCH 3/3] cciss: kick off logical drive topology rescan through sysfs Andrew Patterson
2009-07-15 22:08 ` [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Morton
2009-07-16  5:45   ` Andrew Patterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090715150637.6f171759.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=andrew.patterson@hp.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mike.miller@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox