* [PATCH 0/3] cciss rmmod/scan-thread fixes
@ 2009-07-14 22:02 Andrew Patterson
2009-07-14 22:02 ` [PATCH 1/3] cciss: remove logical drive sysfs entries during driver cleanup Andrew Patterson
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Andrew Patterson @ 2009-07-14 22:02 UTC (permalink / raw)
To: linux-kernel; +Cc: akpm, linux-scsi, mike.miller, jens.axboe
The following series fixes several problems causing hangs while
rmmoding the cciss driver. In the process of fixing these hangs, I
also reworked the the logical drive scanning kernel thread to use one
thread per driver rather than one thread per controller. I also added
a sysfs attribute to kick off a controller scan.
.../ABI/testing/sysfs-bus-pci-devices-cciss | 7 +
drivers/block/cciss.c | 198 ++++++++++++++++++--
drivers/block/cciss.h | 7 +
3 files changed, 188 insertions(+), 24 deletions(-)
Andrew Patterson (3):
cciss: remove logical drive sysfs entries during driver cleanup.
cciss: use only one scan thread
cciss: kick off logical drive topology rescan through sysfs
--
Andrew Patterson
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 1/3] cciss: remove logical drive sysfs entries during driver cleanup. 2009-07-14 22:02 [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Patterson @ 2009-07-14 22:02 ` Andrew Patterson 2009-07-14 22:02 ` [PATCH 2/3] cciss: use only one scan thread Andrew Patterson ` (2 subsequent siblings) 3 siblings, 0 replies; 9+ messages in thread From: Andrew Patterson @ 2009-07-14 22:02 UTC (permalink / raw) To: linux-kernel; +Cc: akpm, linux-scsi, mike.miller, jens.axboe cciss: remove logical drive sysfs entries during driver cleanup. Sysfs entries for logical drives need to be removed when a drive is deleted during driver cleanup. Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> --- diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index a52cc7f..970c896 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -1977,7 +1977,6 @@ static int rebuild_lun_table(ctlr_info_t *h, int first_time) h->drv[i].busy_configuring = 1; spin_unlock_irqrestore(CCISS_LOCK(h->ctlr), flags); return_code = deregister_disk(h, i, 1); - cciss_destroy_ld_sysfs_entry(&h->drv[i]); h->drv[i].busy_configuring = 0; } } @@ -2118,6 +2117,7 @@ static int deregister_disk(ctlr_info_t *h, int drv_index, * indicate that this element of the drive * array is free. */ + cciss_destroy_ld_sysfs_entry(drv); if (clear_all) { /* check to see if it was the last disk */ @@ -4141,6 +4141,9 @@ static void __devexit cciss_remove_one(struct pci_dev *pdev) if (q) blk_cleanup_queue(q); } + if (hba[i]->drv[j].raid_level != -1) + cciss_destroy_ld_sysfs_entry(&hba[i]->drv[j]); + } #ifdef CONFIG_CISS_SCSI_TAPE ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] cciss: use only one scan thread 2009-07-14 22:02 [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Patterson 2009-07-14 22:02 ` [PATCH 1/3] cciss: remove logical drive sysfs entries during driver cleanup Andrew Patterson @ 2009-07-14 22:02 ` Andrew Patterson 2009-07-15 22:06 ` Andrew Morton 2009-07-14 22:02 ` [PATCH 3/3] cciss: kick off logical drive topology rescan through sysfs Andrew Patterson 2009-07-15 22:08 ` [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Morton 3 siblings, 1 reply; 9+ messages in thread From: Andrew Patterson @ 2009-07-14 22:02 UTC (permalink / raw) To: linux-kernel; +Cc: akpm, linux-scsi, mike.miller, jens.axboe cciss: use only one scan thread Replace the use of one scan kthread per controller with one per driver. Use a queue to hold a list of controllers that need to be rescanned with routines to add and remove controllers from the queue. Fix locking and completion handling to prevent a hang during rmmod. Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> --- diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index 970c896..c6bba77 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -39,6 +39,7 @@ #include <linux/hdreg.h> #include <linux/spinlock.h> #include <linux/compat.h> +#include <linux/mutex.h> #include <asm/uaccess.h> #include <asm/io.h> @@ -155,6 +156,10 @@ static struct board_type products[] = { static ctlr_info_t *hba[MAX_CTLR]; +static struct task_struct *cciss_scan_thread; +static struct mutex scan_mutex; +static struct list_head scan_q; + static void do_cciss_request(struct request_queue *q); static irqreturn_t do_cciss_intr(int irq, void *dev_id); static int cciss_open(struct block_device *bdev, fmode_t mode); @@ -189,6 +194,8 @@ static int sendcmd_withirq_core(ctlr_info_t *h, CommandList_struct *c, static int process_sendcmd_error(ctlr_info_t *h, CommandList_struct *c); static void fail_all_cmds(unsigned long ctlr); +static int add_to_scan_list(struct ctlr_info *h); +static void remove_from_scan_list(struct ctlr_info *h); static int scan_thread(void *data); static int check_for_unit_attention(ctlr_info_t *h, CommandList_struct *c); @@ -3232,20 +3239,121 @@ static irqreturn_t do_cciss_intr(int irq, void *dev_id) return IRQ_HANDLED; } -static int scan_thread(void *data) +/** + * add_to_scan_list() - add controller to rescan queue + * @h: Pointer to the controller. + * + * Adds the controller to the rescan queue if not already on the queue. + * + * returns 1 if added to the queue, 0 if skipped (could be on the + * queue already, or the controller could be initializing or shutting + * down). + **/ +static int add_to_scan_list(struct ctlr_info *h) { - ctlr_info_t *h = data; - int rc; - DECLARE_COMPLETION_ONSTACK(wait); - h->rescan_wait = &wait; + struct ctlr_info *test_h; + int found = 0; + int ret = 0; + + if (h->busy_initializing) + return 0; + + if (!mutex_trylock(&h->busy_shutting_down)) + return 0; - for (;;) { - rc = wait_for_completion_interruptible(&wait); - if (kthread_should_stop()) + mutex_lock(&scan_mutex); + list_for_each_entry(test_h, &scan_q, scan_list) { + if (test_h == h) { + found = 1; break; - if (!rc) + } + } + if (!found && !h->busy_scanning) { + INIT_COMPLETION(h->scan_wait); + list_add_tail(&h->scan_list, &scan_q); + ret = 1; + } + mutex_unlock(&scan_mutex); + mutex_unlock(&h->busy_shutting_down); + + return ret; +} + +/** + * remove_from_scan_list() - remove controller from rescan queue + * @h: Pointer to the controller. + * + * Removes the controller from the rescan queue if present. Blocks if + * the controller is currently conducting a rescan. + **/ +static void remove_from_scan_list(struct ctlr_info *h) +{ + struct ctlr_info *test_h, *tmp_h; + int scanning = 0; + + mutex_lock(&scan_mutex); + list_for_each_entry_safe(test_h, tmp_h, &scan_q, scan_list) { + if (test_h == h) { + list_del(&h->scan_list); + complete_all(&h->scan_wait); + mutex_unlock(&scan_mutex); + return; + } + } + if (&h->busy_scanning) + scanning = 0; + mutex_unlock(&scan_mutex); + + if (scanning) + wait_for_completion(&h->scan_wait); +} + +/** + * scan_thread() - kernel thread used to rescan controllers + * @data: Ignored. + * + * A kernel thread used scan for drive topology changes on + * controllers. The thread processes only one controller at a time + * using a queue. Controllers are added to the queue using + * add_to_scan_list() and removed from the queue either after done + * processing or using remove_from_scan_list(). + * + * returns 0. + **/ +static int scan_thread(void *data) +{ + struct ctlr_info *h; + + set_current_state(TASK_INTERRUPTIBLE); + while (!kthread_should_stop()) { + mutex_lock(&scan_mutex); + if (list_empty(&scan_q)) { + h = NULL; + } else { + h = list_entry(scan_q.next, + struct ctlr_info, + scan_list); + list_del(&h->scan_list); + h->busy_scanning = 1; + } + mutex_unlock(&scan_mutex); + + if (h) { + __set_current_state(TASK_RUNNING); rebuild_lun_table(h, 0); + complete_all(&h->scan_wait); + mutex_lock(&scan_mutex); + h->busy_scanning = 0; + mutex_unlock(&scan_mutex); + set_current_state(TASK_INTERRUPTIBLE); + } else { + schedule(); + set_current_state(TASK_INTERRUPTIBLE); + continue; + } } + + __set_current_state(TASK_RUNNING); return 0; } @@ -3268,8 +3376,8 @@ static int check_for_unit_attention(ctlr_info_t *h, CommandList_struct *c) case REPORT_LUNS_CHANGED: printk(KERN_WARNING "cciss%d: report LUN data " "changed\n", h->ctlr); - if (h->rescan_wait) - complete(h->rescan_wait); + add_to_scan_list(h); + wake_up_process(cciss_scan_thread); return 1; break; case POWER_OR_RESET: @@ -3918,6 +4026,7 @@ static int __devinit cciss_init_one(struct pci_dev *pdev, hba[i]->busy_initializing = 1; INIT_HLIST_HEAD(&hba[i]->cmpQ); INIT_HLIST_HEAD(&hba[i]->reqQ); + mutex_init(&hba[i]->busy_shutting_down); if (cciss_pci_init(hba[i], pdev) != 0) goto clean0; @@ -3926,6 +4035,8 @@ static int __devinit cciss_init_one(struct pci_dev *pdev, hba[i]->ctlr = i; hba[i]->pdev = pdev; + init_completion(&hba[i]->scan_wait); + if (cciss_create_hba_sysfs_entry(hba[i])) goto clean0; @@ -4037,10 +4148,6 @@ static int __devinit cciss_init_one(struct pci_dev *pdev, hba[i]->busy_initializing = 0; rebuild_lun_table(hba[i], 1); - hba[i]->cciss_scan_thread = kthread_run(scan_thread, hba[i], - "cciss_scan%02d", i); - if (IS_ERR(hba[i]->cciss_scan_thread)) - return PTR_ERR(hba[i]->cciss_scan_thread); return 1; @@ -4125,8 +4232,9 @@ static void __devexit cciss_remove_one(struct pci_dev *pdev) return; } - kthread_stop(hba[i]->cciss_scan_thread); + mutex_lock(&hba[i]->busy_shutting_down); + remove_from_scan_list(hba[i]); remove_proc_entry(hba[i]->devname, proc_cciss); unregister_blkdev(hba[i]->major, hba[i]->devname); @@ -4173,6 +4281,7 @@ static void __devexit cciss_remove_one(struct pci_dev *pdev) pci_release_regions(pdev); pci_set_drvdata(pdev, NULL); cciss_destroy_hba_sysfs_entry(hba[i]); + mutex_unlock(&hba[i]->busy_shutting_down); free_hba(i); } @@ -4205,15 +4314,27 @@ static int __init cciss_init(void) if (err) return err; + /* Start the scan thread */ + mutex_init(&scan_mutex); + INIT_LIST_HEAD(&scan_q); + cciss_scan_thread = kthread_run(scan_thread, NULL, "cciss_scan"); + if (IS_ERR(cciss_scan_thread)) { + err = PTR_ERR(cciss_scan_thread); + goto err_bus_unregister; + } + /* Register for our PCI devices */ err = pci_register_driver(&cciss_pci_driver); if (err) - goto err_bus_register; + goto err_thread_stop; return 0; -err_bus_register: +err_thread_stop: + kthread_stop(cciss_scan_thread); +err_bus_unregister: bus_unregister(&cciss_bus_type); + return err; } @@ -4230,6 +4351,7 @@ static void __exit cciss_cleanup(void) cciss_remove_one(hba[i]->pdev); } } + kthread_stop(cciss_scan_thread); remove_proc_entry("driver/cciss", NULL); bus_unregister(&cciss_bus_type); } diff --git a/drivers/block/cciss.h b/drivers/block/cciss.h index 06a5db2..859e0e0 100644 --- a/drivers/block/cciss.h +++ b/drivers/block/cciss.h @@ -2,6 +2,7 @@ #define CCISS_H #include <linux/genhd.h> +#include <linux/mutex.h> #include "cciss_cmd.h" @@ -108,6 +109,8 @@ struct ctlr_info int nr_frees; int busy_configuring; int busy_initializing; + int busy_scanning; + struct mutex busy_shutting_down; /* This element holds the zero based queue number of the last * queue to be started. It is used for fairness. @@ -122,8 +125,8 @@ struct ctlr_info /* and saved for later processing */ #endif unsigned char alive; - struct completion *rescan_wait; - struct task_struct *cciss_scan_thread; + struct list_head scan_list; + struct completion scan_wait; struct device dev; }; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] cciss: use only one scan thread 2009-07-14 22:02 ` [PATCH 2/3] cciss: use only one scan thread Andrew Patterson @ 2009-07-15 22:06 ` Andrew Morton 2009-07-16 4:15 ` Andrew Patterson 0 siblings, 1 reply; 9+ messages in thread From: Andrew Morton @ 2009-07-15 22:06 UTC (permalink / raw) To: Andrew Patterson; +Cc: linux-kernel, linux-scsi, mike.miller, jens.axboe On Tue, 14 Jul 2009 16:02:50 -0600 Andrew Patterson <andrew.patterson@hp.com> wrote: > cciss: use only one scan thread > > Replace the use of one scan kthread per controller with one per driver. > Use a queue to hold a list of controllers that need to be rescanned with > routines to add and remove controllers from the queue. > Fix locking and completion handling to prevent a hang during rmmod. > > > ... > > --- a/drivers/block/cciss.c > +++ b/drivers/block/cciss.c > @@ -39,6 +39,7 @@ > #include <linux/hdreg.h> > #include <linux/spinlock.h> > #include <linux/compat.h> > +#include <linux/mutex.h> > #include <asm/uaccess.h> > #include <asm/io.h> > > @@ -155,6 +156,10 @@ static struct board_type products[] = { > > static ctlr_info_t *hba[MAX_CTLR]; > > +static struct task_struct *cciss_scan_thread; > +static struct mutex scan_mutex; > +static struct list_head scan_q; Both of the above can be initialised at compile-time. This is a bit neater and prevents reviewers from having to run around checking that they got initialised early enough. > static void do_cciss_request(struct request_queue *q); > static irqreturn_t do_cciss_intr(int irq, void *dev_id); > static int cciss_open(struct block_device *bdev, fmode_t mode); > @@ -189,6 +194,8 @@ static int sendcmd_withirq_core(ctlr_info_t *h, CommandList_struct *c, > static int process_sendcmd_error(ctlr_info_t *h, CommandList_struct *c); > > static void fail_all_cmds(unsigned long ctlr); > +static int add_to_scan_list(struct ctlr_info *h); > +static void remove_from_scan_list(struct ctlr_info *h); These forward declarations are unneeded. > static int scan_thread(void *data); > static int check_for_unit_attention(ctlr_info_t *h, CommandList_struct *c); > > @@ -3232,20 +3239,121 @@ static irqreturn_t do_cciss_intr(int irq, void *dev_id) > return IRQ_HANDLED; > } > > -static int scan_thread(void *data) > +/** > + * add_to_scan_list() - add controller to rescan queue > + * @h: Pointer to the controller. > + * > + * Adds the controller to the rescan queue if not already on the queue. > + * > + * returns 1 if added to the queue, 0 if skipped (could be on the > + * queue already, or the controller could be initializing or shutting > + * down). > + **/ > +static int add_to_scan_list(struct ctlr_info *h) > { > - ctlr_info_t *h = data; > - int rc; > - DECLARE_COMPLETION_ONSTACK(wait); > - h->rescan_wait = &wait; > + struct ctlr_info *test_h; > + int found = 0; > + int ret = 0; > + > + if (h->busy_initializing) > + return 0; This looks pretty random and racy. For example, what stops busy_initializing from becoming true one picosecond after the test? > + if (!mutex_trylock(&h->busy_shutting_down)) > + return 0; > > - for (;;) { > - rc = wait_for_completion_interruptible(&wait); > - if (kthread_should_stop()) > + mutex_lock(&scan_mutex); > + list_for_each_entry(test_h, &scan_q, scan_list) { > + if (test_h == h) { > + found = 1; > break; > - if (!rc) > + } > + } > + if (!found && !h->busy_scanning) { > + INIT_COMPLETION(h->scan_wait); > + list_add_tail(&h->scan_list, &scan_q); > + ret = 1; > + } > + mutex_unlock(&scan_mutex); > + mutex_unlock(&h->busy_shutting_down); > + > + return ret; > +} We already did init_completion(h->scan_wait) at startup time? Doing the INIT_COMPLETION() here looks unusual and is hopefully unnecessary. > +/** > + * scan_thread() - kernel thread used to rescan controllers > + * @data: Ignored. > + * > + * A kernel thread used scan for drive topology changes on > + * controllers. The thread processes only one controller at a time > + * using a queue. Controllers are added to the queue using > + * add_to_scan_list() and removed from the queue either after done > + * processing or using remove_from_scan_list(). > + * > + * returns 0. > + **/ > +static int scan_thread(void *data) > +{ > + struct ctlr_info *h; > + > + set_current_state(TASK_INTERRUPTIBLE); > + while (!kthread_should_stop()) { > + mutex_lock(&scan_mutex); > + if (list_empty(&scan_q)) { > + h = NULL; > + } else { > + h = list_entry(scan_q.next, > + struct ctlr_info, > + scan_list); > + list_del(&h->scan_list); > + h->busy_scanning = 1; > + } > + mutex_unlock(&scan_mutex); > + > + if (h) { > + __set_current_state(TASK_RUNNING); > rebuild_lun_table(h, 0); > + complete_all(&h->scan_wait); > + mutex_lock(&scan_mutex); > + h->busy_scanning = 0; > + mutex_unlock(&scan_mutex); > + set_current_state(TASK_INTERRUPTIBLE); > + } else { > + schedule(); > + set_current_state(TASK_INTERRUPTIBLE); > + continue; > + } > } > + > + __set_current_state(TASK_RUNNING); > return 0; > } This code is somewhat wrong. Look: set_current_state(TASK_INTERRUPTIBLE); ... mutex_lock(&scan_mutex); now, we don't know what state the task is in. If the mutex_lock() slept the we're in state TASK_RUNNING. If the mutex_lock() didn't sleep then we're in state <mumble>. Where <mumble>==TASK_INTERRUPTIBLE, but that's an implementation fluke. So I'd say that some rethinking/restructuring is needed here. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] cciss: use only one scan thread 2009-07-15 22:06 ` Andrew Morton @ 2009-07-16 4:15 ` Andrew Patterson 2009-07-16 4:33 ` Andrew Morton 0 siblings, 1 reply; 9+ messages in thread From: Andrew Patterson @ 2009-07-16 4:15 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, linux-scsi, mike.miller, jens.axboe On Wed, 2009-07-15 at 15:06 -0700, Andrew Morton wrote: > On Tue, 14 Jul 2009 16:02:50 -0600 > Andrew Patterson <andrew.patterson@hp.com> wrote: > > > cciss: use only one scan thread > > > > Replace the use of one scan kthread per controller with one per driver. > > Use a queue to hold a list of controllers that need to be rescanned with > > routines to add and remove controllers from the queue. > > Fix locking and completion handling to prevent a hang during rmmod. > > > > > > ... > > > > --- a/drivers/block/cciss.c > > +++ b/drivers/block/cciss.c > > @@ -39,6 +39,7 @@ > > #include <linux/hdreg.h> > > #include <linux/spinlock.h> > > #include <linux/compat.h> > > +#include <linux/mutex.h> > > #include <asm/uaccess.h> > > #include <asm/io.h> > > > > @@ -155,6 +156,10 @@ static struct board_type products[] = { > > > > static ctlr_info_t *hba[MAX_CTLR]; > > > > +static struct task_struct *cciss_scan_thread; > > +static struct mutex scan_mutex; > > +static struct list_head scan_q; > > Both of the above can be initialised at compile-time. This is a bit > neater and prevents reviewers from having to run around checking that > they got initialised early enough. Fixed. > > > static void do_cciss_request(struct request_queue *q); > > static irqreturn_t do_cciss_intr(int irq, void *dev_id); > > static int cciss_open(struct block_device *bdev, fmode_t mode); > > @@ -189,6 +194,8 @@ static int sendcmd_withirq_core(ctlr_info_t *h, CommandList_struct *c, > > static int process_sendcmd_error(ctlr_info_t *h, CommandList_struct *c); > > > > static void fail_all_cmds(unsigned long ctlr); > > +static int add_to_scan_list(struct ctlr_info *h); > > +static void remove_from_scan_list(struct ctlr_info *h); > > These forward declarations are unneeded. > Fixed. > > static int scan_thread(void *data); > > static int check_for_unit_attention(ctlr_info_t *h, CommandList_struct *c); > > > > @@ -3232,20 +3239,121 @@ static irqreturn_t do_cciss_intr(int irq, void *dev_id) > > return IRQ_HANDLED; > > } > > > > -static int scan_thread(void *data) > > +/** > > + * add_to_scan_list() - add controller to rescan queue > > + * @h: Pointer to the controller. > > + * > > + * Adds the controller to the rescan queue if not already on the queue. > > + * > > + * returns 1 if added to the queue, 0 if skipped (could be on the > > + * queue already, or the controller could be initializing or shutting > > + * down). > > + **/ > > +static int add_to_scan_list(struct ctlr_info *h) > > { > > - ctlr_info_t *h = data; > > - int rc; > > - DECLARE_COMPLETION_ONSTACK(wait); > > - h->rescan_wait = &wait; > > + struct ctlr_info *test_h; > > + int found = 0; > > + int ret = 0; > > + > > + if (h->busy_initializing) > > + return 0; > > This looks pretty random and racy. Fixed. Converted busy_initializer to a mutex and will use: if (mutex_is_locked(&h->busy_initializing)) return 0; > > For example, what stops busy_initializing from becoming true one > picosecond after the test? > > > + if (!mutex_trylock(&h->busy_shutting_down)) > > + return 0; > > > > - for (;;) { > > - rc = wait_for_completion_interruptible(&wait); > > - if (kthread_should_stop()) > > + mutex_lock(&scan_mutex); > > + list_for_each_entry(test_h, &scan_q, scan_list) { > > + if (test_h == h) { > > + found = 1; > > break; > > - if (!rc) > > + } > > + } > > + if (!found && !h->busy_scanning) { > > + INIT_COMPLETION(h->scan_wait); > > + list_add_tail(&h->scan_list, &scan_q); > > + ret = 1; > > + } > > + mutex_unlock(&scan_mutex); > > + mutex_unlock(&h->busy_shutting_down); > > + > > + return ret; > > +} > > We already did init_completion(h->scan_wait) at startup time? Doing > the INIT_COMPLETION() here looks unusual and is hopefully unnecessary. > I am following the text in the "Linux Device Drivers 3rd Edition section 5.4: "A completion is normally a one-shot device; it is used once then discarded. It is possible, however, to reuse completion structures if proper care is taken. If complete_all is not used, a completion structure can be reused without any problems as long as there is no ambiguity about what event is being signalled. If you use complete_all, however, you must reinitialize the completion structure before reusing it. The macro: INIT_COMPLETION(struct completion c); can be used to quickly perform this reinitialization." It there are better way to do this? INIT_COMPLETION() currently just sets done = 0. > > +/** > > + * scan_thread() - kernel thread used to rescan controllers > > + * @data: Ignored. > > + * > > + * A kernel thread used scan for drive topology changes on > > + * controllers. The thread processes only one controller at a time > > + * using a queue. Controllers are added to the queue using > > + * add_to_scan_list() and removed from the queue either after done > > + * processing or using remove_from_scan_list(). > > + * > > + * returns 0. > > + **/ > > +static int scan_thread(void *data) > > +{ > > + struct ctlr_info *h; > > + > > + set_current_state(TASK_INTERRUPTIBLE); > > + while (!kthread_should_stop()) { > > + mutex_lock(&scan_mutex); > > + if (list_empty(&scan_q)) { > > + h = NULL; > > + } else { > > + h = list_entry(scan_q.next, > > + struct ctlr_info, > > + scan_list); > > + list_del(&h->scan_list); > > + h->busy_scanning = 1; > > + } > > + mutex_unlock(&scan_mutex); > > + > > + if (h) { > > + __set_current_state(TASK_RUNNING); > > rebuild_lun_table(h, 0); > > + complete_all(&h->scan_wait); > > + mutex_lock(&scan_mutex); > > + h->busy_scanning = 0; > > + mutex_unlock(&scan_mutex); > > + set_current_state(TASK_INTERRUPTIBLE); > > + } else { > > + schedule(); > > + set_current_state(TASK_INTERRUPTIBLE); > > + continue; > > + } > > } > > + > > + __set_current_state(TASK_RUNNING); > > return 0; > > } > > This code is somewhat wrong. Look: > > set_current_state(TASK_INTERRUPTIBLE); > ... > mutex_lock(&scan_mutex); > > now, we don't know what state the task is in. If the mutex_lock() > slept the we're in state TASK_RUNNING. If the mutex_lock() didn't > sleep then we're in state <mumble>. Where > <mumble>==TASK_INTERRUPTIBLE, but that's an implementation fluke. > > So I'd say that some rethinking/restructuring is needed here. Will look into this. I have had a hard time removing this possible race (even the mutex_lock() does not remove it completely, but at most you get multiple completions) without using some sort of nested locks. > > -- Andrew Patterson Hewlett-Packard ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] cciss: use only one scan thread 2009-07-16 4:15 ` Andrew Patterson @ 2009-07-16 4:33 ` Andrew Morton 0 siblings, 0 replies; 9+ messages in thread From: Andrew Morton @ 2009-07-16 4:33 UTC (permalink / raw) To: Andrew Patterson; +Cc: linux-kernel, linux-scsi, mike.miller, jens.axboe On Thu, 16 Jul 2009 04:15:59 +0000 Andrew Patterson <andrew.patterson@hp.com> wrote: > > > +} > > > > We already did init_completion(h->scan_wait) at startup time? Doing > > the INIT_COMPLETION() here looks unusual and is hopefully unnecessary. > > > > I am following the text in the "Linux Device Drivers 3rd Edition section > 5.4: > > "A completion is normally a one-shot device; it is used once then > discarded. It is possible, however, to reuse completion structures if > proper care is taken. If complete_all is not used, a completion > structure can be reused without any problems as long as there is no > ambiguity about what event is being signalled. If you use complete_all, > however, you must reinitialize the completion structure before reusing > it. The macro: > > INIT_COMPLETION(struct completion c); > > can be used to quickly perform this reinitialization." > > It there are better way to do this? INIT_COMPLETION() currently just > sets done = 0. OK, I'd assumed that a full wait_for_completion()/complete() operation would leave the completion in ready-to-use-again state. But wait_for_completion_interruptible() can indeed leave completion.done in a non-zero state if it was interrupted by a signal. hm. Perhaps so the caller can run wait_for_completion*() again. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 3/3] cciss: kick off logical drive topology rescan through sysfs 2009-07-14 22:02 [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Patterson 2009-07-14 22:02 ` [PATCH 1/3] cciss: remove logical drive sysfs entries during driver cleanup Andrew Patterson 2009-07-14 22:02 ` [PATCH 2/3] cciss: use only one scan thread Andrew Patterson @ 2009-07-14 22:02 ` Andrew Patterson 2009-07-15 22:08 ` [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Morton 3 siblings, 0 replies; 9+ messages in thread From: Andrew Patterson @ 2009-07-14 22:02 UTC (permalink / raw) To: linux-kernel; +Cc: akpm, linux-scsi, mike.miller, jens.axboe cciss: kick off logical drive topology rescan through sysfs Added /sys/bus/pci/devices/<dev>/ccissX/rescan sysfs entry used to kick off a rescan that discovers logical drive topology changes. Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Andrew Patterson <andrew.patterson@hp.com> --- diff --git a/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss b/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss index 0a92a7c..4050606 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss +++ b/Documentation/ABI/testing/sysfs-bus-pci-devices-cciss @@ -31,3 +31,10 @@ Date: March 2009 Kernel Version: 2.6.30 Contact: iss_storagedev@hp.com Description: A symbolic link to /sys/block/cciss!cXdY + +Where: /sys/bus/pci/devices/<dev>/ccissX/rescan +Date: July 2009 +Kernel Version: 2.6.31 +Contact: iss_storagedev@hp.com +Description: Kicks of a rescan of the controller to discovery logical + drive topology changes. diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c index c6bba77..a409e26 100644 --- a/drivers/block/cciss.c +++ b/drivers/block/cciss.c @@ -461,9 +461,19 @@ static void __devinit cciss_procinit(int i) #define to_hba(n) container_of(n, struct ctlr_info, dev) #define to_drv(n) container_of(n, drive_info_struct, dev) -static struct device_type cciss_host_type = { - .name = "cciss_host", -}; +static ssize_t host_store_rescan(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct ctlr_info *h = to_hba(dev); + + add_to_scan_list(h); + wake_up_process(cciss_scan_thread); + wait_for_completion_interruptible(&h->scan_wait); + + return count; +} +DEVICE_ATTR(rescan, S_IWUSR, NULL, host_store_rescan); static ssize_t dev_show_unique_id(struct device *dev, struct device_attribute *attr, @@ -567,6 +577,25 @@ static ssize_t dev_show_rev(struct device *dev, } DEVICE_ATTR(rev, S_IRUGO, dev_show_rev, NULL); +static struct attribute *cciss_host_attrs[] = { + &dev_attr_rescan.attr, + NULL +}; + +static struct attribute_group cciss_host_attr_group = { + .attrs = cciss_host_attrs, +}; + +static struct attribute_group *cciss_host_attr_groups[] = { + &cciss_host_attr_group, + NULL +}; + +static struct device_type cciss_host_type = { + .name = "cciss_host", + .groups = cciss_host_attr_groups, +}; + static struct attribute *cciss_dev_attrs[] = { &dev_attr_unique_id.attr, &dev_attr_model.attr, ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3] cciss rmmod/scan-thread fixes 2009-07-14 22:02 [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Patterson ` (2 preceding siblings ...) 2009-07-14 22:02 ` [PATCH 3/3] cciss: kick off logical drive topology rescan through sysfs Andrew Patterson @ 2009-07-15 22:08 ` Andrew Morton 2009-07-16 5:45 ` Andrew Patterson 3 siblings, 1 reply; 9+ messages in thread From: Andrew Morton @ 2009-07-15 22:08 UTC (permalink / raw) To: Andrew Patterson; +Cc: linux-kernel, linux-scsi, mike.miller, jens.axboe On Tue, 14 Jul 2009 16:02:40 -0600 Andrew Patterson <andrew.patterson@hp.com> wrote: > The following series fixes several problems causing hangs while > rmmoding the cciss driver. In the process of fixing these hangs, I > also reworked the the logical drive scanning kernel thread to use one > thread per driver rather than one thread per controller. I also added > a sysfs attribute to kick off a controller scan. > > .../ABI/testing/sysfs-bus-pci-devices-cciss | 7 + > drivers/block/cciss.c | 198 ++++++++++++++++++-- > drivers/block/cciss.h | 7 + > 3 files changed, 188 insertions(+), 24 deletions(-) > > Andrew Patterson (3): > cciss: remove logical drive sysfs entries during driver cleanup. > cciss: use only one scan thread > cciss: kick off logical drive topology rescan through sysfs > So I assume that these are intended to fix the problem which cciss-fix-the-termination-of-the-scanning-thread.patch didn't fix? Once we get this all sorted out, which kernel version(s) are we targetting and why? Thanks. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3] cciss rmmod/scan-thread fixes 2009-07-15 22:08 ` [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Morton @ 2009-07-16 5:45 ` Andrew Patterson 0 siblings, 0 replies; 9+ messages in thread From: Andrew Patterson @ 2009-07-16 5:45 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel, linux-scsi, mike.miller, jens.axboe On Wed, 2009-07-15 at 15:08 -0700, Andrew Morton wrote: > On Tue, 14 Jul 2009 16:02:40 -0600 > Andrew Patterson <andrew.patterson@hp.com> wrote: > > > The following series fixes several problems causing hangs while > > rmmoding the cciss driver. In the process of fixing these hangs, I > > also reworked the the logical drive scanning kernel thread to use one > > thread per driver rather than one thread per controller. I also added > > a sysfs attribute to kick off a controller scan. > > > > .../ABI/testing/sysfs-bus-pci-devices-cciss | 7 + > > drivers/block/cciss.c | 198 ++++++++++++++++++-- > > drivers/block/cciss.h | 7 + > > 3 files changed, 188 insertions(+), 24 deletions(-) > > > > Andrew Patterson (3): > > cciss: remove logical drive sysfs entries during driver cleanup. > > cciss: use only one scan thread > > cciss: kick off logical drive topology rescan through sysfs > > > > So I assume that these are intended to fix the problem which > cciss-fix-the-termination-of-the-scanning-thread.patch didn't fix? Correct. As well as a lot of the race conditions during cleanup. > > Once we get this all sorted out, which kernel version(s) are we > targetting and why? The current problems this patchset tries to solve is that rmmod of the module hangs along with logical hot remove. I was hoping to get these in as soon as possible. It looks like the sysfs additions and the scan_thread patches initially went into 2.6.30 (if my limited git-fu is telling me correctly), so I can't call this a regression. I guess 2.6.32 then. > > Thanks. > -- Andrew Patterson Hewlett-Packard ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-07-16 5:45 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-07-14 22:02 [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Patterson 2009-07-14 22:02 ` [PATCH 1/3] cciss: remove logical drive sysfs entries during driver cleanup Andrew Patterson 2009-07-14 22:02 ` [PATCH 2/3] cciss: use only one scan thread Andrew Patterson 2009-07-15 22:06 ` Andrew Morton 2009-07-16 4:15 ` Andrew Patterson 2009-07-16 4:33 ` Andrew Morton 2009-07-14 22:02 ` [PATCH 3/3] cciss: kick off logical drive topology rescan through sysfs Andrew Patterson 2009-07-15 22:08 ` [PATCH 0/3] cciss rmmod/scan-thread fixes Andrew Morton 2009-07-16 5:45 ` Andrew Patterson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox