* Re: [SCSI] fix scsi_reap_target() device_del from atomic context [not found] <200512212359.jBLNxluV016971@hera.kernel.org> @ 2005-12-23 6:13 ` Andrew Morton 2005-12-23 12:15 ` Matthew Wilcox ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Andrew Morton @ 2005-12-23 6:13 UTC (permalink / raw) To: James Bottomley; +Cc: linux-scsi Linux Kernel Mailing List <linux-kernel@vger.kernel.org> wrote: > > tree d2f74a0351a09e184e124fd6ecf16e02ab768a0b > parent 42e33148df38c60b99d984b76b302c64397ebe4c > author James Bottomley <James.Bottomley@steeleye.com> Fri, 16 Dec 2005 12:01:43 -0800 > committer James Bottomley <jejb@mulgrave.(none)> Sat, 17 Dec 2005 22:48:08 -0600 > > [SCSI] fix scsi_reap_target() device_del from atomic context > > scsi_reap_target() was desgined to be called from any context. > However it must do a device_del() of the target device, which may only > be called from user context. Thus we have to reimplement > scsi_reap_target() via a workqueue. > > Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> > > drivers/scsi/scsi_scan.c | 48 +++++++++++++++++++++++++++++++++++++---------- > 1 files changed, 38 insertions(+), 10 deletions(-) > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > index 94e5167..e36c21e 100644 > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -400,6 +400,35 @@ static struct scsi_target *scsi_alloc_ta > return found_target; > } > > +struct work_queue_wrapper { > + struct work_struct work; > + struct scsi_target *starget; > +}; > + > +static void scsi_target_reap_work(void *data) { Coding style? > + struct work_queue_wrapper *wqw = (struct work_queue_wrapper *)data; > + struct scsi_target *starget = wqw->starget; > + struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > + unsigned long flags; > + > + kfree(wqw); > + > + spin_lock_irqsave(shost->host_lock, flags); > + > + if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { > + list_del_init(&starget->siblings); > + spin_unlock_irqrestore(shost->host_lock, flags); > + device_del(&starget->dev); > + transport_unregister_device(&starget->dev); > + put_device(&starget->dev); > + return; > + > + } > + spin_unlock_irqrestore(shost->host_lock, flags); > + > + return; > +} Given that this can run an arbitrary amount of time later on, how do we know that *shost is still live? > void scsi_target_reap(struct scsi_target *starget) > { > - struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > - unsigned long flags; > - spin_lock_irqsave(shost->host_lock, flags); > + struct work_queue_wrapper *wqw = > + kzalloc(sizeof(struct work_queue_wrapper), GFP_ATOMIC); kmalloc() would suffice. > - if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { > - list_del_init(&starget->siblings); > - spin_unlock_irqrestore(shost->host_lock, flags); > - device_del(&starget->dev); > - transport_unregister_device(&starget->dev); > - put_device(&starget->dev); > + if (!wqw) { > + starget_printk(KERN_ERR, starget, > + "Failed to allocate memory in scsi_reap_target()\n"); > return; > } > - spin_unlock_irqrestore(shost->host_lock, flags); > + > + INIT_WORK(&wqw->work, scsi_target_reap_work, wqw); > + wqw->starget = starget; > + schedule_work(&wqw->work); > } ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 6:13 ` [SCSI] fix scsi_reap_target() device_del from atomic context Andrew Morton @ 2005-12-23 12:15 ` Matthew Wilcox 2005-12-23 15:27 ` James Bottomley 2005-12-23 20:05 ` Andrew Vasquez 2005-12-23 20:43 ` Sergey Vlasov 2 siblings, 1 reply; 12+ messages in thread From: Matthew Wilcox @ 2005-12-23 12:15 UTC (permalink / raw) To: Andrew Morton; +Cc: James Bottomley, linux-scsi On Thu, Dec 22, 2005 at 10:13:29PM -0800, Andrew Morton wrote: > > + struct work_queue_wrapper *wqw = (struct work_queue_wrapper *)data; > > + struct scsi_target *starget = wqw->starget; > > + struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > > + unsigned long flags; > > + > > + kfree(wqw); > > + > > + spin_lock_irqsave(shost->host_lock, flags); > > + > > + if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { > > + list_del_init(&starget->siblings); > > + spin_unlock_irqrestore(shost->host_lock, flags); > > + device_del(&starget->dev); > > + transport_unregister_device(&starget->dev); > > + put_device(&starget->dev); > > + return; > > + > > + } > > + spin_unlock_irqrestore(shost->host_lock, flags); > > + > > + return; > > +} > > Given that this can run an arbitrary amount of time later on, how do we > know that *shost is still live? If there's still an starget, its parent shost must still be around, no? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 12:15 ` Matthew Wilcox @ 2005-12-23 15:27 ` James Bottomley 2005-12-23 15:38 ` Andrew Morton 0 siblings, 1 reply; 12+ messages in thread From: James Bottomley @ 2005-12-23 15:27 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Andrew Morton, linux-scsi On Fri, 2005-12-23 at 05:15 -0700, Matthew Wilcox wrote: > On Thu, Dec 22, 2005 at 10:13:29PM -0800, Andrew Morton wrote: > > Given that this can run an arbitrary amount of time later on, how do we > > know that *shost is still live? > > If there's still an starget, its parent shost must still be around, no? Precisely, starget holds a reference to shost. Even if this put_device(&starget->dev) is the last put, triggering a final put of the shost, we've stopped referring to it by that point. There is a potential improvement, in that could be done which is only to use the workqueue if we're in atomic context. However, I elected to leave playing with that cleanup until after 2.6.15 There is also the point that I now have two of these allocations of structures containing a workqueue and a pointer in separate instances. It does look like this might be an improvement to the API (i.e. a workqueue use that manages the allocation of the actual work_struct). James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 15:27 ` James Bottomley @ 2005-12-23 15:38 ` Andrew Morton 2005-12-23 15:58 ` James Bottomley 0 siblings, 1 reply; 12+ messages in thread From: Andrew Morton @ 2005-12-23 15:38 UTC (permalink / raw) To: James Bottomley; +Cc: matthew, linux-scsi James Bottomley <James.Bottomley@SteelEye.com> wrote: > > There is a potential improvement, in that could be done which is only to > use the workqueue if we're in atomic context. However, I elected to > leave playing with that cleanup until after 2.6.15 We don't have a way of determining whether we're in atomic context (in_atomic() only works with CONFIG_PREEMPT). If scsi internally knows what context things are in then that'll work OK. > There is also the point that I now have two of these allocations of > structures containing a workqueue and a pointer in separate instances. > It does look like this might be an improvement to the API (i.e. a > workqueue use that manages the allocation of the actual work_struct). Perhaps you could use work_struct.data for the scsi_target* and get back to the work_struct via container_of(). ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 15:38 ` Andrew Morton @ 2005-12-23 15:58 ` James Bottomley 2005-12-24 3:54 ` Andrew Morton 0 siblings, 1 reply; 12+ messages in thread From: James Bottomley @ 2005-12-23 15:58 UTC (permalink / raw) To: Andrew Morton; +Cc: matthew, linux-scsi On Fri, 2005-12-23 at 07:38 -0800, Andrew Morton wrote: > James Bottomley <James.Bottomley@SteelEye.com> wrote: > > There is a potential improvement, in that could be done which is only to > > use the workqueue if we're in atomic context. However, I elected to > > leave playing with that cleanup until after 2.6.15 > > We don't have a way of determining whether we're in atomic context > (in_atomic() only works with CONFIG_PREEMPT). If scsi internally knows what > context things are in then that'll work OK. Yes, that's what I thought ... and we don't really have a good way of identifying this from the reap_target invocations (because of the way most in-context paths could come back to us via a softirq). > > There is also the point that I now have two of these allocations of > > structures containing a workqueue and a pointer in separate instances. > > It does look like this might be an improvement to the API (i.e. a > > workqueue use that manages the allocation of the actual work_struct). > > Perhaps you could use work_struct.data for the scsi_target* and get back to > the work_struct via container_of(). Could you elaborate some more on this? If I simply use the starget pointer as my work_struct.data, how do I get back to the actual work_struct for me to free it? It's not passed in to the function as far as I can tell. But even if I do this, I still have to manage the allocation and deallocation of the work_struct. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 15:58 ` James Bottomley @ 2005-12-24 3:54 ` Andrew Morton 0 siblings, 0 replies; 12+ messages in thread From: Andrew Morton @ 2005-12-24 3:54 UTC (permalink / raw) To: James Bottomley; +Cc: matthew, linux-scsi James Bottomley <James.Bottomley@SteelEye.com> wrote: > > > Perhaps you could use work_struct.data for the scsi_target* and get back to > > the work_struct via container_of(). > > Could you elaborate some more on this? If I simply use the starget > pointer as my work_struct.data, how do I get back to the actual > work_struct for me to free it? It's not passed in to the function as > far as I can tell. But even if I do this, I still have to manage the > allocation and deallocation of the work_struct. > err, no, I'm full of it. container_of() does pointer arithmetic to convert a pointer to foo.bar into a pointer to foo. But in this case the callback would have the value of work_struct.data, not the address of it, so it cannot be done. Most usage in drivers would be something like: struct driver_thing { ... struct work_struct work; }; and driver_thing.work.data gets a driver_thing* put into it. So the need to separately kmalloc the work_struct+data wrapper is relatively unusual. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 6:13 ` [SCSI] fix scsi_reap_target() device_del from atomic context Andrew Morton 2005-12-23 12:15 ` Matthew Wilcox @ 2005-12-23 20:05 ` Andrew Vasquez 2005-12-23 20:30 ` James Bottomley 2005-12-23 20:43 ` Sergey Vlasov 2 siblings, 1 reply; 12+ messages in thread From: Andrew Vasquez @ 2005-12-23 20:05 UTC (permalink / raw) To: Andrew Morton; +Cc: James Bottomley, linux-scsi On Thu, 22 Dec 2005, Andrew Morton wrote: > > + struct work_queue_wrapper *wqw = (struct work_queue_wrapper *)data; > > + struct scsi_target *starget = wqw->starget; > > + struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > > + unsigned long flags; > > + > > + kfree(wqw); > > + > > + spin_lock_irqsave(shost->host_lock, flags); > > + > > + if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { > > + list_del_init(&starget->siblings); > > + spin_unlock_irqrestore(shost->host_lock, flags); > > + device_del(&starget->dev); > > + transport_unregister_device(&starget->dev); > > + put_device(&starget->dev); > > + return; > > + > > + } > > + spin_unlock_irqrestore(shost->host_lock, flags); > > + > > + return; > > +} > > Given that this can run an arbitrary amount of time later on, how do we > know that *shost is still live? Hmm, I'm getting the following OOPS (consistently): Unable to handle kernel paging request at virtual address 6b6b6beb printing eip: c01900c2 *pde = 00000000 Oops: 0002 [#1] SMP Modules linked in: qla2xxx scsi_transport_fc CPU: 2 EIP: 0060:[<c01900c2>] Not tainted VLI EFLAGS: 00010282 (2.6.15-rc6) EIP is at sysfs_hash_and_remove+0x31/0x11f eax: 00000000 ebx: 6b6b6b6b ecx: 00000000 edx: ee6aac68 esi: 6b6b6b6b edi: f8839d44 ebp: f8839cc0 esp: c1af7e98 ds: 007b es: 007b ss: 0068 Process events/2 (pid: 16, threadinfo=c1af7000 task=c1aef030) Stack: c03998d2 00000063 eb6a0e2c eb6a0e24 f8839d44 f8839cc0 c025877b e640aac0 ee000f5c ee000f5c 00000000 f8839cc0 f332e178 eb6a0e24 c025ab08 c025ab39 eb6a0e24 c0370606 eb6a0e00 f332e178 c1af7f00 e5a872dc c025a722 f332e178 Call Trace: [<c025877b>] class_device_del+0x138/0x163 [<c025ab08>] transport_remove_classdev+0x0/0x68 [<c025ab39>] transport_remove_classdev+0x31/0x68 [<c0370606>] klist_next+0x42/0x5c [<c025a722>] attribute_container_device_trigger+0xcb/0xe2 [<c020524a>] kobject_put+0x1e/0x22 [<c025ab87>] transport_remove_device+0x17/0x1b [<c025ab08>] transport_remove_classdev+0x0/0x68 [<c0294874>] scsi_target_reap_work+0xa4/0xbb [<c012bcca>] worker_thread+0x1bf/0x24a [<c02947d0>] scsi_target_reap_work+0x0/0xbb [<c0117960>] default_wake_function+0x0/0x12 [<c0117960>] default_wake_function+0x0/0x12 [<c012bb0b>] worker_thread+0x0/0x24a [<c0130021>] kthread+0xa3/0xcd [<c012ff7e>] kthread+0x0/0xcd [<c0101119>] kernel_thread_helper+0x5/0xb Code: 08 8b 44 24 1c 8b 58 18 8b 70 60 85 db 75 08 83 c4 08 5b 5e 5f 5d c3 c7 44 24 04 63 00 00 00 c7 04 24 d2 98 39 c0 e8 e7 93 <6>ACPI: PCI interrupt for device 0000:03:01.0 disabled SysRq : Emergency Sync Emergency Sync complete with this commit applied. See previous post: http://marc.theaimsgroup.com/?l=linux-scsi&m=113536796621377&w=2 must be the sixth day... -- av ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 20:05 ` Andrew Vasquez @ 2005-12-23 20:30 ` James Bottomley 2005-12-23 20:46 ` Andrew Vasquez 0 siblings, 1 reply; 12+ messages in thread From: James Bottomley @ 2005-12-23 20:30 UTC (permalink / raw) To: Andrew Vasquez; +Cc: Andrew Morton, linux-scsi On Fri, 2005-12-23 at 12:05 -0800, Andrew Vasquez wrote: > EIP is at sysfs_hash_and_remove+0x31/0x11f > eax: 00000000 ebx: 6b6b6b6b ecx: 00000000 edx: ee6aac68 > esi: 6b6b6b6b edi: f8839d44 ebp: f8839cc0 esp: c1af7e98 > ds: 007b es: 007b ss: 0068 > Process events/2 (pid: 16, threadinfo=c1af7000 task=c1aef030) > Stack: c03998d2 00000063 eb6a0e2c eb6a0e24 f8839d44 f8839cc0 c025877b e640aac0 > ee000f5c ee000f5c 00000000 f8839cc0 f332e178 eb6a0e24 c025ab08 c025ab39 > eb6a0e24 c0370606 eb6a0e00 f332e178 c1af7f00 e5a872dc c025a722 f332e178 > Call Trace: > [<c025877b>] class_device_del+0x138/0x163 > [<c025ab08>] transport_remove_classdev+0x0/0x68 This one is the signal for class devices being deleted *after* the underlying device has been deleted, so the class device delete oopses trying to remove the symlink from the real device to the class device. Strange it should show up now, since that particular piece of code hasn't been altered, just moved into a work queue. See if the attached fixes it (if it does, I'm mystified as to why you haven't always seen this). James diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -418,8 +418,9 @@ static void scsi_target_reap_work(void * if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { list_del_init(&starget->siblings); spin_unlock_irqrestore(shost->host_lock, flags); + transport_remove_device(&starget->dev); device_del(&starget->dev); - transport_unregister_device(&starget->dev); + transport_destroy_device(&starget->dev); put_device(&starget->dev); return; ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 20:30 ` James Bottomley @ 2005-12-23 20:46 ` Andrew Vasquez 0 siblings, 0 replies; 12+ messages in thread From: Andrew Vasquez @ 2005-12-23 20:46 UTC (permalink / raw) To: James Bottomley; +Cc: Andrew Morton, linux-scsi On Fri, 23 Dec 2005, James Bottomley wrote: > On Fri, 2005-12-23 at 12:05 -0800, Andrew Vasquez wrote: > > EIP is at sysfs_hash_and_remove+0x31/0x11f > > eax: 00000000 ebx: 6b6b6b6b ecx: 00000000 edx: ee6aac68 > > esi: 6b6b6b6b edi: f8839d44 ebp: f8839cc0 esp: c1af7e98 > > ds: 007b es: 007b ss: 0068 > > Process events/2 (pid: 16, threadinfo=c1af7000 task=c1aef030) > > Stack: c03998d2 00000063 eb6a0e2c eb6a0e24 f8839d44 f8839cc0 c025877b e640aac0 > > ee000f5c ee000f5c 00000000 f8839cc0 f332e178 eb6a0e24 c025ab08 c025ab39 > > eb6a0e24 c0370606 eb6a0e00 f332e178 c1af7f00 e5a872dc c025a722 f332e178 > > Call Trace: > > [<c025877b>] class_device_del+0x138/0x163 > > [<c025ab08>] transport_remove_classdev+0x0/0x68 > > This one is the signal for class devices being deleted *after* the > underlying device has been deleted, so the class device delete oopses > trying to remove the symlink from the real device to the class device. > > Strange it should show up now, since that particular piece of code > hasn't been altered, just moved into a work queue. Yes, that does seem odd.. > See if the attached fixes it (if it does, I'm mystified as to why you > haven't always seen this). > > James > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > --- a/drivers/scsi/scsi_scan.c > +++ b/drivers/scsi/scsi_scan.c > @@ -418,8 +418,9 @@ static void scsi_target_reap_work(void * > if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { > list_del_init(&starget->siblings); > spin_unlock_irqrestore(shost->host_lock, flags); > + transport_remove_device(&starget->dev); > device_del(&starget->dev); > - transport_unregister_device(&starget->dev); > + transport_destroy_device(&starget->dev); > put_device(&starget->dev); > return; This fixes the OOPS during unload problem... Now the interesting question... why... -- av ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 6:13 ` [SCSI] fix scsi_reap_target() device_del from atomic context Andrew Morton 2005-12-23 12:15 ` Matthew Wilcox 2005-12-23 20:05 ` Andrew Vasquez @ 2005-12-23 20:43 ` Sergey Vlasov 2005-12-23 20:53 ` James Bottomley 2 siblings, 1 reply; 12+ messages in thread From: Sergey Vlasov @ 2005-12-23 20:43 UTC (permalink / raw) To: Andrew Morton; +Cc: James Bottomley, linux-scsi [-- Attachment #1: Type: text/plain, Size: 5399 bytes --] On Thu, 22 Dec 2005 22:13:29 -0800 Andrew Morton wrote: > Linux Kernel Mailing List <linux-kernel@vger.kernel.org> wrote: > > tree d2f74a0351a09e184e124fd6ecf16e02ab768a0b > > parent 42e33148df38c60b99d984b76b302c64397ebe4c > > author James Bottomley <James.Bottomley@steeleye.com> Fri, 16 Dec 2005 12:01:43 -0800 > > committer James Bottomley <jejb@mulgrave.(none)> Sat, 17 Dec 2005 22:48:08 -0600 > > > > [SCSI] fix scsi_reap_target() device_del from atomic context > > > > scsi_reap_target() was desgined to be called from any context. > > However it must do a device_del() of the target device, which may only > > be called from user context. Thus we have to reimplement > > scsi_reap_target() via a workqueue. > > > > Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> > > > > drivers/scsi/scsi_scan.c | 48 +++++++++++++++++++++++++++++++++++++---------- > > 1 files changed, 38 insertions(+), 10 deletions(-) > > > > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c > > index 94e5167..e36c21e 100644 > > --- a/drivers/scsi/scsi_scan.c > > +++ b/drivers/scsi/scsi_scan.c [skip] > > void scsi_target_reap(struct scsi_target *starget) > > { > > - struct Scsi_Host *shost = dev_to_shost(starget->dev.parent); > > - unsigned long flags; > > - spin_lock_irqsave(shost->host_lock, flags); > > + struct work_queue_wrapper *wqw = > > + kzalloc(sizeof(struct work_queue_wrapper), GFP_ATOMIC); > > kmalloc() would suffice. > > > - if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { > > - list_del_init(&starget->siblings); > > - spin_unlock_irqrestore(shost->host_lock, flags); > > - device_del(&starget->dev); > > - transport_unregister_device(&starget->dev); > > - put_device(&starget->dev); > > + if (!wqw) { > > + starget_printk(KERN_ERR, starget, > > + "Failed to allocate memory in scsi_reap_target()\n"); > > return; So if that GFP_ATOMIC allocation ever fails, the target is leaked forever - does not look nice. Does anything depend on starget->siblings being empty after we had removed it from the shost->__targets list it was on? Seems that all other uses of this field are: drivers/scsi/scsi_scan.c:313: list_for_each_entry(starget, &shost->__targets, siblings) { drivers/scsi/scsi_scan.c:365: INIT_LIST_HEAD(&starget->siblings); drivers/scsi/scsi_scan.c:373: list_add_tail(&starget->siblings, &shost->__targets); So probably we can reuse this field and do deferred reaping without any memory allocation at all. The following patch should be applied _instead_ of the James' patch, not on top of it (I can make a combined patch if it is desired). The difference with the previous patch is that scsi_target_reap() still removes the target from shost->__targets immediately - only device_del() and subsequent actions are deferred to a workqueue. Patch is only compile tested. ------------------------------------------------------------------------ [SCSI] fix scsi_target_reap() device_del from atomic context scsi_target_reap() may be called from any context, however, it needs to call device_del(), which requires a process context. Thus we have to perform device_del() via a workqueue. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> --- drivers/scsi/scsi_scan.c | 33 +++++++++++++++++++++++++++++---- 1 files changed, 29 insertions(+), 4 deletions(-) 28763f31e602e7265b61e676dcc1b536b5442fbf diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 94e5167..ac4c75f 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -400,6 +400,30 @@ static struct scsi_target *scsi_alloc_ta return found_target; } +static LIST_HEAD(scsi_target_reap_list); +static DEFINE_SPINLOCK(scsi_target_reap_list_lock); + +static void scsi_target_reap_work_fn(void *data) +{ + struct scsi_target *starget; + unsigned long flags; + + spin_lock_irqsave(&scsi_target_reap_list_lock, flags); + while (!list_empty(&scsi_target_reap_list)) { + starget = list_entry(scsi_target_reap_list.next, + struct scsi_target, siblings); + list_del_init(&starget->siblings); + spin_unlock_irqrestore(&scsi_target_reap_list_lock, flags); + device_del(&starget->dev); + transport_unregister_device(&starget->dev); + put_device(&starget->dev); + spin_lock_irqsave(&scsi_target_reap_list_lock, flags); + } + spin_unlock_irqrestore(&scsi_target_reap_list_lock, flags); +} + +static DECLARE_WORK(scsi_target_reap_work, scsi_target_reap_work_fn, NULL); + /** * scsi_target_reap - check to see if target is in use and destroy if not * @@ -416,11 +440,12 @@ void scsi_target_reap(struct scsi_target spin_lock_irqsave(shost->host_lock, flags); if (--starget->reap_ref == 0 && list_empty(&starget->devices)) { - list_del_init(&starget->siblings); + list_del(&starget->siblings); spin_unlock_irqrestore(shost->host_lock, flags); - device_del(&starget->dev); - transport_unregister_device(&starget->dev); - put_device(&starget->dev); + spin_lock_irqsave(&scsi_target_reap_list_lock, flags); + list_add_tail(&starget->siblings, &scsi_target_reap_list); + spin_unlock_irqrestore(&scsi_target_reap_list_lock, flags); + schedule_work(&scsi_target_reap_work); return; } spin_unlock_irqrestore(shost->host_lock, flags); -- 1.0.GIT [-- Attachment #2: Type: application/pgp-signature, Size: 190 bytes --] ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 20:43 ` Sergey Vlasov @ 2005-12-23 20:53 ` James Bottomley 2005-12-23 21:26 ` Sergey Vlasov 0 siblings, 1 reply; 12+ messages in thread From: James Bottomley @ 2005-12-23 20:53 UTC (permalink / raw) To: Sergey Vlasov; +Cc: Andrew Morton, linux-scsi On Fri, 2005-12-23 at 23:43 +0300, Sergey Vlasov wrote: > So if that GFP_ATOMIC allocation ever fails, the target is leaked > forever - does not look nice. Yes, but it will print a warning. > Does anything depend on starget->siblings being empty after we had > removed it from the shost->__targets list it was on? Seems that all > other uses of this field are: > > drivers/scsi/scsi_scan.c:313: list_for_each_entry(starget, &shost->__targets, siblings) { > drivers/scsi/scsi_scan.c:365: INIT_LIST_HEAD(&starget->siblings); > drivers/scsi/scsi_scan.c:373: list_add_tail(&starget->siblings, &shost->__targets); > > So probably we can reuse this field and do deferred reaping without any > memory allocation at all. The following patch should be applied > _instead_ of the James' patch, not on top of it (I can make a combined > patch if it is desired). The difference with the previous patch is that > scsi_target_reap() still removes the target from shost->__targets > immediately - only device_del() and subsequent actions are deferred to a > workqueue. No, you can't. If you do this, the target namespace will potentially be in use in sysfs after the system thinks the target is gone. Thus, any reallocation fails because you can't add a new target with the same name as an existing one. James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [SCSI] fix scsi_reap_target() device_del from atomic context 2005-12-23 20:53 ` James Bottomley @ 2005-12-23 21:26 ` Sergey Vlasov 0 siblings, 0 replies; 12+ messages in thread From: Sergey Vlasov @ 2005-12-23 21:26 UTC (permalink / raw) To: James Bottomley; +Cc: Andrew Morton, linux-scsi [-- Attachment #1: Type: text/plain, Size: 1978 bytes --] On Fri, Dec 23, 2005 at 02:53:39PM -0600, James Bottomley wrote: > On Fri, 2005-12-23 at 23:43 +0300, Sergey Vlasov wrote: > > So if that GFP_ATOMIC allocation ever fails, the target is leaked > > forever - does not look nice. > > Yes, but it will print a warning. > > > Does anything depend on starget->siblings being empty after we had > > removed it from the shost->__targets list it was on? Seems that all > > other uses of this field are: > > > > drivers/scsi/scsi_scan.c:313: list_for_each_entry(starget, &shost->__targets, siblings) { > > drivers/scsi/scsi_scan.c:365: INIT_LIST_HEAD(&starget->siblings); > > drivers/scsi/scsi_scan.c:373: list_add_tail(&starget->siblings, &shost->__targets); > > > > So probably we can reuse this field and do deferred reaping without any > > memory allocation at all. The following patch should be applied > > _instead_ of the James' patch, not on top of it (I can make a combined > > patch if it is desired). The difference with the previous patch is that > > scsi_target_reap() still removes the target from shost->__targets > > immediately - only device_del() and subsequent actions are deferred to a > > workqueue. > > No, you can't. > > If you do this, the target namespace will potentially be in use in sysfs > after the system thinks the target is gone. Thus, any reallocation > fails because you can't add a new target with the same name as an > existing one. Ah, I see now... scsi_alloc_target() checks if a target with the same channel/id combination is on the list and adds new target only if there was no such one before. However, what prevents a race between scsi_target_reap_work() and scsi_alloc_target()? If the worker thread is interrupted/preempted just after releasing host_lock (when it has already removed the target from the list, but before it has called device_del()), scsi_alloc_target() might consider the target as new and get to device_add() faster. [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-12-24 3:55 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200512212359.jBLNxluV016971@hera.kernel.org>
2005-12-23 6:13 ` [SCSI] fix scsi_reap_target() device_del from atomic context Andrew Morton
2005-12-23 12:15 ` Matthew Wilcox
2005-12-23 15:27 ` James Bottomley
2005-12-23 15:38 ` Andrew Morton
2005-12-23 15:58 ` James Bottomley
2005-12-24 3:54 ` Andrew Morton
2005-12-23 20:05 ` Andrew Vasquez
2005-12-23 20:30 ` James Bottomley
2005-12-23 20:46 ` Andrew Vasquez
2005-12-23 20:43 ` Sergey Vlasov
2005-12-23 20:53 ` James Bottomley
2005-12-23 21:26 ` Sergey Vlasov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox