* Re: PM/hibernate swapfile regression [not found] <20090714135453.GA26976@osiris.boeblingen.de.ibm.com> @ 2009-07-14 16:21 ` Rafael J. Wysocki 2009-07-15 21:24 ` Alan Jenkins ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Rafael J. Wysocki @ 2009-07-14 16:21 UTC (permalink / raw) To: Heiko Carstens, Alan Jenkins; +Cc: Hans-Joachim Picht, pm list, linux-kernel On Tuesday 14 July 2009, Heiko Carstens wrote: > We've seen this bug: > > Jul 8 13:16:02 h05lp03 kernel: BUG: sleeping function called from invalid context at /home/autobuild/BUILD/linux-2.6.30-20090707/include/linux/writeback.h:87 > Jul 8 13:16:02 h05lp03 kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 24377, name: bash > Jul 8 13:16:02 h05lp03 kernel: 3 locks held by bash/24377: > Jul 8 13:16:02 h05lp03 kernel: #0: (&buffer->mutex){+.+.+.}, at: [<0000000000276e74>] sysfs_write_file+0x4c/0x1ac > Jul 8 13:16:02 h05lp03 kernel: #1: (pm_mutex#2){+.+.+.}, at: [<000000000018f128>] hibernate+0x34/0x200 > Jul 8 13:16:02 h05lp03 kernel: #2: (swap_lock){+.+.-.}, at: [<00000000001f371c>] swap_type_of+0x44/0x158 > Jul 8 13:16:02 h05lp03 kernel: CPU: 8 Not tainted 2.6.30-39.x.20090707-s390xdefault #1 > Jul 8 13:16:02 h05lp03 kernel: Process bash (pid: 24377, task: 000000012ce84240, ksp: 00000000c262bb00) > Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 00000000c262ba88 0000000000000002 0000000000000000 > Jul 8 13:16:02 h05lp03 kernel: 00000000c262bb28 00000000c262baa0 00000000c262baa0 00000000005448c4 > Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 000000012ce84718 000000013d5bf1a8 0000000000000000 > Jul 8 13:16:02 h05lp03 kernel: 000000000000000d 0000000000000000 00000000c262baf8 000000000000000e > Jul 8 13:16:02 h05lp03 kernel: 0000000000553da8 0000000000105600 00000000c262ba88 00000000c262bad0 > Jul 8 13:16:02 h05lp03 kernel: Call Trace: > Jul 8 13:16:02 h05lp03 kernel: ([<00000000001054fc>] show_trace+0xf0/0x148) > Jul 8 13:16:02 h05lp03 kernel: [<00000000001391ba>] __might_sleep+0x172/0x188 > Jul 8 13:16:02 h05lp03 kernel: [<000000000021f738>] ifind+0x88/0xe4 > Jul 8 13:16:02 h05lp03 kernel: [<0000000000220b0e>] iget5_locked+0x66/0x1d8 > Jul 8 13:16:02 h05lp03 kernel: [<000000000023b676>] bdget+0x5e/0x150 > Jul 8 13:16:02 h05lp03 kernel: [<00000000001f37b2>] swap_type_of+0xda/0x158 > Jul 8 13:16:02 h05lp03 kernel: [<0000000000192342>] swsusp_write+0x4e/0x458 > Jul 8 13:16:02 h05lp03 kernel: [<000000000018f254>] hibernate+0x160/0x200 > Jul 8 13:16:02 h05lp03 kernel: [<000000000018d8da>] state_store+0x82/0xa8 > Jul 8 13:16:02 h05lp03 kernel: [<0000000000276f20>] sysfs_write_file+0xf8/0x1ac > Jul 8 13:16:02 h05lp03 kernel: [<000000000020663a>] vfs_write+0xae/0x15c > Jul 8 13:16:02 h05lp03 kernel: [<00000000002067e0>] SyS_write+0x54/0xac > Jul 8 13:16:02 h05lp03 kernel: [<0000000000117a96>] sysc_noemu+0x10/0x16 > Jul 8 13:16:02 h05lp03 kernel: [<00000047083e36b4>] 0x47083e36b4 > > Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap > breaks after hibernation failures"". > Calling bdget while holding a spinlock doesn't seem to be a good idea... Agreed, sorry for missing that. Alan, can you please prepare a fix? Rafael ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PM/hibernate swapfile regression 2009-07-14 16:21 ` PM/hibernate swapfile regression Rafael J. Wysocki @ 2009-07-15 21:24 ` Alan Jenkins 2009-07-17 13:08 ` Alan Jenkins [not found] ` <4A6077DE.6060202@tuffmail.co.uk> 2 siblings, 0 replies; 11+ messages in thread From: Alan Jenkins @ 2009-07-15 21:24 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Hans-Joachim Picht, pm list, Heiko Carstens, linux-kernel Rafael J. Wysocki wrote: > On Tuesday 14 July 2009, Heiko Carstens wrote: > >> We've seen this bug: >> >> Jul 8 13:16:02 h05lp03 kernel: BUG: sleeping function called from invalid context at /home/autobuild/BUILD/linux-2.6.30-20090707/include/linux/writeback.h:87 >> Jul 8 13:16:02 h05lp03 kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 24377, name: bash >> Jul 8 13:16:02 h05lp03 kernel: 3 locks held by bash/24377: >> Jul 8 13:16:02 h05lp03 kernel: #0: (&buffer->mutex){+.+.+.}, at: [<0000000000276e74>] sysfs_write_file+0x4c/0x1ac >> Jul 8 13:16:02 h05lp03 kernel: #1: (pm_mutex#2){+.+.+.}, at: [<000000000018f128>] hibernate+0x34/0x200 >> Jul 8 13:16:02 h05lp03 kernel: #2: (swap_lock){+.+.-.}, at: [<00000000001f371c>] swap_type_of+0x44/0x158 >> Jul 8 13:16:02 h05lp03 kernel: CPU: 8 Not tainted 2.6.30-39.x.20090707-s390xdefault #1 >> Jul 8 13:16:02 h05lp03 kernel: Process bash (pid: 24377, task: 000000012ce84240, ksp: 00000000c262bb00) >> Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 00000000c262ba88 0000000000000002 0000000000000000 >> Jul 8 13:16:02 h05lp03 kernel: 00000000c262bb28 00000000c262baa0 00000000c262baa0 00000000005448c4 >> Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 000000012ce84718 000000013d5bf1a8 0000000000000000 >> Jul 8 13:16:02 h05lp03 kernel: 000000000000000d 0000000000000000 00000000c262baf8 000000000000000e >> Jul 8 13:16:02 h05lp03 kernel: 0000000000553da8 0000000000105600 00000000c262ba88 00000000c262bad0 >> Jul 8 13:16:02 h05lp03 kernel: Call Trace: >> Jul 8 13:16:02 h05lp03 kernel: ([<00000000001054fc>] show_trace+0xf0/0x148) >> Jul 8 13:16:02 h05lp03 kernel: [<00000000001391ba>] __might_sleep+0x172/0x188 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000021f738>] ifind+0x88/0xe4 >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000220b0e>] iget5_locked+0x66/0x1d8 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000023b676>] bdget+0x5e/0x150 >> Jul 8 13:16:02 h05lp03 kernel: [<00000000001f37b2>] swap_type_of+0xda/0x158 >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000192342>] swsusp_write+0x4e/0x458 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000018f254>] hibernate+0x160/0x200 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000018d8da>] state_store+0x82/0xa8 >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000276f20>] sysfs_write_file+0xf8/0x1ac >> Jul 8 13:16:02 h05lp03 kernel: [<000000000020663a>] vfs_write+0xae/0x15c >> Jul 8 13:16:02 h05lp03 kernel: [<00000000002067e0>] SyS_write+0x54/0xac >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000117a96>] sysc_noemu+0x10/0x16 >> Jul 8 13:16:02 h05lp03 kernel: [<00000047083e36b4>] 0x47083e36b4 >> >> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap >> breaks after hibernation failures"". >> Calling bdget while holding a spinlock doesn't seem to be a good idea... >> > > Agreed, sorry for missing that. > > Alan, can you please prepare a fix? > > Rafael > I'll be back from graduation on the 17th. I'll see what I can do then. Thanks Alan ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PM/hibernate swapfile regression 2009-07-14 16:21 ` PM/hibernate swapfile regression Rafael J. Wysocki 2009-07-15 21:24 ` Alan Jenkins @ 2009-07-17 13:08 ` Alan Jenkins [not found] ` <4A6077DE.6060202@tuffmail.co.uk> 2 siblings, 0 replies; 11+ messages in thread From: Alan Jenkins @ 2009-07-17 13:08 UTC (permalink / raw) To: Heiko Carstens; +Cc: Hans-Joachim Picht, pm list, linux-kernel Rafael J. Wysocki wrote: > On Tuesday 14 July 2009, Heiko Carstens wrote: > >> We've seen this bug: >> >> Jul 8 13:16:02 h05lp03 kernel: BUG: sleeping function called from invalid context at /home/autobuild/BUILD/linux-2.6.30-20090707/include/linux/writeback.h:87 >> Jul 8 13:16:02 h05lp03 kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 24377, name: bash >> Jul 8 13:16:02 h05lp03 kernel: 3 locks held by bash/24377: >> Jul 8 13:16:02 h05lp03 kernel: #0: (&buffer->mutex){+.+.+.}, at: [<0000000000276e74>] sysfs_write_file+0x4c/0x1ac >> Jul 8 13:16:02 h05lp03 kernel: #1: (pm_mutex#2){+.+.+.}, at: [<000000000018f128>] hibernate+0x34/0x200 >> Jul 8 13:16:02 h05lp03 kernel: #2: (swap_lock){+.+.-.}, at: [<00000000001f371c>] swap_type_of+0x44/0x158 >> Jul 8 13:16:02 h05lp03 kernel: CPU: 8 Not tainted 2.6.30-39.x.20090707-s390xdefault #1 >> Jul 8 13:16:02 h05lp03 kernel: Process bash (pid: 24377, task: 000000012ce84240, ksp: 00000000c262bb00) >> Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 00000000c262ba88 0000000000000002 0000000000000000 >> Jul 8 13:16:02 h05lp03 kernel: 00000000c262bb28 00000000c262baa0 00000000c262baa0 00000000005448c4 >> Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 000000012ce84718 000000013d5bf1a8 0000000000000000 >> Jul 8 13:16:02 h05lp03 kernel: 000000000000000d 0000000000000000 00000000c262baf8 000000000000000e >> Jul 8 13:16:02 h05lp03 kernel: 0000000000553da8 0000000000105600 00000000c262ba88 00000000c262bad0 >> Jul 8 13:16:02 h05lp03 kernel: Call Trace: >> Jul 8 13:16:02 h05lp03 kernel: ([<00000000001054fc>] show_trace+0xf0/0x148) >> Jul 8 13:16:02 h05lp03 kernel: [<00000000001391ba>] __might_sleep+0x172/0x188 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000021f738>] ifind+0x88/0xe4 >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000220b0e>] iget5_locked+0x66/0x1d8 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000023b676>] bdget+0x5e/0x150 >> Jul 8 13:16:02 h05lp03 kernel: [<00000000001f37b2>] swap_type_of+0xda/0x158 >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000192342>] swsusp_write+0x4e/0x458 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000018f254>] hibernate+0x160/0x200 >> Jul 8 13:16:02 h05lp03 kernel: [<000000000018d8da>] state_store+0x82/0xa8 >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000276f20>] sysfs_write_file+0xf8/0x1ac >> Jul 8 13:16:02 h05lp03 kernel: [<000000000020663a>] vfs_write+0xae/0x15c >> Jul 8 13:16:02 h05lp03 kernel: [<00000000002067e0>] SyS_write+0x54/0xac >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000117a96>] sysc_noemu+0x10/0x16 >> Jul 8 13:16:02 h05lp03 kernel: [<00000047083e36b4>] 0x47083e36b4 >> >> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap >> breaks after hibernation failures"". >> Calling bdget while holding a spinlock doesn't seem to be a good idea... >> > > Agreed, sorry for missing that. > > Alan, can you please prepare a fix? > > Rafael > I'm not sure how to reproduce. I tried pm-hibernate with CONFIG_DEBUG_SPINLOCK_SLEEP, but nothing showed up in dmesg. Here's a quick & dirty patch. Please test (or explain how I can test it myself, whichever is easier :-). swap_unplug_sem is used to avoid holding swap_lock when calling the block device unplug function. I think it can also be used for this bdget call. Thanks Alan diff --git a/mm/swapfile.c b/mm/swapfile.c index d1ade1a..9176464 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -744,6 +744,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) if (device) bdev = bdget(device); + down_read(&swap_unplug_sem); spin_lock(&swap_lock); for (i = 0; i < nr_swapfiles; i++) { struct swap_info_struct *sis = swap_info + i; @@ -752,10 +753,11 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) continue; if (!bdev) { + spin_unlock(&swap_lock); if (bdev_p) *bdev_p = bdget(sis->bdev->bd_dev); + up_read(&swap_unplug_sem); - spin_unlock(&swap_lock); return i; } if (bdev == sis->bdev) { @@ -764,16 +766,18 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) se = list_entry(sis->extent_list.next, struct swap_extent, list); if (se->start_block == offset) { + spin_unlock(&swap_lock); if (bdev_p) *bdev_p = bdget(sis->bdev->bd_dev); + up_read(&swap_unplug_sem); - spin_unlock(&swap_lock); bdput(bdev); return i; } } } spin_unlock(&swap_lock); + up_read(&swap_unplug_sem); if (bdev) bdput(bdev); ^ permalink raw reply related [flat|nested] 11+ messages in thread
[parent not found: <4A6077DE.6060202@tuffmail.co.uk>]
* Re: PM/hibernate swapfile regression [not found] ` <4A6077DE.6060202@tuffmail.co.uk> @ 2009-07-20 13:24 ` Heiko Carstens [not found] ` <20090720132456.GA29023@osiris.boeblingen.de.ibm.com> 1 sibling, 0 replies; 11+ messages in thread From: Heiko Carstens @ 2009-07-20 13:24 UTC (permalink / raw) To: Alan Jenkins; +Cc: Hans-Joachim Picht, pm list, linux-kernel, Arnd Schneider On Fri, Jul 17, 2009 at 02:08:46PM +0100, Alan Jenkins wrote: > Rafael J. Wysocki wrote: > > On Tuesday 14 July 2009, Heiko Carstens wrote: > > > >> We've seen this bug: > >> > >> Jul 8 13:16:02 h05lp03 kernel: BUG: sleeping function called from invalid context at /home/autobuild/BUILD/linux-2.6.30-20090707/include/linux/writeback.h:87 > >> Jul 8 13:16:02 h05lp03 kernel: in_atomic(): 1, irqs_disabled(): 0, pid: 24377, name: bash > >> Jul 8 13:16:02 h05lp03 kernel: 3 locks held by bash/24377: > >> Jul 8 13:16:02 h05lp03 kernel: #0: (&buffer->mutex){+.+.+.}, at: [<0000000000276e74>] sysfs_write_file+0x4c/0x1ac > >> Jul 8 13:16:02 h05lp03 kernel: #1: (pm_mutex#2){+.+.+.}, at: [<000000000018f128>] hibernate+0x34/0x200 > >> Jul 8 13:16:02 h05lp03 kernel: #2: (swap_lock){+.+.-.}, at: [<00000000001f371c>] swap_type_of+0x44/0x158 > >> Jul 8 13:16:02 h05lp03 kernel: CPU: 8 Not tainted 2.6.30-39.x.20090707-s390xdefault #1 > >> Jul 8 13:16:02 h05lp03 kernel: Process bash (pid: 24377, task: 000000012ce84240, ksp: 00000000c262bb00) > >> Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 00000000c262ba88 0000000000000002 0000000000000000 > >> Jul 8 13:16:02 h05lp03 kernel: 00000000c262bb28 00000000c262baa0 00000000c262baa0 00000000005448c4 > >> Jul 8 13:16:02 h05lp03 kernel: 0000000000000000 000000012ce84718 000000013d5bf1a8 0000000000000000 > >> Jul 8 13:16:02 h05lp03 kernel: 000000000000000d 0000000000000000 00000000c262baf8 000000000000000e > >> Jul 8 13:16:02 h05lp03 kernel: 0000000000553da8 0000000000105600 00000000c262ba88 00000000c262bad0 > >> Jul 8 13:16:02 h05lp03 kernel: Call Trace: > >> Jul 8 13:16:02 h05lp03 kernel: ([<00000000001054fc>] show_trace+0xf0/0x148) > >> Jul 8 13:16:02 h05lp03 kernel: [<00000000001391ba>] __might_sleep+0x172/0x188 > >> Jul 8 13:16:02 h05lp03 kernel: [<000000000021f738>] ifind+0x88/0xe4 > >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000220b0e>] iget5_locked+0x66/0x1d8 > >> Jul 8 13:16:02 h05lp03 kernel: [<000000000023b676>] bdget+0x5e/0x150 > >> Jul 8 13:16:02 h05lp03 kernel: [<00000000001f37b2>] swap_type_of+0xda/0x158 > >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000192342>] swsusp_write+0x4e/0x458 > >> Jul 8 13:16:02 h05lp03 kernel: [<000000000018f254>] hibernate+0x160/0x200 > >> Jul 8 13:16:02 h05lp03 kernel: [<000000000018d8da>] state_store+0x82/0xa8 > >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000276f20>] sysfs_write_file+0xf8/0x1ac > >> Jul 8 13:16:02 h05lp03 kernel: [<000000000020663a>] vfs_write+0xae/0x15c > >> Jul 8 13:16:02 h05lp03 kernel: [<00000000002067e0>] SyS_write+0x54/0xac > >> Jul 8 13:16:02 h05lp03 kernel: [<0000000000117a96>] sysc_noemu+0x10/0x16 > >> Jul 8 13:16:02 h05lp03 kernel: [<00000047083e36b4>] 0x47083e36b4 > >> > >> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap > >> breaks after hibernation failures"". > >> Calling bdget while holding a spinlock doesn't seem to be a good idea... > >> > > > > Agreed, sorry for missing that. > > > > Alan, can you please prepare a fix? > > I'm not sure how to reproduce. I tried pm-hibernate with > CONFIG_DEBUG_SPINLOCK_SLEEP, but nothing showed up in dmesg. > > Here's a quick & dirty patch. Please test (or explain how I can test it > myself, whichever is easier :-). swap_unplug_sem is used to avoid > holding swap_lock when calling the block device unplug function. I > think it can also be used for this bdget call. Thanks for the patch. Unfortunately Arnd was unable to reproduce the original behaviour. But your patch makes sense anyway. I also tested it and nothing broke. So should this go upstream? > diff --git a/mm/swapfile.c b/mm/swapfile.c > index d1ade1a..9176464 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -744,6 +744,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) > if (device) > bdev = bdget(device); > > + down_read(&swap_unplug_sem); > spin_lock(&swap_lock); > for (i = 0; i < nr_swapfiles; i++) { > struct swap_info_struct *sis = swap_info + i; > @@ -752,10 +753,11 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) > continue; > > if (!bdev) { > + spin_unlock(&swap_lock); > if (bdev_p) > *bdev_p = bdget(sis->bdev->bd_dev); > + up_read(&swap_unplug_sem); > > - spin_unlock(&swap_lock); > return i; > } > if (bdev == sis->bdev) { > @@ -764,16 +766,18 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) > se = list_entry(sis->extent_list.next, > struct swap_extent, list); > if (se->start_block == offset) { > + spin_unlock(&swap_lock); > if (bdev_p) > *bdev_p = bdget(sis->bdev->bd_dev); > + up_read(&swap_unplug_sem); > > - spin_unlock(&swap_lock); > bdput(bdev); > return i; > } > } > } > spin_unlock(&swap_lock); > + up_read(&swap_unplug_sem); > if (bdev) > bdput(bdev); ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20090720132456.GA29023@osiris.boeblingen.de.ibm.com>]
* Re: PM/hibernate swapfile regression [not found] ` <20090720132456.GA29023@osiris.boeblingen.de.ibm.com> @ 2009-07-21 10:50 ` Alan Jenkins [not found] ` <4A659D75.3050004@tuffmail.co.uk> 1 sibling, 0 replies; 11+ messages in thread From: Alan Jenkins @ 2009-07-21 10:50 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Arnd Schneider, Heiko Carstens, linux-kernel, linux-fsdevel, Hans-Joachim Picht, pm list Heiko Carstens wrote: > On Fri, Jul 17, 2009 at 02:08:46PM +0100, Alan Jenkins wrote: > >> Rafael J. Wysocki wrote: >> >>> On Tuesday 14 July 2009, Heiko Carstens wrote: >>> >>> >>>> We've seen this bug [...] >>>> >>>> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap >>>> breaks after hibernation failures"". >>>> Calling bdget while holding a spinlock doesn't seem to be a good idea... >>>> >>>> >>> Agreed, sorry for missing that. >>> >>> Alan, can you please prepare a fix? >>> >> Here's a quick & dirty patch [...] >> > > Thanks for the patch. Unfortunately Arnd was unable to reproduce the original > behaviour. But your patch makes sense anyway. > I also tested it and nothing broke. So should this go upstream? I do want to fix it, but I think there's a better way. It doesn't really need the sleeping bdget(). All we want is atomic_inc(&bdev->bd_inode->i_count). I think we should call it bdcopy(). ----------> >From 643014ec079610a8b01dfd78c6949c1e8727195b Mon Sep 17 00:00:00 2001 From: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Date: Tue, 21 Jul 2009 10:17:30 +0100 Subject: [PATCH] PM/hibernate: replace bdget call with bdcopy (simple atomic_inc of i_count) Create bdcopy(). This function copies an existing reference to a block_device. It is safe to call from any context. Hibernation code wishes to copy a reference to the active swap device. Right now it calls bdget() under a spinlock, but this is wrong because bdget() can sleep. It doesn't need a full bdget() because we already hold a reference to active swap devices (and the spinlock protects against swapoff). Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> CC: linux-fsdevel@vger.kernel.org --- fs/block_dev.c | 8 ++++++++ include/linux/fs.h | 1 + mm/swapfile.c | 4 ++-- 3 files changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/block_dev.c b/fs/block_dev.c index 3a6d4fb..0b04974 100644 --- a/fs/block_dev.c +++ b/fs/block_dev.c @@ -564,6 +564,14 @@ struct block_device *bdget(dev_t dev) EXPORT_SYMBOL(bdget); +struct block_device *bdcopy(struct block_device *bdev) +{ + atomic_inc(&bdev->bd_inode->i_count); + return bdev; +} + +EXPORT_SYMBOL(bdcopy); + long nr_blockdev_pages(void) { struct block_device *bdev; diff --git a/include/linux/fs.h b/include/linux/fs.h index 0872372..eeb1091 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1946,6 +1946,7 @@ extern void putname(const char *name); extern int register_blkdev(unsigned int, const char *); extern void unregister_blkdev(unsigned int, const char *); extern struct block_device *bdget(dev_t); +extern struct block_device *bdcopy(struct block_device *bdev); extern void bd_set_size(struct block_device *, loff_t size); extern void bd_forget(struct inode *inode); extern void bdput(struct block_device *); diff --git a/mm/swapfile.c b/mm/swapfile.c index d1ade1a..272ea8e 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -753,7 +753,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) if (!bdev) { if (bdev_p) - *bdev_p = bdget(sis->bdev->bd_dev); + *bdev_p = bdcopy(sis->bdev); spin_unlock(&swap_lock); return i; @@ -765,7 +765,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) struct swap_extent, list); if (se->start_block == offset) { if (bdev_p) - *bdev_p = bdget(sis->bdev->bd_dev); + *bdev_p = bdcopy(sis->bdev); spin_unlock(&swap_lock); bdput(bdev); -- 1.6.3.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
[parent not found: <4A659D75.3050004@tuffmail.co.uk>]
* Re: PM/hibernate swapfile regression [not found] ` <4A659D75.3050004@tuffmail.co.uk> @ 2009-07-21 13:41 ` Rafael J. Wysocki 2009-07-21 20:57 ` Rafael J. Wysocki ` (3 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Rafael J. Wysocki @ 2009-07-21 13:41 UTC (permalink / raw) To: Alan Jenkins, Jens Axboe Cc: Arnd Schneider, Heiko Carstens, linux-kernel, linux-fsdevel, Hans-Joachim Picht, pm list On Tuesday 21 July 2009, Alan Jenkins wrote: > Heiko Carstens wrote: > > On Fri, Jul 17, 2009 at 02:08:46PM +0100, Alan Jenkins wrote: > > > >> Rafael J. Wysocki wrote: > >> > >>> On Tuesday 14 July 2009, Heiko Carstens wrote: > >>> > >>> > >>>> We've seen this bug [...] > >>>> > >>>> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap > >>>> breaks after hibernation failures"". > >>>> Calling bdget while holding a spinlock doesn't seem to be a good idea... > >>>> > >>>> > >>> Agreed, sorry for missing that. > >>> > >>> Alan, can you please prepare a fix? > >>> > >> Here's a quick & dirty patch [...] > >> > > > > Thanks for the patch. Unfortunately Arnd was unable to reproduce the original > > behaviour. But your patch makes sense anyway. > > I also tested it and nothing broke. So should this go upstream? > > I do want to fix it, but I think there's a better way. > > It doesn't really need the sleeping bdget(). All we want is > atomic_inc(&bdev->bd_inode->i_count). I think we should call it bdcopy(). Makes sense to me, but IMO that requres an ACK from Jens. Jens, is the patch below fine with you? Rafael > ----------> > From 643014ec079610a8b01dfd78c6949c1e8727195b Mon Sep 17 00:00:00 2001 > From: Alan Jenkins <alan-jenkins@tuffmail.co.uk> > Date: Tue, 21 Jul 2009 10:17:30 +0100 > Subject: [PATCH] PM/hibernate: replace bdget call with bdcopy (simple atomic_inc of i_count) > > Create bdcopy(). This function copies an existing reference to a > block_device. It is safe to call from any context. > > Hibernation code wishes to copy a reference to the active swap device. > Right now it calls bdget() under a spinlock, but this is wrong because > bdget() can sleep. It doesn't need a full bdget() because we already > hold a reference to active swap devices (and the spinlock protects > against swapoff). > > Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> > CC: linux-fsdevel@vger.kernel.org > --- > fs/block_dev.c | 8 ++++++++ > include/linux/fs.h | 1 + > mm/swapfile.c | 4 ++-- > 3 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 3a6d4fb..0b04974 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -564,6 +564,14 @@ struct block_device *bdget(dev_t dev) > > EXPORT_SYMBOL(bdget); > > +struct block_device *bdcopy(struct block_device *bdev) > +{ > + atomic_inc(&bdev->bd_inode->i_count); > + return bdev; > +} > + > +EXPORT_SYMBOL(bdcopy); > + > long nr_blockdev_pages(void) > { > struct block_device *bdev; > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 0872372..eeb1091 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1946,6 +1946,7 @@ extern void putname(const char *name); > extern int register_blkdev(unsigned int, const char *); > extern void unregister_blkdev(unsigned int, const char *); > extern struct block_device *bdget(dev_t); > +extern struct block_device *bdcopy(struct block_device *bdev); > extern void bd_set_size(struct block_device *, loff_t size); > extern void bd_forget(struct inode *inode); > extern void bdput(struct block_device *); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index d1ade1a..272ea8e 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -753,7 +753,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) > > if (!bdev) { > if (bdev_p) > - *bdev_p = bdget(sis->bdev->bd_dev); > + *bdev_p = bdcopy(sis->bdev); > > spin_unlock(&swap_lock); > return i; > @@ -765,7 +765,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) > struct swap_extent, list); > if (se->start_block == offset) { > if (bdev_p) > - *bdev_p = bdget(sis->bdev->bd_dev); > + *bdev_p = bdcopy(sis->bdev); > > spin_unlock(&swap_lock); > bdput(bdev); ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PM/hibernate swapfile regression [not found] ` <4A659D75.3050004@tuffmail.co.uk> 2009-07-21 13:41 ` Rafael J. Wysocki @ 2009-07-21 20:57 ` Rafael J. Wysocki [not found] ` <200907212257.21878.rjw@sisk.pl> ` (2 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Rafael J. Wysocki @ 2009-07-21 20:57 UTC (permalink / raw) To: Alan Jenkins Cc: Arnd Schneider, Heiko Carstens, linux-kernel, linux-fsdevel, Hans-Joachim Picht, pm list On Tuesday 21 July 2009, Alan Jenkins wrote: > Heiko Carstens wrote: > > On Fri, Jul 17, 2009 at 02:08:46PM +0100, Alan Jenkins wrote: > > > >> Rafael J. Wysocki wrote: > >> > >>> On Tuesday 14 July 2009, Heiko Carstens wrote: > >>> > >>> > >>>> We've seen this bug [...] > >>>> > >>>> Looks like this was introduced with git commit a1bb7d61 "PM/hibernate: fix "swap > >>>> breaks after hibernation failures"". > >>>> Calling bdget while holding a spinlock doesn't seem to be a good idea... > >>>> > >>>> > >>> Agreed, sorry for missing that. > >>> > >>> Alan, can you please prepare a fix? > >>> > >> Here's a quick & dirty patch [...] > >> > > > > Thanks for the patch. Unfortunately Arnd was unable to reproduce the original > > behaviour. But your patch makes sense anyway. > > I also tested it and nothing broke. So should this go upstream? > > I do want to fix it, but I think there's a better way. > > It doesn't really need the sleeping bdget(). All we want is > atomic_inc(&bdev->bd_inode->i_count). I think we should call it bdcopy(). > > ----------> > From 643014ec079610a8b01dfd78c6949c1e8727195b Mon Sep 17 00:00:00 2001 > From: Alan Jenkins <alan-jenkins@tuffmail.co.uk> > Date: Tue, 21 Jul 2009 10:17:30 +0100 > Subject: [PATCH] PM/hibernate: replace bdget call with bdcopy (simple atomic_inc of i_count) > > Create bdcopy(). This function copies an existing reference to a > block_device. It is safe to call from any context. > > Hibernation code wishes to copy a reference to the active swap device. > Right now it calls bdget() under a spinlock, but this is wrong because > bdget() can sleep. It doesn't need a full bdget() because we already > hold a reference to active swap devices (and the spinlock protects > against swapoff). > > Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> > CC: linux-fsdevel@vger.kernel.org > --- > fs/block_dev.c | 8 ++++++++ > include/linux/fs.h | 1 + > mm/swapfile.c | 4 ++-- > 3 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 3a6d4fb..0b04974 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -564,6 +564,14 @@ struct block_device *bdget(dev_t dev) > > EXPORT_SYMBOL(bdget); > > +struct block_device *bdcopy(struct block_device *bdev) > +{ > + atomic_inc(&bdev->bd_inode->i_count); > + return bdev; > +} Hmm. If you defined bdcopy() as static inline directly in mm/swapfile.c, the patch would be slightly simpler. Best, Rafael > + > +EXPORT_SYMBOL(bdcopy); > + > long nr_blockdev_pages(void) > { > struct block_device *bdev; > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 0872372..eeb1091 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1946,6 +1946,7 @@ extern void putname(const char *name); > extern int register_blkdev(unsigned int, const char *); > extern void unregister_blkdev(unsigned int, const char *); > extern struct block_device *bdget(dev_t); > +extern struct block_device *bdcopy(struct block_device *bdev); > extern void bd_set_size(struct block_device *, loff_t size); > extern void bd_forget(struct inode *inode); > extern void bdput(struct block_device *); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index d1ade1a..272ea8e 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -753,7 +753,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) > > if (!bdev) { > if (bdev_p) > - *bdev_p = bdget(sis->bdev->bd_dev); > + *bdev_p = bdcopy(sis->bdev); > > spin_unlock(&swap_lock); > return i; > @@ -765,7 +765,7 @@ int swap_type_of(dev_t device, sector_t offset, struct block_device **bdev_p) > struct swap_extent, list); > if (se->start_block == offset) { > if (bdev_p) > - *bdev_p = bdget(sis->bdev->bd_dev); > + *bdev_p = bdcopy(sis->bdev); > > spin_unlock(&swap_lock); > bdput(bdev); ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <200907212257.21878.rjw@sisk.pl>]
* Re: PM/hibernate swapfile regression [not found] ` <200907212257.21878.rjw@sisk.pl> @ 2009-07-21 21:46 ` Johannes Weiner 0 siblings, 0 replies; 11+ messages in thread From: Johannes Weiner @ 2009-07-21 21:46 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Arnd Schneider, Heiko Carstens, linux-kernel, Alan Jenkins, linux-fsdevel, Hans-Joachim Picht, pm list On Tue, Jul 21, 2009 at 10:57:21PM +0200, Rafael J. Wysocki wrote: > On Tuesday 21 July 2009, Alan Jenkins wrote: > > From 643014ec079610a8b01dfd78c6949c1e8727195b Mon Sep 17 00:00:00 2001 > > From: Alan Jenkins <alan-jenkins@tuffmail.co.uk> > > Date: Tue, 21 Jul 2009 10:17:30 +0100 > > Subject: [PATCH] PM/hibernate: replace bdget call with bdcopy (simple atomic_inc of i_count) > > > > Create bdcopy(). This function copies an existing reference to a > > block_device. It is safe to call from any context. > > > > Hibernation code wishes to copy a reference to the active swap device. > > Right now it calls bdget() under a spinlock, but this is wrong because > > bdget() can sleep. It doesn't need a full bdget() because we already > > hold a reference to active swap devices (and the spinlock protects > > against swapoff). > > > > Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> > > CC: linux-fsdevel@vger.kernel.org > > --- > > fs/block_dev.c | 8 ++++++++ > > include/linux/fs.h | 1 + > > mm/swapfile.c | 4 ++-- > > 3 files changed, 11 insertions(+), 2 deletions(-) > > > > diff --git a/fs/block_dev.c b/fs/block_dev.c > > index 3a6d4fb..0b04974 100644 > > --- a/fs/block_dev.c > > +++ b/fs/block_dev.c > > @@ -564,6 +564,14 @@ struct block_device *bdget(dev_t dev) > > > > EXPORT_SYMBOL(bdget); > > > > +struct block_device *bdcopy(struct block_device *bdev) > > +{ > > + atomic_inc(&bdev->bd_inode->i_count); > > + return bdev; > > +} > > Hmm. If you defined bdcopy() as static inline directly in mm/swapfile.c, > the patch would be slightly simpler. Please don't do that. This helper seems to be generic enough to live next to the rest of the bd interface, don't you think? swapfile.c is a mess already... Hannes ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PM/hibernate swapfile regression [not found] ` <4A659D75.3050004@tuffmail.co.uk> ` (2 preceding siblings ...) [not found] ` <200907212257.21878.rjw@sisk.pl> @ 2009-07-21 21:55 ` Christoph Hellwig [not found] ` <20090721215536.GA9502@infradead.org> 4 siblings, 0 replies; 11+ messages in thread From: Christoph Hellwig @ 2009-07-21 21:55 UTC (permalink / raw) To: Alan Jenkins Cc: Arnd Schneider, Heiko Carstens, linux-kernel, linux-fsdevel, Hans-Joachim Picht, pm list On Tue, Jul 21, 2009 at 11:50:29AM +0100, Alan Jenkins wrote: > +struct block_device *bdcopy(struct block_device *bdev) > +{ > + atomic_inc(&bdev->bd_inode->i_count); > + return bdev; > +} > + > +EXPORT_SYMBOL(bdcopy); The function name doesn't make any sense. You don't copy anything here, but you grab a reference to it. A better name would be bdgrab, mirroing the names of functions like igrab. A kerneldoc comment documenting it would also be very helpful. Why do you export it? The swapfile code is not actually modular. ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <20090721215536.GA9502@infradead.org>]
* Re: PM/hibernate swapfile regression [not found] ` <20090721215536.GA9502@infradead.org> @ 2009-07-25 21:58 ` Rafael J. Wysocki [not found] ` <200907252358.26434.rjw@sisk.pl> 1 sibling, 0 replies; 11+ messages in thread From: Rafael J. Wysocki @ 2009-07-25 21:58 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Schneider, Heiko Carstens, linux-kernel, Alan Jenkins, linux-fsdevel, Hans-Joachim Picht, pm list On Tuesday 21 July 2009, Christoph Hellwig wrote: > On Tue, Jul 21, 2009 at 11:50:29AM +0100, Alan Jenkins wrote: > > +struct block_device *bdcopy(struct block_device *bdev) > > +{ > > + atomic_inc(&bdev->bd_inode->i_count); > > + return bdev; > > +} > > + > > +EXPORT_SYMBOL(bdcopy); > > The function name doesn't make any sense. You don't copy anything > here, but you grab a reference to it. A better name would be bdgrab, > mirroing the names of functions like igrab. A kerneldoc comment > documenting it would also be very helpful. > > Why do you export it? The swapfile code is not actually modular. Thanks for the comments. Does the one below look better? Rafael --- From: Alan Jenkins <alan-jenkins@tuffmail.co.uk> Subject: PM / Hibernate: Replace bdget call with simple atomic_inc of i_count Create bdgrab(). This function copies an existing reference to a block_device. It is safe to call from any context. Hibernation code wishes to copy a reference to the active swap device. Right now it calls bdget() under a spinlock, but this is wrong because bdget() can sleep. It doesn't need a full bdget() because we already hold a reference to active swap devices (and the spinlock protects against swapoff). Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk> CC: linux-fsdevel@vger.kernel.org Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> --- fs/block_dev.c | 10 ++++++++++ include/linux/fs.h | 1 + mm/swapfile.c | 4 ++-- 3 files changed, 13 insertions(+), 2 deletions(-) Index: linux-2.6/fs/block_dev.c =================================================================== --- linux-2.6.orig/fs/block_dev.c +++ linux-2.6/fs/block_dev.c @@ -564,6 +564,16 @@ struct block_device *bdget(dev_t dev) EXPORT_SYMBOL(bdget); +/** + * bdgrab -- Grab a reference to an already referenced block device + * @bdev: Block device to grab a reference to. + */ +struct block_device *bdgrab(struct block_device *bdev) +{ + atomic_inc(&bdev->bd_inode->i_count); + return bdev; +} + long nr_blockdev_pages(void) { struct block_device *bdev; Index: linux-2.6/include/linux/fs.h =================================================================== --- linux-2.6.orig/include/linux/fs.h +++ linux-2.6/include/linux/fs.h @@ -1946,6 +1946,7 @@ extern void putname(const char *name); extern int register_blkdev(unsigned int, const char *); extern void unregister_blkdev(unsigned int, const char *); extern struct block_device *bdget(dev_t); +extern struct block_device *bdgrab(struct block_device *bdev); extern void bd_set_size(struct block_device *, loff_t size); extern void bd_forget(struct inode *inode); extern void bdput(struct block_device *); Index: linux-2.6/mm/swapfile.c =================================================================== --- linux-2.6.orig/mm/swapfile.c +++ linux-2.6/mm/swapfile.c @@ -753,7 +753,7 @@ int swap_type_of(dev_t device, sector_t if (!bdev) { if (bdev_p) - *bdev_p = bdget(sis->bdev->bd_dev); + *bdev_p = bdgrab(sis->bdev); spin_unlock(&swap_lock); return i; @@ -765,7 +765,7 @@ int swap_type_of(dev_t device, sector_t struct swap_extent, list); if (se->start_block == offset) { if (bdev_p) - *bdev_p = bdget(sis->bdev->bd_dev); + *bdev_p = bdgrab(sis->bdev); spin_unlock(&swap_lock); bdput(bdev); ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <200907252358.26434.rjw@sisk.pl>]
* Re: PM/hibernate swapfile regression [not found] ` <200907252358.26434.rjw@sisk.pl> @ 2009-07-27 5:12 ` Christoph Hellwig 0 siblings, 0 replies; 11+ messages in thread From: Christoph Hellwig @ 2009-07-27 5:12 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Arnd Schneider, Heiko Carstens, linux-kernel, Christoph Hellwig, Alan Jenkins, linux-fsdevel, Hans-Joachim Picht, pm list On Sat, Jul 25, 2009 at 11:58:25PM +0200, Rafael J. Wysocki wrote: > Thanks for the comments. Does the one below look better? Yes, that looks much better to me. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-07-27 5:12 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20090714135453.GA26976@osiris.boeblingen.de.ibm.com>
2009-07-14 16:21 ` PM/hibernate swapfile regression Rafael J. Wysocki
2009-07-15 21:24 ` Alan Jenkins
2009-07-17 13:08 ` Alan Jenkins
[not found] ` <4A6077DE.6060202@tuffmail.co.uk>
2009-07-20 13:24 ` Heiko Carstens
[not found] ` <20090720132456.GA29023@osiris.boeblingen.de.ibm.com>
2009-07-21 10:50 ` Alan Jenkins
[not found] ` <4A659D75.3050004@tuffmail.co.uk>
2009-07-21 13:41 ` Rafael J. Wysocki
2009-07-21 20:57 ` Rafael J. Wysocki
[not found] ` <200907212257.21878.rjw@sisk.pl>
2009-07-21 21:46 ` Johannes Weiner
2009-07-21 21:55 ` Christoph Hellwig
[not found] ` <20090721215536.GA9502@infradead.org>
2009-07-25 21:58 ` Rafael J. Wysocki
[not found] ` <200907252358.26434.rjw@sisk.pl>
2009-07-27 5:12 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox