All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vivek Goyal <vgoyal@redhat.com>
To: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Jens Axboe <jaxboe@fusionio.com>, Ingo Molnar <mingo@elte.hu>,
	Tejun Heo <tj@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Subject: Re: [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for  2.6.37-rc1)
Date: Sun, 24 Oct 2010 02:15:10 -0400	[thread overview]
Message-ID: <20101024061510.GB7474@redhat.com> (raw)
In-Reply-To: <1287865993.3322.3.camel@maxim-laptop>

On Sat, Oct 23, 2010 at 10:33:13PM +0200, Maxim Levitsky wrote:
> On Sat, 2010-10-23 at 20:43 +0200, Jens Axboe wrote:
> > On 2010-10-23 20:21, Ingo Molnar wrote:
> > > 
> > > * Jens Axboe <jaxboe@fusionio.com> wrote:
> > > 
> > >>> Looks like a fairly straight forward case of uninitialized memory and 
> > >>> blk_sync_queue() -> throtl_shutdown_timer() -> cancel_delayed_work_sync().
> > >>>
> > >>> Will get that fixed up.
> > >>
> > >> It frees q->td in blk_cleanup_queue(), but doesn't clear q->td. When the final put 
> > >> happens, blk_sync_queue() is called and then ends up doing the 
> > >> cancel_delayed_work_sync() on freed memory.
> > >>
> > >> Two possible fixes:
> > >>
> > >> - Clear ->td when the queue is goin dead. May require other ->td == NULL
> > >>   checks in the code, so I opted for:
> > >>
> > >> - Move the free to when the queue is really going away, post doing the
> > >>   blk_sync_queue() call.
> > >>
> > >> The below should fix it.
> > >>
> > >> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
> > > 
> > > This did the trick, thanks Jens!
> > 
> > Great, thanks for testing/reporting! I added your reported/tested-by.
> > 
> > Linus, please pull this single fix, better get this out the door since
> > I'll be travelling very shortly.
> > 
> > 
> >   git://git.kernel.dk/linux-2.6-block.git for-2.6.37/core
> > 
> > Jens Axboe (1):
> >       block: fix use-after-free bug in blk throttle code
> > 
> >  block/blk-core.c  |    2 --
> >  block/blk-sysfs.c |    2 ++
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> I have here very similar bug.
> Must have been caused by this patch series.
> I pulled that tree, but that didn't affect anything.
> 
> System oopses/panics on removal of any hotplugable device.
> (reproduced with xD, MemoryStick, and USB mass storage).
> 
> Here is backtrace for MemoryStick card:
> 
> <6>[   24.138665] r592: IRQ: card removed
> <1>[   24.228293] BUG: unable to handle kernel NULL pointer dereference at 00000000000001f8
> <1>[   24.228966] IP: [<00000000000001f8>] 0x1f8
> <4>[   24.230739] PGD 0 
> <0>[   24.231182] Oops: 0010 [#1] PREEMPT SMP 
> <0>[   24.231182] last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda3/alignment_offset
> <4>[   24.231182] CPU 1 
> <4>[   24.231182] Modules linked in: dm_crypt firewire_net usb_storage usb_libusual cpufreq_powersave cpufreq_conservative cpufreq_userspace uvcvideo videodev v4l2_compat_ioctl32 acpi_cpufreq iwl3945 iwlcore snd_hda_codec_realtek mac80211 mperf r852 iTCO_wdt coretemp uhci_hcd sm_common ir_lirc_codec mspro_block snd_hda_intel ms_block ehci_hcd sdhci_pci lirc_dev joydev sbp2 nand snd_hda_codec cfg80211 firewire_ohci sdhci ir_sony_decoder ieee1394 nand_ids usbcore r592 ir_jvc_decoder snd_hwdep mmc_core nand_ecc ir_rc6_decoder ene_ir snd_pcm tg3 ir_rc5_decoder firewire_core mtd battery memstick ac ir_nec_decoder psmouse snd_page_alloc libphy sunrpc ir_core sg evdev serio_raw dm_mirror dm_region_hash dm_log dm_mod nouveau ttm drm_kms_helper drm i2c_algo_bit thermal video
> <4>[   32.881606] 
> <4>[   32.881606] Pid: 543, comm: kworker/u:4 Not tainted 2.6.36+ #191 Nettiling/Aspire 5720     
> <4>[   32.881606] RIP: 0010:[<00000000000001f8>]  [<00000000000001f8>] 0x1f8
> <4>[   32.881606] RSP: 0018:ffff880037a03ab8  EFLAGS: 00010086
> <4>[   32.881606] RAX: ffff88007c0ebc00 RBX: ffff880037af9470 RCX: 0000000000000000
> <4>[   32.881606] RDX: 0000000000000019 RSI: 0000000000000001 RDI: ffff880037af9470
> <4>[   32.881606] RBP: ffff880037a03ad0 R08: 0000000000000000 R09: 0000000000000001
> <4>[   32.881606] R10: 00000000000002f0 R11: 0000000000000000 R12: ffff880037af9470
> <4>[   32.881606] R13: ffff880075d6a870 R14: ffff880075bfb560 R15: 0000000000000282
> <4>[   32.881606] FS:  0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
> <4>[   32.881606] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> <4>[   32.881606] CR2: 00000000000001f8 CR3: 000000007a046000 CR4: 00000000000006e0
> <4>[   32.881606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> <4>[   32.881606] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> <4>[   32.881606] Process kworker/u:4 (pid: 543, threadinfo ffff880037a02000, task ffff88007c5b0000)
> <0>[   32.881606] Stack:
> <4>[   32.881606]  ffffffff811c42a2 ffff880037a03af0 ffff880037af9470 ffff880037a03af0
> <4>[   32.881606] <0> ffffffff811c525a ffff880077250040 ffff880077250040 ffff880037a03b10
> <4>[   32.881606] <0> ffffffff811cebb2 ffff880075d6a800 ffff880075d6a8a8 ffff880037a03b30
> <0>[   32.881606] Call Trace:
> <4>[   32.881606]  [<ffffffff811c42a2>] ? elv_drain_elevator+0x22/0x70
> <4>[   32.881606]  [<ffffffff811c525a>] elv_quiesce_start+0x3a/0xc0
> <4>[   32.881606]  [<ffffffff811cebb2>] disk_replace_part_tbl+0x42/0x70
> <4>[   32.881606]  [<ffffffff811cec63>] disk_release+0x23/0x50
> <4>[   32.881606]  [<ffffffff81273c42>] device_release+0x22/0x90
> <4>[   32.881606]  [<ffffffff811daced>] kobject_release+0x8d/0x1a0
> <4>[   32.881606]  [<ffffffff811dac60>] ? kobject_release+0x0/0x1a0
> <4>[   32.881606]  [<ffffffff811dc257>] kref_put+0x37/0x70
> <4>[   32.881606]  [<ffffffff811dab67>] kobject_put+0x27/0x60
> <4>[   32.881606]  [<ffffffff811cef42>] put_disk+0x12/0x20
> <4>[   32.881606]  [<ffffffffa0627663>] mspro_block_disk_release+0xa3/0xb0 [mspro_block]
> <4>[   32.881606]  [<ffffffffa062773d>] mspro_block_remove+0xcd/0x140 [mspro_block]
> <4>[   32.881606]  [<ffffffffa01d42b5>] memstick_device_remove+0x35/0x60 [memstick]
> <4>[   32.881606]  [<ffffffff81277630>] __device_release_driver+0x70/0xe0
> <4>[   32.881606]  [<ffffffff8127779a>] device_release_driver+0x2a/0x40
> <4>[   32.881606]  [<ffffffff812769b5>] bus_remove_device+0xb5/0x120
> <4>[   32.881606]  [<ffffffff81274817>] device_del+0x127/0x1d0
> <4>[   32.881606]  [<ffffffff812748dd>] device_unregister+0x1d/0x60
> <4>[   32.881606]  [<ffffffffa01d5071>] memstick_check+0x241/0x360 [memstick]
> <4>[   32.881606]  [<ffffffff8105a740>] process_one_work+0x1c0/0x4d0
> <4>[   32.881606]  [<ffffffff8105a6e2>] ? process_one_work+0x162/0x4d0
> <4>[   32.881606]  [<ffffffffa01d4e30>] ? memstick_check+0x0/0x360 [memstick]
> <4>[   32.881606]  [<ffffffff8105ae36>] worker_thread+0x156/0x410
> <4>[   32.881606]  [<ffffffff8105ace0>] ? worker_thread+0x0/0x410
> <4>[   32.881606]  [<ffffffff8105ed66>] kthread+0xb6/0xc0
> <4>[   32.881606]  [<ffffffff81037fa6>] ? finish_task_switch+0x46/0xe0
> <4>[   32.881606]  [<ffffffff81003c14>] kernel_thread_helper+0x4/0x10
> <4>[   32.881606]  [<ffffffff8105ecb0>] ? kthread+0x0/0xc0
> <4>[   32.881606]  [<ffffffff81003c10>] ? kernel_thread_helper+0x0/0x10
> <0>[   32.881606] Code:  Bad RIP value.
> <1>[   32.881606] RIP  [<00000000000001f8>] 0x1f8
> <4>[   32.881606]  RSP <ffff880037a03ab8>
> <0>[   32.881606] CR2: 00000000000001f8
> <4>[   32.881606] ---[ end trace ca0206dec4457aff ]---
> 

Looking at the backtrace and commit messages, it might be coming from
following commit.

commit 7681bfeeccff5efa9eb29bf09249a3c400b15327
Author: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Date:   Tue Oct 19 09:05:00 2010 +0200

    block: fix accounting bug on cross partition merges


Looks like we have freed the request queue in mspro_block_remove() and
then we are calling mspro_block_disk_release() which ends up accessing
request queue in disk_replace_part_tbl(). So use-after-free case.  
 
CCing Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>.

Thanks
Vivek

  reply	other threads:[~2010-10-24  6:16 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-22  7:57 [GIT PULL] core block bits for 2.6.37-rc1 Jens Axboe
2010-10-23 15:29 ` [origin tree boot failure] " Ingo Molnar
2010-10-23 15:42   ` Linus Torvalds
2010-10-23 15:52   ` Ingo Molnar
2010-10-23 16:51   ` Jens Axboe
2010-10-23 17:17     ` Jens Axboe
2010-10-23 18:21       ` Ingo Molnar
2010-10-23 18:43         ` [GIT PULL] Throtl bug (was Re: [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1) Jens Axboe
2010-10-23 20:33           ` Maxim Levitsky
2010-10-24  6:15             ` Vivek Goyal [this message]
2010-10-24  5:48       ` [origin tree boot failure] Re: [GIT PULL] core block bits for 2.6.37-rc1 Vivek Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101024061510.GB7474@redhat.com \
    --to=vgoyal@redhat.com \
    --cc=isimatu.yasuaki@jp.fujitsu.com \
    --cc=jaxboe@fusionio.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maximlevitsky@gmail.com \
    --cc=mingo@elte.hu \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.