Oops in loop_clr_fd => bd_set_size

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Alan Curry <pacman@kosh.dhis.org>
To: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Jens Axboe <axboe@kernel.dk>, Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Oops in loop_clr_fd => bd_set_size
Date: Sun, 5 Aug 2012 22:02:45 -0400	[thread overview]
Message-ID: <20120806020245.GA7130@kosh.dhis.org> (raw)

I got an Oops from running "losetup -d /dev/loop2". The trace shows
loop_clr_fd calling bd_set_size, and the parameter bdev appears to have a
NULL in its bd_disk field.

The losetup command was run as part of a script that did this:
  losetup -d /dev/loop5
  losetup -d /dev/loop4
  losetup -d /dev/loop3
  losetup -d /dev/loop2
  losetup -d /dev/loop1
The kernel was processing a lot of loop_clr_fd's in a quick sequence. The
first three worked, and the loop2 Oopsed.

After that I ran the losetup -d /dev/loop1 separately and it worked. The
process that caused the Oops didn't die:
  PID TTY      STAT   TIME COMMAND
 5055 pts/1    D      0:00 [losetup]

I also tried to query the current state of the device with "losetup loop2"
after the Oops. That gave me a second stuck process:
  PID TTY      STAT   TIME COMMAND
 5059 pts/1    D+     0:00 losetup /dev/loop2

These processes are still alive, in their permanent D state. The rest of
the system is still functional. I'll try to keep it that way for now, in
case anyone wants to suggest some debugging actions that I can take.

The loop devices were set up to handle an unusual situation: I have a whole
hard drive image contained within a partition on another hard drive. Each
loop device corresponds to a partition of the imaged drive. They were set
up like this, with numbers taken from its partition table:

cyl=516096
losetup -o $((1*$cyl)) --sizelimit $(((2080-1+1)*$cyl)) /dev/loop1 /dev/sda6
losetup -o $((2081*$cyl)) --sizelimit $(((4160-2081+1)*$cyl)) /dev/loop2 /dev/sda6
losetup -o $((4161*$cyl)) --sizelimit $(((24965-4161+1)*$cyl)) /dev/loop3 /dev/sda6
losetup -o $((24966*$cyl)) --sizelimit $(((45770-24966+1)*$cyl)) /dev/loop4 /dev/sda6
losetup -o $((45771*$cyl)) --sizelimit $(((158815-45771+1)*$cyl)) /dev/loop5 /dev/sda6

So all the loop devices were referencing the same backing device, with
different (adjacent but non-overlapping) offsets.

I had also done blockdev --setro on sda6 and all of the loop devices as
soon as they were created. The devices were successfully dm_snapshot'ed,
fscked, mounted, and all data was copied off of them before I did the
losetup -d that caused the oops. The one that failed, loop2, actually
corresponded to the swap partition of the imaged drive so I didn't copy
anything off of it, but I know it was working because I used "strings
/dev/loop2" to figure out what it was.

That's all the information I can think of that might be relevant. I'm
willing to dig deeper if there's anything that can be retrieved from the
two stuck processes, or I could reboot and try to repeat the incident.

Here is the Oops:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000328
IP: [<ffffffff8110ffed>] bd_set_size+0x7/0x5e
PGD 156eb067 PUD 102aa067 PMD 0 
Oops: 0000 [#1] SMP 
CPU 1 
Modules linked in: dm_snapshot ext3 jbd ext2 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc arc4 carl9170 ehci_hcd mac80211 usbcore usb_common led_class ath cfg80211 rfkill sha256_generic aes_x86_64 aes_generic cbc crc32c_intel loop dm_crypt dm_mod sd_mod crc_t10dif ata_piix libata scsi_mod

Pid: 5055, comm: losetup Not tainted 3.5.0 #17 BIOSTAR Group TH61 ITX/TH61 ITX
RIP: 0010:[<ffffffff8110ffed>]  [<ffffffff8110ffed>] bd_set_size+0x7/0x5e
RSP: 0018:ffff88001113ddd0  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8800721b4a00 RCX: 0000000180240011
RDX: 0000000180240012 RSI: 0000000000000000 RDI: ffff880100385d40
RBP: ffff880100385d40 R08: ffff880072dcb8c0 R09: 0000000180240011
R10: 0000000080240011 R11: 0000000000000000 R12: ffff880073712300
R13: 0000000000020010 R14: ffff8800721b4b14 R15: 0000000000000000
FS:  00007f98183ba700(0000) GS:ffff880100300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000328 CR3: 000000001adcb000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process losetup (pid: 5055, threadinfo ffff88001113c000, task ffff8800732c8640)
Stack:
 ffffffffa00b251c ffff8800721b4a00 ffff8800739ec9c0 0000000000004c01
 0000000000000000 ffff8800721b4b30 ffffffffa00b315d ffff880072209c20
 ffff88000f9f6900 00007f9817f09430 000000000000001d ffff880031878840
Call Trace:
 [<ffffffffa00b251c>] ? loop_clr_fd+0x154/0x1f5 [loop]
 [<ffffffffa00b315d>] ? lo_ioctl+0x4af/0x657 [loop]
 [<ffffffff811a38f4>] ? blkdev_ioctl+0x632/0x666
 [<ffffffff811100de>] ? block_ioctl+0x32/0x36
 [<ffffffff810f54f0>] ? do_vfs_ioctl+0x44b/0x490
 [<ffffffff810df6b7>] ? virt_to_head_page+0x9/0x2c
 [<ffffffff810e1814>] ? kmem_cache_free+0x12/0x9e
 [<ffffffff810f5571>] ? sys_ioctl+0x3c/0x5f
 [<ffffffff81346422>] ? system_call_fastpath+0x16/0x1b
Code: 20 31 c0 48 85 c9 75 1b 48 39 7f 70 74 13 48 8b 46 50 48 3d ba ff 10 81 74 07 48 85 c0 0f 94 c0 c3 b0 01 c3 48 8b 87 90 00 00 00 <48> 8b 80 28 03 00 00 48 85 c0 74 0e 8b 90 d0 04 00 00 66 85 d2 
RIP  [<ffffffff8110ffed>] bd_set_size+0x7/0x5e
 RSP <ffff88001113ddd0>
CR2: 0000000000000328
---[ end trace 149557d36d01641b ]---

-- 
Alan Curry

next             reply	other threads:[~2012-08-06  2:02 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-06  2:02 Alan Curry [this message]
2013-04-01 12:07 ` Oops in loop_clr_fd => bd_set_size Anatol Pomozov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120806020245.GA7130@kosh.dhis.org \
    --to=pacman@kosh.dhis.org \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.