From: Jeff Garzik <jeff@garzik.org>
To: tfjellstrom@shaw.ca
Cc: linux-kernel@vger.kernel.org,
linux-scsi <linux-scsi@vger.kernel.org>,
Linux IDE mailing list <linux-ide@vger.kernel.org>
Subject: Re: 2.6.31-rc9 kernel BUG and mvsas
Date: Sat, 12 Sep 2009 20:43:14 -0400 [thread overview]
Message-ID: <4AAC4022.4040805@garzik.org> (raw)
In-Reply-To: <200909101755.51459.tfjellstrom@shaw.ca>
On 09/10/2009 07:55 PM, Thomas Fjellstrom wrote:
> On Wed September 9 2009, Thomas Fjellstrom wrote:
>> On Wed September 9 2009, Thomas Fjellstrom wrote:
>>> On Wed September 9 2009, Jeff Garzik wrote:
>>>> On 09/09/2009 12:30 PM, Thomas Fjellstrom wrote:
>>>>> No errors on that disk. Other than the one above, and its more of a
>>>>> warning. However, I just rebooted to add some extra drives, thinking
>>>>> everything was working a little better now that I've updated to
>>>>> 2.6.31-rc9, I'm treated to the following two messages right after
>>>>> boot (and a system lockup to boot):
>>>>>
>>>>> kernel: [ 971.033138] ------------[ cut here ]------------
>>>>> kernel: [ 971.033211] WARNING: at drivers/ata/libata-core.c:4913
>>>>> __ata_qc_complete+0x5a/0xe1 [libata]()
>>>>> kernel: [ 971.033217] Hardware name: GA-MA790FXT-UD5P
>>>>> kernel: [ 971.033221] Modules linked in: powernow_k8
>>>>> cpufreq_conservative cpufreq_stats cpufreq_userspace
>>>>> cpufreq_powersave kvm_amd kvm nfsd exportfs nfs lockd fscache nfs_acl
>>>>> auth_rpcgss sunrpc bridge stp it87 hwmon_vid adt7473 firewire_sbp2
>>>>> loop md_mod
>>>>> snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep
>>>>> snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi
>>>>> snd_seq_midi_event snd_seq snd_timer snd_seq_device snd
>>>>> amd64_edac_mod edac_core i2c_piix4 soundcore snd_page_alloc i2c_core
>>>>> evdev wmi parport_pc button parport processor ext3 jbd mbcache dm_mod
>>>>> sg sr_mod cdrom sd_mod crc_t10dif usbhid ata_generic ide_pci_generic
>>>>> hid mvsas firewire_ohci libsas firewire_core crc_itu_t
>>>>> scsi_transport_sas r8169 atiixp ide_core floppy ahci mii ohci_hcd
>>>>> libata ehci_hcd scsi_mod thermal fan thermal_sys [last unloaded:
>>>>> scsi_wait_scan]
>>>>> kernel: [ 971.033337] Pid: 0, comm: swapper Not tainted 2.6.31-rc9
>>>>> #2 kernel: [ 971.033342] Call Trace:
>>>>> kernel: [ 971.033346]<IRQ> [<ffffffffa00562ca>] ?
>>>>> __ata_qc_complete+0x5a/0xe1 [libata]
>>>>> kernel: [ 971.033434] [<ffffffffa00562ca>] ?
>>>>> __ata_qc_complete+0x5a/0xe1 [libata]
>>>>> kernel: [ 971.033446] [<ffffffff8104aca0>] ?
>>>>> warn_slowpath_common+0x77/0xa3 kernel: [ 971.033455]
>>>>> [<ffffffff81038d06>] ? enqueue_task+0x5c/0x65 kernel: [ 971.033496]
>>>>> [<ffffffffa00562ca>] ? __ata_qc_complete+0x5a/0xe1 [libata]
>>>>> kernel: [ 971.033519] [<ffffffffa00f7b59>] ?
>>>>> sas_ata_task_done+0x178/0x210 [libsas]
>>>>> kernel: [ 971.033528] [<ffffffff8115ead1>] ?
>>>>> blk_run_queue+0x21/0x35 kernel: [ 971.033548] [<ffffffffa010e2ce>]
>>>>> ?
>>>>> mvs_slot_complete+0x3df/0x41b [mvsas]
>>>>> kernel: [ 971.033565] [<ffffffffa010e39c>] ? mvs_int_rx+0x92/0x101
>>>>> [mvsas] kernel: [ 971.033583] [<ffffffffa01112ba>] ?
>>>>> mvs_int_full+0x25/0x88 [mvsas] kernel: [ 971.033600]
>>>>> [<ffffffffa011134e>] ? mvs_64xx_isr+0x31/0x40 [mvsas] kernel: [
>>>>> 971.033617] [<ffffffffa010d0e5>] ? mvs_interrupt+0x61/0x78 [mvsas]
>>>>> kernel: [ 971.033625] [<ffffffff8108aaac>] ?
>>>>> handle_IRQ_event+0x58/0x135 kernel: [ 971.033633]
>>>>> [<ffffffff8108c1a1>] ? handle_fasteoi_irq+0x7d/0xb5 kernel: [
>>>>> 971.033642]
>>>>> [<ffffffff8101388d>] ? handle_irq+0x17/0x1d
>>>>
>>>> That warning is triggered when an ata_queued_cmd is passed to
>>>> completion without the ATA_QCFLAG_ACTIVE flag being set (which
>>>> indicates the qc was started with some activity).
>>>>
>>>> That possibly indicates the low-level driver (or libsas) was passing an
>>>> already-completed cmd to libata.
>>>>
>>>>> The added hard drives are connected to a Supermicro AOC-SASLP-MV8,
>>>>> which is based on a marvel MV64460/64461/64462 chipset, which uses
>>>>> the sata_mv driver.
>>>>
>>>> Surely you mean 'mvsas' driver?
>>>
>>> Yes, sorry I did mean mvsas.
>>>
>>> I am more concerned about the actual oops/BUG rather than the warning
>>> though. Unless the problem causing the warning is also causing the oops.
>>>
>>>> Jeff
>>
>> Thanks for taking a look so far. But I'm having more and more trouble with
>> this card as the days go by:
>>
>> [ 464.792214] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 464.792222] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 494.816170] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 494.816179] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 494.816192] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 494.816197] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 525.817335] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 525.817343] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 525.817358] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 525.817363] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 556.816148] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 556.816157] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 556.816170] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 556.816175] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 587.816171] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 587.816179] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 587.816193] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 587.816199] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 600.616255] INFO: task mount:4395 blocked for more than 120 seconds.
>> [ 600.616263] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables
>> this message.
>> [ 600.616270] mount D 0000000000000000 0 4395 4229
>> 0x00000000 [ 600.616281] ffff88012fb6f780 0000000000000082
>> ffff8800b808ddc8 ffff8800b80f3d60
>> [ 600.616290] ffff880128923c90 0000000000014800 000000000000f800
>> ffff88012dd91840
>> [ 600.616299] ffff88012dd91b38 0000000300000008 0000000000000000
>> ffff88010c5bfa88
>> [ 600.616308] Call Trace:
>> [ 600.616324] [<ffffffff81017015>] ? read_tsc+0xa/0x20
>> [ 600.616335] [<ffffffff810adc63>] ? __pagevec_free+0x29/0x3b
>> [ 600.616343] [<ffffffff810661e9>] ? getnstimeofday+0x55/0xaf
>> [ 600.616351] [<ffffffff810a8e69>] ? sync_page+0x0/0x46
>> [ 600.616361] [<ffffffff812dc8cb>] ? io_schedule+0x63/0xa5
>> [ 600.616368] [<ffffffff810a8eaa>] ? sync_page+0x41/0x46
>> [ 600.616376] [<ffffffff812dcade>] ? __wait_on_bit_lock+0x3f/0x84
>> [ 600.616383] [<ffffffff810a8e55>] ? __lock_page+0x5d/0x63
>> [ 600.616391] [<ffffffff8105ef14>] ? wake_bit_function+0x0/0x23
>> [ 600.616401] [<ffffffff810b0ad4>] ? pagevec_lookup+0x17/0x1e
>> [ 600.616408] [<ffffffff810b1bbd>] ?
>> truncate_inode_pages_range+0x288/0x318 [ 600.616418]
>> [<ffffffff810fc8cb>] ? set_blocksize+0xc2/0xd2
>> [ 600.616426] [<ffffffff810fc8f2>] ? sb_set_blocksize+0x17/0x43
>> [ 600.616477] [<ffffffffa04aa8e0>] ? ext4_fill_super+0x1cc/0x2060 [ext4]
>> [ 600.616486] [<ffffffff8117593b>] ? snprintf+0x44/0x4c
>> [ 600.616493] [<ffffffff810fb5b7>] ? check_disk_change+0x22/0x52
>> [ 600.616501] [<ffffffff812dd680>] ? __down_write_nested+0x15/0xab
>> [ 600.616524] [<ffffffff810dbfe6>] ? get_sb_bdev+0x111/0x159
>> [ 600.616571] [<ffffffffa04aa714>] ? ext4_fill_super+0x0/0x2060 [ext4]
>> [ 600.616582] [<ffffffff810dbc44>] ? vfs_kern_mount+0x95/0x111
>> [ 600.616593] [<ffffffff810dbd13>] ? do_kern_mount+0x43/0xe2
>> [ 600.616607] [<ffffffff810ef49d>] ? do_mount+0x767/0x7d6
>> [ 600.616620] [<ffffffff810ef591>] ? sys_mount+0x85/0xc8
>> [ 600.616633] [<ffffffff81010a02>] ? system_call_fastpath+0x16/0x1b
>> [ 618.816175] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 618.816184] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 618.816196] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 618.816201] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 618.816241] sd 1:0:3:0: [sdf] Unhandled error code
>> [ 618.816247] sd 1:0:3:0: [sdf] Result: hostbyte=DID_OK
>> driverbyte=DRIVER_TIMEOUT
>> [ 618.816255] end_request: I/O error, dev sdf, sector 160
>> [ 618.816263] Buffer I/O error on device md2, logical block 20
>> [ 618.816274] Buffer I/O error on device md2, logical block 21
>> [ 618.816280] Buffer I/O error on device md2, logical block 22
>> [ 618.816285] Buffer I/O error on device md2, logical block 23
>> [ 618.816291] Buffer I/O error on device md2, logical block 24
>> [ 618.816298] Buffer I/O error on device md2, logical block 25
>> [ 618.816307] Buffer I/O error on device md2, logical block 26
>> [ 618.816316] Buffer I/O error on device md2, logical block 27
>> [ 618.816324] Buffer I/O error on device md2, logical block 28
>> [ 618.816331] Buffer I/O error on device md2, logical block 29
>> [ 649.780185] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 649.780194] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 649.780208] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 649.780214] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 649.780293] ------------[ cut here ]------------
>> [ 649.780366] WARNING: at drivers/ata/libata-core.c:5129
>> ata_qc_issue+0x10a/0x347 [libata]()
>> [ 649.780373] Hardware name: GA-MA790FXT-UD5P
>> [ 649.780377] Modules linked in: ext4 jbd2 crc16 raid0 powernow_k8
>> cpufreq_conservative cpufreq_stats cpufreq_userspace cpufreq_powersave
>> kvm_amd kvm nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc
>> bridge stp it87 hwmon_vid adt7473 firewire_sbp2 loop md_mod
>> snd_hda_codec_realtek
>> snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm
>> snd_seq_midi snd_rawmidi snd_seq_midi_event amd64_edac_mod snd_seq
>> edac_core snd_timer snd_seq_device snd soundcore i2c_piix4 snd_page_alloc
>> i2c_core evdev parport_pc wmi parport button processor ext3 jbd mbcache
>> dm_mod usbhid hid sg sr_mod cdrom sd_mod crc_t10dif ata_generic
>> ide_pci_generic firewire_ohci firewire_core ohci_hcd crc_itu_t atiixp
>> ide_core mvsas ehci_hcd ahci libsas libata scsi_transport_sas scsi_mod
>> r8169 mii floppy thermal fan thermal_sys [last unloaded: scsi_wait_scan]
>> [ 649.780499] Pid: 3185, comm: hddtemp Not tainted 2.6.31-rc9 #2
>> [ 649.780504] Call Trace:
>> [ 649.780551] [<ffffffffa0074f90>] ? ata_qc_issue+0x10a/0x347 [libata]
>> [ 649.780593] [<ffffffffa0074f90>] ? ata_qc_issue+0x10a/0x347 [libata]
>> [ 649.780607] [<ffffffff8104aca0>] ? warn_slowpath_common+0x77/0xa3
>> [ 649.780649] [<ffffffffa00790ce>] ? ata_scsi_pass_thru+0x0/0x240
>> [libata] [ 649.780690] [<ffffffffa0074f90>] ? ata_qc_issue+0x10a/0x347
>> [libata] [ 649.780733] [<ffffffffa00401d6>] ? scsi_get_command+0x75/0x97
>> [scsi_mod] [ 649.780776] [<ffffffffa00790ce>] ?
>> ata_scsi_pass_thru+0x0/0x240 [libata] [ 649.780815] [<ffffffffa003f7aa>]
>> ? scsi_done+0x0/0xc [scsi_mod] [ 649.780858] [<ffffffffa007a4d5>] ?
>> __ata_scsi_queuecmd+0x185/0x1dc [libata]
>> [ 649.780896] [<ffffffffa003f7aa>] ? scsi_done+0x0/0xc [scsi_mod]
>> [ 649.780918] [<ffffffffa00a3c8e>] ? sas_queuecommand+0x83/0x25d [libsas]
>> [ 649.780956] [<ffffffffa003fa7c>] ? scsi_dispatch_cmd+0x1c0/0x23c
>> [scsi_mod]
>> [ 649.780996] [<ffffffffa0044ff0>] ? scsi_request_fn+0x3a5/0x506
>> [scsi_mod] [ 649.781006] [<ffffffff810546e0>] ? del_timer+0x59/0x62
>> [ 649.781016] [<ffffffff81163b70>] ? blk_execute_rq_nowait+0x65/0x89
>> [ 649.781032] [<ffffffffa014164f>] ? sg_common_write+0x489/0x4ab [sg]
>> [ 649.781042] [<ffffffff8115df56>] ? __freed_request+0x26/0x83
>> [ 649.781056] [<ffffffffa01421da>] ? sg_new_write+0x23e/0x269 [sg]
>> [ 649.781070] [<ffffffffa0142473>] ? sg_ioctl+0x26e/0xb63 [sg]
>> [ 649.781080] [<ffffffff81100f38>] ? inotify_d_instantiate+0x12/0x39
>> [ 649.781088] [<ffffffff8105eee6>] ? autoremove_wake_function+0x0/0x2e
>> [ 649.781098] [<ffffffff810d80bf>] ? fd_install+0x2e/0x5a
>> [ 649.781105] [<ffffffff810e5247>] ? vfs_ioctl+0x56/0x6c
>> [ 649.781111] [<ffffffff810e570a>] ? do_vfs_ioctl+0x437/0x475
>> [ 649.781118] [<ffffffff810e5799>] ? sys_ioctl+0x51/0x70
>> [ 649.781128] [<ffffffff81010a02>] ? system_call_fastpath+0x16/0x1b
>> [ 649.781134] ---[ end trace 9005373b1b9c6eb7 ]---
>> [ 680.781672] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 680.781681] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 711.781672] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 711.781681] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 741.816069] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 741.816078] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 741.816090] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 741.816096] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 772.820160] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 772.820168] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 772.820181] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 772.820186] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 803.816212] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 803.816220] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 803.816233] drivers/scsi/mvsas/mv_sas.c 1669:mvs_abort_task:rc= 5
>> [ 803.816239] drivers/scsi/mvsas/mv_sas.c 1608:mvs_query_task:rc= 5
>> [ 803.816264] sd 1:0:3:0: [sdf] Unhandled error code
>> [ 803.816270] sd 1:0:3:0: [sdf] Result: hostbyte=DID_OK
>> driverbyte=DRIVER_TIMEOUT
>> [ 803.816278] end_request: I/O error, dev sdf, sector 512
>> [ 803.816284] __ratelimit: 70 callbacks suppressed
>> [ 803.816290] Buffer I/O error on device md2, logical block 64
>>
>> That's after bringing up a raid0 array I build a few days ago on 4
>> perfectly good (Seagate Baracuda 1TB 7200.12) disks, without the bad disk
>> plugged in. I try to mount it, and the driver hangs. Anything trying to
>> access any of the 4 disks hangs as well.
>>
>> I know this array worked a few days ago. The most major change I've made
>> was upgrade from -rc8 (or -rc5, not sure if I mounted the array under
>> -rc8) to -rc9.
>>
>
> Hi, just wondering if anyone has had a chance to look at this, or if there's
> some patches I should try out, or if you need me to do some testing, I'd be
> glad to, thanks :)
I was hoping that some VM would jump in. The BUG in question is, from
mm/slab.c:
/*
* The slab was either on partial or free list so
* there must be at least one object available for
* allocation.
*/
BUG_ON(slabp->inuse >= cachep->num);
So I wonder if that is a double-free, indicating a bug in
SCSI/libsas/mvsas, or a VM problem of some sort.
Was free memory low on that machine, at that point, perchance?
Jeff
next prev parent reply other threads:[~2009-09-13 0:43 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-09 10:14 2.6.31-rc8 libata WARNING? Thomas Fjellstrom
2009-09-09 16:30 ` 2.6.31-rc9 kernel BUG (was: Re: 2.6.31-rc8 libata WARNING?) Thomas Fjellstrom
2009-09-09 18:52 ` 2.6.31-rc9 kernel BUG and mvsas Jeff Garzik
2009-09-09 19:10 ` Thomas Fjellstrom
2009-09-09 20:07 ` Thomas Fjellstrom
2009-09-10 23:55 ` Thomas Fjellstrom
2009-09-11 16:04 ` Thomas Fjellstrom
2009-09-13 0:43 ` Jeff Garzik [this message]
2009-09-13 4:56 ` Thomas Fjellstrom
2009-09-15 16:20 ` Thomas Fjellstrom
2009-09-17 15:39 ` 2.6.31-git4 mvsas DRIVER_TIMEOUT Thomas Fjellstrom
2009-09-13 4:59 ` 2.6.31-rc9 kernel BUG and mvsas Thomas Fjellstrom
2009-11-04 0:50 ` Andre Tomt
2009-11-04 1:05 ` Andre Tomt
2009-12-03 21:17 ` Mikael Abrahamsson
2009-12-03 21:36 ` Kristleifur Daðason
2009-12-04 19:00 ` Mikael Abrahamsson
2009-12-04 19:08 ` Kristleifur Daðason
2009-12-05 10:54 ` Mikael Abrahamsson
2009-12-10 19:25 ` Mikael Abrahamsson
2009-12-10 21:04 ` Thomas Fjellstrom
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AAC4022.4040805@garzik.org \
--to=jeff@garzik.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=tfjellstrom@shaw.ca \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.