public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alessandro Bono <alessandro.bono@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-xfs <linux-xfs@oss.sgi.com>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: XFS kernel BUG at fs/buffer.c:470! with 2.6.28.4
Date: Mon, 02 Mar 2009 16:04:59 +0100	[thread overview]
Message-ID: <1236006299.5744.7.camel@champagne> (raw)
In-Reply-To: <20090302133609.GD23880@duck.suse.cz>

On Mon, 2009-03-02 at 14:36 +0100, Jan Kara wrote:
> On Fri 27-02-09 20:44:55, Alessandro Bono wrote:
> > On Thu, 2009-02-26 at 17:58 +0100, Jan Kara wrote:
> > ....
> > > > any suggestion?
> > >   Hmm, are you still able to reproduce the problem? As I'm looking into
> > > registers in your dump, no register really seems to contain sensible page
> > > flags so it could be some corruption of page pointer. If you are still
> > > able to reproduce, could you please do so with the attached patch
> > > applied? It will dump us much more information... Thanks.
> > > 
> > > 								Honza
> > > 
> > 
> > this time with plain xfs, no lvm no dm_crypt
> > just to add some info
> > xfs fs created with this command
> > 
> > mkfs.xfs -f -l lazy-count=1,version=2,size=128m -i attr=2 -d agcount=4
> > -L store /dev/sdb1
>   OK, thanks for the data. I've looked at both your traces and I think the
> most likely cause is a bit-flip in memory. Look: The page address (we got
> from bh->b_page) is ffffe20000885604. IMHO it should be ffffe20000885600
> because if you look e.g. at mapping, it is 000ac69affff8800, which is a
> bogus value but it would look like a valid pointer if we shift it by 32
> bits to the left. The same with private pointer. Index also looks absurdly
> high but low 32-bits are actually 0 and previous 4 bytes in the page
> structure (we have them shown in the upper 4 bytes of mapping pointer)
> contain 0x000ac69a => so given together page index would be 706202 -
> perfectly sensible value.
>   And in the second report it is the same. Also buffer head looks perfectly
> valid in both cases.
>   Such bit flips are usually caused by faulty memory or other HW (io
> controler etc.) so I suggest trying to shuffle the hardware somehow -
> change memory DIMMs as a starter, running memtest if you don't have a spare
> DIMMs but it is not an exception that even though memtest runs just fine
> for a long time, memory is really at fault.

Ok, I'll change DIMMs, retest and report back

Thanks a lot for your support



> 
> 								Honza
> 
> > Feb 27 19:20:28 champagne kernel: [ 2664.993272] Buffer ffff8800371e7e70 of page ffffe20000885604 not private! Some data to debug:
> > Feb 27 19:20:28 champagne kernel: [ 2664.993282] flags: 240000000, mapping: 000ac69affff8800, index: 54276223274057728, private: 2489e50ffff8800
> > Feb 27 19:20:28 champagne kernel: [ 2664.993288] Buffer: state=125, block=23795642, b_size=4096, b_this_page=ffff8800371e7e70
> > Feb 27 19:20:28 champagne kernel: [ 2664.993293] Other buffers in the page:
> > Feb 27 19:20:28 champagne kernel: [ 2664.993355] ------------[ cut here ]------------
> > Feb 27 19:20:28 champagne kernel: [ 2664.993361] Kernel BUG at ffffffff802b3653 [verbose debug info unavailable]
> > Feb 27 19:20:28 champagne kernel: [ 2664.993366] invalid opcode: 0000 [#1] SMP
> > Feb 27 19:20:28 champagne kernel: [ 2664.993373] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
> > Feb 27 19:20:28 champagne kernel: [ 2664.993378] CPU 1
> > Feb 27 19:20:28 champagne kernel: [ 2664.993382] Modules linked in: xfs nls_iso8859_1 nls_cp437 vfat fat nls_base usb_storage libusual af_packet binfmt_misc rfcomm bridge stp llc bnep sco l2cap kvm_intel kvm ppdev ipv6 acpi_cpufreq cpufreq_powersave cpufreq_stats cpufreq_userspace cpufreq_ondemand freq_table cpufreq_conservative pci_slot sbs sbshc iptable_filter ip_tables x_tables dm_crypt sbp2 lp snd_hda_intel snd_hwdep snd_pcm_oss arc4 snd_pcm snd_page_alloc ecb snd_mixer_oss snd_seq_dummy snd_seq_oss iwlagn iwlcore usbhid snd_seq_midi rfkill btusb snd_rawmidi snd_seq_midi_event snd_seq hid parport_pc snd_timer mac80211 pcmcia bluetooth parport snd_seq_device lis3lv02d leds_hp_disk sdhci_pci sdhci snd video output ricoh_mmc pcspkr tpm_infineon tpm tpm_bios cfg80211 mmc_core led_class yenta_socket rsrc_nonstatic pcmcia_core psmouse container wmi battery ac button soundcore serio_raw iTCO_wdt iTCO_vendor_support evdev dm_multipath ext3 jbd mbcache sr_mod cdrom sg sd_mod crc_t10dif ohci1394 ata_piix ahci ieee1394 l
> > Feb 27 19:20:28 champagne kernel: bata scsi_mod ehci_hcd uhci_hcd usbcore e1000e dm_mirror dm_region_hash dm_log dm_snapshot dm_mod thermal processor fan thermal_sys hwmon fuse
> > Feb 27 19:20:28 champagne kernel: [ 2664.993553] Pid: 8367, comm: xfsdatad/1 Not tainted 2.6.28.7 #1
> > Feb 27 19:20:28 champagne kernel: [ 2664.993558] RIP: 0010:[<ffffffff802b3653>]  [<ffffffff802b3653>] end_buffer_async_write+0x119/0x18d
> > Feb 27 19:20:28 champagne kernel: [ 2664.993573] RSP: 0018:ffff880084555e40  EFLAGS: 00010246
> > Feb 27 19:20:28 champagne kernel: [ 2664.993578] RAX: 0000000240000000 RBX: ffff8800371e7e70 RCX: ffffffff80554d00
> > Feb 27 19:20:28 champagne kernel: [ 2664.993584] RDX: ffff8800a7ae3000 RSI: 0000000000000046 RDI: ffffffff80572f50
> > Feb 27 19:20:28 champagne kernel: [ 2664.993589] RBP: ffff8800371e7e70 R08: 0000000000000000 R09: 0000000000000000
> > Feb 27 19:20:28 champagne kernel: [ 2664.993594] R10: 000000000000000a R11: 0000000000018600 R12: ffff8801159fd288
> > Feb 27 19:20:28 champagne kernel: [ 2664.993599] R13: ffffe20000885604 R14: ffff88013b85ff00 R15: 0000000000000001
> > Feb 27 19:20:28 champagne kernel: [ 2664.993605] FS:  0000000000000000(0000) GS:ffff88013b803a00(0000) knlGS:0000000000000000
> > Feb 27 19:20:28 champagne kernel: [ 2664.993611] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > Feb 27 19:20:28 champagne kernel: [ 2664.993616] CR2: 00007f9deefb8ca8 CR3: 0000000117829000 CR4: 00000000000026e0
> > Feb 27 19:20:28 champagne kernel: [ 2664.993621] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > Feb 27 19:20:28 champagne kernel: [ 2664.993626] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Feb 27 19:20:28 champagne kernel: [ 2664.993632] Process xfsdatad/1 (pid: 8367, threadinfo ffff880084554000, task ffff88008c1222b0)
> > Feb 27 19:20:28 champagne kernel: [ 2664.993636] Stack:
> > Feb 27 19:20:28 champagne kernel: [ 2664.993640]  0000000000000282 0000000000000004 ffff88000248b6c0 ffff8801159fd288
> > Feb 27 19:20:28 champagne kernel: [ 2664.993648]  ffff88013b85fee0 0000000000000000 ffff88010dc356c0 ffff8801159fd288
> > Feb 27 19:20:28 champagne kernel: [ 2664.993657]  ffff88013b85fee0 ffffffffa055a9f9 ffffffffa055ab6b ffff8801159fd280
> > Feb 27 19:20:28 champagne kernel: [ 2664.993667] Call Trace:
> > Feb 27 19:20:28 champagne kernel: [ 2664.993671]  [<ffffffffa055a9f9>] ? xfs_destroy_ioend+0x23/0x71 [xfs]
> > Feb 27 19:20:28 champagne kernel: [ 2664.993723]  [<ffffffffa055ab6b>] ? xfs_end_bio_delalloc+0x0/0x19 [xfs]
> > Feb 27 19:20:28 champagne kernel: [ 2664.993770]  [<ffffffffa055ab6b>] ? xfs_end_bio_delalloc+0x0/0x19 [xfs]
> > Feb 27 19:20:28 champagne kernel: [ 2664.993815]  [<ffffffff8023fdc2>] ? run_workqueue+0x79/0xfe
> > Feb 27 19:20:28 champagne kernel: [ 2664.993826]  [<ffffffff8023ff1f>] ? worker_thread+0xd8/0xe7
> > Feb 27 19:20:28 champagne kernel: [ 2664.993834]  [<ffffffff80243254>] ? autoremove_wake_function+0x0/0x2e
> > Feb 27 19:20:28 champagne kernel: [ 2664.993842]  [<ffffffff8023fe47>] ? worker_thread+0x0/0xe7
> > Feb 27 19:20:28 champagne kernel: [ 2664.993849]  [<ffffffff80242f42>] ? kthread+0x47/0x73
> > Feb 27 19:20:28 champagne kernel: [ 2664.993856]  [<ffffffff8022ec78>] ? schedule_tail+0x27/0x5f
> > Feb 27 19:20:28 champagne kernel: [ 2664.993864]  [<ffffffff8020c199>] ? child_rip+0xa/0x11
> > Feb 27 19:20:28 champagne kernel: [ 2664.993872]  [<ffffffff80242efb>] ? kthread+0x0/0x73
> > Feb 27 19:20:28 champagne kernel: [ 2664.993879]  [<ffffffff8020c18f>] ? child_rip+0x0/0x11
> > Feb 27 19:20:28 champagne kernel: [ 2664.993885] Code: 10 4c 8b 43 20 48 8b 13 48 89 de 48 c7 c7 f8 ef 46 80 31 c0 e8 b2 db 13 00 48 8b 5b 08 48 39 eb 75 d7 49 8b 45 00 f6 c4 08 75 04 <0f> 0b eb fe 49 8b 5d 10 9c 41 5c fa eb 07 f3 90 f603 10 75 f9
> > Feb 27 19:20:28 champagne kernel: [ 2664.993950] RIP  [<ffffffff802b3653>] end_buffer_async_write+0x119/0x18d
> > Feb 27 19:20:28 champagne kernel: [ 2664.993959]  RSP <ffff880084555e40>
> > Feb 27 19:20:28 champagne kernel: [ 2664.993985] ---[ end trace 2d34f97811caf921 ]---
> > 
> 
-- 
---
Cordiali Saluti
Alessandro Bono


  reply	other threads:[~2009-03-02 15:05 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-07 13:06 XFS kernel BUG at fs/buffer.c:470! with 2.6.28.4 Alessandro Bono
2009-02-08 22:16 ` Dave Chinner
2009-02-08 23:24   ` Alessandro Bono
2009-02-09 10:33     ` Peter Zijlstra
2009-02-08 22:28 ` Christoph Hellwig
2009-02-08 22:39   ` Alessandro Bono
2009-02-08 22:42     ` Christoph Hellwig
2009-02-08 22:45       ` Alessandro Bono
2009-02-09  2:52         ` Eric Sandeen
2009-02-09  2:53           ` Eric Sandeen
2009-02-09  7:53         ` Christoph Hellwig
2009-02-09  9:02           ` Alessandro Bono
2009-02-10 10:43           ` Dave Chinner
2009-02-10 10:53             ` Alessandro Bono
2009-02-11 21:47             ` Alessandro Bono
2009-02-12  9:47             ` Alessandro Bono
2009-02-26 16:58               ` Jan Kara
2009-02-26 16:59                 ` Jan Kara
2009-02-27  9:22                 ` Alessandro Bono
2009-02-27 19:44                 ` Alessandro Bono
2009-03-02 13:36                   ` Jan Kara
2009-03-02 15:04                     ` Alessandro Bono [this message]
2009-03-11 17:04                     ` Alessandro Bono

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1236006299.5744.7.camel@champagne \
    --to=alessandro.bono@gmail.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox