All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Holler <holler@ahsoftware.de>
To: Jan Kara <jack@suse.cz>
Cc: Dan Carpenter <dan.carpenter@oracle.com>, linux-kernel@vger.kernel.org
Subject: Re: kernel BUG at fs/buffer.c:3205 (stable 3.5.3)
Date: Thu, 27 Sep 2012 20:01:00 +0200	[thread overview]
Message-ID: <5064945C.4020403@ahsoftware.de> (raw)
In-Reply-To: <50647CCC.7010407@ahsoftware.de>

Am 27.09.2012 18:20, schrieb Alexander Holler:
> Am 27.09.2012 17:46, schrieb Alexander Holler:
>> Hello,
>>
>> Am 27.09.2012 17:12, schrieb Jan Kara:
>>>    Just some thoughts about your oops:
>>> The assertion which fails is:
>>> BUG_ON(!list_empty(&bh->b_assoc_buffers));
>>>
>>> Now b_assoc_buffers isn't used very much. In particular ext4 which you
>>> seem
>>> to be using doesn't use this list at all (except when mounted in
>>> nojournal
>>> mode but that doesn't seem to be your case). That would point rather
>>> strongly at a memory corruption issue.
>>>
>>> So if you can reproduce the oops, it might be interesting to print
>>> bh->b_assoc_buffers.next and &bh->b_assoc_buffers.next if the list is
>>> found
>>> to be non-empty.
>>
>> Hmm, a loose pointer would explain it all too. Especially the cases when
>> I just have seen wrong content in the archive without having any oops. I
>> try to reproduce it with
>>
>> pr_info("AHO: %p %p\n", bh->b_assoc_buffers.next,
>> &bh->b_assoc_buffers.next);
>>
>> after the BUG_ON().
>>
>> Thanks for the hint. I wasn't already that far to know that
>> b_assoc_buffers isn't used that much.
>
> Hmm, that doesn't look very practicable because b_assoc_buffers seems to
> be used a lot here. ;)
> Maybe I should have mentioned that I'm mounting the source filesystem
> (root with ext4) with
> nodelalloc(rw,noatime,nodelalloc,errors=remount-ro,data=ordered), and to
> backup it, I'm using a bind-mount (mount -o bind / /foo) as source.
>
> But the debug output starts very early on boot, where no bind-mount is
> used:
>
> ---------------------
> Sep 27 18:03:23 krabat udevd[1254]: invalid rule
> '/etc/udev/rules.d/80-aho.rules:26'
> Sep 27 18:03:23 krabat kernel: [    4.562670] usb usb8: New USB device
> found, idVendor=1d6b, idProduct=0001
> Sep 27 18:03:23 krabat kernel: [    4.562671] usb usb8: New USB device
> strings: Mfr=3, Product=2, SerialNumber=1
> Sep 27 18:03:23 krabat systemd-uaccess[1363]: Failed to apply ACL on
> /dev/kvm: Operation not supported
> Sep 27 18:03:23 krabat kernel: [    4.562673] usb usb8: Product: UHCI
> Host Controller
> Sep 27 18:03:23 krabat kernel: [    4.562674] usb usb8: Manufacturer:
> Linux 3.5.4-00009-gfa43f23-dirty uhci_hcd
> Sep 27 18:03:23 krabat kernel: [    4.562676] usb usb8: SerialNumber:
> 0000:00:1d.0
> Sep 27 18:03:23 krabat systemd-uaccess[1716]: Failed to apply ACL on
> /dev/kvm: Operation not supported
> Sep 27 18:03:23 krabat kernel: [    4.563285] hub 8-0:1.0: USB hub found
> Sep 27 18:03:23 krabat kernel: [    4.563288] hub 8-0:1.0: 2 ports detected
> Sep 27 18:03:23 krabat systemd-uaccess[2324]: Failed to apply ACL on
> /dev/snd/timer: Operation not supported
> Sep 27 18:03:23 krabat kernel: [    4.563316] AHO: ffff880212e4b048
> ffff880212e4b048
> Sep 27 18:03:23 krabat kernel: [    4.563318] AHO: ffff880212e4b0b0
> ffff880212e4b0b0
> Sep 27 18:03:23 krabat kernel: [    4.563319] AHO: ffff880212e4b118
> ffff880212e4b118
> ---------------------
>
> And afterwards I see tons of those messages, so it doesn't look usable.
> Anyway, I retry to repdroduce the problem without that debug line, just
> to see if still can reproduce the problem with F17 as userspace (and
> kernel 3.5.4 instead of 3.5.3).

After 2 successful tries in sequence, the third failed (sorry, LANG=de):

---------------------------------------------------------------
[root@krabat bind]# tar cp . | mbuffer | bzip2smp 
 >/mnt/usb3/Krabat.Fedora17.sdb2.27.09.12.tar.bz2
in @ 33.1 MiB/s, out @ 38.3 MiB/s,  888 MiB total, buffer  95% fulltar: 
./tmp/.X11-unix/X0: Socket ignoriert
in @  0.0 KiB/s, out @ 20.8 MiB/s, 24.9 GiB total, buffer  22% full
summary: 24.9 GiByte in 19 min 53.0 sec - average of 21.4 MiB/s
[root@krabat bind]# tar djf /mnt/usb3/Krabat.Fedora17.sdb2.27.09.12.tar.bz2
./var/log/messages: Änderungszeit ist unterschiedlich
./var/log/messages: Größe ist unterschiedlich
./var/tmp/kdecache-aholler/icon-cache.kcache: Änderungszeit ist 
unterschiedlich
./var/tmp/kdecache-aholler/icon-cache.kcache: Unterschiedliche Inhalte
./var/tmp/kdecache-aholler/plasma_theme_oxygen.kcache: Änderungszeit ist 
unterschiedlich
./var/tmp/kdecache-aholler/plasma_theme_oxygen.kcache: Unterschiedliche 
Inhalte
./var/lib/chrony/drift: Änderungszeit ist unterschiedlich
./var/lib/chrony/drift: Unterschiedliche Inhalte
./home/aholler/.kde/share/apps/konqueror/autosave/_1.77: Änderungszeit 
ist unterschiedlich
./home/aholler/.kde/share/apps/konqueror/autosave/_1.77: Größe ist 
unterschiedlich
./home/aholler/.kde/share/apps/konqueror/konq_history: Änderungszeit ist 
unterschiedlich
./home/aholler/.kde/share/apps/konqueror/konq_history: Größe ist 
unterschiedlich
./home/aholler/.kde/share/apps/kcookiejar/cookies: Änderungszeit ist 
unterschiedlich
./home/aholler/thinstation_src-2.0beta2.tar.bz2: Unterschiedliche Inhalte

bzip2: Data integrity error when decompressing.
         Input file = (stdin), output file = (stdout)

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

tar: Unerwartetes Dateiende im Archiv.
tar: Child returned status 2
tar: Error is not recoverable: exiting now
[root@krabat bind]#
---------------------------------------------------------------

This time without any oops, dmesg just shows some

---------------------------------------------------------------
[  111.087356] EXT4-fs (sdc1): mounted filesystem with ordered data 
mode. Opts: (null)
[  672.868948] CPU4: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  672.868949] CPU0: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  672.869970] CPU0: Core temperature/speed normal
[  672.869971] CPU4: Core temperature/speed normal
[  688.285419] CPU6: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  688.285421] CPU2: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  688.286442] CPU2: Core temperature/speed normal
[  688.286443] CPU6: Core temperature/speed normal
[  698.822614] CPU3: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  698.822615] CPU7: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  698.824674] CPU3: Core temperature/speed normal
[  698.824675] CPU7: Core temperature/speed normal
[  706.979633] CPU1: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  706.979635] CPU5: Core temperature above threshold, cpu clock 
throttled (total events = 1)
[  706.980648] CPU1: Core temperature/speed normal
[  706.980649] CPU5: Core temperature/speed normal
[  899.540485] [Hardware Error]: Machine check events logged
---------------------------------------------------------------

Nothing else. Kernel is 3.5.4 userland now F17.

Regards.


  reply	other threads:[~2012-09-27 18:02 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-14 11:34 kernel BUG at fs/buffer.c:3205 (stable 3.5.3) Alexander Holler
2012-09-25 11:02 ` Dan Carpenter
2012-09-27 11:45   ` Alexander Holler
2012-09-27 15:12     ` Jan Kara
2012-09-27 15:46       ` Alexander Holler
2012-09-27 16:20         ` Alexander Holler
2012-09-27 18:01           ` Alexander Holler [this message]
2012-09-27 18:12             ` Alexander Holler
2012-09-27 20:05             ` Jan Kara
2012-09-28  8:09               ` Alexander Holler
2012-09-27 20:03         ` Jan Kara
2012-09-29 19:07           ` Alexander Holler
2012-10-01  9:10             ` Jan Kara
2012-10-01  9:21               ` Alexander Holler
2012-10-02  9:30                 ` Alexander Holler
2012-10-14  9:10                   ` Alexander Holler
2012-10-14 12:27                     ` Alan Cox
2012-10-15  8:46                       ` Alexander Holler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5064945C.4020403@ahsoftware.de \
    --to=holler@ahsoftware.de \
    --cc=dan.carpenter@oracle.com \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.