All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-ext4@vger.kernel.org
Subject: [Bug 14354] Bad corruption with 2.6.32-rc1 and upwards
Date: Fri, 16 Oct 2009 19:14:31 GMT	[thread overview]
Message-ID: <200910161914.n9GJEVbF012071@demeter.kernel.org> (raw)
In-Reply-To: <bug-14354-13602@http.bugzilla.kernel.org/>

http://bugzilla.kernel.org/show_bug.cgi?id=14354





--- Comment #64 from Anonymous Emailer <anonymous@kernel-bugs.osdl.org>  2009-10-16 19:14:31 ---
Reply-To: rwheeler@redhat.com

On 10/16/2009 05:15 AM, Theodore Tso wrote:
> On Fri, Oct 16, 2009 at 12:28:18AM -0400, Parag Warudkar wrote:
>    
>> So I have been experimenting with various root file systems on my
>> laptop running latest git. This laptop some times has problems waking
>> up from sleep and that results in it needing a hard reset and
>> subsequently unclean file system.
>>      
> A number of people have reported this, and there is some discussion
> and some suggestions that I've made here:
>
> 	http://bugzilla.kernel.org/show_bug.cgi?id=14354
>
> It's been very frustrating because I have not been able to replicate
> it myself; I've been very much looking for someone who is (a) willing
> to work with me on this, and perhaps willing to risk running fsck
> frequently, perhaps after every single unclean shutdown, and (b) who
> can reliably reproduce this problem.  On my system, which is a T400
> running 9.04 with the latest git kernels, I've not been able to
> reproduce it, despite many efforts to try to reproduce it.  (i.e.,
> suspend the machine and then pull the battery and power; pulling the
> battery and power, "echo c>  /proc/sysrq-trigger", etc., while
> doing "make -j4" when the system is being uncleanly shutdown)
>    

I wonder if we might have better luck if we tested using an external 
(e-sata or USB connected) S-ATA drive.

Instead of pulling the drive's data connection, most of these have an 
external power source that could be turned off so the drive firmware 
won't have a chance to flush the volatile write cache. Note that some 
drives automatically write back the cache if they have power and see a 
bus disconnect, so hot unplugging just the e-sata or usb cable does not 
do the trick.

Given the number of cheap external drives, this should be easy to test 
at home....

Ric



> So if you can come up with a reliable reproduction case, and don't
> mind doing some experiments and/or exchanging debugging correspondance
> with me, please let me know.  I'd **really** appreciate the help.
>
> Information that would be helpful to me would be:
>
> a) Detailed hardware information (what type of disk/SSD, what type of
> laptop, hardware configuration, etc.)
>
> b) Detailed software information (what version of the kernel are you
> using including any special patches, what distro and version are you
> using, are you using LVM or dm-crypt, what partition or partitions did
> you have mounted, was the failing partition a root partition or some
> other mounted partition, etc.)
>
> c) Detailed reproduction recipe (what programs were you running before
> the crash/failed suspend/resume, etc.)
>
>
> If you do decide to go hunting this problem, one thing I would
> strongly suggest is that either to use "tune2fs -c 1 /dev/XXX" to
> force a fsck after every reboot, or if you are using LVM, to use the
> e2croncheck script (found as an attachment in the above bugzilla entry
> or in the e2fsprogs sources in the contrib directory) to take a
> snapshot and then check the snapshot right after you reboot and login
> to your system.  The reported file system corruptions seem to involve
> the block allocation bitmaps getting corrupted, and so you will
> significantly reduce the chances of data loss if you run e2fsck as
> soon as possible after the file system corruption happens.  This helps
> you not lose data, and it also helps us find the bug, since it helps
> pinpoint the earliest possible point where the file system is getting
> corrupted.
>
> (I suspect that some bug reporters had their file system get corrupted
> one or more boot sessions earlier, and by the time the corruption was
> painfully obvious, they had lost data.  Mercifully, running fsck
> frequently is much less painful on a freshly created ext4 filesystem,
> and of course if you are using an SSD.)
>
> If you can reliably reproduce the problem, it would be great to get a
> bisection, or at least a confirmation that the problem doesn't exist
> on 2.6.31, but does exist on 2.6.32-rcX kernels.  At this point I'm
> reasonably sure it's a post-2.6.31 regression, but it would be good to
> get a hard confirmation of that fact.
>
> For people with a reliable reproduction case, one possible experiment
> can be found here:
>
>     http://bugzilla.kernel.org/show_bug.cgi?id=14354#c18
>
> Another thing you might try is to try reverting these commits one at a
> time, and see if they make the problem go away: d0646f7, 5534fb5,
> 7178057.  These are three commits that seem most likely, but there are
> only 93 ext4-related commits, so doing a "git bisect start v2.6.31
> v2.6.32-rc5 -- fs/ext4 fs/jbd2" should only take at most seven compile
> tests --- assuming this is indeed a 2.6.31 regression and the problem
> is an ext4-specific code change, as opposed to some other recent
> change in the writeback code or some device driver which is
> interacting badly with ext4.
>
> If that assumption isn't true and so a git bisect limited to fs/ext4
> and fs/jbd2 doesn't find a bad commit which when reverted makes the
> problem go away, we could try a full bisection search via "git bisect
> start v2.6.31 v2.6.31-rc3", which would take approximately 14 compile
> tests, but hopefully that wouldn't be necessary.
>
> I'm going to be at the kernel summit in Tokyo next week, so my e-mail
> latency will be a bit longer than normal, which is one of the reason
> why I've left a goodly list of potential experiments for people to
> try.  If you can come up with a reliable regression, and are willing
> to work with me or to try some of the above mentioned tests, I'll
> definitely buy you a real (or virtual) beer.
>
> Given that a number of people have reported losing data as a result,
> it would **definitely** be a good thing to get this fixed before
> 2.6.32 is released.
>
> Thanks,
>
> 						- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

  parent reply	other threads:[~2009-10-16 19:14 UTC|newest]

Thread overview: 218+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-09 15:42 [Bug 14354] New: Bad corruption with 2.6.32-rc1 and upwards bugzilla-daemon
2009-10-09 15:51 ` [Bug 14354] " bugzilla-daemon
2009-10-09 16:06 ` bugzilla-daemon
2009-10-09 16:44 ` bugzilla-daemon
2009-10-09 16:50 ` bugzilla-daemon
2009-10-09 22:35 ` bugzilla-daemon
2009-10-10  7:32 ` bugzilla-daemon
2009-10-10 16:48 ` bugzilla-daemon
2009-10-10 16:50 ` bugzilla-daemon
2009-10-10 17:00 ` bugzilla-daemon
2009-10-10 17:04 ` bugzilla-daemon
2009-10-10 19:54 ` bugzilla-daemon
2009-10-11  1:26 ` bugzilla-daemon
2009-10-11  2:03 ` bugzilla-daemon
2009-10-11 12:31 ` bugzilla-daemon
2009-10-11 19:07 ` bugzilla-daemon
2009-10-11 21:45 ` bugzilla-daemon
2009-10-11 23:01 ` bugzilla-daemon
2009-10-12  0:02 ` bugzilla-daemon
2009-10-12  2:18 ` bugzilla-daemon
2009-10-12  2:24 ` bugzilla-daemon
2009-10-12  7:54 ` bugzilla-daemon
2009-10-12  7:56 ` bugzilla-daemon
2009-10-12  7:57 ` bugzilla-daemon
2009-10-12 15:24 ` bugzilla-daemon
2009-10-12 15:27 ` [Bug 14354] New: " Aneesh Kumar K.V
2009-10-12 15:27 ` [Bug 14354] " bugzilla-daemon
2009-10-12 18:15 ` bugzilla-daemon
2009-10-12 18:29 ` bugzilla-daemon
2009-10-12 21:41 ` bugzilla-daemon
2009-10-12 21:50 ` bugzilla-daemon
2009-10-13  0:18 ` bugzilla-daemon
2009-10-13  8:00 ` bugzilla-daemon
2009-10-13  8:02 ` bugzilla-daemon
2009-10-13  8:47 ` bugzilla-daemon
2009-10-13 13:16 ` bugzilla-daemon
2009-10-13 14:50 ` bugzilla-daemon
2009-10-13 15:12 ` bugzilla-daemon
2009-10-13 20:17 ` bugzilla-daemon
2009-10-13 20:28 ` bugzilla-daemon
2009-10-13 20:55 ` bugzilla-daemon
2009-10-13 21:10 ` bugzilla-daemon
2009-10-13 22:02 ` bugzilla-daemon
2009-10-14  0:45 ` bugzilla-daemon
2009-10-14  0:48 ` bugzilla-daemon
2009-10-14  2:31 ` bugzilla-daemon
2009-10-14  3:17 ` bugzilla-daemon
2009-10-14  3:26 ` bugzilla-daemon
2009-10-14  9:31 ` bugzilla-daemon
2009-10-14 13:21 ` bugzilla-daemon
2009-10-14 13:36 ` bugzilla-daemon
2009-10-14 22:08 ` bugzilla-daemon
2009-10-14 22:09 ` bugzilla-daemon
2009-10-14 22:11 ` bugzilla-daemon
2009-10-15  4:24 ` bugzilla-daemon
2009-10-15  7:11 ` bugzilla-daemon
2009-10-15 15:07 ` bugzilla-daemon
2009-10-15 17:58 ` bugzilla-daemon
2009-10-15 18:30 ` bugzilla-daemon
2009-10-15 19:10 ` bugzilla-daemon
2009-10-15 23:40 ` bugzilla-daemon
2009-10-16  9:16 ` bugzilla-daemon
2009-10-16 10:48 ` bugzilla-daemon
2009-10-16 14:13 ` bugzilla-daemon
2009-10-16 14:41 ` bugzilla-daemon
2009-10-16 15:27 ` bugzilla-daemon
2009-10-16 19:14 ` bugzilla-daemon [this message]
2009-10-16 19:39 ` bugzilla-daemon
2009-10-16 20:05 ` bugzilla-daemon
2009-10-16 20:07 ` bugzilla-daemon
2009-10-16 22:24 ` bugzilla-daemon
2009-10-16 23:02 ` bugzilla-daemon
2009-10-17  2:27 ` bugzilla-daemon
2009-10-17  6:01 ` bugzilla-daemon
2009-10-17  6:03 ` bugzilla-daemon
2009-10-17  6:12 ` bugzilla-daemon
2009-10-17  6:38 ` bugzilla-daemon
2009-10-17  6:38 ` bugzilla-daemon
2009-10-17  6:43 ` bugzilla-daemon
2009-10-17  8:18 ` bugzilla-daemon
2009-10-17 10:51 ` bugzilla-daemon
2009-10-17 11:37 ` bugzilla-daemon
2009-10-17 14:37 ` bugzilla-daemon
2009-10-17 14:40 ` bugzilla-daemon
2009-10-17 15:23 ` bugzilla-daemon
2009-10-17 17:46 ` bugzilla-daemon
2009-10-17 19:57 ` bugzilla-daemon
2009-10-17 20:54 ` bugzilla-daemon
2009-10-17 20:59 ` bugzilla-daemon
2009-10-18  0:02 ` bugzilla-daemon
2009-10-18  0:03 ` bugzilla-daemon
2009-10-18  0:04 ` bugzilla-daemon
2009-10-18  0:04 ` bugzilla-daemon
2009-10-18  0:05 ` bugzilla-daemon
2009-10-18  0:05 ` bugzilla-daemon
2009-10-18  0:07 ` bugzilla-daemon
2009-10-18  0:07 ` bugzilla-daemon
2009-10-18  0:07 ` bugzilla-daemon
2009-10-18  7:44 ` bugzilla-daemon
2009-10-18 10:06 ` bugzilla-daemon
2009-10-18 11:23 ` bugzilla-daemon
2009-10-18 11:25 ` bugzilla-daemon
2009-10-18 11:57 ` bugzilla-daemon
2009-10-19 21:07 ` bugzilla-daemon
2009-10-19 21:08 ` bugzilla-daemon
2009-10-20 17:28 ` bugzilla-daemon
2009-10-20 20:21 ` bugzilla-daemon
2009-10-21  0:53 ` bugzilla-daemon
2009-10-21  0:57 ` bugzilla-daemon
2009-10-21  5:55 ` bugzilla-daemon
2009-10-21  9:05 ` bugzilla-daemon
2009-10-21  9:06 ` bugzilla-daemon
2009-10-22 18:43 ` bugzilla-daemon
2009-10-22 19:06 ` bugzilla-daemon
2009-10-22 19:22 ` bugzilla-daemon
2009-10-22 19:29 ` bugzilla-daemon
2009-10-22 19:37 ` bugzilla-daemon
2009-10-22 21:58 ` bugzilla-daemon
2009-10-22 22:24 ` bugzilla-daemon
2009-10-23  4:30 ` bugzilla-daemon
2009-10-23  7:45 ` bugzilla-daemon
2009-10-24 17:04 ` bugzilla-daemon
2009-10-24 17:05 ` bugzilla-daemon
2009-10-25  8:58 ` bugzilla-daemon
2009-10-25 13:59 ` bugzilla-daemon
2009-10-25 14:01 ` bugzilla-daemon
2009-10-25 19:04 ` bugzilla-daemon
2009-10-26 13:46 ` bugzilla-daemon
2009-10-26 15:42 ` bugzilla-daemon
2009-10-26 16:17 ` bugzilla-daemon
2009-10-26 19:13 ` bugzilla-daemon
2009-10-26 21:46 ` bugzilla-daemon
2009-10-26 22:40 ` bugzilla-daemon
2009-10-27  6:54   ` Aneesh Kumar K.V
2009-10-27  6:39 ` bugzilla-daemon
2009-10-27 10:00 ` bugzilla-daemon
2009-10-27 10:38 ` bugzilla-daemon
2009-10-27 11:03   ` Aneesh Kumar K.V
2009-10-27 10:48 ` bugzilla-daemon
2009-10-27 11:47 ` bugzilla-daemon
2009-10-27 12:41 ` bugzilla-daemon
2009-10-27 13:02 ` bugzilla-daemon
2009-10-27 18:50 ` bugzilla-daemon
2009-10-27 19:00 ` bugzilla-daemon
2009-10-27 20:00 ` bugzilla-daemon
2009-10-27 20:31 ` bugzilla-daemon
2009-10-27 20:37 ` bugzilla-daemon
2009-10-27 21:23 ` bugzilla-daemon
2009-10-27 21:42 ` bugzilla-daemon
2009-10-27 22:04 ` bugzilla-daemon
2009-10-27 23:38 ` bugzilla-daemon
2009-10-28  6:44 ` bugzilla-daemon
2009-10-28  7:20 ` bugzilla-daemon
2009-10-29 16:23 ` bugzilla-daemon
2009-10-29 16:47 ` bugzilla-daemon
2009-10-29 17:34 ` bugzilla-daemon
2009-10-29 20:11 ` bugzilla-daemon
2009-10-29 21:25 ` bugzilla-daemon
2009-10-29 21:39 ` bugzilla-daemon
2009-10-29 21:42 ` bugzilla-daemon
2009-10-29 21:52 ` bugzilla-daemon
2009-10-29 21:55 ` bugzilla-daemon
2009-10-29 22:20 ` bugzilla-daemon
2009-10-29 22:23 ` bugzilla-daemon
2009-10-30  8:16 ` bugzilla-daemon
2009-10-30  8:22 ` bugzilla-daemon
2009-10-30 10:26 ` bugzilla-daemon
2009-10-30 10:42 ` bugzilla-daemon
2009-10-30 13:54 ` bugzilla-daemon
2009-10-30 16:27 ` bugzilla-daemon
2009-10-30 18:05 ` bugzilla-daemon
2009-10-30 19:17 ` bugzilla-daemon
2009-10-30 19:22 ` bugzilla-daemon
2009-10-30 19:56 ` bugzilla-daemon
2009-10-31  9:15 ` bugzilla-daemon
2009-10-31 15:26 ` bugzilla-daemon
2009-10-31 16:16 ` bugzilla-daemon
2009-10-31 19:15 ` bugzilla-daemon
2009-10-31 19:56 ` bugzilla-daemon
2009-11-02  4:07 ` bugzilla-daemon
2009-11-02 17:05 ` bugzilla-daemon
2009-11-02 23:11 ` bugzilla-daemon
2009-11-03  1:07 ` bugzilla-daemon
2009-11-03  9:33 ` bugzilla-daemon
2009-11-03 13:37 ` bugzilla-daemon
2009-11-03 13:43 ` bugzilla-daemon
2009-11-03 13:58 ` bugzilla-daemon
2009-11-03 14:32 ` bugzilla-daemon
2009-11-03 23:39 ` bugzilla-daemon
2009-11-03 23:43 ` bugzilla-daemon
2009-11-04  3:33 ` bugzilla-daemon
2009-11-06 20:49 ` bugzilla-daemon
2009-11-06 22:07 ` bugzilla-daemon
2009-11-08 18:12   ` Christoph Hellwig
2009-11-08 19:12   ` Theodore Tso
2009-11-06 22:15 ` bugzilla-daemon
2009-11-06 22:26 ` bugzilla-daemon
2009-11-08 18:12 ` bugzilla-daemon
2009-11-08 19:12 ` bugzilla-daemon
2009-11-08 21:05 ` bugzilla-daemon
2009-11-08 21:12   ` Theodore Tso
2009-11-08 21:12 ` bugzilla-daemon
2009-11-09 14:54 ` bugzilla-daemon
2009-11-16 22:30 ` bugzilla-daemon
2009-11-16 22:30 ` bugzilla-daemon
2009-11-17 22:20 ` bugzilla-daemon
2009-11-17 22:21 ` bugzilla-daemon
2009-11-17 22:22 ` bugzilla-daemon
  -- strict thread matches above, loose matches on Subject: below --
2009-10-11 22:07 2.6.32-rc4: Reported regressions from 2.6.31 Rafael J. Wysocki
2009-10-11 22:22 ` [Bug #14354] Bad corruption with 2.6.32-rc1 and upwards Rafael J. Wysocki
2009-10-26 18:45 2.6.32-rc5-git3: Reported regressions from 2.6.31 Rafael J. Wysocki
2009-10-26 18:55 ` [Bug #14354] Bad corruption with 2.6.32-rc1 and upwards Rafael J. Wysocki
2009-10-26 18:55   ` Rafael J. Wysocki
2009-11-16 22:33 2.6.32-rc7-git1: Reported regressions from 2.6.31 Rafael J. Wysocki
2009-11-16 22:37 ` [Bug #14354] Bad corruption with 2.6.32-rc1 and upwards Rafael J. Wysocki
2009-11-17  2:02   ` Theodore Tso
2009-11-17  2:02     ` Theodore Tso
     [not found]     ` <20091117020202.GA23066-3s7WtUTddSA@public.gmane.org>
2009-11-17 22:23       ` Rafael J. Wysocki
2009-11-17 22:23         ` Rafael J. Wysocki
     [not found]         ` <200911172323.11224.rjw-KKrjLPT3xs0@public.gmane.org>
2009-11-18  0:11           ` tytso-3s7WtUTddSA
2009-11-18  0:11             ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200910161914.n9GJEVbF012071@demeter.kernel.org \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.