All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@redhat.com>
To: "Juergens Dirk (CM-AI/ECO2)" <Dirk.Juergens@de.bosch.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	"Huang Weller (CM/ESW12-CN)" <Weller.Huang@cn.bosch.com>
Cc: "linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: AW: AW: ext4 filesystem bad extent error review
Date: Fri, 03 Jan 2014 12:48:48 -0600	[thread overview]
Message-ID: <52C70610.4010907@redhat.com> (raw)
In-Reply-To: <B8A948099C53E0408BDBCE749AAECA9A2A80C78551@SI-MBX10.de.bosch.com>

On 1/3/14, 12:45 PM, Juergens Dirk (CM-AI/ECO2) wrote:
> 
> On Thu, Jan 03, 2014 at 19:24, Eric Sandeen wrote
>>
>> On 1/3/14, 10:29 AM, Juergens Dirk (CM-AI/ECO2) wrote:
>>> So, I think there _might_ be a kernel bug, but it could be also a
>> problem
>>> related to the particular type of eMMC. We did not observe the same
>> issue
>>> in previous tests with another type of eMMC from another supplier,
>> but this
>>> was with an older kernel patch level and with another HW design.
>>>
>>> Regarding a possible kernel bug: Is there any chance that the invalid
>>> ee_len or ee_start are returned by, e.g., the block allocator ?
>>> If so, can we try to instrument the code to get suitable traces ?
>>> Just to see or to exclude that the corrupted inode is really written
>>> to the eMMC ?
>>
>> From your description it does sound possible that it's a kernel bug.
>> Adding testcases to the code to catch it before it hits the journal
>> might be helpful - but then maybe this is something getting overwritten
>> after the fact - hard to say.
>>
>> Can you share more details of the test you are running?  Or maybe even
>> the test itself?
> 
> Yes, for sure, we can. Weller, please provide additional details
> or corrections. 
> 
> In short:
> Basically we use an automated cyclic test writing many small 
> (some kBytes) files with CRC checksums for easy consistency check
> into a separate test partition. Files also contain meta information
> like filename,  sequence number and a random number to allow to identify 
> from block device image dumps, if we just see a fragment of an old
> deleted file or a still valid one. 
> 
> Each test loop looks like this:

0) mkfs the filesystem - with what options?  How big?

> 1) Boot the device after power on or reset
> 2) Do fsck -n BEFORE mounting
> 2 a) (optional) binary dump of the journal 
> 3) Mount test partition

Again with what options, if any?

> 4) File content check for all files from prev. loop
> 5) erase all files from previous loop
> 6) start writing hundreds/thousands of test files 
>     in multiple directories with several threads

I guess this is where we might need more details in order,
to try to recreate the failure, but perhaps
this is not a case where you can simply share the IO
generation utility...?

Thanks,
-Eric

> 7) after random time cut the power or do soft reset
> 
> If 2), 3), 4) or 5) fails, stop test.
> 
> We are running the test usually with kind of transaction
> safe handling, i.e. use fsync/rename, to avoid zero length files
> or file fragments.
> 
>>
>> I've used a test framework in the past to simulate resets w/o needing
>> to reset the box, and do many journal replays very quickly.  It'd be
>> interesting to run it using your testcase.
>>
>> Thanks,
>> -Eric
> 
> Mit freundlichen Grüßen / Best regards
> 
> Dirk Juergens
> 
> Robert Bosch Car Multimedia GmbH
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2014-01-03 18:48 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-02  4:59 ext4 filesystem bad extent error review Huang Weller (CM/ESW12-CN)
2014-01-02 18:42 ` Theodore Ts'o
2014-01-03  3:16   ` Huang Weller (CM/ESW12-CN)
2014-01-03 15:48     ` Theodore Ts'o
2014-01-03 16:40       ` AW: " Juergens Dirk (CM-AI/ECO2)
2014-01-06  2:23         ` Huang Weller (CM/ESW12-CN)
2014-01-03 17:23       ` Eric Sandeen
2014-01-03 17:51         ` Theodore Ts'o
2014-01-03 17:54           ` Eric Sandeen
2014-01-03 18:06             ` Theodore Ts'o
2014-01-03 18:21               ` AW: " Juergens Dirk (CM-AI/ECO2)
2014-01-06  3:53                 ` Huang Weller (CM/ESW12-CN)
2014-01-03 16:29   ` AW: " Juergens Dirk (CM-AI/ECO2)
2014-01-03 17:25     ` Eric Sandeen
2014-01-03 18:45       ` AW: " Juergens Dirk (CM-AI/ECO2)
2014-01-03 18:48         ` Eric Sandeen [this message]
2014-01-03 18:56           ` AW: " Juergens Dirk (CM-AI/ECO2)
2014-01-06  5:45             ` Huang Weller (CM/ESW12-CN)
2014-01-06  1:44           ` Huang Weller (CM/ESW12-CN)
2014-01-06  5:17         ` Huang Weller (CM/ESW12-CN)
2014-01-06  5:10       ` [Attachment has been removed]RE: " Huang Weller (CM/ESW12-CN)
2014-01-07  9:10       ` Huang Weller (CM/ESW12-CN)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52C70610.4010907@redhat.com \
    --to=sandeen@redhat.com \
    --cc=Dirk.Juergens@de.bosch.com \
    --cc=Weller.Huang@cn.bosch.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.