[Bug 15579] New: ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [Bug 15579] New: ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
@ 2010-03-19 10:51 bugzilla-daemon
  2010-03-19 12:41 ` [Bug 15579] " bugzilla-daemon
  2010-03-19 18:13 ` bugzilla-daemon
  0 siblings, 2 replies; 3+ messages in thread
From: bugzilla-daemon @ 2010-03-19 10:51 UTC (permalink / raw)
  To: linux-ext4

http://bugzilla.kernel.org/show_bug.cgi?id=15579

           Summary: ext4 -o discard produces incorrect blocks of zeroes in
                    newly created files under heavy
                    read+truncate+append-new-file load
           Product: File System
           Version: 2.5
    Kernel Version: 2.6.33
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4@kernel-bugs.osdl.org
        ReportedBy: kernel-bugs@abeckmann.de
        Regression: No

I'm testing ext4 -o discard on a Super Talent FTM56GX25H SSD. The speed
increase by using the discard option seems promising.
But I'm experiencing problems under a certain stressful file system load:

(approximate description, the actual sizes/numbers are not exact MB/GB, but
that shouldn't be a problem)
* you have a 252 GB ext4 -m 0 -T largefile filesystem
* you have 250 input files of size 1 GB each and an empty output file
* while the input has not been consumed
  - load 1 MB from the end of each input file
  - truncate the input files to reduce their size by 1 MB
  - do some computation ...
  - append 250 MB to the output file

Checking the output file after operation has finished I find blocks of 0x00
that should not be there. These blocks are usually the size of 1MB (the size
that was truncated and 'discarded') and always multiples of 16KB (the minimal
discard/TRIM-able unit (also the discard/TRIM alignment) of the SSD, found by
doing manual experiments using hdparm --trim-sector-ranges).
In several repetitions I've counted about 10-12MB of invalid 0x00 bytes in the
output.

The problem does not occur if I use 250000 inputfiles instead, read a subset of
250 files and delete them before writing the output. This is significantly
slower.

A possible cause could be some race condition between
* freeing filesystem blocks by truncating a file and queuing them for
DISCARD/TRIM
* allocating free filesystem blocks for a new append/write to a file
* submitting the DISCARD/TRIM request to the disk
* submitting the write request to the disk

Is there a possibility to generate debug information from ext4 that would be
helpful for tracking down this problem? The file system on the SSD is the only
ext[2-4] file system in the machine.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
  2010-03-19 10:51 [Bug 15579] New: ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load bugzilla-daemon
@ 2010-03-19 12:41 ` bugzilla-daemon
  2010-03-19 18:13 ` bugzilla-daemon
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2010-03-19 12:41 UTC (permalink / raw)
  To: linux-ext4

http://bugzilla.kernel.org/show_bug.cgi?id=15579


Dmitry Monakhov <dmonakhov@openvz.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmonakhov@openvz.org




--- Comment #1 from Dmitry Monakhov <dmonakhov@openvz.org>  2010-03-19 12:40:57 ---
Some time ago i've posted comat discard support which simulate 
discard by generating simple zero filled request 
http://lkml.org/lkml/2010/2/11/74
Many changes was requested so i'm still working on new version (it will be
ready
soon).
But it may be useful for debugging needs with conjunction with blktrace.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
  2010-03-19 10:51 [Bug 15579] New: ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load bugzilla-daemon
  2010-03-19 12:41 ` [Bug 15579] " bugzilla-daemon
@ 2010-03-19 18:13 ` bugzilla-daemon
  1 sibling, 0 replies; 3+ messages in thread
From: bugzilla-daemon @ 2010-03-19 18:13 UTC (permalink / raw)
  To: linux-ext4

http://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #2 from Theodore Tso <tytso@mit.edu>  2010-03-19 18:13:46 ---
Created an attachment (id=25616)
 --> (http://bugzilla.kernel.org/attachment.cgi?id=25616)
Proposed patch for this problem

Oh, sh*t.   If what I think is happening, is happening, this is definitely a
brown paper bag bug.

Does this fix it for you?

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-03-19 18:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-19 10:51 [Bug 15579] New: ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load bugzilla-daemon
2010-03-19 12:41 ` [Bug 15579] " bugzilla-daemon
2010-03-19 18:13 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).