linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
  2010-03-19 10:51 [Bug 15579] New: " bugzilla-daemon
@ 2010-03-19 12:41 ` bugzilla-daemon
  2010-03-19 18:13 ` bugzilla-daemon
  1 sibling, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-19 12:41 UTC (permalink / raw)
  To: linux-ext4

http://bugzilla.kernel.org/show_bug.cgi?id=15579


Dmitry Monakhov <dmonakhov@openvz.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dmonakhov@openvz.org




--- Comment #1 from Dmitry Monakhov <dmonakhov@openvz.org>  2010-03-19 12:40:57 ---
Some time ago i've posted comat discard support which simulate 
discard by generating simple zero filled request 
http://lkml.org/lkml/2010/2/11/74
Many changes was requested so i'm still working on new version (it will be
ready
soon).
But it may be useful for debugging needs with conjunction with blktrace.

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
  2010-03-19 10:51 [Bug 15579] New: " bugzilla-daemon
  2010-03-19 12:41 ` [Bug 15579] " bugzilla-daemon
@ 2010-03-19 18:13 ` bugzilla-daemon
  1 sibling, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-19 18:13 UTC (permalink / raw)
  To: linux-ext4

http://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #2 from Theodore Tso <tytso@mit.edu>  2010-03-19 18:13:46 ---
Created an attachment (id=25616)
 --> (http://bugzilla.kernel.org/attachment.cgi?id=25616)
Proposed patch for this problem

Oh, sh*t.   If what I think is happening, is happening, this is definitely a
brown paper bag bug.

Does this fix it for you?

-- 
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
@ 2010-03-21  9:46 ` bugzilla-daemon
  2010-03-22 21:41 ` bugzilla-daemon
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-21  9:46 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #3 from Andreas Beckmann <kernel-bugs@abeckmann.de>  2010-03-21 09:45:54 ---
(In reply to comment #2)
> Does this fix it for you?

The patch didn't apply cleanly to 2.6.33, I had to "remove the old code block"
manually.
So far after rebuilding the ext4 module I haven't experienced the problem any
more. Please get this into 2.6.33.x. 

Thanks!

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
  2010-03-21  9:46 ` [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load bugzilla-daemon
@ 2010-03-22 21:41 ` bugzilla-daemon
  2010-03-23 11:10 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-22 21:41 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579


Eric Sandeen <sandeen@redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sandeen@redhat.com




--- Comment #4 from Eric Sandeen <sandeen@redhat.com>  2010-03-22 21:40:52 ---
Just for what it's worth, I've had trouble reproducing this on another brand of
SSD... something like this (don't let the xfs_io throw you; it's just a
convenient way to generate the IO).  I did this on a 512M filesystem.

#!/bin/bash

SCRATCH_MNT=/mnt/scratch

rm -f $SCRATCH_MNT/*
touch $SCRATCH_MNT/outputfile

# Create several large-ish files
for I in `seq 1 240`; do
  xfs_io -F -f -c "pwrite 0 2m" $SCRATCH_MNT/file$I &>/dev/null
done

# reread the last bit of each, just for kicks, and truncate off 1m
for I in `seq 1 240`; do
  xfs_io -F -c "pread 1m 2m" $SCRATCH_MNT/file$I &>/dev/null
  xfs_io -F -c "truncate 1m" $SCRATCH_MNT/file$I
done

# Append the outputfile
xfs_io -F -c "pwrite 0 250m" $SCRATCH_MNT/outputfile &>/dev/null

In the end I don't get any corruption.  I was hoping to write a testcase for
this (one that didn't take 250G) :)

Does the above reflect your use case?  Does the above corrupt the outputfile on
your filesystem?  (note the "rm -rf" above, careful with that).  You could
substitute dd for xfs_io without much trouble if desired.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
  2010-03-21  9:46 ` [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load bugzilla-daemon
  2010-03-22 21:41 ` bugzilla-daemon
@ 2010-03-23 11:10 ` bugzilla-daemon
  2010-03-23 14:29 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-23 11:10 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #5 from Andreas Beckmann <kernel-bugs@abeckmann.de>  2010-03-23 11:10:15 ---
(In reply to comment #4)
> Just for what it's worth, I've had trouble reproducing this on another brand of
> SSD... something like this (don't let the xfs_io throw you; it's just a
> convenient way to generate the IO).  I did this on a 512M filesystem.

Might be a probability issue. For the 250 GB case I did in total about 200000
truncations on about 250 files and found in the output file 8 and 13 corrupt
blocks (I only kept detailed numbers for two cases). Reducing the block size
might "help" by increasing the number of I/Os.

I can't test your script right now, the disks are all busy with some long
running experiments. There should be another one just back from RMA on my desk,
so I can try it tomorrow when I'm back there (was travelling for a week).

What do you do on the remaining space of the SSD? Try putting a file system
there and fill it with something so that the SSD is 99% filled so it can't that
easily remap the blocks you are writing to.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (2 preceding siblings ...)
  2010-03-23 11:10 ` bugzilla-daemon
@ 2010-03-23 14:29 ` bugzilla-daemon
  2010-03-23 21:01 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-23 14:29 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #6 from Eric Sandeen <sandeen@redhat.com>  2010-03-23 14:29:35 ---
(In reply to comment #5)

> What do you do on the remaining space of the SSD? Try putting a file system
> there and fill it with something so that the SSD is 99% filled so it can't that
> easily remap the blocks you are writing to.

Hm, I suppose that could be, and it makes it a little harder to write a generic
testcase....

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (3 preceding siblings ...)
  2010-03-23 14:29 ` bugzilla-daemon
@ 2010-03-23 21:01 ` bugzilla-daemon
  2010-03-29  8:17 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-23 21:01 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579


Greg.Freemyer@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |Greg.Freemyer@gmail.com




--- Comment #7 from Greg.Freemyer@gmail.com  2010-03-23 21:01:04 ---
If the number of available unmapped blocks has an impact, that seems most
likely to be a SSD firmware bug to me.

ie. If the linux kernel is sending control messages in the wrong order, then it
should cause corruption regardless of the number of unmapped blocks.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (4 preceding siblings ...)
  2010-03-23 21:01 ` bugzilla-daemon
@ 2010-03-29  8:17 ` bugzilla-daemon
  2010-03-29  8:37 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-29  8:17 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #8 from Andreas Beckmann <kernel-bugs@abeckmann.de>  2010-03-29 08:16:54 ---
(In reply to comment #7)
> If the number of available unmapped blocks has an impact, that seems most
> likely to be a SSD firmware bug to me.
> 
> ie. If the linux kernel is sending control messages in the wrong order, then it
> should cause corruption regardless of the number of unmapped blocks.

That's correct except that you may get a timing issue (e.g. writing to free
unmapped blocks is/could be/should be a bit faster than clearing the blocks
first) which could turn this into race condition debugging ...

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (5 preceding siblings ...)
  2010-03-29  8:17 ` bugzilla-daemon
@ 2010-03-29  8:37 ` bugzilla-daemon
  2010-03-29  8:43 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-29  8:37 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #9 from Andreas Beckmann <kernel-bugs@abeckmann.de>  2010-03-29 08:36:38 ---
(In reply to comment #4)
> Just for what it's worth, I've had trouble reproducing this on another brand of
> SSD... something like this (don't let the xfs_io throw you; it's just a
> convenient way to generate the IO).  I did this on a 512M filesystem.

With some small modifications I can reproduce this every time: I do two
iterations of truncating + writing the output. Seems to happen in the second
write only.
You can skip the reading, not neccessary.
N=236 ist the smallest N where the problem occurs, N=253 the maximum number of
files fitting on the file system.

./find-zeroes is my tool to check for "0x00 holes"

mkfs options: -m 0 -T largefile4

#!/bin/bash

SCRATCH_MNT=/mnt/scratch

N=253

#rm -f $SCRATCH_MNT/*
#touch $SCRATCH_MNT/outputfile
#xfs_io -F -c "pwrite 0 ${N}m" $SCRATCH_MNT/outputfile &>/dev/null
#xfs_io -F -c "pwrite ${N}M ${N}m" $SCRATCH_MNT/outputfile &>/dev/null
#./find-zeroes $SCRATCH_MNT/outputfile

rm -f $SCRATCH_MNT/*
touch $SCRATCH_MNT/outputfile

# Create several large-ish files
for I in `seq 1 $N`; do
  xfs_io -F -f -c "pwrite 0 2m" $SCRATCH_MNT/file$I &>/dev/null
done

# reread the last bit of each, just for kicks, and truncate off 1m
for I in `seq 1 $N`; do
  xfs_io -F -c "pread 1m 1m" $SCRATCH_MNT/file$I &>/dev/null
  xfs_io -F -c "truncate 1m" $SCRATCH_MNT/file$I
done

# Append the outputfile
xfs_io -F -c "pwrite 0 ${N}m" $SCRATCH_MNT/outputfile &>/dev/null

# reread the last bit of each, just for kicks, and truncate off 1m
for I in `seq 1 $N`; do
  xfs_io -F -c "pread 0m 1m" $SCRATCH_MNT/file$I &>/dev/null
  xfs_io -F -c "truncate 0m" $SCRATCH_MNT/file$I
done

# Append the outputfile
xfs_io -F -c "pwrite ${N}M ${N}m" $SCRATCH_MNT/outputfile &>/dev/null

./find-zeroes $SCRATCH_MNT/outputfile



$ ./trash-ext4-discard
at 246800384 length 18489344
size 511950848 zeroes 18489344
$ ./trash-ext4-discard
at 246808576 length 18481152
size 511950848 zeroes 18481152
$ ./trash-ext4-discard
at 246857728 length 18432000
size 511848448 zeroes 18432000
$ ./trash-ext4-discard
at 246640640 length 18649088
size 512086016 zeroes 18649088
$ ./trash-ext4-discard
at 246800384 length 18489344
size 511959040 zeroes 18489344

actually this is enough:


# Create several large-ish files
for I in `seq 1 $N`; do
  xfs_io -F -f -c "pwrite 0 1m" $SCRATCH_MNT/file$I &>/dev/null
done

# Append the outputfile
xfs_io -F -c "pwrite 0 ${N}m" $SCRATCH_MNT/outputfile &>/dev/null

# truncate all
for I in `seq 1 $N`; do
  xfs_io -F -c "truncate 0m" $SCRATCH_MNT/file$I
done

# Append the outputfile
xfs_io -F -c "pwrite ${N}M ${N}m" $SCRATCH_MNT/outputfile &>/dev/null


$ ./trash-ext4-discard2
at 228061184 length 37228544
size 530579456 zeroes 37228544

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (6 preceding siblings ...)
  2010-03-29  8:37 ` bugzilla-daemon
@ 2010-03-29  8:43 ` bugzilla-daemon
  2010-03-29 14:55 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-29  8:43 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #10 from Andreas Beckmann <kernel-bugs@abeckmann.de>  2010-03-29 08:43:03 ---
(In reply to comment #9)
> mkfs options: -m 0 -T largefile4
no, only -T largefile otherwise I can't create enough files

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (7 preceding siblings ...)
  2010-03-29  8:43 ` bugzilla-daemon
@ 2010-03-29 14:55 ` bugzilla-daemon
  2010-05-19 10:50 ` bugzilla-daemon
  2010-05-19 15:58 ` bugzilla-daemon
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-03-29 14:55 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #11 from Eric Sandeen <sandeen@redhat.com>  2010-03-29 14:54:57 ---
Thanks, I'll make sure I can reproduce this and turn it into a testcase.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (8 preceding siblings ...)
  2010-03-29 14:55 ` bugzilla-daemon
@ 2010-05-19 10:50 ` bugzilla-daemon
  2010-05-19 15:58 ` bugzilla-daemon
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-05-19 10:50 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579





--- Comment #12 from Andreas Beckmann <kernel-bugs@abeckmann.de>  2010-05-19 10:50:44 ---
I just saw that this patch went into 2.6.34
commmit b90f687018e6d6c77d981b09203780f7001407e5

Thanks!

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load
       [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
                   ` (9 preceding siblings ...)
  2010-05-19 10:50 ` bugzilla-daemon
@ 2010-05-19 15:58 ` bugzilla-daemon
  10 siblings, 0 replies; 13+ messages in thread
From: bugzilla-daemon @ 2010-05-19 15:58 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=15579


Eric Sandeen <sandeen@redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|fs_ext4@kernel-bugs.osdl.or |sandeen@redhat.com
                   |g                           |




--- Comment #13 from Eric Sandeen <sandeen@redhat.com>  2010-05-19 15:58:25 ---
Taking bug so I can close it :)

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-05-19 15:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <bug-15579-13602@https.bugzilla.kernel.org/>
2010-03-21  9:46 ` [Bug 15579] ext4 -o discard produces incorrect blocks of zeroes in newly created files under heavy read+truncate+append-new-file load bugzilla-daemon
2010-03-22 21:41 ` bugzilla-daemon
2010-03-23 11:10 ` bugzilla-daemon
2010-03-23 14:29 ` bugzilla-daemon
2010-03-23 21:01 ` bugzilla-daemon
2010-03-29  8:17 ` bugzilla-daemon
2010-03-29  8:37 ` bugzilla-daemon
2010-03-29  8:43 ` bugzilla-daemon
2010-03-29 14:55 ` bugzilla-daemon
2010-05-19 10:50 ` bugzilla-daemon
2010-05-19 15:58 ` bugzilla-daemon
2010-03-19 10:51 [Bug 15579] New: " bugzilla-daemon
2010-03-19 12:41 ` [Bug 15579] " bugzilla-daemon
2010-03-19 18:13 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).