All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-ext4@vger.kernel.org
Subject: Re: [PATCH v5 00/22] Rewrite XIP code and add XIP support to ext4
Date: Thu, 30 Jan 2014 17:42:30 +1100	[thread overview]
Message-ID: <20140130064230.GG13997@dastard> (raw)
In-Reply-To: <cover.1389779961.git.matthew.r.wilcox@intel.com>

On Wed, Jan 15, 2014 at 08:24:18PM -0500, Matthew Wilcox wrote:
> This series of patches add support for XIP to ext4.  Unfortunately,
> it turns out to be necessary to rewrite the existing XIP support code
> first due to races that are unfixable in the current design.
> 
> Since v4 of this patchset, I've improved the documentation, fixed a
> couple of warnings that a newer version of gcc emitted, and fixed a
> bug where we would read/write the wrong address for I/Os that were not
> aligned to PAGE_SIZE.

Looks like there's something fundamentally broken with the patch set
as it stands. I get this same data corruption on both ext4 and XFS
with XIP using fsx. It's as basic as it gets - the first read after
a mmapped write fails to see the data written by mmap:

$ sudo mkfs.xfs -f /dev/ram0
meta-data=/dev/ram0              isize=256    agcount=4, agsize=256000 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0
data     =                       bsize=4096   blocks=1024000, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=12800, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
$ sudo mount -o xip /dev/ram0 /mnt/scr
$ sudo chmod 777 /mnt/scr
$ ltp/fsx -d -N 1000 -S 0 /mnt/scr/fsx
Seed set to 3774
1 mapwrite      0x3db39 thru    0x3ffff (0x24c7 bytes)
2 mapread       0x2e947 thru    0x33163 (0x481d bytes)
3 read  0x2e836 thru    0x3cba1 (0xe36c bytes)
4 punch from 0x2e7 to 0x5c43, (0x595c bytes)
5 mapwrite      0xcaea thru     0x13ba9 (0x70c0 bytes)
6 punch from 0x31645 to 0x38d1d, (0x76d8 bytes)
7 falloc        from 0x24f92 to 0x2f2b7 (0xa325 bytes)
fallocating to largest ever: 0x171ac
8 falloc        from 0xbcf1 to 0x171ac (0xb4bb bytes)
9 read  0x126f thru     0x11136 (0xfec8 bytes)
READ BAD DATA: offset = 0x126f, size = 0xfec8, fname = /mnt/scr/fsx
OFFSET  GOOD    BAD     RANGE
0x caea 0x05f9  0x0000  0x    0
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caeb 0xf905  0x0000  0x    1
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caec 0x0599  0x0000  0x    2
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caed 0x9905  0x0000  0x    3
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caee 0x05e8  0x0000  0x    4
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caef 0xe805  0x0000  0x    5
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf0 0x0580  0x0000  0x    6
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf1 0x8005  0x0000  0x    7
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf2 0x056c  0x0000  0x    8
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf3 0x6c05  0x0000  0x    9
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf4 0x05ad  0x0000  0x    a
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf5 0xad05  0x0000  0x    b
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf6 0x0539  0x0000  0x    c
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf7 0x3905  0x0000  0x    d
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf8 0x05db  0x0000  0x    e
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf9 0xdb05  0x0000  0x    f
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
LOG DUMP (9 total operations):
1(  1 mod 256): MAPWRITE 0x3db39 thru 0x3ffff   (0x24c7 bytes)
2(  2 mod 256): MAPREAD  0x2e947 thru 0x33163   (0x481d bytes)
3(  3 mod 256): READ     0x2e836 thru 0x3cba1   (0xe36c bytes)
4(  4 mod 256): PUNCH    0x2e7 thru 0x5c42      (0x595c bytes)
5(  5 mod 256): MAPWRITE 0xcaea thru 0x13ba9    (0x70c0 bytes)  ******WWWW
6(  6 mod 256): PUNCH    0x31645 thru 0x38d1c   (0x76d8 bytes)
7(  7 mod 256): FALLOC   0x24f92 thru 0x2f2b7   (0xa325 bytes) INTERIOR
8(  8 mod 256): FALLOC   0xbcf1 thru 0x171ac    (0xb4bb bytes) INTERIOR ******FFFF
9(  9 mod 256): READ     0x126f thru 0x11136    (0xfec8 bytes)  ***RRRR***
Correct content saved for comparison
(maybe hexdump "/mnt/scr/fsx" vs "/mnt/scr/fsx.fsxgood")

XFS gives a good indication that we aren't doing something correctly
w.r.t. mapped XIP writes, as trying to fiemap the file ASSERT fails
with a delayed allocation extent somewhere inside the file after a
sync. I shall keep digging.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-ext4@vger.kernel.org
Subject: Re: [PATCH v5 00/22] Rewrite XIP code and add XIP support to ext4
Date: Thu, 30 Jan 2014 17:42:30 +1100	[thread overview]
Message-ID: <20140130064230.GG13997@dastard> (raw)
In-Reply-To: <cover.1389779961.git.matthew.r.wilcox@intel.com>

On Wed, Jan 15, 2014 at 08:24:18PM -0500, Matthew Wilcox wrote:
> This series of patches add support for XIP to ext4.  Unfortunately,
> it turns out to be necessary to rewrite the existing XIP support code
> first due to races that are unfixable in the current design.
> 
> Since v4 of this patchset, I've improved the documentation, fixed a
> couple of warnings that a newer version of gcc emitted, and fixed a
> bug where we would read/write the wrong address for I/Os that were not
> aligned to PAGE_SIZE.

Looks like there's something fundamentally broken with the patch set
as it stands. I get this same data corruption on both ext4 and XFS
with XIP using fsx. It's as basic as it gets - the first read after
a mmapped write fails to see the data written by mmap:

$ sudo mkfs.xfs -f /dev/ram0
meta-data=/dev/ram0              isize=256    agcount=4, agsize=256000 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0
data     =                       bsize=4096   blocks=1024000, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=12800, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
$ sudo mount -o xip /dev/ram0 /mnt/scr
$ sudo chmod 777 /mnt/scr
$ ltp/fsx -d -N 1000 -S 0 /mnt/scr/fsx
Seed set to 3774
1 mapwrite      0x3db39 thru    0x3ffff (0x24c7 bytes)
2 mapread       0x2e947 thru    0x33163 (0x481d bytes)
3 read  0x2e836 thru    0x3cba1 (0xe36c bytes)
4 punch from 0x2e7 to 0x5c43, (0x595c bytes)
5 mapwrite      0xcaea thru     0x13ba9 (0x70c0 bytes)
6 punch from 0x31645 to 0x38d1d, (0x76d8 bytes)
7 falloc        from 0x24f92 to 0x2f2b7 (0xa325 bytes)
fallocating to largest ever: 0x171ac
8 falloc        from 0xbcf1 to 0x171ac (0xb4bb bytes)
9 read  0x126f thru     0x11136 (0xfec8 bytes)
READ BAD DATA: offset = 0x126f, size = 0xfec8, fname = /mnt/scr/fsx
OFFSET  GOOD    BAD     RANGE
0x caea 0x05f9  0x0000  0x    0
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caeb 0xf905  0x0000  0x    1
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caec 0x0599  0x0000  0x    2
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caed 0x9905  0x0000  0x    3
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caee 0x05e8  0x0000  0x    4
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caef 0xe805  0x0000  0x    5
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf0 0x0580  0x0000  0x    6
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf1 0x8005  0x0000  0x    7
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf2 0x056c  0x0000  0x    8
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf3 0x6c05  0x0000  0x    9
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf4 0x05ad  0x0000  0x    a
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf5 0xad05  0x0000  0x    b
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf6 0x0539  0x0000  0x    c
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf7 0x3905  0x0000  0x    d
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf8 0x05db  0x0000  0x    e
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x caf9 0xdb05  0x0000  0x    f
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
LOG DUMP (9 total operations):
1(  1 mod 256): MAPWRITE 0x3db39 thru 0x3ffff   (0x24c7 bytes)
2(  2 mod 256): MAPREAD  0x2e947 thru 0x33163   (0x481d bytes)
3(  3 mod 256): READ     0x2e836 thru 0x3cba1   (0xe36c bytes)
4(  4 mod 256): PUNCH    0x2e7 thru 0x5c42      (0x595c bytes)
5(  5 mod 256): MAPWRITE 0xcaea thru 0x13ba9    (0x70c0 bytes)  ******WWWW
6(  6 mod 256): PUNCH    0x31645 thru 0x38d1c   (0x76d8 bytes)
7(  7 mod 256): FALLOC   0x24f92 thru 0x2f2b7   (0xa325 bytes) INTERIOR
8(  8 mod 256): FALLOC   0xbcf1 thru 0x171ac    (0xb4bb bytes) INTERIOR ******FFFF
9(  9 mod 256): READ     0x126f thru 0x11136    (0xfec8 bytes)  ***RRRR***
Correct content saved for comparison
(maybe hexdump "/mnt/scr/fsx" vs "/mnt/scr/fsx.fsxgood")

XFS gives a good indication that we aren't doing something correctly
w.r.t. mapped XIP writes, as trying to fiemap the file ASSERT fails
with a delayed allocation extent somewhere inside the file after a
sync. I shall keep digging.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2014-01-30  6:42 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-16  1:24 [PATCH v5 00/22] Rewrite XIP code and add XIP support to ext4 Matthew Wilcox
2014-01-16  1:24 ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 01/22] Fix XIP fault vs truncate race Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 02/22] Allow page fault handlers to perform the COW Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 03/22] axonram: Fix bug in direct_access Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 04/22] Change direct_access calling convention Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 05/22] Introduce IS_XIP(inode) Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 06/22] Treat XIP like O_DIRECT Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-31 16:59   ` Jan Kara
2014-01-31 16:59     ` Jan Kara
2014-01-16  1:24 ` [PATCH v5 07/22] Rewrite XIP page fault handling Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 08/22] Change xip_truncate_page to take a get_block parameter Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 09/22] Remove mm/filemap_xip.c Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 10/22] Remove get_xip_mem Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:46   ` Randy Dunlap
2014-01-16  1:46     ` Randy Dunlap
2014-01-27 13:26     ` Matthew Wilcox
2014-01-27 13:26       ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 11/22] Replace ext2_clear_xip_target with xip_clear_blocks Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 12/22] ext2: Remove ext2_xip_verify_sb() Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 13/22] ext2: Remove ext2_use_xip Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 14/22] ext2: Remove xip.c and xip.h Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 15/22] Remove CONFIG_EXT2_FS_XIP Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 16/22] ext2: Remove ext2_aops_xip Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 17/22] xip: Add xip_zero_page_range Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 18/22] ext4: Make ext4_block_zero_page_range static Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 19/22] ext4: Add XIP functionality Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 20/22] ext4: Fix typos Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 21/22] xip: Add reporting of major faults Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
2014-01-16  1:24 ` [PATCH v5 22/22] XIP: Add support for unwritten extents Matthew Wilcox
2014-01-16  1:24   ` Matthew Wilcox
     [not found] ` <CEFDA737.22F87%matthew.r.wilcox@intel.com>
2014-01-17  0:00   ` [PATCH v5 19/22] ext4: Add XIP functionality Ross Zwisler
2014-01-17  0:00     ` Ross Zwisler
     [not found] ` <CEFD7DAD.22F65%matthew.r.wilcox@intel.com>
2014-01-22 22:51   ` [PATCH v5 22/22] XIP: Add support for unwritten extents Ross Zwisler
2014-01-22 22:51     ` Ross Zwisler
2014-01-23 12:08     ` Matthew Wilcox
2014-01-23 12:08       ` Matthew Wilcox
2014-01-23 19:13       ` Ross Zwisler
2014-01-23 19:13         ` Ross Zwisler
     [not found]     ` <CF0C370C.235F1%willy@linux.intel.com>
2014-01-27 23:32       ` Ross Zwisler
2014-01-27 23:32         ` Ross Zwisler
2014-01-28  3:49         ` Matthew Wilcox
2014-01-28  3:49           ` Matthew Wilcox
2014-01-23  7:48 ` [PATCH v5 00/22] Rewrite XIP code and add XIP support to ext4 Dave Chinner
2014-01-23  7:48   ` Dave Chinner
2014-01-23  7:53   ` Dave Chinner
2014-01-23  7:53     ` Dave Chinner
2014-01-23  9:01 ` Dave Chinner
2014-01-23  9:01   ` Dave Chinner
2014-01-23 12:12   ` Wilcox, Matthew R
2014-01-23 12:12     ` Wilcox, Matthew R
2014-01-28  6:06     ` Dave Chinner
2014-01-28  6:06       ` Dave Chinner
2014-01-30  6:42 ` Dave Chinner [this message]
2014-01-30  6:42   ` Dave Chinner
2014-01-30  9:25   ` Dave Chinner
2014-01-30  9:25     ` Dave Chinner
2014-01-31  3:06     ` Dave Chinner
2014-01-31  3:06       ` Dave Chinner
2014-01-31  5:45       ` Ross Zwisler
2014-01-31  5:45         ` Ross Zwisler
2014-01-31 13:04         ` Dave Chinner
2014-01-31 13:04           ` Dave Chinner
     [not found] ` <CF1FF3EB.24114%matthew.r.wilcox@intel.com>
2014-02-11 23:12   ` [PATCH v5 19/22] ext4: Add XIP functionality Ross Zwisler
2014-02-11 23:12     ` Ross Zwisler
2014-02-13  0:00     ` Ross Zwisler
2014-02-13  0:00       ` Ross Zwisler
     [not found] ` <CF215477.24281%matthew.r.wilcox@intel.com>
2014-02-12 23:53   ` [PATCH v5 06/22] Treat XIP like O_DIRECT Ross Zwisler
2014-02-12 23:53     ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140130064230.GG13997@dastard \
    --to=david@fromorbit.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.