All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Jan Kara <jack@suse.com>
Cc: linux-ext4@vger.kernel.org,
	Dan Williams <dan.j.williams@intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 0/6] ext4: Punch hole and DAX fixes
Date: Wed, 14 Oct 2015 12:06:04 -0600	[thread overview]
Message-ID: <20151014180604.GA31803@linux.intel.com> (raw)
In-Reply-To: <1444822227-29984-1-git-send-email-jack@suse.com>

On Wed, Oct 14, 2015 at 01:30:21PM +0200, Jan Kara wrote:
> Hello,
> 
> This series fixes a long standing problem of racing punch hole and page fault
> resulting in possible filesystem corruption or stale data exposure. We fix the
> problem by using a new inode-private rw_semaphore i_mmap_sem to synchronize
> page faults with truncate and punch hole operations.
> 
> When having this exclusion, the only remaining problem with DAX implementation
> are races between two page faults zeroing out same block concurrently (where
> the data written after the first fault finishes are possibly overwritten by
> the second fault still doing zeroing). That is dealt with by patches 5 and 6
> in this series where we implement block zeroing directly in ext4_map_blocks().
> 
> The patches survived xfstests run both in dax and non-dax mode.
> 
> 								Honza

I threw these patches in to my own test setup and was able to hit the
following with generic/198:

[   57.453855] run fstests generic/198 at 2015-10-14 11:41:21
[   57.585886] ------------[ cut here ]------------
[   57.585890] kernel BUG at fs/ext4/inode.c:3057!
[   57.585892] invalid opcode: 0000 [#1] SMP 
[   57.585894] Modules linked in: nd_pmem nd_btt nd_e820 libnvdimm
[   57.585898] CPU: 0 PID: 3099 Comm: aiodio_sparse2 Not tainted 4.3.0-rc4_ext4_locking+ #1
[   57.585899] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
[   57.585900] task: ffff88040ea3b400 ti: ffff8800ba5e8000 task.ti: ffff8800ba5e8000
[   57.585901] RIP: 0010:[<ffffffff812f7b79>]  [<ffffffff812f7b79>] ext4_dax_mmap_get_block+0x1f9/0x250
[   57.585906] RSP: 0000:ffff8800ba5ebc48  EFLAGS: 00010216
[   57.585907] RAX: 0000000000000200 RBX: ffff8800ba5ebcf8 RCX: 000000000000000c
[   57.585908] RDX: ffff8800ba5ebcf8 RSI: 000000000000004b RDI: ffff8801fbdb13c0
[   57.585909] RBP: ffff8800ba5ebca0 R08: ffffffff812f7980 R09: ffff8801fbdb15e0
[   57.585910] R10: 00007f7921c00000 R11: 000000000000000c R12: ffff8801fbdb13c0
[   57.585911] R13: 0000000000000001 R14: ffff8800b153c870 R15: 0000000000000000
[   57.585912] FS:  00007f7928999700(0000) GS:ffff880411000000(0000) knlGS:0000000000000000
[   57.585913] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   57.585914] CR2: 00007f7921c00000 CR3: 00000000bb1b5000 CR4: 00000000000406f0
[   57.585917] Stack:
[   57.585918]  ffff8800ba351000 ffff8800bafca028 0000000000000000 000002000000004b
[   57.585920]  ffff8800ba5ebd08 000000004edcd27a ffff8800baf47f18 ffff8801fbdb13c0
[   57.585922]  00007f7921c00000 ffff8800b153c870 0000000000000000 ffff8800ba5ebd90
[   57.585924] Call Trace:
[   57.585928]  [<ffffffff812a061c>] __dax_pmd_fault+0x15c/0x590
[   57.585931]  [<ffffffff81224d46>] ? kmem_cache_alloc+0xd6/0x210
[   57.585934]  [<ffffffff812f06ab>] ext4_dax_pmd_fault+0xbb/0x140
[   57.585936]  [<ffffffff811f6d0b>] handle_mm_fault+0x2cb/0x1960
[   57.585938]  [<ffffffff811f6a9a>] ? handle_mm_fault+0x5a/0x1960
[   57.585941]  [<ffffffff81067731>] __do_page_fault+0x191/0x3f0
[   57.585943]  [<ffffffff81067a5f>] trace_do_page_fault+0x4f/0x120
[   57.585944]  [<ffffffff8106217a>] do_async_page_fault+0x1a/0xa0
[   57.585947]  [<ffffffff81a23dc8>] async_page_fault+0x28/0x30
[   57.585948] Code: e8 3d ef ff ff 85 c0 41 89 c5 4c 89 fa be 18 0c 00 00 48 c7 c7 50 0a c3 81 78 31 e8 82 17 03 00 85 c0 44 0f 48 f0 e9 98 fe ff ff <0f> 0b be 1d 0c 00 00 48 c7 c7 f8 6e f2 81 e8 74 b6 da ff c6 05 
[   57.585969] RIP  [<ffffffff812f7b79>] ext4_dax_mmap_get_block+0x1f9/0x250
[   57.585971]  RSP <ffff8800ba5ebc48>
[   57.585973] ---[ end trace 356a2226c61d7104 ]---

I'll spend some time today trying to figure out what's going on, but thought
I'd let you know early in case you were already debugging this.

My test setup is a pair of 4 GiB PMEM partitions in a KVM guest, one for the
TEST_DEV and the other for the SCRATCH_DEV.  I'm setting "-o dax" with
MOUNT_OPTIONS:

# ./check generic/198
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 alara 4.3.0-rc4_ext4_locking+
MKFS_OPTIONS  -- /dev/pmem0p2
MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem0p2 /mnt/xfstests_scratch

generic/198 0s ... [failed, exit status 139] - output mismatch (see /root/xfstests/results//generic/198.out.bad)
    --- tests/generic/198.out	2015-10-02 10:19:36.804795880 -0600
    +++ /root/xfstests/results//generic/198.out.bad	2015-10-14 11:41:21.234972535 -0600
    @@ -1,2 +1,3 @@
     QA output created by 198
     Silence is golden.
    +./tests/generic/198: line 55:  3099 Segmentation fault      $AIO_TEST "$TEST_DIR/aiodio_sparse"
    ...
    (Run 'diff -u tests/generic/198.out /root/xfstests/results//generic/198.out.bad'  to see the entire diff)

  parent reply	other threads:[~2015-10-14 18:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-14 11:30 [PATCH 0/6] ext4: Punch hole and DAX fixes Jan Kara
2015-10-14 11:30 ` [PATCH 1/6] ext4: Fix races between page faults and hole punching Jan Kara
2015-10-15  3:00   ` Ross Zwisler
2015-10-15  9:14     ` Jan Kara
2015-10-15 20:22     ` Dave Chinner
2015-10-14 11:30 ` [PATCH 2/6] ext4: Document lock ordering Jan Kara
2015-10-14 11:30 ` [PATCH 3/6] ext4: Get rid of EXT4_GET_BLOCKS_NO_LOCK flag Jan Kara
2015-10-14 11:30 ` [PATCH 4/6] ext4: Provide ext4_issue_zeroout() Jan Kara
2015-10-14 13:18   ` kbuild test robot
2015-10-14 11:30 ` [PATCH 5/6] ext4: Implement allocation of pre-zeroed blocks Jan Kara
2015-10-14 11:30 ` [PATCH 6/6] ext4: Use pre-zeroed blocks for DAX page faults Jan Kara
2015-10-14 18:06 ` Ross Zwisler [this message]
2015-10-14 21:07   ` [PATCH 0/6] ext4: Punch hole and DAX fixes Ross Zwisler
2015-10-15  9:13     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151014180604.GA31803@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=matthew.r.wilcox@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.