linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Mahoney <jeffm@jeffm.io>
To: Roman Lebedev <lebedev.ri@gmail.com>,
	David Howells <dhowells@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-btrfs@vger.kernel.org, linux-unionfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, fstests@vger.kernel.org,
	Filipe Manana <fdmanana@suse.com>
Subject: Re: kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs
Date: Thu, 5 Nov 2015 21:57:35 -0500	[thread overview]
Message-ID: <563C171F.30702@jeffm.io> (raw)
In-Reply-To: <1443643065-16460-1-git-send-email-lebedev.ri@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5946 bytes --]

On 9/30/15 3:57 PM, Roman Lebedev wrote:
> Hello.
> 
> My / is btrfs.
> To do some my local stuff more cleanly i wanted to use overlayfs, 
> but it didn't quite work.
> 
> Simple non-automatic sequence to reproduce the issue:
>  mkdir lower upper work merged
>  mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merged
>  vi merged/file
>  :wq

Filipe and I got a chance to look into this today.  The crash is due to
commit 4bacc9c9234 (overlayfs: Make f_path always point to the overlay
and f_inode to the underlay)  Incidentally, the test case is as simple
as ":> file ; fsync file" after mounting.

The short version is that after this commit, we see:

file->f_mapping->host = <actual fs inode>
file->f_inode = <actual fs inode>
file->f_path.dentry->d_inode = <overlayfs inode>

So now file_operations callbacks can't assume that file->f_path.dentry
belongs to the same file system that implements the callback.  More than
that, any code that could ultimately get a dentry that comes from an
open file can't trust that it's from the same file system.

This crash is due to this issue.  Unlike xfs and ext2/3/4, we use
file->f_path.dentry->d_inode to resolve the inode.  Using file_inode()
is an easy enough fix here, but we run into trouble later.  We have
logic in the btrfs fsync() call path (check_parent_dirs_for_sync) that
walks back up the dentry chain examining the inode's last transaction
and last unlink transaction to determine whether a full transaction
commit is required.  This obviously doesn't work if we're walking the
overlayfs path instead.  Regardless of any argument over whether that's
doing the right thing, it's a pretty common pattern to assume that
file->f_path.dentry comes from the same file system when using a
file_operation.  Is it intended that that assumption is no longer valid?

-Jeff

> Results in vi being killed on exit, and the following trace appears in dmesg:
> 
> [34304.047841] BUG: unable to handle kernel paging request at 0000000009618e56
> [34304.047846] IP: [<ffffffffa01667b6>] btrfs_sync_file+0xa6/0x350 [btrfs]
> [34304.047864] PGD 0 
> [34304.047866] Oops: 0002 [#12] SMP 
> [34304.047867] Modules linked in: overlay cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc fglrx(PO) nls_utf8 joydev nls_cp437 vfat fat hid_generic usbhid kvm_amd hid kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi sha256_ssse3 sha256_generic snd_hda_intel snd_hda_codec hmac drbg ansi_cprng aesni_intel snd_hda_core aes_x86_64 mxm_wmi snd_hwdep lrw eeepc_wmi snd_pcm gf128mul asus_wmi sparse_keymap rfkill video snd_timer glue_helper sp5100_tco evdev ablk_helper e1000e ohci_pci pcspkr snd ohci_hcd xhci_pci edac_mce_amd ehci_pci serio_raw xhci_hcd soundcore fam15h_power ehci_hcd cryptd edac_core ptp pps_core usbcore k10temp i2c_piix4
> [34304.047893]  sg usb_common acpi_cpufreq wmi tpm_infineon button processor shpchp tpm_tis tpm thermal_sys tcp_yeah tcp_vegas it87 hwmon_vid loop parport_pc ppdev lp parport autofs4 crc32c_generic btrfs xor raid6_pq sd_mod crc32c_intel ahci libahci libata scsi_mod
> [34304.047905] CPU: 4 PID: 13990 Comm: vi Tainted: P      D    O    4.2.0-1-amd64 #1 Debian 4.2.1-2
> [34304.047906] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015
> [34304.047908] task: ffff8803d5f7f2c0 ti: ffff8806a3ec8000 task.ti: ffff8806a3ec8000
> [34304.047909] RIP: 0010:[<ffffffffa01667b6>]  [<ffffffffa01667b6>] btrfs_sync_file+0xa6/0x350 [btrfs]
> [34304.047920] RSP: 0018:ffff8806a3ecbe88  EFLAGS: 00010246
> [34304.047921] RAX: ffff8803d5f7f2c0 RBX: ffff8807b2d46600 RCX: ffffffff81a6ad00
> [34304.047922] RDX: 0000000080000000 RSI: 0000000000000000 RDI: ffff8807c19f8970
> [34304.047923] RBP: ffff8807c19f8970 R08: 0000000000000000 R09: 0000000000000001
> [34304.047924] R10: 0000000000000000 R11: 0000000000000246 R12: ffff8807c19f88c8
> [34304.047925] R13: 0000000000000000 R14: 0000000009618b22 R15: 000055cb20184a70
> [34304.047926] FS:  00007f31c5492800(0000) GS:ffff88082fd00000(0000) knlGS:0000000000000000
> [34304.047927] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [34304.047928] CR2: 0000000009618e56 CR3: 000000044af44000 CR4: 00000000000406e0
> [34304.047929] Stack:
> [34304.047930]  0000000000000001 7fffffffffffffff ffff880403d5b918 8000000000000000
> [34304.047932]  0000000000000000 0000000000000000 000055cb20186d40 ffff8807b2d46600
> [34304.047933]  0000000000000004 ffff88044b249000 0000000000000020 ffff8807b2d46600
> [34304.047935] Call Trace:
> [34304.047939]  [<ffffffff811e7038>] ? do_fsync+0x38/0x60
> [34304.047940]  [<ffffffff811e72b0>] ? SyS_fsync+0x10/0x20
> [34304.047943]  [<ffffffff8154de72>] ? system_call_fast_compare_end+0xc/0x6b
> [34304.047944] Code: 49 8b 0f 48 85 c9 75 e9 eb b3 48 8b 44 24 08 49 8d ac 24 a8 00 00 00 48 89 ef 4c 29 e8 48 83 c0 01 48 89 44 24 18 e8 3a 59 3e e1 <f0> 41 ff 86 34 03 00 00 49 8b 84 24 70 ff ff ff 48 c1 e8 07 83 
> [34304.047959] RIP  [<ffffffffa01667b6>] btrfs_sync_file+0xa6/0x350 [btrfs]
> [34304.047970]  RSP <ffff8806a3ecbe88>
> [34304.047970] CR2: 0000000009618e56
> [34304.047972] ---[ end trace 414199893a542949 ]---
> 
> I was able to create a new fstests test that reproduces my issue,
> and i'm sending it as follow-up to this message.
> 
> Roman Lebedev (1):
>   fstests: generic: Test that fsync works on file in overlayfs merged
>     directory
> 
>  tests/generic/111     | 80 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  tests/generic/111.out |  5 ++++
>  tests/generic/group   |  1 +
>  3 files changed, 86 insertions(+)
>  create mode 100755 tests/generic/111
>  create mode 100644 tests/generic/111.out
> 


-- 
Jeff Mahoney
SUSE Labs



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 827 bytes --]

  parent reply	other threads:[~2015-11-06  2:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-30 19:57 kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs Roman Lebedev
2015-09-30 19:57 ` [RFC PATCH] fstests: generic: Test that fsync works on file in overlayfs merged directory Roman Lebedev
2015-09-30 21:56   ` Dave Chinner
2015-09-30 22:07     ` Eric Sandeen
2015-11-06  2:57 ` Jeff Mahoney [this message]
2015-11-06  3:18   ` kernel BUG when fsync'ing file in a overlayfs merged dir, located on btrfs Al Viro
2015-11-06  4:03     ` Jeff Mahoney
2015-11-06 14:46       ` Jeff Mahoney
2016-03-24 15:20       ` Al Viro
2016-03-24 15:25         ` Al Viro
2016-03-24 15:31         ` Jeff Mahoney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=563C171F.30702@jeffm.io \
    --to=jeffm@jeffm.io \
    --cc=dhowells@redhat.com \
    --cc=fdmanana@suse.com \
    --cc=fstests@vger.kernel.org \
    --cc=lebedev.ri@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).