All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Philipp Hahn <hahn@univention.de>
Cc: linux-fsdevel@vger.kernel.org,
	"linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Felix Botner <botner@univention.de>,
	Dirk Wiesenthal <wiesenthal@univention.de>
Subject: Re: OOPS: linux-4.9.33/fs/pnode.c:propagate_one()
Date: Thu, 28 Sep 2017 12:00:02 -0500	[thread overview]
Message-ID: <8760c2u5bh.fsf@xmission.com> (raw)
In-Reply-To: <efbf1b3f-57cb-8ee8-e135-418a1706dbbd@univention.de> (Philipp Hahn's message of "Thu, 28 Sep 2017 15:03:37 +0200")

Philipp Hahn <hahn@univention.de> writes:

> Hello Eric,
>
> we have observed the following OOPS with linux-4.9.33 on several
> occasions when starting a Docker container:
>
>> [  531.801537] Oops: 0000 [#1] SMP
>> [  531.801565] Modules linked in: xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo xt_addrtype br_netfilter bridge stp llc overlay ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_conntrack iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc joydev pcspkr serio_raw virtio_balloon i2c_piix4 evdev quota_v2 quota_tree parport_pc ppdev lp parport autofs4 ext4 crc16 jbd2 fscrypto mbcache dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_mod hid_generic usbhid hid ata_generic cirrus virtio_net virtio_blk ttm ata_piix libata uhci_hcd ehci_hcd psmouseSep 28 13:30:33 master70 kernel: [  531.802171]  drm_kms_helper scsi_mod floppy usbcore virtio_pci button drm virtio_ring virtio usb_common
>> [  531.802247] CPU: 0 PID: 1668 Comm: dockerd Not tainted 4.9.0-ucs104-amd64 #1 Debian 4.9.30-2A~4.2.0.201706171152
>> [  531.802310] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
>> [  531.802347] task: ffff918af7ccaf40 task.stack: ffffb47a40b54000
>> [  531.802385] RIP: 0010:[<ffffffff91234020>]  [<ffffffff91234020>] propagate_one+0x160/0x1e0
>> [  531.802444] RSP: 0018:ffffb47a40b57e08  EFLAGS: 00010282
>> [  531.802479] RAX: 0000000000000001 RBX: ffff918adf43cd80 RCX: 0000000000000000
>> [  531.802524] RDX: ffff918ad69209c0 RSI: ffff918adf43cd80 RDI: ffff918ad69209c0
>> [  531.802569] RBP: ffff918afb3a2480 R08: 0000000000000001 R09: ffff918ac0183e40
>> [  531.802614] R10: 00000000000011b8 R11: 00000000000176b9 R12: ffffb47a40b57e58
>> [  531.802670] R13: 0000000000000000 R14: ffff918adf43cd80 R15: ffff918afbe83600
>> [  531.802716] FS:  00007f6157fff700(0000) GS:ffff918affc00000(0000) knlGS:0000000000000000
>> [  531.802767] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  531.802804] CR2: 0000000000000010 CR3: 0000000039e60000 CR4: 00000000000006f0
>> [  531.802852] Stack:
>> [  531.802868]  ffff918adf43cd80 ffff918afb3a2480 ffffffff912344af ffff918ac0183e40
>> [  531.802923]  ffff918afbe83600 ffff918afb3a2480 0000000000000000 0000000000000000
>> [  531.802978]  ffffffff91226971 ffff918afe084900 ffff918ac01c4c40 3aa437d5d9ae5224
>> [  531.803037] Call Trace:
>> [  531.803062]  [<ffffffff912344af>] ? propagate_mnt+0x11f/0x150
>> [  531.803104]  [<ffffffff91226971>] ? attach_recursive_mnt+0x1d1/0x2f0
>> [  531.803146]  [<ffffffff912279cd>] ? do_mount+0xafd/0xc70
>> [  531.803182]  [<ffffffff91227e24>] ? SyS_mount+0x84/0xc0
>> [  531.803229]  [<ffffffff91611abb>] ? system_call_fast_compare_end+0xc/0x9b
>> [  531.803273] Code: 48 89 de eb 09 f6 42 33 04 75 7d 48 89 d6 48 8b 96 d8 00 00 00 48 39 fa 75 eb 48 8b 0d 52 d6 cc 00 4c 8b 0d 53 d6 cc 00 49 39 c9 <48> 8b 51 10 74 5c 48 3b ba d8 00 00 00 75 3f 45 84 c0 75 55 8b
>> [  531.805256] RIP  [<ffffffff91234020>] propagate_one+0x160/0x1e0
>> [  531.807057]  RSP <ffffb47a40b57e08>
>> [  531.808839] CR2: 0000000000000010
>> [  531.810618] ---[ end trace f817db65f78e36fc ]---
>
> The same has happened in Amazon-EC2 and on out own KVM servers.
>
> There was a similar bug report in 2016 at
> <https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1572316>, but that
> one got fixed with "v4.6-rc7~24^2".
>
> Maybe is has already been fixed by
>> $ git l1 v4.9.33..v4.9.52 -- fs/pnode.c
>> 54fcb2303ef4 mnt: Make propagate_umount less slow for overlapping mount propagation trees
>> bb4fbf094b44 mnt: In propgate_umount handle visiting mounts in any order
>> e260db757676 mnt: In umount propagation reparent in a separate pass
>
> but I'd like to ask anyway if you have seen such a OOPs before?

I have not.

> For your convenience here's the disassembly, where I think "rcx =
> last_source->mnt_parent = NULL":

If your disassembly is lined up properly with the code my guess would
be that rcx simply holds the value of last_source.  And rdx
is being set to last_source->mnt_parent.

Disassembly a recent build on my machine I have a one byte difference
from yours, but that is what I see happening. rcx == last_source.

Why and how last_source is getting set to NULL is something I need to
read the code for a while to figure out.

>> 		do {
>> 			struct mount *parent = last_source->mnt_parent;
>> 			if (last_source == first_source)
>> 000000000000029d <propagate_one+0x15d> cmp    %rcx,%r9
>> 			p = n->mnt_master;
>> 			if (p == dest_master || IS_MNT_MARKED(p))
>> 				break;
>> 		}
>> 		do {
>> 			struct mount *parent = last_source->mnt_parent;
>> 00000000000002a0 <propagate_one+0x160> mov    0x10(%rcx),%rdx
>> 			if (last_source == first_source)
>> 00000000000002a4 <propagate_one+0x164> je     0000000000000302 <propagate_one+0x1c2>
>> 				break;
>> 			done = parent->mnt_master == p;
>> 			if (done && peers(n, parent))
>> 00000000000002a6 <propagate_one+0x166> cmp    0xd8(%rdx),%rdi
>> 00000000000002ad <propagate_one+0x16d> jne    00000000000002ee <propagate_one+0x1ae>
>> 00000000000002af <propagate_one+0x16f> test   %r8b,%r8b
>> 00000000000002b2 <propagate_one+0x172> jne    0000000000000309 <propagate_one+0x1c9>
>> 00000000000002b4 <propagate_one+0x174> mov    0x110(%rsi),%eax
>
> We will try 4.9.52 next and see, if we can still reproduce it.

I suspect you will be able to as nothing as nothing substantial has
changed in that logic in a while.

I will stare at it and see if I can imagine how you are getting
last_source set to NULL.

Eric

  reply	other threads:[~2017-09-28 17:00 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-28 13:03 OOPS: linux-4.9.33/fs/pnode.c:propagate_one() Philipp Hahn
2017-09-28 17:00 ` Eric W. Biederman [this message]
2017-10-03 16:48   ` Eric W. Biederman
2017-10-04 13:18     ` Philipp Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8760c2u5bh.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=botner@univention.de \
    --cc=hahn@univention.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=wiesenthal@univention.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.