From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752139AbdI1RAU (ORCPT ); Thu, 28 Sep 2017 13:00:20 -0400 Received: from out03.mta.xmission.com ([166.70.13.233]:58087 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751782AbdI1RAS (ORCPT ); Thu, 28 Sep 2017 13:00:18 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Philipp Hahn Cc: linux-fsdevel@vger.kernel.org, "linux-kernel\@vger.kernel.org" , Felix Botner , Dirk Wiesenthal References: Date: Thu, 28 Sep 2017 12:00:02 -0500 In-Reply-To: (Philipp Hahn's message of "Thu, 28 Sep 2017 15:03:37 +0200") Message-ID: <8760c2u5bh.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1dxcAU-0005nq-DF;;;mid=<8760c2u5bh.fsf@xmission.com>;;;hst=in01.mta.xmission.com;;;ip=67.3.200.44;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+d+tNHflx/YFxMNTe0+PLh7FMcT1FPq+A= X-SA-Exim-Connect-IP: 67.3.200.44 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 1.5 TR_Symld_Words too many words that have symbols inside * 0.0 TVD_RCVD_IP Message was received from an IP address * 1.2 LotsOfNums_01 BODY: Lots of long strings of numbers * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 0.4 FVGT_m_MULTI_ODD Contains multiple odd letter combinations * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.0 T_TooManySym_03 6+ unique symbols in subject * 0.0 T_TooManySym_02 5+ unique symbols in subject X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Philipp Hahn X-Spam-Relay-Country: X-Spam-Timing: total 5422 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.6 (0.0%), b_tie_ro: 1.85 (0.0%), parse: 0.81 (0.0%), extract_message_metadata: 16 (0.3%), get_uri_detail_list: 3.3 (0.1%), tests_pri_-1000: 4.4 (0.1%), tests_pri_-950: 1.15 (0.0%), tests_pri_-900: 0.98 (0.0%), tests_pri_-400: 37 (0.7%), check_bayes: 36 (0.7%), b_tokenize: 11 (0.2%), b_tok_get_all: 12 (0.2%), b_comp_prob: 5 (0.1%), b_tok_touch_all: 4.7 (0.1%), b_finish: 0.69 (0.0%), tests_pri_0: 370 (6.8%), check_dkim_signature: 0.78 (0.0%), check_dkim_adsp: 3.0 (0.1%), tests_pri_500: 4986 (92.0%), poll_dns_idle: 4977 (91.8%), rewrite_mail: 0.00 (0.0%) Subject: Re: OOPS: linux-4.9.33/fs/pnode.c:propagate_one() X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Philipp Hahn writes: > Hello Eric, > > we have observed the following OOPS with linux-4.9.33 on several > occasions when starting a Docker container: > >> [ 531.801537] Oops: 0000 [#1] SMP >> [ 531.801565] Modules linked in: xt_nat veth ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo xt_addrtype br_netfilter bridge stp llc overlay ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_filter ip6_tables xt_conntrack iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc joydev pcspkr serio_raw virtio_balloon i2c_piix4 evdev quota_v2 quota_tree parport_pc ppdev lp parport autofs4 ext4 crc16 jbd2 fscrypto mbcache dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_mod hid_generic usbhid hid ata_generic cirrus virtio_net virtio_blk ttm ata_piix libata uhci_hcd ehci_hcd psmouseSep 28 13:30:33 master70 kernel: [ 531.802171] drm_kms_helper scsi_mod floppy usbcore virtio_pci button drm virtio_ring virtio usb_common >> [ 531.802247] CPU: 0 PID: 1668 Comm: dockerd Not tainted 4.9.0-ucs104-amd64 #1 Debian 4.9.30-2A~4.2.0.201706171152 >> [ 531.802310] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 >> [ 531.802347] task: ffff918af7ccaf40 task.stack: ffffb47a40b54000 >> [ 531.802385] RIP: 0010:[] [] propagate_one+0x160/0x1e0 >> [ 531.802444] RSP: 0018:ffffb47a40b57e08 EFLAGS: 00010282 >> [ 531.802479] RAX: 0000000000000001 RBX: ffff918adf43cd80 RCX: 0000000000000000 >> [ 531.802524] RDX: ffff918ad69209c0 RSI: ffff918adf43cd80 RDI: ffff918ad69209c0 >> [ 531.802569] RBP: ffff918afb3a2480 R08: 0000000000000001 R09: ffff918ac0183e40 >> [ 531.802614] R10: 00000000000011b8 R11: 00000000000176b9 R12: ffffb47a40b57e58 >> [ 531.802670] R13: 0000000000000000 R14: ffff918adf43cd80 R15: ffff918afbe83600 >> [ 531.802716] FS: 00007f6157fff700(0000) GS:ffff918affc00000(0000) knlGS:0000000000000000 >> [ 531.802767] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 531.802804] CR2: 0000000000000010 CR3: 0000000039e60000 CR4: 00000000000006f0 >> [ 531.802852] Stack: >> [ 531.802868] ffff918adf43cd80 ffff918afb3a2480 ffffffff912344af ffff918ac0183e40 >> [ 531.802923] ffff918afbe83600 ffff918afb3a2480 0000000000000000 0000000000000000 >> [ 531.802978] ffffffff91226971 ffff918afe084900 ffff918ac01c4c40 3aa437d5d9ae5224 >> [ 531.803037] Call Trace: >> [ 531.803062] [] ? propagate_mnt+0x11f/0x150 >> [ 531.803104] [] ? attach_recursive_mnt+0x1d1/0x2f0 >> [ 531.803146] [] ? do_mount+0xafd/0xc70 >> [ 531.803182] [] ? SyS_mount+0x84/0xc0 >> [ 531.803229] [] ? system_call_fast_compare_end+0xc/0x9b >> [ 531.803273] Code: 48 89 de eb 09 f6 42 33 04 75 7d 48 89 d6 48 8b 96 d8 00 00 00 48 39 fa 75 eb 48 8b 0d 52 d6 cc 00 4c 8b 0d 53 d6 cc 00 49 39 c9 <48> 8b 51 10 74 5c 48 3b ba d8 00 00 00 75 3f 45 84 c0 75 55 8b >> [ 531.805256] RIP [] propagate_one+0x160/0x1e0 >> [ 531.807057] RSP >> [ 531.808839] CR2: 0000000000000010 >> [ 531.810618] ---[ end trace f817db65f78e36fc ]--- > > The same has happened in Amazon-EC2 and on out own KVM servers. > > There was a similar bug report in 2016 at > , but that > one got fixed with "v4.6-rc7~24^2". > > Maybe is has already been fixed by >> $ git l1 v4.9.33..v4.9.52 -- fs/pnode.c >> 54fcb2303ef4 mnt: Make propagate_umount less slow for overlapping mount propagation trees >> bb4fbf094b44 mnt: In propgate_umount handle visiting mounts in any order >> e260db757676 mnt: In umount propagation reparent in a separate pass > > but I'd like to ask anyway if you have seen such a OOPs before? I have not. > For your convenience here's the disassembly, where I think "rcx = > last_source->mnt_parent = NULL": If your disassembly is lined up properly with the code my guess would be that rcx simply holds the value of last_source. And rdx is being set to last_source->mnt_parent. Disassembly a recent build on my machine I have a one byte difference from yours, but that is what I see happening. rcx == last_source. Why and how last_source is getting set to NULL is something I need to read the code for a while to figure out. >> do { >> struct mount *parent = last_source->mnt_parent; >> if (last_source == first_source) >> 000000000000029d cmp %rcx,%r9 >> p = n->mnt_master; >> if (p == dest_master || IS_MNT_MARKED(p)) >> break; >> } >> do { >> struct mount *parent = last_source->mnt_parent; >> 00000000000002a0 mov 0x10(%rcx),%rdx >> if (last_source == first_source) >> 00000000000002a4 je 0000000000000302 >> break; >> done = parent->mnt_master == p; >> if (done && peers(n, parent)) >> 00000000000002a6 cmp 0xd8(%rdx),%rdi >> 00000000000002ad jne 00000000000002ee >> 00000000000002af test %r8b,%r8b >> 00000000000002b2 jne 0000000000000309 >> 00000000000002b4 mov 0x110(%rsi),%eax > > We will try 4.9.52 next and see, if we can still reproduce it. I suspect you will be able to as nothing as nothing substantial has changed in that logic in a while. I will stare at it and see if I can imagine how you are getting last_source set to NULL. Eric