From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75E5DC282C4 for ; Mon, 4 Feb 2019 09:19:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2EF132147A for ; Mon, 4 Feb 2019 09:19:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1549271989; bh=IvQlP30bPc59Fs58nYMU6JNaT9+e4bPQLJ7zYfqKUpo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=rwqwmWpuZjni8GubPhTszAAYE5m8WIOOu+/8B/D9lOnxKnRd0sjCqzd/W8lgLl3Ij x8e108q/boYtxrhJxGWUdrPyf11IVXWWT0omJ62n1KMOdkgJbSLygp3IyDsT1LcHG1 rykRVYGDxfhwW5VdhQGUJc7NBa7sfst8VMlRwH3U= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726502AbfBDJTs (ORCPT ); Mon, 4 Feb 2019 04:19:48 -0500 Received: from mail.kernel.org ([198.145.29.99]:48082 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726320AbfBDJTs (ORCPT ); Mon, 4 Feb 2019 04:19:48 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AECAE2070C; Mon, 4 Feb 2019 09:19:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1549271987; bh=IvQlP30bPc59Fs58nYMU6JNaT9+e4bPQLJ7zYfqKUpo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=OGPkM9NApSpGRA5yoxk7UeazuBh8Zed07oVghqrxwO3XKun+ph/Ptac3I2uhkcW1T b4CAief+kQ8ibR7Po9fY6dFzSKXvpv3yhjHYSZbL4tPHrhD5vlWi1xfa+7ZxvsxqFy fJnyqsU1SH8XTRYJriddSxJIGCMSyZGl/Lcs8fOM= Date: Mon, 4 Feb 2019 10:19:44 +0100 From: Greg Kroah-Hartman To: Aaron Lu Cc: stable@vger.kernel.org, Dave Chinner , Al Viro , Chenggang Qin Subject: Re: [PATCH for v4.9] fs: don't scan the inode cache before SB_BORN is set Message-ID: <20190204091944.GA18876@kroah.com> References: <20190128132045.GA128122@h07e11201.sqa.eu95> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190128132045.GA128122@h07e11201.sqa.eu95> User-Agent: Mutt/1.11.3 (2019-02-01) Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org On Mon, Jan 28, 2019 at 09:20:45PM +0800, Aaron Lu wrote: > One of our servers recently hit a kernel crash and the callstack is: > > [6469391.997662] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070 > [6469392.005693] IP: [] shmem_unused_huge_count+0x10/0x20 > [6469392.012412] PGD 1000c21067 > [6469392.015203] PUD ffc306067 > [6469392.018089] PMD 0 > [6469392.018627] > [6469392.020303] Oops: 0000 [#1] SMP > [6469392.023621] Modules linked in: kpatch_6iljwh9b(OE) memcg_force_swapin(OE) rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache nfsd auth_rpcgss nfs_acl [last unloaded: memcg_force_swapin] > [6469392.040177] CPU: 2 PID: 89058 Comm: ilogtail Tainted: G OE K 4.9.93-010.ali3000.alios7.x86_64 #1 > [6469392.049996] Hardware name: Inventec K900-1G /B900G2-1G , BIOS A2.32 10/09/2014 > [6469392.060334] task: ffff8802217b1800 task.stack: ffffc9004ea88000 > [6469392.066418] RIP: 0010:[] [] shmem_unused_huge_count+0x10/0x20 > [6469392.075563] RSP: 0018:ffffc9004ea8b6c0 EFLAGS: 00010282 > [6469392.081041] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000001 > [6469392.088339] RDX: 0000000000000001 RSI: ffffc9004ea8b780 RDI: ffff881749bd2000 > [6469392.095635] RBP: ffffc9004ea8b6c0 R08: 28f5c28f5c28f5c3 R09: ffff88173bf3fce0 > [6469392.102934] R10: ffff88207ffd4000 R11: 0000000000000000 R12: ffff881749bd24c0 > [6469392.110233] R13: ffffc9004ea8b780 R14: 0000000000000000 R15: ffff88207ffd4000 > [6469392.117533] FS: 00007fe260420700(0000) GS:ffff88103fa80000(0000) knlGS:0000000000000000 > [6469392.125792] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [6469392.131703] CR2: 0000000000000070 CR3: 00000005bb46d000 CR4: 00000000001606f0 > [6469392.138999] Stack: > [6469392.141185] ffffc9004ea8b6f0 ffffffff81247bee 0000000000000020 0000000000000400 > [6469392.148811] ffff881749bd24c0 0000000000000000 ffffc9004ea8b7d0 ffffffff811c431c > [6469392.156436] 0000000000000020 0000000000000000 ffff88207b82c000 0000000000000001 > [6469392.164063] Call Trace: > [6469392.166692] [] super_cache_count+0x3e/0xe0 > [6469392.172607] [] shrink_slab.part.38+0x11c/0x420 > [6469392.178875] [] shrink_slab+0x29/0x30 > [6469392.184273] [] shrink_node+0xff/0x300 > [6469392.189756] [] do_try_to_free_pages+0x10d/0x330 > [6469392.196104] [] try_to_free_mem_cgroup_pages+0xd5/0x1b0 > [6469392.203063] [] try_charge+0x14d/0x720 > [6469392.208551] [] ? kmem_cache_alloc+0xd3/0x1a0 > [6469392.214642] [] ? mempool_alloc_slab+0x15/0x20 > [6469392.220825] [] mem_cgroup_try_charge+0x6e/0x1b0 > [6469392.227177] [] __add_to_page_cache_locked+0x64/0x220 > [6469392.233961] [] add_to_page_cache_lru+0x4e/0xe0 > [6469392.240242] [] ext4_mpage_readpages+0x151/0x980 [ext4] > [6469392.247211] [] ext4_readpages+0x35/0x40 [ext4] > [6469392.253474] [] __do_page_cache_readahead+0x197/0x240 > [6469392.260260] [] ? pagecache_get_page+0x2c/0x2a0 > [6469392.266523] [] filemap_fault+0x4db/0x590 > [6469392.272282] [] ext4_filemap_fault+0x36/0x50 [ext4] > [6469392.278896] [] __do_fault+0x80/0x170 > [6469392.284292] [] do_fault+0x4c2/0x720 > [6469392.289603] [] ? futex_wait_queue_me+0x9f/0x120 > [6469392.295954] [] handle_mm_fault+0x512/0xc90 > [6469392.301874] [] __do_page_fault+0x24b/0x4d0 > [6469392.307796] [] ? SyS_futex+0x85/0x170 > [6469392.313280] [] do_page_fault+0x30/0x80 > [6469392.318850] [] ? do_syscall_64+0x74/0x180 > [6469392.324679] [] page_fault+0x28/0x30 > [6469392.329986] Code: 00 48 83 43 38 01 4c 89 e7 c6 <48> 8b 40 70 5d c3 66 2e 0f 1f 84 > [6469392.338183] RIP [] shmem_unused_huge_count+0x10/0x20 > [6469392.344990] RSP > [6469392.348656] CR2: 0000000000000070 > > Google showed me Dave Chinner's fix and I think it is the right fix for > our problem(not easy to reproduce in our production environment so I > haven't been able to confirm). > > Unfortunately, this commit is only back ported to v4.14 and v4.16 stable > kernel, not v4.9 stable kernel, presumbly due to the rename of MS_BORN > to SB_BORN starting from v4.14. To make this patch work on v4.9, I have > done one minor change to Dave's commit: by keep using MS_BORN. I think > this is correct, but since I know very little about fs code, please > kindly review, thanks a lot for your time. Backport looks good to me, thanks for the patch, I'll go queue it up now. greg k-h