From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18B6FC433ED for ; Sat, 17 Apr 2021 08:33:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1345D6113D for ; Sat, 17 Apr 2021 08:33:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1345D6113D Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=e16-tech.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 68CF46B006C; Sat, 17 Apr 2021 04:33:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 63DFB6B006E; Sat, 17 Apr 2021 04:33:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 505996B0070; Sat, 17 Apr 2021 04:33:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0112.hostedemail.com [216.40.44.112]) by kanga.kvack.org (Postfix) with ESMTP id 302D16B006C for ; Sat, 17 Apr 2021 04:33:40 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id D6C2F180ACF7F for ; Sat, 17 Apr 2021 08:33:39 +0000 (UTC) X-FDA: 78041195358.12.B27DED4 Received: from out20-73.mail.aliyun.com (out20-73.mail.aliyun.com [115.124.20.73]) by imf08.hostedemail.com (Postfix) with ESMTP id 3737480192D4 for ; Sat, 17 Apr 2021 08:33:21 +0000 (UTC) X-Alimail-AntiSpam:AC=CONTINUE;BC=0.04447377|-1;CH=green;DM=|CONTINUE|false|;DS=CONTINUE|ham_enroll_verification|0.00333567-0.000895241-0.995769;FP=0|0|0|0|0|-1|-1|-1;HT=ay29a033018047208;MF=wangyugui@e16-tech.com;NM=1;PH=DS;RN=3;RT=3;SR=0;TI=SMTPD_---.K.KjDgH_1618648413; Received: from 192.168.2.112(mailfrom:wangyugui@e16-tech.com fp:SMTPD_---.K.KjDgH_1618648413) by smtp.aliyun-inc.com(10.147.40.200); Sat, 17 Apr 2021 16:33:34 +0800 Date: Sat, 17 Apr 2021 16:33:37 +0800 From: Wang Yugui To: Yang Shi Subject: Re: kernel BUG at mm/huge_memory.c:2736(linux 5.10.29) Cc: Linux MM , wangyugui@e16-tech.com In-Reply-To: References: <20210412180659.B9E3.409509F4@e16-tech.com> Message-Id: <20210417163337.AA58.409509F4@e16-tech.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.75.03 [en] X-Stat-Signature: amn55iik3x4nqsszrj5h1qp3hr6aj596 X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 3737480192D4 Received-SPF: none (e16-tech.com>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=out20-73.mail.aliyun.com; client-ip=115.124.20.73 X-HE-DKIM-Result: none/none X-HE-Tag: 1618648401-683015 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, > On Mon, Apr 12, 2021 at 3:07 AM Wang Yugui wrote: > > > > Hi, > > > > kernel BUG at mm/huge_memory.c:2736(linux 5.10.29) is triggered > > by some files write test. > > > > mm/huge_memory.c: > > if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) { > > pr_alert("total_mapcount: %u, page_count(): %u\n", > > mapcount, count); > > if (PageTail(page)) > > dump_page(head, NULL); > > dump_page(page, "total_mapcount(head) > 0"); > > L2736: BUG(); > > } > > We just can tell the mapcount of the page is not zero from the current > log, it might mean the unmap_page() call is failed. It seems you have > CONFIG_DEBUG_VM enabled, could you please paste more log? There is > "VM_BUG_ON_PAGE(!unmap_success, page)" in unmap_page(). It should be > able to tell us if unmap_page() is failed or not, or something else > happened. This is the full dmesg output [63080.331513] huge_memory: total_mapcount: 511, page_count(): 512 [63080.332167] page:00000000d2e1a982 refcount:512 mapcount:0 mapping:0000000000000000 index:0x7fe260582 pfn:0x676a00 [63080.332167] head:00000000d2e1a982 order:9 compound_mapcount:0 compound_pincount:0 [63080.332167] anon flags: 0x17ffffc009001d(locked|uptodate|dirty|lru|head|swapbacked) [63080.332167] raw: 0017ffffc009001d ffffc93cda0d0008 ffffc93cd9ab0008 ffff8f21be9f0cb9 [63080.332167] raw: 00000007fe260582 0000000000000000 00000200ffffffff ffff8f1021810000 [63080.332167] page->mem_cgroup:ffff8f1021810000 [63080.332167] page:00000000bc78ac24 refcount:512 mapcount:1 mapping:0000000000000000 index:0x7fe260584 pfn:0x676a02 [63080.332167] head:00000000d2e1a982 order:9 compound_mapcount:0 compound_pincount:0 [63080.332167] anon flags: 0x17ffffc009001d(locked|uptodate|dirty|lru|head|swapbacked) [63080.332167] raw: 0017ffffc0000000 ffffc93cd9da8001 dead000000000000 ffffc93d428d0098 [63080.332167] raw: ffffa002cd183bf0 0000000000000000 0000000000000000 0000000000000000 [63080.332167] head: 0017ffffc009001d ffffc93cda0d0008 ffffc93cd9ab0008 ffff8f21be9f0cb9 [63080.332167] head: 00000007fe260582 0000000000000000 00000200ffffffff ffff8f1021810000 [63080.332167] page dumped because: total_mapcount(head) > 0 [63080.332167] ------------[ cut here ]------------ [63080.332167] kernel BUG at mm/huge_memory.c:2736! [63080.332167] invalid opcode: 0000 [#1] SMP NOPTI [63080.332167] CPU: 8 PID: 376 Comm: kswapd0 Tainted: G S 5.10.31-1.el7.x86_64 #1 [63080.332167] Hardware name: Dell Inc. Precision T7610/0NK70N, BIOS A18 09/11/2019 [63080.332167] RIP: 0010:split_huge_page_to_list.cold.86+0x19/0x1b [63080.332167] Code: 3a bc e8 8f 86 ff ff b8 f4 ff ff ff e9 43 7f 83 ff 31 f6 4c 89 e7 e8 bd dc 7d ff 48 c7 c6 4f f1 3a bc 48 89 ef e8 ae dc 7d ff <0f> 0b 48 8b 34 24 4c 89 e2 48 c7 c7 28 f5 3a bc e8 57 86 ff ff 31 [63080.332167] RSP: 0018:ffffa002cd183b10 EFLAGS: 00010086 [63080.332167] RAX: 0000000000000000 RBX: ffff8f1021810ae0 RCX: 0000000000000027 [63080.332167] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8f2eefa18a88 [63080.332167] RBP: ffffc93cd9da8080 R08: 0000000000000000 R09: c0000000ffffbfff [63080.332167] R10: 0000000000000001 R11: ffffa002cd1837e8 R12: ffffc93cd9da8000 [63080.332167] R13: 0000000000000000 R14: ffff8f21be9f0cb8 R15: 00000000000001ff [63080.332167] FS: 0000000000000000(0000) GS:ffff8f2eefa00000(0000) knlGS:0000000000000000 [63080.332167] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [63080.332167] CR2: 00007f8e24fabd20 CR3: 00000007eaa10005 CR4: 00000000001706e0 [63080.332167] Call Trace: [63080.332167] ? irq_exit_rcu+0x4f/0xe0 [63080.332167] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [63080.332167] deferred_split_scan+0x1ca/0x320 [63080.332167] do_shrink_slab+0x11f/0x250 [63080.332167] shrink_slab+0x20f/0x2c0 [63080.332167] shrink_node+0x24b/0x6d0 [63080.332167] balance_pgdat+0x2db/0x550 [63080.332167] kswapd+0x201/0x390 [63080.332167] ? finish_wait+0x80/0x80 [63080.332167] ? balance_pgdat+0x550/0x550 [63080.332167] kthread+0x116/0x130 [63080.332167] ? kthread_park+0x80/0x80 [63080.332167] ret_from_fork+0x1f/0x30 [63080.332167] Modules linked in: binfmt_misc rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rfkill rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_umad snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation intel_rapl_msr intel_rapl_common snd_soc_core snd_compress snd_pcm_dmaengine soundwire_cadence snd_hda_codec sb_edac x86_pkg_temp_thermal snd_hda_core intel_powerclamp coretemp ac97_bus kvm_intel snd_hwdep snd_seq iTCO_wdt snd_seq_device dcdbas intel_pmc_bxt mei_wdt mei_hdcp iTCO_vendor_support snd_pcm dell_smm_hwmon kvm irqbypass snd_timer rapl mei_me intel_cstate snd i2c_i801 intel_uncore i2c_smbus lpc_ich mei soundcore nvme_rdma nvme_fabrics rdma_cm iw_cm ib_cm rdmavt rdma_rxe nfsd ib_uverbs ip6_udp_tunnel udp_tunnel ib_core auth_rpcgss nfs_acl lockd grace nfs_ssc ip_tables xfs r adeon [63080.332167] i2c_algo_bit ttm drm_kms_helper cec bnx2x crct10dif_pclmul nvme crc32_pclmul drm crc32c_intel mpt3sas ghash_clmulni_intel e1000e pcspkr mdio nvme_core raid_class scsi_transport_sas wmi dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua btrfs xor raid6_pq sunrpc i2c_dev [63080.332167] ---[ end trace 35ee9d9fcf3c4757 ]--- [63080.332167] RIP: 0010:split_huge_page_to_list.cold.86+0x19/0x1b [63080.332167] Code: 3a bc e8 8f 86 ff ff b8 f4 ff ff ff e9 43 7f 83 ff 31 f6 4c 89 e7 e8 bd dc 7d ff 48 c7 c6 4f f1 3a bc 48 89 ef e8 ae dc 7d ff <0f> 0b 48 8b 34 24 4c 89 e2 48 c7 c7 28 f5 3a bc e8 57 86 ff ff 31 [63080.332167] RSP: 0018:ffffa002cd183b10 EFLAGS: 00010086 [63080.332167] RAX: 0000000000000000 RBX: ffff8f1021810ae0 RCX: 0000000000000027 [63080.332167] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8f2eefa18a88 [63080.332167] RBP: ffffc93cd9da8080 R08: 0000000000000000 R09: c0000000ffffbfff [63080.332167] R10: 0000000000000001 R11: ffffa002cd1837e8 R12: ffffc93cd9da8000 [63080.332167] R13: 0000000000000000 R14: ffff8f21be9f0cb8 R15: 00000000000001ff [63080.332167] FS: 0000000000000000(0000) GS:ffff8f2eefa00000(0000) knlGS:0000000000000000 [63080.332167] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [63080.332167] CR2: 00007f8e24fabd20 CR3: 00000007eaa10005 CR4: 00000000001706e0 [63080.332167] Kernel panic - not syncing: Fatal exception [63080.332167] Shutting down cpus with NMI [63080.332167] Kernel Offset: 0x3a000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) [63080.332167] ---[ end Kernel panic - not syncing: Fatal exception ]--- Best Regards Wang Yugui (wangyugui@e16-tech.com) 2021/04/17