From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3048FC433EF for ; Thu, 28 Apr 2022 17:19:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 785C26B00B2; Thu, 28 Apr 2022 13:19:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 735906B00B3; Thu, 28 Apr 2022 13:19:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FF4E6B00B7; Thu, 28 Apr 2022 13:19:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 537B66B00B2 for ; Thu, 28 Apr 2022 13:19:11 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2309026452 for ; Thu, 28 Apr 2022 17:19:11 +0000 (UTC) X-FDA: 79406948502.06.2DD2C10 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by imf08.hostedemail.com (Postfix) with ESMTP id 11388160059 for ; Thu, 28 Apr 2022 17:19:03 +0000 (UTC) Received: by mail-pf1-f182.google.com with SMTP id g8so2394203pfh.5 for ; Thu, 28 Apr 2022 10:19:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Wtw6B2rFNDXVmiKNq0etq9EjKqHW0xxNbC8jYlIic+k=; b=LrL+GRwsD1AOJoEenncRbO5/H3+/1LlcLfT6tR4Y6WF7RojPNYsYD1YOxIt+tOnyIV 1LQfwGrFRzSXGX4IyHMd+VeLs6hPy5nicbdxmZZwIhs5gK5Ggaklj0y9A+QoFK0e1ksD +Bt7a+2JHxsC/g+u6Nq4NG7N0ZHUmEjasX81ggxJL8Xag7UZeLLOcalU9pHlO8twhQeK 6Q/l+Dl2S+Zfa5+aKfr6VUBdQk5EDYEdZ6FGJvb7w/fvdC8CW1B7gtNfs5WPOFoTkj4y BptKXRo7SG0Y0Q1Hp9Hb2usLVVYuRi5Hu7D4jDnxR41HGu034vQSSafhHzEmmO/VY8Bz xzvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Wtw6B2rFNDXVmiKNq0etq9EjKqHW0xxNbC8jYlIic+k=; b=hYCjHJz1AtNFL0cwbDEybIZxyjZsmc53DvjvupID9/Zrm5olHaQnB4lZ1leaSLF9WM tGp6/qAidS/rdsUITBi+e+MxEn3lx8odxFcTni0kXnN658cOXXVbmFbMtVZ63bgbqF+6 lkERLcBiTD+IbSfQ40QK9nNOMAuIPIKaKgHku8oUsVQvK/njGt2X/FtK5B3G7gO9DYsr Ylt8SLxYFzpJssb6iLmQey/zuSt7dEuCBTuy44WMwRXmV4n0IReHdW1qElLqM8lkumFL b0VorLe0Y+arWrOljbNdLDs3nru88zg0U33oXUs8jihc2YPG7fr34Y3i82k7uv1HjtJQ JGwQ== X-Gm-Message-State: AOAM533/Aun0Jcyg8CCx00RsN02xYBhAvZ4TRZ2IsN5F6kOhpHwKG8FZ LdhQidDjtKEPEjXIWX5PL6wj/cwuNqm7y5+EGLE= X-Google-Smtp-Source: ABdhPJwRK0QSC589ptluvDzh8P3XW2ILK14Vewu8jFtnOPNbzP56yufQGb42qU7UBb9KeRN5a87gt0f/wmRuLMAwqLA= X-Received: by 2002:a63:90ca:0:b0:3aa:fff3:6f76 with SMTP id a193-20020a6390ca000000b003aafff36f76mr22681986pge.206.1651166349543; Thu, 28 Apr 2022 10:19:09 -0700 (PDT) MIME-Version: 1.0 References: <3af1e2b9-1221-5325-cc30-052b6294b5ce@redhat.com> In-Reply-To: <3af1e2b9-1221-5325-cc30-052b6294b5ce@redhat.com> From: Yang Shi Date: Thu, 28 Apr 2022 10:18:57 -0700 Message-ID: Subject: Re: [PATCH 2/2 RESEND] mm/huge_memory: do not overkill when splitting huge_zero_page To: David Hildenbrand Cc: Xu Yu , Linux MM , Andrew Morton , =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: hg3ff1hctpdcbkmtyn8ku4xhyppyr3bg X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 11388160059 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=LrL+GRws; spf=pass (imf08.hostedemail.com: domain of shy828301@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-HE-Tag: 1651166343-461351 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 28, 2022 at 9:04 AM David Hildenbrand wrote: > > On 27.04.22 11:44, Xu Yu wrote: > > Kernel panic when injecting memory_failure for the global > > huge_zero_page, when CONFIG_DEBUG_VM is enabled, as follows. > > > > Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000 > > page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00 > > head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0 > > flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff) > > raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000 > > raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000 > > page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head)) > > ------------[ cut here ]------------ > > kernel BUG at mm/huge_memory.c:2499! > > invalid opcode: 0000 [#1] PREEMPT SMP PTI > > CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11 > > Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014 > > RIP: 0010:split_huge_page_to_list+0x66a/0x880 > > Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b > > RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246 > > RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000 > > RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff > > RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff > > R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000 > > R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40 > > FS: 00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > try_to_split_thp_page+0x3a/0x130 > > memory_failure+0x128/0x800 > > madvise_inject_error.cold+0x8b/0xa1 > > __x64_sys_madvise+0x54/0x60 > > do_syscall_64+0x35/0x80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > RIP: 0033:0x7fc3754f8bf9 > > Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8 > > RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9 > > RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000 > > RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000 > > R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490 > > R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000 > > > > We think that raising BUG is overkilling for splitting huge_zero_page, > > the huge_zero_page can't be met from normal paths other than memory > > failure, but memory failure is a valid caller. So we tend to replace the > > BUG to WARN + returning -EBUSY, and thus the panic above won't happen > > again. > > > > Suggested-by: Yang Shi > > Cc: Naoya Horiguchi > > Reported-by: kernel test robot > > Signed-off-by: Xu Yu > > --- > > mm/huge_memory.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index c468fee595ff..910a138e9859 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -2495,11 +2495,16 @@ int split_huge_page_to_list(struct page *page, struct list_head *list) > > struct address_space *mapping = NULL; > > int extra_pins, ret; > > pgoff_t end; > > + bool is_hzp; > > > > - VM_BUG_ON_PAGE(is_huge_zero_page(head), head); > > VM_BUG_ON_PAGE(!PageLocked(head), head); > > VM_BUG_ON_PAGE(!PageCompound(head), head); > > > > + is_hzp = is_huge_zero_page(head); > > + VM_WARN_ON_ONCE_PAGE(is_hzp, head); > > If this code is valid to be reached, VM_WARN_ON_ONCE_PAGE is most > probably the wrong choice. Only from the memory failure path, any other path is invalid. The warning is mainly used to catch the invalid cases. It should be rare to have memory failure on huge zero page in real life. > > IIUC, after patch #1 (revert) we can reach this again? > > -- > Thanks, > > David / dhildenb >