From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 530EFFCA16A for ; Mon, 9 Mar 2026 18:14:59 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [127.0.0.1]) by lists.ozlabs.org (Postfix) with ESMTP id 4fV4t644L1z3cCM; Tue, 10 Mar 2026 05:14:54 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; arc=none smtp.remote-ip="2607:f8b0:4864:20::429" ARC-Seal: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1773080094; cv=none; b=nLVCBFEPlOv6WZcOBIJq3fl5BDoSsh6sYvoNlmxY1y4eVhfqQdK95sYig2FYRSUyfUXT5BYdihJH1/8+nBOPB9972RQVdBLY4y2XjOy1Q3CPizo1h/KpTkQINKKh+2/ZBBTTeUXQ0DPqsSWyFVVgi457T9MvIcIddOY6CVaEzOnYseOYX3s9qhSiEYz3TMp3PVpQgsm/uL076UcZ3yQZPLgeOUvSz7wd34hu6Y9StBDhoW9FzpdvEaPgAaQx+pI/D1IL5R/lDMKdP+tZ8Qej+F3favZqurCMRTeq80AdSqOeiGOZHwQnoR8npTJkwELXj1hwrtPFcnm4IO8H0IoQ4A== ARC-Message-Signature: i=1; a=rsa-sha256; d=lists.ozlabs.org; s=201707; t=1773080094; c=relaxed/relaxed; bh=3MBax5iE+Q4wtvNgOLJBXJe7HKlSHH0ur1TR2F8EkWc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oU9EiLh64GYW3Wk7PCYLq+Ps2yfJ6wvB0mNcGYvTGFqJijHWWvXW055vXOYJlIZfiic0jPhuNX8I9U0pAenOXAgsjkQOqljdhuJLkI++hO6VBpEBzDRgi85A+gw/zv4BIlKC9M4GjkbszeQ16O+f2JSQgtCu9wHsNS3udcAxMR6KSaCWh1CXAEBTXlGBH3QOK7DwX7GqdGBijvgpivTNE0Gp2vxzypvxdLZE6uZQrD6C39hJlraNsmUvWewwZgTJvGmHuGYSHEJoeljc5vAvRDlclXlWVoEFikSlR9qPSqDibhqcs6e4i1fsgNUpzYC24MKFUArVD2PYrweiv1plTg== ARC-Authentication-Results: i=1; lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=VYannn4w; dkim-atps=neutral; spf=pass (client-ip=2607:f8b0:4864:20::429; helo=mail-pf1-x429.google.com; envelope-from=ritesh.list@gmail.com; receiver=lists.ozlabs.org) smtp.mailfrom=gmail.com Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=VYannn4w; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::429; helo=mail-pf1-x429.google.com; envelope-from=ritesh.list@gmail.com; receiver=lists.ozlabs.org) Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4fV4t56D80z3c8h for ; Tue, 10 Mar 2026 05:14:53 +1100 (AEDT) Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-8296d553142so2720740b3a.3 for ; Mon, 09 Mar 2026 11:14:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773080091; x=1773684891; darn=lists.ozlabs.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3MBax5iE+Q4wtvNgOLJBXJe7HKlSHH0ur1TR2F8EkWc=; b=VYannn4w3Hk+VwFG67QGx3wVjgZkOPNR/QZohgJEO4pME1z+D6fuR87ShQacbGr70Z yX04gwp5wZr4MSnr7YNWHLPjjAsec8G6QsdAw+YhyngGENlb0s3ILzWPKURhNVUNI9E5 8vY4shDBrSgTZE758XTF5W/qY/FDBIoFZRuLUVqNZCZdyA9rBjJS/ofSAh4YG3OQ9ESb A/9PLKlH3EDJEIvh+DrzVFk0ff2ibamck6wo+ZrBz2w56mSZc7G1hDJBAJ1YdcGJRF35 vgoE/FnbDeCL62AHqd4PHubjLPMUa/mqv1rSnc7GNz0Yo3chzniOKKJotYC9d/C6eyyp md6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773080091; x=1773684891; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=3MBax5iE+Q4wtvNgOLJBXJe7HKlSHH0ur1TR2F8EkWc=; b=YB6iTiNLfLJXG6eHzI7jMiSue3Sybii4d3Q22bo++0LhQlPDYJFeDQmC4tGjSZMwEE MkmStvrKr+7kEaK3Gsf/DaXfmPKy7ut2tHJTnuP01qHvWtGPzaFPHOtNFW8mCrzRWxL5 N1Pdp2pJhihUY/w0Y/0uLEC2Xdl6auYP7f7TqiO+wi8gTGw2+2ipNbKgFRy4QJx3msXc zgWGg9CjmoV6bhMoyksDEVT0OMw+8brYhnZuVj7waBqzsk4QXViJj7OKsY+W8SZvm+Jf +wa1UznQYXClZrjFeyhXznzK0ksWAin1oJL2+L7/GIgO9SGp8mHlKJ/ScFTNt4S14upv mIHA== X-Gm-Message-State: AOJu0YyHoEb9tWG4W++E8oBc7BU70i+RxRD1RnA3tIicUC1fwnUNgJCD uKA6Da/6MI7vcc2xon8rmNsw6XxhjMqgAVIDuzdN9YJC/3Fq+FV+EFAxj8WtLuty X-Gm-Gg: ATEYQzxJe0cRYO00i5cnpiN6Q0EkWPZENxyZjAd/fdRSGnOjOkRQu7Xu5lRATfXegk/ zoRdBbALFj5XcW91DJTPAUOTbfYY4aS5UbiUTxS6SRwf6rhr5euFsYdfqkRobjVroYm/z3DQE6C YcaZi8tGjNHEHIRsSeE96eDnc6uCvdSXw4+nE+2K1O21HGp2EZAykZUFwaFK7QvcbkjJSSUXDno cFHbDSuLAgsHCCxTrEKhQDzBfb+qP+MBZDNxXHRcR1vMemDDFrY0KtFpVAc8hpK/SyPKpDsp5w/ yPPltzip+B1fANKprlYkuDQvX4VR/eoTD7uH/uOTRP25gXWA1VwxwAfnvxwVMwPHXuce5RsDSog 1CAtgw+C9FP9tyNUA3gQ4nxeDLGJYEVLHjGlOYHa+TEHiBtxpGf7MN7/qfkI+pBgIXzMnu7wzUL 3PrhqBe6z8mqUnxrU/tjuT2ubic67rmKlPJAZYnJSyoZ4WfYSxb6Tzsj5zXBATyvub X-Received: by 2002:a05:6a00:1f10:b0:81f:852b:a925 with SMTP id d2e1a72fcca58-829a2d86e1fmr10763933b3a.1.1773080091290; Mon, 09 Mar 2026 11:14:51 -0700 (PDT) Received: from localhost.localdomain ([2401:4900:1f29:53c8:742c:4036:d7c6:9024]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-829a46765a6sm10775477b3a.29.2026.03.09.11.14.46 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 09 Mar 2026 11:14:50 -0700 (PDT) From: "Ritesh Harjani (IBM)" To: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan , Christophe Leroy , Venkat Rao Bagalkote , Nicholas Piggin , Sayali Patil , Aboorva Devarajan , Donet Tom , "Ritesh Harjani (IBM)" Subject: [PATCH v2 01/10] powerpc/pgtable-frag: Fix bad page state in pte_frag_destroy Date: Mon, 9 Mar 2026 23:44:24 +0530 Message-ID: X-Mailer: git-send-email 2.50.1 In-Reply-To: References: X-Mailing-List: linuxppc-dev@lists.ozlabs.org List-Id: List-Help: List-Owner: List-Post: List-Archive: , List-Subscribe: , , List-Unsubscribe: Precedence: list MIME-Version: 1.0 Content-Transfer-Encoding: 8bit powerpc uses pt_frag_refcount as a reference counter for tracking it's pte and pmd page table fragments. For PTE table, in case of Hash with 64K pagesize, we have 16 fragments of 4K size in one 64K page. Patch series [1] "mm: free retracted page table by RCU" added pte_free_defer() to defer the freeing of PTE tables when retract_page_tables() is called for madvise MADV_COLLAPSE on shmem range. [1]: https://lore.kernel.org/all/7cd843a9-aa80-14f-5eb2-33427363c20@google.com/ pte_free_defer() sets the active flag on the corresponding fragment's folio & calls pte_fragment_free(), which reduces the pt_frag_refcount. When pt_frag_refcount reaches 0 (no active fragment using the folio), it checks if the folio active flag is set, if set, it calls call_rcu to free the folio, it the active flag is unset then it calls pte_free_now(). Now, this can lead to following problem in a corner case... [ 265.351553][ T183] BUG: Bad page state in process a.out pfn:20d62 [ 265.353555][ T183] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x20d62 [ 265.355457][ T183] flags: 0x3ffff800000100(active|node=0|zone=0|lastcpupid=0x7ffff) [ 265.358719][ T183] raw: 003ffff800000100 0000000000000000 5deadbeef0000122 0000000000000000 [ 265.360177][ T183] raw: 0000000000000000 c0000000119caf58 00000000ffffffff 0000000000000000 [ 265.361438][ T183] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set [ 265.362572][ T183] Modules linked in: [ 265.364622][ T183] CPU: 0 UID: 0 PID: 183 Comm: a.out Not tainted 6.18.0-rc3-00141-g1ddeaaace7ff-dirty #53 VOLUNTARY [ 265.364785][ T183] Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,git-ee03ae pSeries [ 265.364908][ T183] Call Trace: [ 265.364955][ T183] [c000000011e6f7c0] [c000000001cfaa18] dump_stack_lvl+0x130/0x148 (unreliable) [ 265.365202][ T183] [c000000011e6f7f0] [c000000000794758] bad_page+0xb4/0x1c8 [ 265.365384][ T183] [c000000011e6f890] [c00000000079c020] __free_frozen_pages+0x838/0xd08 [ 265.365554][ T183] [c000000011e6f980] [c0000000000a70ac] pte_frag_destroy+0x298/0x310 [ 265.365729][ T183] [c000000011e6fa30] [c0000000000aa764] arch_exit_mmap+0x34/0x218 [ 265.365912][ T183] [c000000011e6fa80] [c000000000751698] exit_mmap+0xb8/0x820 [ 265.366080][ T183] [c000000011e6fc30] [c0000000001b1258] __mmput+0x98/0x300 [ 265.366244][ T183] [c000000011e6fc80] [c0000000001c81f8] do_exit+0x470/0x1508 [ 265.366421][ T183] [c000000011e6fd70] [c0000000001c95e4] do_group_exit+0x88/0x148 [ 265.366602][ T183] [c000000011e6fdc0] [c0000000001c96ec] pid_child_should_wake+0x0/0x178 [ 265.366780][ T183] [c000000011e6fdf0] [c00000000003a270] system_call_exception+0x1b0/0x4e0 [ 265.366958][ T183] [c000000011e6fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec The bad page state error occurs when such a folio gets freed (with active flag set), from do_exit() path in parallel. ... this can happen when the pte fragment was allocated from this folio, but when all the fragments get freed, the pte_frag_refcount still had some unused fragments. Now, if this process exits, with such folio as it's cached pte_frag in mm->context, then during pte_frag_destroy(), we simply call pagetable_dtor() and pagetable_free(), meaning it doesn't clear the active flag. This, can lead to the above bug. Since we are anyway in do_exit() path, then if the refcount is 0, then I guess it should be ok to simply clear the folio active flag before calling pagetable_dtor() & pagetable_free(). Fixes: 32cc0b7c9d50 ("powerpc: add pte_free_defer() for pgtables sharing page") Reviewed-by: Christophe Leroy (CS GROUP) Signed-off-by: Ritesh Harjani (IBM) --- arch/powerpc/mm/pgtable-frag.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c index 77e55eac16e4..ae742564a3d5 100644 --- a/arch/powerpc/mm/pgtable-frag.c +++ b/arch/powerpc/mm/pgtable-frag.c @@ -25,6 +25,7 @@ void pte_frag_destroy(void *pte_frag) count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT; /* We allow PTE_FRAG_NR fragments from a PTE page */ if (atomic_sub_and_test(PTE_FRAG_NR - count, &ptdesc->pt_frag_refcount)) { + folio_clear_active(ptdesc_folio(ptdesc)); pagetable_dtor(ptdesc); pagetable_free(ptdesc); } -- 2.50.1 (Apple Git-155)