From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B38DC399001 for ; Tue, 23 Jun 2026 07:05:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782198324; cv=none; b=s3HTFCG+i4NO/99sdV8wOIrGkb1OIeTuzA4nr3wXod81fq/kZlY6hoIms3YiJU8AiuzAsr4w5gJM5hDSj4nnUTB344CeRg5X3pSMiys5Tg3/ukGKLZ6qYtAWgZl4CIOTIbe+RTmvCszZELxJOnhnPRj1hhgjZE+5ikln5+afdns= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782198324; c=relaxed/simple; bh=2edlHoTARK4XOwJQhEgs4zRuOW2bg9kGCtX6ZMXliKU=; h=From:To:Cc:Subject:In-Reply-To:Date:Message-ID:References: MIME-version:Content-type; b=lwVmMGsp05N9tfLlOzgdcEKJEdcHLnphgbnXAUGfNAL/JT1ScEcMh5ZFL7l6RYCEJdVCoxw9cVJvlj2Gnj2yuULROoC2rLMUfrOQf0sy4zDH9oC4qSJNidv94DlI13jAhjm2L/tYgPiqHihz1Rf/jJDv9lC662nM0Po7q4qZ5xI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=T/HFAYE1; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="T/HFAYE1" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-845369f60faso3133773b3a.3 for ; Tue, 23 Jun 2026 00:05:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1782198323; x=1782803123; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:message-id:date :in-reply-to:subject:cc:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2edlHoTARK4XOwJQhEgs4zRuOW2bg9kGCtX6ZMXliKU=; b=T/HFAYE1UtQXyoYB5HnVLugWEgkhM+2I8RbZ6v8+OqfZsVIKZD9Wr2G7H6h1z0rbQ3 +ny9KcLt1r0WsLnitIvduPbDN5Tu/sldM4KtHCAgXqt1TxasymhzkMzHweGItLOXiUOW mVRtDG2RdDyLMdCav2ab1ga5cUdQImY6znc3GPvVrRc+2G0Emwl4OWX46S2+bxNBON4M bI0IEHv8GPpMz1A5EPZKSyDG3dd6mJW+wPces/sgMgPsivwJlbT2AHryTyr3qzmlo3Kb OlHmEsCJjUvDQAvghYtjW+Ruz0P60hKu0sRuLZQP8N9IzgWqc1m2yKPrhkyqpPZ9vF7D 7SCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1782198323; x=1782803123; h=content-transfer-encoding:mime-version:references:message-id:date :in-reply-to:subject:cc:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=2edlHoTARK4XOwJQhEgs4zRuOW2bg9kGCtX6ZMXliKU=; b=klClS7Jx0LH2CWmt7cikKVwTpMK0gqglKC2XlTbXpRilMnqoeuhVvgRqzY23sCFRHi 7gW1fAvXQzEXvk7ZFNqRBPCfTBlsQzuTgXN1nLM37OeB/zb8W45IrYhjGRXY+zbZJiZN Q0ZyUDNUi9vvPHRuguAYGj6OOiwybXRb+zErGz8zH/X01viLCBa1DeuDQ/aS3eS4nTHR 4IUL12tdF2KUp/7WAz2aI0zVi75m620Z+No1287+vFcof2ldpo/7St+eO5HNeAHqysOY bQjn4DQoJ7ZRjPQRMg942WL1/gcSPHE3ELGl7JAybv/ISfK3eOJaRNwhCuDp5KbBk0kI KhNw== X-Forwarded-Encrypted: i=1; AFNElJ82vKWHOknD3TYxYYukVlQ/a5h9uZP33otLZxKnTL6Iz7VfTm3AcdjeZDqrozQ8ixyoDgYaydds+/hR83A=@vger.kernel.org X-Gm-Message-State: AOJu0Yy/rkanewJ6tzT+3uK0MkkLPUWvx/LZFabLAT2M5+WibjaHS80+ LhAZ5D/20K3xXWT70da/SvYjgmYswlqU/W8a81X5EZCvoOyN0UNAmvQP X-Gm-Gg: AfdE7cn+LxiTarNKX8yZMWAsnw1myjGiKQhYk4D5VTBIloY28hscdYqL/BUKSWCXyBt smRKloM7dVh3lvgcrPJLQ+FGD5fgpuZn9hk2tPmzJleJoauzxKO86fhZ7pq6kR8aVFiJtpIpvIQ QXV8hi6pHNl+FIDSeH4Q9RCc0jXdxewofLYNmyLgmfTndQ32HK/raWMxSpdw4rXzXXiM1nwO4ps gF7+2A0ZLbtWzmTpTynsy7EGiYGVuMq9HTTH8dzuj7ilTb2+d4fr9H27p8e7s+bIdgdHiuLl7+V 9IvJiIU1LcrW7ixaV48tdMcQHClG38dIh3qtZdPuiyrEJSXD9nnfZSLAGfR8kNu8GOymigSZDK8 P5msgcevO65MvcygQWB+csYJ3tBbWvrwtleBtmnAzCICWLYfykV0dUNVpPlS1IxbCSBZ+OBmSJ7 cVggqC/64/XvkWx944UPkbQN9cyg== X-Received: by 2002:a05:6a00:3492:b0:82c:d7c4:4c5c with SMTP id d2e1a72fcca58-845507edeb3mr18744789b3a.20.1782198322868; Tue, 23 Jun 2026 00:05:22 -0700 (PDT) Received: from pve-server ([49.205.216.49]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-84564d8f1a1sm10912023b3a.18.2026.06.23.00.05.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jun 2026 00:05:22 -0700 (PDT) From: Ritesh Harjani (IBM) To: Barry Song Cc: linux-mm@kvack.org, Madhavan Srinivasan , Michael Ellerman , Nicholas Piggin , Christophe Leroy , Andrew Morton , Chris Li , Kairui Song , Kemeng Shi , Nhat Pham , Baoquan He , Youngjun Park , David Hildenbrand , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Sayali Patil Subject: Re: [PATCH v4 2/3] mm, swap: allow archs to override SWAP_NR_ORDERS via ARCH_MAX_PMD_ORDER In-Reply-To: Date: Tue, 23 Jun 2026 12:07:01 +0530 Message-ID: <33ydyd82.ritesh.list@gmail.com> References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Barry Song writes: > On Fri, Jun 19, 2026 at 12:41 PM Ritesh Harjani (IBM) > wrote: >> >> SWAP_NR_ORDERS sizes a few small bounded arrays inside THP swap >> allocator code (nofull/frag cluster lists, percpu_swap_cluster's >> si/offset arrays, next array for rotational device). This currently >> expands to PMD_ORDER+1, which only works when PMD_ORDER is a compile >> time constant. >> >> However on architecture like PowerPC Book3S64, PMD_ORDER is a runtime >> variable which depends upon which MMU is selected (Radix / Hash), so in >> that case, PMD_ORDER cannot be used to size the static arrays. >> >> This patch provides an optional ARCH_MAX_PMD_ORDER (upper-bound) >> override for such architectures. The memory overhead on enabling this >> override is negligible. Even if we make SWAP_NR_ORDERS runtime alloc, >> default slab padding could cause some memory waste. Also we lose the >> per-cpu cacheline benefits (for percpu_swap_cluster) because it might >> cost an extra cacheline indirection overhead in swap_alloc_fast() for >> fetching si[order]/offset[order]. Note that a fully runtime >> SWAP_NR_ORDERS was considered in previous version but was dropped for >> this reason [1] > > Do we know the maximum PMD size? ARCH_MAX_PMD_ORDER will be 8 on PowerPC book3s64 with 64K pagesize. PowerPC Hash MMU with 64K default pagesize supports PMD size of 16MB. > On arm64 with a 64 KB base page, > a PMD can be as large as 512 MB: > https://docs.kernel.org/arch/arm64/hugetlbpage.html > > One concern we have is that performing I/O on such a large folio could > incur significant latency before reclaiming any memory. For this > reason, on arm64 we initially enabled THP_SWAPOUT only for 4 KB base > pages: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0637c505f > That's not the case on PowerPC. Max PMD size for Hash will be 16MB. Also we still need this patch since we can at runtime choose Hash or Radix MMU. So, the main problem this patch is trying to solve on PowerPC Book3s64 is enabling this feature w/o impacting any other architecture. W/O this patch series, we can't enable it, since it gives build errors. >> >> [1]: https://lore.kernel.org/linuxppc-dev/pl1zdksc.ritesh.list@gmail.com/ >> > > Best Regards > Barry Thanks for the review! -ritesh