From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 88CF2431E53 for ; Thu, 2 Jul 2026 00:31:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782952285; cv=none; b=Ca3wiA54CwkKmWKpsonir+wqvdh4LJP11lI8yRkxCGpL3xc6hOY/VG1tx/kuc6V7PpHd6kGqMy62c3U/4OIqsMdmp/SuQ9u9dTY1SdfxYoG/ExTuvHsc6r4dTAYoZU8PSWXevBFPhQYPlhBpZwfPMMp0/Trl1fZjIMWSWdIr6yU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782952285; c=relaxed/simple; bh=TWLNzYNHC/Sc/bsBXwp239sLhLL7A/DHvNKqahHDDxs=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=lNjuiLLkbup5bslT07ioC47spP2Fo2HhqCGilYJtRWvvPIDNdyH7OnJ+3jENQ8Igr8OZEzNHfem+BNBbZrpwuLhjbKNF0AGMQqDQ8l0lRLOKXfD+KHE3nTszxfVhdfn+Y3jXFUwT2UZcLXNB18wg1p5k57IrP8bjerSSTGNyIrY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=erCvD/wL; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="erCvD/wL" Date: Wed, 1 Jul 2026 17:31:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1782952280; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=fcr7vGBoPGtlTON86w1Q/6pdGUEuG6sALXr3lxWNySg=; b=erCvD/wL+foxFVvukXyMgS/VaZi2Dl75cM6Oaa1gmp07G3pzbOW+lx9Xo/OHAKQBU6P4Z5 D0Gb0UE7vQ/axvFujb0ysrJHLNmBbTJTHH7VQoL+KbX2tByLOhkeNpFO5esZ5/75SAaR44 iZzFc04U4xPCBdAFuwR9V5PtcgLxqwk= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Johannes Weiner Cc: Andrew Morton , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Zi Yan , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/4] mm: fix reclaim storms in defrag_mode Message-ID: References: <20260626182215.1107966-1-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260626182215.1107966-1-hannes@cmpxchg.org> X-Migadu-Flow: FLOW_OUT On Fri, Jun 26, 2026 at 02:21:16PM -0400, Johannes Weiner wrote: > As we deployed vm.defrag_mode=1 into Meta production, some workloads > regressed with recurring pressure spikes and swap storms (which in turn > triggered userspace OOM rules on pressure and swap utilization levels). > > Tracing pinned this to non-movable allocation? > requests spinning and reclaiming > unproductively when kswapd/kcompactd are overwhelmed. Direct reclaim > predominantly frees up pages in movable blocks, but those requests > cannot use that space under defrag_mode rules; Do we have these rules documented somewhere? > and it is unlikely to > free up whole blocks incidentally for __rmqueue_claim() to work. > > This series fixes it by making non-movable requests participate in > pageblock production in the allocator slowpath. Sorry after reading above sentence I didn't get what those allocators will do things differently after the series (I still have to go through the series). > > That requires some small-ish adjustments up front in the allocator and > the compaction code: three prep patches and the fix last. > > The series has been in production against one of the affected workloads > for two weeks and restores the OOM kill rate to !defrag_mode baseline. > > Based on mm-new (2026-06-22). > > include/linux/compaction.h | 3 +- > mm/compaction.c | 68 ++++++++++++++++++++++++-------------------- > mm/internal.h | 7 +++++ > mm/page_alloc.c | 59 ++++++++++++++++++++++++++++++------ > 4 files changed, 98 insertions(+), 39 deletions(-) >