From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC3733B8D40 for ; Fri, 26 Jun 2026 18:43:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.171 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782499415; cv=none; b=CgzzAELy+xvNwVNOKeH9vBjdni+wyxprZOpQFuu6TMyuckLIkgMKnYkuBCVlem1BN9vdDi0ltDlmuVLcytSkUhvWTliLa8eLGSB7ADJmYf0NyNnzATGG0KxkrT1cH3WKLHJvQ2bOhWRipwzfSpVQbjI32AWoY2RqWvLPJkPYypw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782499415; c=relaxed/simple; bh=XPo3KTLghWLRXvTUy3SV3FUhDCryJP+HACDD1/mOARE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=onES1tVQdY2SHkGysycppMYJnAwy+o00N6hUUMH1vicoTXcm/rBYFccIYrn4B13d/ITECwRO9rBHRJZ1zbo1awR4DaYBk6tYsaZSaplmwxvI7QF9r2is5PJlZ19y2zkoznUxtk8vYX7gdQrV2ZCES88dbS7yfnDWYe/4rwD6yvE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org; spf=pass smtp.mailfrom=cmpxchg.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b=JdAJTcNF; arc=none smtp.client-ip=209.85.160.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cmpxchg.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg.org header.i=@cmpxchg.org header.b="JdAJTcNF" Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-519f758bca6so10024001cf.0 for ; Fri, 26 Jun 2026 11:43:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg.org; s=google; t=1782499411; x=1783104211; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=FIb2IG4OsYtmUwXVg/x7Qwra2DOFpR91A8rDyQolB1I=; b=JdAJTcNFZxJvAnH/aKFaYo1fXdrGQ+PV0tArAvfhoHTgS3nWlIU9UbiIgf3rG6rbLc 28QsZVVYk/yXtNO2qi43DCc+EXoL+QhU+PVDjyNQNb5sio44x/k/LGCJd7vS8W80Wd5t 9Y0V2FbFbo5ERadN+/Ms9dM1CgR8+9gTNs3W8o2B3egmsnjySq+qfGHkEWqZAj2cJeqy PaVtDSGI6fKbl2WVhdYwpp47Kw3eqlz6lL+z1S8U1cBNS6ljoYtyA01pKvT7YRSiyzgb SJwmxtMxI+6yxrtC7D036UHgpcjQShEo+d62eQBMaPo0WfLOXKw61yHj/nCgqsaF4W3z Pk4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1782499411; x=1783104211; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-gg:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FIb2IG4OsYtmUwXVg/x7Qwra2DOFpR91A8rDyQolB1I=; b=n4xFBqK7NyiuCEMG0gJuWykEbYTFymjqXGk9gWRu0h6qs9sa4lQlvn7qTRVxsUdvSR JD9d9mAilqPypTyF+FRHlWShNCDCJaA8MOwIIUIAZxPFKpRPI1PTvURSBpS8BAaZSqwP isscbk0HeiF8lNBb2Lkkecy5634GBG3L4jSpYbwcRTiUT9zoCYrWDkBQlcmqxtaj6TtX MBAbgO2PjKe+VUjppB454coyFHD8Sv59ugEfaMbfFwwx5LAcPWebbln532XV/8BaQJ2b mPlxq9PjA0m/4XOZC5U5HKZGLqDQk0eYJX/TseqE/HUWSK6r3D3p24cQ3+fZr/BRgeFq D7tw== X-Forwarded-Encrypted: i=1; AFNElJ/ZkfBKmRKgUOiXWU8Ytz3aVLvV1XVJhP/zutPk21VikWDlJMD2rXk8ZG9IAR4JiLWVg8GrVCt4YHYwyi0=@vger.kernel.org X-Gm-Message-State: AOJu0YxP25HufXnPerZsps10aU3gBktsBNq26pXyT02Dk+eO3rolKWDV XVt1SOGbRq6mxLb8W1zxopcIIHLiFPDW2AcNtxMEJZNs0t1MBQ9ufI9jCKiwklO3+jc= X-Gm-Gg: AfdE7cn23O4CUmuSpOBMp7RURjI1Aa/s24+PvQRPRhklR4+IvFpcuw3PzZTb8qDQV59 qz/UJJzl/aEvHHbDSjUB7gb9T6whijyalumLykjF0XRgqLQiu7V0FMkYdipM79u8iV493gzNt8Y 3Xb80gfmoDzLdu8QvcC9Cl7oly1GDRUnzH6Y2blKTqFvAyXTvfPRGFQ+HeqLYtSt8i1KZWKD5zD t6zY7Pmb20tiTgtuG/KtMdm0Sw6/jYVQBhbXGvP3R7RjPaXB8WFgmS3jeirmJQ8qQ5YYSCXY03x C2RKpSCR52gnhgaZasUMuEDKa5U1/QnyfaxjXKmZoGi00gpyvSLGlvIUMd2/xwVBk8kD+dEAqC4 OjPuqVjjpquLoHusFF+OKP0krWYEn7tpkwVcQHKsygnJi3mR84FUYXyYDDeHsa9WqnxAcJVCj4M ViL5iFh5ZOyR8= X-Received: by 2002:a05:622a:4291:b0:517:6435:c4ce with SMTP id d75a77b69052e-51a72822e58mr113839771cf.49.1782499411509; Fri, 26 Jun 2026 11:43:31 -0700 (PDT) Received: from localhost ([2603:7001:f100:500:365a:60ff:fe62:ff29]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-8df7f6fc013sm219113666d6.18.2026.06.26.11.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jun 2026 11:43:30 -0700 (PDT) Date: Fri, 26 Jun 2026 14:43:29 -0400 From: Johannes Weiner To: Zi Yan Cc: Andrew Morton , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Brendan Jackman , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Mike Rapoport , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 4/4] mm: page_alloc: fix non-movable reclaim storm in defrag_mode Message-ID: References: <20260626182215.1107966-1-hannes@cmpxchg.org> <20260626182215.1107966-5-hannes@cmpxchg.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Fri, Jun 26, 2026 at 02:29:24PM -0400, Zi Yan wrote: > On Fri Jun 26, 2026 at 2:21 PM EDT, Johannes Weiner wrote: > > As we deployed defrag_mode into Meta production, pressure spikes and > > excessive swapping were observed on some workloads. Tracing confirmed > > that this is unmovable/reclaimable requests spinning in the allocator > > and direct reclaim, causing excessive amounts of swap. > > > > The initial plan for defrag_mode was to rely on kswapd/kcompactd to > > produce blocks, and if those are overwhelmed under high pressure, let > > the allocator fall back (__rmqueue_steal()) after its retry loops. > > However, that retrying results in more reclaim on some of these > > workloads than we'd hoped, sometimes excessively so, spurred on by the > > !costly order conditions in should_reclaim_retry(). > > > > The storms are dependent on the request type. Reclaim will inevitably > > make room in existing movable blocks, since that's where the LRU pages > > live. So if movable requests retry on reclaim, they make progress. > > > > When non-movable requests spin in reclaim that isn't productive. They > > cannot use the individually freed pages, and the process is unlikely > > to accidentally free whole blocks to meet the ALLOC_NOFRAGMENT bar. > > They spin and overreclaim excessively, which tanks performance and > > triggers userspace guards like swap exhaustion or pressure based OOM. > > > > To fix this, send non-movable requests, regardless of order, into > > pageblock reclaim/compaction. This way, they help move things along to > > meet the ALLOC_NOFRAGMENT bar. After this patch, the reclaim storms > > and excess OOM rates are no longer observed in production. > > > > The longer-term plan is still to have all requests, including the > > movable ones, help make blocks to spread the cost of defragmenting > > more evenly and fairly; combined with proper watermarking to reduce > > allocation latencies in the common case. However, doing this naively > > unearths scaling and concurrency limitations in compaction that need > > to be addressed first. Promoting just non-movables for now is the > > minimally viable bug fix for the above issue. > > > > Fixes: f38356df6474 ("mm: page_alloc: introduce defrag_mode") > > Should be > Fixes: e3aa7df331bc ("mm: page_alloc: defrag_mode"). > Since I cannot find f38356df6474 in the tree. Oops, indeed. I managed to pull that commit from the old development branch I still had locally.