From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5671F2B75E for ; Tue, 24 Oct 2023 16:09:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20230601.gappssmtp.com header.i=@cmpxchg-org.20230601.gappssmtp.com header.b="HnxUkBWN" Received: from mail-qk1-x72f.google.com (mail-qk1-x72f.google.com [IPv6:2607:f8b0:4864:20::72f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41462A6 for ; Tue, 24 Oct 2023 09:09:06 -0700 (PDT) Received: by mail-qk1-x72f.google.com with SMTP id af79cd13be357-778a108ae49so428282085a.0 for ; Tue, 24 Oct 2023 09:09:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1698163745; x=1698768545; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=VOgTqaWod2GIRMlQnZ21kgeA7fU4We2gZHXmH2gakWs=; b=HnxUkBWN+fNe0/jn27fSEWXczaAvhrMLtG+uZiImetWKZhEDcOUzgTh8O/1YcJqPnp nEUUQDEwok6hzsT4rdLO/+9a+kXx7pJWWQw2nlhAkMo51aKnxoI2q+NpOIgIMEC0hJog Gmn79q3r8cwL4W6hHerFgFJQgXFpoTexI0UuiQbzW/o7bdEByShdf85fNneoQn9swLnk S4diAq+8JY1jmc2g4T4v1tP0y3/g2Ta4o+f5Oj1qKGYP/572C4oUl7jJwG8nYY4Sf2ni VBYr8BiEXVaHLITop5cQM6s5g1NpChFOfmv1qkjAnq5HzkIvEoUJJneFuj/gC00HXfCc 9mRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698163745; x=1698768545; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=VOgTqaWod2GIRMlQnZ21kgeA7fU4We2gZHXmH2gakWs=; b=tT9Zy6+q3cBqTG6ylB3SrtD/SB4EO7oFj2aMcSjspJzUG9wC8ZPpishJS5kp0ZKx8F EgKkiLf1Xqb0M8crOCbJ1hlcXL0IM7HNe86trF8Sd3FMh1wjdYd7JOurVZXHRDDFFJpu gDj6PhX8QKyJu+R96T479mP6JV/2GJQ3w19lJN3BhZJgAGEqatkjrafDyXyOou1/2bTr L7N3WcZPuNZStE2+9uRlTL4WAIzhvwN6raJY+yp3S6UQPa6xnALiubIxqhTGLutZJsyH ga7BkWLQQC/OxIHINag7Y8YsXLHsKCKWq1TVnQyyDuqfKHZLmlzL0rutmKWUciTytLF2 MdAg== X-Gm-Message-State: AOJu0YzE7DdguD2Ywvi1KIWGiBFT6CE0JxtTyXMUmmeY8x0IjxrozHXn z3ZdWyJo2AV0EdMcqfN9PcBn0w== X-Google-Smtp-Source: AGHT+IE4iOSeMFxlVe/rUGgAvUrIKN/iHb2FxiHseAUx3vVefnmIsABNiiFtr0ZTnpRstzIYxb6ysQ== X-Received: by 2002:a05:620a:448a:b0:773:ac84:3f57 with SMTP id x10-20020a05620a448a00b00773ac843f57mr17278800qkp.5.1698163745326; Tue, 24 Oct 2023 09:09:05 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com. [2603:7000:c01:2716:3012:16a2:6bc2:2937]) by smtp.gmail.com with ESMTPSA id r16-20020a05620a299000b007756c8ce8f5sm3505539qkp.59.2023.10.24.09.09.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 09:09:05 -0700 (PDT) Date: Tue, 24 Oct 2023 12:09:04 -0400 From: Johannes Weiner To: Nhat Pham Cc: akpm@linux-foundation.org, cerasuolodomenico@gmail.com, yosryahmed@google.com, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [PATCH] zswap: export more zswap store failure stats Message-ID: <20231024160904.GA1971738@cmpxchg.org> References: <20231024000702.1387130-1-nphamcs@gmail.com> Precedence: bulk X-Mailing-List: cgroups@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231024000702.1387130-1-nphamcs@gmail.com> On Mon, Oct 23, 2023 at 05:07:02PM -0700, Nhat Pham wrote: > Since: > > "42c06a0e8ebe mm: kill frontswap" > > we no longer have a counter to tracks the number of zswap store > failures. This makes it hard to investigate and monitor for zswap > issues. > > This patch adds a global and a per-cgroup zswap store failure counter, > as well as a dedicated debugfs counter for compression algorithm failure > (which can happen for e.g when random data are passed to zswap). > > Signed-off-by: Nhat Pham I agree this is an issue. > --- > include/linux/vm_event_item.h | 1 + > mm/memcontrol.c | 1 + > mm/vmstat.c | 1 + > mm/zswap.c | 18 ++++++++++++++---- > 4 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h > index 8abfa1240040..7b2b117b193d 100644 > --- a/include/linux/vm_event_item.h > +++ b/include/linux/vm_event_item.h > @@ -145,6 +145,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, > #ifdef CONFIG_ZSWAP > ZSWPIN, > ZSWPOUT, > + ZSWPOUT_FAIL, Would the writeback stat be sufficient to determine this? Hear me out. We already have pswpout that shows when we're hitting disk swap. Right now we can't tell if this is because of a rejection or because of writeback. With a writeback counter we could. And I think we want the writeback counter anyway going forward in order to monitor and understand the dynamic shrinker's performance. Either way we go, one of the metrics needs to be derived from the other(s). But I think subtle and not so subtle shrinker issues are more concerning than outright configuration problems where zswap doesn't work at all. The latter is easier to catch before or during early deployment with simple functionality tests. Plus, rejections should be rare. They are now, and they should become even more rare or cease to exist going forward. Because every time they happen at scale, they represent problematic LRU inversions. We have patched, have pending patches, or discussed changes to reduce every single one of them: /* Store failed due to a reclaim failure after pool limit was reached */ static u64 zswap_reject_reclaim_fail; With the shrinker this becomes less relevant. There was also the proposal to disable the bypass to swap and just keep the page. /* Compressed page was too big for the allocator to (optimally) store */ static u64 zswap_reject_compress_poor; You were working on eradicating this (with zsmalloc at least). /* Store failed because underlying allocator could not get memory */ static u64 zswap_reject_alloc_fail; /* Store failed because the entry metadata could not be allocated (rare) */ static u64 zswap_reject_kmemcache_fail; These shouldn't happen at all due to PF_MEMALLOC. IOW, the fail counter is expected to stay zero in healthy, well-configured systems. Rather than an MM event that needs counting, this strikes me as something that could be a WARN down the line... I agree with adding the debugfs counter though.