From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDEAA7082D; Thu, 28 Aug 2025 23:28:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.180 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756423727; cv=none; b=uPu7EdFndz8OLhChS7zh4g+TBAcnWuC6nnZeZvGSHX35VJIMkIrxvXrsz9XgpORXCe51PZgXHh7HZYv3CwKoUrYnAu8aStOxJL9L6IIRIZeNGhNVcvN28pX462gdwEL/fIqmxa491YLAYS8xDjXu92iqM4n6d+Kj/Uun/BQBQ2w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756423727; c=relaxed/simple; bh=ddB5ltz052A0NgyWpdONDu99T4ypYNf+rbcFEGWBC6U=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=af6YKkqKpPmMHZMlUnvEcI/DcBail3CkXyxTjO2R+CJo/mAMGxfvpVqI9TmNTP2paZeCmBtmq2wWuJyqDCYj4Pt9lXzZU2yAOmkNa0EaERlYJlZxv1SHnHMl1LNMmqQiOwcEpuZpwjbmtkzOBx+oFBeTSs/d8xHOyEpsd3DwEiE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JbkJ0SQF; arc=none smtp.client-ip=209.85.222.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JbkJ0SQF" Received: by mail-qk1-f180.google.com with SMTP id af79cd13be357-7f8ea864d68so146265185a.0; Thu, 28 Aug 2025 16:28:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1756423725; x=1757028525; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=g5xAEZzGkKzOekVqIS4rKLRu5l7Wsy0HGBRxwvDh6Sc=; b=JbkJ0SQF58bxxeS7rfzfQlOSKp3WlVr+iBzUPWi9pCnIjdA0PQTNkCnP9L8KGmNheV 6V5oEJTJR+OlghzmcFkN98QVX3BdA9FmY59DDjTC5YQ+7Siv6x6erwrgqyJOwk/w6UwH KM0EVRvZ78aKt7T1Tl/wA9TC4BPvD0iDS0mJYV39dQpft4LWzDrfP49Uv5XwY0kM0y/h C3kJffQ8x/efceAgXNBRWeWOERwlmHbp5eR0IkhFVLT94H+R5TQm5cNHEWUaJoFKE38N uh+IZls1dNhMKg0tEVvr07BIgEK8GkjJhAs5Af2P5HD9VeaGAhkqKd0fwN7r+XtlzySp yJxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756423725; x=1757028525; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g5xAEZzGkKzOekVqIS4rKLRu5l7Wsy0HGBRxwvDh6Sc=; b=KyaeJ9zHju0mDLS97qH3a4ayvjM6J0EIuXuUZvZ7cO/cr8ZjiaFWPaly/A/sbo0Cq1 ohejGs/Yf1+Tf6QywZux8oo8MzT5VJ6lFr36zv4OC6Xwt0vh3liBWju+TVNgfpNvrKk+ YJyPJu60le2isknP9q+WtbvM4sEn3oSnJuw2iO0NOf9nHpQiRkec2dk2GgOA/tzJivvU Q5YxBTZd2UTKCXgD3sI6mtaDlzHKo2yfK2XiGy4oeSGHEOSIsd7Y3EVXAmszS5IF2HAC 2Is6HF07YLzWeiOpkQpE8eAspNmmHPZs2jiEoaiVbK2YaG1i29gL9O70v7nL0phYIudY jSXg== X-Forwarded-Encrypted: i=1; AJvYcCVjTmVOzIZ+l5+N58YY1jNw4RFRiRo/Gn2jg/KApHp7G7aO4osLc9Ag9MNE9iqQ+e53POl9y4qRk+qoRUk=@vger.kernel.org X-Gm-Message-State: AOJu0YwneOC6JhMQ3J+BLahr6OZvNhkXSh4/UY4HiEXdp/zp4G69Onr5 jkOOCSsloaeFhI1ErMBiIKtC6v39OaSFPTavWVHxNrL2XUXyActGEHxqlVxjEAx5GDK0GX6lu4S pB8UqK10vhlzq8Z37aQuLUytw1m6/6rT8AMrw X-Gm-Gg: ASbGncuJENieg+hCSwhl4b5CuQ4bWHelCP1nvt/KxEseoYVUEfsiwDlQltM2AVyeITz iiCdt6eEji+jVhbVB1AptDx3o+oAaBSijktYQndu+blju0Qi7ZTXDeW8N8WS29GNz1cPpwZccUx x4UJd/wtXTIwkOIq+C2DJ8yI8w9WfvyuVw28Bvu3dZm7hnio7kVRIEMYw8QPr5XIbSMO+uL/F14 Gmx1QjJ+hPm/nEvCg== X-Google-Smtp-Source: AGHT+IG9dqf0v9B4cNprbTA7zUMYBwOa7Uq6IGQrFQsG5rT3a7SmZdhcZHPObNG9JdyxtkduL8KD+Nlyyk3kPIqkw2Q= X-Received: by 2002:a05:620a:3913:b0:7fb:d5c5:a4df with SMTP id af79cd13be357-7fbd5c5a6b0mr253670785a.34.1756423724471; Thu, 28 Aug 2025 16:28:44 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20250801043642.8103-1-kanchana.p.sridhar@intel.com> <20250801043642.8103-23-kanchana.p.sridhar@intel.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Fri, 29 Aug 2025 11:28:33 +1200 X-Gm-Features: Ac12FXymL-oJOFhtfchGYmwxY8_PFVTDmAseMPSDCZcsDDSfx-7l2g2_r4lHbbs Message-ID: Subject: Re: [PATCH v11 22/24] mm: zswap: Allocate pool batching resources if the compressor supports batching. To: "Sridhar, Kanchana P" Cc: "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "hannes@cmpxchg.org" , "yosry.ahmed@linux.dev" , "nphamcs@gmail.com" , "chengming.zhou@linux.dev" , "usamaarif642@gmail.com" , "ryan.roberts@arm.com" , "ying.huang@linux.alibaba.com" , "akpm@linux-foundation.org" , "senozhatsky@chromium.org" , "linux-crypto@vger.kernel.org" , "herbert@gondor.apana.org.au" , "davem@davemloft.net" , "clabbe@baylibre.com" , "ardb@kernel.org" , "ebiggers@google.com" , "surenb@google.com" , "Accardi, Kristen C" , "Gomes, Vinicius" , "Feghali, Wajdi K" , "Gopal, Vinodh" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > > > > If ZSWAP_MAX_BATCH_SIZE is set to 8 and there is no hardware batching, > > compression is done with a step size of 1. If the hardware step size is= 4, > > compression occurs in two steps. If the hardware step size is 6, the fi= rst > > compression uses a step size of 6, and the second uses a step size of 2= . > > Do you think this will work? > > Hi Barry, > > This would be non-optimal from code simplicity and latency perspectives. > One of the benefits of using the hardware accelerator's "batch parallelis= m" > is cost amortization across the batch. We might lose this benefit if we m= ake > multiple calls to zswap_compress() to ask the hardware accelerator to > batch compress in smaller batches. Compression throughput would also > be sub-optimal. I guess it wouldn=E2=80=99t be an issue if both ZSWAP_MAX_BATCH_SIZE and pool->compr_batch_size are powers of two. As you mentioned, we still gain improvement with ZSWAP_MAX_BATCH_SIZE batching even when pool->compr_batch_size =3D=3D 1, by compressing pages one by one but batching other work such as zswap_entries_cache_alloc_batch() ? > > In my patch-series, the rule is simple: if an algorithm has specified a > batch-size, carve out the folio by that "batch-size" # of pages to be > compressed as a batch in zswap_compress(). This custom batch-size > is capped at ZSWAP_MAX_BATCH_SIZE. > > If an algorithm has not specified a batch-size, the default batch-size > is 1. In this case, carve out the folio by ZSWAP_MAX_BATCH_SIZE > # of pages to be compressed as a batch in zswap_compress(). Yes, I understand your rule. However, having two global variables is still somewhat confusing. It might be clearer to use a single variable with a comment, since one variable can clearly determine the value of the other. Can we get the batch_size at runtime based on pool->compr_batch_size? /* * If hardware compression supports batching, we use the hardware step size= . * Otherwise, we use ZSWAP_MAX_BATCH_SIZE for batching, but still compress * one page at a time. */ batch_size =3D pool->compr_batch_size > 1 ? pool->compr_batch_size : ZSWAP_MAX_BATCH_SIZE; We probably don=E2=80=99t need this if both pool->compr_batch_size and ZSWAP_MAX_BATCH_SIZE are powers of two? > > > > > I don=E2=80=99t quite understand why you want to save > > ZSWAP_MAX_BATCH_SIZE - X resources, since even without hardware > > batching > > you are still allocating all ZSWAP_MAX_BATCH_SIZE resources. This is th= e > > case for all platforms except yours. > > Not sure I understand.. Just to clarify, this is not done to save on reso= urces, > rather for the reasons stated above. > > We are already saving on resources by only allocating only > "pool->compr_batch_size" number of resources > (*not* ZSWAP_MAX_BATCH_SIZE resources): > > pool->compr_batch_size =3D min(ZSWAP_MAX_BATCH_SIZE, > crypto_acomp_batch_size(acomp_ctx->a= comp)); > > For non-Intel platforms, this means only 1 dst buffer is allocated, as > explained in the commit log for this patch. you are right. I misunderstood your code :-) > > " A "u8 compr_batch_size" member is added to "struct zswap_pool", as per > Yosry's suggestion. pool->compr_batch_size is set as the minimum of the > compressor's max batch-size and ZSWAP_MAX_BATCH_SIZE. Accordingly, it > proceeds to allocate the necessary compression dst buffers in the > per-CPU acomp_ctx." This is fine, but it still doesn=E2=80=99t provide a strong reason for havi= ng two global variables when one can fully determine the value of the other. Thanks Barry