From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46BFD2153D9 for ; Thu, 9 Jan 2025 13:16:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736428591; cv=none; b=rjvCxhDDmc7WUFp5X7oFKS8vYvR9h3U3qLoG1MXHl+rzdCmbmlvH8w221tQnkIXyyw/BQvoC3N3QC2SDnY+dSOwKNMJlvtRJQbf0zkEiYQPRLPBsV7UvD+GR/zQNdCoPc+ESb3GSvtSGuujvQJOxktOmmSpbkqqmCAUJg5ZGxiQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736428591; c=relaxed/simple; bh=O+HYHamx1p3+ILoEPBjBK7OiUlhjeWsSzJqP7OSIwPs=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=BZeZDhdo4X5PxcnYLXfnvuj0rgdKGgBTFcqhR+q7Xs5XqhXjy/ezBYOcni+C6xLDd2tnSAW2fRA5dwivhMTjHOnr/Kq4adpJ7kTWPOiYxvOgIBDA4YM4uyVYD/NFm4fOh/e9BXXDzPRGR9Jx17ernsFfytmQYXrUvKNKykpInKw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WFvdpRPD; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WFvdpRPD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736428588; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H0ZpSa6ZeYEhoz4+5kQn2kH6v7v1MM7x+pBBhTzOknk=; b=WFvdpRPDw/YDgI1gsKv17mhHz9RfTCIbDDx0gMuvUZmmf2dlgGk11H1b2Q8VG7ip8qMrub HH/9O/9mgE+4NJocOj9aUm+n5JOkeO+QovlfhXgNmMQBNGiP+RkrQH/ljIlwqkr7hTk58a Zdli7y68c92v5DuCfm93jG+97sslNvU= Received: from mail-qv1-f70.google.com (mail-qv1-f70.google.com [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-662-a5im1p8COOWD2zF4KSfHcg-1; Thu, 09 Jan 2025 08:16:27 -0500 X-MC-Unique: a5im1p8COOWD2zF4KSfHcg-1 X-Mimecast-MFC-AGG-ID: a5im1p8COOWD2zF4KSfHcg Received: by mail-qv1-f70.google.com with SMTP id 6a1803df08f44-6d87efed6c4so14803056d6.1 for ; Thu, 09 Jan 2025 05:16:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736428586; x=1737033386; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=H0ZpSa6ZeYEhoz4+5kQn2kH6v7v1MM7x+pBBhTzOknk=; b=YTWtvNDJcLdma82ljkf++2dBPig7PHcekBKBPXCLHrMTr55LlO2GkR7YzxhdhobZ+C ieVc7sk5RRxp98X5MZNskFH7W9ouyx5kBiQqWYFhbhu5WuBVbKtNb9rSynYETQn7bUXP mBrQ/xyRatIPjBezuGUAH89bYK3lHKu4MzlfSyOYe5BRLklElCTmvu/P57l/yWK9W/+0 UMfKTKHjX7A76w2HWznM4vrolie5cCeYqgvUK+jIor5l27v3t4t0dWhh5H/jtcJqvT0v Lrvk+IXqiJO+qzzEply9ccH2RjuSKxqTUUH8CdyZqZj9LYM5EXovr+JJ3VF83xR3tDfA SxOA== X-Forwarded-Encrypted: i=1; AJvYcCWOT+wOqtNXG96em9UVMJvN1P7FyvMumtozzy3lNQomBBiRfcRjN1nQRVpuoMxUly/w9c4=@vger.kernel.org X-Gm-Message-State: AOJu0YwAhFhiwoQkNd5blBJUklrB9yBM8Fsj2n2iFG90l9UF0BkmPjbx vYuLYu6cT/XC2gA9CqbQA2H5bnlqxwsKhC56I7G8Wwxi2TJi+NcN6hUda+2l85sSYM9KUrmFSZt iZ7oQDfFmntPSPf2F8NRfBVrdF3DpiqcBKknZbwSqgf7qCBrGwQ== X-Gm-Gg: ASbGncv2a5oOtuPXcjUjve50J0/7Luyt+ZFyJVr2Nsevgdqgr/LNQdzUPek4/Pp/475 WK0GfHjlZSJTTgLGVSbxVBzQ+ENJKQ0CHSlsQNeAYVnV281M7aDC5GN+TNwRpMAe/jmzVne5p95 NLpYreNQ8RDVRHfsFaeZ3bqg9DQuqG3naIBVJPTJ0FDgNPvgRlAR4Avt+QbT95E5+qapQdpJPg5 eYXRnrWjFdbqJ+wsEX1nTnMjapDe33rYBYZo7Uydc3pnmWMPDjiR+qnmFaSXBiQ6nilaQxr17sg iGobps80 X-Received: by 2002:a05:6214:19c8:b0:6df:97a3:5e5a with SMTP id 6a1803df08f44-6df9b255d62mr124444776d6.28.1736428586680; Thu, 09 Jan 2025 05:16:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IEH1xjlf7wVC26WFyHvsKl4IwBJwUPL5AZsOoFJQiN794kJePTIbyPTvBRQQ4UaX2vXG51oZA== X-Received: by 2002:a05:6214:19c8:b0:6df:97a3:5e5a with SMTP id 6a1803df08f44-6df9b255d62mr124444316d6.28.1736428586323; Thu, 09 Jan 2025 05:16:26 -0800 (PST) Received: from [192.168.88.253] (146-241-2-244.dyn.eolo.it. [146.241.2.244]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6dd180eaa0csm201636846d6.5.2025.01.09.05.16.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 09 Jan 2025 05:16:25 -0800 (PST) Message-ID: Date: Thu, 9 Jan 2025 14:16:22 +0100 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH net-next v2 5/8] net: skbuff: introduce napi_skb_cache_get_bulk() To: Alexander Lobakin , Andrew Lunn , "David S. Miller" , Eric Dumazet , Jakub Kicinski Cc: Lorenzo Bianconi , Daniel Xu , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , John Fastabend , =?UTF-8?Q?Toke_H=C3=B8iland-J=C3=B8rgensen?= , Jesper Dangaard Brouer , Martin KaFai Lau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org References: <20250107152940.26530-1-aleksander.lobakin@intel.com> <20250107152940.26530-6-aleksander.lobakin@intel.com> Content-Language: en-US From: Paolo Abeni In-Reply-To: <20250107152940.26530-6-aleksander.lobakin@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 1/7/25 4:29 PM, Alexander Lobakin wrote: > Add a function to get an array of skbs from the NAPI percpu cache. > It's supposed to be a drop-in replacement for > kmem_cache_alloc_bulk(skbuff_head_cache, GFP_ATOMIC) and > xdp_alloc_skb_bulk(GFP_ATOMIC). The difference (apart from the > requirement to call it only from the BH) is that it tries to use > as many NAPI cache entries for skbs as possible, and allocate new > ones only if needed. > > The logic is as follows: > > * there is enough skbs in the cache: decache them and return to the > caller; > * not enough: try refilling the cache first. If there is now enough > skbs, return; > * still not enough: try allocating skbs directly to the output array > with %GFP_ZERO, maybe we'll be able to get some. If there's now > enough, return; > * still not enough: return as many as we were able to obtain. > > Most of times, if called from the NAPI polling loop, the first one will > be true, sometimes (rarely) the second one. The third and the fourth -- > only under heavy memory pressure. > It can save significant amounts of CPU cycles if there are GRO cycles > and/or Tx completion cycles (anything that descends to > napi_skb_cache_put()) happening on this CPU. > > Signed-off-by: Alexander Lobakin > Tested-by: Daniel Xu > --- > include/linux/skbuff.h | 1 + > net/core/skbuff.c | 62 ++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 63 insertions(+) > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > index bb2b751d274a..1c089c7c14e1 100644 > --- a/include/linux/skbuff.h > +++ b/include/linux/skbuff.h > @@ -1315,6 +1315,7 @@ struct sk_buff *build_skb_around(struct sk_buff *skb, > void *data, unsigned int frag_size); > void skb_attempt_defer_free(struct sk_buff *skb); > > +u32 napi_skb_cache_get_bulk(void **skbs, u32 n); > struct sk_buff *napi_build_skb(void *data, unsigned int frag_size); > struct sk_buff *slab_build_skb(void *data); > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c > index a441613a1e6c..42eb31dcc9ce 100644 > --- a/net/core/skbuff.c > +++ b/net/core/skbuff.c > @@ -367,6 +367,68 @@ static struct sk_buff *napi_skb_cache_get(void) > return skb; > } > > +/** > + * napi_skb_cache_get_bulk - obtain a number of zeroed skb heads from the cache > + * @skbs: pointer to an at least @n-sized array to fill with skb pointers > + * @n: number of entries to provide > + * > + * Tries to obtain @n &sk_buff entries from the NAPI percpu cache and writes > + * the pointers into the provided array @skbs. If there are less entries > + * available, tries to replenish the cache and bulk-allocates the diff from > + * the MM layer if needed. > + * The heads are being zeroed with either memset() or %__GFP_ZERO, so they are > + * ready for {,__}build_skb_around() and don't have any data buffers attached. > + * Must be called *only* from the BH context. > + * > + * Return: number of successfully allocated skbs (@n if no actual allocation > + * needed or kmem_cache_alloc_bulk() didn't fail). > + */ > +u32 napi_skb_cache_get_bulk(void **skbs, u32 n) > +{ > + struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); > + u32 bulk, total = n; > + > + local_lock_nested_bh(&napi_alloc_cache.bh_lock); > + > + if (nc->skb_count >= n) > + goto get; I (mis?)understood from the commit message this condition should be likely, too?!? > + /* No enough cached skbs. Try refilling the cache first */ > + bulk = min(NAPI_SKB_CACHE_SIZE - nc->skb_count, NAPI_SKB_CACHE_BULK); > + nc->skb_count += kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, > + GFP_ATOMIC | __GFP_NOWARN, bulk, > + &nc->skb_cache[nc->skb_count]); > + if (likely(nc->skb_count >= n)) > + goto get; > + > + /* Still not enough. Bulk-allocate the missing part directly, zeroed */ > + n -= kmem_cache_alloc_bulk(net_hotdata.skbuff_cache, > + GFP_ATOMIC | __GFP_ZERO | __GFP_NOWARN, > + n - nc->skb_count, &skbs[nc->skb_count]); You should probably cap 'n' to NAPI_SKB_CACHE_SIZE. Also what about latency spikes when n == 48 (should be the maximum possible with such limit) here? /P