netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: Eric Dumazet <edumazet@google.com>,
	 "David S . Miller" <davem@davemloft.net>,
	 Jakub Kicinski <kuba@kernel.org>,
	 Paolo Abeni <pabeni@redhat.com>
Cc: Simon Horman <horms@kernel.org>,
	 Kuniyuki Iwashima <kuniyu@google.com>,
	 netdev@vger.kernel.org,  eric.dumazet@gmail.com,
	 Eric Dumazet <edumazet@google.com>,
	 Willem de Bruijn <willemb@google.com>
Subject: Re: [PATCH net-next] net: optimize enqueue_to_backlog() for the fast path
Date: Fri, 24 Oct 2025 10:03:44 -0400	[thread overview]
Message-ID: <willemdebruijn.kernel.249e3b8331c2c@gmail.com> (raw)
In-Reply-To: <20251024090517.3289181-1-edumazet@google.com>

Eric Dumazet wrote:
> Add likely() and unlikely() clauses for the common cases:
> 
> Device is running.
> Queue is not full.
> Queue is less than half capacity.
> 
> Add max_backlog parameter to skb_flow_limit() to avoid
> a second READ_ONCE(net_hotdata.max_backlog).
> 
> skb_flow_limit() does not need the backlog_lock protection,
> and can be called before we acquire the lock, for even better
> resistance to attacks.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Willem de Bruijn <willemb@google.com>
> ---
>  net/core/dev.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 378c2d010faf251ffd874ebf0cc3dd6968eee447..d32f0b0c03bbd069d3651f5a6b772c8029baf96c 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5249,14 +5249,15 @@ void kick_defer_list_purge(unsigned int cpu)
>  int netdev_flow_limit_table_len __read_mostly = (1 << 12);
>  #endif
>  
> -static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen)
> +static bool skb_flow_limit(struct sk_buff *skb, unsigned int qlen,
> +			   int max_backlog)
>  {
>  #ifdef CONFIG_NET_FLOW_LIMIT
> -	struct sd_flow_limit *fl;
> -	struct softnet_data *sd;
>  	unsigned int old_flow, new_flow;
> +	const struct softnet_data *sd;
> +	struct sd_flow_limit *fl;
>  
> -	if (qlen < (READ_ONCE(net_hotdata.max_backlog) >> 1))
> +	if (likely(qlen < (max_backlog >> 1)))
>  		return false;
>  
>  	sd = this_cpu_ptr(&softnet_data);

I assume sd is warm here. Else we could even move skb_flow_limit
behind a static_branch seeing how rarely it is likely used.

> @@ -5301,19 +5302,19 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
>  	u32 tail;
>  
>  	reason = SKB_DROP_REASON_DEV_READY;
> -	if (!netif_running(skb->dev))
> +	if (unlikely(!netif_running(skb->dev)))
>  		goto bad_dev;

Isn't unlikely usually predicted for branches without an else?

And that is ignoring both FDO and actual branch prediction hardware
improving on the simple compiler heuristic.

No immediately concerns. Just want to avoid precedence for others
to sprinkle code with likely/unlikely with abandon. As is sometimes
seen.

>  
> -	reason = SKB_DROP_REASON_CPU_BACKLOG;
>  	sd = &per_cpu(softnet_data, cpu);
>  
>  	qlen = skb_queue_len_lockless(&sd->input_pkt_queue);
>  	max_backlog = READ_ONCE(net_hotdata.max_backlog);
> -	if (unlikely(qlen > max_backlog))
> +	if (unlikely(qlen > max_backlog) ||
> +	    skb_flow_limit(skb, qlen, max_backlog))
>  		goto cpu_backlog_drop;
>  	backlog_lock_irq_save(sd, &flags);
>  	qlen = skb_queue_len(&sd->input_pkt_queue);
> -	if (qlen <= max_backlog && !skb_flow_limit(skb, qlen)) {
> +	if (likely(qlen <= max_backlog)) {
>  		if (!qlen) {
>  			/* Schedule NAPI for backlog device. We can use
>  			 * non atomic operation as we own the queue lock.
> @@ -5334,6 +5335,7 @@ static int enqueue_to_backlog(struct sk_buff *skb, int cpu,
>  	backlog_unlock_irq_restore(sd, &flags);
>  
>  cpu_backlog_drop:
> +	reason = SKB_DROP_REASON_CPU_BACKLOG;
>  	numa_drop_add(&sd->drop_counters, 1);
>  bad_dev:
>  	dev_core_stats_rx_dropped_inc(skb->dev);
> -- 
> 2.51.1.821.gb6fe4d2222-goog
> 



  reply	other threads:[~2025-10-24 14:03 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-24  9:05 [PATCH net-next] net: optimize enqueue_to_backlog() for the fast path Eric Dumazet
2025-10-24 14:03 ` Willem de Bruijn [this message]
2025-10-24 14:30   ` Eric Dumazet
2025-10-24 14:48     ` Willem de Bruijn
2025-10-27 17:07 ` Kuniyuki Iwashima
2025-10-29  0:50 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=willemdebruijn.kernel.249e3b8331c2c@gmail.com \
    --to=willemdebruijn.kernel@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).