[PATCH net] ptr_ring: drop duplicated tail zeroing code

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net] ptr_ring: drop duplicated tail zeroing code
@ 2025-09-24  5:27 Michael S. Tsirkin
  2025-09-24  5:27 ` Michael S. Tsirkin
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2025-09-24  5:27 UTC (permalink / raw)
  To: linux-kernel, netdev, Jason Wang

We have some rather subtle code around zeroing tail entries, minimizing
cache bouncing.  Let's put it all in one place.

Doing this also reduces the text size slightly, e.g. for
drivers/vhost/net.o
  Before: text: 15,114 bytes
  After: text: 15,082 bytes

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Lightly tested.

 include/linux/ptr_ring.h | 42 +++++++++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 551329220e4f..a736b16859a6 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -243,6 +243,24 @@ static inline bool ptr_ring_empty_bh(struct ptr_ring *r)
 	return ret;
 }
 
+/* Zero entries from tail to specified head.
+ * NB: if consumer_head can be >= r->size need to fixup tail later.
+ */
+static inline void __ptr_ring_zero_tail(struct ptr_ring *r, int consumer_head)
+{
+	int head = consumer_head - 1;
+
+	/* Zero out entries in the reverse order: this way we touch the
+	 * cache line that producer might currently be reading the last;
+	 * producer won't make progress and touch other cache lines
+	 * besides the first one until we write out all entries.
+	 */
+	while (likely(head >= r->consumer_tail))
+		r->queue[head--] = NULL;
+
+	r->consumer_tail = consumer_head;
+}
+
 /* Must only be called after __ptr_ring_peek returned !NULL */
 static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 {
@@ -261,8 +279,7 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 	/* Note: we must keep consumer_head valid at all times for __ptr_ring_empty
 	 * to work correctly.
 	 */
-	int consumer_head = r->consumer_head;
-	int head = consumer_head++;
+	int consumer_head = r->consumer_head + 1;
 
 	/* Once we have processed enough entries invalidate them in
 	 * the ring all at once so producer can reuse their space in the ring.
@@ -270,16 +287,9 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
 	 * but helps keep the implementation simple.
 	 */
 	if (unlikely(consumer_head - r->consumer_tail >= r->batch ||
-		     consumer_head >= r->size)) {
-		/* Zero out entries in the reverse order: this way we touch the
-		 * cache line that producer might currently be reading the last;
-		 * producer won't make progress and touch other cache lines
-		 * besides the first one until we write out all entries.
-		 */
-		while (likely(head >= r->consumer_tail))
-			r->queue[head--] = NULL;
-		r->consumer_tail = consumer_head;
-	}
+		     consumer_head >= r->size))
+		__ptr_ring_zero_tail(r, consumer_head);
+
 	if (unlikely(consumer_head >= r->size)) {
 		consumer_head = 0;
 		r->consumer_tail = 0;
@@ -513,7 +523,6 @@ static inline void ptr_ring_unconsume(struct ptr_ring *r, void **batch, int n,
 				      void (*destroy)(void *))
 {
 	unsigned long flags;
-	int head;
 
 	spin_lock_irqsave(&r->consumer_lock, flags);
 	spin_lock(&r->producer_lock);
@@ -525,17 +534,14 @@ static inline void ptr_ring_unconsume(struct ptr_ring *r, void **batch, int n,
 	 * Clean out buffered entries (for simplicity). This way following code
 	 * can test entries for NULL and if not assume they are valid.
 	 */
-	head = r->consumer_head - 1;
-	while (likely(head >= r->consumer_tail))
-		r->queue[head--] = NULL;
-	r->consumer_tail = r->consumer_head;
+	__ptr_ring_zero_tail(r, r->consumer_head);
 
 	/*
 	 * Go over entries in batch, start moving head back and copy entries.
 	 * Stop when we run into previously unconsumed entries.
 	 */
 	while (n) {
-		head = r->consumer_head - 1;
+		int head = r->consumer_head - 1;
 		if (head < 0)
 			head = r->size - 1;
 		if (r->queue[head]) {
-- 
MST


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ptr_ring: drop duplicated tail zeroing code
  2025-09-24  5:27 [PATCH net] ptr_ring: drop duplicated tail zeroing code Michael S. Tsirkin
@ 2025-09-24  5:27 ` Michael S. Tsirkin
  2025-09-24  7:29 ` Jason Wang
  2025-09-26 22:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2025-09-24  5:27 UTC (permalink / raw)
  To: linux-kernel, netdev, Jason Wang

On Wed, Sep 24, 2025 at 01:27:09AM -0400, Michael S. Tsirkin wrote:
> We have some rather subtle code around zeroing tail entries, minimizing
> cache bouncing.  Let's put it all in one place.
> 
> Doing this also reduces the text size slightly, e.g. for
> drivers/vhost/net.o
>   Before: text: 15,114 bytes
>   After: text: 15,082 bytes
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>


Ugh net-next obviously. Sorry.


> ---
> 
> Lightly tested.
> 
>  include/linux/ptr_ring.h | 42 +++++++++++++++++++++++-----------------
>  1 file changed, 24 insertions(+), 18 deletions(-)
> 
> diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
> index 551329220e4f..a736b16859a6 100644
> --- a/include/linux/ptr_ring.h
> +++ b/include/linux/ptr_ring.h
> @@ -243,6 +243,24 @@ static inline bool ptr_ring_empty_bh(struct ptr_ring *r)
>  	return ret;
>  }
>  
> +/* Zero entries from tail to specified head.
> + * NB: if consumer_head can be >= r->size need to fixup tail later.
> + */
> +static inline void __ptr_ring_zero_tail(struct ptr_ring *r, int consumer_head)
> +{
> +	int head = consumer_head - 1;
> +
> +	/* Zero out entries in the reverse order: this way we touch the
> +	 * cache line that producer might currently be reading the last;
> +	 * producer won't make progress and touch other cache lines
> +	 * besides the first one until we write out all entries.
> +	 */
> +	while (likely(head >= r->consumer_tail))
> +		r->queue[head--] = NULL;
> +
> +	r->consumer_tail = consumer_head;
> +}
> +
>  /* Must only be called after __ptr_ring_peek returned !NULL */
>  static inline void __ptr_ring_discard_one(struct ptr_ring *r)
>  {
> @@ -261,8 +279,7 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
>  	/* Note: we must keep consumer_head valid at all times for __ptr_ring_empty
>  	 * to work correctly.
>  	 */
> -	int consumer_head = r->consumer_head;
> -	int head = consumer_head++;
> +	int consumer_head = r->consumer_head + 1;
>  
>  	/* Once we have processed enough entries invalidate them in
>  	 * the ring all at once so producer can reuse their space in the ring.
> @@ -270,16 +287,9 @@ static inline void __ptr_ring_discard_one(struct ptr_ring *r)
>  	 * but helps keep the implementation simple.
>  	 */
>  	if (unlikely(consumer_head - r->consumer_tail >= r->batch ||
> -		     consumer_head >= r->size)) {
> -		/* Zero out entries in the reverse order: this way we touch the
> -		 * cache line that producer might currently be reading the last;
> -		 * producer won't make progress and touch other cache lines
> -		 * besides the first one until we write out all entries.
> -		 */
> -		while (likely(head >= r->consumer_tail))
> -			r->queue[head--] = NULL;
> -		r->consumer_tail = consumer_head;
> -	}
> +		     consumer_head >= r->size))
> +		__ptr_ring_zero_tail(r, consumer_head);
> +
>  	if (unlikely(consumer_head >= r->size)) {
>  		consumer_head = 0;
>  		r->consumer_tail = 0;
> @@ -513,7 +523,6 @@ static inline void ptr_ring_unconsume(struct ptr_ring *r, void **batch, int n,
>  				      void (*destroy)(void *))
>  {
>  	unsigned long flags;
> -	int head;
>  
>  	spin_lock_irqsave(&r->consumer_lock, flags);
>  	spin_lock(&r->producer_lock);
> @@ -525,17 +534,14 @@ static inline void ptr_ring_unconsume(struct ptr_ring *r, void **batch, int n,
>  	 * Clean out buffered entries (for simplicity). This way following code
>  	 * can test entries for NULL and if not assume they are valid.
>  	 */
> -	head = r->consumer_head - 1;
> -	while (likely(head >= r->consumer_tail))
> -		r->queue[head--] = NULL;
> -	r->consumer_tail = r->consumer_head;
> +	__ptr_ring_zero_tail(r, r->consumer_head);
>  
>  	/*
>  	 * Go over entries in batch, start moving head back and copy entries.
>  	 * Stop when we run into previously unconsumed entries.
>  	 */
>  	while (n) {
> -		head = r->consumer_head - 1;
> +		int head = r->consumer_head - 1;
>  		if (head < 0)
>  			head = r->size - 1;
>  		if (r->queue[head]) {
> -- 
> MST


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ptr_ring: drop duplicated tail zeroing code
  2025-09-24  5:27 [PATCH net] ptr_ring: drop duplicated tail zeroing code Michael S. Tsirkin
  2025-09-24  5:27 ` Michael S. Tsirkin
@ 2025-09-24  7:29 ` Jason Wang
  2025-09-26 22:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 5+ messages in thread
From: Jason Wang @ 2025-09-24  7:29 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-kernel, netdev

On Wed, Sep 24, 2025 at 1:27 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> We have some rather subtle code around zeroing tail entries, minimizing
> cache bouncing.  Let's put it all in one place.
>
> Doing this also reduces the text size slightly, e.g. for
> drivers/vhost/net.o
>   Before: text: 15,114 bytes
>   After: text: 15,082 bytes
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>

Acked-by: Jason Wang <jasowang@redhat.com>

Thanks


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ptr_ring: drop duplicated tail zeroing code
  2025-09-24  5:27 [PATCH net] ptr_ring: drop duplicated tail zeroing code Michael S. Tsirkin
  2025-09-24  5:27 ` Michael S. Tsirkin
  2025-09-24  7:29 ` Jason Wang
@ 2025-09-26 22:30 ` patchwork-bot+netdevbpf
  2025-09-28 15:19   ` Lei Yang
  2 siblings, 1 reply; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-09-26 22:30 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-kernel, netdev, jasowang

Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 24 Sep 2025 01:27:07 -0400 you wrote:
> We have some rather subtle code around zeroing tail entries, minimizing
> cache bouncing.  Let's put it all in one place.
> 
> Doing this also reduces the text size slightly, e.g. for
> drivers/vhost/net.o
>   Before: text: 15,114 bytes
>   After: text: 15,082 bytes
> 
> [...]

Here is the summary with links:
  - [net] ptr_ring: drop duplicated tail zeroing code
    https://git.kernel.org/netdev/net-next/c/4e9510f16218

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ptr_ring: drop duplicated tail zeroing code
  2025-09-26 22:30 ` patchwork-bot+netdevbpf
@ 2025-09-28 15:19   ` Lei Yang
  0 siblings, 0 replies; 5+ messages in thread
From: Lei Yang @ 2025-09-28 15:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, netdev, jasowang, patchwork-bot+netdevbpf

Tested this patch with virtio-net regression tests, everything works fine.

Tested-by: Lei Yang <leiyang@redhat.com>

On Sat, Sep 27, 2025 at 6:30 AM <patchwork-bot+netdevbpf@kernel.org> wrote:
>
> Hello:
>
> This patch was applied to netdev/net-next.git (main)
> by Jakub Kicinski <kuba@kernel.org>:
>
> On Wed, 24 Sep 2025 01:27:07 -0400 you wrote:
> > We have some rather subtle code around zeroing tail entries, minimizing
> > cache bouncing.  Let's put it all in one place.
> >
> > Doing this also reduces the text size slightly, e.g. for
> > drivers/vhost/net.o
> >   Before: text: 15,114 bytes
> >   After: text: 15,082 bytes
> >
> > [...]
>
> Here is the summary with links:
>   - [net] ptr_ring: drop duplicated tail zeroing code
>     https://git.kernel.org/netdev/net-next/c/4e9510f16218
>
> You are awesome, thank you!
> --
> Deet-doot-dot, I am a bot.
> https://korg.docs.kernel.org/patchwork/pwbot.html
>
>
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-09-28 15:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-24  5:27 [PATCH net] ptr_ring: drop duplicated tail zeroing code Michael S. Tsirkin
2025-09-24  5:27 ` Michael S. Tsirkin
2025-09-24  7:29 ` Jason Wang
2025-09-26 22:30 ` patchwork-bot+netdevbpf
2025-09-28 15:19   ` Lei Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).