All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Leesoo Ahn <lsahn@ooseel.net>
Cc: Oliver Neukum <oneukum@suse.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org, linux-usb@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] usbnet: jump to rx_cleanup case instead of calling skb_queue_tail
Date: Mon, 19 Dec 2022 09:55:45 +0100	[thread overview]
Message-ID: <Y6AnEWWd7DQg0b6o@kroah.com> (raw)
In-Reply-To: <403f3ea8-eeec-2a78-640e-c11c3fe28f45@ooseel.net>

On Mon, Dec 19, 2022 at 05:09:21PM +0900, Leesoo Ahn wrote:
> 
> On 22. 12. 19. 16:50, Greg KH wrote:
> > On Mon, Dec 19, 2022 at 04:41:16PM +0900, Leesoo Ahn wrote:
> > > On 22. 12. 18. 17:55, Greg KH wrote:
> > > > On Sun, Dec 18, 2022 at 01:18:51AM +0900, Leesoo Ahn wrote:
> > > > > The current source pushes skb into dev->done queue by calling
> > > > > skb_queue_tail() and then, call skb_dequeue() to pop for rx_cleanup state
> > > > > to free urb and skb next in usbnet_bh().
> > > > > It wastes CPU resource with extra instructions. Instead, use return values
> > > > > jumping to rx_cleanup case directly to free them. Therefore calling
> > > > > skb_queue_tail() and skb_dequeue() is not necessary.
> > > > > 
> > > > > The follows are just showing difference between calling skb_queue_tail()
> > > > > and using return values jumping to rx_cleanup state directly in usbnet_bh()
> > > > > in Arm64 instructions with perf tool.
> > > > > 
> > > > > ----------- calling skb_queue_tail() -----------
> > > > >          │     if (!(dev->driver_info->flags & FLAG_RX_ASSEMBLE))
> > > > >     7.58 │248:   ldr     x0, [x20, #16]
> > > > >     2.46 │24c:   ldr     w0, [x0, #8]
> > > > >     1.64 │250: ↑ tbnz    w0, #14, 16c
> > > > >          │     dev->net->stats.rx_errors++;
> > > > >     0.57 │254:   ldr     x1, [x20, #184]
> > > > >     1.64 │258:   ldr     x0, [x1, #336]
> > > > >     2.65 │25c:   add     x0, x0, #0x1
> > > > >          │260:   str     x0, [x1, #336]
> > > > >          │     skb_queue_tail(&dev->done, skb);
> > > > >     0.38 │264:   mov     x1, x19
> > > > >          │268:   mov     x0, x21
> > > > >     2.27 │26c: → bl      skb_queue_tail
> > > > >     0.57 │270: ↑ b       44    // branch to call skb_dequeue()
> > > > > 
> > > > > ----------- jumping to rx_cleanup state -----------
> > > > >          │     if (!(dev->driver_info->flags & FLAG_RX_ASSEMBLE))
> > > > >     1.69 │25c:   ldr     x0, [x21, #16]
> > > > >     4.78 │260:   ldr     w0, [x0, #8]
> > > > >     3.28 │264: ↑ tbnz    w0, #14, e4    // jump to 'rx_cleanup' state
> > > > >          │     dev->net->stats.rx_errors++;
> > > > >     0.09 │268:   ldr     x1, [x21, #184]
> > > > >     2.72 │26c:   ldr     x0, [x1, #336]
> > > > >     3.37 │270:   add     x0, x0, #0x1
> > > > >     0.09 │274:   str     x0, [x1, #336]
> > > > >     0.66 │278: ↑ b       e4    // branch to 'rx_cleanup' state
> > > > Interesting, but does this even really matter given the slow speed of
> > > > the USB hardware?
> > > It doesn't if USB hardware has slow speed but in software view, it's still
> > > worth avoiding calling skb_queue_tail() and skb_dequeue() which work with
> > > spinlock, if possible.
> > But can you actually measure that in either CPU load or in increased
> > transfer speeds?
> > 
> > thanks,
> > 
> > greg k-h
> 
> I think the follows are maybe what you would be interested in. I have tested
> both case with perf on the same machine and environments, also modified
> driver code a bit to go to rx_cleanup case, not to net stack in a specific
> packet.
> 
> ----- calling skb_queue_tail() -----
> -   11.58%     0.26%  swapper          [k] usbnet_bh
>    - 11.32% usbnet_bh
>       - 6.43% skb_dequeue
>            6.34% _raw_spin_unlock_irqrestore
>       - 2.21% skb_queue_tail
>            2.19% _raw_spin_unlock_irqrestore
>       - 1.68% consume_skb
>          - 0.97% kfree_skbmem
>               0.80% kmem_cache_free
>            0.53% skb_release_data
> 
> ----- jump to rx_cleanup directly -----
> -    7.62%     0.18%  swapper          [k] usbnet_bh
>    - 7.44% usbnet_bh
>       - 4.63% skb_dequeue
>            4.57% _raw_spin_unlock_irqrestore
>       - 1.76% consume_skb
>          - 1.03% kfree_skbmem
>               0.86% kmem_cache_free
>            0.56% skb_release_data
>         0.54% smsc95xx_rx_fixup
> 
> The first case takes CPU resource a bit much by the result.

Ok, great!  Fix up the patch based on the review comments and add this
information to the changelog as well.

thanks,

greg k-h

      reply	other threads:[~2022-12-19  8:55 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-17 16:18 [PATCH] usbnet: jump to rx_cleanup case instead of calling skb_queue_tail Leesoo Ahn
2022-12-18  8:55 ` Greg KH
2022-12-18 10:01   ` Ladislav Michl
2022-12-19  7:41   ` Leesoo Ahn
2022-12-19  7:50     ` Greg KH
2022-12-19  8:09       ` Leesoo Ahn
2022-12-19  8:55         ` Greg KH [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y6AnEWWd7DQg0b6o@kroah.com \
    --to=greg@kroah.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    --cc=lsahn@ooseel.net \
    --cc=netdev@vger.kernel.org \
    --cc=oneukum@suse.com \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.