All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: "Paweł Staszewski" <pstaszewski@itcare.pl>,
	"Linux Kernel Network Developers" <netdev@vger.kernel.org>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Eric Dumazet" <eric.dumazet@gmail.com>,
	brouer@redhat.com
Subject: Re: Kernel 4.13.0-rc4-next-20170811 - IP Routing / Forwarding performance vs Core/RSS number / HT on
Date: Tue, 15 Aug 2017 11:35:42 +0200	[thread overview]
Message-ID: <20170815113542.62175adf@redhat.com> (raw)
In-Reply-To: <1502729870.8411.63.camel@redhat.com>

On Mon, 14 Aug 2017 18:57:50 +0200
Paolo Abeni <pabeni@redhat.com> wrote:

> On Mon, 2017-08-14 at 18:19 +0200, Jesper Dangaard Brouer wrote:
> > The output (extracted below) didn't show who called 'do_raw_spin_lock',
> > BUT it showed another interesting thing.  The kernel code
> > __dev_queue_xmit() in might create route dst-cache problem for itself(?),
> > as it will first call skb_dst_force() and then skb_dst_drop() when the
> > packet is transmitted on a VLAN.
> > 
> >  static int __dev_queue_xmit(struct sk_buff *skb, void *accel_priv)
> >  {
> >  [...]
> > 	/* If device/qdisc don't need skb->dst, release it right now while
> > 	 * its hot in this cpu cache.
> > 	 */
> > 	if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
> > 		skb_dst_drop(skb);
> > 	else
> > 		skb_dst_force(skb);  
> 
> I think that the high impact of the above code in this specific test is
> mostly due to the following:
> 
> - ingress packets with different RSS rx hash lands on different CPUs
> - but they use the same dst entry, since the destination IPs belong to
> the same subnet
> - the dst refcnt cacheline is contented between all the CPUs

Good point and explanation Paolo :-)
I changed my pktgen setup to be closer to Pawel's to provoke this
situation some more, and I get closer to provoke this although not as
clearly as Pawel.

A perf diff does show, that the overhead in the VLAN cause originates
from the routing "dst_release" code.  Diff Baseline==non-vlan case.

[jbrouer@canyon ~]$ sudo ~/perf diff
# Event 'cycles'
#
# Baseline  Delta Abs  Shared Object     Symbol                                   
# ........  .........  ................  .........................................
#
     3.23%     +4.32%  [kernel.vmlinux]  [k] __dev_queue_xmit
               +3.43%  [kernel.vmlinux]  [k] dst_release
    13.54%     -3.17%  [kernel.vmlinux]  [k] fib_table_lookup
     9.33%     -2.73%  [kernel.vmlinux]  [k] _raw_spin_lock
     7.91%     -1.75%  [ixgbe]           [k] ixgbe_poll
               +1.64%  [8021q]           [k] vlan_dev_hard_start_xmit
     7.23%     -1.26%  [ixgbe]           [k] ixgbe_xmit_frame_ring
     3.34%     -1.10%  [kernel.vmlinux]  [k] eth_type_trans
     5.20%     +0.97%  [kernel.vmlinux]  [k] ip_route_input_rcu
     1.13%     +0.95%  [kernel.vmlinux]  [k] ip_rcv_finish
     2.49%     -0.82%  [kernel.vmlinux]  [k] ip_forward
     3.05%     -0.80%  [kernel.vmlinux]  [k] __build_skb
     0.44%     +0.74%  [kernel.vmlinux]  [k] __netif_receive_skb
               +0.71%  [kernel.vmlinux]  [k] neigh_connected_output
     1.70%     +0.68%  [kernel.vmlinux]  [k] validate_xmit_skb
     1.42%     +0.67%  [kernel.vmlinux]  [k] dev_hard_start_xmit
     0.49%     +0.66%  [kernel.vmlinux]  [k] netif_receive_skb_internal
               +0.62%  [kernel.vmlinux]  [k] eth_header
               +0.57%  [ixgbe]           [k] ixgbe_tx_ctxtdesc
     1.19%     -0.55%  [kernel.vmlinux]  [k] __netdev_pick_tx
     2.54%     -0.48%  [kernel.vmlinux]  [k] fib_validate_source
     2.83%     +0.46%  [kernel.vmlinux]  [k] ip_finish_output2
     1.45%     +0.45%  [kernel.vmlinux]  [k] netif_skb_features
     1.66%     -0.45%  [kernel.vmlinux]  [k] napi_gro_receive
     0.90%     -0.40%  [kernel.vmlinux]  [k] validate_xmit_skb_list
     1.45%     -0.39%  [kernel.vmlinux]  [k] ip_finish_output
               +0.36%  [8021q]           [k] vlan_passthru_hard_header
     1.28%     -0.33%  [kernel.vmlinux]  [k] netdev_pick_tx
 

> Perhaps we can inprove the situation setting the IFF_XMIT_DST_RELEASE
> flag for vlan if the underlaying device does not have (relevant)
> classifier attached? (and clearing it as needed)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

  parent reply	other threads:[~2017-08-15  9:35 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-11 17:51 Kernel 4.13.0-rc4-next-20170811 - IP Routing / Forwarding performance vs Core/RSS number / HT on Paweł Staszewski
2017-08-12 12:23 ` Jesper Dangaard Brouer
2017-08-12 17:27   ` Paweł Staszewski
2017-08-13 16:58     ` Paweł Staszewski
2017-08-14 16:19       ` Jesper Dangaard Brouer
2017-08-14 16:33         ` Eric Dumazet
2017-08-14 16:57         ` Paolo Abeni
2017-08-15  0:45           ` Paweł Staszewski
2017-08-15  1:07             ` Eric Dumazet
2017-08-15  1:17               ` Eric Dumazet
2017-08-15  9:11                 ` Paweł Staszewski
2017-08-15  9:19                   ` Paweł Staszewski
2017-08-15 10:05                   ` Jesper Dangaard Brouer
2017-09-21 21:26                   ` Paweł Staszewski
2017-09-21 21:34                     ` Eric Dumazet
2017-09-21 21:34                       ` Paweł Staszewski
2017-09-21 21:41                     ` Florian Fainelli
2017-09-21 21:43                       ` Paweł Staszewski
2017-09-21 21:54                       ` Eric Dumazet
2017-09-21 22:07                         ` Florian Fainelli
2017-09-22  0:37                           ` Eric Dumazet
2017-10-18 21:49                       ` Paweł Staszewski
2017-10-18 21:54                         ` Eric Dumazet
2017-10-18 22:45                           ` Paweł Staszewski
2017-09-09  9:03                 ` Paweł Staszewski
2017-09-11 16:57                   ` Paweł Staszewski
2017-09-11 22:11                     ` Paweł Staszewski
2017-08-15  9:35           ` Jesper Dangaard Brouer [this message]
2017-08-15  0:38         ` Paweł Staszewski
2017-08-15  9:23           ` Jesper Dangaard Brouer
2017-08-15  9:30             ` Paweł Staszewski
2017-08-15  9:57               ` Jesper Dangaard Brouer
2017-08-15 10:02                 ` Paweł Staszewski
2017-08-15 10:05                   ` Paweł Staszewski
2017-08-15 10:28                     ` Jesper Dangaard Brouer
2017-08-14  0:07     ` Alexander Duyck
2017-08-14 15:07       ` Paweł Staszewski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170815113542.62175adf@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pstaszewski@itcare.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.