All of lore.kernel.org
 help / color / mirror / Atom feed
From: Flavio Leitner <fbl@sysclose.org>
To: Mark Gray <mark.d.gray@redhat.com>
Cc: dev@openvswitch.org, netdev@vger.kernel.org,
	dan.carpenter@oracle.com, pravin.ovn@gmail.com
Subject: Re: [PATCH net-next] openvswitch: Introduce per-cpu upcall dispatch
Date: Mon, 5 Jul 2021 20:54:17 -0300	[thread overview]
Message-ID: <YOObqVRRRHUNmA9o@p50> (raw)
In-Reply-To: <20210630095350.817785-1-mark.d.gray@redhat.com>

On Wed, Jun 30, 2021 at 05:53:49AM -0400, Mark Gray wrote:
> The Open vSwitch kernel module uses the upcall mechanism to send
> packets from kernel space to user space when it misses in the kernel
> space flow table. The upcall sends packets via a Netlink socket.
> Currently, a Netlink socket is created for every vport. In this way,
> there is a 1:1 mapping between a vport and a Netlink socket.
> When a packet is received by a vport, if it needs to be sent to
> user space, it is sent via the corresponding Netlink socket.
> 
> This mechanism, with various iterations of the corresponding user
> space code, has seen some limitations and issues:
> 
> * On systems with a large number of vports, there is a correspondingly
> large number of Netlink sockets which can limit scaling.
> (https://bugzilla.redhat.com/show_bug.cgi?id=1526306)
> * Packet reordering on upcalls.
> (https://bugzilla.redhat.com/show_bug.cgi?id=1844576)
> * A thundering herd issue.
> (https://bugzilla.redhat.com/show_bug.cgi?id=1834444)
> 
> This patch introduces an alternative, feature-negotiated, upcall
> mode using a per-cpu dispatch rather than a per-vport dispatch.
> 
> In this mode, the Netlink socket to be used for the upcall is
> selected based on the CPU of the thread that is executing the upcall.
> In this way, it resolves the issues above as:
> 
> a) The number of Netlink sockets scales with the number of CPUs
> rather than the number of vports.
> b) Ordering per-flow is maintained as packets are distributed to
> CPUs based on mechanisms such as RSS and flows are distributed
> to a single user space thread.
> c) Packets from a flow can only wake up one user space thread.
> 
> The corresponding user space code can be found at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2021-April/382618.html
> 
> Bugzilla: https://bugzilla.redhat.com/1844576
> Signed-off-by: Mark Gray <mark.d.gray@redhat.com>
> ---

It looks good and works for me.
Acked-by: Flavio Leitner <fbl@sysclose.org>

Thanks Mark!
fbl

  parent reply	other threads:[~2021-07-05 23:54 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-30  9:53 [PATCH net-next] openvswitch: Introduce per-cpu upcall dispatch Mark Gray
2021-06-30 16:29 ` kernel test robot
2021-06-30 16:29   ` kernel test robot
2021-07-01 20:59 ` Flavio Leitner
2021-07-05 23:54 ` Flavio Leitner [this message]
2021-07-08 14:40 ` Flavio Leitner
2021-07-12 18:07   ` [ovs-dev] " Flavio Leitner
2021-07-15  4:45 ` Pravin Shelar
2021-07-15 11:55   ` Mark Gray

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YOObqVRRRHUNmA9o@p50 \
    --to=fbl@sysclose.org \
    --cc=dan.carpenter@oracle.com \
    --cc=dev@openvswitch.org \
    --cc=mark.d.gray@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pravin.ovn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.