netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: Alexander Duyck <alexander.duyck@gmail.com>,
	"Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
	Jiri Pirko <jiri@resnulli.us>, KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	Stephen Hemminger <sthemmin@microsoft.com>
Subject: Re: [PATCH net] failover: eliminate callback hell
Date: Thu, 7 Jun 2018 20:22:15 +0300	[thread overview]
Message-ID: <20180607201850-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20180607091742.461dff83@xeon-e3>

On Thu, Jun 07, 2018 at 09:17:42AM -0700, Stephen Hemminger wrote:
> On Thu, 7 Jun 2018 18:41:31 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > > > Why would DPDK care what we do in the kernel? Isn't it just slapping
> > > > vfio-pci on the netdevs it sees?  
> > > 
> > > Alex, you are correct for Intel devices; but DPDK on Azure is not Intel based.,.
> > > The DPDK support uses:
> > >  * Mellanox MLX5 which uses the Infinband hooks to do DMA directly to
> > >    userspace. This means VF netdev device must exist and be visible.
> > >  * Slow path using kernel netvsc device, TAP and BPF to get exception
> > >    path packets to userspace.
> > >  * A autodiscovery mechanism that to set all this up that relies on
> > >    2 device model and sysfs.  
> > 
> > Could you describe what does it look for exactly? What will break if
> > instead of MLX5 being a child of the PV, it's a child of the failover
> > device?
> 
> So in DPDK there is an internal four device model:
> 	1. failsafe is like failover in your model
> 	2. TAP is used like netvsc in kernel
> 	3. MLX5 is the VF
> 	4. vdev_netvsc is a pseudo device whose only reason to exist
> 	   is to glue everything together.
> 
> Digging deeper inside...
> 
> Vdev_netvsc does:
>    * driver is started in a convuluted way off device arguments
>    * probe routine for driver runs
>       - scans list of kernel interfaces in sysfs
>       - matches those using VMBUS 

Could you tell a bit more what does this step entail?

>       - skip netvsc devices that have an IPV4 route
>    * scan for PCI devices that have same MAC address as kernel netvsc
>      devices discovered in previous step
>    * add these interfaces to arguments to failsafe
> 
> Then failsafe configures based on arguments on device
> 
> The code works but is specific to the Azure hardware model, and exposes lots
> of things to application that it should not have to care about.
> 
> If you  try and walk through this code in DPDK, you will see why I have developed
> a dislike for high levels of indirection.
> 
> 
> 	   

Thanks that was helpful!  I'll try to poke at it next week.  Just from
the description it seems the kernel is merely used to locate the MAC
address through sysfs and that for this DPDK code to keep working the
hidden device must be hidden from it in sysfs - is that a fair summary?

-- 
MST

  reply	other threads:[~2018-06-07 17:22 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-05  3:42 [PATCH net] failover: eliminate callback hell Stephen Hemminger
2018-06-05 17:22 ` Samudrala, Sridhar
2018-06-05 17:45   ` Stephen Hemminger
2018-06-05 18:14     ` David Miller
2018-06-05 18:35 ` Michael S. Tsirkin
2018-06-05 18:53   ` Stephen Hemminger
2018-06-05 19:38     ` Michael S. Tsirkin
2018-06-05 21:52       ` Stephen Hemminger
2018-06-05 23:52         ` Samudrala, Sridhar
2018-06-06  3:51           ` Stephen Hemminger
2018-06-06  5:39             ` Samudrala, Sridhar
2018-06-06  6:00               ` Stephen Hemminger
2018-06-06  6:11                 ` Samudrala, Sridhar
2018-06-06 21:16                   ` Stephen Hemminger
2018-06-06 21:30                     ` Michael S. Tsirkin
2018-06-06 22:21                       ` Stephen Hemminger
2018-06-11 18:07                         ` Michael S. Tsirkin
2018-06-06 12:19             ` Michael S. Tsirkin
2018-06-06 21:17               ` Stephen Hemminger
2018-06-06  7:25 ` Jiri Pirko
2018-06-06 12:30   ` Michael S. Tsirkin
2018-06-06 21:24     ` Stephen Hemminger
2018-06-06 21:47       ` Michael S. Tsirkin
2018-06-06 22:24         ` Stephen Hemminger
2018-06-07 14:57           ` Michael S. Tsirkin
2018-06-07 15:23             ` Stephen Hemminger
2018-06-06 21:54       ` Samudrala, Sridhar
2018-06-06 22:25         ` Stephen Hemminger
2018-06-07 14:17           ` Alexander Duyck
2018-06-07 14:51             ` Stephen Hemminger
2018-06-07 15:41               ` Michael S. Tsirkin
2018-06-07 16:17                 ` Stephen Hemminger
2018-06-07 17:22                   ` Michael S. Tsirkin [this message]
2018-06-08 18:30                     ` Stephen Hemminger
2018-06-08 19:04                       ` Michael S. Tsirkin
2018-06-08 22:54         ` Siwei Liu
2018-06-11 15:17           ` Stephen Hemminger
2018-06-08 22:25       ` Siwei Liu
2018-06-08 23:18         ` Stephen Hemminger
2018-06-08 23:44           ` Siwei Liu
2018-06-09  0:02             ` Stephen Hemminger
2018-06-09  0:42               ` Siwei Liu
2018-06-11 15:22                 ` Stephen Hemminger
2018-06-11 19:23                   ` Siwei Liu
2018-06-11 14:01               ` Michael S. Tsirkin
2018-06-09  1:29             ` Jakub Kicinski
2018-06-11 18:56               ` Siwei Liu
2018-06-12  2:14                 ` Michael S. Tsirkin
2018-06-06 21:26   ` Stephen Hemminger
2018-06-11 18:10 ` Michael S. Tsirkin
2018-06-11 19:34   ` Samudrala, Sridhar
2018-06-12  0:08     ` Samudrala, Sridhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180607201850-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=haiyangz@microsoft.com \
    --cc=jiri@resnulli.us \
    --cc=kys@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=sthemmin@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).