netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Long Li <longli@microsoft.com>,
	"longli@linuxonhyperv.com" <longli@linuxonhyperv.com>,
	KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net-next v4] hv_netvsc: Mark VF as slave before exposing it to user-mode
Date: Wed, 15 Nov 2023 08:14:06 -0800	[thread overview]
Message-ID: <20231115081406.1bd9a4ed@hermes.local> (raw)
In-Reply-To: <20231110120513.45ed505c@kernel.org>

On Fri, 10 Nov 2023 12:05:13 -0800
Jakub Kicinski <kuba@kernel.org> wrote:

> On Fri, 10 Nov 2023 00:43:55 +0000 Long Li wrote:
> > The code above needs to work with and without netvsc (the possible
> > master device) present.  
> 
> I don't think that's a reasonable requirement for the kernel code.
> 
> The auto-bonding already puts the kernel into business of guessing
> policy, which frankly we shouldn't be in.
> 
> Having the kernel guess even harder that there will be a master,
> but it's not there yet, is not reasonable.
> 

I wrote the netvsc automatic VF code almost six years ago.
So let me give a little history. The original support of VF's was
done by using a bonding device and script. Haiyang worked hard
to get to work but it could not work on many distro's and had
lots of races and problems.

Jakub is right that in an ideal world, this could all be managed by
userspace. But the management of network devices in Linux is a
dumpster fire! Every distro invents there own solution, last time
I counted there were six different tools claiming to be the
"one network device manager to rule them all". And that doesn't
include all the custom scripts and vendor appliances.

The users requirements were:
 - VF networking should work out of the box
 - VF networking should require no userspace changes
 - It must work with legacy enterprise distro's
 - The first network device must show up as eth0 and it must work.

The Linux ecosystem of userspace but the kernel is a common base.
It was much easier for Microsoft to tell partners to
"use these upstream kernel components" and it will work.
Windows and BSD OS's have a tight binding between kernel and management
from userspace, therefore it is possible to handle things in userspace.

There are still problems (as Long indicated in the patch) because
the VF device does appear in the list of network devices. And
getting the transparent VF support to work in the face of all
the trash of userspace scripts is hard. Part of the problem is
that the state model of Linux network devices is fractured and
poorly documented.

The IFF_SLAVE flag is already used to indicate device is managed
by another driver. It keeps IPv6 from doing local address assignment
and existing userspace should be looking at it. The problem was
that userspace must not see a non-flagged VF device, or it will
get confused.

Microsoft should have exposed only one device in hardware.
Other vendors only expose the VF device and hairpin packets any
pre-processed packets. Part of the problem here is that
VF firmware needs to be updated (too often) and it is a requirement
that VM's do not lose connectivity.

Ideally, several things should happen:
   - Linux should support hiding devices managed by another device
   - the naming of device roles needs to not be master/slave.

   

  reply	other threads:[~2023-11-15 16:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-08 22:56 [PATCH net-next v4] hv_netvsc: Mark VF as slave before exposing it to user-mode longli
2023-11-09  2:13 ` Jakub Kicinski
2023-11-10  0:43   ` Long Li
2023-11-10 20:05     ` Jakub Kicinski
2023-11-15 16:14       ` Stephen Hemminger [this message]
2023-11-18 17:38         ` Jakub Kicinski
2023-11-21  0:23           ` Long Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231115081406.1bd9a4ed@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=davem@davemloft.net \
    --cc=decui@microsoft.com \
    --cc=edumazet@google.com \
    --cc=haiyangz@microsoft.com \
    --cc=kuba@kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@linuxonhyperv.com \
    --cc=longli@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).