public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Greg KH <gregkh@linuxfoundation.org>,
	KY Srinivasan <kys@microsoft.com>,
	"olaf@aepfle.de" <olaf@aepfle.de>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"bjorn.helgaas@gmail.com" <bjorn.helgaas@gmail.com>,
	"apw@canonical.com" <apw@canonical.com>,
	"devel@linuxdriverproject.org" <devel@linuxdriverproject.org>,
	"leann.ogasawara@canonical.com" <leann.ogasawara@canonical.com>,
	"jasowang@redhat.com" <jasowang@redhat.com>
Subject: Re: [PATCH 3/3] hv_netvsc: Implement VF matching based on serial numbers
Date: Fri, 9 Dec 2016 14:05:09 -0800	[thread overview]
Message-ID: <20161209140509.1dd7cad2@xeon-e3> (raw)
In-Reply-To: <BLUPR03MB141254EC05D3BAF06A64B928CA870@BLUPR03MB1412.namprd03.prod.outlook.com>

On Fri, 9 Dec 2016 21:53:49 +0000
Haiyang Zhang <haiyangz@microsoft.com> wrote:

> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Friday, December 9, 2016 4:45 PM
> > To: Haiyang Zhang <haiyangz@microsoft.com>
> > Cc: Greg KH <gregkh@linuxfoundation.org>; KY Srinivasan
> > <kys@microsoft.com>; olaf@aepfle.de; linux-kernel@vger.kernel.org;
> > bjorn.helgaas@gmail.com; apw@canonical.com; devel@linuxdriverproject.org;
> > leann.ogasawara@canonical.com; jasowang@redhat.com
> > Subject: Re: [PATCH 3/3] hv_netvsc: Implement VF matching based on
> > serial numbers
> > 
> > On Fri, 9 Dec 2016 21:31:25 +0000
> > Haiyang Zhang <haiyangz@microsoft.com> wrote:
> >   
> > > > -----Original Message-----
> > > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > > Sent: Friday, December 9, 2016 3:30 PM
> > > > To: Haiyang Zhang <haiyangz@microsoft.com>
> > > > Cc: Greg KH <gregkh@linuxfoundation.org>; KY Srinivasan
> > > > <kys@microsoft.com>; olaf@aepfle.de; linux-kernel@vger.kernel.org;
> > > > bjorn.helgaas@gmail.com; apw@canonical.com;  
> > devel@linuxdriverproject.org;  
> > > > leann.ogasawara@canonical.com; jasowang@redhat.com
> > > > Subject: Re: [PATCH 3/3] hv_netvsc: Implement VF matching based on
> > > > serial numbers
> > > >
> > > > On Fri, 9 Dec 2016 20:09:49 +0000
> > > > Haiyang Zhang <haiyangz@microsoft.com> wrote:
> > > >  
> > > > > > -----Original Message-----
> > > > > > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > > > > > Sent: Friday, December 9, 2016 1:21 PM
> > > > > > To: Greg KH <gregkh@linuxfoundation.org>
> > > > > > Cc: KY Srinivasan <kys@microsoft.com>; olaf@aepfle.de; Haiyang  
> > Zhang  
> > > > > > <haiyangz@microsoft.com>; linux-kernel@vger.kernel.org;
> > > > > > bjorn.helgaas@gmail.com; apw@canonical.com;  
> > > > devel@linuxdriverproject.org;  
> > > > > > leann.ogasawara@canonical.com; jasowang@redhat.com
> > > > > > Subject: Re: [PATCH 3/3] hv_netvsc: Implement VF matching based  
> > on  
> > > > > > serial numbers
> > > > > >
> > > > > > On Fri, 9 Dec 2016 08:31:22 +0100
> > > > > > Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > >  
> > > > > > > On Fri, Dec 09, 2016 at 12:05:53AM +0000, KY Srinivasan wrote:  
> > > > > > > >
> > > > > > > >  
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > > > > > > > Sent: Thursday, December 8, 2016 7:56 AM
> > > > > > > > > To: KY Srinivasan <kys@microsoft.com>
> > > > > > > > > Cc: linux-kernel@vger.kernel.org;  
> > devel@linuxdriverproject.org;  
> > > > > > > > > olaf@aepfle.de; apw@canonical.com; vkuznets@redhat.com;
> > > > > > > > > jasowang@redhat.com; leann.ogasawara@canonical.com;
> > > > > > > > > bjorn.helgaas@gmail.com; Haiyang Zhang  
> > <haiyangz@microsoft.com>  
> > > > > > > > > Subject: Re: [PATCH 3/3] hv_netvsc: Implement VF matching  
> > > > based on  
> > > > > > serial  
> > > > > > > > > numbers
> > > > > > > > >
> > > > > > > > > On Thu, Dec 08, 2016 at 12:33:43AM -0800,  
> > > > > > kys@exchange.microsoft.com  
> > > > > > > > > wrote:  
> > > > > > > > > > From: Haiyang Zhang <haiyangz@microsoft.com>
> > > > > > > > > >
> > > > > > > > > > We currently use MAC address to match VF and synthetic  
> > NICs.  
> > > > > > Hyper-V  
> > > > > > > > > > provides a serial number to both devices for this  
> > purpose.  
> > > > This  
> > > > > > patch  
> > > > > > > > > > implements the matching based on VF serial numbers. This  
> > is  
> > > > the  
> > > > > > way  
> > > > > > > > > > specified by the protocol and more reliable.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> > > > > > > > > > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> > > > > > > > > > ---
> > > > > > > > > >  drivers/net/hyperv/netvsc_drv.c |   55  
> > > > > > > > > ++++++++++++++++++++++++++++++++++++---  
> > > > > > > > > >  1 files changed, 51 insertions(+), 4 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c  
> > > > > > > > > b/drivers/net/hyperv/netvsc_drv.c  
> > > > > > > > > > index 9522763..c5778cf 100644
> > > > > > > > > > --- a/drivers/net/hyperv/netvsc_drv.c
> > > > > > > > > > +++ b/drivers/net/hyperv/netvsc_drv.c
> > > > > > > > > > @@ -1165,9 +1165,10 @@ static void  
> > netvsc_free_netdev(struct  
> > > > > > > > > net_device *netdev)  
> > > > > > > > > >  	free_netdev(netdev);
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > -static struct net_device *get_netvsc_bymac(const u8  
> > *mac)  
> > > > > > > > > > +static struct net_device *get_netvsc_byvfser(u32 vfser)
> > > > > > > > > >  {
> > > > > > > > > >  	struct net_device *dev;
> > > > > > > > > > +	struct net_device_context *ndev_ctx;
> > > > > > > > > >
> > > > > > > > > >  	ASSERT_RTNL();
> > > > > > > > > >
> > > > > > > > > > @@ -1175,7 +1176,8 @@ static void  
> > netvsc_free_netdev(struct  
> > > > > > net_device  
> > > > > > > > > *netdev)  
> > > > > > > > > >  		if (dev->netdev_ops != &device_ops)
> > > > > > > > > >  			continue;	/* not a netvsc device */
> > > > > > > > > >
> > > > > > > > > > -		if (ether_addr_equal(mac, dev->perm_addr))
> > > > > > > > > > +		ndev_ctx = netdev_priv(dev);
> > > > > > > > > > +		if (ndev_ctx->vf_serial == vfser)
> > > > > > > > > >  			return dev;
> > > > > > > > > >  	}
> > > > > > > > > >
> > > > > > > > > > @@ -1205,21 +1207,66 @@ static void  
> > > > netvsc_free_netdev(struct  
> > > > > > > > > net_device *netdev)  
> > > > > > > > > >  	return NULL;
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > +static u32 netvsc_get_vfser(struct net_device  
> > *vf_netdev)  
> > > > > > > > > > +{
> > > > > > > > > > +	struct device *dev;
> > > > > > > > > > +	struct hv_device *hdev;
> > > > > > > > > > +	struct hv_pcibus_device *hbus = NULL;
> > > > > > > > > > +	struct list_head *iter;
> > > > > > > > > > +	struct hv_pci_dev *hpdev;
> > > > > > > > > > +	unsigned long flags;
> > > > > > > > > > +	u32 vfser = 0;
> > > > > > > > > > +	u32 count = 0;
> > > > > > > > > > +
> > > > > > > > > > +	for (dev = &vf_netdev->dev; dev; dev = dev->parent)  
> > {  
> > > > > > > > >
> > > > > > > > > You are going to walk the whole device tree backwards?  
> > That's  
> > > > > > crazy.  
> > > > > > > > > And foolish.  And racy and broken (what happens if the  
> > tree  
> > > > > > changes  
> > > > > > > > > while you do this?)  Where is the lock being grabbed while  
> > > > this  
> > > > > > happens?  
> > > > > > > > > What about reference counts?  Do you see other drivers  
> > ever  
> > > > doing  
> > > > > > this  
> > > > > > > > > (if you do, point them out and I'll go yell at them too...)  
> > > > > > > >
> > > > > > > > Greg,
> > > > > > > >
> > > > > > > > We are registering for netdev events. Coming into this  
> > function,  
> > > > the  
> > > > > > caller  
> > > > > > > > guarantees that the list of netdevs does not change - we  
> > assert  
> > > > this  
> > > > > > on entry:  
> > > > > > > > ASSERT_RTNL(). We are only walking up the device tree for  
> > the  
> > > > > > netdevs whose  
> > > > > > > > state change is being notified to us - the device tree being  
> > > > walked  
> > > > > > here is limited to  
> > > > > > > > netdevs under question.  
> > > > > > >
> > > > > > > But a netdev is a child of some type of "real" device, and you  
> > are  
> > > > now  
> > > > > > > walking the tree of all devices up to the "root" parent device,  
> > > > which  
> > > > > > > means you will hit PCI bridges, USB controllers, and all sorts  
> > of  
> > > > fun  
> > > > > > > things if you are a child of those types of devices.
> > > > > > >
> > > > > > > And can't you tell if the netdev for this event, really is  
> > "your"  
> > > > > > > netdev?  Or are you getting called this for "all" netdevs?  
> > Sorry,  
> > > > I  
> > > > > > > don't know this api, any pointers to it would be appreciated.
> > > > > > >  
> > > > > > > > We have a reference to the device and we know the device is  
> > not  
> > > > > > going away. Is it not  
> > > > > > > > safe to dereference the parent pointer - after all the child  
> > has  
> > > > > > taken a reference on  
> > > > > > > > the parent as part of  device_add() call.  
> > > > > > >
> > > > > > > It might be, and might not be.  There's a reason you don't see  
> > > > this  
> > > > > > > pattern anywhere in the kernel because of this...
> > > > > > >  
> > > > > > > > > > +		if (!dev_is_vmbus(dev))
> > > > > > > > > > +			continue;  
> > > > > > > > >
> > > > > > > > > Ick.
> > > > > > > > >
> > > > > > > > > Why isn't your parent pointer a vmbus device all the time?  
> > > > How  
> > > > > > could  
> > > > > > > > > you get burried down in the device hierarchy when you are  
> > the  
> > > > > > driver for  
> > > > > > > > > a specific bus type in the first place?  How could this  
> > > > function  
> > > > > > ever be  
> > > > > > > > > called for a device that is NOT of this type?  
> > > > > > > >
> > > > > > > > We get notified when state changes on any of the netdev  
> > devices  
> > > > in  
> > > > > > the system.  
> > > > > > > > Not all netdevs in the system belong to vmbus. Consider for  
> > > > instance  
> > > > > > the  
> > > > > > > > emulated NIC that can be configured. This is an emulated PCI  
> > NIC.  
> > > > We  
> > > > > > are only  
> > > > > > > > interested in netdevs that correspond to the VF instance  
> > that we  
> > > > are  
> > > > > > interested in.  
> > > > > > >
> > > > > > > Can you "know" this is your netdev by some other way than  
> > having  
> > > > to  
> > > > > > walk  
> > > > > > > the device tree?  Name?  local device type?  Something else?  
> > This  
> > > > > > seems  
> > > > > > > like an odd api in that everyone would have to do gyrations  
> > like  
> > > > this  
> > > > > > in  
> > > > > > > order to determine if the netdev is "theirs" or not...  
> > > > > >
> > > > > > The scenario is SR-IOV on Hyper-V. In the case of VF device, the  
> > > > host  
> > > > > > hands the
> > > > > > guest OS a PCI device for the virtual function device. The VF  
> > device  
> > > > is  
> > > > > > placed
> > > > > > on a special synthetic PCI bus (ie not part of the other buses  
> > on  
> > > > the  
> > > > > > system).
> > > > > > The VF device also has a synthetic network interface (netvsc)  
> > which  
> > > > > > lives
> > > > > > on VMBUS.  This code is about managing the interaction between  
> > the  
> > > > two.  
> > > > > >
> > > > > > The association between VF and synthetic NIC is done in response  
> > to  
> > > > the  
> > > > > > VF network device being registered. Initial version was based on  
> > MAC  
> > > > > > address
> > > > > > which is the same.  Later refinement used permanent MAC address  
> > to  
> > > > > > avoid bugs if MAC address changed.  This version is to use  
> > serial  
> > > > number  
> > > > > > instead which is safer than MAC address.
> > > > > >
> > > > > > The code to walk up/down maybe not be needed to find serial  
> > number.  
> > > > > > Perhaps a more direct single set of conditions is possible?
> > > > > >
> > > > > > Something like:
> > > > > >
> > > > > > In pci-hyperv.c
> > > > > >
> > > > > > u32 hv_pcifront_get_serial(struct pci_bus *bus, unsigned int  
> > devfn)  
> > > > > > {
> > > > > > 	struct hv_pcibus_device *hbus
> > > > > > 		= container_of(bus->sysdata,
> > > > > > 			       struct hv_pcibus_device, sysdata);
> > > > > > 	struct hf_pci_dev *hpdev;
> > > > > > 	u32 serial;
> > > > > >
> > > > > > 	hpdev = get_pcichild_wslot(hbus, devfn_to_wslot(pdev->devfn));
> > > > > > 	if (!hpdev)
> > > > > > 		return 0;
> > > > > >
> > > > > > 	serial = hpdev->devs.ser;
> > > > > > 	put_pcichild(hpdev, hv_pcidev_ref_by_slot);
> > > > > > 	return serial;
> > > > > > }
> > > > > >
> > > > > > In netvsc_drv.c
> > > > > >
> > > > > > static u32 netvsc_get_vfser(struct net_device *vf_netdev)
> > > > > > {
> > > > > > 	struct device *dev = vf_netdev->dev.parent;
> > > > > > 	struct pci_dev *pdev;
> > > > > > 	u32 wslot;
> > > > > >
> > > > > > 	if (!dev || !dev_is_pci(dev))
> > > > > > 		return 0;
> > > > > >
> > > > > > 	pdev = container_of(dev, struct pci_device, dev);
> > > > > >
> > > > > > 	return hv_pcifront_get_serial(pdev->bus, pdev->devfn);
> > > > > > }
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > P.S: it would be good to be able to get win_slot out through  
> > sysfs  
> > > > as  
> > > > > > well for systemd/udev.  
> > > > >
> > > > > Stephen,
> > > > >
> > > > > Thanks for suggestion. Actually, in my earlier implementation of  
> > this  
> > > > > feature (VF serial based matching), I thought about export a  
> > function  
> > > > > from vPCI driver, then calling it from netvsc. So I don't need to
> > > > > move structs between headers... But, it creates a dependency of  
> > netvsc  
> > > > > on vPCI driver's symbol. So, even if on a VM without SRIOV, we  
> > have to  
> > > > > load vPCI driver, which we don't want.
> > > > >
> > > > > Also, hv_vpci device is 3 parent layers above the vf_netdevice:
> > > > > Here is the VF drv hierarchy --
> > > > > Should we assume it's always 3 parents above vf_netdevice, or  
> > search  
> > > > for it?  
> > > > >
> > > > > [  368.185259] HZINFO:NETDEV_REGISTER:
> > > > > [  368.185261] HZINFO: dev:ffff88007c10d518, bus:          (null),  
> > > > busName:(null), drvName:(null)  
> > > > > [  368.185262] HZINFO: dev:ffff88007c10c0a0, bus:ffffffff81ce4b60,  
> > > > busName:pci, drvName:ixgbevf  
> > > > > [  368.185263] HZINFO: dev:ffff8800355c0000, bus:          (null),  
> > > > busName:(null), drvName:(null)  
> > > > > [  368.185264] HZINFO: dev:ffff8800355c5428, bus:ffffffffa0008160,  
> > > > busName:vmbus, drvName:hv_pci  
> > > > > [  368.185264] HZINFO: dev:ffff88007c49e268, bus:ffffffff81ce9800,  
> > > > busName:acpi, drvName:vmbus  
> > > > > [  368.185265] HZINFO: dev:ffff88007c48ea68, bus:ffffffff81ce9800,  
> > > > busName:acpi, drvName:(null)  
> > > > > [  368.185266] HZINFO: dev:ffff88007c48aa68, bus:ffffffff81ce9800,  
> > > > busName:acpi, drvName:(null)  
> > > > > [  368.185266] HZINFO: dev:ffff88007c48a268, bus:ffffffff81ce9800,  
> > > > busName:acpi, drvName:(null)  
> > > > > [  368.185267] HZINFO: dev:ffff88007c489a68, bus:ffffffff81ce9800,  
> > > > busName:acpi, drvName:(null)  
> > > > >
> > > > > Thanks,
> > > > > - Haiyang  
> > > >
> > > > Since this is a synthetic bus, the topology should not change unless
> > > > host side
> > > > software changes. The vf_netdev device has to be PCI device, so that  
> > is  
> > > > going to
> > > > be certain.  After that there maybe intermediate up to hv_pci. The  
> > code  
> > > > in hyperv-pci
> > > > already has similar stuff (ie for read_config).  
> > >
> > > Other netdevice, like emulated NIC can also trigger this notification.
> > > They are not vPCI.
> > >
> > > Thanks,
> > > - Haiyang  
> > 
> > Emulated NIC is already excluded in start of netvc notifier handler.
> > 
> > static int netvsc_netdev_event(struct notifier_block *this,
> > 			       unsigned long event, void *ptr)
> > {
> > 	struct net_device *event_dev = netdev_notifier_info_to_dev(ptr);
> > 
> > 	/* Skip our own events */
> > 	if (event_dev->netdev_ops == &device_ops)
> > 		return NOTIFY_DONE;
> >   
> 
> Emulated device is not based on netvsc. It's the native Linux (dec100M?)
> Driver. So this line doesn't exclude it. And how about other NIC type 
> may be added in the future?

Sorry, forgot about that haven't used emulated device in years.
The emulated device should appear to be on a PCI bus, but the serial
would not match??

  reply	other threads:[~2016-12-09 22:05 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-08  8:33 [PATCH 0/3] Drivers: hv: Implement VF association based on serial number kys
2016-12-08  8:33 ` [PATCH 1/3] hyperv: Move hv_pci_dev and related structs to hyperv.h kys
2016-12-08  8:33   ` [PATCH 2/3] hyperv: Add a function to detect if the device is a vmbus dev kys
2016-12-08  8:33   ` [PATCH 3/3] hv_netvsc: Implement VF matching based on serial numbers kys
2016-12-08 15:56     ` Greg KH
2016-12-09  0:05       ` KY Srinivasan
2016-12-09  7:31         ` Greg KH
2016-12-09 18:20           ` Stephen Hemminger
2016-12-09 20:09             ` Haiyang Zhang
2016-12-09 20:29               ` Stephen Hemminger
2016-12-09 21:31                 ` Haiyang Zhang
2016-12-09 21:45                   ` Stephen Hemminger
2016-12-09 21:53                     ` Haiyang Zhang
2016-12-09 22:05                       ` Stephen Hemminger [this message]
2016-12-09 22:35                         ` Haiyang Zhang
2016-12-10  0:21                           ` Stephen Hemminger
2016-12-10 12:20                             ` Greg KH
2016-12-14 23:18                               ` Haiyang Zhang
2016-12-14 23:27                                 ` Greg KH
2016-12-14 23:51                                   ` Stephen Hemminger
2016-12-16  1:11                                     ` KY Srinivasan
2016-12-16 15:20                                       ` Haiyang Zhang
2016-12-16 18:39                                         ` KY Srinivasan
2016-12-16 16:45                                     ` Greg KH
2016-12-08  9:49   ` [PATCH 1/3] hyperv: Move hv_pci_dev and related structs to hyperv.h kbuild test robot
2016-12-08 15:26     ` KY Srinivasan
2016-12-08 15:56       ` gregkh
2016-12-08 16:54         ` KY Srinivasan
2016-12-08 11:15   ` kbuild test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161209140509.1dd7cad2@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=apw@canonical.com \
    --cc=bjorn.helgaas@gmail.com \
    --cc=devel@linuxdriverproject.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=haiyangz@microsoft.com \
    --cc=jasowang@redhat.com \
    --cc=kys@microsoft.com \
    --cc=leann.ogasawara@canonical.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=olaf@aepfle.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox