* [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing
@ 2019-12-30 20:13 Haiyang Zhang
2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw)
To: sashal, linux-hyperv, netdev
Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel
Add dev_num for vmbus device based on channel offer sequence.
Use this number for nic naming, and enable async probing.
Haiyang Zhang (3):
Drivers: hv: vmbus: Add a dev_num variable based on channel offer
sequence
Drivers: hv: vmbus: Add dev_num to sysfs
hv_netvsc: Name NICs based on vmbus offer sequence and use async probe
Documentation/ABI/stable/sysfs-bus-vmbus | 8 +++++++
drivers/hv/channel_mgmt.c | 38 ++++++++++++++++++++++++++++++--
drivers/hv/vmbus_drv.c | 13 +++++++++++
drivers/net/hyperv/netvsc_drv.c | 18 ++++++++++++---
include/linux/hyperv.h | 6 +++++
5 files changed, 78 insertions(+), 5 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 14+ messages in thread* [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence 2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang @ 2019-12-30 20:13 ` Haiyang Zhang 2019-12-31 1:34 ` Michael Kelley 2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang 2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang 2 siblings, 1 reply; 14+ messages in thread From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw) To: sashal, linux-hyperv, netdev Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel This number is set to the first available number, starting from zero, when a vmbus device’s primary channel is offered. It will be used for stable naming when Async probing is used. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> --- Changes V2: Use nest loops in hv_set_devnum, instead of goto. drivers/hv/channel_mgmt.c | 38 ++++++++++++++++++++++++++++++++++++-- include/linux/hyperv.h | 6 ++++++ 2 files changed, 42 insertions(+), 2 deletions(-) diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 8eb1675..00fa2db 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -315,6 +315,8 @@ static struct vmbus_channel *alloc_channel(void) if (!channel) return NULL; + channel->dev_num = HV_DEV_NUM_INVALID; + spin_lock_init(&channel->lock); init_completion(&channel->rescind_event); @@ -541,6 +543,36 @@ static void vmbus_add_channel_work(struct work_struct *work) } /* + * Get the first available device number of its type, then + * record it in the channel structure. + */ +static void hv_set_devnum(struct vmbus_channel *newchannel) +{ + struct vmbus_channel *channel; + int i = -1; + bool found; + + BUG_ON(!mutex_is_locked(&vmbus_connection.channel_mutex)); + + do { + i++; + found = false; + + list_for_each_entry(channel, &vmbus_connection.chn_list, + listentry) { + if (i == channel->dev_num && + guid_equal(&channel->offermsg.offer.if_type, + &newchannel->offermsg.offer.if_type)) { + found = true; + break; + } + } + } while (found); + + newchannel->dev_num = i; +} + +/* * vmbus_process_offer - Process the offer by creating a channel/device * associated with this offer */ @@ -573,10 +605,12 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) } } - if (fnew) + if (fnew) { + hv_set_devnum(newchannel); + list_add_tail(&newchannel->listentry, &vmbus_connection.chn_list); - else { + } else { /* * Check to see if this is a valid sub-channel. */ diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 26f3aee..4f110c5 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -718,6 +718,8 @@ struct vmbus_device { bool perf_device; }; +#define HV_DEV_NUM_INVALID (-1) + struct vmbus_channel { struct list_head listentry; @@ -849,6 +851,10 @@ struct vmbus_channel { */ struct vmbus_channel *primary_channel; /* + * Used for device naming based on channel offer sequence. + */ + int dev_num; + /* * Support per-channel state for use by vmbus drivers. */ void *per_channel_state; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence 2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang @ 2019-12-31 1:34 ` Michael Kelley 2019-12-31 15:48 ` Haiyang Zhang 0 siblings, 1 reply; 14+ messages in thread From: Michael Kelley @ 2019-12-31 1:34 UTC (permalink / raw) To: Haiyang Zhang, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, 2019 12:14 PM > > This number is set to the first available number, starting from zero, > when a vmbus device's primary channel is offered. Let's use "VMBus" as the capitalization in text. > It will be used for stable naming when Async probing is used. > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > --- > Changes > V2: > Use nest loops in hv_set_devnum, instead of goto. > > drivers/hv/channel_mgmt.c | 38 ++++++++++++++++++++++++++++++++++++-- > include/linux/hyperv.h | 6 ++++++ > 2 files changed, 42 insertions(+), 2 deletions(-) > > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c > index 8eb1675..00fa2db 100644 > --- a/drivers/hv/channel_mgmt.c > +++ b/drivers/hv/channel_mgmt.c > @@ -315,6 +315,8 @@ static struct vmbus_channel *alloc_channel(void) > if (!channel) > return NULL; > > + channel->dev_num = HV_DEV_NUM_INVALID; > + > spin_lock_init(&channel->lock); > init_completion(&channel->rescind_event); > > @@ -541,6 +543,36 @@ static void vmbus_add_channel_work(struct work_struct *work) > } > > /* > + * Get the first available device number of its type, then > + * record it in the channel structure. > + */ > +static void hv_set_devnum(struct vmbus_channel *newchannel) > +{ > + struct vmbus_channel *channel; > + int i = -1; > + bool found; > + > + BUG_ON(!mutex_is_locked(&vmbus_connection.channel_mutex)); > + > + do { > + i++; > + found = false; > + > + list_for_each_entry(channel, &vmbus_connection.chn_list, > + listentry) { > + if (i == channel->dev_num && > + guid_equal(&channel->offermsg.offer.if_type, > + &newchannel->offermsg.offer.if_type)) { > + found = true; > + break; > + } > + } > + } while (found); > + > + newchannel->dev_num = i; > +} > + It took me a little while to figure out what the above algorithm is doing. Perhaps it would help to rename the "found" variable to "in_use", and add this comment before the start of the "do" loop: Iterate through each possible device number starting at zero. If the device number is already in use for a device of this type, try the next device number until finding one that is not in use. This approach selects the smallest device number that is not in use, and so reuses any numbers that are freed by devices that have been removed. > +/* > * vmbus_process_offer - Process the offer by creating a channel/device > * associated with this offer > */ > @@ -573,10 +605,12 @@ static void vmbus_process_offer(struct vmbus_channel > *newchannel) > } > } > > - if (fnew) > + if (fnew) { > + hv_set_devnum(newchannel); > + > list_add_tail(&newchannel->listentry, > &vmbus_connection.chn_list); > - else { > + } else { > /* > * Check to see if this is a valid sub-channel. > */ > diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h > index 26f3aee..4f110c5 100644 > --- a/include/linux/hyperv.h > +++ b/include/linux/hyperv.h > @@ -718,6 +718,8 @@ struct vmbus_device { > bool perf_device; > }; > > +#define HV_DEV_NUM_INVALID (-1) > + > struct vmbus_channel { > struct list_head listentry; > > @@ -849,6 +851,10 @@ struct vmbus_channel { > */ > struct vmbus_channel *primary_channel; > /* > + * Used for device naming based on channel offer sequence. > + */ > + int dev_num; > + /* > * Support per-channel state for use by vmbus drivers. > */ > void *per_channel_state; > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence 2019-12-31 1:34 ` Michael Kelley @ 2019-12-31 15:48 ` Haiyang Zhang 0 siblings, 0 replies; 14+ messages in thread From: Haiyang Zhang @ 2019-12-31 15:48 UTC (permalink / raw) To: Michael Kelley, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org Cc: KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org > -----Original Message----- > From: Michael Kelley <mikelley@microsoft.com> > Sent: Monday, December 30, 2019 8:35 PM > To: Haiyang Zhang <haiyangz@microsoft.com>; sashal@kernel.org; linux- > hyperv@vger.kernel.org; netdev@vger.kernel.org > Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; > olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net; > linux-kernel@vger.kernel.org > Subject: RE: [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num > variable based on channel offer sequence > > From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, > 2019 12:14 PM > > > > This number is set to the first available number, starting from zero, > > when a vmbus device's primary channel is offered. > > Let's use "VMBus" as the capitalization in text. Sure I will. > > > It will be used for stable naming when Async probing is used. > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > --- > > Changes > > V2: > > Use nest loops in hv_set_devnum, instead of goto. > > > > drivers/hv/channel_mgmt.c | 38 > ++++++++++++++++++++++++++++++++++++-- > > include/linux/hyperv.h | 6 ++++++ > > 2 files changed, 42 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c > > index 8eb1675..00fa2db 100644 > > --- a/drivers/hv/channel_mgmt.c > > +++ b/drivers/hv/channel_mgmt.c > > @@ -315,6 +315,8 @@ static struct vmbus_channel *alloc_channel(void) > > if (!channel) > > return NULL; > > > > + channel->dev_num = HV_DEV_NUM_INVALID; > > + > > spin_lock_init(&channel->lock); > > init_completion(&channel->rescind_event); > > > > @@ -541,6 +543,36 @@ static void vmbus_add_channel_work(struct > work_struct *work) > > } > > > > /* > > + * Get the first available device number of its type, then > > + * record it in the channel structure. > > + */ > > +static void hv_set_devnum(struct vmbus_channel *newchannel) > > +{ > > + struct vmbus_channel *channel; > > + int i = -1; > > + bool found; > > + > > + BUG_ON(!mutex_is_locked(&vmbus_connection.channel_mutex)); > > + > > + do { > > + i++; > > + found = false; > > + > > + list_for_each_entry(channel, &vmbus_connection.chn_list, > > + listentry) { > > + if (i == channel->dev_num && > > + guid_equal(&channel->offermsg.offer.if_type, > > + &newchannel->offermsg.offer.if_type)) { > > + found = true; > > + break; > > + } > > + } > > + } while (found); > > + > > + newchannel->dev_num = i; > > +} > > + > > It took me a little while to figure out what the above algorithm is doing. > Perhaps it would help to rename the "found" variable to "in_use", and add > this comment before the start of the "do" loop: > > Iterate through each possible device number starting at zero. If the device > number is already in use for a device of this type, try the next device number > until finding one that is not in use. This approach selects the smallest > device number that is not in use, and so reuses any numbers that are freed > by devices that have been removed. I will rename the variable, and add the code comments. Thanks, - Haiyang ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs 2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang 2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang @ 2019-12-30 20:13 ` Haiyang Zhang 2019-12-31 1:34 ` Michael Kelley 2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang 2 siblings, 1 reply; 14+ messages in thread From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw) To: sashal, linux-hyperv, netdev Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel It's a number based on the vmbus device offer sequence. Useful for device naming when using Async probing. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> --- Documentation/ABI/stable/sysfs-bus-vmbus | 8 ++++++++ drivers/hv/vmbus_drv.c | 13 +++++++++++++ 2 files changed, 21 insertions(+) diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus index 8e8d167..a42225d 100644 --- a/Documentation/ABI/stable/sysfs-bus-vmbus +++ b/Documentation/ABI/stable/sysfs-bus-vmbus @@ -49,6 +49,14 @@ Contact: Stephen Hemminger <sthemmin@microsoft.com> Description: This NUMA node to which the VMBUS device is attached, or -1 if the node is unknown. +What: /sys/bus/vmbus/devices/<UUID>/dev_num +Date: Dec 2019 +KernelVersion: 5.5 +Contact: Haiyang Zhang <haiyangz@microsoft.com> +Description: A number based on the vmbus device offer sequence. + Useful for device naming when using Async probing. +Users: Debugging tools and userspace drivers + What: /sys/bus/vmbus/devices/<UUID>/channels/<N> Date: September. 2017 KernelVersion: 4.14 diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 4ef5a66..fe7aefa 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -214,6 +214,18 @@ static ssize_t numa_node_show(struct device *dev, static DEVICE_ATTR_RO(numa_node); #endif +static ssize_t dev_num_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct hv_device *hv_dev = device_to_hv_device(dev); + + if (!hv_dev->channel) + return -ENODEV; + + return sprintf(buf, "%d\n", hv_dev->channel->dev_num); +} +static DEVICE_ATTR_RO(dev_num); + static ssize_t server_monitor_pending_show(struct device *dev, struct device_attribute *dev_attr, char *buf) @@ -598,6 +610,7 @@ static ssize_t driver_override_show(struct device *dev, #ifdef CONFIG_NUMA &dev_attr_numa_node.attr, #endif + &dev_attr_dev_num.attr, &dev_attr_server_monitor_pending.attr, &dev_attr_client_monitor_pending.attr, &dev_attr_server_monitor_latency.attr, -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs 2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang @ 2019-12-31 1:34 ` Michael Kelley 2019-12-31 15:50 ` Haiyang Zhang 0 siblings, 1 reply; 14+ messages in thread From: Michael Kelley @ 2019-12-31 1:34 UTC (permalink / raw) To: Haiyang Zhang, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, 2019 12:14 PM > > It's a number based on the vmbus device offer sequence. Let's use "VMBus" as the capitalization. > Useful for device naming when using Async probing. > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > --- > Documentation/ABI/stable/sysfs-bus-vmbus | 8 ++++++++ > drivers/hv/vmbus_drv.c | 13 +++++++++++++ > 2 files changed, 21 insertions(+) > > diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus > b/Documentation/ABI/stable/sysfs-bus-vmbus > index 8e8d167..a42225d 100644 > --- a/Documentation/ABI/stable/sysfs-bus-vmbus > +++ b/Documentation/ABI/stable/sysfs-bus-vmbus > @@ -49,6 +49,14 @@ Contact: Stephen Hemminger <sthemmin@microsoft.com> > Description: This NUMA node to which the VMBUS device is > attached, or -1 if the node is unknown. > > +What: /sys/bus/vmbus/devices/<UUID>/dev_num > +Date: Dec 2019 > +KernelVersion: 5.5 > +Contact: Haiyang Zhang <haiyangz@microsoft.com> > +Description: A number based on the vmbus device offer sequence. The capitalization of "VMBus" is already inconsistent in this file in that we have "VMBus" and "VMBUS" in the text. Let's not introduce another capitalization -- use "VMBus". > + Useful for device naming when using Async probing. > +Users: Debugging tools and userspace drivers > + > What: /sys/bus/vmbus/devices/<UUID>/channels/<N> > Date: September. 2017 > KernelVersion: 4.14 > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c > index 4ef5a66..fe7aefa 100644 > --- a/drivers/hv/vmbus_drv.c > +++ b/drivers/hv/vmbus_drv.c > @@ -214,6 +214,18 @@ static ssize_t numa_node_show(struct device *dev, > static DEVICE_ATTR_RO(numa_node); > #endif > > +static ssize_t dev_num_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct hv_device *hv_dev = device_to_hv_device(dev); > + > + if (!hv_dev->channel) > + return -ENODEV; > + > + return sprintf(buf, "%d\n", hv_dev->channel->dev_num); > +} > +static DEVICE_ATTR_RO(dev_num); > + > static ssize_t server_monitor_pending_show(struct device *dev, > struct device_attribute *dev_attr, > char *buf) > @@ -598,6 +610,7 @@ static ssize_t driver_override_show(struct device *dev, > #ifdef CONFIG_NUMA > &dev_attr_numa_node.attr, > #endif > + &dev_attr_dev_num.attr, > &dev_attr_server_monitor_pending.attr, > &dev_attr_client_monitor_pending.attr, > &dev_attr_server_monitor_latency.attr, > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs 2019-12-31 1:34 ` Michael Kelley @ 2019-12-31 15:50 ` Haiyang Zhang 0 siblings, 0 replies; 14+ messages in thread From: Haiyang Zhang @ 2019-12-31 15:50 UTC (permalink / raw) To: Michael Kelley, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org Cc: KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org > -----Original Message----- > From: Michael Kelley <mikelley@microsoft.com> > Sent: Monday, December 30, 2019 8:35 PM > To: Haiyang Zhang <haiyangz@microsoft.com>; sashal@kernel.org; linux- > hyperv@vger.kernel.org; netdev@vger.kernel.org > Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; > olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net; > linux-kernel@vger.kernel.org > Subject: RE: [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs > > From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, > 2019 12:14 PM > > > > It's a number based on the vmbus device offer sequence. > > Let's use "VMBus" as the capitalization. I will. > > > Useful for device naming when using Async probing. > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > --- > > Documentation/ABI/stable/sysfs-bus-vmbus | 8 ++++++++ > > drivers/hv/vmbus_drv.c | 13 +++++++++++++ > > 2 files changed, 21 insertions(+) > > > > diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus > > b/Documentation/ABI/stable/sysfs-bus-vmbus > > index 8e8d167..a42225d 100644 > > --- a/Documentation/ABI/stable/sysfs-bus-vmbus > > +++ b/Documentation/ABI/stable/sysfs-bus-vmbus > > @@ -49,6 +49,14 @@ Contact: Stephen Hemminger > <sthemmin@microsoft.com> > > Description: This NUMA node to which the VMBUS device is > > attached, or -1 if the node is unknown. > > > > +What: /sys/bus/vmbus/devices/<UUID>/dev_num > > +Date: Dec 2019 > > +KernelVersion: 5.5 > > +Contact: Haiyang Zhang <haiyangz@microsoft.com> > > +Description: A number based on the vmbus device offer sequence. > > The capitalization of "VMBus" is already inconsistent in this file in that > we have "VMBus" and "VMBUS" in the text. Let's not introduce another > capitalization -- use "VMBus". I will use "VMBus". Thanks, - Haiyang ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe 2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang 2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang 2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang @ 2019-12-30 20:13 ` Haiyang Zhang 2019-12-31 1:35 ` Michael Kelley 2019-12-31 11:34 ` Roman Kagan 2 siblings, 2 replies; 14+ messages in thread From: Haiyang Zhang @ 2019-12-30 20:13 UTC (permalink / raw) To: sashal, linux-hyperv, netdev Cc: haiyangz, kys, sthemmin, olaf, vkuznets, davem, linux-kernel The dev_num field in vmbus channel structure is assigned to the first available number when the channel is offered. So netvsc driver uses it for NIC naming based on channel offer sequence. Now re-enable the async probing mode for faster probing. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> --- drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, struct net_device_context *net_device_ctx; struct netvsc_device_info *device_info = NULL; struct netvsc_device *nvdev; + char name[IFNAMSIZ]; int ret = -ENOMEM; - net = alloc_etherdev_mq(sizeof(struct net_device_context), - VRSS_CHANNEL_MAX); + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, + NET_NAME_ENUM, ether_setup, + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); + if (!net) goto no_net; @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, net->max_mtu = ETH_DATA_LEN; ret = register_netdevice(net); + + if (ret == -EEXIST) { + pr_info("NIC name %s exists, request another name.\n", + net->name); + strlcpy(net->name, "eth%d", IFNAMSIZ); + ret = register_netdevice(net); + } + if (ret != 0) { pr_err("Unable to register netdev.\n"); goto register_failed; @@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev) .suspend = netvsc_suspend, .resume = netvsc_resume, .driver = { - .probe_type = PROBE_FORCE_SYNCHRONOUS, + .probe_type = PROBE_PREFER_ASYNCHRONOUS, }, }; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe 2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang @ 2019-12-31 1:35 ` Michael Kelley 2019-12-31 16:01 ` Haiyang Zhang 2019-12-31 11:34 ` Roman Kagan 1 sibling, 1 reply; 14+ messages in thread From: Michael Kelley @ 2019-12-31 1:35 UTC (permalink / raw) To: Haiyang Zhang, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org Cc: Haiyang Zhang, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, 2019 12:14 PM > > The dev_num field in vmbus channel structure is assigned to the first Let's use "VMBus" in text. > available number when the channel is offered. So netvsc driver uses it > for NIC naming based on channel offer sequence. Now re-enable the async > probing mode for faster probing. > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > --- > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c > index f3f9eb8..39c412f 100644 > --- a/drivers/net/hyperv/netvsc_drv.c > +++ b/drivers/net/hyperv/netvsc_drv.c > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > struct net_device_context *net_device_ctx; > struct netvsc_device_info *device_info = NULL; > struct netvsc_device *nvdev; > + char name[IFNAMSIZ]; > int ret = -ENOMEM; > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > - VRSS_CHANNEL_MAX); > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > + NET_NAME_ENUM, ether_setup, > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > + > if (!net) > goto no_net; > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > net->max_mtu = ETH_DATA_LEN; > > ret = register_netdevice(net); > + > + if (ret == -EEXIST) { > + pr_info("NIC name %s exists, request another name.\n", > + net->name); > + strlcpy(net->name, "eth%d", IFNAMSIZ); > + ret = register_netdevice(net); The message above could be confusing. "request another name" sounds like a directive to the user/sysadmin, which I don't think it is. Perhaps better would be "requesting another name", which says the system is handling the collision automatically. Also will this second call to register_netdevice() actually assign a numeric name? If so, it would be really nice for the message to be output *after* the second call to register_netdevice() and to show both the originally selected name that collided as well as the new name. We sometimes get into NIC naming issues with customers in Azure, and when a new name has to be selected it will help debugging if we can show both the original selection and the new selection. With both pieces of data, there's less guessing about who did what regarding NIC naming. Finally, having to choose a different name because of a collision does *not* update the channel->dev_num value. Subsequent calls to hv_set_devnum() will detect "in use" based on the originally assigned dev_num value. I don't think that fundamentally breaks anything, but I wondered if you had thought about that case. Maybe a comment here to that effect would help a future reader of this code. Michael > + } > + > if (ret != 0) { > pr_err("Unable to register netdev.\n"); > goto register_failed; > @@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev) > .suspend = netvsc_suspend, > .resume = netvsc_resume, > .driver = { > - .probe_type = PROBE_FORCE_SYNCHRONOUS, > + .probe_type = PROBE_PREFER_ASYNCHRONOUS, > }, > }; > > -- > 1.8.3.1 ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe 2019-12-31 1:35 ` Michael Kelley @ 2019-12-31 16:01 ` Haiyang Zhang 0 siblings, 0 replies; 14+ messages in thread From: Haiyang Zhang @ 2019-12-31 16:01 UTC (permalink / raw) To: Michael Kelley, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org Cc: KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org > -----Original Message----- > From: Michael Kelley <mikelley@microsoft.com> > Sent: Monday, December 30, 2019 8:35 PM > To: Haiyang Zhang <haiyangz@microsoft.com>; sashal@kernel.org; linux- > hyperv@vger.kernel.org; netdev@vger.kernel.org > Cc: Haiyang Zhang <haiyangz@microsoft.com>; KY Srinivasan > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; > olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net; > linux-kernel@vger.kernel.org > Subject: RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus > offer sequence and use async probe > > From: Haiyang Zhang <haiyangz@microsoft.com> Sent: Monday, December 30, > 2019 12:14 PM > > > > The dev_num field in vmbus channel structure is assigned to the first > > Let's use "VMBus" in text. I will. > > > available number when the channel is offered. So netvsc driver uses it > > for NIC naming based on channel offer sequence. Now re-enable the async > > probing mode for faster probing. > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > --- > > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c > > index f3f9eb8..39c412f 100644 > > --- a/drivers/net/hyperv/netvsc_drv.c > > +++ b/drivers/net/hyperv/netvsc_drv.c > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > > struct net_device_context *net_device_ctx; > > struct netvsc_device_info *device_info = NULL; > > struct netvsc_device *nvdev; > > + char name[IFNAMSIZ]; > > int ret = -ENOMEM; > > > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > > - VRSS_CHANNEL_MAX); > > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); > > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > > + NET_NAME_ENUM, ether_setup, > > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > > + > > if (!net) > > goto no_net; > > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > > net->max_mtu = ETH_DATA_LEN; > > > > ret = register_netdevice(net); > > + > > + if (ret == -EEXIST) { > > + pr_info("NIC name %s exists, request another name.\n", > > + net->name); > > + strlcpy(net->name, "eth%d", IFNAMSIZ); > > + ret = register_netdevice(net); > > The message above could be confusing. "request another name" sounds > like a directive to the user/sysadmin, which I don't think it is. Perhaps > better would be "requesting another name", which says the system is > handling the collision automatically. > > Also will this second call to register_netdevice() actually assign a numeric > name? If so, it would be really nice for the message to be output *after* > the second call to register_netdevice() and to show both the originally > selected name that collided as well as the new name. We sometimes get > into NIC naming issues with customers in Azure, and when a new name > has to be selected it will help debugging if we can show both the original > selection and the new selection. With both pieces of data, there's less > guessing about who did what regarding NIC naming. Good idea! I will include both new and old names in the log messages. > > Finally, having to choose a different name because of a collision does > *not* update the channel->dev_num value. Subsequent calls to > hv_set_devnum() will detect "in use" based on the originally assigned > dev_num value. I don't think that fundamentally breaks anything, but > I wondered if you had thought about that case. Maybe a comment here > to that effect would help a future reader of this code. That's correct. And I'm aware of this situation. But, the dev_num shouldn't be changed, because we want it to keep track of the sequence of device offering. Avoid single or multiple collisions, we should use udev or other daemons to set name of VF or other types of NICs to different formats, like "vf*", or "enP*", etc. For distros have not done so already, we need to work with the distro vendors to ensure udev or other settings are correctly included in Distros, and Azure images. Thanks, - Haiyang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe 2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang 2019-12-31 1:35 ` Michael Kelley @ 2019-12-31 11:34 ` Roman Kagan 2019-12-31 16:12 ` Haiyang Zhang 1 sibling, 1 reply; 14+ messages in thread From: Roman Kagan @ 2019-12-31 11:34 UTC (permalink / raw) To: Haiyang Zhang Cc: sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org, kys@microsoft.com, sthemmin@microsoft.com, olaf@aepfle.de, vkuznets@redhat.com, davem@davemloft.net, linux-kernel@vger.kernel.org On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote: > The dev_num field in vmbus channel structure is assigned to the first > available number when the channel is offered. So netvsc driver uses it > for NIC naming based on channel offer sequence. Now re-enable the async > probing mode for faster probing. > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > --- > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c > index f3f9eb8..39c412f 100644 > --- a/drivers/net/hyperv/netvsc_drv.c > +++ b/drivers/net/hyperv/netvsc_drv.c > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > struct net_device_context *net_device_ctx; > struct netvsc_device_info *device_info = NULL; > struct netvsc_device *nvdev; > + char name[IFNAMSIZ]; > int ret = -ENOMEM; > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > - VRSS_CHANNEL_MAX); > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); How is this supposed to work when there are other ethernet device types on the system, which may claim the same device names? > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > + NET_NAME_ENUM, ether_setup, > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > + > if (!net) > goto no_net; > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > net->max_mtu = ETH_DATA_LEN; > > ret = register_netdevice(net); > + > + if (ret == -EEXIST) { > + pr_info("NIC name %s exists, request another name.\n", > + net->name); > + strlcpy(net->name, "eth%d", IFNAMSIZ); > + ret = register_netdevice(net); > + } IOW you want the device naming to be predictable, but don't guarantee this? I think the problem this patchset is trying to solve is much better solved with a udev rule, similar to how it's done for PCI net devices. And IMO the primary channel number, being a device's "hardware" property, is more suited to be used in the device name, than this completely ephemeral device number. Thanks, Roman. > + > if (ret != 0) { > pr_err("Unable to register netdev.\n"); > goto register_failed; > @@ -2496,7 +2508,7 @@ static int netvsc_resume(struct hv_device *dev) > .suspend = netvsc_suspend, > .resume = netvsc_resume, > .driver = { > - .probe_type = PROBE_FORCE_SYNCHRONOUS, > + .probe_type = PROBE_PREFER_ASYNCHRONOUS, > }, > }; > > -- > 1.8.3.1 > ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe 2019-12-31 11:34 ` Roman Kagan @ 2019-12-31 16:12 ` Haiyang Zhang 2019-12-31 17:36 ` Stephen Hemminger 0 siblings, 1 reply; 14+ messages in thread From: Haiyang Zhang @ 2019-12-31 16:12 UTC (permalink / raw) To: Roman Kagan Cc: sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net, linux-kernel@vger.kernel.org > -----Original Message----- > From: Roman Kagan <rkagan@virtuozzo.com> > Sent: Tuesday, December 31, 2019 6:35 AM > To: Haiyang Zhang <haiyangz@microsoft.com> > Cc: sashal@kernel.org; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; > KY Srinivasan <kys@microsoft.com>; Stephen Hemminger > <sthemmin@microsoft.com>; olaf@aepfle.de; vkuznets > <vkuznets@redhat.com>; davem@davemloft.net; linux-kernel@vger.kernel.org > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus > offer sequence and use async probe > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote: > > The dev_num field in vmbus channel structure is assigned to the first > > available number when the channel is offered. So netvsc driver uses it > > for NIC naming based on channel offer sequence. Now re-enable the > > async probing mode for faster probing. > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > --- > > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644 > > --- a/drivers/net/hyperv/netvsc_drv.c > > +++ b/drivers/net/hyperv/netvsc_drv.c > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > > struct net_device_context *net_device_ctx; > > struct netvsc_device_info *device_info = NULL; > > struct netvsc_device *nvdev; > > + char name[IFNAMSIZ]; > > int ret = -ENOMEM; > > > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > > - VRSS_CHANNEL_MAX); > > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); > > How is this supposed to work when there are other ethernet device types on the > system, which may claim the same device names? > > > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > > + NET_NAME_ENUM, ether_setup, > > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > > + > > if (!net) > > goto no_net; > > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > > net->max_mtu = ETH_DATA_LEN; > > > > ret = register_netdevice(net); > > + > > + if (ret == -EEXIST) { > > + pr_info("NIC name %s exists, request another name.\n", > > + net->name); > > + strlcpy(net->name, "eth%d", IFNAMSIZ); > > + ret = register_netdevice(net); > > + } > > IOW you want the device naming to be predictable, but don't guarantee this? > > I think the problem this patchset is trying to solve is much better solved with a > udev rule, similar to how it's done for PCI net devices. > And IMO the primary channel number, being a device's "hardware" > property, is more suited to be used in the device name, than this completely > ephemeral device number. The vmbus number can be affected by other types of devices and/or subchannel offerings. They are not stable either. That's why this patch set keeps track of the offering sequence within the same device type in a new variable "dev_num". As in my earlier email, to avoid impact by other types of NICs, we should put them into different naming formats, like "vf*", "enP*", etc. And yes, these can be done in udev. But for netvsc (synthetic) NICs, we still want the default naming format "eth*". And the variable "dev_num" gives them the basis for stable naming with Async probing. Thanks, - Haiyang ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe 2019-12-31 16:12 ` Haiyang Zhang @ 2019-12-31 17:36 ` Stephen Hemminger 2019-12-31 17:51 ` Haiyang Zhang 0 siblings, 1 reply; 14+ messages in thread From: Stephen Hemminger @ 2019-12-31 17:36 UTC (permalink / raw) To: Haiyang Zhang Cc: Roman Kagan, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net On Tue, 31 Dec 2019 16:12:36 +0000 Haiyang Zhang <haiyangz@microsoft.com> wrote: > > -----Original Message----- > > From: Roman Kagan <rkagan@virtuozzo.com> > > Sent: Tuesday, December 31, 2019 6:35 AM > > To: Haiyang Zhang <haiyangz@microsoft.com> > > Cc: sashal@kernel.org; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; > > KY Srinivasan <kys@microsoft.com>; Stephen Hemminger > > <sthemmin@microsoft.com>; olaf@aepfle.de; vkuznets > > <vkuznets@redhat.com>; davem@davemloft.net; linux-kernel@vger.kernel.org > > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus > > offer sequence and use async probe > > > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote: > > > The dev_num field in vmbus channel structure is assigned to the first > > > available number when the channel is offered. So netvsc driver uses it > > > for NIC naming based on channel offer sequence. Now re-enable the > > > async probing mode for faster probing. > > > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > > --- > > > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644 > > > --- a/drivers/net/hyperv/netvsc_drv.c > > > +++ b/drivers/net/hyperv/netvsc_drv.c > > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > > > struct net_device_context *net_device_ctx; > > > struct netvsc_device_info *device_info = NULL; > > > struct netvsc_device *nvdev; > > > + char name[IFNAMSIZ]; > > > int ret = -ENOMEM; > > > > > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > > > - VRSS_CHANNEL_MAX); > > > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); > > > > How is this supposed to work when there are other ethernet device types on the > > system, which may claim the same device names? > > > > > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > > > + NET_NAME_ENUM, ether_setup, > > > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > > > + > > > if (!net) > > > goto no_net; > > > > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > > > net->max_mtu = ETH_DATA_LEN; > > > > > > ret = register_netdevice(net); > > > + > > > + if (ret == -EEXIST) { > > > + pr_info("NIC name %s exists, request another name.\n", > > > + net->name); > > > + strlcpy(net->name, "eth%d", IFNAMSIZ); > > > + ret = register_netdevice(net); > > > + } > > > > IOW you want the device naming to be predictable, but don't guarantee this? > > > > I think the problem this patchset is trying to solve is much better solved with a > > udev rule, similar to how it's done for PCI net devices. > > And IMO the primary channel number, being a device's "hardware" > > property, is more suited to be used in the device name, than this completely > > ephemeral device number. > > The vmbus number can be affected by other types of devices and/or subchannel > offerings. They are not stable either. That's why this patch set keeps track of the > offering sequence within the same device type in a new variable "dev_num". > > As in my earlier email, to avoid impact by other types of NICs, we should put them > into different naming formats, like "vf*", "enP*", etc. And yes, these can be done in > udev. > > But for netvsc (synthetic) NICs, we still want the default naming format "eth*". And > the variable "dev_num" gives them the basis for stable naming with Async probing. > > Thanks, > - Haiyang > The primary requirements for network naming are: 1. Network names must be repeatable on each boot. This was the original problem that PCI devices discovered back years ago when parallel probing was enabled. 2. Network names must be predictable. If new VM is created, the names should match a similar VM config. 3. Names must be persistent. If a NIC is added or deleted, the existing names must not change. The other things which are important (and this proposal breaks): 1. Don't break it principle: an existing VM should not suddenly get interfaces renamed if kernel is upgraded. A corrallary is that a lot of current userspace code expects eth0. It doesn't look like first interface would be guaranteed to be eth0. 2. No snowflakes principle: a device driver should follow the current practice of other devices. For netvsc, this means VMBus should act like PCI as much as possible. Is there another driver doing this already? 3. Userspace policy principle: Every distribution has its own policy by now. The solution must make netvsc work reliably on Redhat (udev), Ubuntu (netplan), SuSE (Yast) doing something in the kernel violates #2. My recommendation would be to take a multi-phase approach: 1. Expose persistent value in sysfs now. 2. Work with udev/netplan/... to use that value. 3. Make parallel VMBus probing an option. So that when distributions have picked up the udev changes they can enable parallel probe. Some will be quick to adopt and the enterprise laggards can get to it when they feel the heat. Long term wish list (requires host side changes): 1. The interface index could be a host side property; the host networking already has the virtual device table and it is persistent. 2. The Azure NIC name should be visible as a property in guest. Then userspace could do rename based on that property. Having multiple disconnected names is leads to confusion. ^ permalink raw reply [flat|nested] 14+ messages in thread
* RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe 2019-12-31 17:36 ` Stephen Hemminger @ 2019-12-31 17:51 ` Haiyang Zhang 0 siblings, 0 replies; 14+ messages in thread From: Haiyang Zhang @ 2019-12-31 17:51 UTC (permalink / raw) To: Stephen Hemminger Cc: Roman Kagan, sashal@kernel.org, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org, KY Srinivasan, Stephen Hemminger, olaf@aepfle.de, vkuznets, davem@davemloft.net > -----Original Message----- > From: linux-hyperv-owner@vger.kernel.org <linux-hyperv- > owner@vger.kernel.org> On Behalf Of Stephen Hemminger > Sent: Tuesday, December 31, 2019 12:36 PM > To: Haiyang Zhang <haiyangz@microsoft.com> > Cc: Roman Kagan <rkagan@virtuozzo.com>; sashal@kernel.org; linux- > hyperv@vger.kernel.org; netdev@vger.kernel.org; KY Srinivasan > <kys@microsoft.com>; Stephen Hemminger <sthemmin@microsoft.com>; > olaf@aepfle.de; vkuznets <vkuznets@redhat.com>; davem@davemloft.net > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus > offer sequence and use async probe > > On Tue, 31 Dec 2019 16:12:36 +0000 > Haiyang Zhang <haiyangz@microsoft.com> wrote: > > > > -----Original Message----- > > > From: Roman Kagan <rkagan@virtuozzo.com> > > > Sent: Tuesday, December 31, 2019 6:35 AM > > > To: Haiyang Zhang <haiyangz@microsoft.com> > > > Cc: sashal@kernel.org; linux-hyperv@vger.kernel.org; > netdev@vger.kernel.org; > > > KY Srinivasan <kys@microsoft.com>; Stephen Hemminger > > > <sthemmin@microsoft.com>; olaf@aepfle.de; vkuznets > > > <vkuznets@redhat.com>; davem@davemloft.net; linux- > kernel@vger.kernel.org > > > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on > vmbus > > > offer sequence and use async probe > > > > > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote: > > > > The dev_num field in vmbus channel structure is assigned to the first > > > > available number when the channel is offered. So netvsc driver uses it > > > > for NIC naming based on channel offer sequence. Now re-enable the > > > > async probing mode for faster probing. > > > > > > > > Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> > > > > --- > > > > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++--- > > > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/net/hyperv/netvsc_drv.c > > > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644 > > > > --- a/drivers/net/hyperv/netvsc_drv.c > > > > +++ b/drivers/net/hyperv/netvsc_drv.c > > > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev, > > > > struct net_device_context *net_device_ctx; > > > > struct netvsc_device_info *device_info = NULL; > > > > struct netvsc_device *nvdev; > > > > + char name[IFNAMSIZ]; > > > > int ret = -ENOMEM; > > > > > > > > - net = alloc_etherdev_mq(sizeof(struct net_device_context), > > > > - VRSS_CHANNEL_MAX); > > > > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num); > > > > > > How is this supposed to work when there are other ethernet device types on > the > > > system, which may claim the same device names? > > > > > > > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name, > > > > + NET_NAME_ENUM, ether_setup, > > > > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX); > > > > + > > > > if (!net) > > > > goto no_net; > > > > > > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev, > > > > net->max_mtu = ETH_DATA_LEN; > > > > > > > > ret = register_netdevice(net); > > > > + > > > > + if (ret == -EEXIST) { > > > > + pr_info("NIC name %s exists, request another name.\n", > > > > + net->name); > > > > + strlcpy(net->name, "eth%d", IFNAMSIZ); > > > > + ret = register_netdevice(net); > > > > + } > > > > > > IOW you want the device naming to be predictable, but don't guarantee this? > > > > > > I think the problem this patchset is trying to solve is much better solved with > a > > > udev rule, similar to how it's done for PCI net devices. > > > And IMO the primary channel number, being a device's "hardware" > > > property, is more suited to be used in the device name, than this completely > > > ephemeral device number. > > > > The vmbus number can be affected by other types of devices and/or > subchannel > > offerings. They are not stable either. That's why this patch set keeps track of > the > > offering sequence within the same device type in a new variable "dev_num". > > > > As in my earlier email, to avoid impact by other types of NICs, we should put > them > > into different naming formats, like "vf*", "enP*", etc. And yes, these can be > done in > > udev. > > > > But for netvsc (synthetic) NICs, we still want the default naming format "eth*". > And > > the variable "dev_num" gives them the basis for stable naming with Async > probing. > > > > Thanks, > > - Haiyang > > > > The primary requirements for network naming are: > 1. Network names must be repeatable on each boot. This was the original > problem > that PCI devices discovered back years ago when parallel probing was > enabled. > 2. Network names must be predictable. If new VM is created, the names > should > match a similar VM config. > 3. Names must be persistent. If a NIC is added or deleted, the existing names > must not change. > > The other things which are important (and this proposal breaks): > 1. Don't break it principle: an existing VM should not suddenly get interfaces > renamed if kernel is upgraded. A corrallary is that a lot of current userspace > code expects eth0. It doesn't look like first interface would be guaranteed > to be eth0. > > 2. No snowflakes principle: a device driver should follow the current practice > of other devices. For netvsc, this means VMBus should act like PCI as much > as possible. Is there another driver doing this already? > > 3. Userspace policy principle: Every distribution has its own policy by now. > The solution must make netvsc work reliably on Redhat (udev), Ubuntu > (netplan), SuSE (Yast) > doing something in the kernel violates #2. > > My recommendation would be to take a multi-phase approach: > 1. Expose persistent value in sysfs now. > 2. Work with udev/netplan/... to use that value. > 3. Make parallel VMBus probing an option. So that when distributions have > picked up > the udev changes they can enable parallel probe. Some will be quick to > adopt > and the enterprise laggards can get to it when they feel the heat. > > Long term wish list (requires host side changes): > 1. The interface index could be a host side property; the host networking > already has the virtual device table and it is persistent. > 2. The Azure NIC name should be visible as a property in guest. > Then userspace could do rename based on that property. > Having multiple disconnected names is leads to confusion. Thank you for the detailed recommendations! I will do the step 1. Expose persistent value in sysfs now. And work on other changes in the future. Thanks, - Haiyang ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-12-31 17:51 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-12-30 20:13 [PATCH V2,net-next, 0/3] Add vmbus dev_num and enable netvsc async probing Haiyang Zhang 2019-12-30 20:13 ` [PATCH V2,net-next, 1/3] Drivers: hv: vmbus: Add a dev_num variable based on channel offer sequence Haiyang Zhang 2019-12-31 1:34 ` Michael Kelley 2019-12-31 15:48 ` Haiyang Zhang 2019-12-30 20:13 ` [PATCH V2,net-next, 2/3] Drivers: hv: vmbus: Add dev_num to sysfs Haiyang Zhang 2019-12-31 1:34 ` Michael Kelley 2019-12-31 15:50 ` Haiyang Zhang 2019-12-30 20:13 ` [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus offer sequence and use async probe Haiyang Zhang 2019-12-31 1:35 ` Michael Kelley 2019-12-31 16:01 ` Haiyang Zhang 2019-12-31 11:34 ` Roman Kagan 2019-12-31 16:12 ` Haiyang Zhang 2019-12-31 17:36 ` Stephen Hemminger 2019-12-31 17:51 ` Haiyang Zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).