netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
To: Harald Hoyer <harald@redhat.com>, Siwei Liu <loseweigh@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>,
	initramfs@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>,
	Netdev <netdev@vger.kernel.org>,
	vijay.balakrishna@oracle.com, si-wei liu <si-wei.liu@oracle.com>,
	liran.alon@oracle.com
Subject: Re: virtio_net failover and initramfs
Date: Fri, 17 Aug 2018 12:09:17 -0700	[thread overview]
Message-ID: <132a4610-a59f-19e4-a602-ead91325fb47@intel.com> (raw)
In-Reply-To: <914c05dc-4eaa-4b1b-69f1-d06676c75fd2@redhat.com>

On 8/17/2018 2:56 AM, Harald Hoyer wrote:
> On 17.08.2018 11:51, Harald Hoyer wrote:
>> On 16.08.2018 00:17, Siwei Liu wrote:
>>> On Wed, Aug 15, 2018 at 12:05 PM, Samudrala, Sridhar
>>> <sridhar.samudrala@intel.com> wrote:
>>>> On 8/14/2018 5:03 PM, Siwei Liu wrote:
>>>>> Are we sure all userspace apps skip and ignore slave interfaces by
>>>>> just looking at "IFLA_MASTER" attribute?
>>>>>
>>>>> When STANDBY is enabled on virtio-net, a failover master interface
>>>>> will appear, which automatically enslaves the virtio device. But it is
>>>>> found out that iSCSI (or any network boot) cannot boot strap over the
>>>>> new failover interface together with a standby virtio (without any VF
>>>>> or PT device in place).
>>>>>
>>>>> Dracut (initramfs) ends up with timeout and dropping into emergency shell:
>>>>>
>>>>> [  228.170425] dracut-initqueue[377]: Warning: dracut-initqueue
>>>>> timeout - starting timeout scripts
>>>>> [  228.171788] dracut-initqueue[377]: Warning: Could not boot.
>>>>>            Starting Dracut Emergency Shell...
>>>>> Generating "/run/initramfs/rdsosreport.txt"
>>>>> Entering emergency mode. Exit the shell to continue.
>>>>> Type "journalctl" to view system logs.
>>>>> You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or
>>>>> /boot
>>>>> after mounting them and attach it to a bug report.
>>>>> dracut:/# ip l sh
>>>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
>>>>> mode DEFAULT group default qlen 1000
>>>>>       link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>>>> 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
>>>>> state UP mode DEFAULT group default qlen 1000
>>>>>       link/ether 9a:46:22:ae:33:54 brd ff:ff:ff:ff:ff:ff\
>>>>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast
>>>>> master eth0 state UP mode DEFAULT group default qlen 1000
>>>>>       link/ether 9a:46:22:ae:33:54 brd ff:ff:ff:ff:ff:ff
>>>>> dracut:/#
>>>>>
>>>>> If changing dracut code to ignore eth1 (with IFLA_MASTER attr),
>>>>> network boot starts to work.
>>>>
>>>> Does dracut by default tries to use all the interfaces that are UP?
>>>>
>>> Yes. The specific dracut cmdline of our case is "ip=dhcp
>>> netroot=iscsi:... ", but it's not specific to iscsi boot. And because
>>> of same MAC address for failover and standby, while dracut tries to
>>> run DHCP on all interfaces that are up it eventually gets same route
>>> for each interface. Those conflict route entries kill off the network
>>> connection.
>>>
>>>>> The reason is that dracut has its own means to differentiate virtual
>>>>> interfaces for network boot: it does not look at IFLA_MASTER and
>>>>> ignores slave interfaces. Instead, users have to provide explicit
>>>>> option e.g. bond=eth0,eth1 in the boot line, then dracut would know
>>>>> the config and ignore the slave interfaces.
>>>>
>>>> Isn't it possible to specify the interface that should be used for network
>>>> boot?
>>> As I understand it, one can only specify interface name for running
>>> DHCP but not select interface for network boot.  We want DHCP to run
>>> on every NIC that is up (excluding the enslaved interfaces), and only
>>> one of them can get a route entry to the network boot server (ie.g.
>>> iSCSI target).
>>>
>>>>
>>>>> However, with automatic creation of failover interface that assumption
>>>>> is no longer true. Can we change dracut to ignore all slave interface
>>>>> by checking  IFLA_MASTER? I don't think so. It has a large impact to
>>>>> existing configs.
>>>>
>>>> What is the issue with checking for IFLA_MASTER? I guess this is used with
>>>> team/bonding setups.
>>> That should be discussed within and determined by the dracut
>>> community. But the current dracut code doesn't check IFLA_MASTER for
>>> team or bonding specifically. I guess this change might have broader
>>> impact to existing userspace that might be already relying on the
>>> current behaviour.
>>>
>>> Thanks,
>>> -Siwei
>> Is there a sysfs flag for IFF_SLAVE? Or any "ip" output I can use to detect, that it is a IFF_SLAVE?
>>
> Oh, it's the other way around.. dracut should ignore "master" (eth1).
In the above example eth0 is the net_failover device and eth1 is the 
lower virtio_net device.
"ip" output of eth1 shows "master eth0". It indicates that eth0 is its 
upper/master device.
This information can also be obtained via sysfs too. 
/sys/class/net/eth1/upper_eth0
>
> Can the master enslave the "eth0", if it is already "UP" and busy later on?
eth0 is the master/failover device and eth1 gets registered as its slave 
via NETDEV_REGISTER event.
dracut should ignore eth1 in this setup.

  reply	other threads:[~2018-08-17 22:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-15  0:03 virtio_net failover and initramfs (was: Re: [PATCH net-next v11 2/5] netvsc: refactor notifier/event handling code to use the failover framework) Siwei Liu
2018-08-15 19:05 ` Samudrala, Sridhar
2018-08-15 22:17   ` Siwei Liu
2018-08-17  9:51     ` Harald Hoyer
2018-08-17  9:56       ` Harald Hoyer
2018-08-17 19:09         ` Samudrala, Sridhar [this message]
2018-08-21 13:44           ` virtio_net failover and initramfs Harald Hoyer
2018-08-22  7:17             ` Siwei Liu
2018-08-22  7:23               ` Harald Hoyer
2018-08-22  7:27                 ` Siwei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=132a4610-a59f-19e4-a602-ead91325fb47@intel.com \
    --to=sridhar.samudrala@intel.com \
    --cc=harald@redhat.com \
    --cc=initramfs@vger.kernel.org \
    --cc=jiri@resnulli.us \
    --cc=liran.alon@oracle.com \
    --cc=loseweigh@gmail.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=si-wei.liu@oracle.com \
    --cc=vijay.balakrishna@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).