From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60FFCC43381 for ; Wed, 20 Mar 2019 14:09:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 38DCC213F2 for ; Wed, 20 Mar 2019 14:09:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726448AbfCTOJt (ORCPT ); Wed, 20 Mar 2019 10:09:49 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:44076 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726067AbfCTOJs (ORCPT ); Wed, 20 Mar 2019 10:09:48 -0400 Received: by mail-qt1-f194.google.com with SMTP id w5so2587121qtb.11 for ; Wed, 20 Mar 2019 07:09:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=HHUSAGqq8WD2egTmb9IiHHhIJdJCNZdTzpwNop3maLY=; b=pvTAR+uSoH+RJOE8yAl/OB05uyF4A7X87euqQv96xaiusz5joaOrOUsM55tWbKVSSG DwTZsUpM6lvmyEF6Zm9VhIdjimoPPIAaY3BsCEv2QSVAdJs0Bs8ks6SstZ2oyI4h34o/ HTFGPcVVQrHNJNpKBz1iqVJ6pVy+AYLc+eSY4r6nI6td0djqX4jZUym0tmg740mTE/t5 LML0okCzV5uJR6LZdb+5G4BbUfzp09eIlQ+kn75JIWGDunpaV+bu9PCDkPzr4sMP99xR MJm0IKJgEZU4Y0eFCWpd085qJzONsrP+/L2zkY3dEaLmdaLe2idZu3xdUDvmg43Cu+oo rTeQ== X-Gm-Message-State: APjAAAXZmX2IkFK350lwRiLrsWVKgFFzC74mOs8bsTZLp04aEqweAgIh w5IIFfUSf4Ptw11iMh9T/mok5w== X-Google-Smtp-Source: APXvYqzdk5pQQOErnsrk5ltXw/3G2nAdMgDU8/aEDyhjP/9rc7Y15SUw4OqmjL2GW6TRIOBHfzz/cQ== X-Received: by 2002:aed:3e94:: with SMTP id n20mr6981864qtf.268.1553090987513; Wed, 20 Mar 2019 07:09:47 -0700 (PDT) Received: from redhat.com ([195.39.71.253]) by smtp.gmail.com with ESMTPSA id a75sm1168774qkg.84.2019.03.20.07.09.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 20 Mar 2019 07:09:46 -0700 (PDT) Date: Wed, 20 Mar 2019 10:09:40 -0400 From: "Michael S. Tsirkin" To: Liran Alon Cc: Stephen Hemminger , Si-Wei Liu , Sridhar Samudrala , Alexander Duyck , Jakub Kicinski , Jiri Pirko , David Miller , Netdev , virtualization@lists.linux-foundation.org, boris.ostrovsky@oracle.com, vijay.balakrishna@oracle.com, jfreimann@redhat.com, ogerlitz@mellanox.com, vuhuong@mellanox.com Subject: Re: [summary] virtio network device failover writeup Message-ID: <20190320100747-mutt-send-email-mst@kernel.org> References: <20190317095052-mutt-send-email-mst@kernel.org> <54E7C3AF-C3C5-4AF2-86C9-AA50389F855F@oracle.com> <20190319084647.727f8dcf@shemminger-XPS-13-9360> <20190319171638-mutt-send-email-mst@kernel.org> <79F5D7C0-BBAA-4F78-9039-27A444970002@oracle.com> <20190320061632-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Wed, Mar 20, 2019 at 02:23:36PM +0200, Liran Alon wrote: > > > > On 20 Mar 2019, at 12:25, Michael S. Tsirkin wrote: > > > > On Wed, Mar 20, 2019 at 01:25:58AM +0200, Liran Alon wrote: > >> > >> > >>> On 19 Mar 2019, at 23:19, Michael S. Tsirkin wrote: > >>> > >>> On Tue, Mar 19, 2019 at 08:46:47AM -0700, Stephen Hemminger wrote: > >>>> On Tue, 19 Mar 2019 14:38:06 +0200 > >>>> Liran Alon wrote: > >>>> > >>>>> b.3) cloud-init: If configured to perform network-configuration, it attempts to configure all available netdevs. It should avoid however doing so on net-failover slaves. > >>>>> (Microsoft has handled this by adding a mechanism in cloud-init to blacklist a netdev from being configured in case it is owned by a specific PCI driver. Specifically, they blacklist Mellanox VF driver. However, this technique doesn’t work for the net-failover mechanism because both the net-failover netdev and the virtio-net netdev are owned by the virtio-net PCI driver). > >>>> > >>>> Cloud-init should really just ignore all devices that have a master device. > >>>> That would have been more general, and safer for other use cases. > >>> > >>> Given lots of userspace doesn't do this, I wonder whether it would be > >>> safer to just somehow pretend to userspace that the slave links are > >>> down? And add a special attribute for the actual link state. > >> > >> I think this may be problematic as it would also break legit use case > >> of userspace attempt to set various config on VF slave. > >> In general, lying to userspace usually leads to problems. > > > > I hear you on this. So how about instead of lying, > > we basically just fail some accesses to slaves > > unless a flag is set e.g. in ethtool. > > > > Some userspace will need to change to set it but in a minor way. > > Arguably/hopefully failure to set config would generally be a safer > > failure. > > Once userspace will set this new flag by ethtool, all operations done by other userspace components will still work. Sorry about being unclear, the idea would be to require the flag on each ethtool operation. > E.g. Running dhclient without parameters, after this flag was set, will still attempt to perform DHCP on it and will now succeed. I think sending/receiving should probably just fail unconditionally. > Therefore, this proposal just effectively delays when the net-failover slave can be operated on by userspace. > But what we actually want is to never allow a net-failover slave to be operated by userspace unless it is explicitly stated > by userspace that it wishes to perform a set of actions on the net-failover slave. > > Something that was achieved if, for example, the net-failover slaves were in a different netns than default netns. > This also aligns with expected customer experience that most customers just want to see a 1:1 mapping between a vNIC and a visible netdev. > But of course maybe there are other ideas that can achieve similar behaviour. > > -Liran > > > > > Which things to fail? Probably sending/receiving packets? Getting MAC? > > More? > > > >> If we reach > >> to a scenario where we try to avoid userspace issues generically and > >> not on a userspace component basis, I believe the right path should be > >> to hide the net-failover slaves such that explicit action is required > >> to actually manipulate them (As described in blog-post). E.g. > >> Automatically move net-failover slaves by kernel to a different netns. > >> > >> -Liran > >> > >>> > >>> -- > >>> MST