From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41C38C43381 for ; Fri, 1 Mar 2019 01:05:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CF49D20863 for ; Fri, 1 Mar 2019 01:05:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=wp.pl header.i=@wp.pl header.b="DRVxEw4s" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733044AbfCABFf (ORCPT ); Thu, 28 Feb 2019 20:05:35 -0500 Received: from mx4.wp.pl ([212.77.101.11]:5516 "EHLO mx4.wp.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733025AbfCABFe (ORCPT ); Thu, 28 Feb 2019 20:05:34 -0500 Received: (wp-smtpd smtp.wp.pl 12272 invoked from network); 1 Mar 2019 02:05:31 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wp.pl; s=1024a; t=1551402331; bh=ewrB+JaWVQGAreFSmbrYyZUy+jN3NjKGPdeoMPGDKws=; h=From:To:Cc:Subject; b=DRVxEw4s1BEHGA7atyZlJdhecjc2fSGbuZdYxmPfaRWKifNhMUEglI16v0+R7IDcL IVz3U/fBcG5NZaNKy7oaMv9/p9fpNz1hIGoIov3cdEo298PM/Y/zM3HCCa+/Imfk2U jLG/9TU9tUhvYWuC70PRS4BfAq8IR4DH4czgR234= Received: from 014.152-60-66-biz-static.surewest.net (HELO cakuba.netronome.com) (kubakici@wp.pl@[66.60.152.14]) (envelope-sender ) by smtp.wp.pl (WP-SMTPD) with ECDHE-RSA-AES256-GCM-SHA384 encrypted SMTP for ; 1 Mar 2019 02:05:31 +0100 Date: Thu, 28 Feb 2019 17:05:20 -0800 From: Jakub Kicinski To: Siwei Liu Cc: "Michael S. Tsirkin" , si-wei liu , "Samudrala, Sridhar" , Jiri Pirko , Stephen Hemminger , David Miller , Netdev , virtualization@lists.linux-foundation.org, "Brandeburg, Jesse" , Alexander Duyck , Jason Wang , liran.alon@oracle.com Subject: Re: [virtio-dev] Re: net_failover slave udev renaming (was Re: [RFC PATCH net-next v6 4/4] netvsc: refactor notifier/event handling code to use the bypass framework) Message-ID: <20190228170520.527ed6df@cakuba.netronome.com> In-Reply-To: References: <20190227173710-mutt-send-email-mst@kernel.org> <20190227184601-mutt-send-email-mst@kernel.org> <20190227193923-mutt-send-email-mst@kernel.org> <20190227165205.307ed83c@cakuba.netronome.com> <20190227201857-mutt-send-email-mst@kernel.org> <20190227175218.736e13b6@cakuba.netronome.com> <20190227233812-mutt-send-email-mst@kernel.org> <20190228101356.39ac70aa@cakuba.netronome.com> <20190228143511-mutt-send-email-mst@kernel.org> <20190228115641.7afe6f09@cakuba.netronome.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-WP-MailID: 2970fcb80d03fea245f9e18e76b35a2f X-WP-AV: skaner antywirusowy Poczty Wirtualnej Polski X-WP-SPAM: NO 0000000 [QbNM] Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, 28 Feb 2019 16:20:28 -0800, Siwei Liu wrote: > On Thu, Feb 28, 2019 at 11:56 AM Jakub Kicinski wrote: > > On Thu, 28 Feb 2019 14:36:56 -0500, Michael S. Tsirkin wrote: > > > > It is a bit of a the chicken or the egg situation ;) But users can > > > > just blacklist, too. Anyway, I think this is far better than module > > > > parameters > > > > > > Sorry I'm a bit confused. What is better than what? > > > > I mean that blacklist net_failover or module param to disable > > net_failover and handle in user space are better than trying to solve > > the renaming at kernel level (either by adding module params that make > > the kernel rename devices or letting user space change names of running > > devices if they are slaves). > > Before I was aksed to revive this old mail thread, I knew the > discussion could end up with something like this. Yes, theoretically > there's a point - basically you don't believe kernel should take risk > in fixing the issue, so you push back the hope to something in > hypothesis that actually wasn't done and hard to get done in reality. > It's not too different than saying "hey, what you're asking for is > simply wrong, don't do it! Go back to modify userspace to create a > bond or team instead!" FWIW I want to emphasize that the debate for > what should be the right place to implement this failover facility: > userspace versus kernel, had been around for almost a decade, and no > real work ever happened in userspace to "standardize" this in the > Linux world. Let me offer you my very subjective opinion of why "no real work ever happened in user space". The actors who have primary interest to get the auto-bonding working are HW vendors trying to either convince customers to use SR-IOV, or being pressured by customers to make SR-IOV easier to consume. HW vendors hire driver developers, not user space developers. So the solution we arrive at is in the kernel for a non technical reason (Conway's law, sort of). $ cd NetworkManager/ $ git log --pretty=format:"%ae" | \ grep '\(mellanox\|intel\|broadcom\|netronome\)' | sort | uniq -c 81 andrew.zaborowski@intel.com 2 David.Woodhouse@intel.com 2 ismo.puustinen@intel.com 1 michael.i.doherty@intel.com Andrew works on WiFi. I have asked the NetworkManager folks to implement this feature last year when net_failover got dangerously close to getting merged, and they said they were never approached with this request before, much less offered code that solve it. Unfortunately before they got around to it net_failover was merged already, and they didn't proceed. So to my knowledge nobody ever tried to solve this in user space. I don't think net_failover is particularly terrible, or that renaming of primary in the kernel is the end of the world, but I'd appreciate if you could point me to efforts to solve it upstream in user space components, or acknowledge that nobody actually tried that.