Linux virtualization list

Linux virtualization list
 help / color / mirror / Atom feed

* Re: [PATCH] x86/paravirt: Guard against invalid cpu # in pv_vcpu_is_preempted()
From: Juergen Gross @ 2019-04-01  6:38 UTC (permalink / raw)
  To: Waiman Long, Alok Kataria, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Peter Zijlstra, Will Deacon
  Cc: Paolo Bonzini, x86, linux-kernel, virtualization
In-Reply-To: <5a48f5fb-799e-84f8-1fa7-eda851878d50@redhat.com>

On 25/03/2019 19:03, Waiman Long wrote:
> On 03/25/2019 12:40 PM, Juergen Gross wrote:
>> On 25/03/2019 16:57, Waiman Long wrote:
>>> It was found that passing an invalid cpu number to pv_vcpu_is_preempted()
>>> might panic the kernel in a VM guest. For example,
>>>
>>> [    2.531077] Oops: 0000 [#1] SMP PTI
>>>   :
>>> [    2.532545] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> [    2.533321] RIP: 0010:__raw_callee_save___kvm_vcpu_is_preempted+0x0/0x20
>>>
>>> To guard against this kind of kernel panic, check is added to
>>> pv_vcpu_is_preempted() to make sure that no invalid cpu number will
>>> be used.
>>>
>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>> ---
>>>  arch/x86/include/asm/paravirt.h | 6 ++++++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
>>> index c25c38a05c1c..4cfb465dcde4 100644
>>> --- a/arch/x86/include/asm/paravirt.h
>>> +++ b/arch/x86/include/asm/paravirt.h
>>> @@ -671,6 +671,12 @@ static __always_inline void pv_kick(int cpu)
>>>  
>>>  static __always_inline bool pv_vcpu_is_preempted(long cpu)
>>>  {
>>> +	/*
>>> +	 * Guard against invalid cpu number or the kernel might panic.
>>> +	 */
>>> +	if (WARN_ON_ONCE((unsigned long)cpu >= nr_cpu_ids))
>>> +		return false;
>>> +
>>>  	return PVOP_CALLEE1(bool, lock.vcpu_is_preempted, cpu);
>>>  }
>> Can this really happen without being a programming error?
> 
> This shouldn't happen without a programming error, I think. In my case,
> it was caused by a race condition leading to use-after-free of the cpu
> number. However, my point is that error like that shouldn't cause the
> kernel to panic.
> 
>> Basically you'd need to guard all percpu area accesses to foreign cpus
>> this way. Why is this one special?
> 
> It depends. If out-of-bound access can only happen with obvious
> programming error, I don't think we need to guard against them. In this
> case, I am not totally sure if the race condition that I found may
> happen with existing code or not. To be prudent, I decide to send this
> patch out.
> 
> The race condition that I am looking at is as follows:
> 
>   CPU 0                         CPU 1
>   -----                         -----
> up_write:
>   owner = NULL;
>   <release-barrier>
>   count = 0;
> 
> <rcu-free task structure>
>  
>                           rwsem_can_spin_on_owner:
>                             rcu_read_lock();
>                             read owner;
>                               :
>                             vcpu_is_preempted(owner->cpu);
>                               :
>                             rcu_read_unlock()
> 
> When I tried to merge the owner into the count (clear the owner after
> the barrier), I can reproduce the crash 100% when booting up the kernel
> in a VM guest. However, I am not sure if the configuration above is safe
> and is just very hard to reproduce.
> 
> Alternatively, I can also do the cpu check before calling
> vcpu_is_preempted().

I think I'd prefer that.


Juergen

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH 2/5] x86: Convert some slow-path static_cpu_has() callers to boot_cpu_has()
From: Juergen Gross @ 2019-03-30 12:50 UTC (permalink / raw)
  To: Borislav Petkov, LKML
  Cc: Rafael J. Wysocki, Peter Zijlstra, Sebastian Andrzej Siewior,
	Dominik Brodowski, virtualization, Dave Hansen, H. Peter Anvin,
	Nadav Amit, x86, Ingo Molnar, Rik van Riel, Thomas Lendacky,
	Joerg Roedel, Jann Horn, Aubrey Li, Thomas Gleixner, linux-edac,
	Tony Luck, Konrad Rzeszutek Wilk, Andy Lutomirski,
	Masami Hiramatsu, Kirill A. Shutemov
In-Reply-To: <20190330112022.28888-3-bp@alien8.de>

On 30/03/2019 12:20, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> Using static_cpu_has() is pointless on those paths, convert them to the
> boot_cpu_has() variant.
> 
> No functional changes.
> 
> Reported-by: Nadav Amit <nadav.amit@gmail.com>
> Signed-off-by: Borislav Petkov <bp@suse.de>

For the paravirt part: Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

^ permalink raw reply

* [PATCH 2/5] x86: Convert some slow-path static_cpu_has() callers to boot_cpu_has()
From: Borislav Petkov @ 2019-03-30 11:20 UTC (permalink / raw)
  To: LKML
  Cc: x86, Peter Zijlstra, Sebastian Andrzej Siewior, Dominik Brodowski,
	virtualization, Dave Hansen, H. Peter Anvin, Nadav Amit,
	Rafael J. Wysocki, Ingo Molnar, Rik van Riel, Thomas Lendacky,
	Joerg Roedel, Jann Horn, Aubrey Li, Thomas Gleixner, linux-edac,
	Juergen Gross, Tony Luck, Konrad Rzeszutek Wilk, Andy Lutomirski,
	Masami Hiramatsu, Kirill A. Shutemov
In-Reply-To: <20190330112022.28888-1-bp@alien8.de>

From: Borislav Petkov <bp@suse.de>

Using static_cpu_has() is pointless on those paths, convert them to the
boot_cpu_has() variant.

No functional changes.

Reported-by: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: linux-edac@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: virtualization@lists.linux-foundation.org
Cc: x86@kernel.org
---
 arch/x86/include/asm/fpu/internal.h  |  7 +++----
 arch/x86/kernel/apic/apic_numachip.c |  2 +-
 arch/x86/kernel/cpu/aperfmperf.c     |  6 +++---
 arch/x86/kernel/cpu/common.c         |  2 +-
 arch/x86/kernel/cpu/mce/inject.c     |  2 +-
 arch/x86/kernel/cpu/proc.c           | 10 +++++-----
 arch/x86/kernel/ldt.c                | 14 +++++++-------
 arch/x86/kernel/paravirt.c           |  2 +-
 arch/x86/kernel/process.c            |  4 ++--
 arch/x86/kernel/reboot.c             |  2 +-
 arch/x86/kernel/vm86_32.c            |  2 +-
 11 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index fb04a3ded7dd..745a19d34f23 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -253,7 +253,7 @@ static inline void copy_xregs_to_kernel_booting(struct xregs_state *xstate)
 
 	WARN_ON(system_state != SYSTEM_BOOTING);
 
-	if (static_cpu_has(X86_FEATURE_XSAVES))
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
 		XSTATE_OP(XSAVES, xstate, lmask, hmask, err);
 	else
 		XSTATE_OP(XSAVE, xstate, lmask, hmask, err);
@@ -275,7 +275,7 @@ static inline void copy_kernel_to_xregs_booting(struct xregs_state *xstate)
 
 	WARN_ON(system_state != SYSTEM_BOOTING);
 
-	if (static_cpu_has(X86_FEATURE_XSAVES))
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
 		XSTATE_OP(XRSTORS, xstate, lmask, hmask, err);
 	else
 		XSTATE_OP(XRSTOR, xstate, lmask, hmask, err);
@@ -497,8 +497,7 @@ static inline void fpregs_activate(struct fpu *fpu)
  *  - switch_fpu_finish() restores the new state as
  *    necessary.
  */
-static inline void
-switch_fpu_prepare(struct fpu *old_fpu, int cpu)
+static inline void switch_fpu_prepare(struct fpu *old_fpu, int cpu)
 {
 	if (static_cpu_has(X86_FEATURE_FPU) && old_fpu->initialized) {
 		if (!copy_fpregs_to_fpstate(old_fpu))
diff --git a/arch/x86/kernel/apic/apic_numachip.c b/arch/x86/kernel/apic/apic_numachip.c
index 78778b54f904..a5464b8b6c46 100644
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -175,7 +175,7 @@ static void fixup_cpu_id(struct cpuinfo_x86 *c, int node)
 	this_cpu_write(cpu_llc_id, node);
 
 	/* Account for nodes per socket in multi-core-module processors */
-	if (static_cpu_has(X86_FEATURE_NODEID_MSR)) {
+	if (boot_cpu_has(X86_FEATURE_NODEID_MSR)) {
 		rdmsrl(MSR_FAM10H_NODE_ID, val);
 		nodes = ((val >> 3) & 7) + 1;
 	}
diff --git a/arch/x86/kernel/cpu/aperfmperf.c b/arch/x86/kernel/cpu/aperfmperf.c
index 804c49493938..64d5aec24203 100644
--- a/arch/x86/kernel/cpu/aperfmperf.c
+++ b/arch/x86/kernel/cpu/aperfmperf.c
@@ -83,7 +83,7 @@ unsigned int aperfmperf_get_khz(int cpu)
 	if (!cpu_khz)
 		return 0;
 
-	if (!static_cpu_has(X86_FEATURE_APERFMPERF))
+	if (!boot_cpu_has(X86_FEATURE_APERFMPERF))
 		return 0;
 
 	aperfmperf_snapshot_cpu(cpu, ktime_get(), true);
@@ -99,7 +99,7 @@ void arch_freq_prepare_all(void)
 	if (!cpu_khz)
 		return;
 
-	if (!static_cpu_has(X86_FEATURE_APERFMPERF))
+	if (!boot_cpu_has(X86_FEATURE_APERFMPERF))
 		return;
 
 	for_each_online_cpu(cpu)
@@ -115,7 +115,7 @@ unsigned int arch_freq_get_on_cpu(int cpu)
 	if (!cpu_khz)
 		return 0;
 
-	if (!static_cpu_has(X86_FEATURE_APERFMPERF))
+	if (!boot_cpu_has(X86_FEATURE_APERFMPERF))
 		return 0;
 
 	if (aperfmperf_snapshot_cpu(cpu, ktime_get(), true))
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cb28e98a0659..95a5faf3a6a0 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1668,7 +1668,7 @@ static void setup_getcpu(int cpu)
 	unsigned long cpudata = vdso_encode_cpunode(cpu, early_cpu_to_node(cpu));
 	struct desc_struct d = { };
 
-	if (static_cpu_has(X86_FEATURE_RDTSCP))
+	if (boot_cpu_has(X86_FEATURE_RDTSCP))
 		write_rdtscp_aux(cpudata);
 
 	/* Store CPU and node number in limit. */
diff --git a/arch/x86/kernel/cpu/mce/inject.c b/arch/x86/kernel/cpu/mce/inject.c
index 3f82afd0f46f..a6026170af92 100644
--- a/arch/x86/kernel/cpu/mce/inject.c
+++ b/arch/x86/kernel/cpu/mce/inject.c
@@ -526,7 +526,7 @@ static void do_inject(void)
 	 * only on the node base core. Refer to D18F3x44[NbMcaToMstCpuEn] for
 	 * Fam10h and later BKDGs.
 	 */
-	if (static_cpu_has(X86_FEATURE_AMD_DCM) &&
+	if (boot_cpu_has(X86_FEATURE_AMD_DCM) &&
 	    b == 4 &&
 	    boot_cpu_data.x86 < 0x17) {
 		toggle_nb_mca_mst_cpu(amd_get_nb_id(cpu));
diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
index 2c8522a39ed5..cb2e49810d68 100644
--- a/arch/x86/kernel/cpu/proc.c
+++ b/arch/x86/kernel/cpu/proc.c
@@ -35,11 +35,11 @@ static void show_cpuinfo_misc(struct seq_file *m, struct cpuinfo_x86 *c)
 		   "fpu_exception\t: %s\n"
 		   "cpuid level\t: %d\n"
 		   "wp\t\t: yes\n",
-		   static_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
-		   static_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
-		   static_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
-		   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
-		   static_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
+		   boot_cpu_has_bug(X86_BUG_FDIV) ? "yes" : "no",
+		   boot_cpu_has_bug(X86_BUG_F00F) ? "yes" : "no",
+		   boot_cpu_has_bug(X86_BUG_COMA) ? "yes" : "no",
+		   boot_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
+		   boot_cpu_has(X86_FEATURE_FPU) ? "yes" : "no",
 		   c->cpuid_level);
 }
 #else
diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index 6135ae8ce036..b2463fcb20a8 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -113,7 +113,7 @@ static void do_sanity_check(struct mm_struct *mm,
 		 * tables.
 		 */
 		WARN_ON(!had_kernel_mapping);
-		if (static_cpu_has(X86_FEATURE_PTI))
+		if (boot_cpu_has(X86_FEATURE_PTI))
 			WARN_ON(!had_user_mapping);
 	} else {
 		/*
@@ -121,7 +121,7 @@ static void do_sanity_check(struct mm_struct *mm,
 		 * Sync the pgd to the usermode tables.
 		 */
 		WARN_ON(had_kernel_mapping);
-		if (static_cpu_has(X86_FEATURE_PTI))
+		if (boot_cpu_has(X86_FEATURE_PTI))
 			WARN_ON(had_user_mapping);
 	}
 }
@@ -156,7 +156,7 @@ static void map_ldt_struct_to_user(struct mm_struct *mm)
 	k_pmd = pgd_to_pmd_walk(k_pgd, LDT_BASE_ADDR);
 	u_pmd = pgd_to_pmd_walk(u_pgd, LDT_BASE_ADDR);
 
-	if (static_cpu_has(X86_FEATURE_PTI) && !mm->context.ldt)
+	if (boot_cpu_has(X86_FEATURE_PTI) && !mm->context.ldt)
 		set_pmd(u_pmd, *k_pmd);
 }
 
@@ -181,7 +181,7 @@ static void map_ldt_struct_to_user(struct mm_struct *mm)
 {
 	pgd_t *pgd = pgd_offset(mm, LDT_BASE_ADDR);
 
-	if (static_cpu_has(X86_FEATURE_PTI) && !mm->context.ldt)
+	if (boot_cpu_has(X86_FEATURE_PTI) && !mm->context.ldt)
 		set_pgd(kernel_to_user_pgdp(pgd), *pgd);
 }
 
@@ -208,7 +208,7 @@ map_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt, int slot)
 	spinlock_t *ptl;
 	int i, nr_pages;
 
-	if (!static_cpu_has(X86_FEATURE_PTI))
+	if (!boot_cpu_has(X86_FEATURE_PTI))
 		return 0;
 
 	/*
@@ -271,7 +271,7 @@ static void unmap_ldt_struct(struct mm_struct *mm, struct ldt_struct *ldt)
 		return;
 
 	/* LDT map/unmap is only required for PTI */
-	if (!static_cpu_has(X86_FEATURE_PTI))
+	if (!boot_cpu_has(X86_FEATURE_PTI))
 		return;
 
 	nr_pages = DIV_ROUND_UP(ldt->nr_entries * LDT_ENTRY_SIZE, PAGE_SIZE);
@@ -311,7 +311,7 @@ static void free_ldt_pgtables(struct mm_struct *mm)
 	unsigned long start = LDT_BASE_ADDR;
 	unsigned long end = LDT_END_ADDR;
 
-	if (!static_cpu_has(X86_FEATURE_PTI))
+	if (!boot_cpu_has(X86_FEATURE_PTI))
 		return;
 
 	tlb_gather_mmu(&tlb, mm, start, end);
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index c0e0101133f3..7bbaa6baf37f 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -121,7 +121,7 @@ DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
 
 void __init native_pv_lock_init(void)
 {
-	if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
+	if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
 		static_branch_disable(&virt_spin_lock_key);
 }
 
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 58ac7be52c7a..16a7113e91c5 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -236,7 +236,7 @@ static int get_cpuid_mode(void)
 
 static int set_cpuid_mode(struct task_struct *task, unsigned long cpuid_enabled)
 {
-	if (!static_cpu_has(X86_FEATURE_CPUID_FAULT))
+	if (!boot_cpu_has(X86_FEATURE_CPUID_FAULT))
 		return -ENODEV;
 
 	if (cpuid_enabled)
@@ -666,7 +666,7 @@ static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
 	if (c->x86_vendor != X86_VENDOR_INTEL)
 		return 0;
 
-	if (!cpu_has(c, X86_FEATURE_MWAIT) || static_cpu_has_bug(X86_BUG_MONITOR))
+	if (!cpu_has(c, X86_FEATURE_MWAIT) || boot_cpu_has_bug(X86_BUG_MONITOR))
 		return 0;
 
 	return 1;
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index 725624b6c0c0..d62ebbc5ec78 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -108,7 +108,7 @@ void __noreturn machine_real_restart(unsigned int type)
 	write_cr3(real_mode_header->trampoline_pgd);
 
 	/* Exiting long mode will fail if CR4.PCIDE is set. */
-	if (static_cpu_has(X86_FEATURE_PCID))
+	if (boot_cpu_has(X86_FEATURE_PCID))
 		cr4_clear_bits(X86_CR4_PCIDE);
 #endif
 
diff --git a/arch/x86/kernel/vm86_32.c b/arch/x86/kernel/vm86_32.c
index a092b6b40c6b..6a38717d179c 100644
--- a/arch/x86/kernel/vm86_32.c
+++ b/arch/x86/kernel/vm86_32.c
@@ -369,7 +369,7 @@ static long do_sys_vm86(struct vm86plus_struct __user *user_vm86, bool plus)
 	preempt_disable();
 	tsk->thread.sp0 += 16;
 
-	if (static_cpu_has(X86_FEATURE_SEP)) {
+	if (boot_cpu_has(X86_FEATURE_SEP)) {
 		tsk->thread.sysenter_cs = 0;
 		refresh_sysenter_cs(&tsk->thread);
 	}
-- 
2.21.0

^ permalink raw reply related

* Re: [PATCH net v4] failover: allow name change on IFF_UP slave interfaces
From: Michael S. Tsirkin @ 2019-03-29 22:31 UTC (permalink / raw)
  To: si-wei liu
  Cc: jiri, kubakici, Samudrala, Sridhar, alexander.duyck,
	virtualization, liran.alon, netdev, boris.ostrovsky, davem
In-Reply-To: <bb2239d2-9900-07b3-f969-c188899eb4e6@oracle.com>

On Fri, Mar 29, 2019 at 12:50:25PM -0700, si-wei liu wrote:
> 
> 
> On 3/28/2019 10:55 PM, Samudrala, Sridhar wrote:
> > 
> > 
> > On 3/28/2019 4:47 PM, Si-Wei Liu wrote:
> > > When a netdev appears through hot plug then gets enslaved by a failover
> > > master that is already up and running, the slave will be opened
> > > right away after getting enslaved. Today there's a race that userspace
> > > (udev) may fail to rename the slave if the kernel (net_failover)
> > > opens the slave earlier than when the userspace rename happens.
> > > Unlike bond or team, the primary slave of failover can't be renamed by
> > > userspace ahead of time, since the kernel initiated auto-enslavement is
> > > unable to, or rather, is never meant to be synchronized with the rename
> > > request from userspace.
> > > 
> > > As the failover slave interfaces are not designed to be operated
> > > directly by userspace apps: IP configuration, filter rules with
> > > regard to network traffic passing and etc., should all be done on master
> > > interface. In general, userspace apps only care about the
> > > name of master interface, while slave names are less important as long
> > > as admin users can see reliable names that may carry
> > > other information describing the netdev. For e.g., they can infer that
> > > "ens3nsby" is a standby slave of "ens3", while for a
> > > name like "eth0" they can't tell which master it belongs to.
> > > 
> > > Historically the name of IFF_UP interface can't be changed because
> > > there might be admin script or management software that is already
> > > relying on such behavior and assumes that the slave name can't be
> > > changed once UP. But failover is special: with the in-kernel
> > > auto-enslavement mechanism, the userspace expectation for device
> > > enumeration and bring-up order is already broken. Previously initramfs
> > > and various userspace config tools were modified to bypass failover
> > > slaves because of auto-enslavement and duplicate MAC address. Similarly,
> > > in case that users care about seeing reliable slave name, the new type
> > > of failover slaves needs to be taken care of specifically in userspace
> > > anyway.
> > > 
> > > It's less risky to lift up the rename restriction on failover slave
> > > which is already UP. Although it's possible this change may potentially
> > > break userspace component (most likely configuration scripts or
> > > management software) that assumes slave name can't be changed while
> > > UP, it's relatively a limited and controllable set among all userspace
> > > components, which can be fixed specifically to listen for the rename
> > > and/or link down/up events on failover slaves. Userspace component
> > > interacting with slaves is expected to be changed to operate on failover
> > > master interface instead, as the failover slave is dynamic in nature
> > > which may come and go at any point.  The goal is to make the role of
> > > failover slaves less relevant, and userspace components should only
> > > deal with failover master in the long run.
> > > 
> > > Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> > > Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> > > Reviewed-by: Liran Alon <liran.alon@oracle.com>
> > > 
> > > -- 
> > > v1 -> v2:
> > > - Drop configurable module parameter (Sridhar)
> > > 
> > > v2 -> v3:
> > > - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
> > > - Send down and up events around rename (Michael S. Tsirkin)
> > > 
> > > v3 -> v4:
> > > - Simplify notification to be sent (Stephen Hemminger)
> > > ---
> > >   net/core/dev.c | 24 +++++++++++++++++++++++-
> > >   1 file changed, 23 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > index 722d50d..6ae5874 100644
> > > --- a/net/core/dev.c
> > > +++ b/net/core/dev.c
> > > @@ -1180,7 +1180,21 @@ int dev_change_name(struct net_device *dev,
> > > const char *newname)
> > >       BUG_ON(!dev_net(dev));
> > >         net = dev_net(dev);
> > > -    if (dev->flags & IFF_UP)
> > > +
> > > +    /* Allow failover slave to rename even when
> > > +     * it is up and running.
> > > +     *
> > > +     * Failover slaves are special, since userspace
> > > +     * might rename the slave after the interface
> > > +     * has been brought up and running due to
> > > +     * auto-enslavement.
> > > +     *
> > > +     * Failover users don't actually care about slave
> > > +     * name change, as they are only expected to operate
> > > +     * on master interface directly.
> > > +     */
> > > +    if (dev->flags & IFF_UP &&
> > > +        likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE)))
> > >           return -EBUSY;
> > >         write_seqcount_begin(&devnet_rename_seq);
> > > @@ -1227,6 +1241,14 @@ int dev_change_name(struct net_device *dev,
> > > const char *newname)
> > >       hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net,
> > > dev->name));
> > >       write_unlock_bh(&dev_base_lock);
> > >   +    if (unlikely(dev->flags & IFF_UP)) {
> > > +        struct netdev_notifier_change_info change_info;
> > > +
> > > +        change_info.flags_changed = 0;
> > > +        call_netdevice_notifiers_info(NETDEV_CHANGE, dev,
> > > +                          &change_info.info);
> > 
> > This function no longer takes the dev parameter in the net-next kernel.
> OK. Will get my patch updated with the latest net-next tree.
> 
> > 
> > Did you consider calling netdev_state_change() although it does send a
> > RTM_NEWLINK message too. May be just fixing the notifier call should be
> > fine.
> I did look at it and the extra RTM_NEWLINK message was indeed the reason I
> got it rewritten. You see, how the way of initialization is inherited.
> 
> Thanks,
> -Siwei

More messages just increase the chance existing scripts will work.
Don't see what the harm is.

> > 
> > > +    }
> > > +
> > >       ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
> > >       ret = notifier_to_errno(ret);
> > > 

^ permalink raw reply

* Re: [PATCH net v4] failover: allow name change on IFF_UP slave interfaces
From: Jiri Pirko @ 2019-03-29 15:34 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: mst, kubakici, sridhar.samudrala, alexander.duyck, virtualization,
	liran.alon, netdev, Si-Wei Liu, boris.ostrovsky, davem
In-Reply-To: <20190329081512.2f231508@shemminger-XPS-13-9360>

Fri, Mar 29, 2019 at 04:15:12PM CET, stephen@networkplumber.org wrote:
>On Thu, 28 Mar 2019 19:47:27 -0400
>Si-Wei Liu <si-wei.liu@oracle.com> wrote:
>
>> +	if (unlikely(dev->flags & IFF_UP)) {
>> +		struct netdev_notifier_change_info change_info;
>> +
>> +		change_info.flags_changed = 0;
>
>Simpler to use structure initialization, which also avoid any chance
>of unititialized fields.
>
>		struct netdev_notifier_change_info change_info
>			= { .flags_changed =  0 };
 
In fact, you can do just:
	struct netdev_notifier_change_info change_info = {};
to achieve the same.

^ permalink raw reply

* Re: [PATCH net v4] failover: allow name change on IFF_UP slave interfaces
From: Stephen Hemminger @ 2019-03-29 15:15 UTC (permalink / raw)
  To: Si-Wei Liu
  Cc: jiri, mst, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, boris.ostrovsky, davem
In-Reply-To: <1553816847-28121-1-git-send-email-si-wei.liu@oracle.com>

On Thu, 28 Mar 2019 19:47:27 -0400
Si-Wei Liu <si-wei.liu@oracle.com> wrote:

> +	if (unlikely(dev->flags & IFF_UP)) {
> +		struct netdev_notifier_change_info change_info;
> +
> +		change_info.flags_changed = 0;

Simpler to use structure initialization, which also avoid any chance
of unititialized fields.

		struct netdev_notifier_change_info change_info
			= { .flags_changed =  0 };

^ permalink raw reply

* Re: [PATCH net v4] failover: allow name change on IFF_UP slave interfaces
From: kbuild test robot @ 2019-03-29 13:45 UTC (permalink / raw)
  Cc: jiri, mst, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, kbuild-all, netdev, si-wei liu,
	boris.ostrovsky, davem
In-Reply-To: <1553816847-28121-1-git-send-email-si-wei.liu@oracle.com>

[-- Attachment #1: Type: text/plain, Size: 5112 bytes --]

Hi Si-Wei,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]

url:    https://github.com/0day-ci/linux/commits/Si-Wei-Liu/failover-allow-name-change-on-IFF_UP-slave-interfaces/20190329-195445
config: x86_64-lkp (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   net/core/dev.c: In function 'dev_change_name':
>> net/core/dev.c:1252:48: error: passing argument 2 of 'call_netdevice_notifiers_info' from incompatible pointer type [-Werror=incompatible-pointer-types]
      call_netdevice_notifiers_info(NETDEV_CHANGE, dev,
                                                   ^~~
   net/core/dev.c:164:12: note: expected 'struct netdev_notifier_info *' but argument is of type 'struct net_device *'
    static int call_netdevice_notifiers_info(unsigned long val,
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> net/core/dev.c:1252:3: error: too many arguments to function 'call_netdevice_notifiers_info'
      call_netdevice_notifiers_info(NETDEV_CHANGE, dev,
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/dev.c:164:12: note: declared here
    static int call_netdevice_notifiers_info(unsigned long val,
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/call_netdevice_notifiers_info +1252 net/core/dev.c

  1166	
  1167	/**
  1168	 *	dev_change_name - change name of a device
  1169	 *	@dev: device
  1170	 *	@newname: name (or format string) must be at least IFNAMSIZ
  1171	 *
  1172	 *	Change name of a device, can pass format strings "eth%d".
  1173	 *	for wildcarding.
  1174	 */
  1175	int dev_change_name(struct net_device *dev, const char *newname)
  1176	{
  1177		unsigned char old_assign_type;
  1178		char oldname[IFNAMSIZ];
  1179		int err = 0;
  1180		int ret;
  1181		struct net *net;
  1182	
  1183		ASSERT_RTNL();
  1184		BUG_ON(!dev_net(dev));
  1185	
  1186		net = dev_net(dev);
  1187	
  1188		/* Allow failover slave to rename even when
  1189		 * it is up and running.
  1190		 *
  1191		 * Failover slaves are special, since userspace
  1192		 * might rename the slave after the interface
  1193		 * has been brought up and running due to
  1194		 * auto-enslavement.
  1195		 *
  1196		 * Failover users don't actually care about slave
  1197		 * name change, as they are only expected to operate
  1198		 * on master interface directly.
  1199		 */
  1200		if (dev->flags & IFF_UP &&
  1201		    likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE)))
  1202			return -EBUSY;
  1203	
  1204		write_seqcount_begin(&devnet_rename_seq);
  1205	
  1206		if (strncmp(newname, dev->name, IFNAMSIZ) == 0) {
  1207			write_seqcount_end(&devnet_rename_seq);
  1208			return 0;
  1209		}
  1210	
  1211		memcpy(oldname, dev->name, IFNAMSIZ);
  1212	
  1213		err = dev_get_valid_name(net, dev, newname);
  1214		if (err < 0) {
  1215			write_seqcount_end(&devnet_rename_seq);
  1216			return err;
  1217		}
  1218	
  1219		if (oldname[0] && !strchr(oldname, '%'))
  1220			netdev_info(dev, "renamed from %s\n", oldname);
  1221	
  1222		old_assign_type = dev->name_assign_type;
  1223		dev->name_assign_type = NET_NAME_RENAMED;
  1224	
  1225	rollback:
  1226		ret = device_rename(&dev->dev, dev->name);
  1227		if (ret) {
  1228			memcpy(dev->name, oldname, IFNAMSIZ);
  1229			dev->name_assign_type = old_assign_type;
  1230			write_seqcount_end(&devnet_rename_seq);
  1231			return ret;
  1232		}
  1233	
  1234		write_seqcount_end(&devnet_rename_seq);
  1235	
  1236		netdev_adjacent_rename_links(dev, oldname);
  1237	
  1238		write_lock_bh(&dev_base_lock);
  1239		hlist_del_rcu(&dev->name_hlist);
  1240		write_unlock_bh(&dev_base_lock);
  1241	
  1242		synchronize_rcu();
  1243	
  1244		write_lock_bh(&dev_base_lock);
  1245		hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net, dev->name));
  1246		write_unlock_bh(&dev_base_lock);
  1247	
  1248		if (unlikely(dev->flags & IFF_UP)) {
  1249			struct netdev_notifier_change_info change_info;
  1250	
  1251			change_info.flags_changed = 0;
> 1252			call_netdevice_notifiers_info(NETDEV_CHANGE, dev,
  1253						      &change_info.info);
  1254		}
  1255	
  1256		ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
  1257		ret = notifier_to_errno(ret);
  1258	
  1259		if (ret) {
  1260			/* err >= 0 after dev_alloc_name() or stores the first errno */
  1261			if (err >= 0) {
  1262				err = ret;
  1263				write_seqcount_begin(&devnet_rename_seq);
  1264				memcpy(dev->name, oldname, IFNAMSIZ);
  1265				memcpy(oldname, newname, IFNAMSIZ);
  1266				dev->name_assign_type = old_assign_type;
  1267				old_assign_type = NET_NAME_RENAMED;
  1268				goto rollback;
  1269			} else {
  1270				pr_err("%s: name change rollback failed: %d\n",
  1271				       dev->name, ret);
  1272			}
  1273		}
  1274	
  1275		return err;
  1276	}
  1277	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26574 bytes --]

[-- Attachment #3: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net v4] failover: allow name change on IFF_UP slave interfaces
From: Samudrala, Sridhar @ 2019-03-29  5:55 UTC (permalink / raw)
  To: Si-Wei Liu, mst, stephen, davem, kubakici, alexander.duyck, jiri,
	netdev, virtualization
  Cc: boris.ostrovsky, liran.alon
In-Reply-To: <1553816847-28121-1-git-send-email-si-wei.liu@oracle.com>



On 3/28/2019 4:47 PM, Si-Wei Liu wrote:
> When a netdev appears through hot plug then gets enslaved by a failover
> master that is already up and running, the slave will be opened
> right away after getting enslaved. Today there's a race that userspace
> (udev) may fail to rename the slave if the kernel (net_failover)
> opens the slave earlier than when the userspace rename happens.
> Unlike bond or team, the primary slave of failover can't be renamed by
> userspace ahead of time, since the kernel initiated auto-enslavement is
> unable to, or rather, is never meant to be synchronized with the rename
> request from userspace.
> 
> As the failover slave interfaces are not designed to be operated
> directly by userspace apps: IP configuration, filter rules with
> regard to network traffic passing and etc., should all be done on master
> interface. In general, userspace apps only care about the
> name of master interface, while slave names are less important as long
> as admin users can see reliable names that may carry
> other information describing the netdev. For e.g., they can infer that
> "ens3nsby" is a standby slave of "ens3", while for a
> name like "eth0" they can't tell which master it belongs to.
> 
> Historically the name of IFF_UP interface can't be changed because
> there might be admin script or management software that is already
> relying on such behavior and assumes that the slave name can't be
> changed once UP. But failover is special: with the in-kernel
> auto-enslavement mechanism, the userspace expectation for device
> enumeration and bring-up order is already broken. Previously initramfs
> and various userspace config tools were modified to bypass failover
> slaves because of auto-enslavement and duplicate MAC address. Similarly,
> in case that users care about seeing reliable slave name, the new type
> of failover slaves needs to be taken care of specifically in userspace
> anyway.
> 
> It's less risky to lift up the rename restriction on failover slave
> which is already UP. Although it's possible this change may potentially
> break userspace component (most likely configuration scripts or
> management software) that assumes slave name can't be changed while
> UP, it's relatively a limited and controllable set among all userspace
> components, which can be fixed specifically to listen for the rename
> and/or link down/up events on failover slaves. Userspace component
> interacting with slaves is expected to be changed to operate on failover
> master interface instead, as the failover slave is dynamic in nature
> which may come and go at any point.  The goal is to make the role of
> failover slaves less relevant, and userspace components should only
> deal with failover master in the long run.
> 
> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> Reviewed-by: Liran Alon <liran.alon@oracle.com>
> 
> --
> v1 -> v2:
> - Drop configurable module parameter (Sridhar)
> 
> v2 -> v3:
> - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
> - Send down and up events around rename (Michael S. Tsirkin)
> 
> v3 -> v4:
> - Simplify notification to be sent (Stephen Hemminger)
> ---
>   net/core/dev.c | 24 +++++++++++++++++++++++-
>   1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 722d50d..6ae5874 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1180,7 +1180,21 @@ int dev_change_name(struct net_device *dev, const char *newname)
>   	BUG_ON(!dev_net(dev));
>   
>   	net = dev_net(dev);
> -	if (dev->flags & IFF_UP)
> +
> +	/* Allow failover slave to rename even when
> +	 * it is up and running.
> +	 *
> +	 * Failover slaves are special, since userspace
> +	 * might rename the slave after the interface
> +	 * has been brought up and running due to
> +	 * auto-enslavement.
> +	 *
> +	 * Failover users don't actually care about slave
> +	 * name change, as they are only expected to operate
> +	 * on master interface directly.
> +	 */
> +	if (dev->flags & IFF_UP &&
> +	    likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE)))
>   		return -EBUSY;
>   
>   	write_seqcount_begin(&devnet_rename_seq);
> @@ -1227,6 +1241,14 @@ int dev_change_name(struct net_device *dev, const char *newname)
>   	hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net, dev->name));
>   	write_unlock_bh(&dev_base_lock);
>   
> +	if (unlikely(dev->flags & IFF_UP)) {
> +		struct netdev_notifier_change_info change_info;
> +
> +		change_info.flags_changed = 0;
> +		call_netdevice_notifiers_info(NETDEV_CHANGE, dev,
> +					      &change_info.info);

This function no longer takes the dev parameter in the net-next kernel.

Did you consider calling netdev_state_change() although it does send a 
RTM_NEWLINK message too. May be just fixing the notifier call should be 
fine.

> +	}
> +
>   	ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
>   	ret = notifier_to_errno(ret);
>   
> 

^ permalink raw reply

* Re: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces
From: kbuild test robot @ 2019-03-28 19:49 UTC (permalink / raw)
  Cc: jiri, mst, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, kbuild-all, netdev, si-wei liu,
	boris.ostrovsky, davem
In-Reply-To: <1553644093-10917-1-git-send-email-si-wei.liu@oracle.com>

[-- Attachment #1: Type: text/plain, Size: 4836 bytes --]

Hi Si-Wei,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]

url:    https://github.com/0day-ci/linux/commits/Si-Wei-Liu/failover-allow-name-change-on-IFF_UP-slave-interfaces/20190329-020744
config: openrisc-or1ksim_defconfig (attached as .config)
compiler: or1k-linux-gcc (GCC) 6.0.0 20160327 (experimental)
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=openrisc 

All errors (new ones prefixed by >>):

   net/core/dev.c: In function 'dev_change_name':
>> net/core/dev.c:1277:9: error: too few arguments to function 'dev_open'
      ret = dev_open(dev);
            ^~~~~~~~
   In file included from net/core/dev.c:93:0:
   include/linux/netdevice.h:2636:5: note: declared here
    int dev_open(struct net_device *dev, struct netlink_ext_ack *extack);
        ^~~~~~~~

vim +/dev_open +1277 net/core/dev.c

  1166	
  1167	/**
  1168	 *	dev_change_name - change name of a device
  1169	 *	@dev: device
  1170	 *	@newname: name (or format string) must be at least IFNAMSIZ
  1171	 *
  1172	 *	Change name of a device, can pass format strings "eth%d".
  1173	 *	for wildcarding.
  1174	 */
  1175	int dev_change_name(struct net_device *dev, const char *newname)
  1176	{
  1177		unsigned char old_assign_type;
  1178		bool reopen_needed = false;
  1179		char oldname[IFNAMSIZ];
  1180		int err = 0;
  1181		int ret;
  1182		struct net *net;
  1183	
  1184		ASSERT_RTNL();
  1185		BUG_ON(!dev_net(dev));
  1186	
  1187		net = dev_net(dev);
  1188	
  1189		/* Allow failover slave to rename even when
  1190		 * it is up and running.
  1191		 *
  1192		 * Failover slaves are special, since userspace
  1193		 * might rename the slave after the interface
  1194		 * has been brought up and running due to
  1195		 * auto-enslavement.
  1196		 *
  1197		 * Failover users don't actually care about slave
  1198		 * name change, as they are only expected to operate
  1199		 * on master interface directly.
  1200		 */
  1201		if (dev->flags & IFF_UP) {
  1202			if (likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE)))
  1203				return -EBUSY;
  1204			reopen_needed = true;
  1205		}
  1206	
  1207		write_seqcount_begin(&devnet_rename_seq);
  1208	
  1209		if (strncmp(newname, dev->name, IFNAMSIZ) == 0) {
  1210			write_seqcount_end(&devnet_rename_seq);
  1211			return 0;
  1212		}
  1213	
  1214		memcpy(oldname, dev->name, IFNAMSIZ);
  1215	
  1216		err = dev_get_valid_name(net, dev, newname);
  1217		if (err < 0) {
  1218			write_seqcount_end(&devnet_rename_seq);
  1219			return err;
  1220		}
  1221	
  1222		if (reopen_needed)
  1223			dev_close(dev);
  1224	
  1225		if (oldname[0] && !strchr(oldname, '%'))
  1226			netdev_info(dev, "renamed from %s\n", oldname);
  1227	
  1228		old_assign_type = dev->name_assign_type;
  1229		dev->name_assign_type = NET_NAME_RENAMED;
  1230	
  1231	rollback:
  1232		ret = device_rename(&dev->dev, dev->name);
  1233		if (ret) {
  1234			memcpy(dev->name, oldname, IFNAMSIZ);
  1235			dev->name_assign_type = old_assign_type;
  1236			write_seqcount_end(&devnet_rename_seq);
  1237			if (err >= 0)
  1238				err = ret;
  1239			goto reopen;
  1240		}
  1241	
  1242		write_seqcount_end(&devnet_rename_seq);
  1243	
  1244		netdev_adjacent_rename_links(dev, oldname);
  1245	
  1246		write_lock_bh(&dev_base_lock);
  1247		hlist_del_rcu(&dev->name_hlist);
  1248		write_unlock_bh(&dev_base_lock);
  1249	
  1250		synchronize_rcu();
  1251	
  1252		write_lock_bh(&dev_base_lock);
  1253		hlist_add_head_rcu(&dev->name_hlist, dev_name_hash(net, dev->name));
  1254		write_unlock_bh(&dev_base_lock);
  1255	
  1256		ret = call_netdevice_notifiers(NETDEV_CHANGENAME, dev);
  1257		ret = notifier_to_errno(ret);
  1258	
  1259		if (ret) {
  1260			/* err >= 0 after dev_alloc_name() or stores the first errno */
  1261			if (err >= 0) {
  1262				err = ret;
  1263				write_seqcount_begin(&devnet_rename_seq);
  1264				memcpy(dev->name, oldname, IFNAMSIZ);
  1265				memcpy(oldname, newname, IFNAMSIZ);
  1266				dev->name_assign_type = old_assign_type;
  1267				old_assign_type = NET_NAME_RENAMED;
  1268				goto rollback;
  1269			} else {
  1270				pr_err("%s: name change rollback failed: %d\n",
  1271				       dev->name, ret);
  1272			}
  1273		}
  1274	
  1275	reopen:
  1276		if (reopen_needed) {
> 1277			ret = dev_open(dev);
  1278			if (ret) {
  1279				pr_err("%s: reopen device failed: %d\n",
  1280				       dev->name, ret);
  1281			}
  1282		}
  1283	
  1284		return err;
  1285	}
  1286	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 8104 bytes --]

[-- Attachment #3: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces
From: Stephen Hemminger @ 2019-03-28 17:14 UTC (permalink / raw)
  To: si-wei liu
  Cc: Jiri Pirko, mst, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, boris.ostrovsky, davem
In-Reply-To: <06fa1aec-d9a6-3ca7-8849-a1656349ab83@oracle.com>

On Wed, 27 Mar 2019 16:44:19 -0700
si-wei liu <si-wei.liu@oracle.com> wrote:

> On 3/27/2019 4:11 AM, Jiri Pirko wrote:
> > Wed, Mar 27, 2019 at 12:48:13AM CET, si-wei.liu@oracle.com wrote:  
> >> When a netdev appears through hot plug then gets enslaved by a failover
> >> master that is already up and running, the slave will be opened
> >> right away after getting enslaved. Today there's a race that userspace
> >> (udev) may fail to rename the slave if the kernel (net_failover)
> >> opens the slave earlier than when the userspace rename happens.
> >> Unlike bond or team, the primary slave of failover can't be renamed by
> >> userspace ahead of time, since the kernel initiated auto-enslavement is
> >> unable to, or rather, is never meant to be synchronized with the rename
> >> request from userspace.
> >>
> >> As the failover slave interfaces are not designed to be operated
> >> directly by userspace apps: IP configuration, filter rules with
> >> regard to network traffic passing and etc., should all be done on master
> >> interface. In general, userspace apps only care about the
> >> name of master interface, while slave names are less important as long
> >> as admin users can see reliable names that may carry
> >> other information describing the netdev. For e.g., they can infer that
> >> "ens3nsby" is a standby slave of "ens3", while for a
> >> name like "eth0" they can't tell which master it belongs to.
> >>
> >> Historically the name of IFF_UP interface can't be changed because
> >> there might be admin script or management software that is already
> >> relying on such behavior and assumes that the slave name can't be
> >> changed once UP. But failover is special: with the in-kernel
> >> auto-enslavement mechanism, the userspace expectation for device
> >> enumeration and bring-up order is already broken. Previously initramfs
> >> and various userspace config tools were modified to bypass failover
> >> slaves because of auto-enslavement and duplicate MAC address. Similarly,
> >> in case that users care about seeing reliable slave name, the new type
> >> of failover slaves needs to be taken care of specifically in userspace
> >> anyway.
> >>
> >> It's less risky to lift up the rename restriction on failover slave
> >> which is already UP. Although it's possible this change may potentially
> >> break userspace component (most likely configuration scripts or
> >> management software) that assumes slave name can't be changed while
> >> UP, it's relatively a limited and controllable set among all userspace
> >> components, which can be fixed specifically to listen for the rename
> >> and/or link down/up events on failover slaves. Userspace component
> >> interacting with slaves is expected to be changed to operate on failover
> >> master interface instead, as the failover slave is dynamic in nature
> >> which may come and go at any point.  The goal is to make the role of
> >> failover slaves less relevant, and userspace components should only
> >> deal with failover master in the long run.
> >>
> >> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module")
> >> Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
> >> Reviewed-by: Liran Alon <liran.alon@oracle.com>
> >>
> >> --
> >> v1 -> v2:
> >> - Drop configurable module parameter (Sridhar)
> >>
> >> v2 -> v3:
> >> - Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar)
> >> - Send down and up events around rename (Michael S. Tsirkin)
> >> ---
> >> net/core/dev.c | 37 ++++++++++++++++++++++++++++++++++---
> >> 1 file changed, 34 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/net/core/dev.c b/net/core/dev.c
> >> index 722d50d..3e0cd80 100644
> >> --- a/net/core/dev.c
> >> +++ b/net/core/dev.c
> >> @@ -1171,6 +1171,7 @@ int dev_get_valid_name(struct net *net, struct net_device *dev,
> >> int dev_change_name(struct net_device *dev, const char *newname)
> >> {
> >> 	unsigned char old_assign_type;
> >> +	bool reopen_needed = false;
> >> 	char oldname[IFNAMSIZ];
> >> 	int err = 0;
> >> 	int ret;
> >> @@ -1180,8 +1181,24 @@ int dev_change_name(struct net_device *dev, const char *newname)
> >> 	BUG_ON(!dev_net(dev));
> >>
> >> 	net = dev_net(dev);
> >> -	if (dev->flags & IFF_UP)
> >> -		return -EBUSY;
> >> +
> >> +	/* Allow failover slave to rename even when
> >> +	 * it is up and running.
> >> +	 *
> >> +	 * Failover slaves are special, since userspace
> >> +	 * might rename the slave after the interface
> >> +	 * has been brought up and running due to
> >> +	 * auto-enslavement.
> >> +	 *
> >> +	 * Failover users don't actually care about slave
> >> +	 * name change, as they are only expected to operate
> >> +	 * on master interface directly.
> >> +	 */
> >> +	if (dev->flags & IFF_UP) {
> >> +		if (likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE)))
> >> +			return -EBUSY;
> >> +		reopen_needed = true;
> >> +	}
> >>
> >> 	write_seqcount_begin(&devnet_rename_seq);
> >>
> >> @@ -1198,6 +1215,9 @@ int dev_change_name(struct net_device *dev, const char *newname)
> >> 		return err;
> >> 	}
> >>
> >> +	if (reopen_needed)
> >> +		dev_close(dev);  
> > Ugh. Don't dev_close/dev_open on name change.  
> See my response to Michael and Stephen. What's your suggestion then?

To a DEV_CHANGE notification instead?


My opinion is that allowing name change is not worth the doing.
Also, the kernel should never do the name change, it is up to userspace.

^ permalink raw reply

* Re: [PATCH] MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE
From: Juergen Gross @ 2019-03-28 14:03 UTC (permalink / raw)
  To: Thomas Hellstrom, linux-kernel@vger.kernel.org
  Cc: Pv-drivers, x86@kernel.org,
	virtualization@lists.linux-foundation.org, mingo@redhat.com,
	hpa@zytor.com, Alok Kataria, tglx@linutronix.de
In-Reply-To: <20190328120558.29897-1-thellstrom@vmware.com>

On 28/03/2019 13:06, Thomas Hellstrom wrote:
> Alok Kataria will be handing over VMware's maintainership of these
> interfaces to Thomas Hellström, with pv-drivers as backup contact.
> 
> Cc: Juergen Gross <jgross@suse.com>
> Cc: virtualization@lists.linux-foundation.org
> Cc: x86@kernel.org
> Cc: tglx@linutronix.de
> Cc: mingo@redhat.com
> Cc: hpa@zytor.com
> Cc: linux-kernel@vger.kernel.org
> Cc: Alok Kataria <akataria@vmware.com>
> Cc: <pv-drivers@vmware.com>
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
> Acked-by: Alok Kataria <akataria@vmware.com>

Acked-by: Juergen Gross <jgross@suse.com>


Juergen
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* [PATCH] MAINTAINERS: Update PARAVIRT_OPS_INTERFACE and VMWARE_HYPERVISOR_INTERFACE
From: Thomas Hellstrom via Virtualization @ 2019-03-28 12:06 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org
  Cc: Juergen Gross, Thomas Hellstrom, Pv-drivers, x86@kernel.org,
	virtualization@lists.linux-foundation.org, mingo@redhat.com,
	hpa@zytor.com, tglx@linutronix.de, Alok Kataria

Alok Kataria will be handing over VMware's maintainership of these
interfaces to Thomas Hellström, with pv-drivers as backup contact.

Cc: Juergen Gross <jgross@suse.com>
Cc: virtualization@lists.linux-foundation.org
Cc: x86@kernel.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: linux-kernel@vger.kernel.org
Cc: Alok Kataria <akataria@vmware.com>
Cc: <pv-drivers@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Alok Kataria <akataria@vmware.com>
---
 MAINTAINERS | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3e5a5d263f29..160b208b93d6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11711,7 +11711,8 @@ F:	Documentation/parport*.txt
 
 PARAVIRT_OPS INTERFACE
 M:	Juergen Gross <jgross@suse.com>
-M:	Alok Kataria <akataria@vmware.com>
+M:	Thomas Hellstrom <thellstrom@vmware.com>
+M:	"VMware, Inc." <pv-drivers@vmware.com>
 L:	virtualization@lists.linux-foundation.org
 S:	Supported
 F:	Documentation/virtual/paravirt_ops.txt
@@ -16620,7 +16621,8 @@ S:	Maintained
 F:	drivers/misc/vmw_balloon.c
 
 VMWARE HYPERVISOR INTERFACE
-M:	Alok Kataria <akataria@vmware.com>
+M:	Thomas Hellstrom <thellstrom@vmware.com>
+M:	"VMware, Inc." <pv-drivers@vmware.com>
 L:	virtualization@lists.linux-foundation.org
 S:	Supported
 F:	arch/x86/kernel/cpu/vmware.c
-- 
2.20.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related

* Re: [PATCH v2 -next] drm/virtio: remove set but not used variable 'vgdev'
From: Gerd Hoffmann @ 2019-03-28 10:32 UTC (permalink / raw)
  To: YueHaibing
  Cc: David Airlie, kernel-janitors, linux-kernel, dri-devel,
	virtualization, Daniel Vetter
In-Reply-To: <20190325092631.152060-1-yuehaibing@huawei.com>

On Mon, Mar 25, 2019 at 09:26:31AM +0000, YueHaibing wrote:
> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> drivers/gpu/drm/virtio/virtgpu_ttm.c: In function 'virtio_gpu_init_mem_type':
> drivers/gpu/drm/virtio/virtgpu_ttm.c:117:28: warning:
>  variable 'vgdev' set but not used [-Wunused-but-set-variable]
> 
> drivers/gpu/drm/virtio/virtgpu_ttm.c: In function 'virtio_gpu_bo_swap_notify':
> drivers/gpu/drm/virtio/virtgpu_ttm.c:300:28: warning:
>  variable 'vgdev' set but not used [-Wunused-but-set-variable]
> 
> It is never used since introduction in dc5698e80cf7 ("Add virtio gpu driver.")

Patch queued up.

thanks,
  Gerd

^ permalink raw reply

* [RFC PATCH 18/68] vfs: Convert virtio_balloon to use the new mount API
From: David Howells @ 2019-03-27 23:42 UTC (permalink / raw)
  To: viro
  Cc: Michael S. Tsirkin, linux-kernel, virtualization, dhowells,
	linux-fsdevel
In-Reply-To: <155372999953.7602.13784796495137723805.stgit@warthog.procyon.org.uk>

Convert the virtio_balloon filesystem to the new internal mount API as the old
one will be obsoleted and removed.  This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "Michael S. Tsirkin" <mst@redhat.com>
cc: Jason Wang <jasowang@redhat.com>
cc: virtualization@lists.linux-foundation.org
---

 drivers/virtio/virtio_balloon.c |   19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index f19061b585a4..89d67c8aa719 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -757,21 +757,22 @@ static int virtballoon_migratepage(struct balloon_dev_info *vb_dev_info,
 
 	return MIGRATEPAGE_SUCCESS;
 }
+#include <linux/fs_context.h>
 
-static struct dentry *balloon_mount(struct file_system_type *fs_type,
-		int flags, const char *dev_name, void *data)
-{
-	static const struct dentry_operations ops = {
-		.d_dname = simple_dname,
-	};
+static const struct dentry_operations balloon_dops = {
+	.d_dname = simple_dname,
+};
 
-	return mount_pseudo(fs_type, "balloon-kvm:", NULL, &ops,
-				BALLOON_KVM_MAGIC);
+static int balloon_init_fs_context(struct fs_context *fc)
+{
+	return vfs_init_pseudo_fs_context(fc, "balloon-kvm:",
+					  NULL, NULL,
+					  &balloon_dops, BALLOON_KVM_MAGIC);
 }
 
 static struct file_system_type balloon_fs = {
 	.name           = "balloon-kvm",
-	.mount          = balloon_mount,
+	.init_fs_context = balloon_init_fs_context,
 	.kill_sb        = kill_anon_super,
 };

^ permalink raw reply related

* [RFC PATCH 00/68] VFS: Convert a bunch of filesystems to the new mount API
From: David Howells @ 2019-03-27 23:40 UTC (permalink / raw)
  To: viro
  Cc: Sergey Senozhatsky, Rafael J. Wysocki, dri-devel, J. Bruce Fields,
	ceph-devel, Mike Marshall, xen-devel, devel, Felipe Balbi,
	Uma Krishnan, Manoj N. Kumar, Joel Becker, Trond Myklebust,
	Mike Marciniszyn, Greg Kroah-Hartman, linux-usb, linux-kernel,
	Arve Hjønnevåg, Frederic Barrat, Martin Brandenburg,
	linux-ia64, David Airlie, virtualization


Hi Al,

Here's a set of patches that converts a bunch (but not yet all!) to the new
mount API.  To this end, it makes the following changes:

 (1) Provides a convenience member in struct fs_context that is OR'd into
     sb->s_iflags by sget_fc().

 (2) Provides a convenience helper function, vfs_init_pseudo_fs_context(),
     for doing most of the work in mounting a pseudo filesystem.

 (3) Provides a convenience helper function, vfs_get_block_super(), for
     doing the work in setting up a block-based superblock.

 (4) Improves the handling of fd-type parameters.

 (5) Moves some of the subtype handling int fuse.

 (6) Provides a convenience helper function, vfs_get_mtd_super(), for
     doing the work in setting up an MTD device-based superblock.

 (7) Kills off mount_pseudo(), mount_pseudo_xattr(), mount_ns(),
     sget_userns(), mount_mtd(), mount_single().

 (8) Converts a slew of filesystems to use the mount API.

 (9) Fixes a bug in hypfs.

The patches can be found here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git

on branch:

	mount-api-viro

David
---
Andrew Price (1):
      gfs2: Convert gfs2 to fs_context

David Howells (66):
      vfs: Update mount API docs
      vfs: Fix refcounting of filenames in fs_parser
      vfs: Provide sb->s_iflags settings in fs_context struct
      vfs: Provide a mount_pseudo-replacement for the new mount API
      vfs: Convert aio to use the new mount API
      vfs: Convert anon_inodes to use the new mount API
      vfs: Convert bdev to use the new mount API
      vfs: Convert nsfs to use the new mount API
      vfs: Convert pipe to use the new mount API
      vfs: Convert zsmalloc to use the new mount API
      vfs: Convert sockfs to use the new mount API
      vfs: Convert dax to use the new mount API
      vfs: Convert drm to use the new mount API
      vfs: Convert ia64 perfmon to use the new mount API
      vfs: Convert cxl to use the new mount API
      vfs: Convert ocxlflash to use the new mount API
      vfs: Convert virtio_balloon to use the new mount API
      vfs: Convert btrfs_test to use the new mount API
      vfs: Kill off mount_pseudo() and mount_pseudo_xattr()
      vfs: Use sget_fc() for pseudo-filesystems
      vfs: Convert binderfs to use the new mount API
      vfs: Convert nfsctl to use the new mount API
      vfs: Convert rpc_pipefs to use the new mount API
      vfs: Kill mount_ns()
      vfs: Kill sget_userns()
      vfs: Convert binfmt_misc to use the new mount API
      vfs: Convert configfs to use the new mount API
      vfs: Convert efivarfs to use the new mount API
      vfs: Convert fusectl to use the new mount API
      vfs: Convert qib_fs/ipathfs to use the new mount API
      vfs: Convert ibmasmfs to use the new mount API
      vfs: Convert oprofilefs to use the new mount API
      vfs: Convert gadgetfs to use the new mount API
      vfs: Convert xenfs to use the new mount API
      vfs: Convert openpromfs to use the new mount API
      vfs: Convert apparmorfs to use the new mount API
      vfs: Convert securityfs to use the new mount API
      vfs: Convert selinuxfs to use the new mount API
      vfs: Convert smackfs to use the new mount API
      vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API
      vfs: Create fs_context-aware mount_bdev() replacement
      vfs: Make fs_parse() handle fs_param_is_fd-type params better
      vfs: Convert fuse to use the new mount API
      vfs: Move the subtype parameter into fuse
      mtd: Provide fs_context-aware mount_mtd() replacement
      vfs: Convert romfs to use the new mount API
      vfs: Convert cramfs to use the new mount API
      vfs: Convert jffs2 to use the new mount API
      mtd: Kill mount_mtd()
      vfs: Convert squashfs to use the new mount API
      vfs: Convert ceph to use the new mount API
      vfs: Convert functionfs to use the new mount API
      vfs: Add a single-or-reconfig keying to vfs_get_super()
      vfs: Convert debugfs to use the new mount API
      vfs: Convert tracefs to use the new mount API
      vfs: Convert pstore to use the new mount API
      hypfs: Fix error number left in struct pointer member
      vfs: Convert hypfs to use the new mount API
      vfs: Convert spufs to use the new mount API
      vfs: Kill mount_single()
      vfs: Convert coda to use the new mount API
      vfs: Convert autofs to use the new mount API
      vfs: Convert devpts to use the new mount API
      vfs: Convert bpf to use the new mount API
      vfs: Convert ubifs to use the new mount API
      vfs: Convert orangefs to use the new mount API

Masahiro Yamada (1):
      kbuild: skip sub-make for in-tree build with GNU Make 4.x


 Documentation/filesystems/mount_api.txt   |  367 ++++++++-------
 Documentation/filesystems/vfs.txt         |    4 
 Makefile                                  |   31 +
 arch/ia64/kernel/perfmon.c                |   14 -
 arch/powerpc/platforms/cell/spufs/inode.c |  207 +++++----
 arch/s390/hypfs/inode.c                   |  137 +++---
 drivers/android/binderfs.c                |  173 ++++---
 drivers/base/devtmpfs.c                   |   16 -
 drivers/block/rbd.c                       |  363 ++++++++-------
 drivers/dax/super.c                       |   13 -
 drivers/gpu/drm/drm_drv.c                 |   14 -
 drivers/infiniband/hw/qib/qib_fs.c        |   26 +
 drivers/misc/cxl/api.c                    |   10 
 drivers/misc/ibmasm/ibmasmfs.c            |   21 +
 drivers/mtd/mtdcore.h                     |    1 
 drivers/mtd/mtdsuper.c                    |  181 ++++----
 drivers/oprofile/oprofilefs.c             |   20 +
 drivers/scsi/cxlflash/ocxl_hw.c           |   21 -
 drivers/usb/gadget/function/f_fs.c        |  233 +++++-----
 drivers/usb/gadget/legacy/inode.c         |   21 +
 drivers/virtio/virtio_balloon.c           |   19 -
 drivers/xen/xenfs/super.c                 |   21 +
 fs/aio.c                                  |   15 -
 fs/anon_inodes.c                          |   12 
 fs/autofs/autofs_i.h                      |   13 -
 fs/autofs/init.c                          |    9 
 fs/autofs/inode.c                         |  429 ++++++++++--------
 fs/binfmt_misc.c                          |   20 +
 fs/block_dev.c                            |   14 -
 fs/btrfs/tests/btrfs-tests.c              |   13 -
 fs/ceph/cache.c                           |    9 
 fs/ceph/cache.h                           |    5 
 fs/ceph/super.c                           |  697 ++++++++++++++---------------
 fs/ceph/super.h                           |    1 
 fs/coda/inode.c                           |  171 +++++--
 fs/configfs/mount.c                       |   20 +
 fs/cramfs/inode.c                         |   69 ++-
 fs/debugfs/inode.c                        |  186 ++++----
 fs/devpts/inode.c                         |  265 +++++------
 fs/efivarfs/super.c                       |   20 +
 fs/fs_context.c                           |   16 -
 fs/fs_parser.c                            |   18 +
 fs/fuse/control.c                         |   20 +
 fs/fuse/inode.c                           |  291 +++++++-----
 fs/gfs2/incore.h                          |    8 
 fs/gfs2/ops_fstype.c                      |  495 ++++++++++++++++-----
 fs/gfs2/super.c                           |  335 --------------
 fs/gfs2/super.h                           |    3 
 fs/jffs2/fs.c                             |   21 -
 fs/jffs2/os-linux.h                       |    4 
 fs/jffs2/super.c                          |  172 +++----
 fs/libfs.c                                |   91 +++-
 fs/nfsd/nfsctl.c                          |   33 +
 fs/nsfs.c                                 |   13 -
 fs/openpromfs/inode.c                     |   20 +
 fs/orangefs/orangefs-kernel.h             |    8 
 fs/orangefs/orangefs-mod.c                |    3 
 fs/orangefs/super.c                       |  186 ++++----
 fs/pipe.c                                 |   12 
 fs/pstore/inode.c                         |  109 +++--
 fs/ramfs/inode.c                          |  104 +++-
 fs/romfs/super.c                          |   46 +-
 fs/squashfs/super.c                       |  100 ++--
 fs/super.c                                |  301 ++++++-------
 fs/tracefs/inode.c                        |  180 +++----
 fs/ubifs/super.c                          |  447 ++++++++-----------
 include/linux/ceph/ceph_debug.h           |    1 
 include/linux/ceph/libceph.h              |   17 +
 include/linux/fs.h                        |   24 -
 include/linux/fs_context.h                |   20 +
 include/linux/mtd/super.h                 |    6 
 include/linux/ramfs.h                     |    6 
 include/linux/shmem_fs.h                  |    4 
 init/do_mounts.c                          |   12 
 kernel/bpf/inode.c                        |   92 ++--
 mm/shmem.c                                |  396 +++++++++++-----
 mm/zsmalloc.c                             |   19 -
 net/ceph/ceph_common.c                    |  410 +++++++----------
 net/socket.c                              |   14 -
 net/sunrpc/rpc_pipe.c                     |   34 +
 security/apparmor/apparmorfs.c            |   20 +
 security/inode.c                          |   21 +
 security/selinux/selinuxfs.c              |   20 +
 security/smack/smackfs.c                  |   34 +
 84 files changed, 4257 insertions(+), 3810 deletions(-)

^ permalink raw reply

* Re: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces
From: Michael S. Tsirkin @ 2019-03-27 22:16 UTC (permalink / raw)
  To: si-wei liu
  Cc: jiri, kubakici, sridhar.samudrala, alexander.duyck,
	virtualization, liran.alon, netdev, boris.ostrovsky, davem
In-Reply-To: <f90d9dfa-f2f1-ff04-f3fe-88fa0deffdf7@oracle.com>

On Wed, Mar 27, 2019 at 01:10:10PM -0700, si-wei liu wrote:
> Another less safer option is that we just notify userspace anyway without
> sending down/up event around, as I don't see *any real application* cares
> about the link state or whatsoever when it attempts to detect rename.

How do you write a race ree handler then? ATM just detecting link up is
sufficient and covers 100% of cases. Seems like a good idea to keep it
that way.

-- 
MST

^ permalink raw reply

* Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi
From: Michael S. Tsirkin @ 2019-03-27 17:01 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: pbonzini, virtualization
In-Reply-To: <20190327165754.GF27283@stefanha-x1.localdomain>

On Wed, Mar 27, 2019 at 04:57:54PM +0000, Stefan Hajnoczi wrote:
> On Wed, Mar 27, 2019 at 10:33:57AM -0400, Michael S. Tsirkin wrote:
> > Jason doesn't really have the time to review blk/scsi
> > patches. Paolo and Setfan agreed to help out.
> > 
> > Thanks guys!
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > ---
> 
> There is relatively little activity in this area so I'd like to reply
> with Reviewed-by/Acked-by on the mailing list and have patches merged
> via your virtio tree.  That way I do not maintain a sub-tree and send
> you pull requests.  Does this sound good?
> 
> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>


v2 does that. pls ack that one then.

^ permalink raw reply

* Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi
From: Paolo Bonzini @ 2019-03-27 17:01 UTC (permalink / raw)
  To: Stefan Hajnoczi, Michael S. Tsirkin; +Cc: virtualization
In-Reply-To: <20190327165754.GF27283@stefanha-x1.localdomain>


[-- Attachment #1.1.1: Type: text/plain, Size: 811 bytes --]

On 27/03/19 17:57, Stefan Hajnoczi wrote:
> On Wed, Mar 27, 2019 at 10:33:57AM -0400, Michael S. Tsirkin wrote:
>> Jason doesn't really have the time to review blk/scsi
>> patches. Paolo and Setfan agreed to help out.
>>
>> Thanks guys!
>>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>
>> ---
> 
> There is relatively little activity in this area so I'd like to reply
> with Reviewed-by/Acked-by on the mailing list and have patches merged
> via your virtio tree.  That way I do not maintain a sub-tree and send
> you pull requests.  Does this sound good?

FWIW me too, that's why I suggested that Michael add us as reviewers
rather than maintainers. So,

Acked-by: Paolo Bonzini <pbonzini@redhat.com>

too.

Paolo

> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
> 



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi
From: Stefan Hajnoczi @ 2019-03-27 16:57 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: pbonzini, virtualization
In-Reply-To: <20190327103254-mutt-send-email-mst@kernel.org>

[-- Attachment #1.1: Type: text/plain, Size: 573 bytes --]

On Wed, Mar 27, 2019 at 10:33:57AM -0400, Michael S. Tsirkin wrote:
> Jason doesn't really have the time to review blk/scsi
> patches. Paolo and Setfan agreed to help out.
> 
> Thanks guys!
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---

There is relatively little activity in this area so I'd like to reply
with Reviewed-by/Acked-by on the mailing list and have patches merged
via your virtio tree.  That way I do not maintain a sub-tree and send
you pull requests.  Does this sound good?

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi
From: Michael S. Tsirkin @ 2019-03-27 15:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: stefanha, virtualization
In-Reply-To: <1a5d4b76-e11d-3a12-4971-64581bffe636@redhat.com>

On Wed, Mar 27, 2019 at 04:08:05PM +0100, Paolo Bonzini wrote:
> On 27/03/19 15:33, Michael S. Tsirkin wrote:
> > Jason doesn't really have the time to review blk/scsi
> > patches. Paolo and Setfan agreed to help out.
> > 
> > Thanks guys!
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > ---
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 95a5ebecd04f..8326d19c1681 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -16247,7 +16247,7 @@ F:	drivers/char/virtio_console.c
> >  F:	include/linux/virtio_console.h
> >  F:	include/uapi/linux/virtio_console.h
> >  
> > -VIRTIO CORE, NET AND BLOCK DRIVERS
> > +VIRTIO CORE AND NET DRIVERS
> >  M:	"Michael S. Tsirkin" <mst@redhat.com>
> >  M:	Jason Wang <jasowang@redhat.com>
> >  L:	virtualization@lists.linux-foundation.org
> > @@ -16262,6 +16262,18 @@ F:	include/uapi/linux/virtio_*.h
> >  F:	drivers/crypto/virtio/
> >  F:	mm/balloon_compaction.c
> >  
> > +VIRTIO BLOCK AND SCSI DRIVERS
> > +M:	"Michael S. Tsirkin" <mst@redhat.com>
> > +M:	Paolo Bonzini <pbonzini@redhat.com>
> > +M:	Stefan Hajnoczi <stefanha@redhat.com>
> 
> Please make this R at least for me, so that it's clear that patches
> still flow through your and Jason's tree.  Not sure if you want to keep
> Jason as M.
> 
> Paolo

Not for block I think. He never reviews these.

> > +L:	virtualization@lists.linux-foundation.org
> > +S:	Maintained
> > +F:	drivers/block/virtio_blk.c
> > +F:	drivers/scsi/virtio_scsi.c
> > +F:	include/uapi/linux/virtio_blk.h
> > +F:	include/uapi/linux/virtio_scsi.h
> > +F:	drivers/vhost/scsi.c
> > +
> >  VIRTIO CRYPTO DRIVER
> >  M:	Gonglei <arei.gonglei@huawei.com>
> >  L:	virtualization@lists.linux-foundation.org
> > 

^ permalink raw reply

* Re: [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi
From: Paolo Bonzini @ 2019-03-27 15:08 UTC (permalink / raw)
  To: Michael S. Tsirkin, stefanha, virtualization
In-Reply-To: <20190327103254-mutt-send-email-mst@kernel.org>

On 27/03/19 15:33, Michael S. Tsirkin wrote:
> Jason doesn't really have the time to review blk/scsi
> patches. Paolo and Setfan agreed to help out.
> 
> Thanks guys!
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 95a5ebecd04f..8326d19c1681 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16247,7 +16247,7 @@ F:	drivers/char/virtio_console.c
>  F:	include/linux/virtio_console.h
>  F:	include/uapi/linux/virtio_console.h
>  
> -VIRTIO CORE, NET AND BLOCK DRIVERS
> +VIRTIO CORE AND NET DRIVERS
>  M:	"Michael S. Tsirkin" <mst@redhat.com>
>  M:	Jason Wang <jasowang@redhat.com>
>  L:	virtualization@lists.linux-foundation.org
> @@ -16262,6 +16262,18 @@ F:	include/uapi/linux/virtio_*.h
>  F:	drivers/crypto/virtio/
>  F:	mm/balloon_compaction.c
>  
> +VIRTIO BLOCK AND SCSI DRIVERS
> +M:	"Michael S. Tsirkin" <mst@redhat.com>
> +M:	Paolo Bonzini <pbonzini@redhat.com>
> +M:	Stefan Hajnoczi <stefanha@redhat.com>

Please make this R at least for me, so that it's clear that patches
still flow through your and Jason's tree.  Not sure if you want to keep
Jason as M.

Paolo

> +L:	virtualization@lists.linux-foundation.org
> +S:	Maintained
> +F:	drivers/block/virtio_blk.c
> +F:	drivers/scsi/virtio_scsi.c
> +F:	include/uapi/linux/virtio_blk.h
> +F:	include/uapi/linux/virtio_scsi.h
> +F:	drivers/vhost/scsi.c
> +
>  VIRTIO CRYPTO DRIVER
>  M:	Gonglei <arei.gonglei@huawei.com>
>  L:	virtualization@lists.linux-foundation.org
> 

^ permalink raw reply

* Re: [PATCH v2 -next] drm/virtio: remove set but not used variable 'vgdev'
From: Mukesh Ojha @ 2019-03-27 14:47 UTC (permalink / raw)
  To: YueHaibing, David Airlie, Gerd Hoffmann, Daniel Vetter
  Cc: kernel-janitors, linux-kernel, dri-devel, virtualization
In-Reply-To: <20190325092631.152060-1-yuehaibing@huawei.com>


On 3/25/2019 2:56 PM, YueHaibing wrote:
> Fixes gcc '-Wunused-but-set-variable' warning:
>
> drivers/gpu/drm/virtio/virtgpu_ttm.c: In function 'virtio_gpu_init_mem_type':
> drivers/gpu/drm/virtio/virtgpu_ttm.c:117:28: warning:
>   variable 'vgdev' set but not used [-Wunused-but-set-variable]
>
> drivers/gpu/drm/virtio/virtgpu_ttm.c: In function 'virtio_gpu_bo_swap_notify':
> drivers/gpu/drm/virtio/virtgpu_ttm.c:300:28: warning:
>   variable 'vgdev' set but not used [-Wunused-but-set-variable]
>
> It is never used since introduction in dc5698e80cf7 ("Add virtio gpu driver.")
>
> Signed-off-by: YueHaibing <yuehaibing@huawei.com>


Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>

-Mukesh

> ---
> v2: fix patch prefix
> ---
>   drivers/gpu/drm/virtio/virtgpu_ttm.c | 6 ------
>   1 file changed, 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ttm.c b/drivers/gpu/drm/virtio/virtgpu_ttm.c
> index d6225ba20b30..eb007c2569d8 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ttm.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ttm.c
> @@ -114,10 +114,6 @@ static const struct ttm_mem_type_manager_func virtio_gpu_bo_manager_func = {
>   static int virtio_gpu_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
>   				    struct ttm_mem_type_manager *man)
>   {
> -	struct virtio_gpu_device *vgdev;
> -
> -	vgdev = virtio_gpu_get_vgdev(bdev);
> -
>   	switch (type) {
>   	case TTM_PL_SYSTEM:
>   		/* System memory */
> @@ -297,10 +293,8 @@ static void virtio_gpu_bo_move_notify(struct ttm_buffer_object *tbo,
>   static void virtio_gpu_bo_swap_notify(struct ttm_buffer_object *tbo)
>   {
>   	struct virtio_gpu_object *bo;
> -	struct virtio_gpu_device *vgdev;
>   
>   	bo = container_of(tbo, struct virtio_gpu_object, tbo);
> -	vgdev = (struct virtio_gpu_device *)bo->gem_base.dev->dev_private;
>   
>   	if (bo->pages)
>   		virtio_gpu_object_free_sg_table(bo);
>
>
>

^ permalink raw reply

* Re: [PATCH v3 5/5] drm/virtio: rework resource creation workflow.
From: Noralf Trønnes @ 2019-03-27 14:41 UTC (permalink / raw)
  To: Gerd Hoffmann, dri-devel
  Cc: David Airlie, open list:VIRTIO GPU DRIVER, open list,
	Daniel Vetter, Gurchetan Singh
In-Reply-To: <20190318113332.10900-6-kraxel@redhat.com>



Den 18.03.2019 12.33, skrev Gerd Hoffmann:
> This patch moves the virtio_gpu_cmd_create_resource() call (which
> notifies the host about the new resource created) into the
> virtio_gpu_object_create() function.  That way we can call
> virtio_gpu_cmd_create_resource() before ttm_bo_init(), so the host
> already knows about the object when ttm initializes the object and calls
> our driver callbacks.
> 
> Specifically the object is already created when the
> virtio_gpu_ttm_tt_bind() callback invokes virtio_gpu_object_attach(),
> so the extra virtio_gpu_object_attach() calls done after
> virtio_gpu_object_create() are not needed any more.
> 
> The fence support for the create ioctl becomes a bit more tricky though.
> The code moved into virtio_gpu_object_create() too.  We first submit the
> (fenced) virtio_gpu_cmd_create_resource() command, then initialize the
> ttm object, and finally attach just created object to the fence for the
> command in case it didn't finish yet.
> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---

Acked-by: Noralf Trønnes <noralf@tronnes.org>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* Re: [PATCH v3 2/5] drm/virtio: use struct to pass params to virtio_gpu_object_create()
From: Noralf Trønnes @ 2019-03-27 14:35 UTC (permalink / raw)
  To: Gerd Hoffmann, dri-devel
  Cc: David Airlie, open list:VIRTIO GPU DRIVER, open list,
	Daniel Vetter, Gurchetan Singh
In-Reply-To: <20190318113332.10900-3-kraxel@redhat.com>



Den 18.03.2019 12.33, skrev Gerd Hoffmann:
> Create virtio_gpu_object_params, use that to pass object parameters to
> virtio_gpu_object_create.  This is just the first step, followup patches
> will add more parameters to the struct.  The plan is to use the struct
> for all object parameters.
> 
> Drop unused "kernel" parameter for virtio_gpu_alloc_object(), it is
> unused and always false.
> 
> Also drop "pinned" parameter.  virtio-gpu doesn't shuffle around
> objects, so effecively they all are pinned anyway.  Hardcode
> TTM_PL_FLAG_NO_EVICT so ttm knows.  Doesn't change much for the moment
> as virtio-gpu supports TTM_PL_FLAG_TT only so there is no opportunity to
> move around objects.  That'll probably change in the future though.
> 
> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> ---

Acked-by: Noralf Trønnes <noralf@tronnes.org>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply

* [PATCH] MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi
From: Michael S. Tsirkin @ 2019-03-27 14:33 UTC (permalink / raw)
  To: pbonzini, stefanha, virtualization

Jason doesn't really have the time to review blk/scsi
patches. Paolo and Setfan agreed to help out.

Thanks guys!

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

diff --git a/MAINTAINERS b/MAINTAINERS
index 95a5ebecd04f..8326d19c1681 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16247,7 +16247,7 @@ F:	drivers/char/virtio_console.c
 F:	include/linux/virtio_console.h
 F:	include/uapi/linux/virtio_console.h
 
-VIRTIO CORE, NET AND BLOCK DRIVERS
+VIRTIO CORE AND NET DRIVERS
 M:	"Michael S. Tsirkin" <mst@redhat.com>
 M:	Jason Wang <jasowang@redhat.com>
 L:	virtualization@lists.linux-foundation.org
@@ -16262,6 +16262,18 @@ F:	include/uapi/linux/virtio_*.h
 F:	drivers/crypto/virtio/
 F:	mm/balloon_compaction.c
 
+VIRTIO BLOCK AND SCSI DRIVERS
+M:	"Michael S. Tsirkin" <mst@redhat.com>
+M:	Paolo Bonzini <pbonzini@redhat.com>
+M:	Stefan Hajnoczi <stefanha@redhat.com>
+L:	virtualization@lists.linux-foundation.org
+S:	Maintained
+F:	drivers/block/virtio_blk.c
+F:	drivers/scsi/virtio_scsi.c
+F:	include/uapi/linux/virtio_blk.h
+F:	include/uapi/linux/virtio_scsi.h
+F:	drivers/vhost/scsi.c
+
 VIRTIO CRYPTO DRIVER
 M:	Gonglei <arei.gonglei@huawei.com>
 L:	virtualization@lists.linux-foundation.org

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox