Linux virtualization list
 help / color / mirror / Atom feed
* Re: [PATCH 2/3] x86/vmware: Add basic paravirt ops support
From: Tim Mann @ 2016-10-26 20:47 UTC (permalink / raw)
  To: Alexey Makhalov
  Cc: time-lords, corbet, pv-drivers, x86, linux-doc, linux-kernel,
	monitor-list, virtualization, mingo, hpa, akataria, tglx
In-Reply-To: <20161026052600.77535-1-amakhalov@vmware.com>

I believe our trademark guidelines say we aren't supposed to use VMware as a
noun to mean a product, only to mean the company.  So we can say "running on
VMware ESXi" or "running in a VMware virtual machine", but "running on VMware"
is wrong.  There is supposedly some good legal reason for this related to
keeping our trademark.

On Tue, 25 Oct 2016 22:26:00 -0700, Alexey Makhalov <amakhalov@vmware.com>
wrote:
> Add basic paravirt support:
>  1. set pv_info.name to "VMware" to have proper boot log message
> 	Booting paravirtualized kernel on VMware
>     instead of "... on bare hardware"
>  2. set pv_cpu_ops.io_delay() to empty function - paravirt_nop() to
>     avoid vm-exits on IO delays.
> 
> Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
> Acked-by: Alok N Kataria <akataria@vmware.com>
> ---
>  arch/x86/kernel/cpu/vmware.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
> index 480790f..e3fb320 100644
> --- a/arch/x86/kernel/cpu/vmware.c
> +++ b/arch/x86/kernel/cpu/vmware.c
> @@ -61,6 +61,16 @@ static unsigned long vmware_get_tsc_khz(void)
>  	return vmware_tsc_khz;
>  }
>  
> +#ifdef CONFIG_PARAVIRT
> +static void __init vmware_paravirt_ops_setup(void)
> +{
> +	pv_info.name = "VMware";
> +	pv_cpu_ops.io_delay = paravirt_nop;
> +}
> +#else
> +#define vmware_paravirt_ops_setup() do {} while (0)
> +#endif
> +
>  static void __init vmware_platform_setup(void)
>  {
>  	uint32_t eax, ebx, ecx, edx;
> @@ -94,6 +104,8 @@ static void __init vmware_platform_setup(void)
>  	} else {
>  		pr_warn("Failed to get TSC freq from the hypervisor\n");
>  	}
> +
> +	vmware_paravirt_ops_setup();
>  }
>  
>  /*



-- 
Tim Mann                  | work: mann@vmware.com  home: tim@tim-mann.org
VMware Sr. Staff Engineer | http://www.vmware.com  http://tim-mann.org

^ permalink raw reply

* Re: [RESEND PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
From: Dave Hansen @ 2016-10-26 18:15 UTC (permalink / raw)
  To: Li, Liang Z, mst@redhat.com
  Cc: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	amit.shah@redhat.com, qemu-devel@nongnu.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	dgilbert@redhat.com
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E3A0FD05E@shsmsx102.ccr.corp.intel.com>

On 10/26/2016 03:13 AM, Li, Liang Z wrote:
> 3 times memory required is not accurate, please ignore this. sorry ...
> The complexity is the point. 

What is making it so complex?  Can you describe the problems?

^ permalink raw reply

* Re: [RESEND PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
From: Dave Hansen @ 2016-10-26 18:11 UTC (permalink / raw)
  To: Li, Liang Z, mst@redhat.com
  Cc: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	amit.shah@redhat.com, qemu-devel@nongnu.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	dgilbert@redhat.com
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E3A0FD034@shsmsx102.ccr.corp.intel.com>

On 10/26/2016 03:06 AM, Li, Liang Z wrote:
> I am working on Dave's new bitmap schema, I have finished the part of
> getting the 'hybrid scheme bitmap' and found the complexity was more
> than I expected. The main issue is more memory is required to save
> the 'hybrid scheme bitmap' beside that used to save the raw page
> bitmap, for the worst case, the memory required is 3 times than that
> in the previous implementation.

Really?  Could you please describe the scenario where this occurs?

> I am wondering if I should continue, as an alternative solution, how about using PFNs array when
> inflating/deflating only a few pages? Things will be much more
> simple.

Yes, using pfn lists is more efficient than using bitmaps for sparse
bitmaps.  Yes, there will be cases where it is preferable to just use
pfn lists vs. any kind of bitmap.

But, what does it matter?  At least with your current scheme where we go
out and collect get_unused_pages(), we do the allocation up front.  The
space efficiency doesn't matter at all for small sizes since we do the
constant-size allocation *anyway*.

I'm also pretty sure you can pack the pfn and page order into a single
64-bit word and have no bitmap for a given record.  That would make it
pack just as well as the old pfns alone.  Right?

^ permalink raw reply

* [PATCH v2 3/3] x86/vmware: Add paravirt sched clock
From: Alexey Makhalov @ 2016-10-26 16:51 UTC (permalink / raw)
  To: corbet, akataria, tglx, mingo, hpa, x86
  Cc: pv-drivers, virtualization, Alexey Makhalov, linux-kernel,
	linux-doc
In-Reply-To: <alpine.DEB.2.20.1610261213000.4983@nanos>

Set pv_time_ops.sched_clock to vmware_sched_clock(). It is simplified
version of native_sched_clock() without ring buffer of mult/shift/offset
triplets and preempt toggling.
Since VMware hypervisor provides constant tsc we can use constant
mult/shift/offset triplet calculated at boot time.

no-vmw-sched-clock kernel parameter is added to disable the paravirt
sched clock.

Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Acked-by: Alok N Kataria <akataria@vmware.com>
---
 Documentation/kernel-parameters.txt |  4 ++++
 arch/x86/kernel/cpu/vmware.c        | 41 +++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 37babf9..b3b2ec0 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2754,6 +2754,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	no-kvmapf	[X86,KVM] Disable paravirtualized asynchronous page
 			fault handling.
 
+	no-vmw-sched-clock
+			[X86,PV_OPS] Disable paravirtualized VMware scheduler
+			clock and use the default one.
+
 	no-steal-acc    [X86,KVM] Disable paravirtualized steal time accounting.
 			steal time is computed, but won't influence scheduler
 			behaviour
diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index e3fb320..e284ebf 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -24,10 +24,15 @@
 #include <linux/dmi.h>
 #include <linux/init.h>
 #include <linux/export.h>
+#include <linux/clocksource.h>
 #include <asm/div64.h>
 #include <asm/x86_init.h>
 #include <asm/hypervisor.h>
 #include <asm/apic.h>
+#include <asm/timer.h>
+
+#undef pr_fmt
+#define pr_fmt(fmt)	"vmware: " fmt
 
 #define CPUID_VMWARE_INFO_LEAF	0x40000000
 #define VMWARE_HYPERVISOR_MAGIC	0x564D5868
@@ -62,10 +67,46 @@ static unsigned long vmware_get_tsc_khz(void)
 }
 
 #ifdef CONFIG_PARAVIRT
+static struct cyc2ns_data vmware_cyc2ns __ro_after_init;
+static int vmw_sched_clock __initdata = 1;
+
+static __init int setup_vmw_sched_clock(char *s)
+{
+	vmw_sched_clock = 0;
+	return 0;
+}
+early_param("no-vmw-sched-clock", setup_vmw_sched_clock);
+
+static unsigned long long vmware_sched_clock(void)
+{
+	unsigned long long ns;
+
+	ns = mul_u64_u32_shr(rdtsc(), vmware_cyc2ns.cyc2ns_mul,
+			     vmware_cyc2ns.cyc2ns_shift);
+	ns -= vmware_cyc2ns.cyc2ns_offset;
+	return ns;
+}
+
 static void __init vmware_paravirt_ops_setup(void)
 {
 	pv_info.name = "VMware";
 	pv_cpu_ops.io_delay = paravirt_nop;
+
+	if (vmware_tsc_khz && vmw_sched_clock) {
+		unsigned long long tsc_now = rdtsc();
+
+		clocks_calc_mult_shift(&vmware_cyc2ns.cyc2ns_mul,
+				       &vmware_cyc2ns.cyc2ns_shift,
+				       vmware_tsc_khz,
+				       NSEC_PER_MSEC, 0);
+		vmware_cyc2ns.cyc2ns_offset =
+			mul_u64_u32_shr(tsc_now, vmware_cyc2ns.cyc2ns_mul,
+					vmware_cyc2ns.cyc2ns_shift);
+
+		pv_time_ops.sched_clock = vmware_sched_clock;
+		pr_info("using sched offset of %llu ns\n",
+			vmware_cyc2ns.cyc2ns_offset);
+	}
 }
 #else
 #define vmware_paravirt_ops_setup() do {} while (0)
-- 
2.10.1

^ permalink raw reply related

* Re: [RESEND PATCH v3 kernel 2/7] virtio-balloon: define new feature bit and page bitmap head
From: Michael S. Tsirkin @ 2016-10-26 15:35 UTC (permalink / raw)
  To: Liang Li
  Cc: virtio-dev, kvm, amit.shah, dave.hansen, qemu-devel, linux-kernel,
	linux-mm, pbonzini, virtualization, dgilbert
In-Reply-To: <1477031080-12616-3-git-send-email-liang.z.li@intel.com>

On Fri, Oct 21, 2016 at 02:24:35PM +0800, Liang Li wrote:
> Add a new feature which supports sending the page information with
> a bitmap. The current implementation uses PFNs array, which is not
> very efficient. Using bitmap can improve the performance of
> inflating/deflating significantly
> 
> The page bitmap header will used to tell the host some information
> about the page bitmap. e.g. the page size, page bitmap length and
> start pfn.
> 
> Signed-off-by: Liang Li <liang.z.li@intel.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
> Cc: Amit Shah <amit.shah@redhat.com>
> ---
>  include/uapi/linux/virtio_balloon.h | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h
> index 343d7dd..d3b182a 100644
> --- a/include/uapi/linux/virtio_balloon.h
> +++ b/include/uapi/linux/virtio_balloon.h
> @@ -34,6 +34,7 @@
>  #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
>  #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
>  #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
> +#define VIRTIO_BALLOON_F_PAGE_BITMAP	3 /* Send page info with bitmap */
>  
>  /* Size of a PFN in the balloon interface. */
>  #define VIRTIO_BALLOON_PFN_SHIFT 12
> @@ -82,4 +83,22 @@ struct virtio_balloon_stat {
>  	__virtio64 val;
>  } __attribute__((packed));
>  
> +/* Page bitmap header structure */
> +struct balloon_bmap_hdr {

Should be virtio_balloon.

> +	/* Used to distinguish different request */

different requests? what are the legal values?

> +	__virtio16 cmd;
> +	/* Shift width of page in the bitmap */

In which units?

> +	__virtio16 page_shift;
> +	/* flag used to identify different status */

this comment does not seem to add any value.

> +	__virtio16 flag;
> +	/* Reserved */

this too

> +	__virtio16 reserved;
> +	/* ID of the request */
> +	__virtio64 req_id;
> +	/* The pfn of 0 bit in the bitmap */
> +	__virtio64 start_pfn;
> +	/* The length of the bitmap, in bytes */

Why not in bits?

> +	__virtio64 bmap_len;
> +};
> +
>  #endif /* _LINUX_VIRTIO_BALLOON_H */
> -- 
> 1.8.3.1

^ permalink raw reply

* Re: [PATCH 3/3] x86/vmware: Add paravirt sched clock
From: Thomas Gleixner @ 2016-10-26 10:18 UTC (permalink / raw)
  To: Alexey Makhalov
  Cc: corbet, time-lords, linux-doc, pv-drivers, x86, linux-kernel,
	monitor-list, virtualization, mingo, hpa, akataria
In-Reply-To: <20161026052640.77585-1-amakhalov@vmware.com>

On Tue, 25 Oct 2016, Alexey Makhalov wrote:
> no-vmw-sched-clock kernel parameter is added to switch back to the
> native_sched_clock() implementation.

You are not switching back. The parameter is used to disable the paravirt
sched clock.

>  #ifdef CONFIG_PARAVIRT
> +static struct cyc2ns_data vmware_cyc2ns __ro_after_init;
> +
> +static int vmw_sched_clock __initdata = 1;
> +static __init int setup_vmw_sched_clock(char *s)

Please stop glueing a variable to a function w/o a new line between
them. It's just stopping the reading flow.

> +{
> +	vmw_sched_clock = 0;
> +	return 0;
> +}
> +early_param("no-vmw-sched-clock", setup_vmw_sched_clock);
> +
> +static unsigned long long vmware_sched_clock(void)
> +{
> +	unsigned long long ns;
> +
> +	ns = mul_u64_u32_shr(rdtsc(), vmware_cyc2ns.cyc2ns_mul,
> +			     vmware_cyc2ns.cyc2ns_shift);
> +	ns -= vmware_cyc2ns.cyc2ns_offset;
> +	return ns;
> +}
> +
>  static void __init vmware_paravirt_ops_setup(void)
>  {
>  	pv_info.name = "VMware";
>  	pv_cpu_ops.io_delay = paravirt_nop;
> +
> +	if (vmware_tsc_khz && vmw_sched_clock) {
> +		unsigned long long tsc_now = rdtsc();
> +
> +		clocks_calc_mult_shift(&vmware_cyc2ns.cyc2ns_mul,
> +				       &vmware_cyc2ns.cyc2ns_shift,
> +				       vmware_tsc_khz,
> +				       NSEC_PER_MSEC, 0);
> +		vmware_cyc2ns.cyc2ns_offset =
> +			mul_u64_u32_shr(tsc_now, vmware_cyc2ns.cyc2ns_mul,
> +					vmware_cyc2ns.cyc2ns_shift);
> +
> +		pv_time_ops.sched_clock = vmware_sched_clock;
> +		pr_info("vmware: using sched offset of %llu ns\n",

Please use pr_fmt instead of adding the prefix to every print.

Thanks,

	tglx

^ permalink raw reply

* RE: [RESEND PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
From: Li, Liang Z @ 2016-10-26 10:13 UTC (permalink / raw)
  To: Li, Liang Z, Hansen, Dave, mst@redhat.com
  Cc: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	amit.shah@redhat.com, qemu-devel@nongnu.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	dgilbert@redhat.com
In-Reply-To: <F2CBF3009FA73547804AE4C663CAB28E3A0FD034@shsmsx102.ccr.corp.intel.com>

> > On 10/20/2016 11:24 PM, Liang Li wrote:
> > > Dave Hansen suggested a new scheme to encode the data structure,
> > > because of additional complexity, it's not implemented in v3.
> >
> > So, what do you want done with this patch set?  Do you want it applied
> > as-is so that we can introduce a new host/guest ABI that we must
> > support until the end of time?  Then, we go back in a year or two and
> > add the newer format that addresses the deficiencies that this ABI has with
> a third version?
> >
> 
> Hi Dave & Michael,
> 
> I am working on Dave's new bitmap schema, I have finished the part of
> getting the 'hybrid scheme bitmap'
> and found the complexity was more than I expected.  The main issue is more
> memory is required to  save the 'hybrid scheme bitmap' beside that used to
> save the raw page bitmap, for the worst case, the memory required is 3
> times than that in the previous implementation.
> 

3 times memory required is not accurate, please ignore this. sorry ...
The complexity is the point. 

> I am wondering if I should continue, as an alternative solution, how about
> using PFNs array when inflating/deflating only a few pages? Things will be
> much more simple.
> 
> 
> Thanks!
> Liang
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org

^ permalink raw reply

* RE: [RESEND PATCH v3 kernel 0/7] Extend virtio-balloon for fast (de)inflating & fast live migration
From: Li, Liang Z @ 2016-10-26 10:06 UTC (permalink / raw)
  To: Hansen, Dave, mst@redhat.com
  Cc: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	amit.shah@redhat.com, qemu-devel@nongnu.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	dgilbert@redhat.com
In-Reply-To: <580A4F81.60201@intel.com>

> On 10/20/2016 11:24 PM, Liang Li wrote:
> > Dave Hansen suggested a new scheme to encode the data structure,
> > because of additional complexity, it's not implemented in v3.
> 
> So, what do you want done with this patch set?  Do you want it applied as-is
> so that we can introduce a new host/guest ABI that we must support until
> the end of time?  Then, we go back in a year or two and add the newer
> format that addresses the deficiencies that this ABI has with a third version?
> 

Hi Dave & Michael,

I am working on Dave's new bitmap schema, I have finished the part of getting the 'hybrid scheme bitmap'
and found the complexity was more than I expected.  The main issue is more memory is required to
 save the 'hybrid scheme bitmap' beside that used to save the raw page bitmap, for the worst case, the
memory required is 3 times than that in the previous implementation. 

I am wondering if I should continue, as an alternative solution, how about using PFNs array when
inflating/deflating only a few pages? Things will be much more simple.


Thanks!
Liang 

^ permalink raw reply

* [PATCH 3/3] x86/vmware: Add paravirt sched clock
From: Alexey Makhalov @ 2016-10-26  5:26 UTC (permalink / raw)
  To: corbet, akataria, tglx, mingo, hpa, x86
  Cc: time-lords, monitor-list, pv-drivers, linux-doc, linux-kernel,
	virtualization, Alexey Makhalov
In-Reply-To: <20161026052038.77042-1-amakhalov@vmware.com>

Set pv_time_ops.sched_clock to vmware_sched_clock(). It is simplified
version of native_sched_clock() without ring buffer of mult/shift/offset
triplets and preempt toggling.
Since VMware hypervisor provides constant tsc we can use constant
mult/shift/offset triplet calculated at boot time.

no-vmw-sched-clock kernel parameter is added to switch back to the
native_sched_clock() implementation.

Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Acked-by: Alok N Kataria <akataria@vmware.com>
---
 Documentation/kernel-parameters.txt |  4 ++++
 arch/x86/kernel/cpu/vmware.c        | 38 +++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 37babf9..b3b2ec0 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2754,6 +2754,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 	no-kvmapf	[X86,KVM] Disable paravirtualized asynchronous page
 			fault handling.
 
+	no-vmw-sched-clock
+			[X86,PV_OPS] Disable paravirtualized VMware scheduler
+			clock and use the default one.
+
 	no-steal-acc    [X86,KVM] Disable paravirtualized steal time accounting.
 			steal time is computed, but won't influence scheduler
 			behaviour
diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index e3fb320..6ef22c1 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -24,10 +24,12 @@
 #include <linux/dmi.h>
 #include <linux/init.h>
 #include <linux/export.h>
+#include <linux/clocksource.h>
 #include <asm/div64.h>
 #include <asm/x86_init.h>
 #include <asm/hypervisor.h>
 #include <asm/apic.h>
+#include <asm/timer.h>
 
 #define CPUID_VMWARE_INFO_LEAF	0x40000000
 #define VMWARE_HYPERVISOR_MAGIC	0x564D5868
@@ -62,10 +64,46 @@ static unsigned long vmware_get_tsc_khz(void)
 }
 
 #ifdef CONFIG_PARAVIRT
+static struct cyc2ns_data vmware_cyc2ns __ro_after_init;
+
+static int vmw_sched_clock __initdata = 1;
+static __init int setup_vmw_sched_clock(char *s)
+{
+	vmw_sched_clock = 0;
+	return 0;
+}
+early_param("no-vmw-sched-clock", setup_vmw_sched_clock);
+
+static unsigned long long vmware_sched_clock(void)
+{
+	unsigned long long ns;
+
+	ns = mul_u64_u32_shr(rdtsc(), vmware_cyc2ns.cyc2ns_mul,
+			     vmware_cyc2ns.cyc2ns_shift);
+	ns -= vmware_cyc2ns.cyc2ns_offset;
+	return ns;
+}
+
 static void __init vmware_paravirt_ops_setup(void)
 {
 	pv_info.name = "VMware";
 	pv_cpu_ops.io_delay = paravirt_nop;
+
+	if (vmware_tsc_khz && vmw_sched_clock) {
+		unsigned long long tsc_now = rdtsc();
+
+		clocks_calc_mult_shift(&vmware_cyc2ns.cyc2ns_mul,
+				       &vmware_cyc2ns.cyc2ns_shift,
+				       vmware_tsc_khz,
+				       NSEC_PER_MSEC, 0);
+		vmware_cyc2ns.cyc2ns_offset =
+			mul_u64_u32_shr(tsc_now, vmware_cyc2ns.cyc2ns_mul,
+					vmware_cyc2ns.cyc2ns_shift);
+
+		pv_time_ops.sched_clock = vmware_sched_clock;
+		pr_info("vmware: using sched offset of %llu ns\n",
+			vmware_cyc2ns.cyc2ns_offset);
+	}
 }
 #else
 #define vmware_paravirt_ops_setup() do {} while (0)
-- 
2.10.1

^ permalink raw reply related

* [PATCH 2/3] x86/vmware: Add basic paravirt ops support
From: Alexey Makhalov @ 2016-10-26  5:26 UTC (permalink / raw)
  To: corbet, akataria, tglx, mingo, hpa, x86
  Cc: time-lords, monitor-list, pv-drivers, linux-doc, linux-kernel,
	virtualization, Alexey Makhalov
In-Reply-To: <20161026052038.77042-1-amakhalov@vmware.com>

Add basic paravirt support:
 1. set pv_info.name to "VMware" to have proper boot log message
	Booting paravirtualized kernel on VMware
    instead of "... on bare hardware"
 2. set pv_cpu_ops.io_delay() to empty function - paravirt_nop() to
    avoid vm-exits on IO delays.

Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Acked-by: Alok N Kataria <akataria@vmware.com>
---
 arch/x86/kernel/cpu/vmware.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index 480790f..e3fb320 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -61,6 +61,16 @@ static unsigned long vmware_get_tsc_khz(void)
 	return vmware_tsc_khz;
 }
 
+#ifdef CONFIG_PARAVIRT
+static void __init vmware_paravirt_ops_setup(void)
+{
+	pv_info.name = "VMware";
+	pv_cpu_ops.io_delay = paravirt_nop;
+}
+#else
+#define vmware_paravirt_ops_setup() do {} while (0)
+#endif
+
 static void __init vmware_platform_setup(void)
 {
 	uint32_t eax, ebx, ecx, edx;
@@ -94,6 +104,8 @@ static void __init vmware_platform_setup(void)
 	} else {
 		pr_warn("Failed to get TSC freq from the hypervisor\n");
 	}
+
+	vmware_paravirt_ops_setup();
 }
 
 /*
-- 
2.10.1

^ permalink raw reply related

* [PATCH 1/3] x86/vmware: Use tsc_khz value for calibrate_cpu()
From: Alexey Makhalov @ 2016-10-26  5:20 UTC (permalink / raw)
  To: corbet, akataria, tglx, mingo, hpa, x86
  Cc: time-lords, monitor-list, pv-drivers, linux-doc, linux-kernel,
	virtualization, Alexey Makhalov
In-Reply-To: <20161026052038.77042-1-amakhalov@vmware.com>

After aa297292d708, there are separate native calibrations for cpu_khz and
tsc_khz. The code sets x86_platform.calibrate_cpu to native_calibrate_cpu()
which looks in cpuid leaf 0x16 or msrs for the cpu frequency. Since we keep
the tsc_khz constant (even after vmotion), the cpu_khz and tsc_khz may
start diverging.

tsc_init() now does

	cpu_khz = x86_platform.calibrate_cpu();
	tsc_khz = x86_platform.calibrate_tsc();
	if (tsc_khz == 0)
		tsc_khz = cpu_khz;
	else if (abs(cpu_khz - tsc_khz) * 10 > tsc_khz)
		cpu_khz = tsc_khz;

We want the cpu_khz and tsc_khz to be sync even if they diverge less then
10%.
This patch resolves this issue by setting x86_platform.calibrate_cpu to
vmware_get_tsc_khz().

Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
Acked-by: Alok N Kataria <akataria@vmware.com>
---
 arch/x86/kernel/cpu/vmware.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/cpu/vmware.c b/arch/x86/kernel/cpu/vmware.c
index 4e34da4b..480790f 100644
--- a/arch/x86/kernel/cpu/vmware.c
+++ b/arch/x86/kernel/cpu/vmware.c
@@ -83,6 +83,7 @@ static void __init vmware_platform_setup(void)
 
 		vmware_tsc_khz = tsc_khz;
 		x86_platform.calibrate_tsc = vmware_get_tsc_khz;
+		x86_platform.calibrate_cpu = vmware_get_tsc_khz;
 
 #ifdef CONFIG_X86_LOCAL_APIC
 		/* Skip lapic calibration since we know the bus frequency. */
-- 
2.10.1

^ permalink raw reply related

* [PATCH 0/3] x86/vmware guest improvements
From: Alexey Makhalov @ 2016-10-26  5:20 UTC (permalink / raw)
  To: corbet, akataria, tglx, mingo, hpa, x86
  Cc: time-lords, monitor-list, pv-drivers, linux-doc, linux-kernel,
	virtualization, Alexey Makhalov

This patchset includes several VMware guest improvements:

Alexey Makhalov (3):
  x86/vmware: Use tsc_khz value for calibrate_cpu()
  x86/vmware: Add basic paravirt ops support
  x86/vmware: Add paravirt sched clock

 Documentation/kernel-parameters.txt |  4 +++
 arch/x86/kernel/cpu/vmware.c        | 51 +++++++++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+)

-- 
2.10.1

^ permalink raw reply

* Re: [PATCH] virtio-net: Update the mtu code to match virtio spec
From: Aaron Conole @ 2016-10-25 20:14 UTC (permalink / raw)
  To: netdev; +Cc: Jarod Wilson, Michael S. Tsirkin, linux-kernel, virtualization
In-Reply-To: <f7ta8ds9ory.fsf@redhat.com>

Aaron Conole <aconole@redhat.com> writes:

>> From: Aaron Conole <aconole@bytheb.org>
>>
>> The virtio committee recently ratified a change, VIRTIO-152, which
>> defines the mtu field to be 'max' MTU, not simply desired MTU.
>>
>> This commit brings the virtio-net device in compliance with VIRTIO-152.
>>
>> Additionally, drop the max_mtu branch - it cannot be taken since the u16
>> returned by virtio_cread16 will never exceed the initial value of
>> max_mtu.
>>
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Jarod Wilson <jarod@redhat.com>
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>
> Sorry about the subject line, David.  This is targetted at net-next, and
> it appears my from was mangled.  Would you like me to resubmit with
> these details corrected?

I answered my own question.  Sorry for the noise.

^ permalink raw reply

* [PATCH v2 net-next] virtio-net: Update the mtu code to match virtio spec
From: Aaron Conole @ 2016-10-25 20:12 UTC (permalink / raw)
  To: netdev; +Cc: Jarod Wilson, Michael S. Tsirkin, linux-kernel, virtualization

The virtio committee recently ratified a change, VIRTIO-152, which
defines the mtu field to be 'max' MTU, not simply desired MTU.

This commit brings the virtio-net device in compliance with VIRTIO-152.

Additionally, drop the max_mtu branch - it cannot be taken since the u16
returned by virtio_cread16 will never exceed the initial value of
max_mtu.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: "Michael S. Tsirkin" <mst@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
---
Nothing code-wise has changed, but I've included the ACKs and fixed up the
subject line.

 drivers/net/virtio_net.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 720809f..2cafd12 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1870,10 +1870,12 @@ static int virtnet_probe(struct virtio_device *vdev)
 		mtu = virtio_cread16(vdev,
 				     offsetof(struct virtio_net_config,
 					      mtu));
-		if (mtu < dev->min_mtu || mtu > dev->max_mtu)
+		if (mtu < dev->min_mtu) {
 			__virtio_clear_bit(vdev, VIRTIO_NET_F_MTU);
-		else
+		} else {
 			dev->mtu = mtu;
+			dev->max_mtu = mtu;
+		}
 	}
 
 	if (vi->any_header_sg)
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] virtio-net: Update the mtu code to match virtio spec
From: Aaron Conole @ 2016-10-25 20:06 UTC (permalink / raw)
  To: netdev; +Cc: Jarod Wilson, Michael S. Tsirkin, linux-kernel, virtualization
In-Reply-To: <1477413335-17296-1-git-send-email-aconole@redhat.com>

> From: Aaron Conole <aconole@bytheb.org>
>
> The virtio committee recently ratified a change, VIRTIO-152, which
> defines the mtu field to be 'max' MTU, not simply desired MTU.
>
> This commit brings the virtio-net device in compliance with VIRTIO-152.
>
> Additionally, drop the max_mtu branch - it cannot be taken since the u16
> returned by virtio_cread16 will never exceed the initial value of
> max_mtu.
>
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jarod Wilson <jarod@redhat.com>
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---

Sorry about the subject line, David.  This is targetted at net-next, and
it appears my from was mangled.  Would you like me to resubmit with
these details corrected?

-Aaron

^ permalink raw reply

* Re: [PATCH] virtio-net: Update the mtu code to match virtio spec
From: Jarod Wilson @ 2016-10-25 17:09 UTC (permalink / raw)
  To: Aaron Conole; +Cc: netdev, Michael S. Tsirkin, linux-kernel, virtualization
In-Reply-To: <1477413335-17296-1-git-send-email-aconole@redhat.com>

On Tue, Oct 25, 2016 at 12:35:35PM -0400, Aaron Conole wrote:
> From: Aaron Conole <aconole@bytheb.org>
> 
> The virtio committee recently ratified a change, VIRTIO-152, which
> defines the mtu field to be 'max' MTU, not simply desired MTU.
> 
> This commit brings the virtio-net device in compliance with VIRTIO-152.
> 
> Additionally, drop the max_mtu branch - it cannot be taken since the u16
> returned by virtio_cread16 will never exceed the initial value of
> max_mtu.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jarod Wilson <jarod@redhat.com>
> Signed-off-by: Aaron Conole <aconole@redhat.com>

Worksforme.

Acked-by: Jarod Wilson <jarod@redhat.com>

-- 
Jarod Wilson
jarod@redhat.com

^ permalink raw reply

* Re: [PATCH] virtio-net: Update the mtu code to match virtio spec
From: Michael S. Tsirkin @ 2016-10-25 16:41 UTC (permalink / raw)
  To: Aaron Conole; +Cc: netdev, Jarod Wilson, linux-kernel, virtualization
In-Reply-To: <1477413335-17296-1-git-send-email-aconole@redhat.com>

On Tue, Oct 25, 2016 at 12:35:35PM -0400, Aaron Conole wrote:
> From: Aaron Conole <aconole@bytheb.org>
> 
> The virtio committee recently ratified a change, VIRTIO-152, which
> defines the mtu field to be 'max' MTU, not simply desired MTU.
> 
> This commit brings the virtio-net device in compliance with VIRTIO-152.
> 
> Additionally, drop the max_mtu branch - it cannot be taken since the u16
> returned by virtio_cread16 will never exceed the initial value of
> max_mtu.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Jarod Wilson <jarod@redhat.com>
> Signed-off-by: Aaron Conole <aconole@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/virtio_net.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 720809f..2cafd12 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1870,10 +1870,12 @@ static int virtnet_probe(struct virtio_device *vdev)
>  		mtu = virtio_cread16(vdev,
>  				     offsetof(struct virtio_net_config,
>  					      mtu));
> -		if (mtu < dev->min_mtu || mtu > dev->max_mtu)
> +		if (mtu < dev->min_mtu) {
>  			__virtio_clear_bit(vdev, VIRTIO_NET_F_MTU);
> -		else
> +		} else {
>  			dev->mtu = mtu;
> +			dev->max_mtu = mtu;
> +		}
>  	}
>  
>  	if (vi->any_header_sg)
> -- 
> 2.7.4

^ permalink raw reply

* [PATCH] virtio-net: Update the mtu code to match virtio spec
From: Aaron Conole @ 2016-10-25 16:35 UTC (permalink / raw)
  To: netdev; +Cc: Jarod Wilson, Michael S. Tsirkin, linux-kernel, virtualization

From: Aaron Conole <aconole@bytheb.org>

The virtio committee recently ratified a change, VIRTIO-152, which
defines the mtu field to be 'max' MTU, not simply desired MTU.

This commit brings the virtio-net device in compliance with VIRTIO-152.

Additionally, drop the max_mtu branch - it cannot be taken since the u16
returned by virtio_cread16 will never exceed the initial value of
max_mtu.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 drivers/net/virtio_net.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 720809f..2cafd12 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1870,10 +1870,12 @@ static int virtnet_probe(struct virtio_device *vdev)
 		mtu = virtio_cread16(vdev,
 				     offsetof(struct virtio_net_config,
 					      mtu));
-		if (mtu < dev->min_mtu || mtu > dev->max_mtu)
+		if (mtu < dev->min_mtu) {
 			__virtio_clear_bit(vdev, VIRTIO_NET_F_MTU);
-		else
+		} else {
 			dev->mtu = mtu;
+			dev->max_mtu = mtu;
+		}
 	}
 
 	if (vi->any_header_sg)
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] virtio: console: Unlock vqs while freeing buffers
From: Michael S. Tsirkin @ 2016-10-25 13:19 UTC (permalink / raw)
  To: Amit Shah
  Cc: Arnd Bergmann, Greg Kroah-Hartman, linux-kernel, stable,
	virtualization, Matt Redfearn
In-Reply-To: <20161025071803.GG2138@amit-lp.rh>

On Tue, Oct 25, 2016 at 12:48:03PM +0530, Amit Shah wrote:
> On (Tue) 11 Oct 2016 [12:05:15], Matt Redfearn wrote:
> > Commit c6017e793b93 ("virtio: console: add locks around buffer removal
> > in port unplug path") added locking around the freeing of buffers in the
> > vq. However, when free_buf() is called with can_sleep = true and rproc
> > is enabled, it calls dma_free_coherent() directly, requiring interrupts
> > to be enabled. Currently a WARNING is triggered due to the spin locking
> > around free_buf, with a call stack like this:
> > 
> > WARNING: CPU: 3 PID: 121 at ./include/linux/dma-mapping.h:433
> > free_buf+0x1a8/0x288
> > Call Trace:
> > [<8040c538>] show_stack+0x74/0xc0
> > [<80757240>] dump_stack+0xd0/0x110
> > [<80430d98>] __warn+0xfc/0x130
> > [<80430ee0>] warn_slowpath_null+0x2c/0x3c
> > [<807e7c6c>] free_buf+0x1a8/0x288
> > [<807ea590>] remove_port_data+0x50/0xac
> > [<807ea6a0>] unplug_port+0xb4/0x1bc
> > [<807ea858>] virtcons_remove+0xb0/0xfc
> > [<807b6734>] virtio_dev_remove+0x58/0xc0
> > [<807f918c>] __device_release_driver+0xac/0x134
> > [<807f924c>] device_release_driver+0x38/0x50
> > [<807f7edc>] bus_remove_device+0xfc/0x130
> > [<807f4b74>] device_del+0x17c/0x21c
> > [<807f4c38>] device_unregister+0x24/0x38
> > [<807b6b50>] unregister_virtio_device+0x28/0x44
> > 
> > Fix this by restructuring the loops to allow the locks to only be taken
> > where it is necessary to protect the vqs, and release it while the
> > buffer is being freed.
> > 
> > Fixes: c6017e793b93 ("virtio: console: add locks around buffer removal in port unplug path")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
> 
> Reviewed-by: Amit Shah <amit.shah@redhat.com>
> 
> Michael, can you pick this up?
> 
> Thanks,
> 
> 		Amit

Sure.

^ permalink raw reply

* RE: [RESEND PATCH v3 kernel 4/7] virtio-balloon: speed up inflate/deflate process
From: Li, Liang Z @ 2016-10-25  9:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org,
	amit.shah@redhat.com, Hansen, Dave, qemu-devel@nongnu.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	pbonzini@redhat.com, virtualization@lists.linux-foundation.org,
	dgilbert@redhat.com
In-Reply-To: <20161025091821-mutt-send-email-mst@kernel.org>

> > +static inline void init_pfn_range(struct virtio_balloon *vb) {
> > +	vb->min_pfn = ULONG_MAX;
> > +	vb->max_pfn = 0;
> > +}
> > +
> > +static inline void update_pfn_range(struct virtio_balloon *vb,
> > +				 struct page *page)
> > +{
> > +	unsigned long balloon_pfn = page_to_balloon_pfn(page);
> > +
> > +	if (balloon_pfn < vb->min_pfn)
> > +		vb->min_pfn = balloon_pfn;
> > +	if (balloon_pfn > vb->max_pfn)
> > +		vb->max_pfn = balloon_pfn;
> > +}
> > +
> 
> rename to hint these are all bitmap related.

Will change in v4.

> 
> 
> >  static void tell_host(struct virtio_balloon *vb, struct virtqueue
> > *vq)  {
> > -	struct scatterlist sg;
> > -	unsigned int len;
> > +	struct scatterlist sg, sg2[BALLOON_BMAP_COUNT + 1];
> > +	unsigned int len, i;
> > +
> > +	if (virtio_has_feature(vb->vdev,
> VIRTIO_BALLOON_F_PAGE_BITMAP)) {
> > +		struct balloon_bmap_hdr *hdr = vb->bmap_hdr;
> > +		unsigned long bmap_len;
> > +		int nr_pfn, nr_used_bmap, nr_buf;
> > +
> > +		nr_pfn = vb->end_pfn - vb->start_pfn + 1;
> > +		nr_pfn = roundup(nr_pfn, BITS_PER_LONG);
> > +		nr_used_bmap = nr_pfn / PFNS_PER_BMAP;
> > +		bmap_len = nr_pfn / BITS_PER_BYTE;
> > +		nr_buf = nr_used_bmap + 1;
> > +
> > +		/* cmd, reserved and req_id are init to 0, unused here */
> > +		hdr->page_shift = cpu_to_virtio16(vb->vdev, PAGE_SHIFT);
> > +		hdr->start_pfn = cpu_to_virtio64(vb->vdev, vb->start_pfn);
> > +		hdr->bmap_len = cpu_to_virtio64(vb->vdev, bmap_len);
> > +		sg_init_table(sg2, nr_buf);
> > +		sg_set_buf(&sg2[0], hdr, sizeof(struct balloon_bmap_hdr));
> > +		for (i = 0; i < nr_used_bmap; i++) {
> > +			unsigned int  buf_len = BALLOON_BMAP_SIZE;
> > +
> > +			if (i + 1 == nr_used_bmap)
> > +				buf_len = bmap_len - BALLOON_BMAP_SIZE
> * i;
> > +			sg_set_buf(&sg2[i + 1], vb->page_bitmap[i],
> buf_len);
> > +		}
> >
> > -	sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb->num_pfns);
> > +		while (vq->num_free < nr_buf)
> > +			msleep(2);
> 
> 
> What's going on here? Who is expected to update num_free?
> 

I just want to wait until the vq have enough space to write the bitmap, I thought qemu
side will update the vq->num_free, is it wrong?

> 
> 
> > +		if (virtqueue_add_outbuf(vq, sg2, nr_buf, vb, GFP_KERNEL)
> == 0)
> > +			virtqueue_kick(vq);
> >
> > -	/* We should always be able to add one buffer to an empty queue.
> */
> > -	virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> > -	virtqueue_kick(vq);
> > +	} else {
> > +		sg_init_one(&sg, vb->pfns, sizeof(vb->pfns[0]) * vb-
> >num_pfns);
> > +
> > +		/* We should always be able to add one buffer to an empty
> > +		 * queue. */
> 
> Pls use a multiple comment style consistent with kernel coding style.

Will change in next version.

> 
> > +		virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
> > +		virtqueue_kick(vq);
> > +	}
> >
> >  	/* When host has read buffer, this completes via balloon_ack */
> >  	wait_event(vb->acked, virtqueue_get_buf(vq, &len)); @@ -138,13
> > +199,93 @@ static void set_page_pfns(struct virtio_balloon *vb,
> >  					  page_to_balloon_pfn(page) + i);  }
> >
> > -static unsigned fill_balloon(struct virtio_balloon *vb, size_t num)
> > +static void extend_page_bitmap(struct virtio_balloon *vb) {
> > +	int i;
> > +	unsigned long bmap_len, bmap_count;
> > +
> > +	bmap_len = ALIGN(get_max_pfn(), BITS_PER_LONG) /
> BITS_PER_BYTE;
> > +	bmap_count = bmap_len / BALLOON_BMAP_SIZE;
> > +	if (bmap_len % BALLOON_BMAP_SIZE)
> > +		bmap_count++;
> > +	if (bmap_count > BALLOON_BMAP_COUNT)
> > +		bmap_count = BALLOON_BMAP_COUNT;
> > +
> 
> This is doing simple things in tricky ways.
> Please use macros such as ALIGN and max instead of if.
> 

Will change.

> 
> > +	for (i = 1; i < bmap_count; i++) {
> 
> why 1?

In probe stage, already allocated one bitmap.

> 
> > +		vb->page_bitmap[i] = kmalloc(BALLOON_BMAP_SIZE,
> GFP_ATOMIC);
> 
> why GFP_ATOMIC?

Yes, GFP_ATOMIC is not necessary.

> and what will free the previous buffer?

The previous buffer will not be freed.

> 
> 
> > +		if (vb->page_bitmap[i])
> > +			vb->nr_page_bmap++;
> > +		else
> > +			break;
> 
> and what will happen then?

I plan to use the previous allocated buffer to save the bitmap, need more code for kmalloc failure?

> > -static unsigned leak_balloon(struct virtio_balloon *vb, size_t num)
> > +static unsigned int leak_balloon(struct virtio_balloon *vb, size_t num,
> > +				bool use_bmap)
> 
> this is just a feature bit - why not get it internally?

Indeed.

> > @@ -218,8 +374,14 @@ static unsigned leak_balloon(struct virtio_balloon
> *vb, size_t num)
> >  	 * virtio_has_feature(vdev, VIRTIO_BALLOON_F_MUST_TELL_HOST);
> >  	 * is true, we *have* to do it in this order
> >  	 */
> > -	if (vb->num_pfns != 0)
> > -		tell_host(vb, vb->deflate_vq);
> > +	if (vb->num_pfns != 0) {
> > +		if (use_bmap)
> > +			set_page_bitmap(vb, &pages, vb->deflate_vq);
> > +		else
> > +			tell_host(vb, vb->deflate_vq);
> > +
> > +		release_pages_balloon(vb, &pages);
> > +	}
> >  	release_pages_balloon(vb, &pages);
> >  	mutex_unlock(&vb->balloon_lock);
> >  	return num_freed_pages;
> > @@ -354,13 +516,15 @@ static int virtballoon_oom_notify(struct
> notifier_block *self,
> >  	struct virtio_balloon *vb;
> >  	unsigned long *freed;
> >  	unsigned num_freed_pages;
> > +	bool use_bmap;
> >
> >  	vb = container_of(self, struct virtio_balloon, nb);
> >  	if (!virtio_has_feature(vb->vdev,
> VIRTIO_BALLOON_F_DEFLATE_ON_OOM))
> >  		return NOTIFY_OK;
> >
> >  	freed = parm;
> > -	num_freed_pages = leak_balloon(vb, oom_pages);
> > +	use_bmap = virtio_has_feature(vb->vdev,
> VIRTIO_BALLOON_F_PAGE_BITMAP);
> > +	num_freed_pages = leak_balloon(vb, oom_pages, use_bmap);
> >  	update_balloon_size(vb);
> >  	*freed += num_freed_pages;
> >
> > @@ -380,15 +544,19 @@ static void update_balloon_size_func(struct
> > work_struct *work)  {
> >  	struct virtio_balloon *vb;
> >  	s64 diff;
> > +	bool use_bmap;
> >
> >  	vb = container_of(work, struct virtio_balloon,
> >  			  update_balloon_size_work);
> >  	diff = towards_target(vb);
> > +	use_bmap = virtio_has_feature(vb->vdev,
> VIRTIO_BALLOON_F_PAGE_BITMAP);
> > +	if (use_bmap && diff && vb->nr_page_bmap == 1)
> > +		extend_page_bitmap(vb);
> 
> So you allocate it on first use, then keep it around until device remove?
> Seems ugly.

Yes, this version behave like this.

> Needs comments explaining the motivation for this.
> Can't we free it immediately when it becomes unused?
> 

Yes, it can be freed immediately, will change in v4.

Thanks for your time and your valuable comments! I will send out the v4 soon.

Liang

^ permalink raw reply

* [GIT PULL v2 5/5] processor.h: remove cpu_relax_lowlatency
From: Christian Borntraeger @ 2016-10-25  9:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-s390, kvm, Will Deacon, x86, Heiko Carstens,
	linux-kernel, Nicholas Piggin, Russell King, sparclinux,
	Noam Camus, Catalin Marinas, Martin Schwidefsky, xen-devel,
	virtualization, linuxppc-dev, Ingo Molnar
In-Reply-To: <1477386195-32736-1-git-send-email-borntraeger@de.ibm.com>

As there are no users left, we can remove cpu_relax_lowlatency.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/alpha/include/asm/processor.h      | 1 -
 arch/arc/include/asm/processor.h        | 2 --
 arch/arm/include/asm/processor.h        | 1 -
 arch/arm64/include/asm/processor.h      | 1 -
 arch/avr32/include/asm/processor.h      | 1 -
 arch/blackfin/include/asm/processor.h   | 1 -
 arch/c6x/include/asm/processor.h        | 1 -
 arch/cris/include/asm/processor.h       | 1 -
 arch/frv/include/asm/processor.h        | 1 -
 arch/h8300/include/asm/processor.h      | 1 -
 arch/hexagon/include/asm/processor.h    | 1 -
 arch/ia64/include/asm/processor.h       | 1 -
 arch/m32r/include/asm/processor.h       | 1 -
 arch/m68k/include/asm/processor.h       | 1 -
 arch/metag/include/asm/processor.h      | 1 -
 arch/microblaze/include/asm/processor.h | 1 -
 arch/mips/include/asm/processor.h       | 1 -
 arch/mn10300/include/asm/processor.h    | 1 -
 arch/nios2/include/asm/processor.h      | 1 -
 arch/openrisc/include/asm/processor.h   | 1 -
 arch/parisc/include/asm/processor.h     | 1 -
 arch/powerpc/include/asm/processor.h    | 1 -
 arch/s390/include/asm/processor.h       | 1 -
 arch/score/include/asm/processor.h      | 1 -
 arch/sh/include/asm/processor.h         | 1 -
 arch/sparc/include/asm/processor_32.h   | 1 -
 arch/sparc/include/asm/processor_64.h   | 1 -
 arch/tile/include/asm/processor.h       | 1 -
 arch/unicore32/include/asm/processor.h  | 1 -
 arch/x86/include/asm/processor.h        | 1 -
 arch/x86/um/asm/processor.h             | 1 -
 arch/xtensa/include/asm/processor.h     | 1 -
 32 files changed, 33 deletions(-)

diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h
index 0556fda..31e8dbe 100644
--- a/arch/alpha/include/asm/processor.h
+++ b/arch/alpha/include/asm/processor.h
@@ -59,7 +59,6 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 #define ARCH_HAS_PREFETCH
 #define ARCH_HAS_PREFETCHW
diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
index 6c158d5..d102a49 100644
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -61,7 +61,6 @@ struct task_struct;
 
 #define cpu_relax()		barrier()
 #define cpu_relax_yield()	cpu_relax()
-#define cpu_relax_lowlatency()	cpu_relax()
 
 #else
 
@@ -69,7 +68,6 @@ struct task_struct;
 	__asm__ __volatile__ (".word %0" : : "i"(CTOP_INST_SCHD_RW) : "memory")
 
 #define cpu_relax_yield()	cpu_relax()
-#define cpu_relax_lowlatency()	barrier()
 
 #endif
 
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index db660e0..9e71c58b 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -83,7 +83,6 @@ unsigned long get_wchan(struct task_struct *p);
 #endif
 
 #define cpu_relax_yield()  	              cpu_relax()
-#define cpu_relax_lowlatency()                cpu_relax()
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 3f9b0e5..6132f64 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -150,7 +150,6 @@ static inline void cpu_relax(void)
 }
 
 #define cpu_relax_yield()                     cpu_relax()
-#define cpu_relax_lowlatency()                cpu_relax()
 
 /* Thread switching */
 extern struct task_struct *cpu_switch_to(struct task_struct *prev,
diff --git a/arch/avr32/include/asm/processor.h b/arch/avr32/include/asm/processor.h
index e412e8b..ee62365 100644
--- a/arch/avr32/include/asm/processor.h
+++ b/arch/avr32/include/asm/processor.h
@@ -93,7 +93,6 @@ extern struct avr32_cpuinfo boot_cpu_data;
 
 #define cpu_relax()		barrier()
 #define cpu_relax_yield()	cpu_relax()
-#define cpu_relax_lowlatency()        cpu_relax()
 #define cpu_sync_pipeline()	asm volatile("sub pc, -2" : : : "memory")
 
 struct cpu_context {
diff --git a/arch/blackfin/include/asm/processor.h b/arch/blackfin/include/asm/processor.h
index 8b8704a..57acfb1 100644
--- a/arch/blackfin/include/asm/processor.h
+++ b/arch/blackfin/include/asm/processor.h
@@ -93,7 +93,6 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()    	smp_mb()
 #define cpu_relax_yield()      cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /* Get the Silicon Revision of the chip */
 static inline uint32_t __pure bfin_revid(void)
diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h
index 914d730..1fd22e7 100644
--- a/arch/c6x/include/asm/processor.h
+++ b/arch/c6x/include/asm/processor.h
@@ -122,7 +122,6 @@ extern unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()		do { } while (0)
 #define cpu_relax_yield()             cpu_relax()
-#define cpu_relax_lowlatency()        cpu_relax()
 
 extern const struct seq_operations cpuinfo_op;
 
diff --git a/arch/cris/include/asm/processor.h b/arch/cris/include/asm/processor.h
index 01dd52e..1a57841 100644
--- a/arch/cris/include/asm/processor.h
+++ b/arch/cris/include/asm/processor.h
@@ -64,7 +64,6 @@ static inline void release_thread(struct task_struct *dead_task)
 
 #define cpu_relax()     barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 void default_idle(void);
 
diff --git a/arch/frv/include/asm/processor.h b/arch/frv/include/asm/processor.h
index 4d00d65..c1e5f2a 100644
--- a/arch/frv/include/asm/processor.h
+++ b/arch/frv/include/asm/processor.h
@@ -108,7 +108,6 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax() barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /* data cache prefetch */
 #define ARCH_HAS_PREFETCH
diff --git a/arch/h8300/include/asm/processor.h b/arch/h8300/include/asm/processor.h
index 683a061..42d6053 100644
--- a/arch/h8300/include/asm/processor.h
+++ b/arch/h8300/include/asm/processor.h
@@ -128,7 +128,6 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()    barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency()	cpu_relax()
 
 #define HARD_RESET_NOW() ({		\
 	local_irq_disable();		\
diff --git a/arch/hexagon/include/asm/processor.h b/arch/hexagon/include/asm/processor.h
index 1558ddb..5d694cc 100644
--- a/arch/hexagon/include/asm/processor.h
+++ b/arch/hexagon/include/asm/processor.h
@@ -57,7 +57,6 @@ struct thread_struct {
 
 #define cpu_relax() __vmyield()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /*
  * Decides where the kernel will search for a free chunk of vm space during
diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index 4654b71..0c2c3b2 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -548,7 +548,6 @@ ia64_eoi (void)
 
 #define cpu_relax()	ia64_hint(ia64_hint_pause)
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 static inline int
 ia64_get_irr(unsigned int vector)
diff --git a/arch/m32r/include/asm/processor.h b/arch/m32r/include/asm/processor.h
index b262037..9b83a13 100644
--- a/arch/m32r/include/asm/processor.h
+++ b/arch/m32r/include/asm/processor.h
@@ -134,6 +134,5 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 #endif /* _ASM_M32R_PROCESSOR_H */
diff --git a/arch/m68k/include/asm/processor.h b/arch/m68k/include/asm/processor.h
index 13e07ae..b0d0442 100644
--- a/arch/m68k/include/asm/processor.h
+++ b/arch/m68k/include/asm/processor.h
@@ -157,6 +157,5 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 #endif
diff --git a/arch/metag/include/asm/processor.h b/arch/metag/include/asm/processor.h
index 61d6e27..ee302a6 100644
--- a/arch/metag/include/asm/processor.h
+++ b/arch/metag/include/asm/processor.h
@@ -153,7 +153,6 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()     barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency()  cpu_relax()
 
 extern void setup_priv(void);
 
diff --git a/arch/microblaze/include/asm/processor.h b/arch/microblaze/include/asm/processor.h
index fd7dd11..08ec1f7 100644
--- a/arch/microblaze/include/asm/processor.h
+++ b/arch/microblaze/include/asm/processor.h
@@ -23,7 +23,6 @@ extern const struct seq_operations cpuinfo_op;
 
 # define cpu_relax()		barrier()
 # define cpu_relax_yield() cpu_relax()
-# define cpu_relax_lowlatency()	cpu_relax()
 
 #define task_pt_regs(tsk) \
 		(((struct pt_regs *)(THREAD_SIZE + task_stack_page(tsk))) - 1)
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 9a656f6..8ea95e7 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -390,7 +390,6 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /*
  * Return_address is a replacement for __builtin_return_address(count)
diff --git a/arch/mn10300/include/asm/processor.h b/arch/mn10300/include/asm/processor.h
index 89f63d1..d11397b 100644
--- a/arch/mn10300/include/asm/processor.h
+++ b/arch/mn10300/include/asm/processor.h
@@ -70,7 +70,6 @@ extern void dodgy_tsc(void);
 
 #define cpu_relax() barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /*
  * User space process size: 1.75GB (default).
diff --git a/arch/nios2/include/asm/processor.h b/arch/nios2/include/asm/processor.h
index 303e593..d32c176 100644
--- a/arch/nios2/include/asm/processor.h
+++ b/arch/nios2/include/asm/processor.h
@@ -89,7 +89,6 @@ extern unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency()  cpu_relax()
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/openrisc/include/asm/processor.h b/arch/openrisc/include/asm/processor.h
index 6ecfc2a..7f47fc7 100644
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -93,7 +93,6 @@ extern unsigned long thread_saved_pc(struct task_struct *t);
 
 #define cpu_relax()     barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 #endif /* __ASSEMBLY__ */
 #endif /* __ASM_OPENRISC_PROCESSOR_H */
diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
index ea2ff9f..a4a07f4 100644
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -310,7 +310,6 @@ extern unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /*
  * parisc_requires_coherency() is used to identify the combined VIPT/PIPT
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 908fa7c..5684e68 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -405,7 +405,6 @@ static inline unsigned long __pack_fe01(unsigned int fpmode)
 #endif
 
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /* Check that a certain kernel stack pointer is valid in task_struct p */
 int validate_sp(unsigned long sp, struct task_struct *p,
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 5d262cf..17c001a 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -237,7 +237,6 @@ static inline unsigned short stap(void)
 void cpu_relax_yield(void);
 
 #define cpu_relax() barrier()
-#define cpu_relax_lowlatency()  barrier()
 
 #define ECAG_CACHE_ATTRIBUTE	0
 #define ECAG_CPU_ATTRIBUTE	1
diff --git a/arch/score/include/asm/processor.h b/arch/score/include/asm/processor.h
index e8e87b4..a1e97c0 100644
--- a/arch/score/include/asm/processor.h
+++ b/arch/score/include/asm/processor.h
@@ -25,7 +25,6 @@ extern unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()		barrier()
 #define cpu_relax_yield()	cpu_relax()
-#define cpu_relax_lowlatency()        cpu_relax()
 #define release_thread(thread)	do {} while (0)
 
 /*
diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h
index 099a991..9454ff1 100644
--- a/arch/sh/include/asm/processor.h
+++ b/arch/sh/include/asm/processor.h
@@ -98,7 +98,6 @@ extern struct sh_cpuinfo cpu_data[];
 #define cpu_sleep()	__asm__ __volatile__ ("sleep" : : : "memory")
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 void default_idle(void);
 void stop_this_cpu(void *);
diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h
index 50e908a3c..fc32b73 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -120,7 +120,6 @@ int do_mathemu(struct pt_regs *regs, struct task_struct *fpt);
 
 #define cpu_relax()	barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 extern void (*sparc_idle)(void);
 
diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h
index 3e8fac7..12787df 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -217,7 +217,6 @@ unsigned long get_wchan(struct task_struct *task);
 				     ".previous"			\
 				     ::: "memory")
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /* Prefetch support.  This is tuned for UltraSPARC-III and later.
  * UltraSPARC-I will treat these as nops, and UltraSPARC-II has
diff --git a/arch/tile/include/asm/processor.h b/arch/tile/include/asm/processor.h
index 91a39a5..c1c228b 100644
--- a/arch/tile/include/asm/processor.h
+++ b/arch/tile/include/asm/processor.h
@@ -265,7 +265,6 @@ static inline void cpu_relax(void)
 }
 
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /* Info on this processor (see fs/proc/cpuinfo.c) */
 struct seq_operations;
diff --git a/arch/unicore32/include/asm/processor.h b/arch/unicore32/include/asm/processor.h
index fc54d5d..eeefe7c 100644
--- a/arch/unicore32/include/asm/processor.h
+++ b/arch/unicore32/include/asm/processor.h
@@ -72,7 +72,6 @@ unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
 #define cpu_relax_yield()		cpu_relax()
-#define cpu_relax_lowlatency()                cpu_relax()
 
 #define task_pt_regs(p) \
 	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 44adada..7513c99 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -589,7 +589,6 @@ static __always_inline void cpu_relax(void)
 }
 
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /* Stop speculative execution and prefetching of modified code. */
 static inline void sync_core(void)
diff --git a/arch/x86/um/asm/processor.h b/arch/x86/um/asm/processor.h
index 01597af..b4bd63b 100644
--- a/arch/x86/um/asm/processor.h
+++ b/arch/x86/um/asm/processor.h
@@ -27,7 +27,6 @@ static inline void rep_nop(void)
 
 #define cpu_relax()		rep_nop()
 #define cpu_relax_yield()	cpu_relax()
-#define cpu_relax_lowlatency()	cpu_relax()
 
 #define task_pt_regs(t) (&(t)->thread.regs)
 
diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h
index fe14dc2..7d8d6be 100644
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -207,7 +207,6 @@ extern unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()  barrier()
 #define cpu_relax_yield() cpu_relax()
-#define cpu_relax_lowlatency() cpu_relax()
 
 /* Special register access. */
 
-- 
2.5.5

^ permalink raw reply related

* [GIT PULL v2 4/5] processor.h: Remove cpu_relax_lowlatency users
From: Christian Borntraeger @ 2016-10-25  9:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-s390, kvm, Will Deacon, x86, Heiko Carstens,
	linux-kernel, Nicholas Piggin, Russell King, sparclinux,
	Noam Camus, Catalin Marinas, Martin Schwidefsky, xen-devel,
	virtualization, linuxppc-dev, Ingo Molnar
In-Reply-To: <1477386195-32736-1-git-send-email-borntraeger@de.ibm.com>

With the s390 special case of a yielding cpu_relax implementation gone,
we can now remove all users of cpu_relax_lowlatency and replace them
with cpu_relax.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 drivers/gpu/drm/i915/i915_gem_request.c | 2 +-
 drivers/vhost/net.c                     | 4 ++--
 kernel/locking/mcs_spinlock.h           | 4 ++--
 kernel/locking/mutex.c                  | 4 ++--
 kernel/locking/osq_lock.c               | 6 +++---
 kernel/locking/qrwlock.c                | 6 +++---
 kernel/locking/rwsem-xadd.c             | 4 ++--
 lib/lockref.c                           | 2 +-
 8 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 8832f8e..383d134 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -723,7 +723,7 @@ bool __i915_spin_request(const struct drm_i915_gem_request *req,
 		if (busywait_stop(timeout_us, cpu))
 			break;
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 	} while (!need_resched());
 
 	return false;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 5dc128a..5dc3465 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -342,7 +342,7 @@ static int vhost_net_tx_get_vq_desc(struct vhost_net *net,
 		endtime = busy_clock() + vq->busyloop_timeout;
 		while (vhost_can_busy_poll(vq->dev, endtime) &&
 		       vhost_vq_avail_empty(vq->dev, vq))
-			cpu_relax_lowlatency();
+			cpu_relax();
 		preempt_enable();
 		r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
 				      out_num, in_num, NULL, NULL);
@@ -533,7 +533,7 @@ static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk)
 		while (vhost_can_busy_poll(&net->dev, endtime) &&
 		       !sk_has_rx_data(sk) &&
 		       vhost_vq_avail_empty(&net->dev, vq))
-			cpu_relax_lowlatency();
+			cpu_relax();
 
 		preempt_enable();
 
diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h
index c835270..6a385aa 100644
--- a/kernel/locking/mcs_spinlock.h
+++ b/kernel/locking/mcs_spinlock.h
@@ -28,7 +28,7 @@ struct mcs_spinlock {
 #define arch_mcs_spin_lock_contended(l)					\
 do {									\
 	while (!(smp_load_acquire(l)))					\
-		cpu_relax_lowlatency();					\
+		cpu_relax();						\
 } while (0)
 #endif
 
@@ -108,7 +108,7 @@ void mcs_spin_unlock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
 			return;
 		/* Wait until the next pointer is set */
 		while (!(next = READ_ONCE(node->next)))
-			cpu_relax_lowlatency();
+			cpu_relax();
 	}
 
 	/* Pass lock to next waiter. */
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a70b90d..4463405 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -241,7 +241,7 @@ bool mutex_spin_on_owner(struct mutex *lock, struct task_struct *owner)
 			break;
 		}
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 	rcu_read_unlock();
 
@@ -377,7 +377,7 @@ static bool mutex_optimistic_spin(struct mutex *lock,
 		 * memory barriers as we'll eventually observe the right
 		 * values at the cost of a few extra spins.
 		 */
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 
 	osq_unlock(&lock->osq);
diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
index 05a3785..4ea2710 100644
--- a/kernel/locking/osq_lock.c
+++ b/kernel/locking/osq_lock.c
@@ -75,7 +75,7 @@ osq_wait_next(struct optimistic_spin_queue *lock,
 				break;
 		}
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 
 	return next;
@@ -122,7 +122,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 		if (need_resched())
 			goto unqueue;
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 	return true;
 
@@ -148,7 +148,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
 		if (smp_load_acquire(&node->locked))
 			return true;
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 
 		/*
 		 * Or we race against a concurrent unqueue()'s step-B, in which
diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
index 19248dd..cc3ed0c 100644
--- a/kernel/locking/qrwlock.c
+++ b/kernel/locking/qrwlock.c
@@ -54,7 +54,7 @@ static __always_inline void
 rspin_until_writer_unlock(struct qrwlock *lock, u32 cnts)
 {
 	while ((cnts & _QW_WMASK) == _QW_LOCKED) {
-		cpu_relax_lowlatency();
+		cpu_relax();
 		cnts = atomic_read_acquire(&lock->cnts);
 	}
 }
@@ -130,7 +130,7 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 		   (cmpxchg_relaxed(&l->wmode, 0, _QW_WAITING) == 0))
 			break;
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 
 	/* When no more readers, set the locked flag */
@@ -141,7 +141,7 @@ void queued_write_lock_slowpath(struct qrwlock *lock)
 					    _QW_LOCKED) == _QW_WAITING))
 			break;
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 unlock:
 	arch_spin_unlock(&lock->wait_lock);
diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 2337b4b..2fa2e2e6 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -368,7 +368,7 @@ static noinline bool rwsem_spin_on_owner(struct rw_semaphore *sem)
 			return false;
 		}
 
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 	rcu_read_unlock();
 out:
@@ -423,7 +423,7 @@ static bool rwsem_optimistic_spin(struct rw_semaphore *sem)
 		 * memory barriers as we'll eventually observe the right
 		 * values at the cost of a few extra spins.
 		 */
-		cpu_relax_lowlatency();
+		cpu_relax();
 	}
 	osq_unlock(&sem->osq);
 done:
diff --git a/lib/lockref.c b/lib/lockref.c
index 5a92189..c4bfcb8 100644
--- a/lib/lockref.c
+++ b/lib/lockref.c
@@ -20,7 +20,7 @@
 		if (likely(old.lock_count == prev.lock_count)) {		\
 			SUCCESS;						\
 		}								\
-		cpu_relax_lowlatency();						\
+		cpu_relax();							\
 	}									\
 } while (0)
 
-- 
2.5.5

^ permalink raw reply related

* [GIT PULL v2 3/5] s390: make cpu_relax a barrier again
From: Christian Borntraeger @ 2016-10-25  9:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-s390, kvm, Will Deacon, x86, Heiko Carstens,
	linux-kernel, Nicholas Piggin, Russell King, sparclinux,
	Noam Camus, Catalin Marinas, Martin Schwidefsky, xen-devel,
	virtualization, linuxppc-dev, Ingo Molnar
In-Reply-To: <1477386195-32736-1-git-send-email-borntraeger@de.ibm.com>

stop_machine seemed to be the only important place for yielding during
cpu_relax. This was fixed by using cpu_relax_yield. Therefore, we can
now redefine cpu_relax to be a barrier instead on s390, making s390
identical to all other architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/include/asm/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index d05965b..5d262cf 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -236,7 +236,7 @@ static inline unsigned short stap(void)
  */
 void cpu_relax_yield(void);
 
-#define cpu_relax() cpu_relax_yield()
+#define cpu_relax() barrier()
 #define cpu_relax_lowlatency()  barrier()
 
 #define ECAG_CACHE_ATTRIBUTE	0
-- 
2.5.5

^ permalink raw reply related

* [GIT PULL v2 2/5] stop_machine: yield CPU during stop machine
From: Christian Borntraeger @ 2016-10-25  9:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-s390, kvm, Will Deacon, x86, Heiko Carstens,
	linux-kernel, Nicholas Piggin, Russell King, sparclinux,
	Noam Camus, Catalin Marinas, Martin Schwidefsky, xen-devel,
	virtualization, linuxppc-dev, Ingo Molnar
In-Reply-To: <1477386195-32736-1-git-send-email-borntraeger@de.ibm.com>

Some time ago commit 57f2ffe14fd125c2  ("s390: remove diag 44 calls
from cpu_relax()") did stop cpu_relax on s390 yielding to the
hypervisor.

As it turns out this made stop_machine run really slow on virtualized
overcommited systems. For example the kprobes test during bootup took
several seconds instead of just running unnoticed with large guests.

Therefore, the yielding was reintroduced with commit 4d92f50249eb
("s390: reintroduce diag 44 calls for cpu_relax()"), but in fact the
stop machine code seems to be the only place where this yielding
was really necessary. This place is probably the most important one
as it makes all but one guest CPUs wait for one guest CPU.

As we now have cpu_relax_yield, we can use this in multi_cpu_stop.
For now lets only add it here. We can add it later in other places
when necessary.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 kernel/stop_machine.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index ec9ab2f..1eb8266 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -194,7 +194,7 @@ static int multi_cpu_stop(void *data)
 	/* Simple state machine */
 	do {
 		/* Chill out and ensure we re-read multi_stop_state. */
-		cpu_relax();
+		cpu_relax_yield();
 		if (msdata->state != curstate) {
 			curstate = msdata->state;
 			switch (curstate) {
-- 
2.5.5

^ permalink raw reply related

* [GIT PULL v2 1/5] processor.h: introduce cpu_relax_yield
From: Christian Borntraeger @ 2016-10-25  9:03 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-arch, linux-s390, kvm, Will Deacon, x86, Heiko Carstens,
	linux-kernel, Nicholas Piggin, Russell King, sparclinux,
	Noam Camus, Catalin Marinas, Martin Schwidefsky, xen-devel,
	virtualization, linuxppc-dev, Ingo Molnar
In-Reply-To: <1477386195-32736-1-git-send-email-borntraeger@de.ibm.com>

For spinning loops people do often use barrier() or cpu_relax().
For most architectures cpu_relax and barrier are the same, but on
some architectures cpu_relax can add some latency.
For example on power,sparc64 and arc, cpu_relax can shift the CPU
towards other hardware threads in an SMT environment.
On s390 cpu_relax does even more, it uses an hypercall to the
hypervisor to give up the timeslice.
In contrast to the SMT yielding this can result in larger latencies.
In some places this latency is unwanted, so another variant
"cpu_relax_lowlatency" was introduced. Before this is used in more
and more places, lets revert the logic and provide a cpu_relax_yield
that can be called in places where yielding is more important than
latency. By default this is the same as cpu_relax on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/alpha/include/asm/processor.h      | 1 +
 arch/arc/include/asm/processor.h        | 2 ++
 arch/arm/include/asm/processor.h        | 1 +
 arch/arm64/include/asm/processor.h      | 1 +
 arch/avr32/include/asm/processor.h      | 1 +
 arch/blackfin/include/asm/processor.h   | 1 +
 arch/c6x/include/asm/processor.h        | 1 +
 arch/cris/include/asm/processor.h       | 1 +
 arch/frv/include/asm/processor.h        | 1 +
 arch/h8300/include/asm/processor.h      | 1 +
 arch/hexagon/include/asm/processor.h    | 1 +
 arch/ia64/include/asm/processor.h       | 1 +
 arch/m32r/include/asm/processor.h       | 1 +
 arch/m68k/include/asm/processor.h       | 1 +
 arch/metag/include/asm/processor.h      | 1 +
 arch/microblaze/include/asm/processor.h | 1 +
 arch/mips/include/asm/processor.h       | 1 +
 arch/mn10300/include/asm/processor.h    | 1 +
 arch/nios2/include/asm/processor.h      | 1 +
 arch/openrisc/include/asm/processor.h   | 1 +
 arch/parisc/include/asm/processor.h     | 1 +
 arch/powerpc/include/asm/processor.h    | 1 +
 arch/s390/include/asm/processor.h       | 3 ++-
 arch/s390/kernel/processor.c            | 4 ++--
 arch/score/include/asm/processor.h      | 1 +
 arch/sh/include/asm/processor.h         | 1 +
 arch/sparc/include/asm/processor_32.h   | 1 +
 arch/sparc/include/asm/processor_64.h   | 1 +
 arch/tile/include/asm/processor.h       | 1 +
 arch/unicore32/include/asm/processor.h  | 1 +
 arch/x86/include/asm/processor.h        | 1 +
 arch/x86/um/asm/processor.h             | 1 +
 arch/xtensa/include/asm/processor.h     | 1 +
 33 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h
index 43a7559..0556fda 100644
--- a/arch/alpha/include/asm/processor.h
+++ b/arch/alpha/include/asm/processor.h
@@ -58,6 +58,7 @@ unsigned long get_wchan(struct task_struct *p);
   ((tsk) == current ? rdusp() : task_thread_info(tsk)->pcb.usp)
 
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 #define ARCH_HAS_PREFETCH
diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
index 16b630f..6c158d5 100644
--- a/arch/arc/include/asm/processor.h
+++ b/arch/arc/include/asm/processor.h
@@ -60,6 +60,7 @@ struct task_struct;
 #ifndef CONFIG_EZNPS_MTM_EXT
 
 #define cpu_relax()		barrier()
+#define cpu_relax_yield()	cpu_relax()
 #define cpu_relax_lowlatency()	cpu_relax()
 
 #else
@@ -67,6 +68,7 @@ struct task_struct;
 #define cpu_relax()     \
 	__asm__ __volatile__ (".word %0" : : "i"(CTOP_INST_SCHD_RW) : "memory")
 
+#define cpu_relax_yield()	cpu_relax()
 #define cpu_relax_lowlatency()	barrier()
 
 #endif
diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
index 8a1e8e9..db660e0 100644
--- a/arch/arm/include/asm/processor.h
+++ b/arch/arm/include/asm/processor.h
@@ -82,6 +82,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define cpu_relax()			barrier()
 #endif
 
+#define cpu_relax_yield()  	              cpu_relax()
 #define cpu_relax_lowlatency()                cpu_relax()
 
 #define task_pt_regs(p) \
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 60e3482..3f9b0e5 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -149,6 +149,7 @@ static inline void cpu_relax(void)
 	asm volatile("yield" ::: "memory");
 }
 
+#define cpu_relax_yield()                     cpu_relax()
 #define cpu_relax_lowlatency()                cpu_relax()
 
 /* Thread switching */
diff --git a/arch/avr32/include/asm/processor.h b/arch/avr32/include/asm/processor.h
index 941593c..e412e8b 100644
--- a/arch/avr32/include/asm/processor.h
+++ b/arch/avr32/include/asm/processor.h
@@ -92,6 +92,7 @@ extern struct avr32_cpuinfo boot_cpu_data;
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 3))
 
 #define cpu_relax()		barrier()
+#define cpu_relax_yield()	cpu_relax()
 #define cpu_relax_lowlatency()        cpu_relax()
 #define cpu_sync_pipeline()	asm volatile("sub pc, -2" : : : "memory")
 
diff --git a/arch/blackfin/include/asm/processor.h b/arch/blackfin/include/asm/processor.h
index 0c265ab..8b8704a 100644
--- a/arch/blackfin/include/asm/processor.h
+++ b/arch/blackfin/include/asm/processor.h
@@ -92,6 +92,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define	KSTK_ESP(tsk)	((tsk) == current ? rdusp() : (tsk)->thread.usp)
 
 #define cpu_relax()    	smp_mb()
+#define cpu_relax_yield()      cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /* Get the Silicon Revision of the chip */
diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h
index f2ef31b..914d730 100644
--- a/arch/c6x/include/asm/processor.h
+++ b/arch/c6x/include/asm/processor.h
@@ -121,6 +121,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(task)	(task_pt_regs(task)->sp)
 
 #define cpu_relax()		do { } while (0)
+#define cpu_relax_yield()             cpu_relax()
 #define cpu_relax_lowlatency()        cpu_relax()
 
 extern const struct seq_operations cpuinfo_op;
diff --git a/arch/cris/include/asm/processor.h b/arch/cris/include/asm/processor.h
index 862126b..01dd52e 100644
--- a/arch/cris/include/asm/processor.h
+++ b/arch/cris/include/asm/processor.h
@@ -63,6 +63,7 @@ static inline void release_thread(struct task_struct *dead_task)
 #define init_stack      (init_thread_union.stack)
 
 #define cpu_relax()     barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 void default_idle(void);
diff --git a/arch/frv/include/asm/processor.h b/arch/frv/include/asm/processor.h
index 73f0a79..4d00d65 100644
--- a/arch/frv/include/asm/processor.h
+++ b/arch/frv/include/asm/processor.h
@@ -107,6 +107,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define	KSTK_ESP(tsk)	((tsk)->thread.frame0->sp)
 
 #define cpu_relax() barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /* data cache prefetch */
diff --git a/arch/h8300/include/asm/processor.h b/arch/h8300/include/asm/processor.h
index 111df73..683a061 100644
--- a/arch/h8300/include/asm/processor.h
+++ b/arch/h8300/include/asm/processor.h
@@ -127,6 +127,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define	KSTK_ESP(tsk)	((tsk) == current ? rdusp() : (tsk)->thread.usp)
 
 #define cpu_relax()    barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency()	cpu_relax()
 
 #define HARD_RESET_NOW() ({		\
diff --git a/arch/hexagon/include/asm/processor.h b/arch/hexagon/include/asm/processor.h
index d850113..1558ddb 100644
--- a/arch/hexagon/include/asm/processor.h
+++ b/arch/hexagon/include/asm/processor.h
@@ -56,6 +56,7 @@ struct thread_struct {
 }
 
 #define cpu_relax() __vmyield()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /*
diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index ce53c50..4654b71 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -547,6 +547,7 @@ ia64_eoi (void)
 }
 
 #define cpu_relax()	ia64_hint(ia64_hint_pause)
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 static inline int
diff --git a/arch/m32r/include/asm/processor.h b/arch/m32r/include/asm/processor.h
index 9f8fd9b..b262037 100644
--- a/arch/m32r/include/asm/processor.h
+++ b/arch/m32r/include/asm/processor.h
@@ -133,6 +133,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)  ((tsk)->thread.sp)
 
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 #endif /* _ASM_M32R_PROCESSOR_H */
diff --git a/arch/m68k/include/asm/processor.h b/arch/m68k/include/asm/processor.h
index c84a218..13e07ae 100644
--- a/arch/m68k/include/asm/processor.h
+++ b/arch/m68k/include/asm/processor.h
@@ -156,6 +156,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define task_pt_regs(tsk)	((struct pt_regs *) ((tsk)->thread.esp0))
 
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 #endif
diff --git a/arch/metag/include/asm/processor.h b/arch/metag/include/asm/processor.h
index a0333eb..61d6e27 100644
--- a/arch/metag/include/asm/processor.h
+++ b/arch/metag/include/asm/processor.h
@@ -152,6 +152,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define user_stack_pointer(regs)        ((regs)->ctx.AX[0].U0)
 
 #define cpu_relax()     barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency()  cpu_relax()
 
 extern void setup_priv(void);
diff --git a/arch/microblaze/include/asm/processor.h b/arch/microblaze/include/asm/processor.h
index c38d0dd..fd7dd11 100644
--- a/arch/microblaze/include/asm/processor.h
+++ b/arch/microblaze/include/asm/processor.h
@@ -22,6 +22,7 @@
 extern const struct seq_operations cpuinfo_op;
 
 # define cpu_relax()		barrier()
+# define cpu_relax_yield() cpu_relax()
 # define cpu_relax_lowlatency()	cpu_relax()
 
 #define task_pt_regs(tsk) \
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 0d36c87..9a656f6 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -389,6 +389,7 @@ unsigned long get_wchan(struct task_struct *p);
 #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
 
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /*
diff --git a/arch/mn10300/include/asm/processor.h b/arch/mn10300/include/asm/processor.h
index b10ba12..89f63d1 100644
--- a/arch/mn10300/include/asm/processor.h
+++ b/arch/mn10300/include/asm/processor.h
@@ -69,6 +69,7 @@ extern void print_cpu_info(struct mn10300_cpuinfo *);
 extern void dodgy_tsc(void);
 
 #define cpu_relax() barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /*
diff --git a/arch/nios2/include/asm/processor.h b/arch/nios2/include/asm/processor.h
index 1c953f0..303e593 100644
--- a/arch/nios2/include/asm/processor.h
+++ b/arch/nios2/include/asm/processor.h
@@ -88,6 +88,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)	((tsk)->thread.kregs->sp)
 
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency()  cpu_relax()
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/openrisc/include/asm/processor.h b/arch/openrisc/include/asm/processor.h
index 70334c9..6ecfc2a 100644
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -92,6 +92,7 @@ extern unsigned long thread_saved_pc(struct task_struct *t);
 #define init_stack      (init_thread_union.stack)
 
 #define cpu_relax()     barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
index 2e674e1..ea2ff9f 100644
--- a/arch/parisc/include/asm/processor.h
+++ b/arch/parisc/include/asm/processor.h
@@ -309,6 +309,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)	((tsk)->thread.regs.gr[30])
 
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /*
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index c07c31b..908fa7c 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -404,6 +404,7 @@ static inline unsigned long __pack_fe01(unsigned int fpmode)
 #define cpu_relax()	barrier()
 #endif
 
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /* Check that a certain kernel stack pointer is valid in task_struct p */
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
index 0332317..d05965b 100644
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -234,8 +234,9 @@ static inline unsigned short stap(void)
 /*
  * Give up the time slice of the virtual PU.
  */
-void cpu_relax(void);
+void cpu_relax_yield(void);
 
+#define cpu_relax() cpu_relax_yield()
 #define cpu_relax_lowlatency()  barrier()
 
 #define ECAG_CACHE_ATTRIBUTE	0
diff --git a/arch/s390/kernel/processor.c b/arch/s390/kernel/processor.c
index 81d0808..9e60ef1 100644
--- a/arch/s390/kernel/processor.c
+++ b/arch/s390/kernel/processor.c
@@ -53,7 +53,7 @@ void s390_update_cpu_mhz(void)
 		on_each_cpu(update_cpu_mhz, NULL, 0);
 }
 
-void notrace cpu_relax(void)
+void notrace cpu_relax_yield(void)
 {
 	if (!smp_cpu_mtid && MACHINE_HAS_DIAG44) {
 		diag_stat_inc(DIAG_STAT_X044);
@@ -61,7 +61,7 @@ void notrace cpu_relax(void)
 	}
 	barrier();
 }
-EXPORT_SYMBOL(cpu_relax);
+EXPORT_SYMBOL(cpu_relax_yield);
 
 /*
  * cpu_init - initializes state that is per-CPU.
diff --git a/arch/score/include/asm/processor.h b/arch/score/include/asm/processor.h
index 851f441..e8e87b4 100644
--- a/arch/score/include/asm/processor.h
+++ b/arch/score/include/asm/processor.h
@@ -24,6 +24,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define current_text_addr() ({ __label__ _l; _l: &&_l; })
 
 #define cpu_relax()		barrier()
+#define cpu_relax_yield()	cpu_relax()
 #define cpu_relax_lowlatency()        cpu_relax()
 #define release_thread(thread)	do {} while (0)
 
diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h
index f9a0994..099a991 100644
--- a/arch/sh/include/asm/processor.h
+++ b/arch/sh/include/asm/processor.h
@@ -97,6 +97,7 @@ extern struct sh_cpuinfo cpu_data[];
 
 #define cpu_sleep()	__asm__ __volatile__ ("sleep" : : : "memory")
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 void default_idle(void);
diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h
index 812fd08..50e908a3c 100644
--- a/arch/sparc/include/asm/processor_32.h
+++ b/arch/sparc/include/asm/processor_32.h
@@ -119,6 +119,7 @@ extern struct task_struct *last_task_used_math;
 int do_mathemu(struct pt_regs *regs, struct task_struct *fpt);
 
 #define cpu_relax()	barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 extern void (*sparc_idle)(void);
diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h
index ce2595c..3e8fac7 100644
--- a/arch/sparc/include/asm/processor_64.h
+++ b/arch/sparc/include/asm/processor_64.h
@@ -216,6 +216,7 @@ unsigned long get_wchan(struct task_struct *task);
 				     "nop\n\t"				\
 				     ".previous"			\
 				     ::: "memory")
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /* Prefetch support.  This is tuned for UltraSPARC-III and later.
diff --git a/arch/tile/include/asm/processor.h b/arch/tile/include/asm/processor.h
index 0684e88..91a39a5 100644
--- a/arch/tile/include/asm/processor.h
+++ b/arch/tile/include/asm/processor.h
@@ -264,6 +264,7 @@ static inline void cpu_relax(void)
 	barrier();
 }
 
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /* Info on this processor (see fs/proc/cpuinfo.c) */
diff --git a/arch/unicore32/include/asm/processor.h b/arch/unicore32/include/asm/processor.h
index 8d21b7a..fc54d5d 100644
--- a/arch/unicore32/include/asm/processor.h
+++ b/arch/unicore32/include/asm/processor.h
@@ -71,6 +71,7 @@ extern void release_thread(struct task_struct *);
 unsigned long get_wchan(struct task_struct *p);
 
 #define cpu_relax()			barrier()
+#define cpu_relax_yield()		cpu_relax()
 #define cpu_relax_lowlatency()                cpu_relax()
 
 #define task_pt_regs(p) \
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 984a7bf..44adada 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -588,6 +588,7 @@ static __always_inline void cpu_relax(void)
 	rep_nop();
 }
 
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /* Stop speculative execution and prefetching of modified code. */
diff --git a/arch/x86/um/asm/processor.h b/arch/x86/um/asm/processor.h
index 233ee09..01597af 100644
--- a/arch/x86/um/asm/processor.h
+++ b/arch/x86/um/asm/processor.h
@@ -26,6 +26,7 @@ static inline void rep_nop(void)
 }
 
 #define cpu_relax()		rep_nop()
+#define cpu_relax_yield()	cpu_relax()
 #define cpu_relax_lowlatency()	cpu_relax()
 
 #define task_pt_regs(t) (&(t)->thread.regs)
diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h
index b42d68b..fe14dc2 100644
--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -206,6 +206,7 @@ extern unsigned long get_wchan(struct task_struct *p);
 #define KSTK_ESP(tsk)		(task_pt_regs(tsk)->areg[1])
 
 #define cpu_relax()  barrier()
+#define cpu_relax_yield() cpu_relax()
 #define cpu_relax_lowlatency() cpu_relax()
 
 /* Special register access. */
-- 
2.5.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox