public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0
  2015-05-29 18:09 ` [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0 K. Y. Srinivasan
@ 2015-05-29 16:59   ` Dan Carpenter
  2015-05-29 17:41     ` KY Srinivasan
  2015-05-29 18:09   ` [PATCH 2/3] Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels K. Y. Srinivasan
  2015-05-29 18:09   ` [PATCH 3/3] Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion K. Y. Srinivasan
  2 siblings, 1 reply; 7+ messages in thread
From: Dan Carpenter @ 2015-05-29 16:59 UTC (permalink / raw)
  To: K. Y. Srinivasan
  Cc: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang

On Fri, May 29, 2015 at 11:09:02AM -0700, K. Y. Srinivasan wrote:
> Map target_cpu to target_vcpu using the mapping table.
> 

It's really hard to tell from this changelog what the user visible
effects of this patch are.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0
  2015-05-29 16:59   ` Dan Carpenter
@ 2015-05-29 17:41     ` KY Srinivasan
  2015-05-31  3:23       ` gregkh
  0 siblings, 1 reply; 7+ messages in thread
From: KY Srinivasan @ 2015-05-29 17:41 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org,
	devel@linuxdriverproject.org, olaf@aepfle.de, apw@canonical.com,
	vkuznets@redhat.com, jasowang@redhat.com



> -----Original Message-----
> From: Dan Carpenter [mailto:dan.carpenter@oracle.com]
> Sent: Friday, May 29, 2015 10:00 AM
> To: KY Srinivasan
> Cc: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> devel@linuxdriverproject.org; olaf@aepfle.de; apw@canonical.com;
> vkuznets@redhat.com; jasowang@redhat.com
> Subject: Re: [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for
> channels bound to CPU 0
> 
> On Fri, May 29, 2015 at 11:09:02AM -0700, K. Y. Srinivasan wrote:
> > Map target_cpu to target_vcpu using the mapping table.
> >
> 
> It's really hard to tell from this changelog what the user visible
> effects of this patch are.

We should use the map to transform guest CPU ID to VP Index as is done
For the non-performance critical channels. While the value CPU 0 is special and will
map to VP index 0, it is good to be consistent.

Regards,

K. Y
> 
> regards,
> dan carpenter


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 0/3] Drivers: hv: vmbus: Make VMBUS driver NUMA aware
@ 2015-05-29 18:08 K. Y. Srinivasan
  2015-05-29 18:09 ` [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0 K. Y. Srinivasan
  0 siblings, 1 reply; 7+ messages in thread
From: K. Y. Srinivasan @ 2015-05-29 18:08 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Implement CPU affinity for channels based on NUMA topology. Also, allocate all
channel specific memory from the appropriate NUMA node.

K. Y. Srinivasan (3):
  Drivers: hv: vmbus: Use the vp_index map even for channels bound to
    CPU 0
  Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels
  Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion

 drivers/hv/channel.c      |   14 +++++++-
 drivers/hv/channel_mgmt.c |   74 ++++++++++++++++++++++++++------------------
 include/linux/hyperv.h    |    5 +++
 3 files changed, 61 insertions(+), 32 deletions(-)

-- 
1.7.4.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0
  2015-05-29 18:08 [PATCH 0/3] Drivers: hv: vmbus: Make VMBUS driver NUMA aware K. Y. Srinivasan
@ 2015-05-29 18:09 ` K. Y. Srinivasan
  2015-05-29 16:59   ` Dan Carpenter
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: K. Y. Srinivasan @ 2015-05-29 18:09 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Map target_cpu to target_vcpu using the mapping table.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel_mgmt.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 1f1417d..c3eba37 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -406,7 +406,7 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui
 		 * channel, bind it to cpu 0.
 		 */
 		channel->target_cpu = 0;
-		channel->target_vp = 0;
+		channel->target_vp = hv_context.vp_index[0];
 		return;
 	}
 
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels
  2015-05-29 18:09 ` [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0 K. Y. Srinivasan
  2015-05-29 16:59   ` Dan Carpenter
@ 2015-05-29 18:09   ` K. Y. Srinivasan
  2015-05-29 18:09   ` [PATCH 3/3] Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion K. Y. Srinivasan
  2 siblings, 0 replies; 7+ messages in thread
From: K. Y. Srinivasan @ 2015-05-29 18:09 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan

Channels/sub-channels can be affinitized to VCPUs in the guest. Implement
this affinity in a way that is NUMA aware. The current protocol distributed
the primary channels uniformly across all available CPUs. The new protocol
is NUMA aware: primary channels are distributed across the available NUMA
nodes while the sub-channels within a primary channel are distributed amongst
CPUs within the NUMA node assigned to the primary channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel_mgmt.c |   72 +++++++++++++++++++++++++++------------------
 include/linux/hyperv.h    |    5 +++
 2 files changed, 48 insertions(+), 29 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index c3eba37..4506a66 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -370,25 +370,27 @@ static const struct hv_vmbus_device_id hp_devs[] = {
 /*
  * We use this state to statically distribute the channel interrupt load.
  */
-static u32  next_vp;
+static int next_numa_node_id;
 
 /*
  * Starting with Win8, we can statically distribute the incoming
- * channel interrupt load by binding a channel to VCPU. We
- * implement here a simple round robin scheme for distributing
- * the interrupt load.
- * We will bind channels that are not performance critical to cpu 0 and
- * performance critical channels (IDE, SCSI and Network) will be uniformly
- * distributed across all available CPUs.
+ * channel interrupt load by binding a channel to VCPU.
+ * We do this in a hierarchical fashion:
+ * First distribute the primary channels across available NUMA nodes
+ * and then distribute the subchannels amongst the CPUs in the NUMA
+ * node assigned to the primary channel.
+ *
+ * For pre-win8 hosts or non-performance critical channels we assign the
+ * first CPU in the first NUMA node.
  */
 static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_guid)
 {
 	u32 cur_cpu;
 	int i;
 	bool perf_chn = false;
-	u32 max_cpus = num_online_cpus();
-	struct vmbus_channel *primary = channel->primary_channel, *prev;
-	unsigned long flags;
+	struct vmbus_channel *primary = channel->primary_channel;
+	int next_node;
+	struct cpumask available_mask;
 
 	for (i = IDE; i < MAX_PERF_CHN; i++) {
 		if (!memcmp(type_guid->b, hp_devs[i].guid,
@@ -405,36 +407,48 @@ static void init_vp_index(struct vmbus_channel *channel, const uuid_le *type_gui
 		 * Also if the channel is not a performance critical
 		 * channel, bind it to cpu 0.
 		 */
+		channel->numa_node = 0;
+		cpumask_set_cpu(0, &channel->alloced_cpus_in_node);
 		channel->target_cpu = 0;
 		channel->target_vp = hv_context.vp_index[0];
 		return;
 	}
 
 	/*
-	 * Primary channels are distributed evenly across all vcpus we have.
-	 * When the host asks us to create subchannels it usually makes us
-	 * num_cpus-1 offers and we are supposed to distribute the work evenly
-	 * among the channel itself and all its subchannels. Make sure they are
-	 * all assigned to different vcpus.
+	 * We distribute primary channels evenly across all the available
+	 * NUMA nodes and within the assigned NUMA node we will assign the
+	 * first available CPU to the primary channel.
+	 * The sub-channels will be assigned to the CPUs available in the
+	 * NUMA node evenly.
 	 */
-	if (!primary)
-		cur_cpu = (++next_vp % max_cpus);
-	else {
+	if (!primary) {
+		while (true) {
+			next_node = next_numa_node_id++;
+			if (next_node == nr_node_ids)
+				next_node = next_numa_node_id = 0;
+			if (cpumask_empty(cpumask_of_node(next_node)))
+				continue;
+			break;
+		}
+		channel->numa_node = next_node;
+		primary = channel;
+	}
+
+	if (cpumask_weight(&primary->alloced_cpus_in_node) ==
+	    cpumask_weight(cpumask_of_node(primary->numa_node))) {
 		/*
-		 * Let's assign the first subchannel of a channel to the
-		 * primary->target_cpu+1 and all the subsequent channels to
-		 * the prev->target_cpu+1.
+		 * We have cycled through all the CPUs in the node;
+		 * reset the alloced map.
 		 */
-		spin_lock_irqsave(&primary->lock, flags);
-		if (primary->num_sc == 1)
-			cur_cpu = (primary->target_cpu + 1) % max_cpus;
-		else {
-			prev = list_prev_entry(channel, sc_list);
-			cur_cpu = (prev->target_cpu + 1) % max_cpus;
-		}
-		spin_unlock_irqrestore(&primary->lock, flags);
+		cpumask_clear(&primary->alloced_cpus_in_node);
 	}
 
+	cpumask_xor(&available_mask, &primary->alloced_cpus_in_node,
+		    cpumask_of_node(primary->numa_node));
+
+	cur_cpu = cpumask_next(-1, &available_mask);
+	cpumask_set_cpu(cur_cpu, &primary->alloced_cpus_in_node);
+
 	channel->target_cpu = cur_cpu;
 	channel->target_vp = hv_context.vp_index[cur_cpu];
 }
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 4317cd1..30d3a1f 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -697,6 +697,11 @@ struct vmbus_channel {
 	/* The corresponding CPUID in the guest */
 	u32 target_cpu;
 	/*
+	 * State to manage the CPU affiliation of channels.
+	 */
+	struct cpumask alloced_cpus_in_node;
+	int numa_node;
+	/*
 	 * Support for sub-channels. For high performance devices,
 	 * it will be useful to have multiple sub-channels to support
 	 * a scalable communication infrastructure with the host.
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion
  2015-05-29 18:09 ` [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0 K. Y. Srinivasan
  2015-05-29 16:59   ` Dan Carpenter
  2015-05-29 18:09   ` [PATCH 2/3] Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels K. Y. Srinivasan
@ 2015-05-29 18:09   ` K. Y. Srinivasan
  2 siblings, 0 replies; 7+ messages in thread
From: K. Y. Srinivasan @ 2015-05-29 18:09 UTC (permalink / raw)
  To: gregkh, linux-kernel, devel, olaf, apw, vkuznets, jasowang
  Cc: K. Y. Srinivasan


Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
---
 drivers/hv/channel.c |   14 ++++++++++++--
 1 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index 7a1c2db..603ce97 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -73,6 +73,7 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size,
 	unsigned long flags;
 	int ret, err = 0;
 	unsigned long t;
+	struct page *page;
 
 	spin_lock_irqsave(&newchannel->lock, flags);
 	if (newchannel->state == CHANNEL_OPEN_STATE) {
@@ -87,8 +88,17 @@ int vmbus_open(struct vmbus_channel *newchannel, u32 send_ringbuffer_size,
 	newchannel->channel_callback_context = context;
 
 	/* Allocate the ring buffer */
-	out = (void *)__get_free_pages(GFP_KERNEL|__GFP_ZERO,
-		get_order(send_ringbuffer_size + recv_ringbuffer_size));
+	page = alloc_pages_node(cpu_to_node(newchannel->target_cpu),
+				GFP_KERNEL|__GFP_ZERO,
+				get_order(send_ringbuffer_size +
+				recv_ringbuffer_size));
+
+	if (!page)
+		out = (void *)__get_free_pages(GFP_KERNEL|__GFP_ZERO,
+					       get_order(send_ringbuffer_size +
+					       recv_ringbuffer_size));
+	else
+		out = (void *)page_address(page);
 
 	if (!out) {
 		err = -ENOMEM;
-- 
1.7.4.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0
  2015-05-29 17:41     ` KY Srinivasan
@ 2015-05-31  3:23       ` gregkh
  0 siblings, 0 replies; 7+ messages in thread
From: gregkh @ 2015-05-31  3:23 UTC (permalink / raw)
  To: KY Srinivasan
  Cc: Dan Carpenter, olaf@aepfle.de, jasowang@redhat.com,
	linux-kernel@vger.kernel.org, apw@canonical.com,
	devel@linuxdriverproject.org

On Fri, May 29, 2015 at 05:41:34PM +0000, KY Srinivasan wrote:
> 
> 
> > -----Original Message-----
> > From: Dan Carpenter [mailto:dan.carpenter@oracle.com]
> > Sent: Friday, May 29, 2015 10:00 AM
> > To: KY Srinivasan
> > Cc: gregkh@linuxfoundation.org; linux-kernel@vger.kernel.org;
> > devel@linuxdriverproject.org; olaf@aepfle.de; apw@canonical.com;
> > vkuznets@redhat.com; jasowang@redhat.com
> > Subject: Re: [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for
> > channels bound to CPU 0
> > 
> > On Fri, May 29, 2015 at 11:09:02AM -0700, K. Y. Srinivasan wrote:
> > > Map target_cpu to target_vcpu using the mapping table.
> > >
> > 
> > It's really hard to tell from this changelog what the user visible
> > effects of this patch are.
> 
> We should use the map to transform guest CPU ID to VP Index as is done
> For the non-performance critical channels. While the value CPU 0 is special and will
> map to VP index 0, it is good to be consistent.

Then put that in the changelog!

Please fix up and resend the series.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-05-31  3:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-29 18:08 [PATCH 0/3] Drivers: hv: vmbus: Make VMBUS driver NUMA aware K. Y. Srinivasan
2015-05-29 18:09 ` [PATCH 1/3] Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0 K. Y. Srinivasan
2015-05-29 16:59   ` Dan Carpenter
2015-05-29 17:41     ` KY Srinivasan
2015-05-31  3:23       ` gregkh
2015-05-29 18:09   ` [PATCH 2/3] Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels K. Y. Srinivasan
2015-05-29 18:09   ` [PATCH 3/3] Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion K. Y. Srinivasan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox