netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ben Serebrin <serebrin@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
	netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
	willemb@google.com, venkateshs@google.com, jonolson@google.com,
	willemdebruijn.kernel@gmail.com, rick.jones2@hpe.com,
	jmattson@google.com, linux-s390 <linux-s390@vger.kernel.org>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [PATCH v2 net-next] virtio: Fix affinity for #VCPUs != #queue pairs
Date: Wed, 8 Feb 2017 21:37:41 +0200	[thread overview]
Message-ID: <20170208205353-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170207181506.36668-1-serebrin@google.com>

On Tue, Feb 07, 2017 at 10:15:06AM -0800, Ben Serebrin wrote:
> From: Benjamin Serebrin <serebrin@google.com>
> 
> If the number of virtio queue pairs is not equal to the
> number of VCPUs, the virtio guest driver doesn't assign
> any CPU affinity for the queue interrupts or the xps
> aggregation interrupt.  (In contrast, the driver does assign
> both if the counts of VCPUs and queues are equal, which is a good
> default behavior.)
> 
> Google Compute Engine currently provides 1 queue pair for
> every VCPU, but limits that at a maximum of 32 queue pairs.
> 
> This code extends the driver's default interrupt affinity
> and transmit affinity settings for the case where there
> are mismatching queue and VCPU counts.  Userspace affinity
> adjustment may always be needed to tune for a given workload.

IIRC irqbalance will bail out and avoid touching affinity
if you set affinity from driver.  Breaking that's not nice.
Pls correct me if I'm wrong.

Generally, I wonder - we aren't the only device with a limited number of
queues.

> Tested:
> 
> (on a 64-VCPU VM with debian 8, jessie-backports 4.9.2)
> 
> Without the fix we see all queues affinitized to all CPUs:
> 
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
> 0-63
> [...]
> 0-63
> 
> and we see all TX queues' xps_cpus affinitzed to no cores:
> 
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done
> 00000000,00000000
> [...]
> 00000000,00000000
> 
> With the fix, we see each queue assigned to the a single core,
> and xps affinity set to 1 unique cpu per TX queue.
> 
> 64 VCPU:
> 
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
> 
> 0-63
> 0
> 0
> 1
> 1
> 2
> 2
> 3
> 3
> 4
> 4
> 5
> 5
> 6
> 6
> 7
> 7
> 8
> 8
> 9
> 9
> 10
> 10
> 11
> 11
> 12
> 12
> 13
> 13
> 14
> 14
> 15
> 15
> 16
> 16
> 17
> 17
> 18
> 18
> 19
> 19
> 20
> 20
> 21
> 21
> 22
> 22
> 23
> 23
> 24
> 24
> 25
> 25
> 26
> 26
> 27
> 27
> 28
> 28
> 29
> 29
> 30
> 30
> 31
> 31
> 0-63
> 0-63
> 0-63
> 0-63
> 
> cd /sys/class/net/eth0/queues
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus;  done
> 
> 00000001,00000001
> 00000002,00000002
> 00000004,00000004
> 00000008,00000008
> 00000010,00000010
> 00000020,00000020
> 00000040,00000040
> 00000080,00000080
> 00000100,00000100
> 00000200,00000200
> 00000400,00000400
> 00000800,00000800
> 00001000,00001000
> 00002000,00002000
> 00004000,00004000
> 00008000,00008000
> 00010000,00010000
> 00020000,00020000
> 00040000,00040000
> 00080000,00080000
> 00100000,00100000
> 00200000,00200000
> 00400000,00400000
> 00800000,00800000
> 01000000,01000000
> 02000000,02000000
> 04000000,04000000
> 08000000,08000000
> 10000000,10000000
> 20000000,20000000
> 40000000,40000000
> 80000000,80000000
> 
> 48 VCPU:
> 
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list;  done
> 0-47
> 0
> 0
> 1
> 1
> 2
> 2
> 3
> 3
> 4
> 4
> 5
> 5
> 6
> 6
> 7
> 7
> 8
> 8
> 9
> 9
> 10
> 10
> 11
> 11
> 12
> 12
> 13
> 13
> 14
> 14
> 15
> 15
> 16
> 16
> 17
> 17
> 18
> 18
> 19
> 19
> 20
> 20
> 21
> 21
> 22
> 22
> 23
> 23
> 24
> 24
> 25
> 25
> 26
> 26
> 27
> 27
> 28
> 28
> 29
> 29
> 30
> 30
> 31
> 31
> 0-47
> 0-47
> 0-47
> 0-47
> 
> cd /sys/class/net/eth0/queues
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus;  done
> 
> 0001,00000001
> 0002,00000002
> 0004,00000004
> 0008,00000008
> 0010,00000010
> 0020,00000020
> 0040,00000040
> 0080,00000080
> 0100,00000100
> 0200,00000200
> 0400,00000400
> 0800,00000800
> 1000,00001000
> 2000,00002000
> 4000,00004000
> 8000,00008000
> 0000,00010000
> 0000,00020000
> 0000,00040000
> 0000,00080000
> 0000,00100000
> 0000,00200000
> 0000,00400000
> 0000,00800000
> 0000,01000000
> 0000,02000000
> 0000,04000000
> 0000,08000000
> 0000,10000000
> 0000,20000000
> 0000,40000000
> 0000,80000000
> 
> Acked-by: Willem de Bruijn <willemb@google.com>
> Acked-by: Jim Mattson <jmattson@google.com>
> Acked-by: Venkatesh Srinivas <venkateshs@google.com>
> 
> Signed-off-by: Ben Serebrin <serebrin@google.com>


What happens if you have more than 1 virtio net device?

> ---
>  drivers/net/virtio_net.c | 30 +++++++++++++++++++++++++++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 765c2d6358da..0dc3a102bfc4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1502,20 +1502,44 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
>  	 * queue pairs, we let the queue pairs to be private to one cpu by
>  	 * setting the affinity hint to eliminate the contention.
>  	 */
> -	if (vi->curr_queue_pairs == 1 ||
> -	    vi->max_queue_pairs != num_online_cpus()) {
> +	if (vi->curr_queue_pairs == 1) {
>  		virtnet_clean_affinity(vi, -1);
>  		return;
>  	}
>  
> +	/* If there are more cpus than queues, then assign the queues'
> +	 * interrupts to the first cpus until we run out.
> +	 */
>  	i = 0;
>  	for_each_online_cpu(cpu) {
> +		if (i == vi->max_queue_pairs)
> +			break;
>  		virtqueue_set_affinity(vi->rq[i].vq, cpu);
>  		virtqueue_set_affinity(vi->sq[i].vq, cpu);
> -		netif_set_xps_queue(vi->dev, cpumask_of(cpu), i);
>  		i++;
>  	}
>  
> +	/* Stripe the XPS affinities across the online CPUs.
> +	 * Hyperthread pairs are typically assigned such that Linux's
> +	 * CPU X and X + (numcpus / 2) are hyperthread twins, so we cause
> +	 * hyperthread twins to share TX queues, in the case where there are
> +	 * more cpus than queues.

Couldn't you add some kind of API so that we don't need to make
assumptions like this? E.g. "give me a new CPU core to use for an
interrupt"?
Would address multiple device thing too.


> +	 */
> +	for (i = 0; i < vi->max_queue_pairs; i++) {
> +		struct cpumask mask;
> +		int skip = i;
> +
> +		cpumask_clear(&mask);
> +		for_each_online_cpu(cpu) {
> +			while (skip--)
> +				cpu = cpumask_next(cpu, cpu_online_mask);
> +			if (cpu < num_possible_cpus())
> +				cpumask_set_cpu(cpu, &mask);
> +			skip = vi->max_queue_pairs - 1;
> +		}
> +		netif_set_xps_queue(vi->dev, &mask, i);
> +	}
> +

Doesn't look like this will handle the case of num cpus < num queues well.

>  	vi->affinity_hint_set = true;
>  }
>  

  parent reply	other threads:[~2017-02-08 19:50 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07 18:15 [PATCH v2 net-next] virtio: Fix affinity for #VCPUs != #queue pairs Ben Serebrin
2017-02-08 18:37 ` David Miller
2017-02-08 19:37 ` Michael S. Tsirkin [this message]
2017-02-14 19:17   ` Benjamin Serebrin
2017-02-14 21:05     ` Michael S. Tsirkin
2017-02-15 16:50       ` Willem de Bruijn
2017-02-15 17:42         ` Michael S. Tsirkin
2017-02-15 18:27           ` Benjamin Serebrin
2017-02-15 19:17             ` Michael S. Tsirkin
2017-02-15 21:38               ` Benjamin Serebrin
2017-02-15 21:49                 ` Michael S. Tsirkin
2017-02-15 22:13                   ` Benjamin Serebrin
2017-02-15 15:23     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170208205353-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=davem@davemloft.net \
    --cc=jasowang@redhat.com \
    --cc=jmattson@google.com \
    --cc=jonolson@google.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=rick.jones2@hpe.com \
    --cc=serebrin@google.com \
    --cc=venkateshs@google.com \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).