From: "Michael S. Tsirkin" <mst@redhat.com>
To: Ben Serebrin <serebrin@google.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>,
netdev@vger.kernel.org, jasowang@redhat.com, davem@davemloft.net,
willemb@google.com, venkateshs@google.com, jonolson@google.com,
willemdebruijn.kernel@gmail.com, rick.jones2@hpe.com,
jmattson@google.com, linux-s390 <linux-s390@vger.kernel.org>,
"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Subject: Re: [PATCH v2 net-next] virtio: Fix affinity for #VCPUs != #queue pairs
Date: Wed, 8 Feb 2017 21:37:41 +0200 [thread overview]
Message-ID: <20170208205353-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170207181506.36668-1-serebrin@google.com>
On Tue, Feb 07, 2017 at 10:15:06AM -0800, Ben Serebrin wrote:
> From: Benjamin Serebrin <serebrin@google.com>
>
> If the number of virtio queue pairs is not equal to the
> number of VCPUs, the virtio guest driver doesn't assign
> any CPU affinity for the queue interrupts or the xps
> aggregation interrupt. (In contrast, the driver does assign
> both if the counts of VCPUs and queues are equal, which is a good
> default behavior.)
>
> Google Compute Engine currently provides 1 queue pair for
> every VCPU, but limits that at a maximum of 32 queue pairs.
>
> This code extends the driver's default interrupt affinity
> and transmit affinity settings for the case where there
> are mismatching queue and VCPU counts. Userspace affinity
> adjustment may always be needed to tune for a given workload.
IIRC irqbalance will bail out and avoid touching affinity
if you set affinity from driver. Breaking that's not nice.
Pls correct me if I'm wrong.
Generally, I wonder - we aren't the only device with a limited number of
queues.
> Tested:
>
> (on a 64-VCPU VM with debian 8, jessie-backports 4.9.2)
>
> Without the fix we see all queues affinitized to all CPUs:
>
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list; done
> 0-63
> [...]
> 0-63
>
> and we see all TX queues' xps_cpus affinitzed to no cores:
>
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done
> 00000000,00000000
> [...]
> 00000000,00000000
>
> With the fix, we see each queue assigned to the a single core,
> and xps affinity set to 1 unique cpu per TX queue.
>
> 64 VCPU:
>
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list; done
>
> 0-63
> 0
> 0
> 1
> 1
> 2
> 2
> 3
> 3
> 4
> 4
> 5
> 5
> 6
> 6
> 7
> 7
> 8
> 8
> 9
> 9
> 10
> 10
> 11
> 11
> 12
> 12
> 13
> 13
> 14
> 14
> 15
> 15
> 16
> 16
> 17
> 17
> 18
> 18
> 19
> 19
> 20
> 20
> 21
> 21
> 22
> 22
> 23
> 23
> 24
> 24
> 25
> 25
> 26
> 26
> 27
> 27
> 28
> 28
> 29
> 29
> 30
> 30
> 31
> 31
> 0-63
> 0-63
> 0-63
> 0-63
>
> cd /sys/class/net/eth0/queues
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done
>
> 00000001,00000001
> 00000002,00000002
> 00000004,00000004
> 00000008,00000008
> 00000010,00000010
> 00000020,00000020
> 00000040,00000040
> 00000080,00000080
> 00000100,00000100
> 00000200,00000200
> 00000400,00000400
> 00000800,00000800
> 00001000,00001000
> 00002000,00002000
> 00004000,00004000
> 00008000,00008000
> 00010000,00010000
> 00020000,00020000
> 00040000,00040000
> 00080000,00080000
> 00100000,00100000
> 00200000,00200000
> 00400000,00400000
> 00800000,00800000
> 01000000,01000000
> 02000000,02000000
> 04000000,04000000
> 08000000,08000000
> 10000000,10000000
> 20000000,20000000
> 40000000,40000000
> 80000000,80000000
>
> 48 VCPU:
>
> cd /proc/irq
> for i in `seq 24 92` ; do sudo grep ".*" $i/smp_affinity_list; done
> 0-47
> 0
> 0
> 1
> 1
> 2
> 2
> 3
> 3
> 4
> 4
> 5
> 5
> 6
> 6
> 7
> 7
> 8
> 8
> 9
> 9
> 10
> 10
> 11
> 11
> 12
> 12
> 13
> 13
> 14
> 14
> 15
> 15
> 16
> 16
> 17
> 17
> 18
> 18
> 19
> 19
> 20
> 20
> 21
> 21
> 22
> 22
> 23
> 23
> 24
> 24
> 25
> 25
> 26
> 26
> 27
> 27
> 28
> 28
> 29
> 29
> 30
> 30
> 31
> 31
> 0-47
> 0-47
> 0-47
> 0-47
>
> cd /sys/class/net/eth0/queues
> for i in `seq 0 31` ; do sudo grep ".*" tx-$i/xps_cpus; done
>
> 0001,00000001
> 0002,00000002
> 0004,00000004
> 0008,00000008
> 0010,00000010
> 0020,00000020
> 0040,00000040
> 0080,00000080
> 0100,00000100
> 0200,00000200
> 0400,00000400
> 0800,00000800
> 1000,00001000
> 2000,00002000
> 4000,00004000
> 8000,00008000
> 0000,00010000
> 0000,00020000
> 0000,00040000
> 0000,00080000
> 0000,00100000
> 0000,00200000
> 0000,00400000
> 0000,00800000
> 0000,01000000
> 0000,02000000
> 0000,04000000
> 0000,08000000
> 0000,10000000
> 0000,20000000
> 0000,40000000
> 0000,80000000
>
> Acked-by: Willem de Bruijn <willemb@google.com>
> Acked-by: Jim Mattson <jmattson@google.com>
> Acked-by: Venkatesh Srinivas <venkateshs@google.com>
>
> Signed-off-by: Ben Serebrin <serebrin@google.com>
What happens if you have more than 1 virtio net device?
> ---
> drivers/net/virtio_net.c | 30 +++++++++++++++++++++++++++---
> 1 file changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 765c2d6358da..0dc3a102bfc4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1502,20 +1502,44 @@ static void virtnet_set_affinity(struct virtnet_info *vi)
> * queue pairs, we let the queue pairs to be private to one cpu by
> * setting the affinity hint to eliminate the contention.
> */
> - if (vi->curr_queue_pairs == 1 ||
> - vi->max_queue_pairs != num_online_cpus()) {
> + if (vi->curr_queue_pairs == 1) {
> virtnet_clean_affinity(vi, -1);
> return;
> }
>
> + /* If there are more cpus than queues, then assign the queues'
> + * interrupts to the first cpus until we run out.
> + */
> i = 0;
> for_each_online_cpu(cpu) {
> + if (i == vi->max_queue_pairs)
> + break;
> virtqueue_set_affinity(vi->rq[i].vq, cpu);
> virtqueue_set_affinity(vi->sq[i].vq, cpu);
> - netif_set_xps_queue(vi->dev, cpumask_of(cpu), i);
> i++;
> }
>
> + /* Stripe the XPS affinities across the online CPUs.
> + * Hyperthread pairs are typically assigned such that Linux's
> + * CPU X and X + (numcpus / 2) are hyperthread twins, so we cause
> + * hyperthread twins to share TX queues, in the case where there are
> + * more cpus than queues.
Couldn't you add some kind of API so that we don't need to make
assumptions like this? E.g. "give me a new CPU core to use for an
interrupt"?
Would address multiple device thing too.
> + */
> + for (i = 0; i < vi->max_queue_pairs; i++) {
> + struct cpumask mask;
> + int skip = i;
> +
> + cpumask_clear(&mask);
> + for_each_online_cpu(cpu) {
> + while (skip--)
> + cpu = cpumask_next(cpu, cpu_online_mask);
> + if (cpu < num_possible_cpus())
> + cpumask_set_cpu(cpu, &mask);
> + skip = vi->max_queue_pairs - 1;
> + }
> + netif_set_xps_queue(vi->dev, &mask, i);
> + }
> +
Doesn't look like this will handle the case of num cpus < num queues well.
> vi->affinity_hint_set = true;
> }
>
next prev parent reply other threads:[~2017-02-08 19:50 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-07 18:15 [PATCH v2 net-next] virtio: Fix affinity for #VCPUs != #queue pairs Ben Serebrin
2017-02-08 18:37 ` David Miller
2017-02-08 19:37 ` Michael S. Tsirkin [this message]
2017-02-14 19:17 ` Benjamin Serebrin
2017-02-14 21:05 ` Michael S. Tsirkin
2017-02-15 16:50 ` Willem de Bruijn
2017-02-15 17:42 ` Michael S. Tsirkin
2017-02-15 18:27 ` Benjamin Serebrin
2017-02-15 19:17 ` Michael S. Tsirkin
2017-02-15 21:38 ` Benjamin Serebrin
2017-02-15 21:49 ` Michael S. Tsirkin
2017-02-15 22:13 ` Benjamin Serebrin
2017-02-15 15:23 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170208205353-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=davem@davemloft.net \
--cc=jasowang@redhat.com \
--cc=jmattson@google.com \
--cc=jonolson@google.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rick.jones2@hpe.com \
--cc=serebrin@google.com \
--cc=venkateshs@google.com \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).