From: "Michael S. Tsirkin" <mst@redhat.com>
To: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: David Miller <davem@davemloft.net>,
caleb.raitto@gmail.com, Jason Wang <jasowang@redhat.com>,
Network Development <netdev@vger.kernel.org>,
Caleb Raitto <caraitto@google.com>
Subject: Re: [PATCH net-next] virtio_net: force_napi_tx module param.
Date: Thu, 2 Aug 2018 01:25:28 +0300 [thread overview]
Message-ID: <20180802012405-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAF=yD-JVj9SLNBftB=ut2-duhi4HQdLBByOHJaZDdavLabD+OA@mail.gmail.com>
On Wed, Aug 01, 2018 at 11:46:14AM -0400, Willem de Bruijn wrote:
> On Tue, Jul 31, 2018 at 8:34 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Sun, Jul 29, 2018 at 05:32:56PM -0400, Willem de Bruijn wrote:
> > > On Sun, Jul 29, 2018 at 12:01 PM David Miller <davem@davemloft.net> wrote:
> > > >
> > > > From: Caleb Raitto <caleb.raitto@gmail.com>
> > > > Date: Mon, 23 Jul 2018 16:11:19 -0700
> > > >
> > > > > From: Caleb Raitto <caraitto@google.com>
> > > > >
> > > > > The driver disables tx napi if it's not certain that completions will
> > > > > be processed affine with tx service.
> > > > >
> > > > > Its heuristic doesn't account for some scenarios where it is, such as
> > > > > when the queue pair count matches the core but not hyperthread count.
> > > > >
> > > > > Allow userspace to override the heuristic. This is an alternative
> > > > > solution to that in the linked patch. That added more logic in the
> > > > > kernel for these cases, but the agreement was that this was better left
> > > > > to user control.
> > > > >
> > > > > Do not expand the existing napi_tx variable to a ternary value,
> > > > > because doing so can break user applications that expect
> > > > > boolean ('Y'/'N') instead of integer output. Add a new param instead.
> > > > >
> > > > > Link: https://patchwork.ozlabs.org/patch/725249/
> > > > > Acked-by: Willem de Bruijn <willemb@google.com>
> > > > > Acked-by: Jon Olson <jonolson@google.com>
> > > > > Signed-off-by: Caleb Raitto <caraitto@google.com>
> > > >
> > > > So I looked into the history surrounding these issues.
> > > >
> > > > First of all, it's always ends up turning out crummy when drivers start
> > > > to set affinities themselves. The worst possible case is to do it
> > > > _conditionally_, and that is exactly what virtio_net is doing.
> > > >
> > > > From the user's perspective, this provides a really bad experience.
> > > >
> > > > So if I have a 32-queue device and there are 32 cpus, you'll do all
> > > > the affinity settings, stopping Irqbalanced from doing anything
> > > > right?
> > > >
> > > > So if I add one more cpu, you'll say "oops, no idea what to do in
> > > > this situation" and not touch the affinities at all?
> > > >
> > > > That makes no sense at all.
> > > >
> > > > If the driver is going to set affinities at all, OWN that decision
> > > > and set it all the time to something reasonable.
> > > >
> > > > Or accept that you shouldn't be touching this stuff in the first place
> > > > and leave the affinities alone.
> > > >
> > > > Right now we're kinda in a situation where the driver has been setting
> > > > affinities in the ncpus==nqueues cases for some time, so we can't stop
> > > > doing it.
> > > >
> > > > Which means we have to set them in all cases to make the user
> > > > experience sane again.
> > > >
> > > > I looked at the linked to patch again:
> > > >
> > > > https://patchwork.ozlabs.org/patch/725249/
> > > >
> > > > And I think the strategy should be made more generic, to get rid of
> > > > the hyperthreading assumptions. I also agree that the "assign
> > > > to first N cpus" logic doesn't make much sense either.
> > > >
> > > > Just distribute across the available cpus evenly, and be done with it.
> > >
> > > Sounds good to me.
> >
> > So e.g. we could set an affinity hint to a group of CPUs that
> > might transmit to this queue.
>
> We also want to set the xps mask for all cpus in the group to this queue.
>
> Is there a benefit over explicitly choosing one cpu from the set, btw?
If only one CPU actually transmits on this queue then probably yes.
And virtio doesn't know whether that's the case.
> I assumed striping. Something along the lines of
>
> int stripe = max_t(int, num_online_cpus() / vi->curr_queue_pairs, 1);
> int vq = 0;
>
> cpumask_clear(xps_mask);
>
> for_each_online_cpu(cpu) {
> cpumask_set_cpu(cpu, xps_mask);
>
> if ((i + 1) % stripe == 0) {
> virtqueue_set_affinity(vi->rq[vq].vq, cpu);
> virtqueue_set_affinity(vi->sq[vq].vq, cpu);
> netif_set_xps_queue(vi->dev, xps_mask, vq);
> cpumask_clear(xps_mask);
> vq++;
> }
> i++;
> }
next prev parent reply other threads:[~2018-08-02 0:13 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-23 23:11 [PATCH net-next] virtio_net: force_napi_tx module param Caleb Raitto
2018-07-24 0:53 ` Stephen Hemminger
2018-07-24 1:23 ` Willem de Bruijn
2018-07-24 10:42 ` Michael S. Tsirkin
2018-07-24 14:01 ` Willem de Bruijn
2018-07-24 18:38 ` Michael S. Tsirkin
2018-07-24 20:52 ` Willem de Bruijn
2018-07-24 22:23 ` Michael S. Tsirkin
2018-07-24 22:31 ` Willem de Bruijn
2018-07-24 22:46 ` Michael S. Tsirkin
2018-07-25 0:02 ` Willem de Bruijn
2018-07-25 0:17 ` Jon Olson
2018-07-30 6:06 ` Jason Wang
2018-08-01 22:27 ` Michael S. Tsirkin
2018-08-28 19:57 ` Willem de Bruijn
2018-08-29 7:56 ` Jason Wang
2018-08-29 13:01 ` Willem de Bruijn
2018-09-09 23:07 ` Willem de Bruijn
2018-09-10 5:59 ` Jason Wang
2018-07-29 16:00 ` David Miller
2018-07-29 20:33 ` Michael S. Tsirkin
2018-07-29 20:36 ` David Miller
2018-07-29 21:09 ` Willem de Bruijn
2018-07-29 21:32 ` Willem de Bruijn
2018-07-31 12:34 ` Michael S. Tsirkin
2018-08-01 15:46 ` Willem de Bruijn
2018-08-01 15:56 ` Willem de Bruijn
2018-08-01 22:25 ` Michael S. Tsirkin [this message]
2018-08-01 22:43 ` Willem de Bruijn
2018-08-01 22:48 ` Michael S. Tsirkin
2018-08-01 23:33 ` Willem de Bruijn
2018-07-30 19:51 ` Stephen Hemminger
2018-07-31 1:41 ` Jason Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180802012405-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=caleb.raitto@gmail.com \
--cc=caraitto@google.com \
--cc=davem@davemloft.net \
--cc=jasowang@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.