Re: [RFC 0/3] virtio: NUMA-aware memory allocation

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: kvm@vger.kernel.org, Stefan Hajnoczi <stefanha@redhat.com>,
	virtualization@lists.linux-foundation.org
Subject: Re: [RFC 0/3] virtio: NUMA-aware memory allocation
Date: Mon, 29 Jun 2020 11:28:41 -0400	[thread overview]
Message-ID: <20200629112212-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20200629092646.GC31392@stefanha-x1.localdomain>

On Mon, Jun 29, 2020 at 10:26:46AM +0100, Stefan Hajnoczi wrote:
> On Sun, Jun 28, 2020 at 02:34:37PM +0800, Jason Wang wrote:
> > 
> > On 2020/6/25 下午9:57, Stefan Hajnoczi wrote:
> > > These patches are not ready to be merged because I was unable to measure a
> > > performance improvement. I'm publishing them so they are archived in case
> > > someone picks up this work again in the future.
> > > 
> > > The goal of these patches is to allocate virtqueues and driver state from the
> > > device's NUMA node for optimal memory access latency. Only guests with a vNUMA
> > > topology and virtio devices spread across vNUMA nodes benefit from this.  In
> > > other cases the memory placement is fine and we don't need to take NUMA into
> > > account inside the guest.
> > > 
> > > These patches could be extended to virtio_net.ko and other devices in the
> > > future. I only tested virtio_blk.ko.
> > > 
> > > The benchmark configuration was designed to trigger worst-case NUMA placement:
> > >   * Physical NVMe storage controller on host NUMA node 0

It's possible that numa is not such a big deal for NVMe.
And it's possible that bios misconfigures ACPI reporting NUMA placement
incorrectly.
I think that the best thing to try is to use a ramdisk
on a specific numa node.

> > >   * IOThread pinned to host NUMA node 0
> > >   * virtio-blk-pci device in vNUMA node 1
> > >   * vCPU 0 on host NUMA node 1 and vCPU 1 on host NUMA node 0
> > >   * vCPU 0 in vNUMA node 0 and vCPU 1 in vNUMA node 1
> > > 
> > > The intent is to have .probe() code run on vCPU 0 in vNUMA node 0 (host NUMA
> > > node 1) so that memory is in the wrong NUMA node for the virtio-blk-pci devic=
> > > e.
> > > Applying these patches fixes memory placement so that virtqueues and driver
> > > state is allocated in vNUMA node 1 where the virtio-blk-pci device is located.
> > > 
> > > The fio 4KB randread benchmark results do not show a significant improvement:
> > > 
> > > Name                  IOPS   Error
> > > virtio-blk        42373.79 =C2=B1 0.54%
> > > virtio-blk-numa   42517.07 =C2=B1 0.79%
> > 
> > 
> > I remember I did something similar in vhost by using page_to_nid() for
> > descriptor ring. And I get little improvement as shown here.
> > 
> > Michael reminds that it was probably because all data were cached. So I
> > doubt if the test lacks sufficient stress on the cache ...
> 
> Yes, that sounds likely. If there's no real-world performance
> improvement then I'm happy to leave these patches unmerged.
> 
> Stefan

Well that was for vhost though. This is virtio, which is different.
Doesn't some benchmark put pressure on the CPU cache?

I kind of feel there should be a difference, and the fact there isn't
means there's some other bottleneck somewhere. Might be worth
figuring out.

-- 
MST

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization