public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Venkatesh Srinivas <venkateshs@google.com>
To: "Xie, Huawei" <huawei.xie@intel.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	"virtualization@lists.linux-foundation.org"
	<virtualization@lists.linux-foundation.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	David Matlack <dmatlack@google.com>,
	KVM list <kvm@vger.kernel.org>,
	"luto@kernel.org" <luto@kernel.org>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Venkatesh Srinivas <vsrinivas@ops101.org>
Subject: Re: [PATCH] virtio_ring: Shadow available ring flags & index
Date: Fri, 20 Nov 2015 10:30:11 -0800	[thread overview]
Message-ID: <20151120183011.GA24228@google.com> (raw)
In-Reply-To: <C37D651A908B024F974696C65296B57B4B19D2A4@SHSMSX101.ccr.corp.intel.com>

On Thu, Nov 19, 2015 at 04:15:48PM +0000, Xie, Huawei wrote:
> On 11/18/2015 12:28 PM, Venkatesh Srinivas wrote:
> > On Tue, Nov 17, 2015 at 08:08:18PM -0800, Venkatesh Srinivas wrote:
> >> On Mon, Nov 16, 2015 at 7:46 PM, Xie, Huawei <huawei.xie@intel.com> wrote:
> >>
> >>> On 11/14/2015 7:41 AM, Venkatesh Srinivas wrote:
> >>>> On Wed, Nov 11, 2015 at 02:34:33PM +0200, Michael S. Tsirkin wrote:
> >>>>> On Tue, Nov 10, 2015 at 04:21:07PM -0800, Venkatesh Srinivas wrote:
> >>>>>> Improves cacheline transfer flow of available ring header.
> >>>>>>
> >>>>>> Virtqueues are implemented as a pair of rings, one producer->consumer
> >>>>>> avail ring and one consumer->producer used ring; preceding the
> >>>>>> avail ring in memory are two contiguous u16 fields -- avail->flags
> >>>>>> and avail->idx. A producer posts work by writing to avail->idx and
> >>>>>> a consumer reads avail->idx.
> >>>>>>
> >>>>>> The flags and idx fields only need to be written by a producer CPU
> >>>>>> and only read by a consumer CPU; when the producer and consumer are
> >>>>>> running on different CPUs and the virtio_ring code is structured to
> >>>>>> only have source writes/sink reads, we can continuously transfer the
> >>>>>> avail header cacheline between 'M' states between cores. This flow
> >>>>>> optimizes core -> core bandwidth on certain CPUs.
> >>>>>>
> >>>>>> (see: "Software Optimization Guide for AMD Family 15h Processors",
> >>>>>> Section 11.6; similar language appears in the 10h guide and should
> >>>>>> apply to CPUs w/ exclusive caches, using LLC as a transfer cache)
> >>>>>>
> >>>>>> Unfortunately the existing virtio_ring code issued reads to the
> >>>>>> avail->idx and read-modify-writes to avail->flags on the producer.
> >>>>>>
> >>>>>> This change shadows the flags and index fields in producer memory;
> >>>>>> the vring code now reads from the shadows and only ever writes to
> >>>>>> avail->flags and avail->idx, allowing the cacheline to transfer
> >>>>>> core -> core optimally.
> >>>>> Sounds logical, I'll apply this after a  bit of testing
> >>>>> of my own, thanks!
> >>>> Thanks!
> >>> Venkatesh:
> >>> Is it that your patch only applies to CPUs w/ exclusive caches?
> >> No --- it applies when the inter-cache coherence flow is optimized by
> >> 'M' -> 'M' transfers and when producer reads might interfere w/
> >> consumer prefetchw/reads. The AMD Optimization guides have specific
> >> language on this subject, but other platforms may benefit.
> >> (see Intel #'s below)
> For core2core case(not HT paire), after consumer reads that M cache line
> for avail_idx, is that line still in the producer core's L1 data cache
> with state changing from M->O state?

Textbook MOESI would not allow that state combination -- when the consumer
gets the line in 'M' state, the producer cannot hold it in 'O' state.

On the AMD Piledriver, per the Optimization guide, I use PREFETCHW/Load to
get the line in 'M' state on the consumer (invalidating it in the Producer's
cache):

"* Use PREFETCHW on the consumer side, even if the consumer will not modify
   the data"

That, plus the "Optimizing Inter-Core Data Transfer" section imply that
PREFETCHW + MOV will cause the consumer to load the line into 'M' state.

PREFETCHW was not available on Intel CPUs pre-Broadwell; from the public
documentation alone, I don't think we can tell what transition the producer's
cacheline undergoes on these cores. For that matter, the latest documentation
I can find (for Nehalem), indicated there was no 'O' state -- Nehalem
implemented MESIF, not MOESI.

HTH,
-- vs;

  reply	other threads:[~2015-11-20 18:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-11  0:21 [PATCH] virtio_ring: Shadow available ring flags & index Venkatesh Srinivas
2015-11-11 12:34 ` Michael S. Tsirkin
2015-11-13 23:41   ` Venkatesh Srinivas
2015-11-17  3:46     ` Xie, Huawei
2015-11-18  4:08       ` Venkatesh Srinivas via Virtualization
2015-11-18  4:28         ` Venkatesh Srinivas
2015-11-19 16:15           ` Xie, Huawei
2015-11-20 18:30             ` Venkatesh Srinivas [this message]
2015-11-23 16:46               ` Xie, Huawei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151120183011.GA24228@google.com \
    --to=venkateshs@google.com \
    --cc=dmatlack@google.com \
    --cc=huawei.xie@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=rusty@rustcorp.com.au \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=vsrinivas@ops101.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox