All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>,
	Paolo Bonzini <pbonzini@redhat.com>,
	qemu-devel@nongnu.org,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 2/3] exec: simplify address_space_get_iotlb_entry
Date: Wed, 14 Jun 2017 21:34:52 +0300	[thread overview]
Message-ID: <20170614213418-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20170612040458.GB25059@pxdev.xzpeter.org>

On Mon, Jun 12, 2017 at 12:04:58PM +0800, Peter Xu wrote:
> On Mon, Jun 12, 2017 at 06:07:04AM +0300, Michael S. Tsirkin wrote:
> > On Mon, Jun 12, 2017 at 10:34:43AM +0800, Peter Xu wrote:
> > > On Sun, Jun 11, 2017 at 08:10:15PM +0800, David Gibson wrote:
> > > > On Sun, Jun 11, 2017 at 01:09:26PM +0300, Michael S. Tsirkin wrote:
> > > > > On Fri, Jun 09, 2017 at 09:58:47AM +0800, Peter Xu wrote:
> > > > > > > > The problem is that when I was fixing the problem that vhost had with
> > > > > > > > PT (a764040, "exec: abstract address_space_do_translate()"), I did
> > > > > > > > broke the IOTLB translation a bit (it was using page masks). IMHO we
> > > > > > > > need to fix it first for correctness (patch 1/2).
> > > > > > > > 
> > > > > > > > For patch 3, if we can have Jason's patch to allow dynamic
> > > > > > > > iommu_platform switching, that'll be the best, then I can rewrite
> > > > > > > > patch 3 with the switching logic rather than caching anything. But
> > > > > > > > IMHO that can be separated from patch 1/2 if you like.
> > > > > > > > 
> > > > > > > > Or do you have better suggestion on how should we fix it?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > 
> > > > > > > Can we drop masks completely and replace with length? I think we
> > > > > > > should do that instead of trying to fix masks.
> > > > > > 
> > > > > > Do you mean to modify IOMMUTLBEntry.addr_mask into length?
> > > > > 
> > > > > I think it's better than alternatives.
> > > > > 
> > > > > > Again, I am not sure this is good... At least we need to get ack from
> > > > > > David since spapr should be the initial user of it, and possibly also
> > > > > > Alex since vfio should be assuming that (IIUC both in QEMU and kernel)
> > > > > > addr_mask is page masks rather than arbirary length.
> > > > > > 
> > > > > > (CC Alex)
> > > > > > 
> > > > > > Thanks,
> > > > > 
> > > > > Callbacks that need powers of two can easily split up the range.
> > > > 
> > > > I think I missed part of the thread.  What's the original use case for
> > > > non-power-of-two IOTLB entries?  It certainly won't happen on Power.
> > > 
> > > Currently address_space_get_iotlb_entry() didn't really follow the
> > > rule, addr_mask can be arbitary length. This series tried to fix it,
> > > while Michael was questioning about whether we should really fix that
> > > at all.
> > > 
> > > Michael,
> > > 
> > > Even if for performance's sake, I should still think we should fix it.
> > > Let's consider a most simple worst case: we have a single page mapped
> > > with IOVA range (2M page):
> > > 
> > >  [0x0, 0x200000)
> > > 
> > > And if guest access IOVA using the following patern:
> > > 
> > >  0x1fffff, 0x1ffffe, 0x1ffffd, ...
> > > 
> > > Then now we'll get this:
> > > 
> > > - request 0x1fffff, cache miss, will get iotlb [0x1fffff, 0x200000)
> > > - request 0x1ffffe, cache miss, will get iotlb [0x1ffffe, 0x200000)
> > > - request 0x1ffffd, cache miss, will get iotlb [0x1ffffd, 0x200000)
> > > - ...
> > 
> > We pass an offset too, do we not? So callee can figure out
> > the region starts at 0x0 and avoid 2nd and 3rd misses.
> 
> Here when you say "offset", do you mean the offset in
> MemoryRegionSection?
> 
> In address_space_get_iotlb_entry() we have this:
> 
>     section = address_space_do_translate(as, addr, &xlat, &plen,
>                                          is_write, false);
> 
> One thing to mention is that, imho we cannot really assume the xlat is
> valid on the whole "section" range - the section can be a huge GPA
> range, while the xlat may only be valid on a single 4K page. The only
> safe region we can use here is (xlat, xlat+plen). Outside that, we
> should know nothing valid.
> 
> Please correct me if I didn't really catch the point..

IIUC section is the translation result. If so all of it is valid,
not just one page.

> > 
> > 
> > > We'll all cache miss along the way until we access 0x0. While if we
> > > are with page mask, we'll get:
> > > 
> > > - request 0x1fffff, cache miss, will get iotlb [0x0, 0x200000)
> > > - request 0x1ffffe, cache hit
> > > - request 0x1ffffd, cache hit
> > > - ...
> > > 
> > > We'll only miss at the first IO.
> > 
> > I think we should send as much info as we can.
> > There should be a way to find full region start and length.
> 
> Thanks,
> 
> -- 
> Peter Xu

  reply	other threads:[~2017-06-14 18:35 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-02 11:50 [Qemu-devel] [PATCH 0/3] exec: further refine address_space_get_iotlb_entry() Peter Xu
2017-06-02 11:50 ` [Qemu-devel] [PATCH 1/3] exec: add page_mask for address_space_do_translate Peter Xu
2017-06-02 16:45   ` Michael S. Tsirkin
2017-06-05  2:52     ` Peter Xu
2017-06-02 11:50 ` [Qemu-devel] [PATCH 2/3] exec: simplify address_space_get_iotlb_entry Peter Xu
2017-06-02 16:49   ` Michael S. Tsirkin
2017-06-05  3:07     ` Peter Xu
2017-06-06 14:34       ` Paolo Bonzini
2017-06-06 23:47         ` David Gibson
2017-06-07  3:44           ` Peter Xu
2017-06-07 13:07             ` Michael S. Tsirkin
2017-06-08  6:11               ` Peter Xu
2017-06-08 18:59                 ` Michael S. Tsirkin
2017-06-09  1:58                   ` Peter Xu
2017-06-09  2:37                     ` David Gibson
2017-06-11 10:09                     ` Michael S. Tsirkin
2017-06-11 12:10                       ` David Gibson
2017-06-12  2:34                         ` Peter Xu
2017-06-12  3:07                           ` Michael S. Tsirkin
2017-06-12  4:04                             ` Peter Xu
2017-06-14 18:34                               ` Michael S. Tsirkin [this message]
2017-06-15  2:31                                 ` Peter Xu
2017-06-15  2:57                                   ` Peter Xu
2017-06-16 15:33                                   ` Michael S. Tsirkin
2017-06-07 13:01           ` Paolo Bonzini
2017-06-02 11:50 ` [Qemu-devel] [PATCH 3/3] vhost: iommu: cache static mapping if there is Peter Xu
2017-06-02 15:45   ` Michael S. Tsirkin
2017-06-05  3:15     ` Peter Xu
2017-06-05  4:07       ` Jason Wang
2017-06-05 15:05       ` Michael S. Tsirkin
2017-06-02 16:51   ` Michael S. Tsirkin
2017-06-02 14:51 ` [Qemu-devel] [PATCH 0/3] exec: further refine address_space_get_iotlb_entry() Michael S. Tsirkin
2017-06-05  3:20   ` Peter Xu
2017-06-06 15:29     ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170614213418-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jasowang@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.