All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Sheng Yang <sheng@linux.intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org
Subject: Re: qemu/hw/device-assignment: questions about msix_table_page
Date: Wed, 6 May 2009 10:31:18 +0300	[thread overview]
Message-ID: <20090506073118.GA3791@redhat.com> (raw)
In-Reply-To: <200905061035.28133.sheng@linux.intel.com>

On Wed, May 06, 2009 at 10:35:27AM +0800, Sheng Yang wrote:
> On Tuesday 05 May 2009 20:46:04 Michael S. Tsirkin wrote:
> > On Tue, May 05, 2009 at 07:49:10AM -0300, Marcelo Tosatti wrote:
> > > On Tue, May 05, 2009 at 01:34:50PM +0300, Michael S. Tsirkin wrote:
> > > > On Tue, May 05, 2009 at 07:19:45AM -0300, Marcelo Tosatti wrote:
> > > > > On Tue, May 05, 2009 at 12:51:36PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Mon, Apr 27, 2009 at 10:30:17PM +0800, Sheng Yang wrote:
> > > > > > > > > > > If guest can write to the real device MSI-X table
> > > > > > > > > > > directly, it would cause chaos on interrupt delivery, for
> > > > > > > > > > > what guest see is totally different with what's host
> > > > > > > > > > > see...
> > > > > > > > > >
> > > > > > > > > > Obviously.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > >
> > > > > > What's the reason that this page is unmapped from the qemu memory
> > > > > > space? Specifically what do these lines do:
> > > > > >             int offset = r_dev->msix_table_addr -
> > > > > > real_region->base_addr; ret = munmap(region->u.r_virtbase + offset,
> > > > > > TARGET_PAGE_SIZE);
> > > > >
> > > > > I believe this allows accesses to this page (the MSI-X table), which
> > > > > is part of the guest address space (through kvm memory slots), to be
> > > > > trapped by qemu.
> > > > >
> > > > > Since there is no actual page in this guest address, KVM treats
> > > > > accesses as MMIO and forwards them to QEMU.
> > > >
> > > > I thought about this too.
> > > > But why is this necessary for assigned MSI-X but not for emulated
> > > > devices such as e.g. e1000? All e1000 does seems to be
> > > > cpu_register_physical_memory ...
> > >
> > > Because there is no registered (kvm) memory slot for the range which
> > > e1000 registers its MMIO? Not sure about the address of the MSI-X table
> > > page, but you could achieve the same effect by splitting the slot which
> > > it lives in two, with a 1 page hole between them.
> >
> > You could also move the emulated MSI-X table, sticking it on top of the
> > existing BAR. Since PCI config includes the pointer to the table,
> > a driver that reads this pointer will continue to work.
> 
> One BAR can contain more than a MSI-X table... The PCI spec only said the 
> other information should be page aligned and can't in the same page of MSI-X 
> table(except PBA). I think this method make thing more complicate, we don't 
> want to and can't trap other informations in the same BAR...

The trick I was suggesting was increasing the BAR size.
Let's assume we have real BAR of size 1Mbyte and MSI-X table at offset 0.
We report to guest BAR of size 2Mbyte and MSI-X table offset 1MByte.
Trap all accesses 1MByte to 2MByte and copy them to MSI-X table.


> > Of course, there's no guarantee that guest drivers don't just hard-code
> > this offset.
> 
> I think this mostly won't happen.
> >
> > > BTW this is why you can't map the MSI-X table page directly, you want
> > > accesses to be trapped.
> >
> > BTW current design won't work if the base page size is > 4K, will it?
> > The hole covers a page, so you'll get faults outside the MSI-X table.
> 
> Yes. One entry for MSI-X is 16bytes, one page can contain 256 entries. Well, I 
> haven't see a device get more than 100 entries, but for this limitation, maybe 
> we should limit MSI-X max entries to 256 (rather than 512 entries  
> now)temporarily...

Drivers might not have a clean fallback path if the number of entries
becomes smaller.

Another problem is if TARGET_PAGE_SIZE is > 4K.
PCI spec only asks devices to reserve 4K of space for the table,
so you will accidentally trapping accesses not related to MSI-X.

-- 
MST

  reply	other threads:[~2009-05-06  7:32 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20090427104117.GB29082@redhat.com>
2009-04-27 13:16 ` Sheng Yang
2009-04-27 13:51   ` qemu/hw/device-assignment: questions about msix_table_page Michael S. Tsirkin
2009-04-27 14:03     ` Sheng Yang
2009-04-27 14:15       ` Michael S. Tsirkin
2009-04-27 14:30         ` Sheng Yang
2009-04-27 14:35           ` Michael S. Tsirkin
2009-05-05  9:51           ` Michael S. Tsirkin
2009-05-05 10:19             ` Marcelo Tosatti
2009-05-05 10:34               ` Michael S. Tsirkin
2009-05-05 10:49                 ` Marcelo Tosatti
2009-05-05 11:45                   ` Michael S. Tsirkin
2009-05-05 11:51                     ` Marcelo Tosatti
2009-05-05 12:46                   ` Michael S. Tsirkin
2009-05-06  2:35                     ` Sheng Yang
2009-05-06  7:31                       ` Michael S. Tsirkin [this message]
2009-05-06  8:17                         ` Sheng Yang
2009-04-28  9:31       ` Avi Kivity
2009-04-27 13:13 Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090506073118.GA3791@redhat.com \
    --to=mst@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=sheng@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.