qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: xuyandong <xuyandong2@huawei.com>
Cc: "marcel@redhat.com" <marcel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Zhanghailiang <zhang.zhanghailiang@huawei.com>,
	"wangxin (U)" <wangxinxin.wang@huawei.com>,
	"Huangweidong (C)" <weidong.huang@huawei.com>,
	"Zhoujian (jay)" <jianjay.zhou@huawei.com>
Subject: Re: [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug
Date: Mon, 7 Jan 2019 11:24:51 -0500	[thread overview]
Message-ID: <20190107112301-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <7CECC2DFC21538489F72729DF5EFB4D908AE91FB@dggemm521-mbs.china.huawei.com>

On Mon, Jan 07, 2019 at 03:28:36PM +0000, xuyandong wrote:
> 
> 
> > -----Original Message-----
> > From: Michael S. Tsirkin [mailto:mst@redhat.com]
> > Sent: Monday, January 07, 2019 11:06 PM
> > To: xuyandong <xuyandong2@huawei.com>
> > Cc: marcel@redhat.com; Paolo Bonzini <pbonzini@redhat.com>; qemu-
> > devel@nongnu.org; Zhanghailiang <zhang.zhanghailiang@huawei.com>;
> > wangxin (U) <wangxinxin.wang@huawei.com>; Huangweidong (C)
> > <weidong.huang@huawei.com>
> > Subject: Re: [BUG]Unassigned mem write during pci device hot-plug
> > 
> > On Mon, Jan 07, 2019 at 02:37:17PM +0000, xuyandong wrote:
> > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > In our test, we configured VM with several
> > > > > > > > > > > > > pci-bridges and a virtio-net nic been attached
> > > > > > > > > > > > > with bus 4,
> > > > > > > > > > > > >
> > > > > > > > > > > > > After VM is startup, We ping this nic from host to
> > > > > > > > > > > > > judge if it is working normally. Then, we hot add
> > > > > > > > > > > > > pci devices to this VM with bus
> > > > > > > > 0.
> > > > > > > > > > > > >
> > > > > > > > > > > > > We  found the virtio-net NIC in bus 4 is not
> > > > > > > > > > > > > working (can not
> > > > > > > > > > > > > connect) occasionally, as it kick virtio backend
> > > > > > > > > > > > > failure with error
> > >
> > > > > > > But I have another question, if we only fix this problem in
> > > > > > > the kernel, the Linux version that has been released does not
> > > > > > > work well on the
> > > > > > virtualization platform.
> > > > > > > Is there a way to fix this problem in the backend?
> > >
> > > Hi Michael,
> > >
> > > If we want to fix this problem on the backend, it is not enough to
> > > consider only PCI device hot plugging, because I found that if we use
> > > a command like "echo 1 > /sys/bus/pci/rescan" in guest, this problem is very
> > easy to reproduce.
> > >
> > > From the perspective of device emulation, when guest writes 0xffffffff
> > > to the BAR, guest just want to get the size of the region but not really
> > updating the address space.
> > > So I made the following patch to avoid  update pci mapping.
> > >
> > > Do you think this make sense?
> > >
> > > [PATCH] pci: avoid update pci mapping when writing 0xFFFF FFFF to BAR
> > >
> > > When guest writes 0xffffffff to the BAR, guest just want to get the
> > > size of the region but not really updating the address space.
> > > So when guest writes 0xffffffff to BAR, we need avoid
> > > pci_update_mappings or pci_bridge_update_mappings.
> > >
> > > Signed-off-by: xuyandong <xuyandong2@huawei.com>
> > 
> > I see how that will address the common case however there are a bunch of
> > issues here.  First of all it's easy to trigger the update by some other action like
> > VM migration.  More importantly it's just possible that guest actually does want
> > to set the low 32 bit of the address to all ones.  For example, that is clearly
> > listed as a way to disable all devices behind the bridge in the pci to pci bridge
> > spec.
> 
> Ok, I see. If I only skip upate when guest writing 0xFFFFFFFF to Prefetcable Base Upper 32 Bits
> to meet the kernel double check problem.
> Do you think there is still risk?

Well it's non zero since spec says such a write should disable all
accesses. Just an idea: why not add an option to disable upper 32 bit?
That is ugly and limits space but spec compliant.

> > 
> > Given upstream is dragging it's feet I'm open to adding a flag that will help
> > keep guests going as a temporary measure.
> > We will need to think about ways to restrict this as much as we can.
> > 
> > 
> > > ---
> > >  hw/pci/pci.c        | 6 ++++--
> > >  hw/pci/pci_bridge.c | 8 +++++---
> > >  2 files changed, 9 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 56b13b3..ef368e1 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -1361,6 +1361,7 @@ void pci_default_write_config(PCIDevice *d,
> > > uint32_t addr, uint32_t val_in, int  {
> > >      int i, was_irq_disabled = pci_irq_disabled(d);
> > >      uint32_t val = val_in;
> > > +    uint64_t barmask = (1 << l*8) - 1;
> > >
> > >      for (i = 0; i < l; val >>= 8, ++i) {
> > >          uint8_t wmask = d->wmask[addr + i]; @@ -1369,9 +1370,10 @@
> > > void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val_in,
> > int
> > >          d->config[addr + i] = (d->config[addr + i] & ~wmask) | (val & wmask);
> > >          d->config[addr + i] &= ~(val & w1cmask); /* W1C: Write 1 to Clear */
> > >      }
> > > -    if (ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> > > +    if ((val_in != barmask &&
> > > +	(ranges_overlap(addr, l, PCI_BASE_ADDRESS_0, 24) ||
> > >          ranges_overlap(addr, l, PCI_ROM_ADDRESS, 4) ||
> > > -        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4) ||
> > > +        ranges_overlap(addr, l, PCI_ROM_ADDRESS1, 4))) ||
> > >          range_covers_byte(addr, l, PCI_COMMAND))
> > >          pci_update_mappings(d);
> > >
> > > diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c index
> > > ee9dff2..f2bad79 100644
> > > --- a/hw/pci/pci_bridge.c
> > > +++ b/hw/pci/pci_bridge.c
> > > @@ -253,17 +253,19 @@ void pci_bridge_write_config(PCIDevice *d,
> > >      PCIBridge *s = PCI_BRIDGE(d);
> > >      uint16_t oldctl = pci_get_word(d->config + PCI_BRIDGE_CONTROL);
> > >      uint16_t newctl;
> > > +    uint64_t barmask = (1 << len * 8) - 1;
> > >
> > >      pci_default_write_config(d, address, val, len);
> > >
> > >      if (ranges_overlap(address, len, PCI_COMMAND, 2) ||
> > >
> > > -        /* io base/limit */
> > > -        ranges_overlap(address, len, PCI_IO_BASE, 2) ||
> > > +        (val != barmask &&
> > > +	/* io base/limit */
> > > +        (ranges_overlap(address, len, PCI_IO_BASE, 2) ||
> > >
> > >          /* memory base/limit, prefetchable base/limit and
> > >             io base/limit upper 16 */
> > > -        ranges_overlap(address, len, PCI_MEMORY_BASE, 20) ||
> > > +        ranges_overlap(address, len, PCI_MEMORY_BASE, 20))) ||
> > >
> > >          /* vga enable */
> > >          ranges_overlap(address, len, PCI_BRIDGE_CONTROL, 2)) {
> > > --
> > > 1.8.3.1
> > >
> > >
> > >

  reply	other threads:[~2019-01-07 16:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-08 11:58 [Qemu-devel] [BUG]Unassigned mem write during pci device hot-plug xuyandong
2018-12-09 14:26 ` Michael S. Tsirkin
2018-12-10  1:59   ` xuyandong
2018-12-10  2:22 ` Michael S. Tsirkin
2018-12-10  3:12   ` xuyandong
2018-12-10 18:14     ` Michael S. Tsirkin
2018-12-11  1:47       ` xuyandong
2018-12-11  2:17         ` Michael S. Tsirkin
2018-12-11  2:55           ` xuyandong
2018-12-11  3:38             ` Michael S. Tsirkin
2018-12-11  3:51               ` xuyandong
2018-12-11  4:04                 ` Michael S. Tsirkin
2018-12-11  4:25                 ` Michael S. Tsirkin
2019-01-07 14:37                   ` xuyandong
2019-01-07 15:06                     ` Michael S. Tsirkin
2019-01-07 15:28                       ` xuyandong
2019-01-07 16:24                         ` Michael S. Tsirkin [this message]
2019-01-07 14:51                   ` xuyandong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190107112301-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=jianjay.zhou@huawei.com \
    --cc=marcel@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wangxinxin.wang@huawei.com \
    --cc=weidong.huang@huawei.com \
    --cc=xuyandong2@huawei.com \
    --cc=zhang.zhanghailiang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).