From: Peter Xu <peterx@redhat.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
Auger Eric <eric.auger@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] memory: do not do out of bound notification
Date: Mon, 24 Jun 2019 14:14:59 +0800 [thread overview]
Message-ID: <20190624061459.GD6279@xz-x1> (raw)
In-Reply-To: <20190624052255.GA27894@joy-OptiPlex-7040>
On Mon, Jun 24, 2019 at 01:22:55AM -0400, Yan Zhao wrote:
> On Thu, Jun 20, 2019 at 09:04:43PM +0800, Peter Xu wrote:
> > On Thu, Jun 20, 2019 at 08:59:55PM +0800, Peter Xu wrote:
> > > On Thu, Jun 20, 2019 at 10:35:29AM +0200, Paolo Bonzini wrote:
> > > > On 20/06/19 06:02, Peter Xu wrote:
> > > > > Seems workable, to be explicit - we can even cut it into chunks with
> > > > > different size to be efficient.
> > > >
> > > > Yes, this is not hard (completely untested):
> > > >
> > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > > index 44b1231157..541538bc6c 100644
> > > > --- a/hw/i386/intel_iommu.c
> > > > +++ b/hw/i386/intel_iommu.c
> > > > @@ -3388,39 +3388,34 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> > > > }
> > > >
> > > > assert(start <= end);
> > > > - size = end - start;
> > > > + while (end > start) {
> > > > + size = end - start;
> > > > + /* Only keep the lowest bit of either size or start. */
> > > > + size = MIN(size & -size, start & -start);
> > >
> > > I feel like this can be problematic. I'm imaging:
> > >
> > > start=0x1000_0000, size=0x1000_1000
> > >
> > > This will get size=0x1000 but actually we can do size=0x1000_0000 as
> > > the first.
> > >
> > > > + /* Should not happen, but limit to address width too just in case */
> > > > + size = MIN(size, 1ULL << s->aw_bits);
> > > >
> > > > - if (ctpop64(size) != 1) {
> > > > - /*
> > > > - * This size cannot format a correct mask. Let's enlarge it to
> > > > - * suite the minimum available mask.
> > > > - */
> > > > - int n = 64 - clz64(size);
> > > > - if (n > s->aw_bits) {
> > > > - /* should not happen, but in case it happens, limit it */
> > > > - n = s->aw_bits;
> > > > - }
> > > > - size = 1ULL << n;
> > > > - }
> > > > + assert((start & (size - 1)) == 0);
> > > >
> > > > - entry.target_as = &address_space_memory;
> > > > - /* Adjust iova for the size */
> > > > - entry.iova = n->start & ~(size - 1);
> > > > - /* This field is meaningless for unmap */
> > > > - entry.translated_addr = 0;
> > > > - entry.perm = IOMMU_NONE;
> > > > - entry.addr_mask = size - 1;
> > > > + entry.target_as = &address_space_memory;
> > > > + entry.iova = start;
> > > > + /* This field is meaningless for unmap */
> > > > + entry.translated_addr = 0;
> > > > + entry.perm = IOMMU_NONE;
> > > > + entry.addr_mask = size - 1;
> > >
> > > (some of the fields can be moved out of loop because they are
> > > constants)
> > >
> > > >
> > > > - trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > > > - VTD_PCI_SLOT(as->devfn),
> > > > - VTD_PCI_FUNC(as->devfn),
> > > > - entry.iova, size);
> > > > + trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > > > + VTD_PCI_SLOT(as->devfn),
> > > > + VTD_PCI_FUNC(as->devfn),
> > > > + entry.iova, size);
> > >
> > > Can move this out because this is a trace only so we don't have
> > > restriction on mask?
> > >
> > > >
> > > > - map.iova = entry.iova;
> > > > - map.size = entry.addr_mask;
> > > > - iova_tree_remove(as->iova_tree, &map);
> > > > + map.iova = entry.iova;
> > > > + map.size = entry.addr_mask;
> > > > + iova_tree_remove(as->iova_tree, &map);
> > >
> > > Same here?
> > >
> > > >
> > > > - memory_region_notify_one(n, &entry);
> > > > + memory_region_notify_one(n, &entry);
> > > > + start += size;
> > > > + }
> > > > }
> > > >
> > > > static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > > >
> > > >
> > > > Yan,
> > > >
> > > > if something like this works for you, let me know and I will submit it
> > > > as a proper patch.
> > > >
> > > > Paolo
> > >
> > > Since during review I'm thinking how to generate a correct sequence of
> > > these masks... here's my try below with above issues fixed... :)
> > >
> > > I've tried compile but not tested. Yan can test it, or I can do it
> > > too tomorrow after I find some machines.
> > >
> > > Thanks,
> > >
> > > ------------------------------------------------------------
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 44b1231157..cfbd225f0a 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -3363,11 +3363,32 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, PCIBus *bus, int devfn)
> > > return vtd_dev_as;
> > > }
> > >
> > > +static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
> > > +{
> > > + /* Tries to find smallest mask from start first */
> > > + uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
> > > +
> > > + assert(size && gaw > 0 && gaw < 64);
> > > +
> > > + /* Zero start, or too big */
> > > + if (!rmask || rmask > max_mask) {
> > > + rmask = max_mask;
> > > + }
> > > +
> > > + /* If the start mask worked, then use it */
> > > + if (rmask <= size) {
> > > + return rmask;
> > > + }
> > > +
> > > + /* Find the largest page mask from size */
> > > + return 1ULL << (63 - clz64(size));
> > > +}
> > > +
> > > /* Unmap the whole range in the notifier's scope. */
> > > static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> > > {
> > > IOMMUTLBEntry entry;
> > > - hwaddr size;
> > > + hwaddr size, remain;
> > > hwaddr start = n->start;
> > > hwaddr end = n->end;
> > > IntelIOMMUState *s = as->iommu_state;
> > > @@ -3388,39 +3409,28 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> > > }
> > >
> > > assert(start <= end);
> > > - size = end - start;
> > > -
> > > - if (ctpop64(size) != 1) {
> > > - /*
> > > - * This size cannot format a correct mask. Let's enlarge it to
> > > - * suite the minimum available mask.
> > > - */
> > > - int n = 64 - clz64(size);
> > > - if (n > s->aw_bits) {
> > > - /* should not happen, but in case it happens, limit it */
> > > - n = s->aw_bits;
> > > - }
> > > - size = 1ULL << n;
> > > - }
> > > -
> > > + size = remain = end - start;
> > > entry.target_as = &address_space_memory;
> > > - /* Adjust iova for the size */
> > > - entry.iova = n->start & ~(size - 1);
> > > + entry.perm = IOMMU_NONE;
> > > /* This field is meaningless for unmap */
> > > entry.translated_addr = 0;
> > > - entry.perm = IOMMU_NONE;
> > > - entry.addr_mask = size - 1;
> > > +
> > > + while (remain) {
> > > + uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
> > > +
> > > + entry.iova = start;
> > > + entry.addr_mask = mask - 1;
> > > + memory_region_notify_one(n, &entry);
> >
> > Sorry, I at least missed these lines:
> >
> > start += mask;
> > remain -= mask;
> >
> > > + }
> > >
> > > trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> > > VTD_PCI_SLOT(as->devfn),
> > > VTD_PCI_FUNC(as->devfn),
> > > - entry.iova, size);
> > > + n->start, size);
> > >
> > > - map.iova = entry.iova;
> > > - map.size = entry.addr_mask;
> > > + map.iova = n->start;
> > > + map.size = size;
> > > iova_tree_remove(as->iova_tree, &map);
> > > -
> > > - memory_region_notify_one(n, &entry);
> > > }
> > >
> > > static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> > > ------------------------------------------------------------
> > >
> > > Regards,
> > >
> > > --
> > > Peter Xu
> >
> > Regards,
> >
> > --
> > Peter Xu
>
> hi Peter and Paolo,
> I tested with code and it's fine in my side.
> It's base on your version with some minor modifications, such as size is
> now (end - start + 1) now.
> Thanks
> Yan
Hi, Yan,
Thanks for testing the patches. I think below change [1] is not
related to the problem so I tend to split it out. For [2] I'll change
to an assertion if you won't disagree. I'll reorganize the patches
and post a formal version with proper authorships soon.
Thanks,
>
> +static uint64_t vtd_get_next_mask(uint64_t start, uint64_t size, int gaw)
> +{
> + /* Tries to find smallest mask from start first */
> + uint64_t rmask = start & -start, max_mask = 1ULL << gaw;
> + assert(size && gaw > 0 && gaw < 64);
> + /* Zero start, or too big */
> + if (!rmask || rmask > max_mask) {
> + rmask = max_mask;
> + }
> + /* If the start mask worked, then use it */
> + if (rmask <= size) {
> + return rmask;
> + }
> +
> + /* Find the largest page mask from size */
> + return 1ULL << (63 - clz64(size));
> +}
> +
> /* Unmap the whole range in the notifier's scope. */
> static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> {
> IOMMUTLBEntry entry;
> - hwaddr size;
> + hwaddr size, remain;
> hwaddr start = n->start;
> hwaddr end = n->end;
> IntelIOMMUState *s = as->iommu_state;
> @@ -3380,48 +3398,46 @@ static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n)
> * VT-d spec), otherwise we need to consider overflow of 64 bits.
> */
>
> - if (end > VTD_ADDRESS_SIZE(s->aw_bits)) {
> + if (end > VTD_ADDRESS_SIZE(s->aw_bits) - 1) {
> /*
> * Don't need to unmap regions that is bigger than the whole
> * VT-d supported address space size
> */
> - end = VTD_ADDRESS_SIZE(s->aw_bits);
> + end = VTD_ADDRESS_SIZE(s->aw_bits) - 1;
[1]
> }
>
> assert(start <= end);
> - size = end - start;
>
> - if (ctpop64(size) != 1) {
> - /*
> - * This size cannot format a correct mask. Let's enlarge it to
> - * suite the minimum available mask.
> - */
> - int n = 64 - clz64(size);
> - if (n > s->aw_bits) {
> - /* should not happen, but in case it happens, limit it */
> - n = s->aw_bits;
> - }
> - size = 1ULL << n;
> - }
> + size = remain = end - start + 1;
>
> entry.target_as = &address_space_memory;
> - /* Adjust iova for the size */
> - entry.iova = n->start & ~(size - 1);
> +
> + entry.perm = IOMMU_NONE;
> /* This field is meaningless for unmap */
> entry.translated_addr = 0;
> - entry.perm = IOMMU_NONE;
> - entry.addr_mask = size - 1;
> +
> + while (remain >= VTD_PAGE_SIZE) {
> + uint64_t mask = vtd_get_next_mask(start, remain, s->aw_bits);
> +
> + entry.iova = start;
> + entry.addr_mask = mask - 1;
> + memory_region_notify_one(n, &entry);
> + start += mask;
> + remain -= mask;
> + }
> +
> + if (remain) {
> + warn_report("Unmapping unaligned range %lx-%lx", start, end);
[2]
> + }
>
> trace_vtd_as_unmap_whole(pci_bus_num(as->bus),
> VTD_PCI_SLOT(as->devfn),
> VTD_PCI_FUNC(as->devfn),
> - entry.iova, size);
> -
> - map.iova = entry.iova;
> - map.size = entry.addr_mask;
> + n->start, size);
> + map.iova = n->start;
> + map.size = size;
> iova_tree_remove(as->iova_tree, &map);
>
> - memory_region_notify_one(n, &entry);
> }
>
> static void vtd_address_space_unmap_all(IntelIOMMUState *s)
> --
> 2.7.4
>
>
>
Regards,
--
Peter Xu
next prev parent reply other threads:[~2019-06-24 6:17 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-19 8:49 [Qemu-devel] [PATCH] memory: do not do out of bound notification Yan Zhao
2019-06-19 13:17 ` Auger Eric
2019-06-20 1:46 ` Yan Zhao
2019-06-20 4:02 ` Peter Xu
2019-06-20 4:14 ` Yan Zhao
2019-06-20 8:14 ` Peter Xu
2019-06-20 8:13 ` Yan Zhao
2019-06-20 8:35 ` Paolo Bonzini
2019-06-20 10:57 ` Yan Zhao
2019-06-20 12:04 ` Paolo Bonzini
2019-06-20 12:59 ` Peter Xu
2019-06-20 13:04 ` Peter Xu
2019-06-24 5:22 ` Yan Zhao
2019-06-24 6:14 ` Peter Xu [this message]
2019-06-20 13:14 ` Paolo Bonzini
2019-06-21 2:36 ` Peter Xu
2019-06-21 7:57 ` Yan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190624061459.GD6279@xz-x1 \
--to=peterx@redhat.com \
--cc=eric.auger@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).