* [Qemu-devel] [PATCH v2 0/2] exec: further refine address_space_get_iotlb_entry() @ 2017-10-06 11:46 Maxime Coquelin 2017-10-06 11:46 ` [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate Maxime Coquelin 2017-10-06 11:46 ` [Qemu-devel] [PATCH v2 2/2] exec: simplify address_space_get_iotlb_entry Maxime Coquelin 0 siblings, 2 replies; 10+ messages in thread From: Maxime Coquelin @ 2017-10-06 11:46 UTC (permalink / raw) To: peterx, pbonzini, mst, jasowang, qemu-devel; +Cc: qemu-stable, Maxime Coquelin This series is a rebase of the first two patches of Peter's series improving address_space_get_iotlb_entry(): Message-Id: <1496404254-17429-1-git-send-email-peterx@redhat.com> This second revision of the rebase fixes the page_mask initial value, that was mistakenly set to its complement. This bug could only be seen when guest kernel doesn't enable iommu support, otherwise this initial value is overwritten. It is actually not only an improvement, but fixes a regression in the way IOTLB updates sent to the backends are generated. The regression is introduced by patch: a764040cc8 ("exec: abstract address_space_do_translate()") Prior to this patch IOTLB entries sent to the backend were aligned on the guest page boundaries (both addresses and size). For example, with the guest using 2MB pages: * Backend sends IOTLB miss request for iova = 0x112378fb4 * QEMU replies with an IOTLB update with iova = 0x112200000, size = 0x200000 * Bakend insert above entry in its cache and compute the translation In this case, if the backend needs later to translate 0x112378004, it will result in a cache it and no need to send another IOTLB miss. With this patch, the addr of the IOTLB entry will be the address requested via the IOTLB miss, the size is computed to cover the remaining of the guest page. The same example gives: * Backend sends IOTLB miss request for iova = 0x112378fb4 * QEMU replies with an IOTLB update with iova = 112378fb4, size = 0x8704c * Bakend insert above entry in its cache and compute the translation In this case, if the backend needs later to translate 0x112378004, it will result in another cache miss: * Backend sends IOTLB miss request for iova = 0x112378004 * QEMU replies with an IOTLB update with iova = 0x112378004, size = 0x87FFC * Bakend insert above entry in its cache and compute the translation It results in having much more IOTLB misses, and more importantly it pollutes the device IOTLB cache by multiplying the number of entries that moreover overlap. Note that current Kernel & User backends implementation do not merge contiguous and overlapping IOTLB entries at device IOTLB cache insertion. This series fixes this regression, so that IOTLB updates are aligned on guest's page boundaries. Changes since rebase: ===================== - Fix page_mask initial value - Apply Michael's on second patch Peter Xu (2): exec: add page_mask for flatview_do_translate exec: simplify address_space_get_iotlb_entry exec.c | 75 +++++++++++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 49 insertions(+), 26 deletions(-) -- 2.13.6 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-06 11:46 [Qemu-devel] [PATCH v2 0/2] exec: further refine address_space_get_iotlb_entry() Maxime Coquelin @ 2017-10-06 11:46 ` Maxime Coquelin 2017-10-06 12:31 ` Paolo Bonzini 2017-10-06 11:46 ` [Qemu-devel] [PATCH v2 2/2] exec: simplify address_space_get_iotlb_entry Maxime Coquelin 1 sibling, 1 reply; 10+ messages in thread From: Maxime Coquelin @ 2017-10-06 11:46 UTC (permalink / raw) To: peterx, pbonzini, mst, jasowang, qemu-devel; +Cc: qemu-stable, Maxime Coquelin From: Peter Xu <peterx@redhat.com> The function is originally used for flatview_space_translate() and what we care about most is (xlat, plen) range. However for iotlb requests, we don't really care about "plen", but the size of the page that "xlat" is located on. While, plen cannot really contain this information. A simple example to show why "plen" is not good for IOTLB translations: E.g., for huge pages, it is possible that guest mapped 1G huge page on device side that used this GPA range: 0x100000000 - 0x13fffffff Then let's say we want to translate one IOVA that finally mapped to GPA 0x13ffffe00 (which is located on this 1G huge page). Then here we'll get: (xlat, plen) = (0x13fffe00, 0x200) So the IOTLB would be only covering a very small range since from "plen" (which is 0x200 bytes) we cannot tell the size of the page. Actually we can really know that this is a huge page - we just throw the information away in flatview_do_translate(). This patch introduced "page_mask" optional parameter to capture that page mask info. Also, I made "plen" an optional parameter as well, with some comments for the whole function. No functional change yet. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> --- exec.c | 46 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 40 insertions(+), 6 deletions(-) diff --git a/exec.c b/exec.c index 7a80460725..a5f3828445 100644 --- a/exec.c +++ b/exec.c @@ -467,11 +467,29 @@ address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr, hwaddr *x return section; } -/* Called from RCU critical section */ +/** + * flatview_do_translate - translate an address in FlatView + * + * @fv: the flat view that we want to translate on + * @addr: the address to be translated in above address space + * @xlat: the translated address offset within memory region. It + * cannot be @NULL. + * @plen_out: valid read/write length of the translated address. It + * can be @NULL when we don't care about it. + * @page_mask_out: page mask for the translated address. This + * should only be meaningful for IOMMU translated + * addresses, since there may be huge pages that this bit + * would tell. It can be @NULL if we don't care about it. + * @is_write: whether the translation operation is for write + * @is_mmio: whether this can be MMIO, set true if it can + * + * This function is called from RCU critical section + */ static MemoryRegionSection flatview_do_translate(FlatView *fv, hwaddr addr, hwaddr *xlat, - hwaddr *plen, + hwaddr *plen_out, + hwaddr *page_mask_out, bool is_write, bool is_mmio, AddressSpace **target_as) @@ -480,11 +498,17 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv, MemoryRegionSection *section; IOMMUMemoryRegion *iommu_mr; IOMMUMemoryRegionClass *imrc; + hwaddr page_mask = ~TARGET_PAGE_MASK; + hwaddr plen = (hwaddr)(-1); + + if (plen_out) { + plen = *plen_out; + } for (;;) { section = address_space_translate_internal( flatview_to_dispatch(fv), addr, &addr, - plen, is_mmio); + &plen, is_mmio); iommu_mr = memory_region_get_iommu(section->mr); if (!iommu_mr) { @@ -496,7 +520,8 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv, IOMMU_WO : IOMMU_RO); addr = ((iotlb.translated_addr & ~iotlb.addr_mask) | (addr & iotlb.addr_mask)); - *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); + page_mask = iotlb.addr_mask; + plen = MIN(plen, (addr | iotlb.addr_mask) - addr + 1); if (!(iotlb.perm & (1 << is_write))) { goto translate_fail; } @@ -507,6 +532,14 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv, *xlat = addr; + if (page_mask_out) { + *page_mask_out = page_mask; + } + + if (plen_out) { + *plen_out = plen; + } + return *section; translate_fail: @@ -525,7 +558,7 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, /* This can never be MMIO. */ section = flatview_do_translate(address_space_to_flatview(as), addr, - &xlat, &plen, is_write, false, &as); + &xlat, &plen, NULL, is_write, false, &as); /* Illegal translation */ if (section.mr == &io_mem_unassigned) { @@ -569,7 +602,8 @@ MemoryRegion *flatview_translate(FlatView *fv, hwaddr addr, hwaddr *xlat, AddressSpace *as = NULL; /* This can be MMIO, so setup MMIO bit. */ - section = flatview_do_translate(fv, addr, xlat, plen, is_write, true, &as); + section = flatview_do_translate(fv, addr, xlat, plen, NULL, + is_write, true, &as); mr = section.mr; if (xen_enabled() && memory_access_is_direct(mr, is_write)) { -- 2.13.6 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-06 11:46 ` [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate Maxime Coquelin @ 2017-10-06 12:31 ` Paolo Bonzini 2017-10-06 12:46 ` Maxime Coquelin 0 siblings, 1 reply; 10+ messages in thread From: Paolo Bonzini @ 2017-10-06 12:31 UTC (permalink / raw) To: Maxime Coquelin, peterx, mst, jasowang, qemu-devel; +Cc: qemu-stable On 06/10/2017 13:46, Maxime Coquelin wrote: > + hwaddr page_mask = ~TARGET_PAGE_MASK; > + hwaddr plen = (hwaddr)(-1); > + > + if (plen_out) { > + plen = *plen_out; > + } > > for (;;) { > section = address_space_translate_internal( > flatview_to_dispatch(fv), addr, &addr, > - plen, is_mmio); > + &plen, is_mmio); > > iommu_mr = memory_region_get_iommu(section->mr); > if (!iommu_mr) { > @@ -496,7 +520,8 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv, > IOMMU_WO : IOMMU_RO); > addr = ((iotlb.translated_addr & ~iotlb.addr_mask) > | (addr & iotlb.addr_mask)); > - *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); > + page_mask = iotlb.addr_mask; Should this be "page_mask &= iotlb.addr_mask"? If you have multiple IOMMUs on top of each other (yeah, I know...) I think the smallest size should win. This is also consistent with the MIN in the line below. Otherwise looks good. Paolo > + plen = MIN(plen, (addr | iotlb.addr_mask) - addr + 1); > if (!(iotlb.perm & (1 << is_write))) { > goto translate_fail; ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-06 12:31 ` Paolo Bonzini @ 2017-10-06 12:46 ` Maxime Coquelin 2017-10-06 12:48 ` Paolo Bonzini 0 siblings, 1 reply; 10+ messages in thread From: Maxime Coquelin @ 2017-10-06 12:46 UTC (permalink / raw) To: Paolo Bonzini, peterx, mst, jasowang, qemu-devel; +Cc: qemu-stable On 10/06/2017 02:31 PM, Paolo Bonzini wrote: > On 06/10/2017 13:46, Maxime Coquelin wrote: >> + hwaddr page_mask = ~TARGET_PAGE_MASK; >> + hwaddr plen = (hwaddr)(-1); >> + >> + if (plen_out) { >> + plen = *plen_out; >> + } >> >> for (;;) { >> section = address_space_translate_internal( >> flatview_to_dispatch(fv), addr, &addr, >> - plen, is_mmio); >> + &plen, is_mmio); >> >> iommu_mr = memory_region_get_iommu(section->mr); >> if (!iommu_mr) { >> @@ -496,7 +520,8 @@ static MemoryRegionSection flatview_do_translate(FlatView *fv, >> IOMMU_WO : IOMMU_RO); >> addr = ((iotlb.translated_addr & ~iotlb.addr_mask) >> | (addr & iotlb.addr_mask)); >> - *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); >> + page_mask = iotlb.addr_mask; > > Should this be "page_mask &= iotlb.addr_mask"? > > If you have multiple IOMMUs on top of each other (yeah, I know...) I > think the smallest size should win. This is also consistent with the > MIN in the line below. I agree, but changin to "page_mask &= iotlb.addr_mask" will not be enough, we also have to change the init value. Else we will always end up with 0xfff. Maybe we could do as plen was handled before, i.e. setting page_mask init value to (hwaddr)(-1), and after the loop set it to ~TARGET_PAGE_MASK if it hasn't been changed. Does that sound reasonable? Thanks, Maxime > > Otherwise looks good. > > Paolo > >> + plen = MIN(plen, (addr | iotlb.addr_mask) - addr + 1); >> if (!(iotlb.perm & (1 << is_write))) { >> goto translate_fail; > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-06 12:46 ` Maxime Coquelin @ 2017-10-06 12:48 ` Paolo Bonzini 2017-10-06 13:03 ` Maxime Coquelin 0 siblings, 1 reply; 10+ messages in thread From: Paolo Bonzini @ 2017-10-06 12:48 UTC (permalink / raw) To: Maxime Coquelin, peterx, mst, jasowang, qemu-devel; +Cc: qemu-stable On 06/10/2017 14:46, Maxime Coquelin wrote: >>> addr = ((iotlb.translated_addr & ~iotlb.addr_mask) >>> | (addr & iotlb.addr_mask)); >>> - *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); >>> + page_mask = iotlb.addr_mask; >> >> Should this be "page_mask &= iotlb.addr_mask"? >> >> If you have multiple IOMMUs on top of each other (yeah, I know...) I >> think the smallest size should win. This is also consistent with the >> MIN in the line below. > > I agree, but changin to "page_mask &= iotlb.addr_mask" will not be > enough, we also have to change the init value. Else we will always end > up with 0xfff. > > Maybe we could do as plen was handled before, i.e. setting page_mask > init value to (hwaddr)(-1), and after the loop set it to > ~TARGET_PAGE_MASK if it hasn't been changed. > > Does that sound reasonable? True that, in fact it makes sense for the "IOTLB entry" to represent all of memory if there's no IOMMU at all. Thanks, Paolo ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-06 12:48 ` Paolo Bonzini @ 2017-10-06 13:03 ` Maxime Coquelin 2017-10-09 5:17 ` Peter Xu 0 siblings, 1 reply; 10+ messages in thread From: Maxime Coquelin @ 2017-10-06 13:03 UTC (permalink / raw) To: Paolo Bonzini, peterx, mst, jasowang, qemu-devel; +Cc: qemu-stable On 10/06/2017 02:48 PM, Paolo Bonzini wrote: > On 06/10/2017 14:46, Maxime Coquelin wrote: >>>> addr = ((iotlb.translated_addr & ~iotlb.addr_mask) >>>> | (addr & iotlb.addr_mask)); >>>> - *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); >>>> + page_mask = iotlb.addr_mask; >>> >>> Should this be "page_mask &= iotlb.addr_mask"? >>> >>> If you have multiple IOMMUs on top of each other (yeah, I know...) I >>> think the smallest size should win. This is also consistent with the >>> MIN in the line below. >> >> I agree, but changin to "page_mask &= iotlb.addr_mask" will not be >> enough, we also have to change the init value. Else we will always end >> up with 0xfff. >> >> Maybe we could do as plen was handled before, i.e. setting page_mask >> init value to (hwaddr)(-1), and after the loop set it to >> ~TARGET_PAGE_MASK if it hasn't been changed. >> >> Does that sound reasonable? > > True that, in fact it makes sense for the "IOTLB entry" to represent all > of memory if there's no IOMMU at all. Indeed, that makes sense as no iommu means identity mapping. It would moreover improve performance, as the vhost backend will only have a single IOTLB entry in its cache. Maybe it is better to wait for Peter to understand the reason he limited it to the target page size? Thanks, Maxime > Thanks, > > Paolo > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-06 13:03 ` Maxime Coquelin @ 2017-10-09 5:17 ` Peter Xu 2017-10-09 8:30 ` Maxime Coquelin 0 siblings, 1 reply; 10+ messages in thread From: Peter Xu @ 2017-10-09 5:17 UTC (permalink / raw) To: Maxime Coquelin; +Cc: Paolo Bonzini, mst, jasowang, qemu-devel, qemu-stable On Fri, Oct 06, 2017 at 03:03:50PM +0200, Maxime Coquelin wrote: > > > On 10/06/2017 02:48 PM, Paolo Bonzini wrote: > >On 06/10/2017 14:46, Maxime Coquelin wrote: > >>>> addr = ((iotlb.translated_addr & ~iotlb.addr_mask) > >>>> | (addr & iotlb.addr_mask)); > >>>>- *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); > >>>>+ page_mask = iotlb.addr_mask; > >>> > >>>Should this be "page_mask &= iotlb.addr_mask"? > >>> > >>>If you have multiple IOMMUs on top of each other (yeah, I know...) I > >>>think the smallest size should win. This is also consistent with the > >>>MIN in the line below. > >> > >>I agree, but changin to "page_mask &= iotlb.addr_mask" will not be > >>enough, we also have to change the init value. Else we will always end > >>up with 0xfff. > >> > >>Maybe we could do as plen was handled before, i.e. setting page_mask > >>init value to (hwaddr)(-1), and after the loop set it to > >>~TARGET_PAGE_MASK if it hasn't been changed. > >> > >>Does that sound reasonable? > > > >True that, in fact it makes sense for the "IOTLB entry" to represent all > >of memory if there's no IOMMU at all. > > Indeed, that makes sense as no iommu means identity mapping. It would > moreover improve performance, as the vhost backend will only have a > single IOTLB entry in its cache. > > Maybe it is better to wait for Peter to understand the reason he limited > it to the target page size? Sorry, just came back from a long holiday. I was trying to use 4K as default to be safe (but yes the mask was not correct, thanks for fixing that!), to make sure the translated range covered by the IOMMUTLBEntry will always be safe to access (I thought that was how IOTLB was defined, but I may be wrong). Using (-1) is good especially from performance POV as long as the caller knows the real memory boundary, but I'm not sure whether it'll break the IOTLB scemantic somehow. If we want to make it -1 for transparent mappings, maybe worth commenting it in definition of IOMMUTLBEntry.page_mask? (Btw, thanks again for moving these patches forward; I tried to, but I failed :) -- Peter Xu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-09 5:17 ` Peter Xu @ 2017-10-09 8:30 ` Maxime Coquelin 2017-10-09 8:47 ` Peter Xu 0 siblings, 1 reply; 10+ messages in thread From: Maxime Coquelin @ 2017-10-09 8:30 UTC (permalink / raw) To: Peter Xu; +Cc: Paolo Bonzini, mst, jasowang, qemu-devel, qemu-stable Hi Peter, On 10/09/2017 07:17 AM, Peter Xu wrote: > On Fri, Oct 06, 2017 at 03:03:50PM +0200, Maxime Coquelin wrote: >> >> >> On 10/06/2017 02:48 PM, Paolo Bonzini wrote: >>> On 06/10/2017 14:46, Maxime Coquelin wrote: >>>>>> addr = ((iotlb.translated_addr & ~iotlb.addr_mask) >>>>>> | (addr & iotlb.addr_mask)); >>>>>> - *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); >>>>>> + page_mask = iotlb.addr_mask; >>>>> >>>>> Should this be "page_mask &= iotlb.addr_mask"? >>>>> >>>>> If you have multiple IOMMUs on top of each other (yeah, I know...) I >>>>> think the smallest size should win. This is also consistent with the >>>>> MIN in the line below. >>>> >>>> I agree, but changin to "page_mask &= iotlb.addr_mask" will not be >>>> enough, we also have to change the init value. Else we will always end >>>> up with 0xfff. >>>> >>>> Maybe we could do as plen was handled before, i.e. setting page_mask >>>> init value to (hwaddr)(-1), and after the loop set it to >>>> ~TARGET_PAGE_MASK if it hasn't been changed. >>>> >>>> Does that sound reasonable? >>> >>> True that, in fact it makes sense for the "IOTLB entry" to represent all >>> of memory if there's no IOMMU at all. >> >> Indeed, that makes sense as no iommu means identity mapping. It would >> moreover improve performance, as the vhost backend will only have a >> single IOTLB entry in its cache. >> >> Maybe it is better to wait for Peter to understand the reason he limited >> it to the target page size? > > Sorry, just came back from a long holiday. No problem. > I was trying to use 4K as default to be safe (but yes the mask was not > correct, thanks for fixing that!), to make sure the translated range > covered by the IOMMUTLBEntry will always be safe to access (I thought > that was how IOTLB was defined, but I may be wrong). Using (-1) is > good especially from performance POV as long as the caller knows the > real memory boundary, but I'm not sure whether it'll break the IOTLB > scemantic somehow. Good point. Maybe it would be safer to wrap the IOTLB entry to the memory region? > If we want to make it -1 for transparent mappings, maybe worth > commenting it in definition of IOMMUTLBEntry.page_mask? Yes, that makes sense. > (Btw, thanks again for moving these patches forward; I tried to, but I > failed :) I'm a bit faulty not to have reviewed/tested it in the first place ;) Thanks, Maxime ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate 2017-10-09 8:30 ` Maxime Coquelin @ 2017-10-09 8:47 ` Peter Xu 0 siblings, 0 replies; 10+ messages in thread From: Peter Xu @ 2017-10-09 8:47 UTC (permalink / raw) To: Maxime Coquelin; +Cc: Paolo Bonzini, mst, jasowang, qemu-devel, qemu-stable On Mon, Oct 09, 2017 at 10:30:07AM +0200, Maxime Coquelin wrote: > Hi Peter, > > On 10/09/2017 07:17 AM, Peter Xu wrote: > >On Fri, Oct 06, 2017 at 03:03:50PM +0200, Maxime Coquelin wrote: > >> > >> > >>On 10/06/2017 02:48 PM, Paolo Bonzini wrote: > >>>On 06/10/2017 14:46, Maxime Coquelin wrote: > >>>>>> addr = ((iotlb.translated_addr & ~iotlb.addr_mask) > >>>>>> | (addr & iotlb.addr_mask)); > >>>>>>- *plen = MIN(*plen, (addr | iotlb.addr_mask) - addr + 1); > >>>>>>+ page_mask = iotlb.addr_mask; > >>>>> > >>>>>Should this be "page_mask &= iotlb.addr_mask"? > >>>>> > >>>>>If you have multiple IOMMUs on top of each other (yeah, I know...) I > >>>>>think the smallest size should win. This is also consistent with the > >>>>>MIN in the line below. > >>>> > >>>>I agree, but changin to "page_mask &= iotlb.addr_mask" will not be > >>>>enough, we also have to change the init value. Else we will always end > >>>>up with 0xfff. > >>>> > >>>>Maybe we could do as plen was handled before, i.e. setting page_mask > >>>>init value to (hwaddr)(-1), and after the loop set it to > >>>>~TARGET_PAGE_MASK if it hasn't been changed. > >>>> > >>>>Does that sound reasonable? > >>> > >>>True that, in fact it makes sense for the "IOTLB entry" to represent all > >>>of memory if there's no IOMMU at all. > >> > >>Indeed, that makes sense as no iommu means identity mapping. It would > >>moreover improve performance, as the vhost backend will only have a > >>single IOTLB entry in its cache. > >> > >>Maybe it is better to wait for Peter to understand the reason he limited > >>it to the target page size? > > > >Sorry, just came back from a long holiday. > > No problem. > > >I was trying to use 4K as default to be safe (but yes the mask was not > >correct, thanks for fixing that!), to make sure the translated range > >covered by the IOMMUTLBEntry will always be safe to access (I thought > >that was how IOTLB was defined, but I may be wrong). Using (-1) is > >good especially from performance POV as long as the caller knows the > >real memory boundary, but I'm not sure whether it'll break the IOTLB > >scemantic somehow. > > Good point. > Maybe it would be safer to wrap the IOTLB entry to the memory region? The problem is that MR size may not be aligned with address masks. I see it less meaningful if we need to further man-made a smaller mask. And wait, since you mentioned about MR... I think using -1 here may be wrong. Although current MR is transparently mapped (the MR that covers the address to be translated), it does not mean the whole address space is transparently mapped. SPAPR should be a good example that some ranges of the address space are mapped by IOMMU but some are not. > > >If we want to make it -1 for transparent mappings, maybe worth > >commenting it in definition of IOMMUTLBEntry.page_mask? > > Yes, that makes sense. According to above, I would vote for your previous solution: first use -1 to get the minimum mask, then switch to PAGE_MASK before returning when needed. > > >(Btw, thanks again for moving these patches forward; I tried to, but I > > failed :) > > I'm a bit faulty not to have reviewed/tested it in the first place ;) :-) Thanks! > > Thanks, > Maxime -- Peter Xu ^ permalink raw reply [flat|nested] 10+ messages in thread
* [Qemu-devel] [PATCH v2 2/2] exec: simplify address_space_get_iotlb_entry 2017-10-06 11:46 [Qemu-devel] [PATCH v2 0/2] exec: further refine address_space_get_iotlb_entry() Maxime Coquelin 2017-10-06 11:46 ` [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate Maxime Coquelin @ 2017-10-06 11:46 ` Maxime Coquelin 1 sibling, 0 replies; 10+ messages in thread From: Maxime Coquelin @ 2017-10-06 11:46 UTC (permalink / raw) To: peterx, pbonzini, mst, jasowang, qemu-devel; +Cc: qemu-stable, Maxime Coquelin From: Peter Xu <peterx@redhat.com> This patch let address_space_get_iotlb_entry() to use the newly introduced page_mask parameter in flatview_do_translate(). Then we will be sure the IOTLB can be aligned to page mask, also we should nicely support huge pages now when introducing a764040. Fixes: a764040 ("exec: abstract address_space_do_translate()") Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> --- exec.c | 31 ++++++++++--------------------- 1 file changed, 10 insertions(+), 21 deletions(-) diff --git a/exec.c b/exec.c index a5f3828445..f6baa4711e 100644 --- a/exec.c +++ b/exec.c @@ -551,14 +551,14 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, bool is_write) { MemoryRegionSection section; - hwaddr xlat, plen; + hwaddr xlat, page_mask; - /* Try to get maximum page mask during translation. */ - plen = (hwaddr)-1; - - /* This can never be MMIO. */ - section = flatview_do_translate(address_space_to_flatview(as), addr, - &xlat, &plen, NULL, is_write, false, &as); + /* + * This can never be MMIO, and we don't really care about plen, + * but page mask. + */ + section = flatview_do_translate(address_space_to_flatview(as), addr, &xlat, + NULL, &page_mask, is_write, false, &as); /* Illegal translation */ if (section.mr == &io_mem_unassigned) { @@ -569,22 +569,11 @@ IOMMUTLBEntry address_space_get_iotlb_entry(AddressSpace *as, hwaddr addr, xlat += section.offset_within_address_space - section.offset_within_region; - if (plen == (hwaddr)-1) { - /* - * We use default page size here. Logically it only happens - * for identity mappings. - */ - plen = TARGET_PAGE_SIZE; - } - - /* Convert to address mask */ - plen -= 1; - return (IOMMUTLBEntry) { .target_as = as, - .iova = addr & ~plen, - .translated_addr = xlat & ~plen, - .addr_mask = plen, + .iova = addr & ~page_mask, + .translated_addr = xlat & ~page_mask, + .addr_mask = page_mask, /* IOTLBs are for DMAs, and DMA only allows on RAMs. */ .perm = IOMMU_RW, }; -- 2.13.6 ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-10-09 8:47 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-10-06 11:46 [Qemu-devel] [PATCH v2 0/2] exec: further refine address_space_get_iotlb_entry() Maxime Coquelin 2017-10-06 11:46 ` [Qemu-devel] [PATCH v2 1/2] exec: add page_mask for flatview_do_translate Maxime Coquelin 2017-10-06 12:31 ` Paolo Bonzini 2017-10-06 12:46 ` Maxime Coquelin 2017-10-06 12:48 ` Paolo Bonzini 2017-10-06 13:03 ` Maxime Coquelin 2017-10-09 5:17 ` Peter Xu 2017-10-09 8:30 ` Maxime Coquelin 2017-10-09 8:47 ` Peter Xu 2017-10-06 11:46 ` [Qemu-devel] [PATCH v2 2/2] exec: simplify address_space_get_iotlb_entry Maxime Coquelin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).