* [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access
@ 2018-06-12 19:05 Eric Auger
2018-06-13 3:15 ` Peter Xu
2018-06-13 9:56 ` Paolo Bonzini
0 siblings, 2 replies; 6+ messages in thread
From: Eric Auger @ 2018-06-12 19:05 UTC (permalink / raw)
To: eric.auger.pro, eric.auger, qemu-devel, pbonzini; +Cc: peterx
When an IOMMUMemoryRegion is in front of a virtio device,
address_space_cache_init does not set cache->ptr as the memory
region is not RAM. However when the device performs an access,
we end up in glue() which performs the translation and then uses
MAP_RAM. This latter uses the unset ptr and returns a wrong value
which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
for instance. Let's test whether the cache->ptr is set, and in
the negative use the old macro definition. This fixes the
use cases featuring vIOMMU (Intel and ARM SMMU) which lead to
a SIGSEV.
Fixes: 48564041a73a (exec: reintroduce MemoryRegion caching)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
I am not sure whether it doesn't break any targeted optimization
but at least it removes the SIGSEV.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
exec.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/exec.c b/exec.c
index f6645ed..46fbd25 100644
--- a/exec.c
+++ b/exec.c
@@ -3800,7 +3800,9 @@ address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr,
#define SUFFIX _cached_slow
#define TRANSLATE(...) address_space_translate_cached(cache, __VA_ARGS__)
#define IS_DIRECT(mr, is_write) memory_access_is_direct(mr, is_write)
-#define MAP_RAM(mr, ofs) (cache->ptr + (ofs - cache->xlat))
+#define MAP_RAM(mr, ofs) (cache->ptr ? \
+ (cache->ptr + (ofs - cache->xlat)) : \
+ qemu_map_ram_ptr((mr)->ram_block, ofs))
#define INVALIDATE(mr, ofs, len) invalidate_and_set_dirty(mr, ofs, len)
#define RCU_READ_LOCK() ((void)0)
#define RCU_READ_UNLOCK() ((void)0)
--
2.5.5
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access
2018-06-12 19:05 [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access Eric Auger
@ 2018-06-13 3:15 ` Peter Xu
2018-06-13 6:31 ` Auger Eric
2018-06-13 9:56 ` Paolo Bonzini
1 sibling, 1 reply; 6+ messages in thread
From: Peter Xu @ 2018-06-13 3:15 UTC (permalink / raw)
To: Eric Auger; +Cc: eric.auger.pro, qemu-devel, pbonzini
On Tue, Jun 12, 2018 at 09:05:25PM +0200, Eric Auger wrote:
> When an IOMMUMemoryRegion is in front of a virtio device,
> address_space_cache_init does not set cache->ptr as the memory
> region is not RAM. However when the device performs an access,
> we end up in glue() which performs the translation and then uses
> MAP_RAM. This latter uses the unset ptr and returns a wrong value
> which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
> for instance. Let's test whether the cache->ptr is set, and in
> the negative use the old macro definition. This fixes the
> use cases featuring vIOMMU (Intel and ARM SMMU) which lead to
> a SIGSEV.
>
> Fixes: 48564041a73a (exec: reintroduce MemoryRegion caching)
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> I am not sure whether it doesn't break any targeted optimization
> but at least it removes the SIGSEV.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
> exec.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/exec.c b/exec.c
> index f6645ed..46fbd25 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -3800,7 +3800,9 @@ address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr,
> #define SUFFIX _cached_slow
> #define TRANSLATE(...) address_space_translate_cached(cache, __VA_ARGS__)
> #define IS_DIRECT(mr, is_write) memory_access_is_direct(mr, is_write)
> -#define MAP_RAM(mr, ofs) (cache->ptr + (ofs - cache->xlat))
> +#define MAP_RAM(mr, ofs) (cache->ptr ? \
> + (cache->ptr + (ofs - cache->xlat)) : \
> + qemu_map_ram_ptr((mr)->ram_block, ofs))
A pure question: if the MR is not a RAM (I think the only case for
virtio case should be an IOMMU MR), then why we'll call MAP_RAM()
after all? An glue() example:
void glue(address_space_stb, SUFFIX)(ARG1_DECL,
hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
{
uint8_t *ptr;
MemoryRegion *mr;
hwaddr l = 1;
hwaddr addr1;
MemTxResult r;
bool release_lock = false;
RCU_READ_LOCK();
mr = TRANSLATE(addr, &addr1, &l, true, attrs);
if (!IS_DIRECT(mr, true)) { <----------------- [1]
release_lock |= prepare_mmio_access(mr);
r = memory_region_dispatch_write(mr, addr1, val, 1, attrs);
} else {
/* RAM case */
ptr = MAP_RAM(mr, addr1);
stb_p(ptr, val);
INVALIDATE(mr, addr1, 1);
r = MEMTX_OK;
}
if (result) {
*result = r;
}
if (release_lock) {
qemu_mutex_unlock_iothread();
}
RCU_READ_UNLOCK();
}
At [1] we should check first against whether it's direct after all.
AFAIU IOMMU MR should not be direct then it'll go the slow path rather
than calling MAP_RAM()?
Since at it, I have another (pure) question about the address space
cache. I don't think it's urgent since I think it's never a problem
for virtio, but I'm still asking anyways...
Still taking the stb example:
static inline void address_space_stb_cached(MemoryRegionCache *cache,
hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
{
assert(addr < cache->len); <----------------------------- [2]
if (likely(cache->ptr)) {
stb_p(cache->ptr + addr, val);
} else {
address_space_stb_cached_slow(cache, addr, val, attrs, result);
}
}
Here at [2] what if the region cached is smaller than provided when
doing address_space_cache_init()? AFAIU the "len" provided to
address_space_cache_init() can actually shrink (though for virtio it
should never) when do:
l = len;
...
cache->mrs = *address_space_translate_internal(d, addr, &cache->xlat, &l, true);
...
cache->len = l;
And here not sure whether we should not assert, instead we only run
the fast path if the address falls into the cache region, say:
static inline void address_space_stb_cached(MemoryRegionCache *cache,
hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
{
if (likely(cache->ptr && addr < cache->len)) {
stb_p(cache->ptr + addr, val);
} else {
address_space_stb_cached_slow(cache, addr, val, attrs, result);
}
}
Or we should add a check in address_space_cache_init() to make sure
the region won't shrink.
Regards,
--
Peter Xu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access
2018-06-13 3:15 ` Peter Xu
@ 2018-06-13 6:31 ` Auger Eric
2018-06-13 6:53 ` Peter Xu
0 siblings, 1 reply; 6+ messages in thread
From: Auger Eric @ 2018-06-13 6:31 UTC (permalink / raw)
To: Peter Xu; +Cc: pbonzini, qemu-devel, eric.auger.pro
Hi Peter,
On 06/13/2018 05:15 AM, Peter Xu wrote:
> On Tue, Jun 12, 2018 at 09:05:25PM +0200, Eric Auger wrote:
>> When an IOMMUMemoryRegion is in front of a virtio device,
>> address_space_cache_init does not set cache->ptr as the memory
>> region is not RAM. However when the device performs an access,
>> we end up in glue() which performs the translation and then uses
>> MAP_RAM. This latter uses the unset ptr and returns a wrong value
>> which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
>> for instance. Let's test whether the cache->ptr is set, and in
>> the negative use the old macro definition. This fixes the
>> use cases featuring vIOMMU (Intel and ARM SMMU) which lead to
>> a SIGSEV.
>>
>> Fixes: 48564041a73a (exec: reintroduce MemoryRegion caching)
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> I am not sure whether it doesn't break any targeted optimization
>> but at least it removes the SIGSEV.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>> exec.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/exec.c b/exec.c
>> index f6645ed..46fbd25 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -3800,7 +3800,9 @@ address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr,
>> #define SUFFIX _cached_slow
>> #define TRANSLATE(...) address_space_translate_cached(cache, __VA_ARGS__)
>> #define IS_DIRECT(mr, is_write) memory_access_is_direct(mr, is_write)
>> -#define MAP_RAM(mr, ofs) (cache->ptr + (ofs - cache->xlat))
>> +#define MAP_RAM(mr, ofs) (cache->ptr ? \
>> + (cache->ptr + (ofs - cache->xlat)) : \
>> + qemu_map_ram_ptr((mr)->ram_block, ofs))
>
> A pure question: if the MR is not a RAM (I think the only case for
> virtio case should be an IOMMU MR), then why we'll call MAP_RAM()
> after all? An glue() example:
>
> void glue(address_space_stb, SUFFIX)(ARG1_DECL,
> hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
> {
> uint8_t *ptr;
> MemoryRegion *mr;
> hwaddr l = 1;
> hwaddr addr1;
> MemTxResult r;
> bool release_lock = false;
>
> RCU_READ_LOCK();
> mr = TRANSLATE(addr, &addr1, &l, true, attrs);
> if (!IS_DIRECT(mr, true)) { <----------------- [1]
after the translate, mr points to the actual RAM region, downstream to
the IOMMU MR. And this one is direct. addr1 is the offset within the RAM
region if I am not wrong.
Am i missing something?
Thanks
Eric
> release_lock |= prepare_mmio_access(mr);
> r = memory_region_dispatch_write(mr, addr1, val, 1, attrs);
> } else {
> /* RAM case */
> ptr = MAP_RAM(mr, addr1);
> stb_p(ptr, val);
> INVALIDATE(mr, addr1, 1);
> r = MEMTX_OK;
> }
> if (result) {
> *result = r;
> }
> if (release_lock) {
> qemu_mutex_unlock_iothread();
> }
> RCU_READ_UNLOCK();
> }
>
> At [1] we should check first against whether it's direct after all.
> AFAIU IOMMU MR should not be direct then it'll go the slow path rather
> than calling MAP_RAM()?
>
> Since at it, I have another (pure) question about the address space
> cache. I don't think it's urgent since I think it's never a problem
> for virtio, but I'm still asking anyways...
>
> Still taking the stb example:
>
> static inline void address_space_stb_cached(MemoryRegionCache *cache,
> hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
> {
> assert(addr < cache->len); <----------------------------- [2]
> if (likely(cache->ptr)) {
> stb_p(cache->ptr + addr, val);
> } else {
> address_space_stb_cached_slow(cache, addr, val, attrs, result);
> }
> }
>
> Here at [2] what if the region cached is smaller than provided when
> doing address_space_cache_init()? AFAIU the "len" provided to
> address_space_cache_init() can actually shrink (though for virtio it
> should never) when do:
>
> l = len;
> ...
> cache->mrs = *address_space_translate_internal(d, addr, &cache->xlat, &l, true);
> ...
> cache->len = l;
>
> And here not sure whether we should not assert, instead we only run
> the fast path if the address falls into the cache region, say:
>
> static inline void address_space_stb_cached(MemoryRegionCache *cache,
> hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
> {
> if (likely(cache->ptr && addr < cache->len)) {
> stb_p(cache->ptr + addr, val);
> } else {
> address_space_stb_cached_slow(cache, addr, val, attrs, result);
> }
> }
>
> Or we should add a check in address_space_cache_init() to make sure
> the region won't shrink.
>
> Regards,
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access
2018-06-13 6:31 ` Auger Eric
@ 2018-06-13 6:53 ` Peter Xu
0 siblings, 0 replies; 6+ messages in thread
From: Peter Xu @ 2018-06-13 6:53 UTC (permalink / raw)
To: Auger Eric; +Cc: pbonzini, qemu-devel, eric.auger.pro
On Wed, Jun 13, 2018 at 08:31:31AM +0200, Auger Eric wrote:
> Hi Peter,
>
> On 06/13/2018 05:15 AM, Peter Xu wrote:
> > On Tue, Jun 12, 2018 at 09:05:25PM +0200, Eric Auger wrote:
> >> When an IOMMUMemoryRegion is in front of a virtio device,
> >> address_space_cache_init does not set cache->ptr as the memory
> >> region is not RAM. However when the device performs an access,
> >> we end up in glue() which performs the translation and then uses
> >> MAP_RAM. This latter uses the unset ptr and returns a wrong value
> >> which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
> >> for instance. Let's test whether the cache->ptr is set, and in
> >> the negative use the old macro definition. This fixes the
> >> use cases featuring vIOMMU (Intel and ARM SMMU) which lead to
> >> a SIGSEV.
> >>
> >> Fixes: 48564041a73a (exec: reintroduce MemoryRegion caching)
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>
> >> ---
> >>
> >> I am not sure whether it doesn't break any targeted optimization
> >> but at least it removes the SIGSEV.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> ---
> >> exec.c | 4 +++-
> >> 1 file changed, 3 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/exec.c b/exec.c
> >> index f6645ed..46fbd25 100644
> >> --- a/exec.c
> >> +++ b/exec.c
> >> @@ -3800,7 +3800,9 @@ address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr,
> >> #define SUFFIX _cached_slow
> >> #define TRANSLATE(...) address_space_translate_cached(cache, __VA_ARGS__)
> >> #define IS_DIRECT(mr, is_write) memory_access_is_direct(mr, is_write)
> >> -#define MAP_RAM(mr, ofs) (cache->ptr + (ofs - cache->xlat))
> >> +#define MAP_RAM(mr, ofs) (cache->ptr ? \
> >> + (cache->ptr + (ofs - cache->xlat)) : \
> >> + qemu_map_ram_ptr((mr)->ram_block, ofs))
> >
> > A pure question: if the MR is not a RAM (I think the only case for
> > virtio case should be an IOMMU MR), then why we'll call MAP_RAM()
> > after all? An glue() example:
> >
> > void glue(address_space_stb, SUFFIX)(ARG1_DECL,
> > hwaddr addr, uint32_t val, MemTxAttrs attrs, MemTxResult *result)
> > {
> > uint8_t *ptr;
> > MemoryRegion *mr;
> > hwaddr l = 1;
> > hwaddr addr1;
> > MemTxResult r;
> > bool release_lock = false;
> >
> > RCU_READ_LOCK();
> > mr = TRANSLATE(addr, &addr1, &l, true, attrs);
> > if (!IS_DIRECT(mr, true)) { <----------------- [1]
> after the translate, mr points to the actual RAM region, downstream to
> the IOMMU MR. And this one is direct. addr1 is the offset within the RAM
> region if I am not wrong.
>
> Am i missing something?
I think you are right. Then the change seems reasonable to me.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access
2018-06-12 19:05 [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access Eric Auger
2018-06-13 3:15 ` Peter Xu
@ 2018-06-13 9:56 ` Paolo Bonzini
2018-06-13 13:20 ` Auger Eric
1 sibling, 1 reply; 6+ messages in thread
From: Paolo Bonzini @ 2018-06-13 9:56 UTC (permalink / raw)
To: Eric Auger, eric.auger.pro, qemu-devel; +Cc: peterx
On 12/06/2018 21:05, Eric Auger wrote:
> When an IOMMUMemoryRegion is in front of a virtio device,
> address_space_cache_init does not set cache->ptr as the memory
> region is not RAM. However when the device performs an access,
> we end up in glue() which performs the translation and then uses
> MAP_RAM. This latter uses the unset ptr and returns a wrong value
> which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
> for instance. Let's test whether the cache->ptr is set, and in
> the negative use the old macro definition. This fixes the
> use cases featuring vIOMMU (Intel and ARM SMMU) which lead to
> a SIGSEV.
>
> Fixes: 48564041a73a (exec: reintroduce MemoryRegion caching)
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>
> ---
>
> I am not sure whether it doesn't break any targeted optimization
> but at least it removes the SIGSEV.
Actually cache->ptr is always NULL here, since this is the slow path
(there is even an assertion in address_space_translate_cached); so
MAP_RAM can be even simpler and, apart from the bugfix, I think we
should remove all of IS_DIRECT, MAP_RAM and INVALIDATE as a follow-up.
They were needed in the original implementation of MemoryRegionCache,
which only worked with RAM regions but not anymore now that the RAM case
is open-coded in include/exec/memory_ldst_cached.inc.h.
Thanks,
Paolo
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
> exec.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/exec.c b/exec.c
> index f6645ed..46fbd25 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -3800,7 +3800,9 @@ address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr,
> #define SUFFIX _cached_slow
> #define TRANSLATE(...) address_space_translate_cached(cache, __VA_ARGS__)
> #define IS_DIRECT(mr, is_write) memory_access_is_direct(mr, is_write)
> -#define MAP_RAM(mr, ofs) (cache->ptr + (ofs - cache->xlat))
> +#define MAP_RAM(mr, ofs) (cache->ptr ? \
> + (cache->ptr + (ofs - cache->xlat)) : \
> + qemu_map_ram_ptr((mr)->ram_block, ofs))
> #define INVALIDATE(mr, ofs, len) invalidate_and_set_dirty(mr, ofs, len)
> #define RCU_READ_LOCK() ((void)0)
> #define RCU_READ_UNLOCK() ((void)0)
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access
2018-06-13 9:56 ` Paolo Bonzini
@ 2018-06-13 13:20 ` Auger Eric
0 siblings, 0 replies; 6+ messages in thread
From: Auger Eric @ 2018-06-13 13:20 UTC (permalink / raw)
To: Paolo Bonzini, eric.auger.pro, qemu-devel; +Cc: peterx
Hi Paolo,
On 06/13/2018 11:56 AM, Paolo Bonzini wrote:
> On 12/06/2018 21:05, Eric Auger wrote:
>> When an IOMMUMemoryRegion is in front of a virtio device,
>> address_space_cache_init does not set cache->ptr as the memory
>> region is not RAM. However when the device performs an access,
>> we end up in glue() which performs the translation and then uses
>> MAP_RAM. This latter uses the unset ptr and returns a wrong value
>> which leads to a SIGSEV in address_space_lduw_internal_cached_slow,
>> for instance. Let's test whether the cache->ptr is set, and in
>> the negative use the old macro definition. This fixes the
>> use cases featuring vIOMMU (Intel and ARM SMMU) which lead to
>> a SIGSEV.
>>
>> Fixes: 48564041a73a (exec: reintroduce MemoryRegion caching)
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> ---
>>
>> I am not sure whether it doesn't break any targeted optimization
>> but at least it removes the SIGSEV.
>
> Actually cache->ptr is always NULL here, since this is the slow path
> (there is even an assertion in address_space_translate_cached); so
> MAP_RAM can be even simpler and, apart from the bugfix, I think we
> should remove all of IS_DIRECT, MAP_RAM and INVALIDATE as a follow-up.
> They were needed in the original implementation of MemoryRegionCache,
> which only worked with RAM regions but not anymore now that the RAM case
> is open-coded in include/exec/memory_ldst_cached.inc.h.
OK I respinned with your suggestions. Hope it matches your expectations.
Thanks
Eric
>
> Thanks,
>
> Paolo
>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>> exec.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/exec.c b/exec.c
>> index f6645ed..46fbd25 100644
>> --- a/exec.c
>> +++ b/exec.c
>> @@ -3800,7 +3800,9 @@ address_space_write_cached_slow(MemoryRegionCache *cache, hwaddr addr,
>> #define SUFFIX _cached_slow
>> #define TRANSLATE(...) address_space_translate_cached(cache, __VA_ARGS__)
>> #define IS_DIRECT(mr, is_write) memory_access_is_direct(mr, is_write)
>> -#define MAP_RAM(mr, ofs) (cache->ptr + (ofs - cache->xlat))
>> +#define MAP_RAM(mr, ofs) (cache->ptr ? \
>> + (cache->ptr + (ofs - cache->xlat)) : \
>> + qemu_map_ram_ptr((mr)->ram_block, ofs))
>> #define INVALIDATE(mr, ofs, len) invalidate_and_set_dirty(mr, ofs, len)
>> #define RCU_READ_LOCK() ((void)0)
>> #define RCU_READ_UNLOCK() ((void)0)
>>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-06-13 13:20 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-06-12 19:05 [Qemu-devel] [PATCH] exec: Fix MAP_RAM for cached access Eric Auger
2018-06-13 3:15 ` Peter Xu
2018-06-13 6:31 ` Auger Eric
2018-06-13 6:53 ` Peter Xu
2018-06-13 9:56 ` Paolo Bonzini
2018-06-13 13:20 ` Auger Eric
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).