* [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
@ 2015-06-11 20:30 Hervé Poussineau
2015-06-11 23:30 ` Aurelien Jarno
0 siblings, 1 reply; 8+ messages in thread
From: Hervé Poussineau @ 2015-06-11 20:30 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Leon Alrae, Aurelien Jarno, Hervé Poussineau
This workarounds a bug in memory management.
To reproduce the problem, try to start the Windows NT 4.0/MIPS installer.
After loading some files, you should see a screen saying
"To set up Windows NT now, press ENTER."
However, you're welcomed with an IRQL_NOT_LESS_OR_EQUAL bugcheck or an
Unknown Hard Error c0000221.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
---
hw/dma/rc4030.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/hw/dma/rc4030.c b/hw/dma/rc4030.c
index 3efa6de..d265d6c 100644
--- a/hw/dma/rc4030.c
+++ b/hw/dma/rc4030.c
@@ -681,6 +681,7 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
rc4030State *s = opaque;
hwaddr dma_addr;
int dev_to_mem;
+ int i;
s->dma_regs[n][DMA_REG_ENABLE] &= ~(DMA_FLAG_TC_INTR | DMA_FLAG_MEM_INTR | DMA_FLAG_ADDR_INTR);
@@ -699,8 +700,22 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
dma_addr = s->dma_regs[n][DMA_REG_ADDRESS];
/* Read/write data at right place */
+#if 1 /* workaround for a bug in memory management */
+ for (i = 0; i < len; ) {
+ int ncpy = DMA_PAGESIZE - (dma_addr & (DMA_PAGESIZE - 1));
+ if (ncpy > len - i) {
+ ncpy = len - i;
+ }
+ address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
+ buf + i, ncpy, is_write);
+
+ dma_addr += ncpy;
+ i += ncpy;
+ }
+#else
address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
buf, len, is_write);
+#endif
s->dma_regs[n][DMA_REG_ENABLE] |= DMA_FLAG_TC_INTR;
s->dma_regs[n][DMA_REG_COUNT] -= len;
--
2.1.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
2015-06-11 20:30 [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers Hervé Poussineau
@ 2015-06-11 23:30 ` Aurelien Jarno
2015-06-15 20:44 ` Hervé Poussineau
0 siblings, 1 reply; 8+ messages in thread
From: Aurelien Jarno @ 2015-06-11 23:30 UTC (permalink / raw)
To: Hervé Poussineau; +Cc: Peter Maydell, Leon Alrae, qemu-devel
On 2015-06-11 22:30, Hervé Poussineau wrote:
> This workarounds a bug in memory management.
>
> To reproduce the problem, try to start the Windows NT 4.0/MIPS installer.
> After loading some files, you should see a screen saying
> "To set up Windows NT now, press ENTER."
> However, you're welcomed with an IRQL_NOT_LESS_OR_EQUAL bugcheck or an
> Unknown Hard Error c0000221.
>
> Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
> ---
> hw/dma/rc4030.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/hw/dma/rc4030.c b/hw/dma/rc4030.c
> index 3efa6de..d265d6c 100644
> --- a/hw/dma/rc4030.c
> +++ b/hw/dma/rc4030.c
> @@ -681,6 +681,7 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
> rc4030State *s = opaque;
> hwaddr dma_addr;
> int dev_to_mem;
> + int i;
>
> s->dma_regs[n][DMA_REG_ENABLE] &= ~(DMA_FLAG_TC_INTR | DMA_FLAG_MEM_INTR | DMA_FLAG_ADDR_INTR);
>
> @@ -699,8 +700,22 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
> dma_addr = s->dma_regs[n][DMA_REG_ADDRESS];
>
> /* Read/write data at right place */
> +#if 1 /* workaround for a bug in memory management */
> + for (i = 0; i < len; ) {
> + int ncpy = DMA_PAGESIZE - (dma_addr & (DMA_PAGESIZE - 1));
> + if (ncpy > len - i) {
> + ncpy = len - i;
> + }
> + address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
> + buf + i, ncpy, is_write);
> +
> + dma_addr += ncpy;
> + i += ncpy;
> + }
> +#else
> address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
> buf, len, is_write);
> +#endif
Hmm, basically your code splits the transfers so that they don't cross
DMA page boundaries. It seems that your DMA memory region is actually
made of small subregions of size DMA_PAGESIZE aliased to the RAM.
Now looking at the address_space_rw function, it seems it optimizes the
write to RAM case by calling address_space_translate() and then doing a
memcpy() of the whole region. It doesn't work given the memory region is
not linear.
That said address_space_translate is supposed to adjust the length if
needed, but does so only if iommu_ops is defined. I therefore wonder if
you therefore shouldn't model this DMA translation tables by using IOMMU
ops instead of subregions.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
2015-06-11 23:30 ` Aurelien Jarno
@ 2015-06-15 20:44 ` Hervé Poussineau
2015-06-16 17:48 ` Aurelien Jarno
0 siblings, 1 reply; 8+ messages in thread
From: Hervé Poussineau @ 2015-06-15 20:44 UTC (permalink / raw)
To: qemu-devel, Leon Alrae, Peter Maydell, Paolo Bonzini
Hi Aurelien,
Le 12/06/2015 01:30, Aurelien Jarno a écrit :
> On 2015-06-11 22:30, Hervé Poussineau wrote:
>> This workarounds a bug in memory management.
>>
>> To reproduce the problem, try to start the Windows NT 4.0/MIPS installer.
>> After loading some files, you should see a screen saying
>> "To set up Windows NT now, press ENTER."
>> However, you're welcomed with an IRQL_NOT_LESS_OR_EQUAL bugcheck or an
>> Unknown Hard Error c0000221.
>>
>> Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
>> ---
>> hw/dma/rc4030.c | 15 +++++++++++++++
>> 1 file changed, 15 insertions(+)
>>
>> diff --git a/hw/dma/rc4030.c b/hw/dma/rc4030.c
>> index 3efa6de..d265d6c 100644
>> --- a/hw/dma/rc4030.c
>> +++ b/hw/dma/rc4030.c
>> @@ -681,6 +681,7 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
>> rc4030State *s = opaque;
>> hwaddr dma_addr;
>> int dev_to_mem;
>> + int i;
>>
>> s->dma_regs[n][DMA_REG_ENABLE] &= ~(DMA_FLAG_TC_INTR | DMA_FLAG_MEM_INTR | DMA_FLAG_ADDR_INTR);
>>
>> @@ -699,8 +700,22 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
>> dma_addr = s->dma_regs[n][DMA_REG_ADDRESS];
>>
>> /* Read/write data at right place */
>> +#if 1 /* workaround for a bug in memory management */
>> + for (i = 0; i < len; ) {
>> + int ncpy = DMA_PAGESIZE - (dma_addr & (DMA_PAGESIZE - 1));
>> + if (ncpy > len - i) {
>> + ncpy = len - i;
>> + }
>> + address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
>> + buf + i, ncpy, is_write);
>> +
>> + dma_addr += ncpy;
>> + i += ncpy;
>> + }
>> +#else
>> address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
>> buf, len, is_write);
>> +#endif
>
> Hmm, basically your code splits the transfers so that they don't cross
> DMA page boundaries. It seems that your DMA memory region is actually
> made of small subregions of size DMA_PAGESIZE aliased to the RAM.
Yes, that's the case. I have lots of DMA_PAGESIZE memory region aliases in the DMA memory region.
> Now looking at the address_space_rw function, it seems it optimizes the
> write to RAM case by calling address_space_translate() and then doing a
> memcpy() of the whole region. It doesn't work given the memory region is
> not linear.
>
> That said address_space_translate is supposed to adjust the length if
> needed, but does so only if iommu_ops is defined.
Then, the problem lies here.
If you can use address_space_rw only on an address range which is linear in underlying memory region, or if underlying memory region is a iommu, then you have a big problem. As you can't query if
that's the case, your only bet is to use address_space_rw with only 1 byte quantities...
Adding Paolo, as he may have an idea.
> I therefore wonder if
> you therefore shouldn't model this DMA translation tables by using IOMMU
> ops instead of subregions.
>
No, in my opinion, that's an implementation detail. Paolo said that it was OK:
"Both are okay. The IOMMU makes address space changes faster; your
scheme is basically a form of caching, it trades update performance for
improved translation performance."
http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg05486.html
Regards,
Hervé
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
2015-06-15 20:44 ` Hervé Poussineau
@ 2015-06-16 17:48 ` Aurelien Jarno
2015-06-17 8:33 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Aurelien Jarno @ 2015-06-16 17:48 UTC (permalink / raw)
To: Hervé Poussineau
Cc: Peter Maydell, Leon Alrae, qemu-devel, Paolo Bonzini
On 2015-06-15 22:44, Hervé Poussineau wrote:
> Hi Aurelien,
>
> Le 12/06/2015 01:30, Aurelien Jarno a écrit :
> >On 2015-06-11 22:30, Hervé Poussineau wrote:
> >>This workarounds a bug in memory management.
> >>
> >>To reproduce the problem, try to start the Windows NT 4.0/MIPS installer.
> >>After loading some files, you should see a screen saying
> >>"To set up Windows NT now, press ENTER."
> >>However, you're welcomed with an IRQL_NOT_LESS_OR_EQUAL bugcheck or an
> >>Unknown Hard Error c0000221.
> >>
> >>Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
> >>---
> >> hw/dma/rc4030.c | 15 +++++++++++++++
> >> 1 file changed, 15 insertions(+)
> >>
> >>diff --git a/hw/dma/rc4030.c b/hw/dma/rc4030.c
> >>index 3efa6de..d265d6c 100644
> >>--- a/hw/dma/rc4030.c
> >>+++ b/hw/dma/rc4030.c
> >>@@ -681,6 +681,7 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
> >> rc4030State *s = opaque;
> >> hwaddr dma_addr;
> >> int dev_to_mem;
> >>+ int i;
> >>
> >> s->dma_regs[n][DMA_REG_ENABLE] &= ~(DMA_FLAG_TC_INTR | DMA_FLAG_MEM_INTR | DMA_FLAG_ADDR_INTR);
> >>
> >>@@ -699,8 +700,22 @@ static void rc4030_do_dma(void *opaque, int n, uint8_t *buf, int len, int is_wri
> >> dma_addr = s->dma_regs[n][DMA_REG_ADDRESS];
> >>
> >> /* Read/write data at right place */
> >>+#if 1 /* workaround for a bug in memory management */
> >>+ for (i = 0; i < len; ) {
> >>+ int ncpy = DMA_PAGESIZE - (dma_addr & (DMA_PAGESIZE - 1));
> >>+ if (ncpy > len - i) {
> >>+ ncpy = len - i;
> >>+ }
> >>+ address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
> >>+ buf + i, ncpy, is_write);
> >>+
> >>+ dma_addr += ncpy;
> >>+ i += ncpy;
> >>+ }
> >>+#else
> >> address_space_rw(&s->dma_as, dma_addr, MEMTXATTRS_UNSPECIFIED,
> >> buf, len, is_write);
> >>+#endif
> >
> >Hmm, basically your code splits the transfers so that they don't cross
> >DMA page boundaries. It seems that your DMA memory region is actually
> >made of small subregions of size DMA_PAGESIZE aliased to the RAM.
>
> Yes, that's the case. I have lots of DMA_PAGESIZE memory region aliases in the DMA memory region.
>
> >Now looking at the address_space_rw function, it seems it optimizes the
> >write to RAM case by calling address_space_translate() and then doing a
> >memcpy() of the whole region. It doesn't work given the memory region is
> >not linear.
> >
> >That said address_space_translate is supposed to adjust the length if
> >needed, but does so only if iommu_ops is defined.
>
> Then, the problem lies here.
> If you can use address_space_rw only on an address range which is linear in
> underlying memory region, or if underlying memory region is a iommu, then
> you have a big problem. As you can't query if that's the case, your only bet
> is to use address_space_rw with only 1 byte quantities...
> Adding Paolo, as he may have an idea.
>
The code assumes that if you don't have an IOMMU, the address range in
the underlying memory region is linear. One fix would be to adjust the
length even without IOMMU. That would have some performance impact
though, so maybe we want to make this assumption clear and always use an
IOMMU in that case.
> > I therefore wonder if
> >you therefore shouldn't model this DMA translation tables by using IOMMU
> >ops instead of subregions.
> >
> No, in my opinion, that's an implementation detail. Paolo said that it was OK:
> "Both are okay. The IOMMU makes address space changes faster; your
> scheme is basically a form of caching, it trades update performance for
> improved translation performance."
> http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg05486.html
It seems wrong with the current code. And if we fix the bug by adjusting
the length, the above sentence about the performances might becomes
wrong
Aurelien.
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
aurelien@aurel32.net http://www.aurel32.net
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
2015-06-16 17:48 ` Aurelien Jarno
@ 2015-06-17 8:33 ` Paolo Bonzini
2015-06-17 17:09 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2015-06-17 8:33 UTC (permalink / raw)
To: Aurelien Jarno, Hervé Poussineau
Cc: Peter Maydell, Leon Alrae, qemu-devel
On 16/06/2015 19:48, Aurelien Jarno wrote:
> The code assumes that if you don't have an IOMMU, the address range in
> the underlying memory region is linear.
I think this is exactly what Peter Crosthwaite's infamous :) "exec:
Respect as_translate_internal length clamp" patch was trying to fix.
However, address_space_translate_internal uses section->mr->size instead
of section->size. I'll post a patch once I'm through the email deluge
from 1 week of absence.
If I read correctly the patch that introduced address_space_translate,
the bug has always been there.
> One fix would be to adjust the
> length even without IOMMU. That would have some performance impact
> though, so maybe we want to make this assumption clear and always use an
> IOMMU in that case.
I don't think there would be a performance impact, except in buggy cases
such as the one Hervé is fixing.
Paolo
>>> I therefore wonder if
>>> you therefore shouldn't model this DMA translation tables by using IOMMU
>>> ops instead of subregions.
>>>
>> No, in my opinion, that's an implementation detail. Paolo said that it was OK:
>> "Both are okay. The IOMMU makes address space changes faster; your
>> scheme is basically a form of caching, it trades update performance for
>> improved translation performance."
>> http://lists.gnu.org/archive/html/qemu-devel/2015-03/msg05486.html
>
> It seems wrong with the current code. And if we fix the bug by adjusting
> the length, the above sentence about the performances might becomes
> wrong
>
> Aurelien.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
2015-06-17 8:33 ` Paolo Bonzini
@ 2015-06-17 17:09 ` Paolo Bonzini
2015-06-17 18:31 ` Hervé Poussineau
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2015-06-17 17:09 UTC (permalink / raw)
To: Aurelien Jarno, Hervé Poussineau
Cc: Peter Maydell, Leon Alrae, qemu-devel
On 17/06/2015 10:33, Paolo Bonzini wrote:
> On 16/06/2015 19:48, Aurelien Jarno wrote:
>> The code assumes that if you don't have an IOMMU, the address range in
>> the underlying memory region is linear.
>
> I think this is exactly what Peter Crosthwaite's infamous :) "exec:
> Respect as_translate_internal length clamp" patch was trying to fix.
> However, address_space_translate_internal uses section->mr->size instead
> of section->size. I'll post a patch once I'm through the email deluge
> from 1 week of absence.
Can you test this?
diff --git a/exec.c b/exec.c
index 76bfc4a..fabb8bb 100644
--- a/exec.c
+++ b/exec.c
@@ -350,7 +350,7 @@
address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr,
hwaddr *x
/* Compute offset within MemoryRegion */
*xlat = addr + section->offset_within_region;
- diff = int128_sub(section->mr->size, int128_make64(addr));
+ diff = int128_sub(section->size, int128_make64(addr));
*plen = int128_get64(int128_min(diff, int128_make64(*plen)));
return section;
}
Paolo
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
2015-06-17 17:09 ` Paolo Bonzini
@ 2015-06-17 18:31 ` Hervé Poussineau
2015-06-17 19:15 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Hervé Poussineau @ 2015-06-17 18:31 UTC (permalink / raw)
To: Paolo Bonzini, Aurelien Jarno; +Cc: Peter Maydell, Leon Alrae, qemu-devel
Le 17/06/2015 19:09, Paolo Bonzini a écrit :
>
>
> On 17/06/2015 10:33, Paolo Bonzini wrote:
>> On 16/06/2015 19:48, Aurelien Jarno wrote:
>>> The code assumes that if you don't have an IOMMU, the address range in
>>> the underlying memory region is linear.
>>
>> I think this is exactly what Peter Crosthwaite's infamous :) "exec:
>> Respect as_translate_internal length clamp" patch was trying to fix.
>> However, address_space_translate_internal uses section->mr->size instead
>> of section->size. I'll post a patch once I'm through the email deluge
>> from 1 week of absence.
>
> Can you test this?
Sure. It works well for my test case. Thanks Paolo!
However, it breaks PC machines.
mtree gives:
0000000000000cf8-0000000000000cfb (prio 0, RW): pci-conf-idx
0000000000000cf9-0000000000000cf9 (prio 1, RW): piix3-reset-control
"make check" wants to write 4 bytes to 0xcf8. Your patch makes it write only 1 byte, and bad things happen.
>
> diff --git a/exec.c b/exec.c
> index 76bfc4a..fabb8bb 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -350,7 +350,7 @@
> address_space_translate_internal(AddressSpaceDispatch *d, hwaddr addr,
> hwaddr *x
> /* Compute offset within MemoryRegion */
> *xlat = addr + section->offset_within_region;
>
> - diff = int128_sub(section->mr->size, int128_make64(addr));
> + diff = int128_sub(section->size, int128_make64(addr));
> *plen = int128_get64(int128_min(diff, int128_make64(*plen)));
> return section;
> }
>
> Paolo
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers
2015-06-17 18:31 ` Hervé Poussineau
@ 2015-06-17 19:15 ` Paolo Bonzini
0 siblings, 0 replies; 8+ messages in thread
From: Paolo Bonzini @ 2015-06-17 19:15 UTC (permalink / raw)
To: Hervé Poussineau, Aurelien Jarno
Cc: Peter Maydell, Leon Alrae, qemu-devel
On 17/06/2015 20:31, Hervé Poussineau wrote:
>>
>
> Sure. It works well for my test case. Thanks Paolo!
>
> However, it breaks PC machines.
> mtree gives:
> 0000000000000cf8-0000000000000cfb (prio 0, RW): pci-conf-idx
> 0000000000000cf9-0000000000000cf9 (prio 1, RW): piix3-reset-control
> "make check" wants to write 4 bytes to 0xcf8. Your patch makes it write
> only 1 byte, and bad things happen.
Yup, that's the same as the Windows XP breakage. Thanks for the test!
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2015-06-17 19:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-11 20:30 [Qemu-devel] [PATCH] dma/rc4030: do multiple calls to address_space_rw when doing DMA transfers Hervé Poussineau
2015-06-11 23:30 ` Aurelien Jarno
2015-06-15 20:44 ` Hervé Poussineau
2015-06-16 17:48 ` Aurelien Jarno
2015-06-17 8:33 ` Paolo Bonzini
2015-06-17 17:09 ` Paolo Bonzini
2015-06-17 18:31 ` Hervé Poussineau
2015-06-17 19:15 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).