* [Qemu-devel] [PATCH] AHCI: Fix port reset race
2012-01-30 22:29 [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop Alexander Graf
@ 2012-01-30 22:29 ` Alexander Graf
2012-02-09 14:42 ` Kevin Wolf
2012-01-30 22:29 ` [Qemu-devel] [PATCH] AHCI: Masking of IRQs actually masks them Alexander Graf
` (2 subsequent siblings)
3 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2012-01-30 22:29 UTC (permalink / raw)
To: kwolf; +Cc: Jan Kiszka, qemu-devel Developers
bdrv_aio_cancel() can trigger bdrv_aio_flush() which makes all aio
that is currently in flight finish. So what we do is:
port reset
detect ncq in flight
cancel ncq
delete ncq sg list
at which point we have double freed the sg list. Instead, with this
patch we do:
port reset
detect ncq in flight
cancel ncq
check if we are really still in flight
delete ncq sg list
which makes things work and gets rid of the race.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
hw/ide/ahci.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 8869fd6..c2c168d 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -560,6 +560,11 @@ static void ahci_reset_port(AHCIState *s, int port)
ncq_tfs->aiocb = NULL;
}
+ /* Maybe we just finished the request thanks to bdrv_aio_cancel() */
+ if (!ncq_tfs->used) {
+ continue;
+ }
+
qemu_sglist_destroy(&ncq_tfs->sglist);
ncq_tfs->used = 0;
}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] AHCI: Fix port reset race
2012-01-30 22:29 ` [Qemu-devel] [PATCH] AHCI: Fix port reset race Alexander Graf
@ 2012-02-09 14:42 ` Kevin Wolf
0 siblings, 0 replies; 13+ messages in thread
From: Kevin Wolf @ 2012-02-09 14:42 UTC (permalink / raw)
To: Alexander Graf; +Cc: Jan Kiszka, qemu-devel Developers
Am 30.01.2012 23:29, schrieb Alexander Graf:
> bdrv_aio_cancel() can trigger bdrv_aio_flush() which makes all aio
> that is currently in flight finish. So what we do is:
>
> port reset
> detect ncq in flight
> cancel ncq
> delete ncq sg list
>
> at which point we have double freed the sg list. Instead, with this
> patch we do:
>
> port reset
> detect ncq in flight
> cancel ncq
> check if we are really still in flight
> delete ncq sg list
>
> which makes things work and gets rid of the race.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
Thanks, applied to the block branch.
Kevin
^ permalink raw reply [flat|nested] 13+ messages in thread
* [Qemu-devel] [PATCH] AHCI: Masking of IRQs actually masks them
2012-01-30 22:29 [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop Alexander Graf
2012-01-30 22:29 ` [Qemu-devel] [PATCH] AHCI: Fix port reset race Alexander Graf
@ 2012-01-30 22:29 ` Alexander Graf
2012-02-09 14:41 ` Kevin Wolf
2012-02-07 14:57 ` [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop Andreas Färber
2012-02-09 14:28 ` Kevin Wolf
3 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2012-01-30 22:29 UTC (permalink / raw)
To: kwolf; +Cc: Jan Kiszka, qemu-devel Developers
When masking IRQ lines, we should actually mask them out and not declare
them active anymore. Once we mask them in again, they are allowed to trigger
again.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
hw/ide/ahci.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index c2c168d..f8e9eb4 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -146,6 +146,7 @@ static void ahci_check_irq(AHCIState *s)
DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus);
+ s->control_regs.irqstatus = 0;
for (i = 0; i < s->ports; i++) {
AHCIPortRegs *pr = &s->dev[i].port_regs;
if (pr->irq_stat & pr->irq_mask) {
@@ -216,6 +217,7 @@ static void ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
break;
case PORT_IRQ_STAT:
pr->irq_stat &= ~val;
+ ahci_check_irq(s);
break;
case PORT_IRQ_MASK:
pr->irq_mask = val & 0xfdc000ff;
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] AHCI: Masking of IRQs actually masks them
2012-01-30 22:29 ` [Qemu-devel] [PATCH] AHCI: Masking of IRQs actually masks them Alexander Graf
@ 2012-02-09 14:41 ` Kevin Wolf
2012-02-09 14:52 ` Alexander Graf
0 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2012-02-09 14:41 UTC (permalink / raw)
To: Alexander Graf; +Cc: Jan Kiszka, qemu-devel Developers
Am 30.01.2012 23:29, schrieb Alexander Graf:
> When masking IRQ lines, we should actually mask them out and not declare
> them active anymore. Once we mask them in again, they are allowed to trigger
> again.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
> hw/ide/ahci.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index c2c168d..f8e9eb4 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -146,6 +146,7 @@ static void ahci_check_irq(AHCIState *s)
>
> DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus);
>
> + s->control_regs.irqstatus = 0;
> for (i = 0; i < s->ports; i++) {
> AHCIPortRegs *pr = &s->dev[i].port_regs;
> if (pr->irq_stat & pr->irq_mask) {
Is this an independent bug fix?
> @@ -216,6 +217,7 @@ static void ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
> break;
> case PORT_IRQ_STAT:
> pr->irq_stat &= ~val;
> + ahci_check_irq(s);
> break;
> case PORT_IRQ_MASK:
> pr->irq_mask = val & 0xfdc000ff;
Makes some sense, but isn't really about masking interrupts either?
(From the commit message I would have expected that you touch PORT_IRQ_MASK)
Kevin
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] AHCI: Masking of IRQs actually masks them
2012-02-09 14:41 ` Kevin Wolf
@ 2012-02-09 14:52 ` Alexander Graf
0 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2012-02-09 14:52 UTC (permalink / raw)
To: Kevin Wolf; +Cc: Jan Kiszka, qemu-devel Developers
On 09.02.2012, at 15:41, Kevin Wolf wrote:
> Am 30.01.2012 23:29, schrieb Alexander Graf:
>> When masking IRQ lines, we should actually mask them out and not declare
>> them active anymore. Once we mask them in again, they are allowed to trigger
>> again.
>>
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> hw/ide/ahci.c | 2 ++
>> 1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>> index c2c168d..f8e9eb4 100644
>> --- a/hw/ide/ahci.c
>> +++ b/hw/ide/ahci.c
>> @@ -146,6 +146,7 @@ static void ahci_check_irq(AHCIState *s)
>>
>> DPRINTF(-1, "check irq %#x\n", s->control_regs.irqstatus);
>>
>> + s->control_regs.irqstatus = 0;
>> for (i = 0; i < s->ports; i++) {
>> AHCIPortRegs *pr = &s->dev[i].port_regs;
>> if (pr->irq_stat & pr->irq_mask) {
>
> Is this an independent bug fix?
No, it's the same thing. Without this we only OR new interrupts into irqstatus. This way we also allow clearing of them when the producer vanishes / gets masked.
>
>> @@ -216,6 +217,7 @@ static void ahci_port_write(AHCIState *s, int port, int offset, uint32_t val)
>> break;
>> case PORT_IRQ_STAT:
>> pr->irq_stat &= ~val;
>> + ahci_check_irq(s);
>> break;
>> case PORT_IRQ_MASK:
>> pr->irq_mask = val & 0xfdc000ff;
>
> Makes some sense, but isn't really about masking interrupts either?
> (From the commit message I would have expected that you touch PORT_IRQ_MASK)
There are multiple layers of IRQ lines, wired to each other, with each layer being able to mask specific bits. We didn't catch the part where irq_stat was losing a bit, either by masking or by explicit unsetting.
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop
2012-01-30 22:29 [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop Alexander Graf
2012-01-30 22:29 ` [Qemu-devel] [PATCH] AHCI: Fix port reset race Alexander Graf
2012-01-30 22:29 ` [Qemu-devel] [PATCH] AHCI: Masking of IRQs actually masks them Alexander Graf
@ 2012-02-07 14:57 ` Andreas Färber
2012-02-07 15:01 ` Alexander Graf
2012-02-09 14:28 ` Kevin Wolf
3 siblings, 1 reply; 13+ messages in thread
From: Andreas Färber @ 2012-02-07 14:57 UTC (permalink / raw)
To: Alexander Graf; +Cc: kwolf, Jan Kiszka, qemu-devel Developers, Bruce Rogers
Hi Alex,
Am 30.01.2012 23:29, schrieb Alexander Graf:
> When using Windows 8 with an AHCI disk drive, it issues a blue screen.
> The reason is that WIN_SECURITY_FREEZE_LOCK / CFA_WEAR_LEVEL is not
> supported by our ATA implementation, but Windows expects it to be there.
>
> Since without security stuff implemented, the lock would be a nop anyway
> and CFA_WEAR_LEVEL already is treated as a nop, let's just allow the cmd
> for HD drives as well. That way Windows is happy.
I tested this with Windows 2008 R2 and it does not resolve the blue
screen I'm getting there during installation. Unfortunately it reboots
so quickly that I cannot read what it says.
Could you share how you debugged your Windows 8 issue?
Andreas
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
> hw/ide/core.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 56b219b..2c129f4 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -969,7 +969,7 @@ static const uint8_t ide_cmd_table[0x100] = {
> [WIN_IDENTIFY] = ALL_OK,
> [WIN_SETFEATURES] = ALL_OK,
> [IBM_SENSE_CONDITION] = CFA_OK,
> - [CFA_WEAR_LEVEL] = CFA_OK,
> + [CFA_WEAR_LEVEL] = HD_CFA_OK,
> [WIN_READ_NATIVE_MAX] = ALL_OK,
> };
>
--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop
2012-02-07 14:57 ` [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop Andreas Färber
@ 2012-02-07 15:01 ` Alexander Graf
0 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2012-02-07 15:01 UTC (permalink / raw)
To: Andreas Färber
Cc: kwolf, Jan Kiszka, qemu-devel Developers, Bruce Rogers
On 07.02.2012, at 15:57, Andreas Färber wrote:
> Hi Alex,
>
> Am 30.01.2012 23:29, schrieb Alexander Graf:
>> When using Windows 8 with an AHCI disk drive, it issues a blue screen.
>> The reason is that WIN_SECURITY_FREEZE_LOCK / CFA_WEAR_LEVEL is not
>> supported by our ATA implementation, but Windows expects it to be there.
>>
>> Since without security stuff implemented, the lock would be a nop anyway
>> and CFA_WEAR_LEVEL already is treated as a nop, let's just allow the cmd
>> for HD drives as well. That way Windows is happy.
>
> I tested this with Windows 2008 R2 and it does not resolve the blue
> screen I'm getting there during installation. Unfortunately it reboots
> so quickly that I cannot read what it says.
Phew, that's probably yet another issue then.
> Could you share how you debugged your Windows 8 issue?
I enabled printf debugging in hw/ide/core.c and looked through the commands close to the blue screen :)
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop
2012-01-30 22:29 [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop Alexander Graf
` (2 preceding siblings ...)
2012-02-07 14:57 ` [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop Andreas Färber
@ 2012-02-09 14:28 ` Kevin Wolf
2012-02-09 14:49 ` Alexander Graf
3 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2012-02-09 14:28 UTC (permalink / raw)
To: Alexander Graf; +Cc: Jan Kiszka, qemu-devel Developers
Am 30.01.2012 23:29, schrieb Alexander Graf:
> When using Windows 8 with an AHCI disk drive, it issues a blue screen.
> The reason is that WIN_SECURITY_FREEZE_LOCK / CFA_WEAR_LEVEL is not
> supported by our ATA implementation, but Windows expects it to be there.
Is there anything that makes Windows believe that we support it? The
spec says bits in IDENTIFY word 82 and 128 must be set to indicate
support for the security feature set, and we don't set those.
Might be just a Windows bug, of course...
> Since without security stuff implemented, the lock would be a nop anyway
> and CFA_WEAR_LEVEL already is treated as a nop, let's just allow the cmd
> for HD drives as well. That way Windows is happy.
It sets the sector count register to 0, which isn't exactly nop. In any
case, the code would at the very least need a comment that it's used for
two separate commands, so that we still remember this when some time in
the future someone writes a real implementation.
Kevin
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop
2012-02-09 14:28 ` Kevin Wolf
@ 2012-02-09 14:49 ` Alexander Graf
2012-02-09 14:59 ` Kevin Wolf
0 siblings, 1 reply; 13+ messages in thread
From: Alexander Graf @ 2012-02-09 14:49 UTC (permalink / raw)
To: Kevin Wolf; +Cc: Jan Kiszka, qemu-devel Developers
On 09.02.2012, at 15:28, Kevin Wolf wrote:
> Am 30.01.2012 23:29, schrieb Alexander Graf:
>> When using Windows 8 with an AHCI disk drive, it issues a blue screen.
>> The reason is that WIN_SECURITY_FREEZE_LOCK / CFA_WEAR_LEVEL is not
>> supported by our ATA implementation, but Windows expects it to be there.
>
> Is there anything that makes Windows believe that we support it? The
> spec says bits in IDENTIFY word 82 and 128 must be set to indicate
> support for the security feature set, and we don't set those.
>
> Might be just a Windows bug, of course...
IIUC it's mandatory in more recent ATA versions, so that's probably why it assumes it's there.
>
>> Since without security stuff implemented, the lock would be a nop anyway
>> and CFA_WEAR_LEVEL already is treated as a nop, let's just allow the cmd
>> for HD drives as well. That way Windows is happy.
>
> It sets the sector count register to 0, which isn't exactly nop. In any
> case, the code would at the very least need a comment that it's used for
> two separate commands, so that we still remember this when some time in
> the future someone writes a real implementation.
Hrm. Good point.
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop
2012-02-09 14:49 ` Alexander Graf
@ 2012-02-09 14:59 ` Kevin Wolf
2012-02-09 15:01 ` Alexander Graf
0 siblings, 1 reply; 13+ messages in thread
From: Kevin Wolf @ 2012-02-09 14:59 UTC (permalink / raw)
To: Alexander Graf; +Cc: Jan Kiszka, qemu-devel Developers
Am 09.02.2012 15:49, schrieb Alexander Graf:
>
> On 09.02.2012, at 15:28, Kevin Wolf wrote:
>
>> Am 30.01.2012 23:29, schrieb Alexander Graf:
>>> When using Windows 8 with an AHCI disk drive, it issues a blue screen.
>>> The reason is that WIN_SECURITY_FREEZE_LOCK / CFA_WEAR_LEVEL is not
>>> supported by our ATA implementation, but Windows expects it to be there.
>>
>> Is there anything that makes Windows believe that we support it? The
>> spec says bits in IDENTIFY word 82 and 128 must be set to indicate
>> support for the security feature set, and we don't set those.
>>
>> Might be just a Windows bug, of course...
>
> IIUC it's mandatory in more recent ATA versions, so that's probably why it assumes it's there.
ACS-2 says it's optional for both ATA and ATAPI devices.
Kevin
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [Qemu-devel] [PATCH] ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop
2012-02-09 14:59 ` Kevin Wolf
@ 2012-02-09 15:01 ` Alexander Graf
0 siblings, 0 replies; 13+ messages in thread
From: Alexander Graf @ 2012-02-09 15:01 UTC (permalink / raw)
To: Kevin Wolf; +Cc: Jan Kiszka, qemu-devel Developers
On 09.02.2012, at 15:59, Kevin Wolf wrote:
> Am 09.02.2012 15:49, schrieb Alexander Graf:
>>
>> On 09.02.2012, at 15:28, Kevin Wolf wrote:
>>
>>> Am 30.01.2012 23:29, schrieb Alexander Graf:
>>>> When using Windows 8 with an AHCI disk drive, it issues a blue screen.
>>>> The reason is that WIN_SECURITY_FREEZE_LOCK / CFA_WEAR_LEVEL is not
>>>> supported by our ATA implementation, but Windows expects it to be there.
>>>
>>> Is there anything that makes Windows believe that we support it? The
>>> spec says bits in IDENTIFY word 82 and 128 must be set to indicate
>>> support for the security feature set, and we don't set those.
>>>
>>> Might be just a Windows bug, of course...
>>
>> IIUC it's mandatory in more recent ATA versions, so that's probably why it assumes it's there.
>
> ACS-2 says it's optional for both ATA and ATAPI devices.
Ah, right. I was looking at the "Historical Command Assignments" table and figured C means it's required, but that of course is wrong.
Then I seriously have no idea why Windows is erroring out here. I mean, it does work just fine for IDE devices. And we don't tell the guest anything special for AHCI.
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread