* [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
@ 2013-05-28 8:18 Andreas Färber
2013-05-28 8:27 ` Kevin Wolf
0 siblings, 1 reply; 8+ messages in thread
From: Andreas Färber @ 2013-05-28 8:18 UTC (permalink / raw)
To: qemu-devel
Cc: Kevin Wolf, stefano.stabellini, stefanha, Heiko Rommel,
Bruce Rogers, arei.gonglei, pbonzini, Andreas Färber
The implementation of the ATA FLUSH command invokes a flush at the block
layer, which may on raw files on POSIX entail a synchronous fdatasync().
This may in some cases take so long that the SLES 11 SP1 guest driver
reports I/O errors and filesystems get corrupted or remounted read-only.
Avoid this by setting BUSY_STAT, so that the guest is made aware we are
in the middle of an operation and no ATA commands are attempted to be
processed concurrently.
Addresses BNC#637297.
Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
---
hw/ide/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/ide/core.c b/hw/ide/core.c
index c7a8041..bf1ff18 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
{
IDEState *s = opaque;
+ s->status &= ~BUSY_STAT;
+
if (ret < 0) {
/* XXX: What sector number to set here? */
if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
@@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
return;
}
+ s->status |= BUSY_STAT;
bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
bdrv_aio_flush(s->bs, ide_flush_cb, s);
}
--
1.8.1.4
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
2013-05-28 8:18 [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH Andreas Färber
@ 2013-05-28 8:27 ` Kevin Wolf
2013-05-28 8:46 ` Andreas Färber
0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2013-05-28 8:27 UTC (permalink / raw)
To: Andreas Färber
Cc: stefano.stabellini, stefanha, Heiko Rommel, qemu-devel,
Bruce Rogers, arei.gonglei, pbonzini
Am 28.05.2013 um 10:18 hat Andreas Färber geschrieben:
> The implementation of the ATA FLUSH command invokes a flush at the block
> layer, which may on raw files on POSIX entail a synchronous fdatasync().
> This may in some cases take so long that the SLES 11 SP1 guest driver
> reports I/O errors and filesystems get corrupted or remounted read-only.
>
> Avoid this by setting BUSY_STAT, so that the guest is made aware we are
> in the middle of an operation and no ATA commands are attempted to be
> processed concurrently.
>
> Addresses BNC#637297.
>
> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
> Signed-off-by: Andreas Färber <afaerber@suse.de>
> ---
> hw/ide/core.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index c7a8041..bf1ff18 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
> {
> IDEState *s = opaque;
>
> + s->status &= ~BUSY_STAT;
> +
This part is unnecessary, the status is already reset.
> if (ret < 0) {
> /* XXX: What sector number to set here? */
> if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
> @@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
> return;
> }
>
> + s->status |= BUSY_STAT;
> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
> bdrv_aio_flush(s->bs, ide_flush_cb, s);
> }
This should fix the bug, however in an one-off way. I was planning to
fix it by setting BSY for all commands and having an explicit command
completion everywhere. This part is a mess currently in IDE.
The other part why I haven't sent a fix yet is that I don't have a test
case for it. I guess I need to extend blkdebug first before this can be
reliably tested by qtest.
Kevin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
2013-05-28 8:27 ` Kevin Wolf
@ 2013-05-28 8:46 ` Andreas Färber
2013-05-28 9:18 ` Kevin Wolf
0 siblings, 1 reply; 8+ messages in thread
From: Andreas Färber @ 2013-05-28 8:46 UTC (permalink / raw)
To: Kevin Wolf
Cc: stefano.stabellini, Stefan Hajnoczi, Heiko Rommel, qemu-devel,
Bruce Rogers, Gonglei (Arei), Paolo Bonzini, qemu-stable
Am 28.05.2013 10:27, schrieb Kevin Wolf:
> Am 28.05.2013 um 10:18 hat Andreas Färber geschrieben:
>> The implementation of the ATA FLUSH command invokes a flush at the block
>> layer, which may on raw files on POSIX entail a synchronous fdatasync().
>> This may in some cases take so long that the SLES 11 SP1 guest driver
>> reports I/O errors and filesystems get corrupted or remounted read-only.
>>
>> Avoid this by setting BUSY_STAT, so that the guest is made aware we are
>> in the middle of an operation and no ATA commands are attempted to be
>> processed concurrently.
>>
>> Addresses BNC#637297.
>>
>> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
>> Signed-off-by: Andreas Färber <afaerber@suse.de>
>> ---
>> hw/ide/core.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/ide/core.c b/hw/ide/core.c
>> index c7a8041..bf1ff18 100644
>> --- a/hw/ide/core.c
>> +++ b/hw/ide/core.c
>> @@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
>> {
>> IDEState *s = opaque;
>>
>> + s->status &= ~BUSY_STAT;
>> +
>
> This part is unnecessary, the status is already reset.
Only in the ret >= 0 case though AFAICS?
>> if (ret < 0) {
>> /* XXX: What sector number to set here? */
>> if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
>> @@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
>> return;
>> }
>>
>> + s->status |= BUSY_STAT;
>> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
>> bdrv_aio_flush(s->bs, ide_flush_cb, s);
>> }
>
> This should fix the bug, however in an one-off way. I was planning to
> fix it by setting BSY for all commands and having an explicit command
> completion everywhere. This part is a mess currently in IDE.
That's a valid idea, but I had backporting to 0.15 in mind. ;)
And doh, I forgot qemu-stable.
> The other part why I haven't sent a fix yet is that I don't have a test
> case for it.
Temporarily add a sleep(31) in qemu_fdatasync()?
I was lazy in testing with -snapshot to not corrupt my disk image, which
would not trigger the same issue since qcow2-backed AFAIU.
> I guess I need to extend blkdebug first before this can be
> reliably tested by qtest.
It can't, since it's not a pure device emulation issue but depends on
the relative timing of filesystem operations and subsequent commands.
Andreas
--
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
2013-05-28 8:46 ` Andreas Färber
@ 2013-05-28 9:18 ` Kevin Wolf
2013-05-28 9:24 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2013-05-28 9:18 UTC (permalink / raw)
To: Andreas Färber
Cc: stefano.stabellini, Stefan Hajnoczi, Heiko Rommel, qemu-devel,
Bruce Rogers, Gonglei (Arei), Paolo Bonzini, qemu-stable
Am 28.05.2013 um 10:46 hat Andreas Färber geschrieben:
> Am 28.05.2013 10:27, schrieb Kevin Wolf:
> > Am 28.05.2013 um 10:18 hat Andreas Färber geschrieben:
> >> The implementation of the ATA FLUSH command invokes a flush at the block
> >> layer, which may on raw files on POSIX entail a synchronous fdatasync().
> >> This may in some cases take so long that the SLES 11 SP1 guest driver
> >> reports I/O errors and filesystems get corrupted or remounted read-only.
> >>
> >> Avoid this by setting BUSY_STAT, so that the guest is made aware we are
> >> in the middle of an operation and no ATA commands are attempted to be
> >> processed concurrently.
> >>
> >> Addresses BNC#637297.
> >>
> >> Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
> >> Signed-off-by: Andreas Färber <afaerber@suse.de>
> >> ---
> >> hw/ide/core.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/hw/ide/core.c b/hw/ide/core.c
> >> index c7a8041..bf1ff18 100644
> >> --- a/hw/ide/core.c
> >> +++ b/hw/ide/core.c
> >> @@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
> >> {
> >> IDEState *s = opaque;
> >>
> >> + s->status &= ~BUSY_STAT;
> >> +
> >
> > This part is unnecessary, the status is already reset.
>
> Only in the ret >= 0 case though AFAICS?
ide_handle_rw_error() takes care of resetting the status as well, except
when the VM is stopped. But then it will be immediately set again when
the VM is continued and the request is restarted. So the semantic
difference is just whether BSY would be set or not when you somehow
inspect the state while the VM is stopped after an I/O error.
> >> if (ret < 0) {
> >> /* XXX: What sector number to set here? */
> >> if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
> >> @@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
> >> return;
> >> }
> >>
> >> + s->status |= BUSY_STAT;
> >> bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
> >> bdrv_aio_flush(s->bs, ide_flush_cb, s);
> >> }
> >
> > This should fix the bug, however in an one-off way. I was planning to
> > fix it by setting BSY for all commands and having an explicit command
> > completion everywhere. This part is a mess currently in IDE.
>
> That's a valid idea, but I had backporting to 0.15 in mind. ;)
> And doh, I forgot qemu-stable.
Fair enough, we can merge something like this first and do the real
thing on top. Though nobody will be interested in the real thing any
more, as usual... :-/
> > The other part why I haven't sent a fix yet is that I don't have a test
> > case for it.
>
> Temporarily add a sleep(31) in qemu_fdatasync()?
>
> I was lazy in testing with -snapshot to not corrupt my disk image, which
> would not trigger the same issue since qcow2-backed AFAIU.
>
> > I guess I need to extend blkdebug first before this can be
> > reliably tested by qtest.
>
> It can't, since it's not a pure device emulation issue but depends on
> the relative timing of filesystem operations and subsequent commands.
That's why you need to take influence on the timing. It's no excuse for
merging without a test case. If we only ever tested devices that have no
relation to the outside world, our testing would be pretty useless and
always stay as bad as it is today in many areas.
Kevin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
2013-05-28 9:18 ` Kevin Wolf
@ 2013-05-28 9:24 ` Paolo Bonzini
2013-05-28 9:36 ` Kevin Wolf
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2013-05-28 9:24 UTC (permalink / raw)
To: Kevin Wolf
Cc: stefano.stabellini, Stefan Hajnoczi, Heiko Rommel, qemu-devel,
Bruce Rogers, Gonglei (Arei), qemu-stable, Andreas Färber
Il 28/05/2013 11:18, Kevin Wolf ha scritto:
>>> The other part why I haven't sent a fix yet is that I don't have a test
>>> case for it.
>>
>> Temporarily add a sleep(31) in qemu_fdatasync()?
>>
>> I was lazy in testing with -snapshot to not corrupt my disk image, which
>> would not trigger the same issue since qcow2-backed AFAIU.
>>
>>> I guess I need to extend blkdebug first before this can be
>>> reliably tested by qtest.
>>
>> It can't, since it's not a pure device emulation issue but depends on
>> the relative timing of filesystem operations and subsequent commands.
>
> That's why you need to take influence on the timing. It's no excuse for
> merging without a test case. If we only ever tested devices that have no
> relation to the outside world, our testing would be pretty useless and
> always stay as bad as it is today in many areas.
I don't think the qtest would be timing dependent. The Linux testcase
is timing dependent, but for the qtest all you need to check is "is BUSY
set during a flush?". This can be done with blkdebug suspend/resume,
except that there is no way to call bdrv_debug_resume from QEMU.
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
2013-05-28 9:24 ` Paolo Bonzini
@ 2013-05-28 9:36 ` Kevin Wolf
2013-05-28 9:48 ` Paolo Bonzini
0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2013-05-28 9:36 UTC (permalink / raw)
To: Paolo Bonzini
Cc: stefano.stabellini, Stefan Hajnoczi, Heiko Rommel, qemu-devel,
Bruce Rogers, Gonglei (Arei), qemu-stable, Andreas Färber
Am 28.05.2013 um 11:24 hat Paolo Bonzini geschrieben:
> Il 28/05/2013 11:18, Kevin Wolf ha scritto:
> >>> The other part why I haven't sent a fix yet is that I don't have a test
> >>> case for it.
> >>
> >> Temporarily add a sleep(31) in qemu_fdatasync()?
> >>
> >> I was lazy in testing with -snapshot to not corrupt my disk image, which
> >> would not trigger the same issue since qcow2-backed AFAIU.
> >>
> >>> I guess I need to extend blkdebug first before this can be
> >>> reliably tested by qtest.
> >>
> >> It can't, since it's not a pure device emulation issue but depends on
> >> the relative timing of filesystem operations and subsequent commands.
> >
> > That's why you need to take influence on the timing. It's no excuse for
> > merging without a test case. If we only ever tested devices that have no
> > relation to the outside world, our testing would be pretty useless and
> > always stay as bad as it is today in many areas.
>
> I don't think the qtest would be timing dependent. The Linux testcase
> is timing dependent, but for the qtest all you need to check is "is BUSY
> set during a flush?". This can be done with blkdebug suspend/resume,
> except that there is no way to call bdrv_debug_resume from QEMU.
That's exactly what I was talking about, suspending a request is taking
influence on its timing. I'm looking into this right now. (And it's not
just resume, bdrv_debug_suspend can't be called from QEMU either)
In fact, I'm checking whether we can have a monitor command to issue
qemu-io commands, which will be more generally useful for test cases. We
just need to make obvious that it doesn't become an ABI. Maybe prefix it
with "__org.qemu.debug-" or something like that.
Kevin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
2013-05-28 9:36 ` Kevin Wolf
@ 2013-05-28 9:48 ` Paolo Bonzini
2013-05-28 9:59 ` Kevin Wolf
0 siblings, 1 reply; 8+ messages in thread
From: Paolo Bonzini @ 2013-05-28 9:48 UTC (permalink / raw)
To: Kevin Wolf
Cc: stefano.stabellini, Stefan Hajnoczi, Heiko Rommel, qemu-devel,
Bruce Rogers, Gonglei (Arei), qemu-stable, Andreas Färber
Il 28/05/2013 11:36, Kevin Wolf ha scritto:
> Am 28.05.2013 um 11:24 hat Paolo Bonzini geschrieben:
>> Il 28/05/2013 11:18, Kevin Wolf ha scritto:
>>>>> The other part why I haven't sent a fix yet is that I don't have a test
>>>>> case for it.
>>>>
>>>> Temporarily add a sleep(31) in qemu_fdatasync()?
>>>>
>>>> I was lazy in testing with -snapshot to not corrupt my disk image, which
>>>> would not trigger the same issue since qcow2-backed AFAIU.
>>>>
>>>>> I guess I need to extend blkdebug first before this can be
>>>>> reliably tested by qtest.
>>>>
>>>> It can't, since it's not a pure device emulation issue but depends on
>>>> the relative timing of filesystem operations and subsequent commands.
>>>
>>> That's why you need to take influence on the timing. It's no excuse for
>>> merging without a test case. If we only ever tested devices that have no
>>> relation to the outside world, our testing would be pretty useless and
>>> always stay as bad as it is today in many areas.
>>
>> I don't think the qtest would be timing dependent. The Linux testcase
>> is timing dependent, but for the qtest all you need to check is "is BUSY
>> set during a flush?". This can be done with blkdebug suspend/resume,
>> except that there is no way to call bdrv_debug_resume from QEMU.
>
> That's exactly what I was talking about, suspending a request is taking
> influence on its timing. I'm looking into this right now. (And it's not
> just resume, bdrv_debug_suspend can't be called from QEMU either)
It can be called from the rules file though, can't it?
> In fact, I'm checking whether we can have a monitor command to issue
> qemu-io commands, which will be more generally useful for test cases. We
> just need to make obvious that it doesn't become an ABI. Maybe prefix it
> with "__org.qemu.debug-" or something like that.
Makes sense. I'm not sure why you'd want to read or write from
testcases, but bdrv_drain(_all) can also be useful from testcases.
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH
2013-05-28 9:48 ` Paolo Bonzini
@ 2013-05-28 9:59 ` Kevin Wolf
0 siblings, 0 replies; 8+ messages in thread
From: Kevin Wolf @ 2013-05-28 9:59 UTC (permalink / raw)
To: Paolo Bonzini
Cc: stefano.stabellini, Stefan Hajnoczi, Heiko Rommel, qemu-devel,
Bruce Rogers, Gonglei (Arei), qemu-stable, Andreas Färber
Am 28.05.2013 um 11:48 hat Paolo Bonzini geschrieben:
> Il 28/05/2013 11:36, Kevin Wolf ha scritto:
> > Am 28.05.2013 um 11:24 hat Paolo Bonzini geschrieben:
> >> Il 28/05/2013 11:18, Kevin Wolf ha scritto:
> >>>>> The other part why I haven't sent a fix yet is that I don't have a test
> >>>>> case for it.
> >>>>
> >>>> Temporarily add a sleep(31) in qemu_fdatasync()?
> >>>>
> >>>> I was lazy in testing with -snapshot to not corrupt my disk image, which
> >>>> would not trigger the same issue since qcow2-backed AFAIU.
> >>>>
> >>>>> I guess I need to extend blkdebug first before this can be
> >>>>> reliably tested by qtest.
> >>>>
> >>>> It can't, since it's not a pure device emulation issue but depends on
> >>>> the relative timing of filesystem operations and subsequent commands.
> >>>
> >>> That's why you need to take influence on the timing. It's no excuse for
> >>> merging without a test case. If we only ever tested devices that have no
> >>> relation to the outside world, our testing would be pretty useless and
> >>> always stay as bad as it is today in many areas.
> >>
> >> I don't think the qtest would be timing dependent. The Linux testcase
> >> is timing dependent, but for the qtest all you need to check is "is BUSY
> >> set during a flush?". This can be done with blkdebug suspend/resume,
> >> except that there is no way to call bdrv_debug_resume from QEMU.
> >
> > That's exactly what I was talking about, suspending a request is taking
> > influence on its timing. I'm looking into this right now. (And it's not
> > just resume, bdrv_debug_suspend can't be called from QEMU either)
>
> It can be called from the rules file though, can't it?
No, you can only define ACTION_INJECT_ERROR and ACTION_SET_STATE from
the config file, but not ACTION_SUSPEND. Maybe we should add it, but it
would still require a manual resume.
So far all test cases suspend requests with explicit qemu-io commands.
> > In fact, I'm checking whether we can have a monitor command to issue
> > qemu-io commands, which will be more generally useful for test cases. We
> > just need to make obvious that it doesn't become an ABI. Maybe prefix it
> > with "__org.qemu.debug-" or something like that.
>
> Makes sense. I'm not sure why you'd want to read or write from
> testcases, but bdrv_drain(_all) can also be useful from testcases.
I imagine writing could be very useful for block job test cases.
Kevin
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-05-28 9:59 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-28 8:18 [Qemu-devel] [PATCH] ide: Set BSY bit during FLUSH Andreas Färber
2013-05-28 8:27 ` Kevin Wolf
2013-05-28 8:46 ` Andreas Färber
2013-05-28 9:18 ` Kevin Wolf
2013-05-28 9:24 ` Paolo Bonzini
2013-05-28 9:36 ` Kevin Wolf
2013-05-28 9:48 ` Paolo Bonzini
2013-05-28 9:59 ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).