From: Peter Maydell <peter.maydell@linaro.org>
To: David Hubbard <dmamfmgm@gmail.com>
Cc: qemu-devel@nongnu.org, Philippe Mathieu-Daude <philmd@linaro.org>,
Michael Tokarev <mjt@tls.msk.ru>,
Gerd Hoffmann <kraxel@redhat.com>
Subject: Re: [PATCH] hw/usb/hcd-ohci: Fix ohci_service_td: accept valid TDs
Date: Fri, 31 May 2024 15:03:33 +0100 [thread overview]
Message-ID: <CAFEAcA8JFMid+SeHWuoaK9meVWwgch9wwzKmT4SFUXz4U_iacw@mail.gmail.com> (raw)
In-Reply-To: <20240520232634.317988-1-dmamfmgm@gmail.com>
On Tue, 21 May 2024 at 00:26, David Hubbard <dmamfmgm@gmail.com> wrote:
>
> From: Cord Amfmgm <dmamfmgm@gmail.com>
>
> This changes the way the ohci emulation handles a Transfer Descriptor with
> "Current Buffer Pointer" set to "Buffer End" + 1.
>
> The OHCI spec 4.3.1.2 Table 4-2 allows td.cbp to be one byte more than td.be
> to signal the buffer has zero length. Currently qemu only accepts zero-length
> Transfer Descriptors if the td.cbp is equal to 0, while actual OHCI hardware
> accepts both cases.
>
> The qemu ohci emulation has a regression in ohci_service_td. Version 4.2
> and earlier matched the spec. (I haven't taken the time to bisect exactly
> where the logic was changed.)
>
> With a tiny OS[1] that boots and executes a test, the issue can be seen:
>
> * OS that sends USB requests to a USB mass storage device
> but sends td.cbp = td.be + 1
> * qemu 4.2
> * qemu HEAD (4e66a0854)
> * Actual OHCI controller (hardware)
>
> Command line:
> qemu-system-x86_64 -m 20 \
> -device pci-ohci,id=ohci \
> -drive if=none,format=raw,id=d,file=testmbr.raw \
> -device usb-storage,bus=ohci.0,drive=d \
> --trace "usb_*" --trace "ohci_*" -D qemu.log
>
> Results are:
>
> qemu 4.2 | qemu HEAD | actual HW
> ------------+------------+------------
> works fine | ohci_die() | works fine
>
> Tip: if the flags "-serial pty -serial stdio" are added to the command line
> the test will output USB requests like this:
>
> Testing qemu HEAD:
>
> > Free mem 2M ohci port2 conn FS
> > setup { 80 6 0 1 0 0 8 0 }
> > ED info=80000 { mps=8 en=0 d=0 } tail=c20920
> > td0 c20880 nxt=c20960 f2000000 setup cbp=c20900 be=c20907
> > td1 c20960 nxt=c20980 f3140000 in cbp=c20908 be=c2090f
> > td2 c20980 nxt=c20920 f3080000 out cbp=c20910 be=c2090f ohci20 host err
> > usb stopped
>
> And in qemu.log:
>
> usb_ohci_iso_td_bad_cc_overrun ISO_TD start_offset=0x00c20910 > next_offset=0x00c2090f
>
> Testing qemu 4.2:
>
> > Free mem 2M ohci port2 conn FS
> > setup { 80 6 0 1 0 0 8 0 }
> > ED info=80000 { mps=8 en=0 d=0 } tail=620920
> > td0 620880 nxt=620960 f2000000 setup cbp=620900 be=620907 cbp=0 be=620907
> > td1 620960 nxt=620980 f3140000 in cbp=620908 be=62090f cbp=0 be=62090f
> > td2 620980 nxt=620920 f3080000 out cbp=620910 be=62090f cbp=0 be=62090f
> > rx { 12 1 0 2 0 0 0 8 }
> > setup { 0 5 1 0 0 0 0 0 } tx {}
> > ED info=80000 { mps=8 en=0 d=0 } tail=620880
> > td0 620920 nxt=620960 f2000000 setup cbp=620900 be=620907 cbp=0 be=620907
> > td1 620960 nxt=620880 f3100000 in cbp=620908 be=620907 cbp=0 be=620907
> > setup { 80 6 0 1 0 0 12 0 }
> > ED info=80001 { mps=8 en=0 d=1 } tail=620960
> > td0 620880 nxt=6209c0 f2000000 setup cbp=620920 be=620927 cbp=0 be=620927
> > td1 6209c0 nxt=6209e0 f3140000 in cbp=620928 be=620939 cbp=0 be=620939
> > td2 6209e0 nxt=620960 f3080000 out cbp=62093a be=620939 cbp=0 be=620939
> > rx { 12 1 0 2 0 0 0 8 f4 46 1 0 0 0 1 2 3 1 }
> > setup { 80 6 0 2 0 0 0 1 }
> > ED info=80001 { mps=8 en=0 d=1 } tail=620880
> > td0 620960 nxt=6209a0 f2000000 setup cbp=620a20 be=620a27 cbp=0 be=620a27
> > td1 6209a0 nxt=6209c0 f3140004 in cbp=620a28 be=620b27 cbp=620a48 be=620b27
> > td2 6209c0 nxt=620880 f3080000 out cbp=620b28 be=620b27 cbp=0 be=620b27
> > rx { 9 2 20 0 1 1 4 c0 0 9 4 0 0 2 8 6 50 0 7 5 81 2 40 0 0 7 5 2 2 40 0 0 }
> > setup { 0 9 1 0 0 0 0 0 } tx {}
> > ED info=80001 { mps=8 en=0 d=1 } tail=620900
> > td0 620880 nxt=620940 f2000000 setup cbp=620a00 be=620a07 cbp=0 be=620a07
> > td1 620940 nxt=620900 f3100000 in cbp=620a08 be=620a07 cbp=0 be=620a07
>
> [1] The OS disk image has been emailed to philmd@linaro.org, mjt@tls.msk.ru,
> and kraxel@redhat.com:
>
> * testCbpOffBy1.img.xz
> * sha256: f87baddcb86de845de12f002c698670a426affb40946025cc32694f9daa3abed
>
> Signed-off-by: Cord Amfmgm <dmamfmgm@gmail.com>
> ---
> hw/usb/hcd-ohci.c | 4 ++--
> hw/usb/trace-events | 1 +
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
> index acd6016980..71b54914d3 100644
> --- a/hw/usb/hcd-ohci.c
> +++ b/hw/usb/hcd-ohci.c
> @@ -941,8 +941,8 @@ static int ohci_service_td(OHCIState *ohci, struct ohci_ed *ed)
> if ((td.cbp & 0xfffff000) != (td.be & 0xfffff000)) {
> len = (td.be & 0xfff) + 0x1001 - (td.cbp & 0xfff);
> } else {
> - if (td.cbp > td.be) {
> - trace_usb_ohci_iso_td_bad_cc_overrun(td.cbp, td.be);
> + if (td.cbp - 1 > td.be) { /* rely on td.cbp != 0 */
> + trace_usb_ohci_td_bad_buf(td.cbp, td.be);
> ohci_die(ohci);
> return 1;
> }
This patch seems to me to do what the commit message sets out to
do, and it looks unlikely to have any unintended side effects
because it turns a "USB controller flags an error" case into
a "treat as zero length packet" case, and I have trouble
imagining that any guest could be relying on looking for the
controller error. On that basis I'm inclined to accept it.
What I would like to see is what we could classify under
"rationale", which is to say "what prompted us to make this
change?". In my experience it's important to record this
(including in the commit message). There are of course
many cases in QEMU's git history where we failed to do that,
but in general I think it's a good standard to meet. (I
am also erring on the side of caution in reviewing this
particular patch, because I don't know the relevant standards
or this bit of the code very well.)
Why do we care about the motivation for a patch?
(1) In the present: it helps to raise confidence that the
proposed new behaviour is right. This is good because QEMU's
test suite is far from comprehensive, so in some sense any
change to the codebase is a risk.
For instance, if a change is being made because the QNX demo
floppy doesn't run, then the fact that the change fixes that
failure-to-run indicates that our interpretation of the
meaning of the standard, or of what should happen in the
grey areas that the documentation doesn't clearly describe,
matches what the QNX driver author (an unrelated third party)
thought and probably also what a lot of in-the-field hardware
does (since QNX was no doubt tested on a lot of different PCs
back in the day).
On the other hand, if a change is proposed because it fixes
booting with older Linux kernels prior to commit XYZ, and
kernel commit XYZ turns out to be "make this device driver
program the hardware according to the specification rather
than relying on an accident of timing", then we might want
to look at where we want to be in the tradeoff of "run older
kernels" versus "put workaround for a guest software issue
into QEMU". (Workarounds for guest software bugs are something
I'm very reluctant to put into QEMU, because my experience
is that once they're in the codebase we can essentially never
remove them, because we don't know what guest code might
be relying on them. But sometimes they're a necessary evil.)
(2) In the future: if in a year's time or more, somebody
reports that a particular commit has regressed some specific
guest workload they have, knowing why we made the change in
the first place is really useful in investigating the
regression.
If we need to change code that was initially added to solve
a problem when running FreeBSD, we know we need to re-test
with FreeBSD.
If the change went in to fix a buffer overrun, we know we
need to be careful and cross-check that we don't reintroduce
the overrun in the course of fixing a regression.
If a change is one that we made on the grounds of "reading
the spec and the code, this looked like it was clearly wrong,
but we don't have a definite repro case of it breaking a guest"
then that might put "revert the change, we were mistaken" on
the table as a response to a future regression report.
And so on.
thanks
-- PMM
next prev parent reply other threads:[~2024-05-31 14:04 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-20 23:26 [PATCH] hw/usb/hcd-ohci: Fix ohci_service_td: accept valid TDs David Hubbard
2024-05-30 15:05 ` Peter Maydell
2024-05-30 19:14 ` Alex Bennée
2024-05-30 21:14 ` Cord Amfmgm
2024-05-31 13:59 ` Alex Bennée
2024-06-12 17:17 ` Cord Amfmgm
2024-06-12 19:36 ` Alex Bennée
2024-06-12 19:48 ` Cord Amfmgm
2024-06-21 15:20 ` Peter Maydell
2024-06-21 16:24 ` Cord Amfmgm
2024-06-21 17:16 ` Peter Maydell
2024-05-31 14:03 ` Peter Maydell [this message]
2024-05-31 18:16 ` Cord Amfmgm
2024-06-07 13:23 ` Peter Maydell
2024-06-08 15:19 ` Cord Amfmgm
-- strict thread matches above, loose matches on Subject: below --
2024-05-20 23:24 David Hubbard
2024-05-20 23:26 ` Cord Amfmgm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAFEAcA8JFMid+SeHWuoaK9meVWwgch9wwzKmT4SFUXz4U_iacw@mail.gmail.com \
--to=peter.maydell@linaro.org \
--cc=dmamfmgm@gmail.com \
--cc=kraxel@redhat.com \
--cc=mjt@tls.msk.ru \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).