From: Kevin Wolf <kwolf@redhat.com>
To: Niklas Cassel <nks@flawful.org>
Cc: John Snow <jsnow@redhat.com>,
qemu-block@nongnu.org, qemu-devel@nongnu.org,
Damien Le Moal <dlemoal@kernel.org>,
Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>,
Michael Tokarev <mjt@tls.msk.ru>,
Niklas Cassel <niklas.cassel@wdc.com>
Subject: Re: [PATCH v2] hw/ide/ahci: fix legacy software reset
Date: Wed, 8 Nov 2023 14:10:59 +0100 [thread overview]
Message-ID: <ZUuI41b5CnU8GlID@redhat.com> (raw)
In-Reply-To: <ZUrO/tluTBjnPXHV@x1-carbon>
Am 08.11.2023 um 00:57 hat Niklas Cassel geschrieben:
> On Tue, Nov 07, 2023 at 07:14:07PM +0100, Kevin Wolf wrote:
> > Am 05.10.2023 um 12:04 hat Niklas Cassel geschrieben:
> > > From: Niklas Cassel <niklas.cassel@wdc.com>
> > >
> > > Legacy software contains a standard mechanism for generating a reset to a
> > > Serial ATA device - setting the SRST (software reset) bit in the Device
> > > Control register.
> > >
> > > Serial ATA has a more robust mechanism called COMRESET, also referred to
> > > as port reset. A port reset is the preferred mechanism for error
> > > recovery and should be used in place of software reset.
> > >
> > > Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> > > improved the handling of PxCI, such that PxCI gets cleared after handling
> > > a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
> > > receiving an arbitrary FIS).
> > >
> > > However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
> > > enough, we also need to clear PxCI when receiving a SRST in the Device
> > > Control register.
> > >
> > > This fixes an issue for FreeBSD where the device would fail to reset.
> > > The problem was not noticed in Linux, because Linux uses a COMRESET
> > > instead of a legacy software reset by default.
> > >
> > > Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> > > Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
> > > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> > > ---
> > > Changes since v1: write the D2H FIS before clearing PxCI.
> > >
> > > hw/ide/ahci.c | 16 ++++++++++++++++
> > > 1 file changed, 16 insertions(+)
> > >
> > > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> > > index babdd7b458..7269dabbdb 100644
> > > --- a/hw/ide/ahci.c
> > > +++ b/hw/ide/ahci.c
> > > @@ -1254,10 +1254,26 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
> > > case STATE_RUN:
> > > if (cmd_fis[15] & ATA_SRST) {
> > > s->dev[port].port_state = STATE_RESET;
> > > + /*
> > > + * When setting SRST in the first H2D FIS in the reset sequence,
> > > + * the device does not send a D2H FIS. Host software thus has to
> > > + * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
> > > + * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
> > > + */
> > > + if (opts & AHCI_CMD_CLR_BUSY) {
> > > + ahci_clear_cmd_issue(ad, slot);
> > > + }
> >
> > I suspect that AHCI_CMD_CLR_BUSY really should be checked in a more
> > generic place, but this will do for fixing software reset.
>
> This weirdo option "Clear Busy upon R_OK" is for the HBA itself to
> clear the bit for the command in PxCI, when it gets the R_OK that the
> Host to Device (H2D) FIS was transmitted successfully to the device.
> The normal way is that the device sends back a Device to Host (D2H) FIS
> which then causes the bit in PxCI to be cleared in the HBA.
>
> Yes, theoretically you could probably build a FIS that has the "Clear
> Busy upon R_OK" even for e.g. a regular non-NCQ I/O command (where
> PxCI/BUSY is first supposed to be cleared once the command is
> done...), however this would surely cause problems as PxCI would then
> no longer shadow the state of the device.
>
> If you search for "Clear Busy upon R_OK", the only usage seems
> to be for legacy software reset.
> [...]
Yes, looks like ignoring it otherwise is harmless.
> > ahci_reset_port() already calls ahci_init_d2h() -> ahci_write_fis_d2h().
> > So I think this new ahci_write_fis_d2h() only sets some state that will
> > immediately be overwritten again. Which is good, because we didn't set
> > the signature as described in the SATA software reset protocol yet, that
> > is only done in ahci_reset_port().
> >
> > Am I missing something? Why do we need this ahci_write_fis_d2h() call
> > here?
> >
> > As for the ahci_clear_cmd_issue(), I'm surprised that ahci_reset_port()
> > doesn't already clear the register. Wouldn't it make more sense there
> > than just in this one caller?
>
> A start/stop of the engine (PxCMD.ST) does clear PxCI, as I implemented in:
> https://github.com/qemu/qemu/commit/d73b84d0b664e60fffb66f46e84d0db4a8e1c713
>
> But according to AHCI 1.3.1, section:
> 3.3.14 Offset 38h: PxCI – Port x Command Issue
>
> PxCI does also have a reset value of 0h.
>
> So I would assume that it is okay to clear PxCI to zero also in
> ahci_reset_port().
Yes, that was what I thought.
> By doing so, we could avoid both the ahci_clear_cmd_issue(),
> and the ahci_write_fis_d2h(ad, false).
>
> Perhaps you can drop the "bonus" part of my patch and simply
> add a pr->cmd_issue to ahci_reset_port()?
I agree that this is probably the best change to make.
If you also want to add that comment that there already is a D2H FIS
sent during reset, and maybe change the commit message to explain why
we're touching ahci_reset_port(), I think I would prefer if you sent a
v3. I'm usually happy to make small changes while applying, but it feels
like this would be a larger change than I would be comfortable making
without having it on the mailing list again.
Kevin
prev parent reply other threads:[~2023-11-08 13:12 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-05 10:04 [PATCH v2] hw/ide/ahci: fix legacy software reset Niklas Cassel
2023-10-10 16:57 ` Michael Tokarev
2023-10-26 6:15 ` Michael Tokarev
2023-11-07 18:14 ` Kevin Wolf
2023-11-07 18:32 ` Kevin Wolf
2023-11-07 23:57 ` Niklas Cassel
2023-11-08 13:10 ` Kevin Wolf [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZUuI41b5CnU8GlID@redhat.com \
--to=kwolf@redhat.com \
--cc=dlemoal@kernel.org \
--cc=jsnow@redhat.com \
--cc=marcin.juszkiewicz@linaro.org \
--cc=mjt@tls.msk.ru \
--cc=niklas.cassel@wdc.com \
--cc=nks@flawful.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.