From: Ondrej Zary <linux@rainbow-software.org>
To: Finn Thain <fthain@telegraphics.com.au>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
Michael Schmitz <schmitzmic@gmail.com>
Subject: Re: [PATCH v6 0/6] g_NCR5380: PDMA fixes and cleanup
Date: Sun, 2 Jul 2017 16:51:36 +0200 [thread overview]
Message-ID: <201707021651.37016.linux@rainbow-software.org> (raw)
In-Reply-To: <alpine.LNX.2.00.1707021008110.2389@nippy.intranet>
On Sunday 02 July 2017 05:11:27 Finn Thain wrote:
> On Sat, 1 Jul 2017, Ondrej Zary wrote:
> > The write corruption is still present - "start" must be rolled back in
> > both IRQ and timeout cases.
>
> Your original algorithm aborts the transfer for a timeout. Same with mine.
I do "start -= 2 * 128" even after timeout.
> The bug must be a elsewhere.
>
> > And 128 B is not enough , 256 is OK (why did it work last time?).
>
> When I get contradictory results it usually means I booted the wrong build
> or built the wrong branch.
I've just retested PATCHv5, it really misses 128 bytes and works if I
add "residual += 128;".
> Actually, I think that adding 128 to the residual is correct in some
> sitations, and 256 is correct in other situations.
>
> > We just wrote a buffer to the chip but the chip is writing the previous
> > one to the drive - so if a problem arises, both buffers are lost.
>
> I see. I guess we have to take buffer swaps into account.
>
> > This fixes the corruption (although the "start > 0" check seems wrong
> > now): --- a/drivers/scsi/g_NCR5380.c
> > +++ b/drivers/scsi/g_NCR5380.c
> > @@ -598,23 +598,17 @@ static inline int generic_NCR5380_psend(struct
> > NCR5380_hostdata *hostdata, CSR_HOST_BUF_NOT_RDY, 0,
> > hostdata->c400_ctl_status,
> > CSR_GATED_53C80_IRQ,
> > - CSR_GATED_53C80_IRQ, HZ / 64) < 0)
> > - break;
> > -
> > - if (NCR5380_read(hostdata->c400_ctl_status) &
> > - CSR_HOST_BUF_NOT_RDY) {
> > + CSR_GATED_53C80_IRQ, HZ / 64) < 0 ||
> > + (NCR5380_read(hostdata->c400_ctl_status) &
> > + (CSR_HOST_BUF_NOT_RDY | CSR_GATED_53C80_IRQ))) {
>
> You could add a printk to the timeout branch. If it executes, something is
> seriously wrong. E.g.
>
> - break;
> + { pr_err("send timeout %02x, %d/%d\n",
> NCR5380_read(hostdata->c400_ctl_status), start, len); break; }
Yes, timeouts do happen:
[ 9671.909223] send timeout 14, 3840/4096
[ 9672.978079] send timeout 14, 2816/4096
[ 9675.323751] send timeout 14, 1280/4096
> > /* The chip has done a 128 B buffer swap but the first
> > * buffer still has not reached the SCSI bus.
> > */
> > if (start > 0)
> > - start -= 128;
> > + start -= 256;
> > break;
> > }
>
> BTW, that change carries the risk of 'start' going negative and the
> residual exceeding the length of the original transfer.
>
> But I agree with you that there's a problem with the residual.
>
> If I understand correctly, the 53c400 can't do a buffer swap until the
> disk acknowledges each of the 128 bytes from the buffer. But I guess the
> first buffer is special because the disk will not see the first byte of
> the transfer until after the first buffer swap.
>
> And it appears that the last buffer is also special: we have to wait for
> CSR_HOST_BUF_NOT_RDY even after start == len otherwise we may not detect a
> failure and fix the residual. So I think the datasheet is right; we have
> to iterate until the block counter goes to zero.
>
> I think it is safe to say that when CSR_HOST_BUF_NOT_RDY, 'start' is
> between 128 and 256 B ahead of the disk. Otherwise, the host buffer is
> empty and 'start' is no more than 128 B ahead of the disk.
>
> > - if (NCR5380_read(hostdata->c400_ctl_status) &
> > - CSR_GATED_53C80_IRQ)
> > - break;
> > -
> > if (hostdata->io_port && hostdata->io_width == 2)
> > outsw(hostdata->io_port + hostdata->c400_host_buf,
> > src + start, 64);
> >
> >
> > DTC seems to work too.
>
> OK. Thanks for testing. Please try the patch below on top of v6.
It misses 256B blocks. It's caused by the timeouts, this patch fixes it:
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -598,11 +598,9 @@ static inline int generic_NCR5380_psend(struct NCR5380_hostdata *hostdata,
CSR_HOST_BUF_NOT_RDY, 0,
hostdata->c400_ctl_status,
CSR_GATED_53C80_IRQ,
- CSR_GATED_53C80_IRQ, HZ / 64) < 0)
- break;
-
- if (NCR5380_read(hostdata->c400_ctl_status) &
- CSR_HOST_BUF_NOT_RDY) {
+ CSR_GATED_53C80_IRQ, HZ / 64) < 0 ||
+ (NCR5380_read(hostdata->c400_ctl_status) &
+ CSR_HOST_BUF_NOT_RDY)) {
/* Both 128 B buffers are in use */
if (start >= 128)
start -= 128;
--
Ondrej Zary
next prev parent reply other threads:[~2017-07-02 14:51 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-01 2:40 [PATCH v6 0/6] g_NCR5380: PDMA fixes and cleanup Finn Thain
2017-07-01 2:40 ` Finn Thain
2017-07-01 2:40 ` [PATCH v6 1/6] g_NCR5380: Fix PDMA transfer size Finn Thain
2017-07-01 2:40 ` Finn Thain
2017-07-01 2:40 ` [PATCH v6 6/6] g_NCR5380: Various DTC436 workarounds Finn Thain
2017-07-01 2:40 ` Finn Thain
2017-07-01 2:40 ` [PATCH v6 4/6] g_NCR5380: Use unambiguous terminology for PDMA send and receive Finn Thain
2017-07-01 2:40 ` Finn Thain
2017-07-01 2:40 ` [PATCH v6 3/6] g_NCR5380: Cleanup comments and whitespace Finn Thain
2017-07-01 2:40 ` Finn Thain
2017-07-01 2:40 ` [PATCH v6 2/6] g_NCR5380: End PDMA transfer correctly on target disconnection Finn Thain
2017-07-01 2:40 ` Finn Thain
2017-07-01 2:40 ` [PATCH v6 5/6] g_NCR5380: Re-work PDMA loops Finn Thain
2017-07-01 2:40 ` Finn Thain
2017-07-01 21:49 ` [PATCH v6 0/6] g_NCR5380: PDMA fixes and cleanup Ondrej Zary
2017-07-02 3:11 ` Finn Thain
2017-07-02 14:51 ` Ondrej Zary [this message]
2017-07-03 8:01 ` Finn Thain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201707021651.37016.linux@rainbow-software.org \
--to=linux@rainbow-software.org \
--cc=fthain@telegraphics.com.au \
--cc=jejb@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=schmitzmic@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.