All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ondrej Zary <linux@rainbow-software.org>
To: Finn Thain <fthain@telegraphics.com.au>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org,
	Michael Schmitz <schmitzmic@gmail.com>
Subject: Re: [PATCH v3 0/4] g_NCR5380: PDMA fixes and cleanup
Date: Tue, 27 Jun 2017 18:06:04 +0200	[thread overview]
Message-ID: <201706271806.05004.linux@rainbow-software.org> (raw)
In-Reply-To: <alpine.LNX.2.00.1706272205340.25239@nippy.intranet>

On Tuesday 27 June 2017 14:42:29 Finn Thain wrote:
> On Tue, 27 Jun 2017, Ondrej Zary wrote:
> > BTW. I've probably found the DTC write corruption. Added the following
> > check (13 is host buffer index register) -
>
> That register is not mentioned in my 53c400 datasheet.

Yes, it's not there. But we don't have 53C400A and DTC436 datasheets (this 
register works on both).

> > and it triggers sometimes: the value is 1 instead of 0. As we use only
> > 16-bit writes, I don't see how the value could ever be odd. Looks like a
> > bug in the chip. The index register corrupts during the transfer, not
> > after IRQ or timeout. The same check at beginning of pwrite() did not
> > trigger.
>
> Are you reading this register at the right moment? Have you tried waiting
> for it to reach zero, as in,
>
> 	if (NCR5380_poll_politely(hostdata, 13, 0xff, 0, HZ / 64) < 0)
> 		/* printk, reset etc */;

I have not but will try (expecting that it will not change by itself).

> Even if this is a reliable way to detect a short transfer, it would be
> nice to know the root cause. But I'm being unrealistic: the DTC436 vendor
> never responded to my requests for technical documentation.

According to the data corruption observed, it's not a short transfer. The 
corruption is always the same: one byte missing at the beginning of a 128 B 
block. It happens only with slow Quantum LPS 240 drive, not with faster IBM 
DORS-32160.

> > The index register is not writable so we must(?) reset the PDMA engine
> > to recover. However, this quick attempt to fix does not work, maybe we
> > should reload the block count and continue?
>
> I don't know if it is possible to recover. If the last byte never reached
> the scsi bus, then once you reset the 53c400 core, you need the driver to
> perform a single-byte PIO transfer after the short PDMA transfer. This
> would require that you set the residual appropriately (though in my
> experience that may not be sufficient).
>
> It may be better to simply limit the transfer to 512 bytes instead of
> attempting to recover based on an undocumented (?) register, etc. Seems
> like a bit of a hack.
>
> > --- a/drivers/scsi/g_NCR5380.c
> > +++ b/drivers/scsi/g_NCR5380.c
> > @@ -595,7 +603,13 @@ static inline int generic_NCR5380_pwrite(struct
> > NCR5380_hostdata *hostdata,
> >                                 goto out_wait;
> >                         }
> >                 }
> > -
> > +               idx = NCR5380_read(13);
> > +               if (idx != 0) {
> > +                       printk("host idx=%d, start=%d\n", idx, start);
> > +                       NCR5380_write(hostdata->c400_ctl_status,
> > CSR_RESET); +                      
> > NCR5380_write(hostdata->c400_ctl_status, CSR_BASE); +                    
> >   goto out_wait;
> > +               }
> >                 if (hostdata->io_port && hostdata->io_width == 2)
> >                         outsw(hostdata->io_port +
> > hostdata->c400_host_buf, src + start, 64);
>
> I find it hard to reason about this code. For example, out_wait is to be
> removed. Let's get the preceding patches working and signed-off. Please go
> ahead and use a 512 B transfer for DTC436 testing if that will help get
> this patch series over the line.

OK, I agree. Let's fix the problems first and leave this hack for later.

-- 
Ondrej Zary

  reply	other threads:[~2017-06-27 16:06 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-26  7:30 [PATCH v3 0/4] g_NCR5380: PDMA fixes and cleanup Finn Thain
2017-06-26  7:30 ` Finn Thain
2017-06-26  7:30 ` [PATCH v3 1/4] g_NCR5380: Fix PDMA transfer size Finn Thain
2017-06-26  7:30   ` Finn Thain
2017-06-26  7:30 ` [PATCH v3 2/4] g_NCR5380: End PDMA transfer correctly on target disconnection Finn Thain
2017-06-26  7:30   ` Finn Thain
2017-06-26  7:30 ` [PATCH v3 3/4] g_NCR5380: Cleanup comments and whitespace Finn Thain
2017-06-26  7:30   ` Finn Thain
2017-06-26  7:30 ` [PATCH v3 4/4] g_NCR5380: Re-work PDMA loops Finn Thain
2017-06-26  7:30   ` Finn Thain
2017-06-26 19:27 ` [PATCH v3 0/4] g_NCR5380: PDMA fixes and cleanup Ondrej Zary
2017-06-27  1:49   ` Finn Thain
2017-06-27 18:44     ` Ondrej Zary
2017-06-28  4:09       ` Finn Thain
2017-06-27  6:28   ` Ondrej Zary
2017-06-27  8:30     ` Michael Schmitz
2017-06-27 12:54       ` Finn Thain
2017-06-27 12:42     ` Finn Thain
2017-06-27 16:06       ` Ondrej Zary [this message]
2017-06-28  4:10         ` Finn Thain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201706271806.05004.linux@rainbow-software.org \
    --to=linux@rainbow-software.org \
    --cc=fthain@telegraphics.com.au \
    --cc=jejb@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=schmitzmic@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.