public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 17:57 Justin T. Gibbs
  2004-01-05 19:22 ` David S. Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Justin T. Gibbs @ 2004-01-05 17:57 UTC (permalink / raw)
  To: Kernel Mailinglist, linux-scsi; +Cc: Berkley Shands

Berkley Shands recently tripped over this problem.  The 2.6.X pci_map_sg
code for x86_64 modifies the passed in S/G list to compact it for mapping
by the GART.  This modification is not reversed when pci_unmap_sg is
called.  In the case of a retried SCSI command, this causes any attempt
to map the command a second time to fail with a BUG assertion since the
nseg parameter passed into the second map call is state.  nseg comes from
the "use_sg" field in the SCSI command structure which is never touched
by the HBA drivers invoking pci_map_sg.

DMA-API.txt doesn't seem to cover this issue.  Should the low-level DMA
code restore the S/G list to its original state on unmap or should the
SCSI HBA drivers be changed to update "use_sg" with the segment count
reported by the pci_map_sg() API?  If the latter, this seems to contradict
the mandate in DMA-API that the nseg parameter passed into the unmap call
be the same as that passed into the map call.  Most of the kernel assumes
that an S/G list can be mapped an unmapped multiple times using the same
arguments.  This doesn't seem to me to be an unreasonable expectation.

--
Justin


^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 19:29 Berkley Shands
  2004-01-05 19:28 ` David S. Miller
  0 siblings, 1 reply; 21+ messages in thread
From: Berkley Shands @ 2004-01-05 19:29 UTC (permalink / raw)
  To: davem, gibbs; +Cc: berkley, linux-kernel, linux-scsi

	The pci layer is modifying the sg list, and then placing a zero
in the length field. pci-gart.c at line 453 (2.6.0 sources) checks this length field
after a retry, sees that it is zero, and bughalts.
	As Justin points out, SOMEBODY needs to reset the sg list, or kick the error
back up far enough to recreate it completely. Or just don't bother to
combine the entries. A machine with 8GB or more can stand to lose a few bytes.
How many hardware devices can handle a 1MB+ I/O buffer being passed to them anyway?

berkley

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-05 20:00 Berkley Shands
  0 siblings, 0 replies; 21+ messages in thread
From: Berkley Shands @ 2004-01-05 20:00 UTC (permalink / raw)
  To: berkley, davem; +Cc: gibbs, linux-kernel, linux-scsi

	   		BUG_ON(s->length == 0);

		size += s->length; 

		/* Handle the previous not yet processed entries */

		if (i > start) {
			struct scatterlist *ps = &sg[i-1];
			/* Can only merge when the last chunk ends on a page 
			   boundary. */
			if (1 || !force_iommu || !need || (i-1 > start && ps->offset) ||
			    (ps->offset + ps->length) % PAGE_SIZE) { 
				if (pci_map_cont(sg, start, i, sg+out, pages, 
						 need) < 0)
					goto error;

If I totally disable coalescing, in arch/x86_64/pci-gart.c by adding the if (1 || ...
then the bughalt goes away. So might some performance, but a system that runs is
MUCH better than one that hard faults on a regular basis.

The shipped scsi driver still needs to be updated to NOT "offline" the drives.
the 2.0.5 driver does this quite well.

Thank you.

berkley

^ permalink raw reply	[flat|nested] 21+ messages in thread
[parent not found: <2938942704.1073325455@aslan.btc.adaptec.com.suse.lists.linux.kernel>]
[parent not found: <200401051929.i05JTsM0000014248@mudpuddle.cs.wustl.edu.suse.lists.linux.kernel>]
* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-07 15:35 Berkley Shands
  2004-01-07 19:19 ` badari
  0 siblings, 1 reply; 21+ messages in thread
From: Berkley Shands @ 2004-01-07 15:35 UTC (permalink / raw)
  To: gibbs, linux-kernel, linux-scsi, pbadari

	Running with the force segment merge OFF panics the processor after
about 1000 scsi retries. the error given, also in pci-gart.c, is
pci_map_area overflow 4096 bytes
So a brain dead repair kills the kernel. Someone clearly needs to figure
out where to correct the merge of the sg lists. A bit of doc on the iommu
and the 4096 byte limit would be nice too :-)

I see that is is the aborting of an SCB that causes the sg list halt.

Jan  7 09:18:32 typhoon kernel: DevQ(0:6:0): 0 waiting
Jan  7 09:18:32 typhoon kernel: (scsi0:A:2:0): SCB 0x46 - timed out
Jan  7 09:18:32 typhoon kernel: Recovery SCB completes
Jan  7 09:18:32 typhoon kernel: scsi0: Issued Channel A Bus Reset. 3 SCBs aborted
Jan  7 09:18:46 typhoon kernel: Did it again, boss 0000:01:03.0

Since the sg list merges into one i/o list, simply adding s->length = 4096
back into the list seems to keep the kernel up. a better if slightly less
stupid fix is to add up the remaining sg list lengths, and ajust
the sg[0] entry to sum to the correct value.

/*   		BUG_ON(s->length == 0); */
if (! s->length)
   {
   unsigned long zero = sg[0].length;
   unsigned long remain = 0;
   int t = 0;
   
   BUG_ON(i != 1);		/* some other error here */
   
   for (t = i + 1; t < nents; t++)
      remain += sg[t].length;	/* collect remaining sizes */
   zero -= remain;		/* deduct what is left on the list */
   sg[0].length = zero / 2;
   sg[1].length = zero / 2;	/* allocate uniformly */
   size = zero / 2;		/* reduce oversize first entry */
   printk(KERN_WARNING "Did it again, boss %s\n", dev->slot_name);
   }

The better solution is to have the upper layer fix the sg list, or
have some marker that the list was diddled, and save the old entries
to put it back.

berkley

^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps
@ 2004-01-07 16:33 Berkley Shands
  0 siblings, 0 replies; 21+ messages in thread
From: Berkley Shands @ 2004-01-07 16:33 UTC (permalink / raw)
  To: berkley, davem; +Cc: gibbs, linux-kernel, linux-scsi

	Here is a somewhat more robust (and trivial) fix for the sg iommu
problem on the x86_64. Still does not correct the sg list modification logic though :-)

--- /raid0/kernels/linux/arch/x86_64/kernel/pci-gart.c  2004-01-05 10:16:11.000000000 -0600
+++ pci-gart.c  2004-01-07 10:14:47.000000000 -0600
@@ -446,6 +446,16 @@
                return 0;
        out = 0;
        start = 0;
+
+       /* see if we are recovering from an error */
+
+       if (sg[0].oldlength) {
+               printk(KERN_WARNING "Recovering sg list for %s\n", dev->slot_name);
+               for (i = 0; i < nents; i++) {
+                       sg[i].length = sg[i].oldlength;
+                       }
+               }
+
        for (i = 0; i < nents; i++) {
                struct scatterlist *s = &sg[i];
                dma_addr_t addr = page_to_phys(s->page) + s->offset;
@@ -453,6 +463,7 @@
                BUG_ON(s->length == 0); 
 
                size += s->length; 
+               s->oldlength = s->length;
 
                /* Handle the previous not yet processed entries */
                if (i > start) {

--- /raid0/kernels/linux/include/asm-x86_64/scatterlist.h       2003-12-17 20:59:39.000000000 -0600
+++ scatterlist.h       2004-01-07 10:01:06.000000000 -0600
@@ -5,6 +5,7 @@
     struct page                *page;
     unsigned int       offset;
     unsigned int       length;
+    unsigned int       oldlength;
     dma_addr_t         dma_address;
 };


berkley

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2004-01-07 19:22 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-05 17:57 [BUG] x86_64 pci_map_sg modifies sg list - fails multiple map/unmaps Justin T. Gibbs
2004-01-05 19:22 ` David S. Miller
2004-01-05 19:41 ` Badari Pulavarty
2004-01-05 19:47 ` Andi Kleen
2004-01-05 20:04   ` Justin T. Gibbs
2004-01-05 20:35   ` David S. Miller
2004-01-05 21:10     ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2004-01-05 19:29 Berkley Shands
2004-01-05 19:28 ` David S. Miller
2004-01-05 20:00 Berkley Shands
     [not found] <2938942704.1073325455@aslan.btc.adaptec.com.suse.lists.linux.kernel>
     [not found] ` <m3brpi41q0.fsf@averell.firstfloor.org.suse.lists.linux.kernel>
     [not found]   ` <2997092704.1073333041@aslan.btc.adaptec.com.suse.lists.linux.kernel>
2004-01-05 20:58     ` Andi Kleen
     [not found] <200401051929.i05JTsM0000014248@mudpuddle.cs.wustl.edu.suse.lists.linux.kernel>
     [not found] ` <20040105112800.7a9f240b.davem@redhat.com.suse.lists.linux.kernel>
2004-01-05 21:02   ` Andi Kleen
2004-01-05 21:01     ` David S. Miller
2004-01-05 21:31       ` Andi Kleen
2004-01-06  0:05         ` James Bottomley
2004-01-06  3:06           ` Andi Kleen
2004-01-06  3:04             ` David S. Miller
2004-01-06  3:14               ` James Bottomley
2004-01-07 15:35 Berkley Shands
2004-01-07 19:19 ` badari
2004-01-07 16:33 Berkley Shands

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox