linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bad DMA from Marvell 9230
@ 2014-03-27  6:57 Benjamin Herrenschmidt
  2014-03-27 15:19 ` Tejun Heo
  2014-05-30  7:06 ` Jérôme Carretero
  0 siblings, 2 replies; 11+ messages in thread
From: Benjamin Herrenschmidt @ 2014-03-27  6:57 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Bartlomiej Zolnierkiewicz, linux-ide, LKML

Hi Folks !

Do that ring any bell ?

I've been trying a 9230 on a power box here (a 9235 on the same machine
works fine) and it blows up with an IOMMU violation early during init.

>From what I can tell the scenario is:

- So we still haven't issued any command per-se, all our DMA command
buffers etc... are all 0's at the point of the error.

 - The core libata calls the AHCI driver's ahci_hardreset() for each
port in a separate thread. They all call sata_link_hardreset().

 - This in turns calls sata_link_resume() which write to the SCR_CONTROL
register as follow:

                scontrol = (scontrol & 0x0f0) | 0x300;
                if ((rc = sata_scr_write(link, SCR_CONTROL, scontrol)))
{
                        printk(" -> sata_link_resume FAIL 2\n");
                        return rc;
                }

                /*
                 * Some PHYs react badly if SStatus is pounded
                 * immediately after resuming.  Delay 200ms before
                 * debouncing.
                 */
                ata_msleep(link->ap, 200);

I get the interrupt from the IOMMU about 2ms after the write to
SCR_CONTROL.

Now, pending misinterpretation of some bits on my side, it looks like
the bad DMA is a DMA *read* from address 0 (which we never map,
typically to catch driver bugs).

I went through a few theories with this one but so far none held. I
don't think it's a D2H FIS issue since the DMA pointers for that appear
to be setup properly, the memory mapped, etc...

I though the chip might incorrectly/inadvertently try to (pre)fetch a
command. At that point all 32 command slots are all 0's, so if it
ignored the size it might try to fetch from command address 0.

So I added a loop to fill all 32 slots with a valid command address
in ahci_hardreset:

+	for (i = 0; i < 32; i++)
+		ahci_fill_cmd_slot(pp, i, 0);
 	rc = sata_link_hardreset(link, timing, deadline, &online,
 				 ahci_check_ready);

But that had basically no effect.

I've contacted Marvell, but I was wondering if anybody here had already
experienced something similar or has an idea of what else the chip
might be doing wrong so we can try to find a workaround ?

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-05-30 23:09 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-27  6:57 Bad DMA from Marvell 9230 Benjamin Herrenschmidt
2014-03-27 15:19 ` Tejun Heo
2014-04-05  2:35   ` Robert Hancock
2014-05-30  7:06 ` Jérôme Carretero
2014-05-30 10:37   ` Benjamin Herrenschmidt
2014-05-30 13:58     ` Jérôme Carretero
2014-05-30 14:13       ` Roger Heflin
2014-05-30 15:14         ` Jérôme Carretero
2014-05-30 21:06         ` Benjamin Herrenschmidt
2014-05-30 23:08           ` Roger Heflin
2014-05-30 20:59       ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).