kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
From: jimmie@sackheads.org (Jimmie Mayfield)
To: kernelnewbies@lists.kernelnewbies.org
Subject: x86 driver help:  ever see DMA and MMIO operations NMI depending on which PCIe slot you're installed in?
Date: Sun, 17 May 2015 06:27:53 -0000	[thread overview]
Message-ID: <555834E5.9060309@sackheads.org> (raw)

Hi all.  We're in the midst of performing system compatibility tests 
with our device and I'm seeing some odd behavior when testing a Lenovo 
x3650 M4 and M5 machines.

1) On the x3650 M4 machine, an attempt to perform a TODEVICE DMA 
operation using consistent memory results in an NMI when the device 
attempts to fetch the memory.  Here's the thing:  this NMI only happens 
when the device is plugged into certain PCIe slots.  Some slots appear 
to work fine.

We obtained PCIe bus analyzer traces for the NMI scenario and sent to 
Lenovo for analysis.  Their response was a rather terse "memory address 
is invalid".  We later obtained an analyzer trace for a non-NMI scenario 
in a different slot and saw the very same bus address.  So I'm very 
confused:  is it possible for a bus address allocated via 
pci_alloc_consistent to be valid only for specific PCIe slots? 
Shouldn't the kernel be able to allocate valid memory since it's given 
the pci_dev * as an argument?

2) On the x3650 M5 machine, an attempt to perform MMIO operations 
results in an NMI.  Again, only from certain slots.  Some slots seem to 
work fine.  Again, near as I can tell, the slots are physically the same 
-- same width, same power capabilities, etc.

Again, we obtained PCIe bus analyzer traces for both the NMI and the 
non-NMI scenarios and compared.  There's a lot of noise in the traces 
because the machine BIOS appears to poll the PCI registers repeatedly 
and frequently but once the driver enables the device, we don't see 
anything that stands out in one case or the other.  In both traces, we 
see the device receive the MMIO read request and respond about a 
microsecond later.  In the NMI trace, the NMI occurs after the device 
writes the MMIO response.

So I'm scratching my head.  Has anyone seen such slot-specific behavior? 
  How does one account for this in a device driver?

                 reply	other threads:[~2015-05-17  6:27 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555834E5.9060309@sackheads.org \
    --to=jimmie@sackheads.org \
    --cc=kernelnewbies@lists.kernelnewbies.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).