Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device"

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device"
@ 2011-09-29 19:41 Mark Hounschell
  2011-09-29 20:30 ` Mark Hounschell
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Hounschell @ 2011-09-29 19:41 UTC (permalink / raw)
  To: linux-pci@vger.kernel.org; +Cc: Linux-kernel, Mark Hounschell

The original thread is a year or so old but I never actually figured out 
what the problem was. It's still an issue I need to try to understand so 
I thought I would try again. As then, I know this is a PCI thing but 
have CC'd LMKL just in case. Sorry for the noise LMKL.

Briefly, we have a couple of PCI cards that talk to each other over the 
pci bus. One is a reflective memory type (rms) card and the other is a 
special I/O (gpiohsd) card. These are both regular PCI cards. The way 
this gpiohsd writes and reads data to and from this rms card mamory and 
also to regular application SHM memory is by way of internal page tables 
in the gpiohsd containing bus addresses of the rms cards memory or 
application SHM memory. We stuff the gpiohsd page tables with bus 
addresses we get from  the  page_to_phys call. We used to use 
virt_to_bus. The gpiohsd may do a single read or write, or a large DMA 
read or write. Software never directly tells this gpiohsd card to do any 
xfer, DMA or not. I've looked at the kernel DMA Documentation and don't 
really see anything there that will help me with my problem (which I 
will explain below). Here is a blurp of the code in one of our GPL 
drivers we use for getting bus addresses for our rms card memory and/or 
our application SHM memory that we stuff into the gpiohsd page tables. 
FYI, a device connected to this gpiohsd card can, by its self, cause the 
gpiohsd to initiate these data transfers.

     /*
      * Get a physical/bus address from our virtual address
      */
     down_read(&current->mm->mmap_sem);

     /*
      * Get around the mlock fix/change in get_user_pages that forces
      * the call to fail if the VM_IO bit is set in vma->vm_flags
      *
      * As of 2.6.16 kernels get_user_pages also fails when the
      * new VM_PFNMAP bit in vm->flags is set. It is OK to
      * reset this bit also as long as we return the bit to
      * its original set condition.
      */

     VM_flags = 0;
     vma = find_vma(current->mm, (unsigned long)pte_info.virt_addr);
     VM_flags = (vma->vm_flags & (VM_IO | VM_PFNMAP));
     vma->vm_flags &= (~VM_IO & ~VM_PFNMAP);

     stat = get_user_pages(current, current->mm,
                                      (unsigned long)pte_info.virt_addr,
                                      1,       // one page
                                      1,        // write access
                                      1,        // force
                                      &pages,   // page struct
                                      NULL);    //

     vma->vm_flags |= VM_flags;   // Set vm_flags back to the way we 
found them

     up_read(&current->mm->mmap_sem);

     if (stat < 0) {
             ret = -EFAULT;
             goto out;
     } else {
             phys_addr = page_to_phys(pages);        // on x86 phys = bus
             page_cache_release(pages);
     }

     pci_address = phys_addr;
     pte_info.pcimsa = 0;         // We are running 32 bit
     pte_info.pcilsa = pci_address;
     if (PAGE_SIZE < 8192)   // MPX page size
             pte_info.pagesize = PAGE_SIZE;
     else
             pte_info.pagesize = 8192;
     if (copy_to_user((lcrs_pte_struct_t *) arg,
                      &pte_info, sizeof(lcrs_pte_struct_t))) {
             ret = -EFAULT;
             goto out;
     }

We come into the above code with a virtual address (page aligned) of 
either the rms memory page, via mmap, or our SHM memory page, using the 
standard shm API. This has all worked just fine for years. And it still 
does except for when we are using a more recent MB with an AMD chipset 
and the data to/from the gpiohsd has to cross a pci-e bridge to get to 
the rms memory or to our SHM memory. An example of that configuration 
would be when the rms card is plugged into the MB and the gpiohsd is in 
a regular PCI expansion rack and the expansion rack interface card is a 
pci-e card plugged into a pci-e slot. It appears we are unable to cross 
over pci-e bridges for some reason using the bus addresses obtained 
using the method described above.

We have no problem with this same configuration using a MB with an 
nvidia chipset. I suspect it might have something to do with the the MB 
that usines the AMD chipset having an IOMMU, but I really don't know for 
sure. I've also read something in the AMD chipset docs about some type 
of restrictions on peer to peer transfers but again I really have no 
idea if this is related to why I'm having this problem.

Any pointers from anyone (even an AMD guy) out there would again be 
appreciated.

Thanks and regards
Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device"
  2011-09-29 19:41 Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device" Mark Hounschell
@ 2011-09-29 20:30 ` Mark Hounschell
  2011-09-30  8:46   ` Clemens Ladisch
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Hounschell @ 2011-09-29 20:30 UTC (permalink / raw)
  To: markh; +Cc: linux-pci@vger.kernel.org, Linux-kernel, Mark Hounschell

On 09/29/2011 03:41 PM, Mark Hounschell wrote:
> The original thread is a year or so old but I never actually figured out
> what the problem was. It's still an issue I need to try to understand so
> I thought I would try again. As then, I know this is a PCI thing but
> have CC'd LMKL just in case. Sorry for the noise LMKL.
>
> Briefly, we have a couple of PCI cards that talk to each other over the
> pci bus. One is a reflective memory type (rms) card and the other is a
> special I/O (gpiohsd) card. These are both regular PCI cards. The way
> this gpiohsd writes and reads data to and from this rms card mamory and
> also to regular application SHM memory is by way of internal page tables
> in the gpiohsd containing bus addresses of the rms cards memory or
> application SHM memory. We stuff the gpiohsd page tables with bus
> addresses we get from the page_to_phys call. We used to use virt_to_bus.
> The gpiohsd may do a single read or write, or a large DMA read or write.
> Software never directly tells this gpiohsd card to do any xfer, DMA or
> not. I've looked at the kernel DMA Documentation and don't really see
> anything there that will help me with my problem (which I will explain
> below). Here is a blurp of the code in one of our GPL drivers we use for
> getting bus addresses for our rms card memory and/or our application SHM
> memory that we stuff into the gpiohsd page tables. FYI, a device
> connected to this gpiohsd card can, by its self, cause the gpiohsd to
> initiate these data transfers.
>
> /*
> * Get a physical/bus address from our virtual address
> */
> down_read(&current->mm->mmap_sem);
>
> /*
> * Get around the mlock fix/change in get_user_pages that forces
> * the call to fail if the VM_IO bit is set in vma->vm_flags
> *
> * As of 2.6.16 kernels get_user_pages also fails when the
> * new VM_PFNMAP bit in vm->flags is set. It is OK to
> * reset this bit also as long as we return the bit to
> * its original set condition.
> */
>
> VM_flags = 0;
> vma = find_vma(current->mm, (unsigned long)pte_info.virt_addr);
> VM_flags = (vma->vm_flags & (VM_IO | VM_PFNMAP));
> vma->vm_flags &= (~VM_IO & ~VM_PFNMAP);
>
> stat = get_user_pages(current, current->mm,
> (unsigned long)pte_info.virt_addr,
> 1, // one page
> 1, // write access
> 1, // force
> &pages, // page struct
> NULL); //
>
> vma->vm_flags |= VM_flags; // Set vm_flags back to the way we found them
>
> up_read(&current->mm->mmap_sem);
>
> if (stat < 0) {
> ret = -EFAULT;
> goto out;
> } else {
> phys_addr = page_to_phys(pages); // on x86 phys = bus
> page_cache_release(pages);
> }
>
> pci_address = phys_addr;
> pte_info.pcimsa = 0; // We are running 32 bit
> pte_info.pcilsa = pci_address;
> if (PAGE_SIZE < 8192) // MPX page size
> pte_info.pagesize = PAGE_SIZE;
> else
> pte_info.pagesize = 8192;
> if (copy_to_user((lcrs_pte_struct_t *) arg,
> &pte_info, sizeof(lcrs_pte_struct_t))) {
> ret = -EFAULT;
> goto out;
> }
>
>

The above code is only used for the SHM case, Sorry. For the rms case we 
get the bus address directly from the rms cards driver.

> We come into the above code with a virtual address (page aligned) of
> either the rms memory page, via mmap, or our SHM memory page, using the
> standard shm API. This has all worked just fine for years. And it still
> does except for when we are using a more recent MB with an AMD chipset
> and the data to/from the gpiohsd has to cross a pci-e bridge to get to
> the rms memory or to our SHM memory. An example of that configuration
> would be when the rms card is plugged into the MB and the gpiohsd is in
> a regular PCI expansion rack and the expansion rack interface card is a
> pci-e card plugged into a pci-e slot. It appears we are unable to cross
> over pci-e bridges for some reason using the bus addresses obtained
> using the method described above.
>
> We have no problem with this same configuration using a MB with an
> nvidia chipset. I suspect it might have something to do with the the MB
> that usines the AMD chipset having an IOMMU, but I really don't know for
> sure. I've also read something in the AMD chipset docs about some type
> of restrictions on peer to peer transfers but again I really have no
> idea if this is related to why I'm having this problem.
>
> Any pointers from anyone (even an AMD guy) out there would again be
> appreciated.
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device"
  2011-09-29 20:30 ` Mark Hounschell
@ 2011-09-30  8:46   ` Clemens Ladisch
  2011-09-30 10:08     ` Mark Hounschell
  0 siblings, 1 reply; 5+ messages in thread
From: Clemens Ladisch @ 2011-09-30  8:46 UTC (permalink / raw)
  To: markh; +Cc: linux-pci@vger.kernel.org, Linux-kernel, Mark Hounschell

Mark Hounschell wrote:
> We have no problem with this same configuration using a MB with an
> nvidia chipset. I suspect it might have something to do with the the MB
> that usines the AMD chipset having an IOMMU, but I really don't know for
> sure. I've also read something in the AMD chipset docs about some type
> of restrictions on peer to peer transfers but again I really have no
> idea if this is related to why I'm having this problem.

According to the published RS780 docs, "P2P traffic could be only memory
writes" (RPR 2.7).  In any case, check the P2P bits (MISC is described
in BDG 2.4).


Regards,
Clemens

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device"
  2011-09-30  8:46   ` Clemens Ladisch
@ 2011-09-30 10:08     ` Mark Hounschell
  2011-09-30 10:55       ` Clemens Ladisch
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Hounschell @ 2011-09-30 10:08 UTC (permalink / raw)
  To: Clemens Ladisch; +Cc: markh, linux-pci@vger.kernel.org, Linux-kernel

On 09/30/2011 04:46 AM, Clemens Ladisch wrote:
> Mark Hounschell wrote:
>> We have no problem with this same configuration using a MB with an
>> nvidia chipset. I suspect it might have something to do with the the MB
>> that usines the AMD chipset having an IOMMU, but I really don't know for
>> sure. I've also read something in the AMD chipset docs about some type
>> of restrictions on peer to peer transfers but again I really have no
>> idea if this is related to why I'm having this problem.
>
> According to the published RS780 docs, "P2P traffic could be only memory
> writes" (RPR 2.7).  In any case, check the P2P bits (MISC is described
> in BDG 2.4).
>
>

I wonder what they expect you to get out of "P2P traffic _could_ be only 
memory writes" Does that mean it _can_ be configured as "P2P traffic is 
enabled for only writes"? In any case I can do neither reads or writes.

As for the MISC bits described in the BDG, it appears that by default all 
the P2PDIS bits will be set to 0. Are there tools that would enable me to 
look at these bits and even change them if they are set. Would these bits 
normally be set/reset by the BIOS or the OS?

I previously asked about this problem on the AMD developers forum but got 
no response.

Thanks
Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device"
  2011-09-30 10:08     ` Mark Hounschell
@ 2011-09-30 10:55       ` Clemens Ladisch
  0 siblings, 0 replies; 5+ messages in thread
From: Clemens Ladisch @ 2011-09-30 10:55 UTC (permalink / raw)
  To: dmarkh; +Cc: markh, linux-pci@vger.kernel.org, Linux-kernel

Mark Hounschell wrote:
> I wonder what they expect you to get out of "P2P traffic _could_ be only
> memory writes" Does that mean it _can_ be configured as "P2P traffic is
> enabled for only writes"? In any case I can do neither reads or writes.

AFAICT you won't be able to do P2P reads in any case.

> As for the MISC bits described in the BDG, it appears that by default all
> the P2PDIS bits will be set to 0. Are there tools that would enable me to
> look at these bits and even change them if they are set.

setpci


Regards,
Clemens

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-09-30 10:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-29 19:41 Problem with AMD chipsets. Was "Re: problems doing direct dma from a pci device to pci-e device" Mark Hounschell
2011-09-29 20:30 ` Mark Hounschell
2011-09-30  8:46   ` Clemens Ladisch
2011-09-30 10:08     ` Mark Hounschell
2011-09-30 10:55       ` Clemens Ladisch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox