public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ralph Campbell <ralphc@pathscale.com>
To: Roland Dreier <rolandd@cisco.com>
Cc: openib-general <openib-general@openib.org>, linux-kernel@vger.kernel.org
Subject: Suggestions for how to remove bus_to_virt()
Date: Wed, 12 Jul 2006 16:29:27 -0700	[thread overview]
Message-ID: <1152746967.4572.263.camel@brick.pathscale.com> (raw)

I have been looking at how to eliminate the bus_to_virt() and
phys_to_virt() calls used by the ib_ipath driver.
I am looking for suggestions on how to proceed.

The current IB core to IB device driver interface relies
on a kernel module being able to call ib_get_dma_mr() to allocate
a memory region which represents all of device addressable memory.
The kernel module is then expected to call dma_map_single(),
dma_map_sg(), etc. to convert physical or virtual addresses into
device addresses.  If the system has an IOMMU, there may be several
physical pages mapped to a single contiguous device address region.
This device address and length (possibly an array of them) is then
passed to the IB device driver so the IB device can DMA data
to or from memory.

The ib_ipath driver cannot tell the HW to DMA data directly to the
device (IOMMU) addresses and must copy the data.  This means the driver
needs to reverse the IOMMU mapping and somehow obtain kernel virtual
addresses so it can memcpy() the data to the correct location.
Currently, the ib_ipath driver requires that the mapping be one-to-one
since there is no practical way to reverse IOMMU mappings.

I believe it is generally agreed that trying to change the dma_map_*
interface to include functions of this sort is not the right approach
to take.

One solution is to change the IB device driver interface so that
kernel virtual addresses are passed to the IB device driver and
the device driver is responsible for calling dma_map_single(), etc.
I believe this will be unacceptable to the OpenFabrics community
since it would require the driver to allocate large amounts of memory
(#QPs * #MaxWRs * sizeof(dma_addr_t + length)) to store the
information needed to undo the mapping when the DMA is complete.
The current IB code allocates the storage for dma_unmap_single(), etc.
as extra elements in structures already needed so it isn't a large
overhead and it is based on the actual number of requests posted
instead of the maximums allowed.

Another solution is to change the IB device driver interface to add
a function which tells the caller what type of addresses the device
expects.  Kernel modules would then be required to pass either a
dma_map_xxx() address or a kernel virtual address based on the
driver's preference.
The current set of IB consumers either start with kmalloc/vmalloc
memory (such as the MAD layer) or a list of physical pages
(such as ISER and SRP). The current code could therefore be
fairly easily changed except for ISER/SRP when a struct page
doesn't have a direct kernel address (high pages) and would
need to call kmap()/kunmap() in that case.

I plan to implement this last approach unless someone has
a better idea.  I would like to get some "buy-in" before
I spend a lot time coding only to be rejected when finished.



             reply	other threads:[~2006-07-12 23:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-07-12 23:29 Ralph Campbell [this message]
2006-07-12 23:40 ` Suggestions for how to remove bus_to_virt() David Miller
2006-07-13  0:11 ` Roland Dreier
2006-07-13  0:40   ` David Miller
2006-07-13  5:46     ` [openib-general] " Muli Ben-Yehuda
2006-07-14 22:27       ` Ralph Campbell
2006-07-14 22:35         ` David Miller
2006-07-14 23:45           ` Ralph Campbell
2006-07-15 13:42             ` Stefan Richter
2006-07-13  7:45     ` Stefan Richter
2006-07-13 16:02     ` Roland Dreier
2006-07-13 16:37       ` Ralph Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1152746967.4572.263.camel@brick.pathscale.com \
    --to=ralphc@pathscale.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=openib-general@openib.org \
    --cc=rolandd@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox