public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
* Re: (R)DMA in userspace
       [not found] ` <OF84B00CFA.7F1CDA02-ONC1257A94.00545627-C1257A94.0055881A-Xeyd2O9EBijQT0dZR+AlfA@public.gmane.org>
@ 2012-10-10  7:36   ` Isaac Huang
  2012-10-11 20:44   ` Roland Dreier
  1 sibling, 0 replies; 8+ messages in thread
From: Isaac Huang @ 2012-10-10  7:36 UTC (permalink / raw)
  To: Animesh K Trivedi1; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bernard Metzler

I looked at a similar problem a while back, and it seemed that none of
the IB HW drivers implemented the sync_single_for_[cpu|device] DMA
operations. Maybe I missed some code, but if it's true, then the
dma_sync_* functions would be no-ops on IB devices. Probably cache
coherence was guaranteed somewhere else?

It seemed that PCIe root complex could manage cache coherence for PCIe
requests, e.g. by reading directly from host CPU cache.

Thanks,
Isaac

On Thu, Oct 11, 2012 at 05:34:06PM +0200, Animesh K Trivedi1 wrote:
> 
> Hi all,
> 
> It is a curiosity question rather a bug/issue report.
> 
> Linux DMA subsystem wants streaming DMA buffers to be synchronized before
> accessing them. This is
> achieved by calling dma_sync_*  family of functions. And, I see that these
> functions are used in kernel clients
> (e.g. xprtrdma, and iSER). This is all fine.
> 
> During memory memory registration, userspace buffers also go through same
> API calls (dma_map_sg_attrs(...)).
> What I am confused about why no such synchronization primitives are
> required in userspace before accessing
> an RDMA data buffer just after when an incoming write/recv (DMA on it) is
> finished? Who guarantees data
> freshness?
> 
> Thanks,
> --
> Animesh
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* (R)DMA in userspace
@ 2012-10-11 15:34 Animesh K Trivedi1
       [not found] ` <OF84B00CFA.7F1CDA02-ONC1257A94.00545627-C1257A94.0055881A-Xeyd2O9EBijQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Animesh K Trivedi1 @ 2012-10-11 15:34 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA; +Cc: Animesh K Trivedi1, Bernard Metzler


Hi all,

It is a curiosity question rather a bug/issue report.

Linux DMA subsystem wants streaming DMA buffers to be synchronized before
accessing them. This is
achieved by calling dma_sync_*  family of functions. And, I see that these
functions are used in kernel clients
(e.g. xprtrdma, and iSER). This is all fine.

During memory memory registration, userspace buffers also go through same
API calls (dma_map_sg_attrs(...)).
What I am confused about why no such synchronization primitives are
required in userspace before accessing
an RDMA data buffer just after when an incoming write/recv (DMA on it) is
finished? Who guarantees data
freshness?

Thanks,
--
Animesh

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (R)DMA in userspace
       [not found] ` <OF84B00CFA.7F1CDA02-ONC1257A94.00545627-C1257A94.0055881A-Xeyd2O9EBijQT0dZR+AlfA@public.gmane.org>
  2012-10-10  7:36   ` Isaac Huang
@ 2012-10-11 20:44   ` Roland Dreier
       [not found]     ` <CAL1RGDUOMz7Qf8bX7hZpJgARGepLQRwY25f6Q1utYBZ0taMs9A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 8+ messages in thread
From: Roland Dreier @ 2012-10-11 20:44 UTC (permalink / raw)
  To: Animesh K Trivedi1; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bernard Metzler

On Thu, Oct 11, 2012 at 8:34 AM, Animesh K Trivedi1 <ZRLATR-Xeyd2O9EBijQT0dZR+AlfA@public.gmane.org> wrote:
>
> During memory memory registration, userspace buffers also go through same
> API calls (dma_map_sg_attrs(...)).
> What I am confused about why no such synchronization primitives are
> required in userspace before accessing
> an RDMA data buffer just after when an incoming write/recv (DMA on it) is
> finished? Who guarantees data
> freshness?

No one has really ever tried to deal with the issue of userspace RDMA on
a cache-incoherent architecture.  Basically if you try the current stack, the
in-kernel users (IPoIB etc) should be OK but libibverbs etc. will be completely
broken.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (R)DMA in userspace
       [not found]     ` <CAL1RGDUOMz7Qf8bX7hZpJgARGepLQRwY25f6Q1utYBZ0taMs9A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-11 21:04       ` Or Gerlitz
       [not found]         ` <CAJZOPZJiEj7rMjF1ouukCPAGCXaNBhHoG1-YuDfEikvM-LLrXg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-10-12  9:12       ` Yann Droneaud
  1 sibling, 1 reply; 8+ messages in thread
From: Or Gerlitz @ 2012-10-11 21:04 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Animesh K Trivedi1, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Bernard Metzler

On Thu, Oct 11, 2012 at 10:44 PM, Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> wrote:

> No one has really ever tried to deal with the issue of userspace RDMA on
> a cache-incoherent architecture.  Basically if you try the current stack, the
> in-kernel users (IPoIB etc) should be OK but libibverbs etc. will be completely broken.

I think the question might refer even to cache-coherent systems, e.g
in the kernel IB core and ULPs all buffers are dma mapped to/from the
device before/after they are touched by the CPU and vise versa, wheres
in user space, after the buffers are registered once, they are
repeatedly touched by the CPUs and provided to the HW for DMA, e.g all
user space buffers are treated like kernel DMA coherent ones.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (R)DMA in userspace
       [not found]         ` <CAJZOPZJiEj7rMjF1ouukCPAGCXaNBhHoG1-YuDfEikvM-LLrXg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2012-10-12  8:54           ` Animesh K Trivedi1
  2012-10-12 23:10           ` Jason Gunthorpe
  1 sibling, 0 replies; 8+ messages in thread
From: Animesh K Trivedi1 @ 2012-10-12  8:54 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Bernard Metzler, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Roland Dreier

On x86 hardware manages the cache coherency. As far as I understood, DMA
sync
operations are no-ops on x86. But the confusion arose when I realised that
there are
no arch specific userlibs. So essentially as Roland pointed out, on a non
cache
coherent architecture userspace applications will break. This will happen
for IB as
well.

Can PCIe root complex can read dirty data from CPU caches? I am seeing
almost
100% LLC misses (L3 on Nehalem, which are inclusive for L1, and L2) for
memcpy on
RDMA buffers after transmission. Depending upon your CPU, LLC miss penalty
could
be very high, and shadow performance gains.

Has anyone ever tried using RDMA on non cache-coherent systems? I think,
cache
line flushing is not a privileged instruction and can be called without
going to kernel for
memory-cache synchronization.

Thought?

Thanks,
--
Animesh

Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote on 10/11/2012 11:04:02 PM:
>
> On Thu, Oct 11, 2012 at 10:44 PM, Roland Dreier
> <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> wrote:
>
> > No one has really ever tried to deal with the issue of userspace RDMA
on
> > a cache-incoherent architecture.  Basically if you try the
currentstack, the
> > in-kernel users (IPoIB etc) should be OK but libibverbs etc. will
> be completely broken.
>
> I think the question might refer even to cache-coherent systems, e.g
> in the kernel IB core and ULPs all buffers are dma mapped to/from the
> device before/after they are touched by the CPU and vise versa, wheres
> in user space, after the buffers are registered once, they are
> repeatedly touched by the CPUs and provided to the HW for DMA, e.g all
> user space buffers are treated like kernel DMA coherent ones.
>
> Or.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (R)DMA in userspace
       [not found]     ` <CAL1RGDUOMz7Qf8bX7hZpJgARGepLQRwY25f6Q1utYBZ0taMs9A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-10-11 21:04       ` Or Gerlitz
@ 2012-10-12  9:12       ` Yann Droneaud
       [not found]         ` <1350033163.2291.22.camel-sQn2kEGNn0pFevvuwOF9vF6hYfS7NtTn@public.gmane.org>
  1 sibling, 1 reply; 8+ messages in thread
From: Yann Droneaud @ 2012-10-12  9:12 UTC (permalink / raw)
  To: Roland Dreier
  Cc: Animesh K Trivedi1, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Bernard Metzler

Hi,

Le jeudi 11 octobre 2012 à 13:44 -0700, Roland Dreier a écrit :
> On Thu, Oct 11, 2012 at 8:34 AM, Animesh K Trivedi1 <ZRLATR-Xeyd2O9EBihhl2p70BpVqQ@public.gmane.orgm> wrote:
> >
> > During memory memory registration, userspace buffers also go through same
> > API calls (dma_map_sg_attrs(...)).
> > What I am confused about why no such synchronization primitives are
> > required in userspace before accessing
> > an RDMA data buffer just after when an incoming write/recv (DMA on it) is
> > finished? Who guarantees data
> > freshness?
> 
> No one has really ever tried to deal with the issue of userspace RDMA on
> a cache-incoherent architecture.  Basically if you try the current stack, the
> in-kernel users (IPoIB etc) should be OK but libibverbs etc. will be completely
> broken.
> 

With the current ARMv7 Cortex-A9 / Cortex-A15 MPCore and the upcoming
ARM 64 bits architecture eg ARMv8 aka Aarch64, one might want in the
near future use RDMA (InfiniBand/RoCE) with them to create highly
parallel system with low-power consumption. 

But this question and some reading about ARM memory management makes me
feel pretty unsure of the ability to use RDMA (InfiniBand) on ARM.

In this article: "ARM, DMA, and memory management"
http://lwn.net/Articles/440221/" 
it is said that a memory page must not be mapped multiple time with
different caching attributes.

This article take the point of Linaro's developers who want to upload
texture to the GPU without holding them in caches. This behavior might
also be applicable to RDMA as well: writing to a memory zone to be
either local IBV_WR_SEND, local IBV_WR_RDMA_WRITE or remote
IBV_WR_RDMA_READ, there's probably no need to keep it in cache.

You could also read this other article "CMA [contiguous memory
allocator] and ARM" http://lwn.net/Articles/450286/
and "ARM's multiply-mapped memory mess" http://lwn.net/Articles/409689/

After reading this, and not being an ARM expert, I'm asking myself about
a possible RDMA (InfiniBand) support on ARM.

Regards.

-- 
Yann Droneaud
OPTEYA


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (R)DMA in userspace
       [not found]         ` <1350033163.2291.22.camel-sQn2kEGNn0pFevvuwOF9vF6hYfS7NtTn@public.gmane.org>
@ 2012-10-12 14:55           ` Yann Droneaud
  0 siblings, 0 replies; 8+ messages in thread
From: Yann Droneaud @ 2012-10-12 14:55 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA
  Cc: Roland Dreier, Animesh K Trivedi1, Bernard Metzler

Le vendredi 12 octobre 2012 à 11:12 +0200, Yann Droneaud a écrit :
> Hi,
> 
> Le jeudi 11 octobre 2012 à 13:44 -0700, Roland Dreier a écrit :
> > On Thu, Oct 11, 2012 at 8:34 AM, Animesh K Trivedi1 <ZRLATR-0BZ7OvD7h6o@public.gmane.orgcom> wrote:
> > >
> > > During memory memory registration, userspace buffers also go through same
> > > API calls (dma_map_sg_attrs(...)).
> > > What I am confused about why no such synchronization primitives are
> > > required in userspace before accessing
> > > an RDMA data buffer just after when an incoming write/recv (DMA on it) is
> > > finished? Who guarantees data
> > > freshness?
> > 
> > No one has really ever tried to deal with the issue of userspace RDMA on
> > a cache-incoherent architecture.  Basically if you try the current stack, the
> > in-kernel users (IPoIB etc) should be OK but libibverbs etc. will be completely
> > broken.
> > 
> 
> With the current ARMv7 Cortex-A9 / Cortex-A15 MPCore and the upcoming
> ARM 64 bits architecture eg ARMv8 aka Aarch64, one might want in the
> near future use RDMA (InfiniBand/RoCE) with them to create highly
> parallel system with low-power consumption. 
> 
> But this question and some reading about ARM memory management makes me
> feel pretty unsure of the ability to use RDMA (InfiniBand) on ARM.
> 
> In this article: "ARM, DMA, and memory management"
> http://lwn.net/Articles/440221/" 
> it is said that a memory page must not be mapped multiple time with
> different caching attributes.
> 
> This article take the point of Linaro's developers who want to upload
> texture to the GPU without holding them in caches. This behavior might
> also be applicable to RDMA as well: writing to a memory zone to be
> either local IBV_WR_SEND, local IBV_WR_RDMA_WRITE or remote
> IBV_WR_RDMA_READ, there's probably no need to keep it in cache.
> 
> You could also read this other article "CMA [contiguous memory
> allocator] and ARM" http://lwn.net/Articles/450286/
> and "ARM's multiply-mapped memory mess" http://lwn.net/Articles/409689/
> 
> After reading this, and not being an ARM expert, I'm asking myself about
> a possible RDMA (InfiniBand) support on ARM.
> 

I've found more information in "Implementing DMA on ARM SMP Systems",
Application Note 228
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0228a/index.html

ARMv7 Cortex-A9 seems to be a cache-coherent architecture.

Regards.

-- 
Yann Droneaud
OPTEYA



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: (R)DMA in userspace
       [not found]         ` <CAJZOPZJiEj7rMjF1ouukCPAGCXaNBhHoG1-YuDfEikvM-LLrXg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2012-10-12  8:54           ` Animesh K Trivedi1
@ 2012-10-12 23:10           ` Jason Gunthorpe
  1 sibling, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2012-10-12 23:10 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Roland Dreier, Animesh K Trivedi1,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, Bernard Metzler

On Thu, Oct 11, 2012 at 11:04:02PM +0200, Or Gerlitz wrote:
> On Thu, Oct 11, 2012 at 10:44 PM, Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org> wrote:
> 
> > No one has really ever tried to deal with the issue of userspace
> > RDMA on a cache-incoherent architecture.  Basically if you try the
> > current stack, the in-kernel users (IPoIB etc) should be OK but
> > libibverbs etc. will be completely broken.
> 
> I think the question might refer even to cache-coherent systems, e.g
> in the kernel IB core and ULPs all buffers are dma mapped to/from the
> device before/after they are touched by the CPU and vise versa, wheres
> in user space, after the buffers are registered once, they are
> repeatedly touched by the CPUs and provided to the HW for DMA, e.g all
> user space buffers are treated like kernel DMA coherent ones.

The answer is the same, userspace is designed to rely on a DMA cache
coherent platform where the only requirement is to issue barrier
instructions, which is done in the providers.

I'm not sure supporting non-DMA-coherent is even possible with the
verbs API, yes we could add the cache ops to the providers, the
information is mostly there. However non-DMA-coherent system all
require that once you start a DMA WRITE into a cache line *that line
is never dirtied by the CPU* - which requires application support that
is not even contemplated by verbs. Indeed, I wonder if all the kernel
ULPs meet that restriction?

Hopefully the forthcoming server grade ARMs are fully DMA coherent..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-10-12 23:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-11 15:34 (R)DMA in userspace Animesh K Trivedi1
     [not found] ` <OF84B00CFA.7F1CDA02-ONC1257A94.00545627-C1257A94.0055881A-Xeyd2O9EBijQT0dZR+AlfA@public.gmane.org>
2012-10-10  7:36   ` Isaac Huang
2012-10-11 20:44   ` Roland Dreier
     [not found]     ` <CAL1RGDUOMz7Qf8bX7hZpJgARGepLQRwY25f6Q1utYBZ0taMs9A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-11 21:04       ` Or Gerlitz
     [not found]         ` <CAJZOPZJiEj7rMjF1ouukCPAGCXaNBhHoG1-YuDfEikvM-LLrXg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-12  8:54           ` Animesh K Trivedi1
2012-10-12 23:10           ` Jason Gunthorpe
2012-10-12  9:12       ` Yann Droneaud
     [not found]         ` <1350033163.2291.22.camel-sQn2kEGNn0pFevvuwOF9vF6hYfS7NtTn@public.gmane.org>
2012-10-12 14:55           ` Yann Droneaud

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox