linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-01 20:02 Roland Dreier
@ 2008-04-01 16:55 ` Shirley Ma
  2008-04-02  7:22 ` Shirley Ma
  2008-04-02 12:31 ` Tziporet Koren
  2 siblings, 0 replies; 17+ messages in thread
From: Shirley Ma @ 2008-04-01 16:55 UTC (permalink / raw)
  To: Roland Dreier; +Cc: general, linux-kernel

On Tue, 2008-04-01 at 13:02 -0700, Roland Dreier wrote:
> - Multiple CQ event vector support.  I still haven't seen any
>    discussions about how ULPs or userspace apps should decide which
>    vector to use, and hence no progress has been made since we
>    deferred this during the 2.6.23 merge window. 

I did some prototype for IPoIB to enable multiple CQ event support. I
did see the approach improved multiple links aggregation performance. I
also see some customers' requirements in userspace. I will start the
discussion as soon as possible. But it would most likely miss 2.6.26
window.

Thanks
Shirley


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-01 20:02 Roland Dreier
  2008-04-01 16:55 ` [ofa-general] " Shirley Ma
@ 2008-04-02  7:22 ` Shirley Ma
  2008-04-02 15:27   ` Roland Dreier
  2008-04-02 12:31 ` Tziporet Koren
  2 siblings, 1 reply; 17+ messages in thread
From: Shirley Ma @ 2008-04-02  7:22 UTC (permalink / raw)
  To: Roland Dreier; +Cc: general, linux-kernel

What's the status of RDS?

Thanks
Shirley


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-01 20:02 Roland Dreier
  2008-04-01 16:55 ` [ofa-general] " Shirley Ma
  2008-04-02  7:22 ` Shirley Ma
@ 2008-04-02 12:31 ` Tziporet Koren
  2008-04-02 16:19   ` Roland Dreier
  2008-04-04  5:54   ` Or Gerlitz
  2 siblings, 2 replies; 17+ messages in thread
From: Tziporet Koren @ 2008-04-02 12:31 UTC (permalink / raw)
  To: Roland Dreier; +Cc: general, linux-kernel

Roland Dreier wrote:
> Core:
>
>  - I did a bunch of cleanups all over drivers/infiniband and the
>    gcc and sparse warning noise is down to a pretty reasonable level.
>    Further cleanups welcome of course.
>   
We want to add send with invalidate & mask compare and swap.
Eli will be able to send the patches next week and since they are small 
I think they can be in for 2.6.26
> ULPs:
>
>  - I merged Eli's IPoIB stateless offload changes for checksum
>    offload and LSO changes.  The interrupt moderation changes are
>    next, and should not be a problem to merge.  Please test IPoIB
>    on all sorts of hardware!
>   
What about the split CQ for UD mode? It's improved the IPoIB performance 
for small messages significantly.
>
> HW specific:
>
>   
mlx4- we plan to send patches for the low level driver only to enable 
mlx4_en. These only affect our low level driver.
Should be ready next week. I hope these can get in too.
> Here are a few topics that I believe will not be ready in time for the
> 2.6.26 window and will need to wait for 2.6.27 at least:
>
>  - XRC.  I still don't have a good feeling that we have settled on all
>    the nuances of the ABI we want to expose to userspace for this, and
>    ideally I would like to understand how ehca LL QPs fit into the
>    picture as well.
>   
I think we should try to push for XEC in 2.6.26 since there are already 
MPI implementation that use it and this ties them to use OFED only.
Also this feature is stable and now being defined in IBTA
Not taking it causing changes between OFED and the kernel and your 
libibverbs and we wish to avoid such gaps.
Is there any thing we can do to help and make it into 2.6.26?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02  7:22 ` Shirley Ma
@ 2008-04-02 15:27   ` Roland Dreier
  2008-04-02 17:11     ` Richard Frank
  0 siblings, 1 reply; 17+ messages in thread
From: Roland Dreier @ 2008-04-02 15:27 UTC (permalink / raw)
  To: Shirley Ma; +Cc: linux-kernel, general

 > What's the status of RDS?

I've never seen any patches.  I guess ask the RDS guys if/when they want
to start working on getting RDS merged.

 - R.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 17:11     ` Richard Frank
@ 2008-04-02 16:15       ` Roland Dreier
  2008-04-02 17:18         ` Richard Frank
  2008-04-02 17:24         ` Richard Frank
  0 siblings, 2 replies; 17+ messages in thread
From: Roland Dreier @ 2008-04-02 16:15 UTC (permalink / raw)
  To: Richard Frank; +Cc: Shirley Ma, linux-kernel, general, rds-devel

 > What is the work we need to do here - I was thinking RDS should just work ?

Stuff doesn't get merged into the kernel on its own.  If you want RDS
upstream then the first step is to post patches in a form suitable for
reviewing.  Then respond to the review comments.

The files Documentation/SubmittingPatches and to some extent
Documentation/SubmittingDrivers in the kernel source have more info.

 - R.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 12:31 ` Tziporet Koren
@ 2008-04-02 16:19   ` Roland Dreier
  2008-04-03 11:40     ` Tziporet Koren
  2008-04-04 20:26     ` Richard Frank
  2008-04-04  5:54   ` Or Gerlitz
  1 sibling, 2 replies; 17+ messages in thread
From: Roland Dreier @ 2008-04-02 16:19 UTC (permalink / raw)
  To: tziporet; +Cc: general, linux-kernel

 > We want to add send with invalidate & mask compare and swap.
 > Eli will be able to send the patches next week and since they are
 > small I think they can be in for 2.6.26

Send with invalidate should be OK.  Let's see about the masked atomics
stuff -- we have a ton of new verbs and I think we might want to slow
down and make sure it all makes sense.

 > What about the split CQ for UD mode? It's improved the IPoIB
 > performance for small messages significantly.

Oh yeah... I'll try to get that in too.

 > mlx4- we plan to send patches for the low level driver only to enable
 > mlx4_en. These only affect our low level driver.

No problem in principle, let's see the actual patches.

 > I think we should try to push for XEC in 2.6.26 since there are
 > already MPI implementation that use it and this ties them to use OFED
 > only.
 > Also this feature is stable and now being defined in IBTA
 > Not taking it causing changes between OFED and the kernel and your
 > libibverbs and we wish to avoid such gaps.
 > Is there any thing we can do to help and make it into 2.6.26?

I don't have a good feeling that the user-kernel interface is well
thought out, so I want to consider XRC + ehca LL stuff + new iWARP verbs
and make sure we have something that makes sense for the future.

 - R.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 17:18         ` Richard Frank
@ 2008-04-02 16:26           ` Roland Dreier
  2008-04-02 17:28             ` Richard Frank
  0 siblings, 1 reply; 17+ messages in thread
From: Roland Dreier @ 2008-04-02 16:26 UTC (permalink / raw)
  To: Richard Frank; +Cc: Shirley Ma, linux-kernel, general, rds-devel

 > Yes, I see this is for pushing RDS upstream - but what about running
 > RDS as is over IWARP NICs - that should just work right ?

No idea.  It depends on whether you took into account the differences
between IB and iWARP.  Anyway that's not really what this thread was about.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
       [not found] <OF1D776305.7E25BAA6-ON8725741F.005B027C-8825741F.002F5161@us.ibm.com>
@ 2008-04-02 16:37 ` Roland Dreier
  0 siblings, 0 replies; 17+ messages in thread
From: Roland Dreier @ 2008-04-02 16:37 UTC (permalink / raw)
  To: Shirley Ma
  Cc: Richard Frank, general, general-bounces, linux-kernel, rds-devel

 > Can the maintainer submit RDS patch for mainline kernel, in 2.6.26 or
 > 2.6.27 window? It's hard for Distros pick this feature without mainline
 > kernel acceptance.

At least as a first order approximation, there is no chance of RDS being
merged for 2.6.26 even if patches appear right this second...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 15:27   ` Roland Dreier
@ 2008-04-02 17:11     ` Richard Frank
  2008-04-02 16:15       ` Roland Dreier
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Frank @ 2008-04-02 17:11 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Shirley Ma, linux-kernel, general, rds-devel

What is the work we need to do here - I was thinking RDS should just work ?

Roland Dreier wrote:
>  > What's the status of RDS?
>
> I've never seen any patches.  I guess ask the RDS guys if/when they want
> to start working on getting RDS merged.
>
>  - R.
> _______________________________________________
> general mailing list
> general@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 16:15       ` Roland Dreier
@ 2008-04-02 17:18         ` Richard Frank
  2008-04-02 16:26           ` Roland Dreier
  2008-04-02 17:24         ` Richard Frank
  1 sibling, 1 reply; 17+ messages in thread
From: Richard Frank @ 2008-04-02 17:18 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Shirley Ma, linux-kernel, general, rds-devel

Yes, I see this is for pushing RDS upstream - but what about running RDS 
as is over IWARP NICs - that should just work right ?

Roland Dreier wrote:
>  > What is the work we need to do here - I was thinking RDS should just work ?
>
> Stuff doesn't get merged into the kernel on its own.  If you want RDS
> upstream then the first step is to post patches in a form suitable for
> reviewing.  Then respond to the review comments.
>
> The files Documentation/SubmittingPatches and to some extent
> Documentation/SubmittingDrivers in the kernel source have more info.
>
>  - R.
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 16:15       ` Roland Dreier
  2008-04-02 17:18         ` Richard Frank
@ 2008-04-02 17:24         ` Richard Frank
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Frank @ 2008-04-02 17:24 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Shirley Ma, linux-kernel, general, rds-devel

WRT to merging RDS into the kernel - our current plans are to wait to 
see RDS adopted by more than Oracle - before approaching the kernel 
community about inclusion of RDS.

Roland Dreier wrote:
>  > What is the work we need to do here - I was thinking RDS should just work ?
>
> Stuff doesn't get merged into the kernel on its own.  If you want RDS
> upstream then the first step is to post patches in a form suitable for
> reviewing.  Then respond to the review comments.
>
> The files Documentation/SubmittingPatches and to some extent
> Documentation/SubmittingDrivers in the kernel source have more info.
>
>  - R.
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 16:26           ` Roland Dreier
@ 2008-04-02 17:28             ` Richard Frank
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Frank @ 2008-04-02 17:28 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Shirley Ma, linux-kernel, general, rds-devel

got it...

Roland Dreier wrote:
>  > Yes, I see this is for pushing RDS upstream - but what about running
>  > RDS as is over IWARP NICs - that should just work right ?
>
> No idea.  It depends on whether you took into account the differences
> between IB and iWARP.  Anyway that's not really what this thread was about.
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 16:19   ` Roland Dreier
@ 2008-04-03 11:40     ` Tziporet Koren
  2008-04-04 20:26     ` Richard Frank
  1 sibling, 0 replies; 17+ messages in thread
From: Tziporet Koren @ 2008-04-03 11:40 UTC (permalink / raw)
  To: Roland Dreier; +Cc: tziporet, general, linux-kernel

Roland Dreier wrote:
> Send with invalidate should be OK.  Let's see about the masked atomics
> stuff -- we have a ton of new verbs and I think we might want to slow
> down and make sure it all makes sense.
>   
OK - will send and then we will see what will come out.

>  > What about the split CQ for UD mode? It's improved the IPoIB
>  > performance for small messages significantly.
>
> Oh yeah... I'll try to get that in too.
>   
thanks
>  > mlx4- we plan to send patches for the low level driver only to enable
>  > mlx4_en. These only affect our low level driver.
>
> No problem in principle, let's see the actual patches.
>   
Sure
>  > I think we should try to push for XEC in 2.6.26 since there are
>  > already MPI implementation that use it and this ties them to use OFED
>  > only.
>  > Also this feature is stable and now being defined in IBTA
>  > Not taking it causing changes between OFED and the kernel and your
>  > libibverbs and we wish to avoid such gaps.
>  > Is there any thing we can do to help and make it into 2.6.26?
>
> I don't have a good feeling that the user-kernel interface is well
> thought out, so I want to consider XRC + ehca LL stuff + new iWARP verbs
> and make sure we have something that makes sense for the future.
>
>   
I see - but can't we figure this all for the 2.6.26 window?

Tziporet


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 12:31 ` Tziporet Koren
  2008-04-02 16:19   ` Roland Dreier
@ 2008-04-04  5:54   ` Or Gerlitz
  1 sibling, 0 replies; 17+ messages in thread
From: Or Gerlitz @ 2008-04-04  5:54 UTC (permalink / raw)
  To: Tziporet Koren; +Cc: Roland Dreier, linux-kernel, general, Dror Goldenberg

On Wed, Apr 2, 2008 at 3:31 PM, Tziporet Koren
<tziporet@dev.mellanox.co.il> wrote:

>  We want to add send with invalidate
>  Eli will be able to send the patches next week and since they are small I think they can be in for 2.6.26

Does send with invalidate applies to rkeys generated through the
proprietary FMR API?
if not, what usage you envision to the new verb under nowadays IB devices?

Or.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-04 20:26     ` Richard Frank
@ 2008-04-04 19:34       ` Roland Dreier
  2008-04-04 22:21         ` Richard Frank
  0 siblings, 1 reply; 17+ messages in thread
From: Roland Dreier @ 2008-04-04 19:34 UTC (permalink / raw)
  To: Richard Frank; +Cc: tziporet, linux-kernel, general

 > We are very interested in these new operations and are moving in the
 > direction of tightly integrating RDMA along with atomics (if
 > available) into Oracle.  We plan on testing some early prototypes of
 > the these in the few months.

And you need the ConnectX-only masked atomics?  Or do the standard IB
atomic operations work for you?  Of course using atomics at all means
that things don't work on iWARP.

 > Send with invalidate is an exact match for our current RDS V3 rdma
 > driver - and should be more efficient than the current background
 > syncing of the tpt  to ensure keys are invalidated.

How does send with invalidate interact with the current IB FMR stuff?
Seems that you would run into trouble keeping the state of the FMR
straight if the remote side is invalidating them.

Also I would think that send-with-invalidate would be much more
expensive than the current FMR method of batching up the invalidates,
since you don't get to amortize the cost of syncing up all the internal
HCA state.

 - R.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-02 16:19   ` Roland Dreier
  2008-04-03 11:40     ` Tziporet Koren
@ 2008-04-04 20:26     ` Richard Frank
  2008-04-04 19:34       ` Roland Dreier
  1 sibling, 1 reply; 17+ messages in thread
From: Richard Frank @ 2008-04-04 20:26 UTC (permalink / raw)
  To: Roland Dreier; +Cc: tziporet, linux-kernel, general

 > We want to add send with invalidate & mask compare and swap.
 > Eli will be able to send the patches next week and since they are
 > small I think they can be in for 2.6.26

We are very interested in these new operations and are moving in the 
direction of tightly integrating RDMA along with atomics (if available) 
into Oracle.  We plan on testing some early prototypes of the these in 
the few months.

Send with invalidate is an exact match for our current RDS V3 rdma 
driver - and should be more efficient than the current background 
syncing of the tpt  to ensure keys are invalidated.

We intend on exposing the atomics via the RDS driver along with simple 
low level rdma operations to Oracle's internal clients. If Oracle is 
running over a transport which exports atomics and rdma - Oracle will 
see a dramatic performance boost for several database operations.

Roland Dreier wrote:
>  > We want to add send with invalidate & mask compare and swap.
>  > Eli will be able to send the patches next week and since they are
>  > small I think they can be in for 2.6.26
>
> Send with invalidate should be OK.  Let's see about the masked atomics
> stuff -- we have a ton of new verbs and I think we might want to slow
> down and make sure it all makes sense.
>
>  > What about the split CQ for UD mode? It's improved the IPoIB
>  > performance for small messages significantly.
>
> Oh yeah... I'll try to get that in too.
>
>  > mlx4- we plan to send patches for the low level driver only to enable
>  > mlx4_en. These only affect our low level driver.
>
> No problem in principle, let's see the actual patches.
>
>  > I think we should try to push for XEC in 2.6.26 since there are
>  > already MPI implementation that use it and this ties them to use OFED
>  > only.
>  > Also this feature is stable and now being defined in IBTA
>  > Not taking it causing changes between OFED and the kernel and your
>  > libibverbs and we wish to avoid such gaps.
>  > Is there any thing we can do to help and make it into 2.6.26?
>
> I don't have a good feeling that the user-kernel interface is well
> thought out, so I want to consider XRC + ehca LL stuff + new iWARP verbs
> and make sure we have something that makes sense for the future.
>
>  - R.
> _______________________________________________
> general mailing list
> general@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git)
  2008-04-04 19:34       ` Roland Dreier
@ 2008-04-04 22:21         ` Richard Frank
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Frank @ 2008-04-04 22:21 UTC (permalink / raw)
  To: Roland Dreier; +Cc: tziporet, linux-kernel, general

Roland Dreier wrote:
>  > We are very interested in these new operations and are moving in the
>  > direction of tightly integrating RDMA along with atomics (if
>  > available) into Oracle.  We plan on testing some early prototypes of
>  > the these in the few months.
>
> And you need the ConnectX-only masked atomics?  Or do the standard IB
> atomic operations work for you?  Of course using atomics at all means
> that things don't work on iWARP.
>
>   
We specifically asked for the masked operations.

Yes, this means Oracle will not get the performance boost of atomics on 
IWARP - but we still get rdma - and that's a real win / benefit for 
Oracle today - and more so over the next few months.

>  > Send with invalidate is an exact match for our current RDS V3 rdma
>  > driver - and should be more efficient than the current background
>  > syncing of the tpt  to ensure keys are invalidated.
>
> How does send with invalidate interact with the current IB FMR stuff?
> Seems that you would run into trouble keeping the state of the FMR
> straight if the remote side is invalidating them.
>
>   
The model we implement is based on "use once" keys - we issue the key to 
the rdma server and want to toss it as soon as the rdma is complete. 
Today, we explicitly free the key after the rdma completes and we get a 
message from the rdma server - saying rdma is complete. If the key is 
auto invalidated by the recv'ing HCA then we do not need to do it in the 
driver... which also meanswe do not need to issue the sync tpts to force 
the HCA to be update its cache.

At least this is how I think it works - Olaf is the divine source here.

> Also I would think that send-with-invalidate would be much more
> expensive than the current FMR method of batching up the invalidates,
> since you don't get to amortize the cost of syncing up all the internal
> HCA state.
>
>   
This is the one piece we do not know - our plans are to test this and 
see where the trade offs are. We will keep the current design / 
implementation to run over NICs that do not support send-with-invalidate.
>  - R.
>   

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-04-04 21:22 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <OF1D776305.7E25BAA6-ON8725741F.005B027C-8825741F.002F5161@us.ibm.com>
2008-04-02 16:37 ` [ofa-general] InfiniBand/iWARP/RDMA merge plans for 2.6.26 (what's in infiniband.git) Roland Dreier
2008-04-01 20:02 Roland Dreier
2008-04-01 16:55 ` [ofa-general] " Shirley Ma
2008-04-02  7:22 ` Shirley Ma
2008-04-02 15:27   ` Roland Dreier
2008-04-02 17:11     ` Richard Frank
2008-04-02 16:15       ` Roland Dreier
2008-04-02 17:18         ` Richard Frank
2008-04-02 16:26           ` Roland Dreier
2008-04-02 17:28             ` Richard Frank
2008-04-02 17:24         ` Richard Frank
2008-04-02 12:31 ` Tziporet Koren
2008-04-02 16:19   ` Roland Dreier
2008-04-03 11:40     ` Tziporet Koren
2008-04-04 20:26     ` Richard Frank
2008-04-04 19:34       ` Roland Dreier
2008-04-04 22:21         ` Richard Frank
2008-04-04  5:54   ` Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).