netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Or Gerlitz <ogerlitz@mellanox.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
	davem@davemloft.net, roland@kernel.org, netdev@vger.kernel.org,
	ali@mellanox.com, sean.hefty@intel.com,
	Erez Shitrit <erezsh@mellanox.co.il>,
	Doug Ledford <dledford@redhat.com>
Subject: Re: [PATCH V2 09/12] net/eipoib: Add main driver functionality
Date: Sun, 12 Aug 2012 16:41:47 +0300	[thread overview]
Message-ID: <20120812134147.GA6003@redhat.com> (raw)
In-Reply-To: <5027AB0F.7010102@mellanox.com>

On Sun, Aug 12, 2012 at 04:09:35PM +0300, Or Gerlitz wrote:
> On 12/08/2012 13:22, Michael S. Tsirkin wrote:
> >On Wed, Aug 08, 2012 at 08:23:15AM +0300, Or Gerlitz wrote:
> >>>On Sun, Aug 5, 2012 at 9:50 PM, Michael S. Tsirkin<mst@redhat.com>  wrote:
> >>>
> >>>[...]
> >>>> >So it seems that a sane solution would involve an extra level of
> >>>> >indirection, with guest addresses being translated to host IB addresses.
> >>>> >As long as you do this, maybe using an ethernet frame format makes sense.
> >>>[...]
> >>>
> >>>Yep, that's among the points we're trying to make, the way you've put
> >>>it makes it clearer.
> >>>
> >>>> >So far the things that make sense. Here are some that don't, to me:
> >>>
> >>>> >- Is a pdf presentation all you have in terms of documentation?
> >>>> >   We are talking communication protocols here - I would expect a
> >>>> >   proper spec, and some effort to standardize, otherwise where's the
> >>>> >   guarantee it won't change in an incompatible way?
> >>>
> >>>To be precise, the solution uses 100% IPoIB wire-protocol, so we don't
> >>>see a need
> >>>for any spec change / standardization effort.
> >Yes, I am guessing this is the real reason you pack LID/QPN
> >in the MAC - to make it all local. But it's a hack really,
> >and if you start storing it all in the SM you will need
> >to document the format so others can inter-operate.
> 
> I'd like to review the way we generate these MAC addresses, maybe it
> can be done differently.
> 
> 
> >
> >
> >>>This might go to the 1stpoint you've
> >>>brought... improve the documentation, will do that. The pdf you looked
> >>>at was presentedin a conference.
> >>
> >>>> >   Other things that I would expect to be addressed in such a spec is
> >>>> >   interaction with other IPoIB features, such as connected
> >>>> >   mode, checksum offloading etc, and IB features such as multipath etc.
> >>>
> >>>For the eipoib interface, it doesn't really matters if the underlyind
> >>>ipoib clones used by it (we call them VIFs) use connected or datagram
> >>>mode, what does matter is the MTU and offload features supported by
> >>>these VIFs, for which the eipoib interface will have the min among all
> >>>these VIFs. Since for a given eipoib nic, all its VIFs must originated
> >>>from the same IPoIB PIF (e.g ib0) its easy admin job to make sure they
> >>>all have the same mtu / features which are needed for that eipoib nic,
> >>>e.g by using the same mode (connected/datagram for all of them), hope
> >>>this is clear.
> >>>
> >Just pointing out all this needs to be documented.
> 
> OK, will do
> 
> >
> >
> >>>> >- The way you encode LID/QPN in the MAC seems questionable. IIRC there's
> >>>> >   more to IB addressing than just the LID.  Since everyone on the subnet
> >>>> >   need access to this translation, I think it makes sense to store it in
> >>>> >   the SM. I think this would also obviate some IPv4 specific hacks in kernel.
> >>>
> >>>The idead beyond the encoding was uniqueness, LID/QPN is unique per IB
> >>>HCA end-node.
> >But then it breaks with VM migration, IB failover, softmac setting in guest, probably more?
> 
> With the current design/code the remote mac of a VM changes, when
> that VM migrates or IB
> LIDs are changed.

Which is exactly the problem with IB and VM migration.

> As for softmac setting in the guest, we  don't
> send the guest MAC on the wire
> anyway, since the Ethernet header is removed.
> 
> Or.

Yes but you generate remote addresses automatically so admin can not
change the local address without risking conflicts.

-- 
MST

  reply	other threads:[~2012-08-12 13:43 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01 17:09 [PATCH V2 00/12] Add Ethernet IPoIB driver Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 01/12] IB/ipoib: Add rtnl_link_ops support Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 02/12] IB/ipoib: Add support for clones / multiple childs on the same partition Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 03/12] include/linux: Add private flags for IPoIB interfaces Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 04/12] IB/ipoib: Add support for acting as VIF Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 05/12] net: Add ndo_set_vif_param operation to serve eIPoIB VIFs Or Gerlitz
2012-08-02  0:17   ` Ben Hutchings
2012-08-02  8:25     ` Erez Shitrit
2012-08-01 17:09 ` [PATCH V2 06/12] net/core: Add rtnetlink support to vif parameters Or Gerlitz
2012-08-02  0:20   ` Ben Hutchings
2012-08-02 15:29     ` Erez Shitrit
2012-08-01 17:09 ` [PATCH V2 07/12] net/eipoib: Add private header file Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 08/12] net/eipoib: Add ethtool file support Or Gerlitz
2012-08-02  0:22   ` Ben Hutchings
2012-08-02  8:35     ` Erez Shitrit
2012-08-02 15:42       ` Ben Hutchings
2012-08-01 17:09 ` [PATCH V2 09/12] net/eipoib: Add main driver functionality Or Gerlitz
2012-08-02 17:15   ` Eric W. Biederman
2012-08-03 20:31     ` Ali Ayoub
2012-08-03 21:33       ` David Miller
2012-08-03 22:39         ` Ali Ayoub
2012-08-03 23:36           ` David Miller
2012-08-04 21:23             ` Or Gerlitz
2012-08-04 21:44               ` Or Gerlitz
2012-08-04 23:19                 ` Eric W. Biederman
2012-08-07  0:14             ` Ali Ayoub
2012-08-07  0:44               ` Eric W. Biederman
2012-08-07  1:21                 ` Re[2]: " Naoto MATSUMOTO
2012-08-15  9:10                   ` Re[3]: " Naoto MATSUMOTO
2012-08-07  3:33                 ` Eric W. Biederman
2012-08-08  6:04                   ` Or Gerlitz
2012-08-08  8:36                     ` Eric W. Biederman
2012-08-09  4:06                       ` Or Gerlitz
2012-08-12 14:05                         ` Michael S. Tsirkin
2012-08-07  3:37                 ` Joseph Glanville
2012-08-08  7:32                 ` Or Gerlitz
2012-08-08  9:17                   ` Eric W. Biederman
2012-08-09  4:34                     ` Or Gerlitz
2012-08-12 10:36                       ` Michael S. Tsirkin
2012-08-04  0:02           ` Ali Ayoub
2012-08-04  0:05             ` David Miller
2012-08-04  1:34             ` Eric W. Biederman
2012-08-04 21:33               ` Or Gerlitz
2012-08-05 18:50     ` Michael S. Tsirkin
2012-08-08  5:23       ` Or Gerlitz
2012-08-12 10:22         ` Michael S. Tsirkin
2012-08-12 13:09           ` Or Gerlitz
2012-08-12 13:41             ` Michael S. Tsirkin [this message]
2012-08-12 13:15           ` Or Gerlitz
2012-08-12 13:55             ` Michael S. Tsirkin
2012-08-12 14:13               ` Or Gerlitz
2012-08-12 20:54                 ` Michael S. Tsirkin
2012-08-14  8:44                   ` Or Gerlitz
2012-08-20 18:57                   ` Michael S. Tsirkin
2012-08-23  6:45                     ` Or Gerlitz
2012-08-14  7:41               ` Or Gerlitz
2012-08-12 10:54         ` Michael S. Tsirkin
2012-08-12 13:19           ` Or Gerlitz
2012-08-12 15:40         ` Eric W. Biederman
2012-08-13  8:33           ` Or Gerlitz
2012-08-13 16:08             ` Eric W. Biederman
2012-09-03 20:53       ` Or Gerlitz
2012-09-03 21:22         ` Michael S. Tsirkin
2012-09-04 18:50           ` Or Gerlitz
2012-09-04 19:31             ` Eric W. Biederman
2012-09-04 19:47               ` Or Gerlitz
2012-09-04 21:21             ` Michael S. Tsirkin
2012-09-04 18:57           ` Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 10/12] net/eipoib: Add sysfs support Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 11/12] net/eipoib: Add Makefile, Kconfig and MAINTAINERS entries Or Gerlitz
2012-08-01 17:09 ` [PATCH V2 12/12] IB/ipoib: Add support for transmission of skbs w.o dst/neighbour Or Gerlitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120812134147.GA6003@redhat.com \
    --to=mst@redhat.com \
    --cc=ali@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=erezsh@mellanox.co.il \
    --cc=netdev@vger.kernel.org \
    --cc=ogerlitz@mellanox.com \
    --cc=roland@kernel.org \
    --cc=sean.hefty@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).