* [PATCH 0/7] More sane neigh infrastructure
@ 2011-07-25 10:01 David Miller
2011-07-25 16:34 ` Roland Dreier
0 siblings, 1 reply; 4+ messages in thread
From: David Miller @ 2011-07-25 10:01 UTC (permalink / raw)
To: roland; +Cc: linux-rdma, netdev
Roland, this is a first pass at the kind of thing I was
talking about with you last week.
ATM and Infiniband both need to do their own kind of
signaling, either in place of (ATM) or in addition to
(IPoIB) the generic ARP negotiation.
ATM wants to push everything to a userspace atmarpd daemon,
and override all of the usual ARP signalling. It replaces
the neigh_table used by ARP completely in order to accomplish
this.
IPoIB triggers it's signalling by hooking in at transmit time, and
adding a neigh parms destruction hook to free up and release it's
private per-neigh state.
I think both cases can be consolidated into one kind of scheme,
and these patches provide the infrastructure and convert ATM
over as an example.
Devices provide up to three things:
1) netdev->neighpriv_len, length of per-neighbour device private
state, accessible via neighbour_priv(neigh)
2) net_device_ops->ndo_neigh_construct(), invoked right after
neigh_tbl->constructor(), can fail
3) net_device_ops->ndo_neigh_destroy(), invoked right before
we release neigh->parms and kfree_rcu() the neigh object.
It could return errors but I'm not checking for them
currently and I can't think what we could possibly do
in response at this point in the code. Maybe this gets
changed to return "void" eventually.
As a result ATM CLIP no longer overrides the IPV4 ARP table, and
I'm convinced IPoIB could behave similarly, override the
neigh_ops in a device neigh constructor, and avoid all of the
hooks at transmit time and instead trigger the key signalling
at neigh->output and friends.
If IPoIB can get converted to this new stuff, then we can get
rid of the ->ndo_neigh_setup() netdev op which only exists to
facilitate IPoIB hooking in a destructor for it's neigh state.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 0/7] More sane neigh infrastructure
2011-07-25 10:01 [PATCH 0/7] More sane neigh infrastructure David Miller
@ 2011-07-25 16:34 ` Roland Dreier
2011-07-25 21:10 ` David Miller
0 siblings, 1 reply; 4+ messages in thread
From: Roland Dreier @ 2011-07-25 16:34 UTC (permalink / raw)
To: David Miller; +Cc: linux-rdma, netdev
On Mon, Jul 25, 2011 at 3:01 AM, David Miller <davem@davemloft.net> wrote:
> Devices provide up to three things:
>
> 1) netdev->neighpriv_len, length of per-neighbour device private
> state, accessible via neighbour_priv(neigh)
>
> 2) net_device_ops->ndo_neigh_construct(), invoked right after
> neigh_tbl->constructor(), can fail
Hey Dave,
I'll definitely look at converting IPoIB over to using this stuff.
Would love to get rid of all the dicy handling of ipoib_neigh lifetime
that we currently have. However, I have a question about what the
intention for ndo_neigh_construct() is in the IPoIB case.
As we talked about, IPoIB has to trigger a path lookup to the subnet
manager (SM) when it gets a remote port ID. However the SM is a
remote entity, so this lookup means we send a message and then
asynchronously wait for it to complete (or possibly timeout), just
like the ARP itself. But this is done after we get the port ID via
normal RFC 826 ARP (with an address format as specified by RFC 4391).
So I don't think we can use custom neigh_ops with a new solict method
the way clip does -- we actually want to let the normal stack do ARP
or ND, but then extend the process by another message/response step.
I'm sure this is possible within your scheme but I'm not sure I
understand what the "right" way is.
Thanks!
Roland
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-07-25 22:49 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-25 10:01 [PATCH 0/7] More sane neigh infrastructure David Miller
2011-07-25 16:34 ` Roland Dreier
2011-07-25 21:10 ` David Miller
[not found] ` <20110725.141041.1092565620930748250.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
2011-07-25 22:49 ` Roland Dreier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).