[PATCH WIP RFC 0/3] mpls: support for ler

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH WIP RFC 0/3] mpls: support for ler
@ 2015-06-03 14:21 Roopa Prabhu
  2015-06-05  9:14 ` Thomas Graf
  0 siblings, 1 reply; 11+ messages in thread
From: Roopa Prabhu @ 2015-06-03 14:21 UTC (permalink / raw)
  To: ebiederm, rshearma, tgraf; +Cc: netdev

From: Roopa Prabhu <roopa@cumulusnetworks.com>

This is still WIP and incomplete.
Posting it here because of the other discussions
happening around mpls ler in the context of Roberts
code and I happened to mention this implementation.

This was in response to earlier email thread with Eric on
net-next of possibly using xfrm style stacked destination
approach.

I introduce a new set of tunnel ops for light weight
tunnels (lwt), but this could be merged with the
other ip_tunnels code if possible.

I had this code for 3.2 kernel initially, and 
as I was pulling out code, I realize i had to separate
out some other mpls code that i have been working on
and quite likely this will not even compile. Sorry abt
that.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>

Roopa Prabhu (3):
  lwtunnels: basic infra for light weight tunnels like mpls
  ipv4 fib: lwtunnel handling
  mpls: register lwtunnel ops

 include/linux/if_lwtunnel.h      |    8 ++
 include/net/dst.h                |    2 +
 include/net/ip_fib.h             |    5 +-
 include/net/lwtunnel.h           |   61 +++++++++++++
 include/uapi/linux/if_lwtunnel.h |   12 +++
 include/uapi/linux/rtnetlink.h   |    8 +-
 net/Makefile                     |    2 +-
 net/ipv4/fib_frontend.c          |    6 ++
 net/ipv4/fib_semantics.c         |   34 +++++++-
 net/ipv4/route.c                 |    5 ++
 net/lwtunnel.c                   |  177 ++++++++++++++++++++++++++++++++++++++
 net/mpls/af_mpls.c               |  143 ++++++++++++++++++++++++++++++
 net/mpls/internal.h              |    5 ++
 13 files changed, 464 insertions(+), 4 deletions(-)
 create mode 100644 include/linux/if_lwtunnel.h
 create mode 100644 include/net/lwtunnel.h
 create mode 100644 include/uapi/linux/if_lwtunnel.h
 create mode 100644 net/lwtunnel.c

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-03 14:21 [PATCH WIP RFC 0/3] mpls: support for ler Roopa Prabhu
@ 2015-06-05  9:14 ` Thomas Graf
  2015-06-05 14:16   ` roopa
  2015-06-05 14:31   ` Robert Shearman
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Graf @ 2015-06-05  9:14 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: ebiederm, rshearma, netdev

On 06/03/15 at 07:21am, Roopa Prabhu wrote:
> From: Roopa Prabhu <roopa@cumulusnetworks.com>
> 
> This is still WIP and incomplete.
> Posting it here because of the other discussions
> happening around mpls ler in the context of Roberts
> code and I happened to mention this implementation.
> 
> This was in response to earlier email thread with Eric on
> net-next of possibly using xfrm style stacked destination
> approach.
> 
> I introduce a new set of tunnel ops for light weight
> tunnels (lwt), but this could be merged with the
> other ip_tunnels code if possible.
> 
> I had this code for 3.2 kernel initially, and 
> as I was pulling out code, I realize i had to separate
> out some other mpls code that i have been working on
> and quite likely this will not even compile. Sorry abt
> that.
> 
> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>

Thanks for posting these patches Roopa!

I see that some of the edges are still a bit rough. In particular
the lack of sanity checking around type before indexing the array
with it ;-) No question that this would make a great optimization
on top of existing IP tunnels though! I think this is where Eric
was heading to and given this implementation, I'm perfectly fine
with it as it does not *require* to precompute the headers for all
encap types.

This can be made compatible with the patches I have posted as well.
A simple flag in what you call rtencap could indicate whether to
perform the encap in the dst->output or merely attach the metadata
and forward it to RTA_OIF for postponed encapsulation.

That way, if desirable by the user, the net_device can be omitted
which would suit Eric's architecture while we still also support
the traditional net_device model which provides stats and a shared
set of encapsulation parameters. It will also allow for bridges to
perform the encapsulation decision if needed and we can still get
rid of the OVS encapsulation special handling.

As I mentioned to Robert, the new RTA_ENCAP should be a list of
Netlink attributes from the beginning to make it extendible without
ever breaking user ABI.

The most overlap seems to be with Robert's series. The direction
seems to be very similar. How do you want to proceed? Work on a
series together? I'm happy to rebase my series on top of both you
and Robert's work and make use of a new generic per nexthop
encapsulation API. Let me know how you guys want to proceed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-05  9:14 ` Thomas Graf
@ 2015-06-05 14:16   ` roopa
  2015-06-05 15:26     ` Robert Shearman
  2015-06-05 14:31   ` Robert Shearman
  1 sibling, 1 reply; 11+ messages in thread
From: roopa @ 2015-06-05 14:16 UTC (permalink / raw)
  To: Thomas Graf; +Cc: ebiederm, rshearma, netdev

On 6/5/15, 2:14 AM, Thomas Graf wrote:
> On 06/03/15 at 07:21am, Roopa Prabhu wrote:
>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>
>> This is still WIP and incomplete.
>> Posting it here because of the other discussions
>> happening around mpls ler in the context of Roberts
>> code and I happened to mention this implementation.
>>
>> This was in response to earlier email thread with Eric on
>> net-next of possibly using xfrm style stacked destination
>> approach.
>>
>> I introduce a new set of tunnel ops for light weight
>> tunnels (lwt), but this could be merged with the
>> other ip_tunnels code if possible.
>>
>> I had this code for 3.2 kernel initially, and
>> as I was pulling out code, I realize i had to separate
>> out some other mpls code that i have been working on
>> and quite likely this will not even compile. Sorry abt
>> that.
>>
>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
> Thanks for posting these patches Roopa!
>
> I see that some of the edges are still a bit rough. In particular
> the lack of sanity checking around type before indexing the array
> with it ;-)
Oh..., sorry you had to see that :)
(In my defense, ...i did successfully get some packets into the mpls 
tunnel with this though! :) )
> No question that this would make a great optimization
> on top of existing IP tunnels though! I think this is where Eric
> was heading to and given this implementation, I'm perfectly fine
> with it as it does not *require* to precompute the headers for all
> encap types.
>
> This can be made compatible with the patches I have posted as well.
> A simple flag in what you call rtencap could indicate whether to
> perform the encap in the dst->output or merely attach the metadata
> and forward it to RTA_OIF for postponed encapsulation.
>
> That way, if desirable by the user, the net_device can be omitted
> which would suit Eric's architecture while we still also support
> the traditional net_device model which provides stats and a shared
> set of encapsulation parameters. It will also allow for bridges to
> perform the encapsulation decision if needed and we can still get
> rid of the OVS encapsulation special handling.
yeah, that's a great idea.
>
> As I mentioned to Robert, the new RTA_ENCAP should be a list of
> Netlink attributes from the beginning to make it extendible without
> ever breaking user ABI.
agreed.
>
> The most overlap seems to be with Robert's series. The direction
> seems to be very similar. How do you want to proceed? Work on a
> series together? I'm happy to rebase my series on top of both you
> and Robert's work and make use of a new generic per nexthop
> encapsulation API. Let me know how you guys want to proceed.
Robert, pls let me know if you have a preference on how you want to 
proceed. One
option is for me to use your git tree as a way to get my patches in.
But, If we agree that we don't want to introduce a tunnel netdevice for 
mpls yet (which is our vote as well),
then its probably better for me to rebase my changes on top of your 
series and
re-submit (with proper attribution ofcourse).
(Happy to take erics feedback as well here).

Right now I am working on refining my patches and covering ipv6.
I would be happy to make RTA_ENCAP nested...unless you would prefer to 
take that over.
I have also been trying to see If i can reuse any infra from the 
existing ip_tunnel world.

Thanks for the feedback Thomas!.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-05  9:14 ` Thomas Graf
  2015-06-05 14:16   ` roopa
@ 2015-06-05 14:31   ` Robert Shearman
  1 sibling, 0 replies; 11+ messages in thread
From: Robert Shearman @ 2015-06-05 14:31 UTC (permalink / raw)
  To: Thomas Graf, Roopa Prabhu; +Cc: ebiederm, netdev

On 05/06/15 10:14, Thomas Graf wrote:
> As I mentioned to Robert, the new RTA_ENCAP should be a list of
> Netlink attributes from the beginning to make it extendible without
> ever breaking user ABI.

Just to be clear in both of our approaches, the contents of the 
RTA_ENCAP data is interpreted by the encap owner. Therefore, if the mpls 
encap doesn't consist of nested attributes then it doesn't preculde 
vxlan, for example, consisting of nested attributes.

I do agree though that the netlink format for specifying mpls encap 
should support nested attributes from day 1 to allow it to be extended 
without breaking the ABI.

Thanks,
Rob

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-05 14:16   ` roopa
@ 2015-06-05 15:26     ` Robert Shearman
  2015-06-06  2:54       ` roopa
  0 siblings, 1 reply; 11+ messages in thread
From: Robert Shearman @ 2015-06-05 15:26 UTC (permalink / raw)
  To: roopa, Thomas Graf; +Cc: ebiederm, netdev

On 05/06/15 15:16, roopa wrote:
> On 6/5/15, 2:14 AM, Thomas Graf wrote:
>> On 06/03/15 at 07:21am, Roopa Prabhu wrote:
>>> From: Roopa Prabhu <roopa@cumulusnetworks.com>
>>>
>>> This is still WIP and incomplete.
>>> Posting it here because of the other discussions
>>> happening around mpls ler in the context of Roberts
>>> code and I happened to mention this implementation.
>>>
>>> This was in response to earlier email thread with Eric on
>>> net-next of possibly using xfrm style stacked destination
>>> approach.
>>>
>>> I introduce a new set of tunnel ops for light weight
>>> tunnels (lwt), but this could be merged with the
>>> other ip_tunnels code if possible.
>>>
>>> I had this code for 3.2 kernel initially, and
>>> as I was pulling out code, I realize i had to separate
>>> out some other mpls code that i have been working on
>>> and quite likely this will not even compile. Sorry abt
>>> that.
>>>
>>> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
>> Thanks for posting these patches Roopa!

Ditto, thanks Roopa!

>>
>> I see that some of the edges are still a bit rough. In particular
>> the lack of sanity checking around type before indexing the array
>> with it ;-)
> Oh..., sorry you had to see that :)
> (In my defense, ...i did successfully get some packets into the mpls
> tunnel with this though! :) )
>> No question that this would make a great optimization
>> on top of existing IP tunnels though! I think this is where Eric
>> was heading to and given this implementation, I'm perfectly fine
>> with it as it does not *require* to precompute the headers for all
>> encap types.
>>
>> This can be made compatible with the patches I have posted as well.
>> A simple flag in what you call rtencap could indicate whether to
>> perform the encap in the dst->output or merely attach the metadata
>> and forward it to RTA_OIF for postponed encapsulation.
>>
>> That way, if desirable by the user, the net_device can be omitted
>> which would suit Eric's architecture while we still also support
>> the traditional net_device model which provides stats and a shared
>> set of encapsulation parameters. It will also allow for bridges to
>> perform the encapsulation decision if needed and we can still get
>> rid of the OVS encapsulation special handling.
> yeah, that's a great idea.
>>
>> As I mentioned to Robert, the new RTA_ENCAP should be a list of
>> Netlink attributes from the beginning to make it extendible without
>> ever breaking user ABI.
> agreed.
>>
>> The most overlap seems to be with Robert's series. The direction
>> seems to be very similar. How do you want to proceed? Work on a
>> series together? I'm happy to rebase my series on top of both you
>> and Robert's work and make use of a new generic per nexthop
>> encapsulation API. Let me know how you guys want to proceed.
> Robert, pls let me know if you have a preference on how you want to
> proceed. One
> option is for me to use your git tree as a way to get my patches in.
> But, If we agree that we don't want to introduce a tunnel netdevice for
> mpls yet (which is our vote as well),
> then its probably better for me to rebase my changes on top of your
> series and
> re-submit (with proper attribution ofcourse).

It isn't clear to me what the strategy here is for dealing with tunnel 
encaps that aren't bound to an interface.

Thomas, I presume you would prefer not to force the user to keep track 
of changes to the output interface and nexthop corresponding to the 
destination of the outer IP header? And I presume that Eric is opposed 
to the option of using a virtual interface here, i.e. falling back to 
the approach I proposed?

In which case, what will the nexthop output interface be set to? 
Logically, it should have no interface. At the moment, the code assumes 
that a nexthop will have a valid interface and I don't have a feel for 
what the impact would be of changing that.

However, with that resolved I'd be happy to work on a series together. 
The remaining issue is whether to optimise for small encap that reside 
in the same memory block as the fib_info, which aren't refcounted but 
instead are copied around, or larger encaps that reside in their own 
memory block that are refcounted and only a pointer passed around. If 
the latter, then there really isn't much left in my patch series that 
can be reused, other than references to the places in the code that need 
to be changed to support multipath and to make fib_info matching work 
correctly.

> (Happy to take erics feedback as well here).
>
> Right now I am working on refining my patches and covering ipv6.
> I would be happy to make RTA_ENCAP nested...unless you would prefer to
> take that over.
> I have also been trying to see If i can reuse any infra from the
> existing ip_tunnel world.

Thanks,
Rob

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-05 15:26     ` Robert Shearman
@ 2015-06-06  2:54       ` roopa
  2015-06-08 12:33         ` Thomas Graf
  0 siblings, 1 reply; 11+ messages in thread
From: roopa @ 2015-06-06  2:54 UTC (permalink / raw)
  To: Robert Shearman; +Cc: Thomas Graf, ebiederm, netdev, Vivek Venkatraman

On 6/5/15, 8:26 AM, Robert Shearman wrote:
>
> It isn't clear to me what the strategy here is for dealing with tunnel 
> encaps that aren't bound to an interface.
>
> Thomas, I presume you would prefer not to force the user to keep track 
> of changes to the output interface and nexthop corresponding to the 
> destination of the outer IP header? And I presume that Eric is opposed 
> to the option of using a virtual interface here, i.e. falling back to 
> the approach I proposed?
>
> In which case, what will the nexthop output interface be set to? 
> Logically, it should have no interface. At the moment, the code 
> assumes that a nexthop will have a valid interface and I don't have a 
> feel for what the impact would be of changing that.

The nexthop interface is the final output interface. Any reason it 
should not be ?
>
> However, with that resolved I'd be happy to work on a series together. 
> The remaining issue is whether to optimise for small encap that reside 
> in the same memory block as the fib_info, which aren't refcounted but 
> instead are copied around, or larger encaps that reside in their own 
> memory block that are refcounted and only a pointer passed around.
I would prefer the latter (as shown in my incomplete patch) simply 
because it stays separate from fib_info and allows for extending it in 
the future.

> If the latter, then there really isn't much left in my patch series 
> that can be reused, other than references to the places in the code 
> that need to be changed to support multipath and to make fib_info 
> matching work correctly.

Thanks,
Roopa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-06  2:54       ` roopa
@ 2015-06-08 12:33         ` Thomas Graf
  2015-06-08 15:17           ` roopa
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Graf @ 2015-06-08 12:33 UTC (permalink / raw)
  To: roopa; +Cc: Robert Shearman, ebiederm, netdev, Vivek Venkatraman

On 06/05/15 at 07:54pm, roopa wrote:
> On 6/5/15, 8:26 AM, Robert Shearman wrote:
> >
> >It isn't clear to me what the strategy here is for dealing with tunnel
> >encaps that aren't bound to an interface.
> >
> >Thomas, I presume you would prefer not to force the user to keep track of
> >changes to the output interface and nexthop corresponding to the
> >destination of the outer IP header? And I presume that Eric is opposed to
> >the option of using a virtual interface here, i.e. falling back to the
> >approach I proposed?
> >
> >In which case, what will the nexthop output interface be set to?
> >Logically, it should have no interface. At the moment, the code assumes
> >that a nexthop will have a valid interface and I don't have a feel for
> >what the impact would be of changing that.
> 
> The nexthop interface is the final output interface. Any reason it should
> not be ?

Yes, the information used to determine the encapsulation and the
route used to select the outgoing interface might be coming from
different components. A simple and typical example is if you are
running quagga to for your underlay which determines which interface
to use for which tunnel endpoints. On top of that, somebody is
maintaining your virtual networks which is only aware of the tunnel
endpoint IP addresses but does not want to manage how to actually
reach them. So you would have:

ip route add 10.1.1.0/24 via tunnel 20.1.1.1 id 100 [dev vxlan0]
ip route add 20.1.1.1/24 dev eth0 

I've put "dev vxlan0" in brackets for now to indicate that it is
optional. I'm also using VXLAN as an examples as I think it's
easier to understand this separation of concern here. The point
is, whoever is adding the route with the encap information may
not know what interface to use to reach 20.1.1.1 and we may want
to rely on existing routes.

I think we want to support three models:

1. nexthop has encap and outgoing interface
   ip route add 10.1.1.0/24 via tunnel 20.1.1.1 dev eth0
   ip route add 20.1.1.1/24 dev eth0 

2. nexthop has endpoint but no dev
   ip route add 10.1.1.0/24 via tunnel 20.1.1.1
   ip route add 20.1.1.1/24 dev eth0 

   This would indicate to the routing subsystem to perform a
   fib lookup on 20.1.1.1 to determine the outgoing interface.

3. virtual tunnel interface to share configuration among routes
   ip route add 10.1.1.0/24 via tunnel 20.1.1.1 dev vxlan0
   ip route add 20.1.1.1/24 dev eth0

I think all of them are intuitive and easy to implement. This will
also allow to incorporate the bridge model.

> >However, with that resolved I'd be happy to work on a series together. The
> >remaining issue is whether to optimise for small encap that reside in the
> >same memory block as the fib_info, which aren't refcounted but instead are
> >copied around, or larger encaps that reside in their own memory block that
> >are refcounted and only a pointer passed around.
> I would prefer the latter (as shown in my incomplete patch) simply because
> it stays separate from fib_info and allows for extending it in the future.

I'm with Roopa on this one. Simply because it allows to keep the RX
and TX path more symmetric and it allows non-FIB users as well.

> >If the latter, then there really isn't much left in my patch series that
> >can be reused, other than references to the places in the code that need
> >to be changed to support multipath and to make fib_info matching work
> >correctly.

Your nexthop implementation seemed more correct based on the chunks
I went through. Can we combine the two series and make the RTA_OIF
in the nexthop optional if an RTA_ENCAP was provided and provide a
route lookup instead?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-08 12:33         ` Thomas Graf
@ 2015-06-08 15:17           ` roopa
  2015-06-08 22:58             ` Thomas Graf
  0 siblings, 1 reply; 11+ messages in thread
From: roopa @ 2015-06-08 15:17 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Robert Shearman, ebiederm, netdev, Vivek Venkatraman

On 6/8/15, 5:33 AM, Thomas Graf wrote:
> Yes, the information used to determine the encapsulation and the
> route used to select the outgoing interface might be coming from
> different components. A simple and typical example is if you are
> running quagga to for your underlay which determines which interface
> to use for which tunnel endpoints. On top of that, somebody is
> maintaining your virtual networks which is only aware of the tunnel
> endpoint IP addresses but does not want to manage how to actually
> reach them. So you would have:
>
> ip route add 10.1.1.0/24 via tunnel 20.1.1.1 id 100 [dev vxlan0]
> ip route add 20.1.1.1/24 dev eth0
>
> I've put "dev vxlan0" in brackets for now to indicate that it is
> optional. I'm also using VXLAN as an examples as I think it's
> easier to understand this separation of concern here. The point
> is, whoever is adding the route with the encap information may
> not know what interface to use to reach 20.1.1.1 and we may want
> to rely on existing routes.
>
> I think we want to support three models:
>
> 1. nexthop has encap and outgoing interface
>     ip route add 10.1.1.0/24 via tunnel 20.1.1.1 dev eth0
>     ip route add 20.1.1.1/24 dev eth0
>
> 2. nexthop has endpoint but no dev
>     ip route add 10.1.1.0/24 via tunnel 20.1.1.1
>     ip route add 20.1.1.1/24 dev eth0
>
>     This would indicate to the routing subsystem to perform a
>     fib lookup on 20.1.1.1 to determine the outgoing interface.
>
> 3. virtual tunnel interface to share configuration among routes
>     ip route add 10.1.1.0/24 via tunnel 20.1.1.1 dev vxlan0
>     ip route add 20.1.1.1/24 dev eth0
>
> I think all of them are intuitive and easy to implement. This will
> also allow to incorporate the bridge model.
>
ack, that sounds intuitive.
With RTA_ENCAP and the mpls examples i was using it looks something like 
the below for (1)
ip route add 10.1.1.0/30 encap mpls 200 via 10.1.1.1 dev eth0

The tunnel dst is parsed and understood by the light weight tunnel 
driver, which I think will
end up having to do the lookup (needs more thought)...for (2) and (3).


> Your nexthop implementation seemed more correct based on the chunks
> I went through. Can we combine the two series and make the RTA_OIF
> in the nexthop optional if an RTA_ENCAP was provided and provide a
> route lookup instead?

yes, we can do that.
  Robert can correct me if i misunderstood, both our patches had similar 
code to handle RTA_ENCAP.
Only difference was in the way we stored the encaped data, mine was a 
pointer to tunnel state and his was embedded in fib_nh. His patch today 
assumes there is a tunnel device.
And mine assumes the output device is specified in the ipv4 fib route.

I am trying to get my code on github to collaborate better. Stay tuned 
(hopefully end of day today).

While we are on this conversation, Though the code already supports 
nested attributes (with the example robert showed), I introduced 
explicit nested attributes for mpls in my version,
and it seemed like it is better to introduce two attributes 
RTA_ENCAP_TYPE and RTA_ENCAP and
type determines the nested policy for RTA_ENCAP
RTA_ENCAP_TYPE /* MPLS, VXLAN etc */

RTA_ENCAP {
     MPLS_IPTUNNEL_UNSPEC
     MPLS_IPTUNNEL_DST
}

RTA_ENCAP {  /* this is also similar to the example robert posted for 
vxlan */
       VXLAN_TUN_UNSPEC,
       VXLAN_TUN_ID,
       VXLAN_TUN_DST,
       VXLAN_TUN_SRC,
       VXLAN_TUN_TTL,
       VXLAN_TUN_TOS,
       VXLAN_TUN_SPORT,
       VXLAN_TUN_DPORT,
       VXLAN_TUN_FLAGS,
       VXLAN_TUN_MAX,
}


Thanks,
Roopa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-08 15:17           ` roopa
@ 2015-06-08 22:58             ` Thomas Graf
  2015-06-10  7:13               ` roopa
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Graf @ 2015-06-08 22:58 UTC (permalink / raw)
  To: roopa; +Cc: Robert Shearman, ebiederm, netdev, Vivek Venkatraman

On 06/08/15 at 08:17am, roopa wrote:
> ack, that sounds intuitive.
> With RTA_ENCAP and the mpls examples i was using it looks something like the
> below for (1)
> ip route add 10.1.1.0/30 encap mpls 200 via 10.1.1.1 dev eth0
> 
> The tunnel dst is parsed and understood by the light weight tunnel driver,
> which I think will
> end up having to do the lookup (needs more thought)...for (2) and (3).

I think we only want to perform the nested fib lookup if no dev
is specified. If a tunnel device is specified, that device will
do the fib lookup and can cache the route in the encap socket.

> >Your nexthop implementation seemed more correct based on the chunks
> >I went through. Can we combine the two series and make the RTA_OIF
> >in the nexthop optional if an RTA_ENCAP was provided and provide a
> >route lookup instead?
> 
> yes, we can do that.
>  Robert can correct me if i misunderstood, both our patches had similar code
> to handle RTA_ENCAP.
> Only difference was in the way we stored the encaped data, mine was a
> pointer to tunnel state and his was embedded in fib_nh. His patch today
> assumes there is a tunnel device.
> And mine assumes the output device is specified in the ipv4 fib route.

I'll immediately ACK any series that supports both models and rebase
my patches on top of it. I think we are on the right track overall.

> I am trying to get my code on github to collaborate better. Stay tuned
> (hopefully end of day today).

Cool

> While we are on this conversation, Though the code already supports nested
> attributes (with the example robert showed), I introduced explicit nested
> attributes for mpls in my version,
> and it seemed like it is better to introduce two attributes RTA_ENCAP_TYPE
> and RTA_ENCAP and
> type determines the nested policy for RTA_ENCAP
> RTA_ENCAP_TYPE /* MPLS, VXLAN etc */

+1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-08 22:58             ` Thomas Graf
@ 2015-06-10  7:13               ` roopa
  2015-06-12 16:15                 ` roopa
  0 siblings, 1 reply; 11+ messages in thread
From: roopa @ 2015-06-10  7:13 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Robert Shearman, ebiederm, netdev, Vivek Venkatraman

On 6/8/15, 3:58 PM, Thomas Graf wrote:
> I'll immediately ACK any series that supports both models and rebase
> my patches on top of it. I think we are on the right track overall.
>
>> I am trying to get my code on github to collaborate better. Stay tuned
>> (hopefully end of day today).
>
Robert/Thomas, All my changes are in the below repo under the 'mpls' branch.
https://github.com/CumulusNetworks/net-next
https://github.com/CumulusNetworks/iproute2

The last iproute2 commit has a sample usage.

The commits pushed to this tree do not contain support for the following 
yet (but working on it):
a) tunnel routes to work with tunnel RTA_OIF and a non-tunnel RTA_OIF:
The current commits in the tree assume a non-tunnel RTA_OIF.
If the tunnel driver has registered a dst_output func,  dst_output
is set to the tunnel dst output handler in the receive route lookup path 
which in turn does the encap
and xmits. Thomas had last suggested using a flag to skip the dst output 
handler re-direction
for cases where RTA_OIF is a special tunnel netdev and the tunnel driver 
xmit function
can do the encap. My current thinking is to pass the oif to the encap 
parse handler and the handler can set the flag on the tunnel state. And 
this flag can then be used to skip the dst_output re-direction.
This change should be trivial will fix it soon.

b) make RTA_OIF optional and do a fib lookup.

keep your suggestions/feedback coming...

thanks,
Roopa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH WIP RFC 0/3] mpls: support for ler
  2015-06-10  7:13               ` roopa
@ 2015-06-12 16:15                 ` roopa
  0 siblings, 0 replies; 11+ messages in thread
From: roopa @ 2015-06-12 16:15 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Robert Shearman, ebiederm, netdev, Vivek Venkatraman

On 6/10/15, 12:13 AM, roopa wrote:
> Robert/Thomas, All my changes are in the below repo under the 'mpls' 
> branch.
> https://github.com/CumulusNetworks/net-next
> https://github.com/CumulusNetworks/iproute2
>
> The last iproute2 commit has a sample usage.
>
> The commits pushed to this tree do not contain support for the 
> following yet (but working on it):
> a) tunnel routes to work with tunnel RTA_OIF and a non-tunnel RTA_OIF:
> The current commits in the tree assume a non-tunnel RTA_OIF.
> If the tunnel driver has registered a dst_output func,  dst_output
> is set to the tunnel dst output handler in the receive route lookup 
> path which in turn does the encap
> and xmits. Thomas had last suggested using a flag to skip the dst 
> output handler re-direction
> for cases where RTA_OIF is a special tunnel netdev and the tunnel 
> driver xmit function
> can do the encap. My current thinking is to pass the oif to the encap 
> parse handler and the handler can set the flag on the tunnel state. 
> And this flag can then be used to skip the dst_output re-direction.
> This change should be trivial will fix it soon.

I have pushed this change to my github tree.
>
> b) make RTA_OIF optional and do a fib lookup.
>
thinking about this some more, RTA_OIF is already optional. And 
net/ipv4/fib_semantics.c:fib_check_nh will lookup the dev if not 
specified. Wouldn't that be enough ?. (unless i have misunderstood 
something here)

thanks,
Roopa

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-06-12 16:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-03 14:21 [PATCH WIP RFC 0/3] mpls: support for ler Roopa Prabhu
2015-06-05  9:14 ` Thomas Graf
2015-06-05 14:16   ` roopa
2015-06-05 15:26     ` Robert Shearman
2015-06-06  2:54       ` roopa
2015-06-08 12:33         ` Thomas Graf
2015-06-08 15:17           ` roopa
2015-06-08 22:58             ` Thomas Graf
2015-06-10  7:13               ` roopa
2015-06-12 16:15                 ` roopa
2015-06-05 14:31   ` Robert Shearman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).