From: Robert Shearman <rshearma@brocade.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "davem@davemloft.net" <davem@davemloft.net>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [PATCH net-next v2 5/5] mpls: Allow payload type to be associated with label routes
Date: Mon, 23 Mar 2015 14:02:49 +0000 [thread overview]
Message-ID: <55101D09.2000306@brocade.com> (raw)
In-Reply-To: <87sicw4up6.fsf@x220.int.ebiederm.org>
On 22/03/15 20:56, Eric W. Biederman wrote:
> Robert Shearman <rshearma@brocade.com> writes:
>
>> RFC 4182 s2 states that if an IPv4 Explicit NULL label is the only
>> label on the stack, then after popping the resulting packet must be
>> treated as a IPv4 packet and forwarded based on the IPv4 header. The
>> same is true for IPv6 Explicit NULL with an IPv6 packet following.
>>
>> Therefore, when installing the IPv4/IPv6 Explicit NULL label routes,
>> add an attribute that specifies the expected payload type for use at
>> forwarding time for determining the type of the encapsulated packet
>> instead of inspecting the first nibble of the packet.
>
> So this patch is not wrong. And it at a practical level it is a good
> idea to enforce ipv4 when the ipv4 explicit null label is present
> and similarly with ipv6.
>
> I do have some quibbles.
>
> First I want to point out that in RFC3032 section 2.2 talks about using
> a label in combination of with the packets contents to figure out the
> type of packet that is being transmitted. IPv4 and IPv6 do count as a
> set of network layer protocols that can be distinguished by inspection
> of the network layer header.
I'm confused why you feel this is a quibble. This patch allows this case
and even documents that this can be done:
>> + MPT_UNSPEC, /* IPv4 or IPv6 */
I haven't added any warnings or barriers to using this even with it
being orthogonal to the direction all the other known MPLS stacks have
gone in, as we discussed in a previous thread.
> Changing mpls_egress to mpls_bos_egress bothers me a little, because it
> seems redundant. But I can see an argument for that name change.
>
> I think it would be cleaner if we set MPT_IPV4 = 4 and MPT_IPV6 = 6.
> which would remove a switch statement mpls_pkt_determine_af.
Ok.
> You delete my big fat comment referring people to how packets are
> encoded in mpls. That seems unfortunate, because it can be easy to get
> lost in the MPLS rfcs, and I am certain someone will want to do more
> than support IPv4 and IPv6.
Yes, I deleted the comment because it refers to determining the type of
packet using the first nibble for the pseudo-wire with control-word
case, which as we discussed in a previous thread is contrary to the
intention of the author of the RFC draft that defines it. I can
certainly keep the references to the RFCs around though.
>
> Given the number of pseudo wire types I do believe that 3 bits is going
> to be too small to encode everything going forward.
I can steal another bit from the number of labels if you'd prefer, but
if you're suggesting moving this out to a full 8-bit field then I don't
see the need to over-engineer this and use more memory given that this
can easily be changed going forward.
Thanks,
Rob
>
>> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
>> Signed-off-by: Robert Shearman <rshearma@brocade.com>
>> ---
>> net/mpls/af_mpls.c | 87 ++++++++++++++++++++++++++++++++++--------------------
>> 1 file changed, 55 insertions(+), 32 deletions(-)
>>
>> diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
>> index 14c7e76..653bae1 100644
>> --- a/net/mpls/af_mpls.c
>> +++ b/net/mpls/af_mpls.c
>> @@ -23,13 +23,20 @@
>> /* This maximum ha length copied from the definition of struct neighbour */
>> #define MAX_VIA_ALEN (ALIGN(MAX_ADDR_LEN, sizeof(unsigned long)))
>>
>> +enum mpls_payload_type {
>> + MPT_UNSPEC, /* IPv4 or IPv6 */
>> + MPT_IPV4,
>> + MPT_IPV6,
>> +};
>> +
>> struct mpls_route { /* next hop label forwarding entry */
>> struct net_device __rcu *rt_dev;
>> struct rcu_head rt_rcu;
>> u32 rt_label[MAX_NEW_LABELS];
>> u8 rt_protocol; /* routing protocol that set this entry */
>> u8 rt_unlabeled : 1;
>> - u8 rt_labels : 7;
>> + u8 rt_payload_type : 3;
>> + u8 rt_labels : 4;
>> u8 rt_via_alen;
>> u8 rt_via_table;
>> u8 rt_via[0];
>> @@ -87,19 +94,24 @@ static bool mpls_pkt_too_big(const struct sk_buff *skb, unsigned int mtu)
>> return true;
>> }
>>
>> -static bool mpls_egress(struct mpls_route *rt, struct sk_buff *skb,
>> - struct mpls_entry_decoded dec)
>> +static enum mpls_payload_type mpls_pkt_determine_af(struct sk_buff *skb)
>> {
>> - /* RFC4385 and RFC5586 encode other packets in mpls such that
>> - * they don't conflict with the ip version number, making
>> - * decoding by examining the ip version correct in everything
>> - * except for the strangest cases.
>> - *
>> - * The strange cases if we choose to support them will require
>> - * manual configuration.
>> - */
>> - struct iphdr *hdr4;
>> - bool success = true;
>> + struct iphdr *hdr4 = ip_hdr(skb);
>> +
>> + switch (hdr4->version) {
>> + case 4:
>> + return MPT_IPV4;
>> + case 6:
>> + return MPT_IPV6;
>> + }
>> +
>> + return MPT_UNSPEC;
>> +}
>> +
>> +static bool mpls_bos_egress(struct mpls_route *rt, struct sk_buff *skb,
>> + struct mpls_entry_decoded dec)
>> +{
>> + enum mpls_payload_type payload_type;
>>
>> /* The IPv4 code below accesses through the IPv4 header
>> * checksum, which is 12 bytes into the packet.
>> @@ -114,24 +126,31 @@ static bool mpls_egress(struct mpls_route *rt, struct sk_buff *skb,
>> if (!pskb_may_pull(skb, 12))
>> return false;
>>
>> - /* Use ip_hdr to find the ip protocol version */
>> - hdr4 = ip_hdr(skb);
>> - if (hdr4->version == 4) {
>> + payload_type = rt->rt_payload_type;
>> + if (payload_type == MPT_UNSPEC)
>> + payload_type = mpls_pkt_determine_af(skb);
>> +
>> + switch (payload_type) {
>> + case MPT_IPV4: {
>> + struct iphdr *hdr4 = ip_hdr(skb);
>> skb->protocol = htons(ETH_P_IP);
>> csum_replace2(&hdr4->check,
>> htons(hdr4->ttl << 8),
>> htons(dec.ttl << 8));
>> hdr4->ttl = dec.ttl;
>> + return true;
>> }
>> - else if (hdr4->version == 6) {
>> + case MPT_IPV6: {
>> struct ipv6hdr *hdr6 = ipv6_hdr(skb);
>> skb->protocol = htons(ETH_P_IPV6);
>> hdr6->hop_limit = dec.ttl;
>> + return true;
>> }
>> - else
>> - /* version 0 and version 1 are used by pseudo wires */
>> - success = false;
>> - return success;
>> + case MPT_UNSPEC:
>> + break;
>> + }
>> +
>> + return false;
>> }
>>
>> static int mpls_forward(struct sk_buff *skb, struct net_device *dev,
>> @@ -210,7 +229,7 @@ static int mpls_forward(struct sk_buff *skb, struct net_device *dev,
>> skb->protocol = htons(ETH_P_MPLS_UC);
>>
>> if (unlikely(!new_header_size && dec.bos)) {
>> - if (!mpls_egress(rt, skb, dec))
>> + if (!mpls_bos_egress(rt, skb, dec))
>> goto drop;
>> } else if (rt->rt_unlabeled) {
>> /* Labeled traffic destined to unlabeled peer should
>> @@ -253,16 +272,17 @@ static const struct nla_policy rtm_mpls_policy[RTA_MAX+1] = {
>> };
>>
>> struct mpls_route_config {
>> - u32 rc_protocol;
>> - u32 rc_ifindex;
>> - u16 rc_via_table;
>> - u16 rc_via_alen;
>> - u8 rc_via[MAX_VIA_ALEN];
>> - u32 rc_label;
>> - u32 rc_output_labels;
>> - u32 rc_output_label[MAX_NEW_LABELS];
>> - u32 rc_nlflags;
>> - struct nl_info rc_nlinfo;
>> + u32 rc_protocol;
>> + u32 rc_ifindex;
>> + u16 rc_via_table;
>> + u16 rc_via_alen;
>> + u8 rc_via[MAX_VIA_ALEN];
>> + u32 rc_label;
>> + u32 rc_output_labels;
>> + u32 rc_output_label[MAX_NEW_LABELS];
>> + u32 rc_nlflags;
>> + enum mpls_payload_type rc_payload_type;
>> + struct nl_info rc_nlinfo;
>> };
>>
>> static struct mpls_route *mpls_rt_alloc(size_t alen)
>> @@ -413,6 +433,7 @@ static int mpls_route_add(struct mpls_route_config *cfg)
>> }
>> rt->rt_protocol = cfg->rc_protocol;
>> RCU_INIT_POINTER(rt->rt_dev, dev);
>> + rt->rt_payload_type = cfg->rc_payload_type;
>> rt->rt_via_table = cfg->rc_via_table;
>> memcpy(rt->rt_via, cfg->rc_via, cfg->rc_via_alen);
>>
>> @@ -948,6 +969,7 @@ static int resize_platform_label_table(struct net *net, size_t limit)
>> goto nort0;
>> RCU_INIT_POINTER(rt0->rt_dev, lo);
>> rt0->rt_protocol = RTPROT_KERNEL;
>> + rt0->rt_payload_type = MPT_IPV4;
>> rt0->rt_via_table = NEIGH_LINK_TABLE;
>> memcpy(rt0->rt_via, lo->dev_addr, lo->addr_len);
>> }
>> @@ -958,6 +980,7 @@ static int resize_platform_label_table(struct net *net, size_t limit)
>> goto nort2;
>> RCU_INIT_POINTER(rt2->rt_dev, lo);
>> rt2->rt_protocol = RTPROT_KERNEL;
>> + rt2->rt_payload_type = MPT_IPV6;
>> rt2->rt_via_table = NEIGH_LINK_TABLE;
>> memcpy(rt2->rt_via, lo->dev_addr, lo->addr_len);
>> }
next prev parent reply other threads:[~2015-03-23 14:03 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-19 21:32 [PATCH net-next 0/5] mpls: Behaviour-changing improvements Robert Shearman
2015-03-19 21:32 ` [PATCH net-next 1/5] mpls: Use definition for reserved label checks Robert Shearman
2015-03-20 0:41 ` Eric W. Biederman
2015-03-20 14:12 ` Robert Shearman
2015-03-19 21:32 ` [PATCH net-next 2/5] mpls: Remove incorrect PHP comment Robert Shearman
2015-03-19 21:32 ` [PATCH net-next 3/5] mpls: Differentiate implicit-null and unlabeled neighbours Robert Shearman
2015-03-19 21:32 ` [PATCH net-next 4/5] mpls: Per-device enabling of packet forwarding Robert Shearman
2015-03-19 21:32 ` [PATCH net-next 5/5] mpls: Allow payload type to be associated with label routes Robert Shearman
2015-03-20 15:42 ` [PATCH net-next v2 0/5] mpls: Behaviour-changing improvements Robert Shearman
2015-03-20 15:42 ` [PATCH net-next v2 1/5] mpls: Use definition for reserved label checks Robert Shearman
2015-03-22 19:09 ` Eric W. Biederman
2015-03-20 15:42 ` [PATCH net-next v2 2/5] mpls: Remove incorrect PHP comment Robert Shearman
2015-03-22 19:12 ` Eric W. Biederman
2015-03-23 11:32 ` Robert Shearman
2015-03-23 18:16 ` Eric W. Biederman
2015-03-24 15:18 ` Robert Shearman
2015-03-24 18:43 ` Vivek Venkatraman
2015-03-20 15:42 ` [PATCH net-next v2 3/5] mpls: Differentiate implicit-null and unlabeled neighbours Robert Shearman
2015-03-22 19:49 ` Eric W. Biederman
2015-03-22 21:06 ` Eric W. Biederman
2015-03-23 11:47 ` Robert Shearman
2015-03-20 15:42 ` [PATCH net-next v2 4/5] mpls: Per-device enabling of packet forwarding Robert Shearman
2015-03-22 20:02 ` Eric W. Biederman
2015-03-22 20:34 ` Eric W. Biederman
2015-03-23 13:42 ` Robert Shearman
2015-03-23 13:10 ` Robert Shearman
2015-03-20 15:42 ` [PATCH net-next v2 5/5] mpls: Allow payload type to be associated with label routes Robert Shearman
2015-03-22 20:56 ` Eric W. Biederman
2015-03-23 14:02 ` Robert Shearman [this message]
2015-03-30 18:15 ` [PATCH net-next v3 0/4] mpls: Behaviour-changing improvements Robert Shearman
2015-03-30 18:15 ` [PATCH net-next v3 1/4] mpls: Use definition for reserved label checks Robert Shearman
2015-03-30 18:15 ` [PATCH net-next v3 2/4] mpls: Differentiate implicit-null and unlabeled neighbours Robert Shearman
2015-04-07 16:56 ` Eric W. Biederman
2015-04-08 17:08 ` Robert Shearman
2015-03-30 18:15 ` [PATCH net-next v3 3/4] mpls: Per-device enabling of packet input Robert Shearman
2015-04-07 17:02 ` Eric W. Biederman
2015-04-08 14:29 ` Robert Shearman
2015-04-08 14:44 ` Eric W. Biederman
2015-03-30 18:15 ` [PATCH net-next v3 4/4] mpls: Allow payload type to be associated with label routes Robert Shearman
2015-04-07 17:19 ` Eric W. Biederman
2015-04-08 14:03 ` Robert Shearman
2015-04-01 19:30 ` [PATCH net-next v3 0/4] mpls: Behaviour-changing improvements David Miller
2015-04-01 21:14 ` Eric W. Biederman
2015-04-01 23:49 ` Robert Shearman
2015-04-06 20:02 ` David Miller
2015-04-14 22:44 ` [PATCH net-next v4 0/6] " Robert Shearman
2015-04-14 22:44 ` [PATCH net-next v4 1/6] mpls: Use definition for reserved label checks Robert Shearman
2015-04-14 22:44 ` [PATCH net-next v4 2/6] mpls: Per-device MPLS state Robert Shearman
2015-04-14 22:45 ` [PATCH net-next v4 3/6] mpls: Per-device enabling of packet input Robert Shearman
2015-04-14 22:45 ` [PATCH net-next v4 4/6] mpls: Allow payload type to be associated with label routes Robert Shearman
2015-04-14 22:45 ` [PATCH net-next v4 5/6] mpls: Differentiate implicit-null and unlabeled neighbours Robert Shearman
2015-04-14 22:45 ` [PATCH net-next v4 6/6] mpls: Prevent use of implicit NULL label as outgoing label Robert Shearman
2015-04-21 20:34 ` [PATCH 0/3] mpls: ABI changes for security and correctness Robert Shearman
2015-04-21 20:34 ` [PATCH 1/3] mpls: Per-device MPLS state Robert Shearman
2015-04-21 20:34 ` [PATCH 2/3] mpls: Per-device enabling of packet input Robert Shearman
2015-04-21 20:34 ` [PATCH 3/3] mpls: Prevent use of implicit NULL label as outgoing label Robert Shearman
2015-04-22 0:29 ` [PATCH 0/3] mpls: ABI changes for security and correctness Eric W. Biederman
2015-04-22 2:12 ` David Miller
2015-04-22 10:10 ` Robert Shearman
2015-04-22 10:14 ` [PATCH v2 " Robert Shearman
2015-04-22 10:14 ` [PATCH v2 1/3] mpls: Per-device MPLS state Robert Shearman
2015-04-22 15:25 ` Eric W. Biederman
2015-04-22 10:14 ` [PATCH v2 2/3] mpls: Per-device enabling of packet input Robert Shearman
2015-04-22 16:27 ` Eric W. Biederman
2015-04-22 10:14 ` [PATCH v2 3/3] mpls: Prevent use of implicit NULL label as outgoing label Robert Shearman
2015-04-22 16:32 ` Eric W. Biederman
2015-04-22 16:47 ` [PATCH v2 0/3] mpls: ABI changes for security and correctness Eric W. Biederman
2015-04-22 18:25 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55101D09.2000306@brocade.com \
--to=rshearma@brocade.com \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).