From: Jiri Pirko <jpirko@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: John Fastabend <john.r.fastabend@intel.com>,
David Miller <davem@davemloft.net>,
"jesse@nicira.com" <jesse@nicira.com>,
"hans.schillstrom@ericsson.com" <hans.schillstrom@ericsson.com>,
"mbizon@freebox.fr" <mbizon@freebox.fr>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"fubar@us.ibm.com" <fubar@us.ibm.com>
Subject: Re: [net-next PATCH] net: allow vlan traffic to be received under bond
Date: Sat, 29 Oct 2011 18:16:41 +0200 [thread overview]
Message-ID: <20111029161640.GB2053@minipsycho.orion> (raw)
In-Reply-To: <1319904026.2586.42.camel@edumazet-laptop>
Sat, Oct 29, 2011 at 06:00:26PM CEST, eric.dumazet@gmail.com wrote:
>Le samedi 29 octobre 2011 à 16:59 +0200, Jiri Pirko a écrit :
>> Sat, Oct 29, 2011 at 12:22:26PM CEST, eric.dumazet@gmail.com wrote:
>> >Le vendredi 28 octobre 2011 à 19:20 -0700, John Fastabend a écrit :
>> >
>> >> Thanks Eric! Thought about this some and I haven't come up
>> >> with anything better yet. Even though this might be a slight
>> >> hack I would prefer this to reverting the patch.
>> >>
>> >> I'll think about this more tomorrow. Would you be against
>> >> submitting this patch?
>> >
>> >I cant submit this patch, because its a hack and partial fix.
>> >
>> >For Unicast packets, we still do the wrong thing : setting their
>> >pkt_type to PACKET_OTHERHOST before the call to rx_handler :
>> >
>> >In this case, bond_handle_frame() wont handle this packet correctly in
>> >some cases (BOND_MODE_ALB ...). I suppose bridge might be confused as
>> >well. So other problems remain.
>> >
>> >We should delay the PACKET_OTHERHOST setting to the last moment, that is
>> >the last time vlan_do_receive() is called.
>> >
>> >What about following patch instead ?
>> >
>> >[PATCH] vlan: allow nested vlan_do_receive()
>> >
>> >commit 2425717b27eb (net: allow vlan traffic to be received under bond)
>> >broke ARP processing on vlan on top of bonding.
>> >
>> > +-------+
>> >eth0 --| bond0 |---bond0.103
>> >eth1 --| |
>> > +-------+
>> >
>> >52870.115435: skb_gro_reset_offset <-napi_gro_receive
>> >52870.115435: dev_gro_receive <-napi_gro_receive
>> >52870.115435: napi_skb_finish <-napi_gro_receive
>> >52870.115435: netif_receive_skb <-napi_skb_finish
>> >52870.115435: get_rps_cpu <-netif_receive_skb
>> >52870.115435: __netif_receive_skb <-netif_receive_skb
>> >52870.115436: vlan_do_receive <-__netif_receive_skb
>> >52870.115436: bond_handle_frame <-__netif_receive_skb
>> >52870.115436: vlan_do_receive <-__netif_receive_skb
>> >52870.115436: arp_rcv <-__netif_receive_skb
>> >52870.115436: kfree_skb <-arp_rcv
>> >
>> >Packet is dropped in arp_rcv() because its pkt_type was set to
>> >PACKET_OTHERHOST in the first vlan_do_receive() call, since no eth0.103
>> >exists.
>> >
>> >We really need to change pkt_type only if no more rx_handler is about to
>> >be called for the packet.
>> >
>> >Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
>> >---
>> > include/linux/if_vlan.h | 8 +++++---
>> > net/8021q/vlan_core.c | 7 +++++--
>> > net/core/dev.c | 4 ++--
>> > 3 files changed, 12 insertions(+), 7 deletions(-)
>> >
>> >diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>> >index 44da482..95874ff 100644
>> >--- a/include/linux/if_vlan.h
>> >+++ b/include/linux/if_vlan.h
>> >@@ -106,7 +106,8 @@ extern struct net_device *__vlan_find_dev_deep(struct net_device *real_dev,
>> > extern struct net_device *vlan_dev_real_dev(const struct net_device *dev);
>> > extern u16 vlan_dev_vlan_id(const struct net_device *dev);
>> >
>> >-extern bool vlan_do_receive(struct sk_buff **skb);
>> >+extern bool vlan_do_receive(struct sk_buff **skb,
>> >+ rx_handler_func_t *rx_handler);
>> > extern struct sk_buff *vlan_untag(struct sk_buff *skb);
>> >
>> > #else
>> >@@ -128,9 +129,10 @@ static inline u16 vlan_dev_vlan_id(const struct net_device *dev)
>> > return 0;
>> > }
>> >
>> >-static inline bool vlan_do_receive(struct sk_buff **skb)
>> >+static inline bool vlan_do_receive(struct sk_buff **skb,
>> >+ rx_handler_func_t *rx_handler)
>> > {
>> >- if ((*skb)->vlan_tci & VLAN_VID_MASK)
>> >+ if (((*skb)->vlan_tci & VLAN_VID_MASK) && !rx_handler)
>> > (*skb)->pkt_type = PACKET_OTHERHOST;
>> > return false;
>> > }
>> >diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>> >index f1f2f7b..3ec1ada 100644
>> >--- a/net/8021q/vlan_core.c
>> >+++ b/net/8021q/vlan_core.c
>> >@@ -4,7 +4,7 @@
>> > #include <linux/netpoll.h>
>> > #include "vlan.h"
>> >
>> >-bool vlan_do_receive(struct sk_buff **skbp)
>> >+bool vlan_do_receive(struct sk_buff **skbp, rx_handler_func_t *rx_handler)
>> > {
>> > struct sk_buff *skb = *skbp;
>> > u16 vlan_id = skb->vlan_tci & VLAN_VID_MASK;
>> >@@ -13,7 +13,10 @@ bool vlan_do_receive(struct sk_buff **skbp)
>> >
>> > vlan_dev = vlan_find_dev(skb->dev, vlan_id);
>> > if (!vlan_dev) {
>> >- if (vlan_id)
>> >+ /* Only the last call to vlan_do_receive() should change
>> >+ * pkt_type to PACKET_OTHERHOST
>> >+ */
>> >+ if (vlan_id && !rx_handler)
>> > skb->pkt_type = PACKET_OTHERHOST;
>> > return false;
>> > }
>> >diff --git a/net/core/dev.c b/net/core/dev.c
>> >index edcf019..40976b4 100644
>> >--- a/net/core/dev.c
>> >+++ b/net/core/dev.c
>> >@@ -3283,18 +3283,18 @@ another_round:
>> > ncls:
>> > #endif
>> >
>> >+ rx_handler = rcu_dereference(skb->dev->rx_handler);
>> > if (vlan_tx_tag_present(skb)) {
>> > if (pt_prev) {
>> > ret = deliver_skb(skb, pt_prev, orig_dev);
>> > pt_prev = NULL;
>> > }
>> >- if (vlan_do_receive(&skb))
>> >+ if (vlan_do_receive(&skb, rx_handler))
>>
>> I must say I do not like passing rx_handler out like this. Apart it's
>> not nice, it might be misleading....
>>
>> How about something like following instead? I must test it but I believe
>> it should resolve the problem.
>>
>>
>> diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>> index 44da482..165a487 100644
>> --- a/include/linux/if_vlan.h
>> +++ b/include/linux/if_vlan.h
>> @@ -130,8 +130,6 @@ static inline u16 vlan_dev_vlan_id(const struct net_device *dev)
>>
>> static inline bool vlan_do_receive(struct sk_buff **skb)
>> {
>> - if ((*skb)->vlan_tci & VLAN_VID_MASK)
>> - (*skb)->pkt_type = PACKET_OTHERHOST;
>> return false;
>> }
>>
>> @@ -141,6 +139,14 @@ static inline struct sk_buff *vlan_untag(struct sk_buff *skb)
>> }
>> #endif
>>
>> +static inline void vlan_handle_leftover(struct sk_buff *skb)
>> +{
>> + u16 vlan_id = skb->vlan_tci & VLAN_VID_MASK;
>> +
>> + if (vlan_id)
>> + skb->pkt_type = PACKET_OTHERHOST;
>> +}
>> +
>> /**
>> * vlan_insert_tag - regular VLAN tag inserting
>> * @skb: skbuff to tag
>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>> index f1f2f7b..540da12 100644
>> --- a/net/8021q/vlan_core.c
>> +++ b/net/8021q/vlan_core.c
>> @@ -12,11 +12,8 @@ bool vlan_do_receive(struct sk_buff **skbp)
>> struct vlan_pcpu_stats *rx_stats;
>>
>> vlan_dev = vlan_find_dev(skb->dev, vlan_id);
>> - if (!vlan_dev) {
>> - if (vlan_id)
>> - skb->pkt_type = PACKET_OTHERHOST;
>> + if (!vlan_dev)
>> return false;
>> - }
>>
>> skb = *skbp = skb_share_check(skb, GFP_ATOMIC);
>> if (unlikely(!skb))
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index b7ba81a..6fdfcc9 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3314,6 +3314,14 @@ ncls:
>> }
>> }
>>
>> + if (vlan_tx_tag_present(skb)) {
>> + /*
>> + * Tag is still present here. That means there's no device
>> + * set up for this vlan id. So handle these leftovers here.
>> + */
>> + vlan_handle_leftover(skb);
>> + }
>> +
>> /* deliver only exact match when indicated */
>> null_or_dev = deliver_exact ? skb->dev : NULL;
>>
>
>Hmm, is it really working ? where vlan_tci is cleared ?
Near the end of vlan_do_receive.
>
>This is indeed nice but adds another test in fast path, while in my
>patch, additional tests are done in slow path only.
Oh, vlan_tx_tag_present() check adds overhead that can be waived upon I
suppose. I think it's better to be nice here...
>
>I see nothing wrong with passing rx_handler : Its probably cleaner than
>adding rcu_dereference_raw(skb->dev->rx_handler) in the two
>vlan_do_receive() implementations...
Cleaner for sure.
>
>For reference, this was the first patch I had in mind, before I decided
>that caller had to pass the rx_handler instead. (It could be a boolean
>instead)
>
> include/linux/if_vlan.h | 3 ++-
> net/8021q/vlan_core.c | 5 ++++-
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
>diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
>index 44da482..c6c00b4 100644
>--- a/include/linux/if_vlan.h
>+++ b/include/linux/if_vlan.h
>@@ -130,7 +130,8 @@ static inline u16 vlan_dev_vlan_id(const struct net_device *dev)
>
> static inline bool vlan_do_receive(struct sk_buff **skb)
> {
>- if ((*skb)->vlan_tci & VLAN_VID_MASK)
>+ if (((*skb)->vlan_tci & VLAN_VID_MASK) &&
>+ !rcu_dereference_raw((*skb)->dev->rx_handler))
> (*skb)->pkt_type = PACKET_OTHERHOST;
> return false;
> }
>diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>index f1f2f7b..245efb8 100644
>--- a/net/8021q/vlan_core.c
>+++ b/net/8021q/vlan_core.c
>@@ -13,7 +13,10 @@ bool vlan_do_receive(struct sk_buff **skbp)
>
> vlan_dev = vlan_find_dev(skb->dev, vlan_id);
> if (!vlan_dev) {
>- if (vlan_id)
>+ /* Only the last call to vlan_do_receive() should change
>+ * pkt_type to PACKET_OTHERHOST
>+ */
>+ if (vlan_id && !rcu_dereference_raw(skb->dev->rx_handler))
> skb->pkt_type = PACKET_OTHERHOST;
> return false;
> }
>
>
prev parent reply other threads:[~2011-10-29 16:16 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-10 19:16 [net-next PATCH] net: allow vlan traffic to be received under bond John Fastabend
2011-10-10 22:37 ` Jiri Pirko
2011-10-11 2:07 ` John Fastabend
2011-10-11 2:43 ` Jesse Gross
2011-10-11 11:08 ` Hans Schillstrom
2011-10-11 13:13 ` John Fastabend
2011-10-13 13:09 ` Hans Schillström
2011-10-11 13:16 ` John Fastabend
2011-10-11 10:57 ` Jiri Pirko
2011-10-13 15:04 ` Maxime Bizon
2011-10-13 15:38 ` Jiri Pirko
2011-10-13 15:48 ` Maxime Bizon
2011-10-13 15:59 ` Hans Schillström
2011-10-13 17:42 ` John Fastabend
2011-10-13 18:23 ` Hans Schillström
2011-10-14 0:22 ` Jesse Gross
2011-10-19 3:47 ` David Miller
2011-10-28 10:00 ` Eric Dumazet
2011-10-28 11:06 ` Eric Dumazet
2011-10-29 2:20 ` John Fastabend
2011-10-29 10:22 ` Eric Dumazet
2011-10-29 14:59 ` Jiri Pirko
2011-10-29 16:00 ` Eric Dumazet
2011-10-29 16:13 ` [PATCH v2] vlan: allow nested vlan_do_receive() Eric Dumazet
2011-10-29 16:28 ` Jiri Pirko
2011-10-30 8:38 ` Jiri Pirko
2011-10-30 8:44 ` David Miller
2011-10-30 8:44 ` Eric Dumazet
2011-10-29 16:16 ` Jiri Pirko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111029161640.GB2053@minipsycho.orion \
--to=jpirko@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=fubar@us.ibm.com \
--cc=hans.schillstrom@ericsson.com \
--cc=jesse@nicira.com \
--cc=john.r.fastabend@intel.com \
--cc=mbizon@freebox.fr \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).