* [PATCH] net: deliver skbs on inactive slaves to exact matches
@ 2010-05-06 17:24 John Fastabend
2010-05-06 18:01 ` Jay Vosburgh
0 siblings, 1 reply; 4+ messages in thread
From: John Fastabend @ 2010-05-06 17:24 UTC (permalink / raw)
To: bonding-devel, netdev
Cc: john.r.fastabend, christopher.leech, andy, kaber, fubar
Currently, the accelerated receive path for VLAN's will
drop packets if the real device is an inactive slave and
is not one of the special pkts tested for in
skb_bond_should_drop(). This behavior is different then
the non-accelerated path and for pkts over a bonded vlan.
For example,
vlanx -> bond0 -> ethx
will be dropped in the vlan path and not delivered to any
packet handlers. However,
bond0 -> vlanx -> ethx
will be delivered to handlers that match the exact dev,
because the VLAN path checks the real_dev which is not a
slave and netif_recv_skb() doesn't drop frames but only
delivers them to exact matches.
This patch adds a pkt_type PACKET_DROP which is now used
to identify skbs that would previously been dropped and
allows the skb to continue to skb_netif_recv(). Here we
add logic to check for PACKET_DROP and if so only deliver
to handlers that match exactly. IMHO this is more
consistent and gives pkt handlers a way to identify skbs
that come from inactive slaves.
This allows a third case to function which is important for
doing multipath with FCoE traffic while LAN traffic bonded,
bond0 -> ethx
|
vlanx -> --
Here the vlan is not in bond0 but the FCoE handler can still
receive the skb. Previously these skbs were dropped.
I have tested the following 4 configurations in failover modes
and load balancing modes and have not seen any duplicate packets
or unexpected bahavior.
# bond0 -> ethx
# vlanx -> bond0 -> ethx
# bond0 -> vlanx -> ethx
# bond0 -> ethx
|
vlanx -> --
Also this removes the PACKET_FASTROUTE define which was not being
used.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
include/linux/if_packet.h | 2 +-
net/8021q/vlan_core.c | 4 ++--
net/core/dev.c | 25 ++++++++++++++++++-------
3 files changed, 21 insertions(+), 10 deletions(-)
diff --git a/include/linux/if_packet.h b/include/linux/if_packet.h
index 6ac23ef..9d079fa 100644
--- a/include/linux/if_packet.h
+++ b/include/linux/if_packet.h
@@ -28,7 +28,7 @@ struct sockaddr_ll {
#define PACKET_OUTGOING 4 /* Outgoing of any type */
/* These ones are invisible by user level */
#define PACKET_LOOPBACK 5 /* MC/BRD frame looped back */
-#define PACKET_FASTROUTE 6 /* Fastrouted frame */
+#define PACKET_DROP 6 /* Drop packet */
/* Packet socket options */
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index c584a0a..4510e08 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -12,7 +12,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
return NET_RX_DROP;
if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
- goto drop;
+ skb->pkt_type = PACKET_DROP;
skb->skb_iif = skb->dev->ifindex;
__vlan_hwaccel_put_tag(skb, vlan_tci);
@@ -84,7 +84,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
struct sk_buff *p;
if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
- goto drop;
+ skb->pkt_type = PACKET_DROP;
skb->skb_iif = skb->dev->ifindex;
__vlan_hwaccel_put_tag(skb, vlan_tci);
diff --git a/net/core/dev.c b/net/core/dev.c
index 36d53be..cefac4f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2776,7 +2776,7 @@ static int __netif_receive_skb(struct sk_buff *skb)
struct net_device *orig_dev;
struct net_device *master;
struct net_device *null_or_orig;
- struct net_device *null_or_bond;
+ struct net_device *dev_or_bond;
int ret = NET_RX_DROP;
__be16 type;
@@ -2793,13 +2793,24 @@ static int __netif_receive_skb(struct sk_buff *skb)
if (!skb->skb_iif)
skb->skb_iif = skb->dev->ifindex;
+ /*
+ * bonding note: skbs received on inactive slaves should only
+ * be delivered to pkt handlers that are exact matches. Also
+ * the pkt_type field will be marked PACKET_DROP. If packet
+ * handlers are sensitive to duplicate packets these skbs will
+ * need to be dropped at the handler. The vlan accel path may
+ * have already set PACKET_DROP.
+ */
null_or_orig = NULL;
orig_dev = skb->dev;
master = ACCESS_ONCE(orig_dev->master);
- if (master) {
- if (skb_bond_should_drop(skb, master))
+ if (skb->pkt_type == PACKET_DROP)
+ null_or_orig = orig_dev;
+ else if (master) {
+ if (skb_bond_should_drop(skb, master)) {
+ skb->pkt_type = PACKET_DROP;
null_or_orig = orig_dev; /* deliver only exact match */
- else
+ } else
skb->dev = master;
}
@@ -2849,10 +2860,10 @@ ncls:
* device that may have registered for a specific ptype. The
* handler may have to adjust skb->dev and orig_dev.
*/
- null_or_bond = NULL;
+ dev_or_bond = skb->dev;
if ((skb->dev->priv_flags & IFF_802_1Q_VLAN) &&
(vlan_dev_real_dev(skb->dev)->priv_flags & IFF_BONDING)) {
- null_or_bond = vlan_dev_real_dev(skb->dev);
+ dev_or_bond = vlan_dev_real_dev(skb->dev);
}
type = skb->protocol;
@@ -2860,7 +2871,7 @@ ncls:
&ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
if (ptype->type == type && (ptype->dev == null_or_orig ||
ptype->dev == skb->dev || ptype->dev == orig_dev ||
- ptype->dev == null_or_bond)) {
+ ptype->dev == dev_or_bond)) {
if (pt_prev)
ret = deliver_skb(skb, pt_prev, orig_dev);
pt_prev = ptype;
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] net: deliver skbs on inactive slaves to exact matches
2010-05-06 17:24 [PATCH] net: deliver skbs on inactive slaves to exact matches John Fastabend
@ 2010-05-06 18:01 ` Jay Vosburgh
2010-05-06 18:37 ` John Fastabend
0 siblings, 1 reply; 4+ messages in thread
From: Jay Vosburgh @ 2010-05-06 18:01 UTC (permalink / raw)
To: John Fastabend; +Cc: bonding-devel, netdev, christopher.leech, andy, kaber
John Fastabend <john.r.fastabend@intel.com> wrote:
>Currently, the accelerated receive path for VLAN's will
>drop packets if the real device is an inactive slave and
>is not one of the special pkts tested for in
>skb_bond_should_drop(). This behavior is different then
>the non-accelerated path and for pkts over a bonded vlan.
>
>For example,
>
>vlanx -> bond0 -> ethx
>
>will be dropped in the vlan path and not delivered to any
>packet handlers. However,
>
>bond0 -> vlanx -> ethx
>
>will be delivered to handlers that match the exact dev,
>because the VLAN path checks the real_dev which is not a
>slave and netif_recv_skb() doesn't drop frames but only
>delivers them to exact matches.
>
>This patch adds a pkt_type PACKET_DROP which is now used
>to identify skbs that would previously been dropped and
>allows the skb to continue to skb_netif_recv(). Here we
>add logic to check for PACKET_DROP and if so only deliver
>to handlers that match exactly. IMHO this is more
>consistent and gives pkt handlers a way to identify skbs
>that come from inactive slaves.
I was looking at this some yesterday and this morning, trying to
figure out if a packet going up the IP protocol stack with pkt_type ==
PACKET_DROP would break anything. For example:
net/ipv4/ip_options.c:
int ip_options_rcv_srr(struct sk_buff *skb)
{
[...]
if (!opt->srr)
return 0;
if (skb->pkt_type != PACKET_HOST)
return -EINVAL;
or:
net/ipv4/tcp_ipv4.c:
int tcp_v4_rcv(struct sk_buff *skb)
{
const struct iphdr *iph;
struct tcphdr *th;
struct sock *sk;
int ret;
struct net *net = dev_net(skb->dev);
if (skb->pkt_type != PACKET_HOST)
goto discard_it;
Is pkt_type == PACKET_DROP instead going to break things for
these, or the other similar cases?
I think what you're looking for is really the usual PACKET_HOST,
PACKET_OTHERHOST, et al, upper protocol behavior, with the exception
that at the __netif_receive_skb level, wildcard matches in the ptypes
are not done. Setting the pkt_type to PACKET_DROP loses the _HOST,
_OTHERHOST, etc, information, but sends the packet up the stack anyway,
and I'm not sure that's really the right thing to do.
-J
>This allows a third case to function which is important for
>doing multipath with FCoE traffic while LAN traffic bonded,
>
>bond0 -> ethx
> |
>vlanx -> --
>
>Here the vlan is not in bond0 but the FCoE handler can still
>receive the skb. Previously these skbs were dropped.
>
>I have tested the following 4 configurations in failover modes
>and load balancing modes and have not seen any duplicate packets
>or unexpected bahavior.
>
># bond0 -> ethx
>
># vlanx -> bond0 -> ethx
>
># bond0 -> vlanx -> ethx
>
># bond0 -> ethx
> |
> vlanx -> --
>
>Also this removes the PACKET_FASTROUTE define which was not being
>used.
>
>Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>---
>
> include/linux/if_packet.h | 2 +-
> net/8021q/vlan_core.c | 4 ++--
> net/core/dev.c | 25 ++++++++++++++++++-------
> 3 files changed, 21 insertions(+), 10 deletions(-)
>
>diff --git a/include/linux/if_packet.h b/include/linux/if_packet.h
>index 6ac23ef..9d079fa 100644
>--- a/include/linux/if_packet.h
>+++ b/include/linux/if_packet.h
>@@ -28,7 +28,7 @@ struct sockaddr_ll {
> #define PACKET_OUTGOING 4 /* Outgoing of any type */
> /* These ones are invisible by user level */
> #define PACKET_LOOPBACK 5 /* MC/BRD frame looped back */
>-#define PACKET_FASTROUTE 6 /* Fastrouted frame */
>+#define PACKET_DROP 6 /* Drop packet */
>
> /* Packet socket options */
>
>diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>index c584a0a..4510e08 100644
>--- a/net/8021q/vlan_core.c
>+++ b/net/8021q/vlan_core.c
>@@ -12,7 +12,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
> return NET_RX_DROP;
>
> if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
>- goto drop;
>+ skb->pkt_type = PACKET_DROP;
>
> skb->skb_iif = skb->dev->ifindex;
> __vlan_hwaccel_put_tag(skb, vlan_tci);
>@@ -84,7 +84,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
> struct sk_buff *p;
>
> if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
>- goto drop;
>+ skb->pkt_type = PACKET_DROP;
>
> skb->skb_iif = skb->dev->ifindex;
> __vlan_hwaccel_put_tag(skb, vlan_tci);
>diff --git a/net/core/dev.c b/net/core/dev.c
>index 36d53be..cefac4f 100644
>--- a/net/core/dev.c
>+++ b/net/core/dev.c
>@@ -2776,7 +2776,7 @@ static int __netif_receive_skb(struct sk_buff *skb)
> struct net_device *orig_dev;
> struct net_device *master;
> struct net_device *null_or_orig;
>- struct net_device *null_or_bond;
>+ struct net_device *dev_or_bond;
> int ret = NET_RX_DROP;
> __be16 type;
>
>@@ -2793,13 +2793,24 @@ static int __netif_receive_skb(struct sk_buff *skb)
> if (!skb->skb_iif)
> skb->skb_iif = skb->dev->ifindex;
>
>+ /*
>+ * bonding note: skbs received on inactive slaves should only
>+ * be delivered to pkt handlers that are exact matches. Also
>+ * the pkt_type field will be marked PACKET_DROP. If packet
>+ * handlers are sensitive to duplicate packets these skbs will
>+ * need to be dropped at the handler. The vlan accel path may
>+ * have already set PACKET_DROP.
>+ */
> null_or_orig = NULL;
> orig_dev = skb->dev;
> master = ACCESS_ONCE(orig_dev->master);
>- if (master) {
>- if (skb_bond_should_drop(skb, master))
>+ if (skb->pkt_type == PACKET_DROP)
>+ null_or_orig = orig_dev;
>+ else if (master) {
>+ if (skb_bond_should_drop(skb, master)) {
>+ skb->pkt_type = PACKET_DROP;
> null_or_orig = orig_dev; /* deliver only exact match */
>- else
>+ } else
> skb->dev = master;
> }
>
>@@ -2849,10 +2860,10 @@ ncls:
> * device that may have registered for a specific ptype. The
> * handler may have to adjust skb->dev and orig_dev.
> */
>- null_or_bond = NULL;
>+ dev_or_bond = skb->dev;
> if ((skb->dev->priv_flags & IFF_802_1Q_VLAN) &&
> (vlan_dev_real_dev(skb->dev)->priv_flags & IFF_BONDING)) {
>- null_or_bond = vlan_dev_real_dev(skb->dev);
>+ dev_or_bond = vlan_dev_real_dev(skb->dev);
> }
>
> type = skb->protocol;
>@@ -2860,7 +2871,7 @@ ncls:
> &ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
> if (ptype->type == type && (ptype->dev == null_or_orig ||
> ptype->dev == skb->dev || ptype->dev == orig_dev ||
>- ptype->dev == null_or_bond)) {
>+ ptype->dev == dev_or_bond)) {
> if (pt_prev)
> ret = deliver_skb(skb, pt_prev, orig_dev);
> pt_prev = ptype;
>
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: deliver skbs on inactive slaves to exact matches
2010-05-06 18:01 ` Jay Vosburgh
@ 2010-05-06 18:37 ` John Fastabend
2010-05-08 0:15 ` John Fastabend
0 siblings, 1 reply; 4+ messages in thread
From: John Fastabend @ 2010-05-06 18:37 UTC (permalink / raw)
To: Jay Vosburgh
Cc: bonding-devel@lists.sourceforge.net, netdev@vger.kernel.org,
Leech, Christopher, andy@greyhouse.net, kaber@trash.net
Jay Vosburgh wrote:
> John Fastabend <john.r.fastabend@intel.com> wrote:
>
>> Currently, the accelerated receive path for VLAN's will
>> drop packets if the real device is an inactive slave and
>> is not one of the special pkts tested for in
>> skb_bond_should_drop(). This behavior is different then
>> the non-accelerated path and for pkts over a bonded vlan.
>>
>> For example,
>>
>> vlanx -> bond0 -> ethx
>>
>> will be dropped in the vlan path and not delivered to any
>> packet handlers. However,
>>
>> bond0 -> vlanx -> ethx
>>
>> will be delivered to handlers that match the exact dev,
>> because the VLAN path checks the real_dev which is not a
>> slave and netif_recv_skb() doesn't drop frames but only
>> delivers them to exact matches.
>>
>> This patch adds a pkt_type PACKET_DROP which is now used
>> to identify skbs that would previously been dropped and
>> allows the skb to continue to skb_netif_recv(). Here we
>> add logic to check for PACKET_DROP and if so only deliver
>> to handlers that match exactly. IMHO this is more
>> consistent and gives pkt handlers a way to identify skbs
>> that come from inactive slaves.
>
> I was looking at this some yesterday and this morning, trying to
> figure out if a packet going up the IP protocol stack with pkt_type ==
> PACKET_DROP would break anything. For example:
>
> net/ipv4/ip_options.c:
> int ip_options_rcv_srr(struct sk_buff *skb)
> {
> [...]
> if (!opt->srr)
> return 0;
>
> if (skb->pkt_type != PACKET_HOST)
> return -EINVAL;
>
> or:
>
> net/ipv4/tcp_ipv4.c:
> int tcp_v4_rcv(struct sk_buff *skb)
> {
> const struct iphdr *iph;
> struct tcphdr *th;
> struct sock *sk;
> int ret;
> struct net *net = dev_net(skb->dev);
>
> if (skb->pkt_type != PACKET_HOST)
> goto discard_it;
>
> Is pkt_type == PACKET_DROP instead going to break things for
> these, or the other similar cases?
>
> I think what you're looking for is really the usual PACKET_HOST,
> PACKET_OTHERHOST, et al, upper protocol behavior, with the exception
> that at the __netif_receive_skb level, wildcard matches in the ptypes
> are not done. Setting the pkt_type to PACKET_DROP loses the _HOST,
> _OTHERHOST, etc, information, but sends the packet up the stack anyway,
> and I'm not sure that's really the right thing to do.
Because we wouldn't be doing wildcard matches the skb shouldn't be
passed to the IP protocol stack. Really what we want is the ip_rcv() to
drop the skb in this case,
net/ipv4/ip_input.c
int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct
packet_type *pt, struct net_device *orig_dev)
{
[...]
if (skb->pkt_type == PACKET_OTHERHOST ||
skb->pkt_type == PACKET_DROP)
goto drop;
We do lose some information about the packet _HOST, _OTHERHOST, etc, but
we also gain something namely that the packet was received on an
inactive slave. Currently, we pass these frames up the stack (for non
wildcard matches) without any indication that they were received on an
inactive slave.
What we need is a way for the VLAN path to flag skbs so that wildcard
matches in the ptyes are not done. Also I think it may be valuable for
the packet handler to be able to determine if the skb is coming from an
inactive slave. I think using the pkt_type field might be OK any ideas
for a better field?
Thanks,
John
>
> -J
>
>> This allows a third case to function which is important for
>> doing multipath with FCoE traffic while LAN traffic bonded,
>>
>> bond0 -> ethx
>> |
>> vlanx -> --
>>
>> Here the vlan is not in bond0 but the FCoE handler can still
>> receive the skb. Previously these skbs were dropped.
>>
>> I have tested the following 4 configurations in failover modes
>> and load balancing modes and have not seen any duplicate packets
>> or unexpected bahavior.
>>
>> # bond0 -> ethx
>>
>> # vlanx -> bond0 -> ethx
>>
>> # bond0 -> vlanx -> ethx
>>
>> # bond0 -> ethx
>> |
>> vlanx -> --
>>
>> Also this removes the PACKET_FASTROUTE define which was not being
>> used.
>>
>> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
>> ---
>>
>> include/linux/if_packet.h | 2 +-
>> net/8021q/vlan_core.c | 4 ++--
>> net/core/dev.c | 25 ++++++++++++++++++-------
>> 3 files changed, 21 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/linux/if_packet.h b/include/linux/if_packet.h
>> index 6ac23ef..9d079fa 100644
>> --- a/include/linux/if_packet.h
>> +++ b/include/linux/if_packet.h
>> @@ -28,7 +28,7 @@ struct sockaddr_ll {
>> #define PACKET_OUTGOING 4 /* Outgoing of any type */
>> /* These ones are invisible by user level */
>> #define PACKET_LOOPBACK 5 /* MC/BRD frame looped back */
>> -#define PACKET_FASTROUTE 6 /* Fastrouted frame */
>> +#define PACKET_DROP 6 /* Drop packet */
>>
>> /* Packet socket options */
>>
>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>> index c584a0a..4510e08 100644
>> --- a/net/8021q/vlan_core.c
>> +++ b/net/8021q/vlan_core.c
>> @@ -12,7 +12,7 @@ int __vlan_hwaccel_rx(struct sk_buff *skb, struct vlan_group *grp,
>> return NET_RX_DROP;
>>
>> if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
>> - goto drop;
>> + skb->pkt_type = PACKET_DROP;
>>
>> skb->skb_iif = skb->dev->ifindex;
>> __vlan_hwaccel_put_tag(skb, vlan_tci);
>> @@ -84,7 +84,7 @@ vlan_gro_common(struct napi_struct *napi, struct vlan_group *grp,
>> struct sk_buff *p;
>>
>> if (skb_bond_should_drop(skb, ACCESS_ONCE(skb->dev->master)))
>> - goto drop;
>> + skb->pkt_type = PACKET_DROP;
>>
>> skb->skb_iif = skb->dev->ifindex;
>> __vlan_hwaccel_put_tag(skb, vlan_tci);
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 36d53be..cefac4f 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -2776,7 +2776,7 @@ static int __netif_receive_skb(struct sk_buff *skb)
>> struct net_device *orig_dev;
>> struct net_device *master;
>> struct net_device *null_or_orig;
>> - struct net_device *null_or_bond;
>> + struct net_device *dev_or_bond;
>> int ret = NET_RX_DROP;
>> __be16 type;
>>
>> @@ -2793,13 +2793,24 @@ static int __netif_receive_skb(struct sk_buff *skb)
>> if (!skb->skb_iif)
>> skb->skb_iif = skb->dev->ifindex;
>>
>> + /*
>> + * bonding note: skbs received on inactive slaves should only
>> + * be delivered to pkt handlers that are exact matches. Also
>> + * the pkt_type field will be marked PACKET_DROP. If packet
>> + * handlers are sensitive to duplicate packets these skbs will
>> + * need to be dropped at the handler. The vlan accel path may
>> + * have already set PACKET_DROP.
>> + */
>> null_or_orig = NULL;
>> orig_dev = skb->dev;
>> master = ACCESS_ONCE(orig_dev->master);
>> - if (master) {
>> - if (skb_bond_should_drop(skb, master))
>> + if (skb->pkt_type == PACKET_DROP)
>> + null_or_orig = orig_dev;
>> + else if (master) {
>> + if (skb_bond_should_drop(skb, master)) {
>> + skb->pkt_type = PACKET_DROP;
>> null_or_orig = orig_dev; /* deliver only exact match */
>> - else
>> + } else
>> skb->dev = master;
>> }
>>
>> @@ -2849,10 +2860,10 @@ ncls:
>> * device that may have registered for a specific ptype. The
>> * handler may have to adjust skb->dev and orig_dev.
>> */
>> - null_or_bond = NULL;
>> + dev_or_bond = skb->dev;
>> if ((skb->dev->priv_flags & IFF_802_1Q_VLAN) &&
>> (vlan_dev_real_dev(skb->dev)->priv_flags & IFF_BONDING)) {
>> - null_or_bond = vlan_dev_real_dev(skb->dev);
>> + dev_or_bond = vlan_dev_real_dev(skb->dev);
>> }
>>
>> type = skb->protocol;
>> @@ -2860,7 +2871,7 @@ ncls:
>> &ptype_base[ntohs(type) & PTYPE_HASH_MASK], list) {
>> if (ptype->type == type && (ptype->dev == null_or_orig ||
>> ptype->dev == skb->dev || ptype->dev == orig_dev ||
>> - ptype->dev == null_or_bond)) {
>> + ptype->dev == dev_or_bond)) {
>> if (pt_prev)
>> ret = deliver_skb(skb, pt_prev, orig_dev);
>> pt_prev = ptype;
>>
>
> ---
> -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] net: deliver skbs on inactive slaves to exact matches
2010-05-06 18:37 ` John Fastabend
@ 2010-05-08 0:15 ` John Fastabend
0 siblings, 0 replies; 4+ messages in thread
From: John Fastabend @ 2010-05-08 0:15 UTC (permalink / raw)
To: Jay Vosburgh
Cc: bonding-devel@lists.sourceforge.net, netdev@vger.kernel.org,
Leech, Christopher, andy@greyhouse.net, kaber@trash.net
John Fastabend wrote:
> Jay Vosburgh wrote:
>> John Fastabend <john.r.fastabend@intel.com> wrote:
>>
>>> Currently, the accelerated receive path for VLAN's will
>>> drop packets if the real device is an inactive slave and
>>> is not one of the special pkts tested for in
>>> skb_bond_should_drop(). This behavior is different then
>>> the non-accelerated path and for pkts over a bonded vlan.
>>>
>>> For example,
>>>
>>> vlanx -> bond0 -> ethx
>>>
>>> will be dropped in the vlan path and not delivered to any
>>> packet handlers. However,
>>>
>>> bond0 -> vlanx -> ethx
>>>
>>> will be delivered to handlers that match the exact dev,
>>> because the VLAN path checks the real_dev which is not a
>>> slave and netif_recv_skb() doesn't drop frames but only
>>> delivers them to exact matches.
>>>
>>> This patch adds a pkt_type PACKET_DROP which is now used
>>> to identify skbs that would previously been dropped and
>>> allows the skb to continue to skb_netif_recv(). Here we
>>> add logic to check for PACKET_DROP and if so only deliver
>>> to handlers that match exactly. IMHO this is more
>>> consistent and gives pkt handlers a way to identify skbs
>>> that come from inactive slaves.
>> I was looking at this some yesterday and this morning, trying to
>> figure out if a packet going up the IP protocol stack with pkt_type ==
>> PACKET_DROP would break anything. For example:
>>
>> net/ipv4/ip_options.c:
>> int ip_options_rcv_srr(struct sk_buff *skb)
>> {
>> [...]
>> if (!opt->srr)
>> return 0;
>>
>> if (skb->pkt_type != PACKET_HOST)
>> return -EINVAL;
>>
>> or:
>>
>> net/ipv4/tcp_ipv4.c:
>> int tcp_v4_rcv(struct sk_buff *skb)
>> {
>> const struct iphdr *iph;
>> struct tcphdr *th;
>> struct sock *sk;
>> int ret;
>> struct net *net = dev_net(skb->dev);
>>
>> if (skb->pkt_type != PACKET_HOST)
>> goto discard_it;
>>
>> Is pkt_type == PACKET_DROP instead going to break things for
>> these, or the other similar cases?
>>
>> I think what you're looking for is really the usual PACKET_HOST,
>> PACKET_OTHERHOST, et al, upper protocol behavior, with the exception
>> that at the __netif_receive_skb level, wildcard matches in the ptypes
>> are not done. Setting the pkt_type to PACKET_DROP loses the _HOST,
>> _OTHERHOST, etc, information, but sends the packet up the stack anyway,
>> and I'm not sure that's really the right thing to do.
>
> Because we wouldn't be doing wildcard matches the skb shouldn't be
> passed to the IP protocol stack. Really what we want is the ip_rcv() to
> drop the skb in this case,
>
> net/ipv4/ip_input.c
> int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct
> packet_type *pt, struct net_device *orig_dev)
> {
> [...]
>
> if (skb->pkt_type == PACKET_OTHERHOST ||
> skb->pkt_type == PACKET_DROP)
> goto drop;
>
>
> We do lose some information about the packet _HOST, _OTHERHOST, etc, but
> we also gain something namely that the packet was received on an
> inactive slave. Currently, we pass these frames up the stack (for non
> wildcard matches) without any indication that they were received on an
> inactive slave.
>
> What we need is a way for the VLAN path to flag skbs so that wildcard
> matches in the ptyes are not done. Also I think it may be valuable for
> the packet handler to be able to determine if the skb is coming from an
> inactive slave. I think using the pkt_type field might be OK any ideas
> for a better field?
>
> Thanks,
> John
>
Another possibility which would keep the pkt_type info would be to add a
flag to the flags2 field in the sk_buff struct. Seeing as there is
already a u8 there for ndisc_nodetype this should work.
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 82f5116..bb1bc22 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -374,8 +374,13 @@ struct sk_buff {
kmemcheck_bitfield_begin(flags2);
__u16 queue_mapping:16;
-#ifdef CONFIG_IPV6_NDISC_NODETYPE
+#if defined(CONTIF_IPV6_NDISC_NODETYPE) && defined(CONFIG_BONDING)
+ __u8 ndisc_nodetype:2,
+ bond_should_drop:1;
+#elif defined(CONFIG_IPV6_NDISC_NODETYPE)
__u8 ndisc_nodetype:2;
+#elif defined(CONFIG_BONDING)
+ __u8 bond_should_drop:1;
#endif
kmemcheck_bitfield_end(flags2);
Thanks,
John
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-05-08 0:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-05-06 17:24 [PATCH] net: deliver skbs on inactive slaves to exact matches John Fastabend
2010-05-06 18:01 ` Jay Vosburgh
2010-05-06 18:37 ` John Fastabend
2010-05-08 0:15 ` John Fastabend
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).