netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* skb_checksum_help() vs GSO
@ 2012-01-12 20:42 Ben Hutchings
  2012-01-12 22:03 ` Herbert Xu
  2012-01-12 22:10 ` skb_checksum_help() vs GSO Stephen Hemminger
  0 siblings, 2 replies; 8+ messages in thread
From: Ben Hutchings @ 2012-01-12 20:42 UTC (permalink / raw)
  To: Herbert Xu; +Cc: netdev, Stephen Hemminger, netfilter-devel

skb_checksum_help() does:

	if (unlikely(skb_shinfo(skb)->gso_size)) {
		/* Let GSO fix up the checksum. */
		goto out_set_summed;
	}
...
out_set_summed:
	skb->ip_summed = CHECKSUM_NONE;
out:
	return ret;

but skb_gso_segment() requires that skb->ip_summed == CHECKSUM_PARTIAL
and WARNs if not.  I don't think there's any case where it's valid to
call both.  Shouldn't skb_checksum_help() also WARN and return an error
code instead of muddling on?

Inspecting the callers of skb_checksum_help(), it looks like sch_netem's
'corrupt' option and xt_CHECKSUM might trigger this case.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_help() vs GSO
  2012-01-12 20:42 skb_checksum_help() vs GSO Ben Hutchings
@ 2012-01-12 22:03 ` Herbert Xu
  2012-01-13  0:57   ` [RFC] netem: de-GSO packets before enqueing Stephen Hemminger
  2012-01-12 22:10 ` skb_checksum_help() vs GSO Stephen Hemminger
  1 sibling, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2012-01-12 22:03 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Stephen Hemminger, netfilter-devel

On Thu, Jan 12, 2012 at 08:42:10PM +0000, Ben Hutchings wrote:
> skb_checksum_help() does:
> 
> 	if (unlikely(skb_shinfo(skb)->gso_size)) {
> 		/* Let GSO fix up the checksum. */
> 		goto out_set_summed;
> 	}
> ...
> out_set_summed:
> 	skb->ip_summed = CHECKSUM_NONE;
> out:
> 	return ret;
> 
> but skb_gso_segment() requires that skb->ip_summed == CHECKSUM_PARTIAL
> and WARNs if not.  I don't think there's any case where it's valid to
> call both.  Shouldn't skb_checksum_help() also WARN and return an error
> code instead of muddling on?

I think so.

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: skb_checksum_help() vs GSO
  2012-01-12 20:42 skb_checksum_help() vs GSO Ben Hutchings
  2012-01-12 22:03 ` Herbert Xu
@ 2012-01-12 22:10 ` Stephen Hemminger
  1 sibling, 0 replies; 8+ messages in thread
From: Stephen Hemminger @ 2012-01-12 22:10 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Herbert Xu, netdev, netfilter-devel

On Thu, 12 Jan 2012 20:42:10 +0000
Ben Hutchings <bhutchings@solarflare.com> wrote:

> skb_checksum_help() does:
> 
> 	if (unlikely(skb_shinfo(skb)->gso_size)) {
> 		/* Let GSO fix up the checksum. */
> 		goto out_set_summed;
> 	}
> ...
> out_set_summed:
> 	skb->ip_summed = CHECKSUM_NONE;
> out:
> 	return ret;
> 
> but skb_gso_segment() requires that skb->ip_summed == CHECKSUM_PARTIAL
> and WARNs if not.  I don't think there's any case where it's valid to
> call both.  Shouldn't skb_checksum_help() also WARN and return an error
> code instead of muddling on?
> 
> Inspecting the callers of skb_checksum_help(), it looks like sch_netem's
> 'corrupt' option and xt_CHECKSUM might trigger this case.
> 
> Ben.

Netem needs to check for GSO manually segment before calling skb_checksum_help.
I'll sort it out.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC] netem:  de-GSO packets before enqueing
  2012-01-12 22:03 ` Herbert Xu
@ 2012-01-13  0:57   ` Stephen Hemminger
  2012-01-13 15:54     ` Eric Dumazet
  2012-01-14 16:06     ` Hagen Paul Pfeifer
  0 siblings, 2 replies; 8+ messages in thread
From: Stephen Hemminger @ 2012-01-13  0:57 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Ben Hutchings, netdev, netfilter-devel

Probably something like this is needed (untested).

This issue was discovered when looking at the skb_checksum path for the
netem corruption operation, but it is a general problem.
Network emulation operations like corruption and drop want to operate
on a per-packet (not per-segment) basis. This patch does GSO in software
if necessary to break up packets.  Code is similar to logic in xfrm_output.

Although it appears that the operation is not work conserving, it is okay
because the higher level qdisc operations account for packets by incrementing
by gso_size.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/net/sched/sch_netem.c	2012-01-12 14:57:59.218796226 -0800
+++ b/net/sched/sch_netem.c	2012-01-12 15:21:07.614162912 -0800
@@ -128,6 +128,8 @@ struct netem_skb_cb {
 	psched_time_t	time_to_send;
 };
 
+static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch);
+
 static inline struct netem_skb_cb *netem_skb_cb(struct sk_buff *skb)
 {
 	BUILD_BUG_ON(sizeof(skb->cb) <
@@ -355,6 +357,41 @@ static int tfifo_enqueue(struct sk_buff
 	return qdisc_reshape_fail(nskb, sch);
 }
 
+static int netem_enqueue_gso(struct sk_buff *skb, struct Qdisc *sch)
+{
+	struct sk_buff *segs;
+	int rc;
+
+	segs = skb_gso_segment(skb, 0);
+	kfree_skb(skb);
+
+	if (IS_ERR(segs)) {
+		sch->qstats.drops++;
+		return NET_XMIT_DROP;
+	}
+
+	do {
+		struct sk_buff *nskb = segs->next;
+		int ret;
+
+		segs->next = NULL;
+		ret = netem_enqueue(segs, sch);
+		if (ret == NET_XMIT_DROP) {
+			while ((segs = nskb)) {
+				nskb = segs->next;
+				segs->next = NULL;
+				kfree_skb(segs);
+			}
+			return ret;
+		}
+
+		segs = nskb;
+	} while (segs);
+
+	return NET_XMIT_SUCCESS;
+}
+
+
 /*
  * Insert one skb into qdisc.
  * Note: parent depends on return value to account for queue length.
@@ -370,6 +407,10 @@ static int netem_enqueue(struct sk_buff
 	int ret;
 	int count = 1;
 
+	/* Want to operate on per-packet basis */
+	if (skb_is_gso(skb))
+		return netem_enqueue_gso(skb, sch);
+
 	/* Random duplication */
 	if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor))
 		++count;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] netem:  de-GSO packets before enqueing
  2012-01-13  0:57   ` [RFC] netem: de-GSO packets before enqueing Stephen Hemminger
@ 2012-01-13 15:54     ` Eric Dumazet
  2012-01-13 19:20       ` Stephen Hemminger
  2012-01-14 16:06     ` Hagen Paul Pfeifer
  1 sibling, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2012-01-13 15:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Herbert Xu, Ben Hutchings, netdev, netfilter-devel

Le jeudi 12 janvier 2012 à 16:57 -0800, Stephen Hemminger a écrit :
> Probably something like this is needed (untested).
> 
> This issue was discovered when looking at the skb_checksum path for the
> netem corruption operation, but it is a general problem.
> Network emulation operations like corruption and drop want to operate
> on a per-packet (not per-segment) basis. This patch does GSO in software
> if necessary to break up packets.  Code is similar to logic in xfrm_output.
> 
> Although it appears that the operation is not work conserving, it is okay
> because the higher level qdisc operations account for packets by incrementing
> by gso_size.

gso_size ok, but what about qlen ?

We end up splitting one GSO skb in multiple segments, but shouldnt we
instruct upper qdisc(s) that qlen was increased, sort of
qdisc_tree_increase_qlen() call ?



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] netem:  de-GSO packets before enqueing
  2012-01-13 15:54     ` Eric Dumazet
@ 2012-01-13 19:20       ` Stephen Hemminger
  2012-01-13 23:44         ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Stephen Hemminger @ 2012-01-13 19:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Herbert Xu, Ben Hutchings, netdev, netfilter-devel

On Fri, 13 Jan 2012 16:54:57 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le jeudi 12 janvier 2012 à 16:57 -0800, Stephen Hemminger a écrit :
> > Probably something like this is needed (untested).
> > 
> > This issue was discovered when looking at the skb_checksum path for the
> > netem corruption operation, but it is a general problem.
> > Network emulation operations like corruption and drop want to operate
> > on a per-packet (not per-segment) basis. This patch does GSO in software
> > if necessary to break up packets.  Code is similar to logic in xfrm_output.
> > 
> > Although it appears that the operation is not work conserving, it is okay
> > because the higher level qdisc operations account for packets by incrementing
> > by gso_size.
> 
> gso_size ok, but what about qlen ?
> 
> We end up splitting one GSO skb in multiple segments, but shouldnt we
> instruct upper qdisc(s) that qlen was increased, sort of
> qdisc_tree_increase_qlen() call ?
> 
> 
> 

Also, does pkt_len need to be set in cb after de-gso?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] netem:  de-GSO packets before enqueing
  2012-01-13 19:20       ` Stephen Hemminger
@ 2012-01-13 23:44         ` Eric Dumazet
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2012-01-13 23:44 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Herbert Xu, Ben Hutchings, netdev, netfilter-devel

Le vendredi 13 janvier 2012 à 11:20 -0800, Stephen Hemminger a écrit :

> Also, does pkt_len need to be set in cb after de-gso?

Yes, most probably :)


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] netem:  de-GSO packets before enqueing
  2012-01-13  0:57   ` [RFC] netem: de-GSO packets before enqueing Stephen Hemminger
  2012-01-13 15:54     ` Eric Dumazet
@ 2012-01-14 16:06     ` Hagen Paul Pfeifer
  1 sibling, 0 replies; 8+ messages in thread
From: Hagen Paul Pfeifer @ 2012-01-14 16:06 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Herbert Xu, Ben Hutchings, netdev, netfilter-devel

* Stephen Hemminger | 2012-01-12 16:57:20 [-0800]:

>Probably something like this is needed (untested).
>
>This issue was discovered when looking at the skb_checksum path for the
>netem corruption operation, but it is a general problem.
>Network emulation operations like corruption and drop want to operate
>on a per-packet (not per-segment) basis. This patch does GSO in software
>if necessary to break up packets.  Code is similar to logic in xfrm_output.
>
>Although it appears that the operation is not work conserving, it is okay
>because the higher level qdisc operations account for packets by incrementing
>by gso_size.

Thanks Stephen! Corruption is currently unusable if GSO is enabled. We disable
GSO/TSO on all our test machines therefore.

I have a larger set of patches for the corruption option for this merge
window. We will now test your patch (and the hopefully following qdisc len
updated patch).

Hagen

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-01-14 16:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-12 20:42 skb_checksum_help() vs GSO Ben Hutchings
2012-01-12 22:03 ` Herbert Xu
2012-01-13  0:57   ` [RFC] netem: de-GSO packets before enqueing Stephen Hemminger
2012-01-13 15:54     ` Eric Dumazet
2012-01-13 19:20       ` Stephen Hemminger
2012-01-13 23:44         ` Eric Dumazet
2012-01-14 16:06     ` Hagen Paul Pfeifer
2012-01-12 22:10 ` skb_checksum_help() vs GSO Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).