All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: netdev <netdev@vger.kernel.org>, Evan Jones <ej@evanjones.ca>,
	Vijay P <vijayp@vijayp.ca>, Cong Wang <cwang@twopensource.com>
Subject: Re: veth regression with "don’t modify ip_summed; doing so treats packets with bad checksums as good."
Date: Thu, 24 Mar 2016 18:11:18 -0700	[thread overview]
Message-ID: <56F49036.8050902@candelatech.com> (raw)
In-Reply-To: <56F4810A.9060904@candelatech.com>

On 03/24/2016 05:06 PM, Ben Greear wrote:
> On 03/24/2016 04:56 PM, Cong Wang wrote:
>> On Thu, Mar 24, 2016 at 3:01 PM, Ben Greear <greearb@candelatech.com> wrote:
>>> I have an application that creates two pairs of veth devices.
>>>
>>> a <-> b       c <-> d
>>>
>>> b and c have a raw packet socket opened on them and I 'bridge' frames
>>> between b and c to provide network emulation (ie, configurable delay).
>>>
>>
>> IIUC, you create two raw sockets in order to bridge these two veth pairs?
>> That is, to receive packets on one socket and deliver packets on the other?
>
> Yes.
>
>>> I put IP 1.1.1.1/24 on a, 1.1.1.2/24 on d, and then create a UDP connection
>>> (using policy based routing to ensure frames are sent on the appropriate
>>> interfaces).
>>>
>>> This is user-space only app, and kernel in this case is completely
>>> unmodified.
>>>
>>> The commit below breaks this feature:  UDP frames are sniffed on both a and
>>> d ports
>>> (in both directions), but the UDP socket does not receive frames.
>>>
>>> Using normal ethernet ports, this network emulation feature works fine, so
>>> it is
>>> specific to VETH.
>>>
>>> A similar test with just sending UDP between a single veth pair:  e <-> f
>>> works fine.  Maybe it has something to do with raw packets?
>>>
>>
>> Yeah, I have the same feeling. Could you trace kfree_skb() to see
>> where these packets are dropped? At UDP layer?
>
> Since reverting the patch fixes this, it almost certainly has to be due to some
> checksum checking logic.  Since UDP sockets (between single veth pair)
> works, it would appear to be related to my packet bridge, so maybe
> it is specific to raw packets and/or sendmmsg api.
>
> I'll investigate it better tomorrow.

So, I found time to poke at it this evening:

Sending between two veth pairs, no packet bridge involved.

UDP:  ip_summed is 3 (CHECKSUM_PARTIAL)   # Works fine.
raw packet frames, custom ether protocol (0x1111 type):  ip_summed is 0 (NONE) # Works fine.

When I try to send UDP through the veth pairs & pkt bridge, I see this:

(pkt-bridge connects to the 'b' side of the veth pairs)

Mar 24 17:59:34 ben-ota-1 kernel: dev: rddVR0  rcv: rddVR0b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:34 ben-ota-1 kernel: dev: rddVR1b  rcv: rddVR1  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:34 ben-ota-1 kernel: dev: rddVR1  rcv: rddVR1b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:34 ben-ota-1 kernel: dev: rddVR0b  rcv: rddVR0  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:34 ben-ota-1 kernel: dev: rddVR0  rcv: rddVR0b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:34 ben-ota-1 kernel: dev: rddVR1b  rcv: rddVR1  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR1  rcv: rddVR1b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR0b  rcv: rddVR0  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR0  rcv: rddVR0b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR1b  rcv: rddVR1  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR1  rcv: rddVR1b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR0b  rcv: rddVR0  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR0  rcv: rddVR0b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR1b  rcv: rddVR1  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR1  rcv: rddVR1b  ip_summed: 3  rcv-features: 0x184074011e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR0b  rcv: rddVR0  ip_summed: 0  rcv-features: 0x184075b59e9
Mar 24 17:59:35 ben-ota-1 kernel: dev: rddVR0  rcv: rddVR0b  ip_summed: 3  rcv-features: 0x184074011e9


I am guessing the issue is that when my pkt bridge sends a raw frame that is actually a UDP packet,
the fact that it has ip_summed == 0 in the kernel causes the frame to be dropped.


I modified veth.c like this for this test:

static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
{
	struct veth_priv *priv = netdev_priv(dev);
	struct net_device *rcv;
	int length = skb->len;

	rcu_read_lock();
	rcv = rcu_dereference(priv->peer);
	if (unlikely(!rcv)) {
		kfree_skb(skb);
		goto drop;
	}

	pr_err("dev: %s  rcv: %s  ip_summed: %d  rcv-features: 0x%llx\n",
	       dev->name, rcv->name, skb->ip_summed, (unsigned long long)rcv->features);
#if 0
	/* don't change ip_summed == CHECKSUM_PARTIAL, as that
	 * will cause bad checksum on forwarded packets
	 */
	if (skb->ip_summed == CHECKSUM_NONE &&
	    rcv->features & NETIF_F_RXCSUM)
		skb->ip_summed = CHECKSUM_UNNECESSARY;
#endif


Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

  reply	other threads:[~2016-03-25  1:11 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-24 22:01 veth regression with "don’t modify ip_summed; doing so treats packets with bad checksums as good." Ben Greear
     [not found] ` <CAKUBDd91rR7QTwCO6L6ZfRe4fuHw0L5+Zi7qm0uF018dwVGCLg@mail.gmail.com>
2016-03-24 22:57   ` Ben Greear
2016-03-24 23:56 ` Cong Wang
2016-03-25  0:06   ` Ben Greear
2016-03-25  1:11     ` Ben Greear [this message]
2016-03-25  1:13       ` Ben Greear
2016-03-25  1:44         ` Vijay Pandurangan
2016-03-25  4:34           ` Ben Greear
2016-03-25  4:41             ` Vijay Pandurangan
2016-03-25  4:45               ` Vijay Pandurangan
2016-03-25  5:07                 ` Ben Greear
2016-03-25  5:24                   ` Vijay Pandurangan
2016-03-25 14:35                     ` Ben Greear
2016-03-25 21:51                       ` Vijay Pandurangan
2016-03-25  5:06             ` Cong Wang
2016-03-25  5:13               ` Ben Greear
2016-03-25  5:33                 ` Cong Wang
2016-03-25 16:10                   ` Ben Greear
2016-03-25 16:32                     ` Cong Wang
2016-03-25 16:45                       ` David Miller
2016-03-25 16:44                     ` David Miller
2016-03-25 17:14                       ` Ben Greear
2016-03-25 19:00                         ` David Miller
2016-03-25 20:56                   ` Ben Greear
2016-03-25 21:59                     ` Vijay Pandurangan
2016-03-25 22:23                       ` Ben Greear
2016-03-25 23:03                         ` Vijay Pandurangan
2016-03-25 23:46                           ` Ben Greear
2016-04-07 15:11                             ` Vijay Pandurangan
2016-04-07 18:32                               ` Ben Greear
2016-03-25 22:23                       ` Cong Wang
2016-03-25 22:16                     ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F49036.8050902@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=cwang@twopensource.com \
    --cc=ej@evanjones.ca \
    --cc=netdev@vger.kernel.org \
    --cc=vijayp@vijayp.ca \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.