From: Daniel Borkmann <dborkman@redhat.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jiri Pirko <jiri@resnulli.us>,
Paul Pearce <pearce@cs.berkeley.edu>,
netdev@vger.kernel.org, tcpdump-workers@lists.tcpdump.org,
davem@davemloft.net, edumazet@google.com, jpirko@redhat.com,
Ani Sinha <ani@aristanetworks.com>
Subject: Re: PROBLEM: Software injected vlan tagged packets are unable to be identified using recent BPF modifications
Date: Sat, 16 Feb 2013 15:34:53 +0100 [thread overview]
Message-ID: <511F990D.3080503@redhat.com> (raw)
In-Reply-To: <87fw0y6o2f.fsf@xmission.com>
On 02/15/2013 08:20 AM, Eric W. Biederman wrote:
>> Tue, Jan 08, 2013 at 01:05:39AM CET, pearce@cs.berkeley.edu wrote:
>>> Hello folks,
>>>
>>> PROBLEM:
>>>
>>> vlan tagged packets that are injected via software are not picked up
>>> by filters using recent (kernel commit
>>> f3335031b9452baebfe49b8b5e55d3fe0c4677d1)
>>> BPF vlan modifications. I suspect this is a problem with the Linux
>>> kernel.
>>>
>>> linux-netdev and tcpdump-workers are both cc'd.
>>>
>>> BACKGROUND:
>>>
>>> Kernel commit bcc6d47903612c3861201cc3a866fb604f26b8b2 (Jiri
>>> Pirko/David S. Miller) removed vlan headers on rx packets prior to
>>> them reaching the packet filters. This broke BPF/libpcap's ability to
>>> do kernel-level packet filtering based on vlan tag information (the
>>> 'vlan' keyword).
>>>
>>> Kernel commit f3335031b9452baebfe49b8b5e55d3fe0c4677d1 (Eric
>>> Dumazet/David S. Miller, just merged into Linus's tree
>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=f3335031b9452baebfe49b8b5e55d3fe0c4677d1)
>>> added the ability to use BPF to once again filter based on vlan
>>> tags. Related bpf jit commit:
>>> http://www.spinics.net/lists/netdev/msg214759.html
>>>
>>> libpcap (Ani Sinha) recently RFC'd a patch to use Eric/David's BPF
>>> modifications to restore vlan filtering to libpcap.
>>> http://www.mail-archive.com/tcpdump-workers@lists.tcpdump.org/msg06810.html
>>> I'm using this patch and it works.
>>>
>>> DETAILS:
>>>
>>> Under these patches vlan tagged packets received from mediam (actual
>>> packets from the wire) can be identified based on vlan tag information
>>> using the new BPF functionality.This is good.
>>>
>>> However, raw vlan tagged packets that are *injected* into the
>>> interface using libpcap's pcap_inject() (which is just a fancy wrapper
>>> for the send() syscall) are not identified by filters using the recent
>>> BPF modifications.
>>>
>>> The bug manifests itself if you attempt to use the new BPF
>>> modifications to filter vlan tagged packets on a live interface. All
>>> packets from the medium show up, but all injected packets are dropped.
>>>
>>> Prior to commit bcc6d47 both medium and injected packets could both be
>>> identified using BPFs.
>>>
>>> These injected packets can however still be identified using the
>>> previous, now incorrect "offset into the header" technique. Given
>>> this, I suspect what's going on is the kernel code path for these
>>> injected packets is not setting skb->vlan_tci correctly (at all?).
>>> Since the vlan tag is not in the skb data structure the new BPF
>>> modifications don't identify the packets as having a vlan tag,
>>> despite it being in the packet header.
>>
>>
>> You are right. skb->vlan_tci is not set. Looking at packet_snd() function
>> in net/packet/af_packet.c I guess that something like following patch
>> would be needed:
>>
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index e639645..2238559 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -2292,6 +2292,12 @@ static int packet_snd(struct socket *sock,
>> if (unlikely(extra_len == 4))
>> skb->no_fcs = 1;
>>
>> + if (skb->protocol == cpu_to_be16(ETH_P_8021Q)) {
>> + skb = vlan_untag(skb);
>> + if (unlikely(!skb))
>> + goto out_unlock;
>> + }
>> +
>> /*
>> * Now send it
>> */
>>
>> Thoughts?
>
> That sounds about right. I don't know how much NIC drivers care but it
> looks like af_packet is the only path through the code that can generate
> a vlan tagged packet that we will transmit where a vlan tagged packet
> can be generated without setting vlan_tci. So it should make the code
> safer.
>
> Certainly we want this to look to the vlan driver like a proper case of
> nested vlans and not something weird.
>
> But note we need to handle tpacket_snd as well as packet_snd.
Isn't this overkill? vlan_untag() in TX path performs internally a memmove(),
not sure if its worth the effort, and in the worst case, if your driver doesn't
support vlan hw accel, it puts the tag back in via yet another memmove() before
transmission via __vlan_put_tag() in dev_hard_start_xmit().
Besides, it also doesn't solve the issue that was stated here, if I'm not
mistaken. Isn't the problem, that when you send such a packet while a local
packet analyser is running at the same time, that back on the input path
vlan_tci is reset to 0 and you won't be able to use the vlan_tci JIT filter
if your NIC doesn't have hw accel? This change therefore doesn't help in
this situation either.
The better solution might be to generate a different BPF code in userland, no?
prev parent reply other threads:[~2013-02-16 14:35 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-08 0:05 PROBLEM: Software injected vlan tagged packets are unable to be identified using recent BPF modifications Paul Pearce
2013-01-08 1:25 ` Ani Sinha
2013-01-08 3:04 ` Paul Pearce
2013-01-10 20:37 ` Bill Fenner
2013-01-08 10:38 ` Jiri Pirko
2013-02-15 7:20 ` Eric W. Biederman
2013-02-16 14:34 ` Daniel Borkmann [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=511F990D.3080503@redhat.com \
--to=dborkman@redhat.com \
--cc=ani@aristanetworks.com \
--cc=davem@davemloft.net \
--cc=ebiederm@xmission.com \
--cc=edumazet@google.com \
--cc=jiri@resnulli.us \
--cc=jpirko@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pearce@cs.berkeley.edu \
--cc=tcpdump-workers@lists.tcpdump.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).