All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Borkmann <dborkman@redhat.com>
To: Marcel Holtmann <marcel@holtmann.org>
Cc: Network Development <netdev@vger.kernel.org>
Subject: Re: protocol 0000 is buggy, dev nlmon
Date: Tue, 12 Aug 2014 21:05:23 +0200	[thread overview]
Message-ID: <53EA6573.4060204@redhat.com> (raw)
In-Reply-To: <8CB882A0-240B-4759-A119-8867FCF70957@holtmann.org>

On 08/03/2014 08:05 PM, Marcel Holtmann wrote:
...
> And on that note, I also found that we can not get the multicast group id from nlmon.
 > This is something we should include to be able to distinguish where netlink messages
 > are send to.

Just to follow-up on this thread first for general discussion [ sorry was/am very busy
recently ]. My suggestion was something like the following pseudocode:

--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -212,10 +212,11 @@ static int __netlink_deliver_tap_skb(struct sk_buff *skb,
[...]
  		nskb->protocol = htons((u16) sk->sk_protocol);
  		nskb->pkt_type = netlink_is_kernel(sk) ?
  				 PACKET_KERNEL : PACKET_USER;
+		nskb->sw_hash = 1;
+		nskb->hash = NETLINK_CB(nskb).dst_group;
[...]

The hash itself should currently be 0 actually as it's unused. The upside for
this approach is that it can be used as is, e.g. in BPF socket filters,
TPACKET_V3 exports it to user space already and we could add it to TPACKET_V2
meta data (as we currently have 4 byte padding left). The downside, it doesn't
fit into struct tpacket_auxdata which is for non-mmap operational mode, however
imho this would be an option to go for.

The other options are (as suggested by you) are a split of the 4 byte into
{tp_mac, tp_net} tuple + status flag, or a split into {tp_vlan_tci, tp_vlan_tpid}
tuple plus status flag since both are also exported via tpacket_auxdata. The
first variant is not possible as at least tp_mac contains the start offset
although tp_net would be unused as it carries the same value as tp_mac; however,
another issue with that could be that applications do not check status flags
before accessing tp_{mac,net}, so I don't really like it. Also it makes it
impossible for usage with BPF filters. The same goes for the tp_vlan_{tci,tpid}
variant, it would not be possible to reliably use it with BPF filters attached
as opposed to the hash version. Next, encoding this over tp_vlan_{tci,tpid} would
require special casing in AF_PACKET since we fetch and fill 'em via vlan_tx_tag_get(skb)
and ntohs(skb->vlan_proto). Yet another option would be to introduce something
like struct tpacket_auxdata2 which carries more meta data with it, but that I'd
really don't like -- AF_PACKET has already too many versions of mmap'ed API
which is horrible. Therefore, I am still in favour of going with skb->hash as
it's the cleanest.

  parent reply	other threads:[~2014-08-12 19:05 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-03 18:05 protocol 0000 is buggy, dev nlmon Marcel Holtmann
2014-08-04 13:18 ` Daniel Borkmann
2014-08-12 19:05 ` Daniel Borkmann [this message]
2014-08-12 23:03   ` Marcel Holtmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53EA6573.4060204@redhat.com \
    --to=dborkman@redhat.com \
    --cc=marcel@holtmann.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.