From: Max Krasnyansky <maxk@qualcomm.com>
To: Brian Braunstein <brian@bristyle.com>
Cc: Shaun Jackman <sjackman@gmail.com>,
Brian Braunstein <linuxkernel@bristyle.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org
Subject: Re: Multicast and receive filtering in TUN/TAP
Date: Thu, 10 Jul 2008 20:05:12 -0700 [thread overview]
Message-ID: <4876CDE8.2020200@qualcomm.com> (raw)
In-Reply-To: <480487c50807101932u7b4020a8qb31227a85f79c187@mail.gmail.com>
Brian Braunstein wrote:
> Sorry that I was confused here and it seems I am still confused.
>
> I was thinking that for any one instance of a TAP interface, there
> should be only 1 MAC address, since there is only 1 network interface,
> since the character device is not a network interface but rather the
> interface for the application to send and receive on that virtual
> network interface.
>
Exactly. Your understanding is perfectly correct.
See my previous reply. It should clear up all the confusion.
> For the MC stuff, I have to admit I haven't looked into it much, but it
> seems like the basic operation of setting the MAC address of the network
> interface should be supported, and it seems like an ioctl called
> SIOCSIFHWADDR should Set the InterFace HardWare ADDRess. Sorry if I was
> wrong about this. It might be good to add a comment to SIOCSIFHWADDR
> that says "This does not actually set the network interface hardware
> address, this is for multicast filtering" or whatever it actually is
> suppose to do. Or perhaps create a new ioctl that has something about
> multicast filtering in the name, and leave SIOCSIFHWADDR doing what it
> is doing now.
Yep. That's what I'm going to do (ie a different ioctl). Again see my prev
email. We're totally on the same page :).
Max
>
> brian
>
>
> On Thu, Jul 10, 2008 at 2:38 PM, Shaun Jackman <sjackman@gmail.com
> <mailto:sjackman@gmail.com>> wrote:
>
> Hi Max,
>
> The original patch implemented receive multicast filtering by
> emulating the implementation used by many physical Ethernet
> interfaces: hashing the multicast address. TUN emulates two network
> cards (and communication via the virtual link between them), the guest
> and the host, or the character device and the network device, so there
> are two receive filters: chr_filter and net_filter. I implemented the
> filtering at the character device using chr_filter in tun_chr_readv,
> and left filtering at the network device for someone else to
> implement.
>
> I'm not sure what you mean by TX filtering. Multicast filtering is
> implemented uniquely at the receiver. There are, however, two
> receivers: the character device and the network device.
>
> I believe Brian's patch was mistaken. Two entirely distinct Ethernet
> addresses are required: one for the character device and one for the
> network device, or put another way, one for the virtual Ethernet
> interface at the guest and one for the virtual Ethernet interface at
> the host. For the same reason, there are two distinct multicast
> filters.
>
>
>
> Looking over the original patch, I believe I see a bug in
> tun_net_mclist:
> memset(tun->chr_filter, 0, sizeof tun->chr_filter);
> should be
> memset(tun->net_filter, 0, sizeof tun->net_filter);
>
> Cheers,
> Shaun
>
> On Wed, Jul 9, 2008 at 3:58 PM, Max Krasnyansky <maxk@qualcomm.com
> <mailto:maxk@qualcomm.com>> wrote:
> > Yesterday while fixing xoff stuckiness issue in the TUN/TAP driver
> I got a
> > chance to look into the multicast filtering code in there. And
> immediately
> > realized how terribly broken & confusing it is. The patch was
> originally
> > done by Shaun (CC'ed) and went in without any proper ACK from me,
> Dave or
> > Jeff.
> > Here is the original ref
> > http://marc.info/?l=linux-netdev&m=110490502102308&w=2
> <http://marc.info/?l=linux-netdev&m=110490502102308&w=2>
> >
> > I'm not going to dive into too much details on what's wrong with
> the current
> > code. The main issues are that it mixes RX and TX filtering which are
> > orthogonal, and it reuses ioctl names and stuff for manipulating
> TX filter
> > state as if it was a normal RX multicast state.
> > Later on Brian's patch added insult to the injury
> > http://git.kernel.org/?p=linux/kernel/git/\
> <http://git.kernel.org/?p=linux/kernel/git/%5C>
> > torvalds/linux-2.6.git;\
> > a=commit;h=36226a8ded46b89a94f9de5976f554bb5e02d84c
> > Brian missed the point of the original patch (not his fault, as I
> said the
> > original patch was not the best) that the separate address
> introduced by the
> > MC patch was used for filtering _TX_ packets. It had nothing to do
> with the
> > HW addr of the local network interface.
> >
> > The problem is that MC stuff is now even more broken and ioctls
> that were
> > used originally now mean something different. So my first thinking
> was to
> > just rip the MC stuff out because it's broken and probably nobody
> uses it
> > (given that we got no complains after Brian's patch broke it
> completely).
> > But then I realized that if done properly it might be very useful for
> > virtualization.
> >
> > ---
> >
> > So the first question is are there any users out there that ever
> used the
> > original patch. Shaun, any insight ? How did you intend to use it ?
> >
> > ---
> >
> > The second question is do you guys think that QEMU/KVM/LGUEST/etc
> would
> > benefit if receive filtering was done by the host OS. Here is a
> specific
> > example of what I'm talking about.
> > We can do what qemu/hw/e1000.c:receive_filter() does in the _host_
> context
> > (that function currently runs in the guest context). By looking at
> libvirt,
> > typical QEMU based setup is that you have a single bridge and all
> the TAPs
> > from different VMs are hooked up to that bridge. What that means
> is that if
> > one VM is getting MC traffic or when the bridge sees MACADDR that
> is not in
> > its tables the packets get delivered to all the VMs. ie We have to
> wake all
> > of the up only to so that they could drop that packet. Instead, we
> could
> > setup filters in the host's side of the TAP device.
> > Does that sound like something useful for QEMU/KVM ?
> > If yes we can talk about the API. If not then I'll just nuke it.
> >
> > Thanx
> > Max
> >
>
>
next prev parent reply other threads:[~2008-07-11 3:05 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-09 22:58 Multicast and receive filtering in TUN/TAP Max Krasnyansky
2008-07-10 8:29 ` Christian Borntraeger
2008-07-10 16:57 ` Max Krasnyansky
2008-07-10 16:57 ` Max Krasnyansky
2008-07-10 20:23 ` Christian Borntraeger
2008-07-10 20:23 ` Christian Borntraeger
2008-07-11 2:20 ` Max Krasnyansky
2008-07-11 7:01 ` Rusty Russell
2008-07-11 8:01 ` Max Krasnyansky
2008-07-11 8:01 ` Max Krasnyansky
2008-07-11 2:20 ` Max Krasnyansky
2008-07-10 8:29 ` Christian Borntraeger
2008-07-10 21:38 ` Shaun Jackman
2008-07-10 21:38 ` Shaun Jackman
2008-07-11 2:32 ` Brian Braunstein
2008-07-11 3:05 ` Max Krasnyansky
2008-07-11 3:05 ` Max Krasnyansky [this message]
2008-07-11 3:01 ` Max Krasnyansky
2008-07-11 3:01 ` Max Krasnyansky
-- strict thread matches above, loose matches on Subject: below --
2008-07-09 22:58 Max Krasnyansky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4876CDE8.2020200@qualcomm.com \
--to=maxk@qualcomm.com \
--cc=borntraeger@de.ibm.com \
--cc=brian@bristyle.com \
--cc=linuxkernel@bristyle.com \
--cc=netdev@vger.kernel.org \
--cc=sjackman@gmail.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.