From: Rusty Russell <rusty@rustcorp.com.au>
To: Alex Williamson <alex.williamson@hp.com>
Cc: netdev@vger.kernel.org, markmc@redhat.com, kvm@vger.kernel.org,
Herbert Xu <herbert@gondor.apana.org.au>
Subject: Re: [PATCH 4/5] virtio_net: Add a MAC filter table
Date: Thu, 29 Jan 2009 10:25:46 +1030 [thread overview]
Message-ID: <200901291025.46887.rusty@rustcorp.com.au> (raw)
In-Reply-To: <1233164908.7026.195.camel@lappy>
On Thursday 29 January 2009 04:18:28 Alex Williamson wrote:
> Hi Rusty,
Hi Alex,
I've cc'd Herbert: he always has good thoughts about this kind of thing and I want to be sure you're getting a fair hearing.
> Here's what I believe to be the parameters around which I've designed
> the current interface:
>
> A. Receiving unwanted packets is a waste of resources.
> B. Proprietary OSes may have dependencies on hardware filtering and
> may not be able to handle unwanted packets.
> C. Different guests may want different filter table sizes.
> D. The guest can make better decisions than the host about what
> packets are unwanted.
A general OS has to handle the "unwanted packets" case if the NIC's table isn't big enough. A guest may want arbitrary filter table sizes, but you're not giving it to them anyway, so that argument is bogus too.
The host *has* to decide the max table size, the guest *doesn't*. Your implementation is the worst of both worlds. The host doesn't tell the guest what the limit is, but does fail it for exceeding it. Your guest implementation limits itself arbitrarily, but you feel better because you've invoked yet-another party (the admin of the user box) to fix it!
And there's no evidence whatsoever that the guest can do anything rational if it hits the filtering limit that the host can't do.
> >From this, I derive that the guest needs to know the size of the filter
> table, needs to be able to specify the size of the filter table, and is
> in the best position to program the filter table. A pseudo-infinite
> table violates the condition of not sending the guest unwanted packets.
I'm not aware of any OS which doesn't filter this anyway, but I'm not familiar with the Windows stack. I'd be very surprised though, since multicast filtering is often implemented as a hash on the card.
> We could also go down the path of deciding what a "big enough" table is,
> and making it part of the ABI to avoid the alloc, but then we may have
> to revise the ABI if we underestimate (whereas the current limit is an
> implementation detail).
OK, say the host bonds a NIC directly to the virtio nic for the guest. That NIC uses a hash to filter MAC addresses. In your scheme, you have numerous problems:
(1) we've defined filtering to be perfect, so we need to filter in the host,
(2) the guest will stop giving us filtering information after 16 entries, and
tell us to go into promisc mode, even though we could do better.
> Weaving in your comments from other parts of this series, would it help
> at all to distinguish EINVAL from ENOMEM for the alloc call? Maybe the
> caller could make some kind of intelligent decision based on that (ex.
> unimplemented feature vs try something smaller).
None of the currently defined calls should fail. Feature bits should guarantee the existence of features, and failure should be greeted with horror by the guest, which should then try to limp along.
So, this conversation has convinced me that the host should accept arbitrary filtering entries, and the guest should accept that it is best effort.
Cheers,
Rusty.
next prev parent reply other threads:[~2009-01-28 23:55 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
2009-01-16 21:13 ` [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-16 21:13 ` [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
[not found] ` <200901271352.57887.rusty@rustcorp.com.au>
2009-01-27 4:00 ` Alex Williamson
2009-01-28 13:05 ` Rusty Russell
2009-01-28 19:02 ` Alex Williamson
2009-01-29 1:35 ` Rusty Russell
2009-01-16 21:13 ` [PATCH 3/5] virtio_net: Add a set_rx_mode interface Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-16 21:13 ` [PATCH 4/5] virtio_net: Add a MAC filter table Alex Williamson
2009-01-19 9:33 ` Mark McLoughlin
[not found] ` <200901271300.30330.rusty@rustcorp.com.au>
2009-01-27 3:38 ` Alex Williamson
2009-01-28 10:45 ` Rusty Russell
2009-01-28 17:48 ` Alex Williamson
2009-01-28 23:55 ` Rusty Russell [this message]
2009-01-29 0:34 ` Herbert Xu
2009-01-29 6:17 ` David Stevens
2009-01-30 7:03 ` Rusty Russell
2009-01-16 21:13 ` [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-20 16:36 ` Alex Williamson
2009-01-20 16:44 ` Mark McLoughlin
2009-01-26 2:08 ` David Miller
2009-01-26 17:42 ` Alex Williamson
[not found] ` <200901271422.33369.rusty@rustcorp.com.au>
2009-01-27 4:19 ` Alex Williamson
2009-01-19 6:05 ` [PATCH 0/5] virtio_net: Add MAC and VLAN filtering David Miller
2009-01-19 8:30 ` Mark McLoughlin
2009-01-20 1:10 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200901291025.46887.rusty@rustcorp.com.au \
--to=rusty@rustcorp.com.au \
--cc=alex.williamson@hp.com \
--cc=herbert@gondor.apana.org.au \
--cc=kvm@vger.kernel.org \
--cc=markmc@redhat.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox