* [PATCH 0/5] virtio_net: Add MAC and VLAN filtering
@ 2009-01-16 21:13 Alex Williamson
2009-01-16 21:13 ` [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC Alex Williamson
` (5 more replies)
0 siblings, 6 replies; 30+ messages in thread
From: Alex Williamson @ 2009-01-16 21:13 UTC (permalink / raw)
To: netdev; +Cc: rusty, markmc, kvm
This series enables setting the virtio-net device MAC address, adds
infrastructure for the new control virtqueue, and makes use of it
to support set_rx_mode, unicast and multicast address lists, and
supporting a hypervisor based VLAN filter. The goal is to make the
virtio-net device support more of the features of a physical NIC and
allow the hypervisor to discard packets we're not interested in.
This version incorporates the review comments from Mark McLoughlin,
specifically, much improved comments and commit logs, verifying the
functionality of host not providing a MAC address, moving communication
structs into virtio_net.h, adding warnings when things don't work, and
making the strings grep'able. I've left the class/cmd split in the
control header rather than consolidating it into a single value, I'm
hoping I've made sufficient arguments for that. Also, I left the
error return rather than a BUG_ON in send_command because I can't
conditionally enable set_rx_mode as it's in a const struct. Instead,
I've changed the caller to avoid the issue. Please comment and/or
apply. Thanks,
Alex
---
Alex Williamson (5):
virtio_net: Add support for VLAN filtering in the hypervisor
virtio_net: Add a MAC filter table
virtio_net: Add a set_rx_mode interface
virtio_net: Add a virtqueue for outbound control commands
virtio_net: Allow setting the MAC address of the NIC
drivers/net/virtio_net.c | 224 ++++++++++++++++++++++++++++++++++++++++++--
include/linux/virtio_net.h | 61 ++++++++++++
2 files changed, 275 insertions(+), 10 deletions(-)
--
Alex Williamson
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
@ 2009-01-16 21:13 ` Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-16 21:13 ` [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands Alex Williamson
` (4 subsequent siblings)
5 siblings, 1 reply; 30+ messages in thread
From: Alex Williamson @ 2009-01-16 21:13 UTC (permalink / raw)
To: netdev; +Cc: rusty, markmc, kvm
Many physical NICs let the OS re-program the "hardware" MAC
address. Virtual NICs should allow this too.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---
drivers/net/virtio_net.c | 23 +++++++++++++++++++++--
1 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 43f6523..e7700de 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -561,6 +561,22 @@ stop_queue:
goto done;
}
+static int virtnet_set_mac_address(struct net_device *dev, void *p)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+ struct virtio_device *vdev = vi->vdev;
+ int ret;
+
+ ret = eth_mac_addr(dev, p);
+ if (ret)
+ return ret;
+
+ vdev->config->set(vdev, offsetof(struct virtio_net_config, mac),
+ dev->dev_addr, dev->addr_len);
+
+ return 0;
+}
+
#ifdef CONFIG_NET_POLL_CONTROLLER
static void virtnet_netpoll(struct net_device *dev)
{
@@ -629,7 +645,7 @@ static const struct net_device_ops virtnet_netdev = {
.ndo_stop = virtnet_close,
.ndo_start_xmit = start_xmit,
.ndo_validate_addr = eth_validate_addr,
- .ndo_set_mac_address = eth_mac_addr,
+ .ndo_set_mac_address = virtnet_set_mac_address,
.ndo_change_mtu = virtnet_change_mtu,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = virtnet_netpoll,
@@ -677,8 +693,11 @@ static int virtnet_probe(struct virtio_device *vdev)
vdev->config->get(vdev,
offsetof(struct virtio_net_config, mac),
dev->dev_addr, dev->addr_len);
- } else
+ } else {
random_ether_addr(dev->dev_addr);
+ vdev->config->set(vdev, offsetof(struct virtio_net_config, mac),
+ dev->dev_addr, dev->addr_len);
+ }
/* Set up our device-specific information */
vi = netdev_priv(dev);
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
2009-01-16 21:13 ` [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC Alex Williamson
@ 2009-01-16 21:13 ` Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
[not found] ` <200901271352.57887.rusty@rustcorp.com.au>
2009-01-16 21:13 ` [PATCH 3/5] virtio_net: Add a set_rx_mode interface Alex Williamson
` (3 subsequent siblings)
5 siblings, 2 replies; 30+ messages in thread
From: Alex Williamson @ 2009-01-16 21:13 UTC (permalink / raw)
To: netdev; +Cc: rusty, markmc, kvm
This will be used for RX mode, MAC filter table, VLAN filtering, etc...
The control transaction consists of one or more "out" sg entries and
one or more "in" sg entries. The first out entry contains a header
defining the class and command. Additional out entries may provide
data for the command. The last in entry provides a status response
back from the command.
Virtqueues typically run asynchronous, running a callback function
when there's data in the channel. We can't readily make use of this
in the command paths where we need to use this. Instead, we kick
the virtqueue and spin. The kick causes an I/O write, triggering an
immediate trap into the hypervisor.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---
drivers/net/virtio_net.c | 49 +++++++++++++++++++++++++++++++++++++++++++-
include/linux/virtio_net.h | 17 +++++++++++++++
2 files changed, 65 insertions(+), 1 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e7700de..d4be0a2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -39,7 +39,7 @@ module_param(gso, bool, 0444);
struct virtnet_info
{
struct virtio_device *vdev;
- struct virtqueue *rvq, *svq;
+ struct virtqueue *rvq, *svq, *cvq;
struct net_device *dev;
struct napi_struct napi;
@@ -603,6 +603,41 @@ static int virtnet_open(struct net_device *dev)
return 0;
}
+static int virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
+ void *data, unsigned int len)
+{
+ struct scatterlist sg[3];
+ struct virtio_net_ctrl_hdr ctrl;
+ virtio_net_ctrl_ack status;
+ unsigned int tmp;
+ int i = 0;
+
+ if (!vi->cvq)
+ return -EIO;
+
+ sg_init_table(sg, len ? 3 : 2);
+
+ sg_set_buf(&sg[i++], &ctrl, sizeof(ctrl));
+ if (len)
+ sg_set_buf(&sg[i++], data, len);
+ sg_set_buf(&sg[i], &status, sizeof(status));
+
+ ctrl.class = class;
+ ctrl.cmd = cmd;
+
+ status = ~0;
+
+ if (vi->cvq->vq_ops->add_buf(vi->cvq, sg, i, 1, vi) != 0)
+ BUG();
+
+ vi->cvq->vq_ops->kick(vi->cvq);
+
+ while (!vi->cvq->vq_ops->get_buf(vi->cvq, &tmp))
+ cpu_relax();
+
+ return status ? -EFAULT : 0;
+}
+
static int virtnet_close(struct net_device *dev)
{
struct virtnet_info *vi = netdev_priv(dev);
@@ -733,6 +768,14 @@ static int virtnet_probe(struct virtio_device *vdev)
goto free_recv;
}
+ /*
+ * Outbound control channel virtqueue. We can live without it,
+ * so don't go fatal if it's not there.
+ */
+ vi->cvq = vdev->config->find_vq(vdev, 2, NULL);
+ if (IS_ERR(vi->cvq))
+ vi->cvq = NULL;
+
/* Initialize our empty receive and send queues. */
skb_queue_head_init(&vi->recv);
skb_queue_head_init(&vi->send);
@@ -763,6 +806,8 @@ static int virtnet_probe(struct virtio_device *vdev)
unregister:
unregister_netdev(dev);
free_send:
+ if (vi->cvq)
+ vdev->config->del_vq(vi->cvq);
vdev->config->del_vq(vi->svq);
free_recv:
vdev->config->del_vq(vi->rvq);
@@ -793,6 +838,8 @@ static void virtnet_remove(struct virtio_device *vdev)
vdev->config->del_vq(vi->svq);
vdev->config->del_vq(vi->rvq);
+ if (vi->cvq)
+ vdev->config->del_vq(vi->cvq);
unregister_netdev(vi->dev);
while (vi->pages)
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 5cdd0aa..e2c1d81 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -53,4 +53,21 @@ struct virtio_net_hdr_mrg_rxbuf {
__u16 num_buffers; /* Number of merged rx buffers */
};
+/*
+ * Control virtqueue data structures
+ *
+ * The control virtqueue expects a header in the first sg entry
+ * and an ack/status response in the last entry. Data for the
+ * command goes in between.
+ */
+struct virtio_net_ctrl_hdr {
+ __u8 class;
+ __u8 cmd;
+};
+
+typedef __u8 virtio_net_ctrl_ack;
+
+#define VIRTIO_NET_OK 0
+#define VIRTIO_NET_ERR 1
+
#endif /* _LINUX_VIRTIO_NET_H */
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 3/5] virtio_net: Add a set_rx_mode interface
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
2009-01-16 21:13 ` [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC Alex Williamson
2009-01-16 21:13 ` [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands Alex Williamson
@ 2009-01-16 21:13 ` Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-16 21:13 ` [PATCH 4/5] virtio_net: Add a MAC filter table Alex Williamson
` (2 subsequent siblings)
5 siblings, 1 reply; 30+ messages in thread
From: Alex Williamson @ 2009-01-16 21:13 UTC (permalink / raw)
To: netdev; +Cc: rusty, markmc, kvm
Make use of the RX_MODE control virtqueue class to enable the
set_rx_mode netdev interface. This allows us to selectively
enable/disable promiscuous and allmulti mode so we don't see
packets we don't want. We'll automatically enable these as
needed if additional unicast or multicast addresses are
requested.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---
drivers/net/virtio_net.c | 28 +++++++++++++++++++++++++++-
include/linux/virtio_net.h | 9 +++++++++
2 files changed, 36 insertions(+), 1 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d4be0a2..da96368 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -658,6 +658,31 @@ static int virtnet_set_tx_csum(struct net_device *dev, u32 data)
return ethtool_op_set_tx_hw_csum(dev, data);
}
+static void virtnet_set_rx_mode(struct net_device *dev)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+ u8 promisc, allmulti;
+
+ /* we're useless without the control virtqueue */
+ if (!vi->cvq)
+ return;
+
+ promisc = ((dev->flags & IFF_PROMISC) != 0 || dev->uc_count > 0);
+ allmulti = ((dev->flags & IFF_ALLMULTI) != 0 || dev->mc_count > 0);
+
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_RX_MODE,
+ VIRTIO_NET_CTRL_RX_MODE_PROMISC,
+ &promisc, sizeof(promisc)))
+ printk(KERN_WARNING "%s: Failed to %sable promisc mode.\n",
+ dev->name, promisc ? "en" : "dis");
+
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_RX_MODE,
+ VIRTIO_NET_CTRL_RX_MODE_ALLMULTI,
+ &allmulti, sizeof(allmulti)))
+ printk(KERN_WARNING "%s: Failed to %sable allmulti mode.\n",
+ dev->name, allmulti ? "en" : "dis");
+}
+
static struct ethtool_ops virtnet_ethtool_ops = {
.set_tx_csum = virtnet_set_tx_csum,
.set_sg = ethtool_op_set_sg,
@@ -681,6 +706,7 @@ static const struct net_device_ops virtnet_netdev = {
.ndo_start_xmit = start_xmit,
.ndo_validate_addr = eth_validate_addr,
.ndo_set_mac_address = virtnet_set_mac_address,
+ .ndo_set_rx_mode = virtnet_set_rx_mode,
.ndo_change_mtu = virtnet_change_mtu,
#ifdef CONFIG_NET_POLL_CONTROLLER
.ndo_poll_controller = virtnet_netpoll,
@@ -770,7 +796,7 @@ static int virtnet_probe(struct virtio_device *vdev)
/*
* Outbound control channel virtqueue. We can live without it,
- * so don't go fatal if it's not there.
+ * so don't go fatal if it's not there. Required for set_rx_mode.
*/
vi->cvq = vdev->config->find_vq(vdev, 2, NULL);
if (IS_ERR(vi->cvq))
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index e2c1d81..f8afef3 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -70,4 +70,13 @@ typedef __u8 virtio_net_ctrl_ack;
#define VIRTIO_NET_OK 0
#define VIRTIO_NET_ERR 1
+/*
+ * Control the RX mode, ie. promisucous and allmulti. PROMISC and
+ * ALLMULTI commands require an "out" sg entry containing a 1 byte
+ * state value, zero = disable, non-zero = enable.
+ */
+#define VIRTIO_NET_CTRL_RX_MODE 0
+ #define VIRTIO_NET_CTRL_RX_MODE_PROMISC 0
+ #define VIRTIO_NET_CTRL_RX_MODE_ALLMULTI 1
+
#endif /* _LINUX_VIRTIO_NET_H */
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
` (2 preceding siblings ...)
2009-01-16 21:13 ` [PATCH 3/5] virtio_net: Add a set_rx_mode interface Alex Williamson
@ 2009-01-16 21:13 ` Alex Williamson
2009-01-19 9:33 ` Mark McLoughlin
[not found] ` <200901271300.30330.rusty@rustcorp.com.au>
2009-01-16 21:13 ` [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor Alex Williamson
2009-01-19 6:05 ` [PATCH 0/5] virtio_net: Add MAC and VLAN filtering David Miller
5 siblings, 2 replies; 30+ messages in thread
From: Alex Williamson @ 2009-01-16 21:13 UTC (permalink / raw)
To: netdev; +Cc: rusty, markmc, kvm
Make use of the MAC_TABLE control virtqueue class to support a
MAC filter table. The size of the filter table defaults to 16
entries and can be adjusted via the mac_entries module parameter.
Note, the original hardware address does not count towards this.
As with most real hardware, unicast addresses have priority in
the filter table so we can avoid enabling full promiscuous
until both unicast and multicast address overflow.
Signed-off-by: Alex Williamson <alex.williamson@hp.com
---
drivers/net/virtio_net.c | 82 +++++++++++++++++++++++++++++++++++++++++++-
include/linux/virtio_net.h | 20 +++++++++++
2 files changed, 100 insertions(+), 2 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index da96368..9be0d6a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -32,6 +32,11 @@ static int csum = 1, gso = 1;
module_param(csum, bool, 0444);
module_param(gso, bool, 0444);
+static unsigned int mac_entries = 16;
+module_param(mac_entries, uint, 0444);
+MODULE_PARM_DESC(mac_entries,
+ "Number of entries in the MAC filter table.");
+
/* FIXME: MTU in config. */
#define MAX_PACKET_LEN (ETH_HLEN+ETH_DATA_LEN)
#define GOOD_COPY_LEN 128
@@ -667,9 +672,64 @@ static void virtnet_set_rx_mode(struct net_device *dev)
if (!vi->cvq)
return;
- promisc = ((dev->flags & IFF_PROMISC) != 0 || dev->uc_count > 0);
- allmulti = ((dev->flags & IFF_ALLMULTI) != 0 || dev->mc_count > 0);
+ promisc = ((dev->flags & IFF_PROMISC) != 0);
+ allmulti = ((dev->flags & IFF_ALLMULTI) != 0);
+
+ if (dev->uc_count > mac_entries)
+ promisc = 1;
+ else if (dev->uc_count + dev->mc_count > mac_entries)
+ allmulti = 1;
+
+ if (!promisc && (dev->uc_count || (dev->mc_count && !allmulti))) {
+ u8 *buf, *cur;
+ int count, i;
+ struct dev_addr_list *uc_ptr, *mc_ptr;
+
+ count = dev->uc_count + (allmulti ? 0 : dev->mc_count);
+
+ buf = kzalloc(count * ETH_ALEN, GFP_ATOMIC);
+ if (!buf) {
+ printk(KERN_WARNING "%s: "
+ "Failed to alloc MAC table, using promisc.\n",
+ dev->name);
+ promisc = 1;
+ goto set_mode;
+ }
+
+ cur = buf;
+ uc_ptr = dev->uc_list;
+ mc_ptr = dev->mc_list;
+
+ for (i = 0; i < dev->uc_count; i++) {
+ memcpy(cur, uc_ptr->da_addr, ETH_ALEN);
+ cur += ETH_ALEN;
+ uc_ptr = uc_ptr->next;
+ }
+ if (!allmulti) {
+ for (i = 0; i < dev->mc_count; i++) {
+ memcpy(cur, mc_ptr->da_addr, ETH_ALEN);
+ cur += ETH_ALEN;
+ mc_ptr = mc_ptr->next;
+ }
+ }
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_MAC_TABLE,
+ VIRTIO_NET_CTRL_MAC_TABLE_SET,
+ buf, count * ETH_ALEN)) {
+ printk(KERN_WARNING "%s: "
+ "Failed to set MAC filter, using promisc.\n",
+ dev->name);
+ promisc = 1;
+ }
+ kfree(buf);
+ } else {
+ /* Set an empty MAC table - disabled */
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_MAC_TABLE,
+ VIRTIO_NET_CTRL_MAC_TABLE_SET, NULL, 0))
+ printk(KERN_WARNING "%s: "
+ "Failed to clear MAC filter.\n", dev->name);
+ }
+set_mode:
if (virtnet_send_command(vi, VIRTIO_NET_CTRL_RX_MODE,
VIRTIO_NET_CTRL_RX_MODE_PROMISC,
&promisc, sizeof(promisc)))
@@ -801,6 +861,24 @@ static int virtnet_probe(struct virtio_device *vdev)
vi->cvq = vdev->config->find_vq(vdev, 2, NULL);
if (IS_ERR(vi->cvq))
vi->cvq = NULL;
+ else {
+ unsigned int entries;
+
+ /*
+ * We use a separate stack variable here because the
+ * data segment for the static variable doesn't translate
+ * across the virtqueue.
+ */
+ entries = mac_entries = min(mac_entries,
+ (unsigned int)(PAGE_SIZE / ETH_ALEN));
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_MAC_TABLE,
+ VIRTIO_NET_CTRL_MAC_TABLE_ALLOC,
+ &entries, sizeof(entries))) {
+ printk(KERN_WARNING "virtio-net: "
+ "MAC filter table allocation failed.\n");
+ mac_entries = 0;
+ }
+ }
/* Initialize our empty receive and send queues. */
skb_queue_head_init(&vi->recv);
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index f8afef3..84086a6 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -79,4 +79,24 @@ typedef __u8 virtio_net_ctrl_ack;
#define VIRTIO_NET_CTRL_RX_MODE_PROMISC 0
#define VIRTIO_NET_CTRL_RX_MODE_ALLMULTI 1
+/*
+ * Control the MAC filter table.
+ *
+ * The ALLOC command requires a 4 byte sg entry indicating the size of
+ * the MAC filter table to be allocated in number of entries
+ * (ie. bytes = entries * ETH_ALEN). The MAC filter table may only be
+ * allocated once after a device reset. A device reset frees the MAC
+ * filter table, allowing a new ALLOC. The current implementation limits
+ * the size to a single host page.
+ *
+ * The SET command requires an out sg entry containing a buffer of the
+ * entire MAC filter table. The format is a simple byte stream
+ * concatenating all of the ETH_ALEN MAC adresses to be inserted into
+ * the table. Partial updates are not available. The SET command can
+ * only succeed if there is a table allocated.
+ */
+#define VIRTIO_NET_CTRL_MAC_TABLE 1
+ #define VIRTIO_NET_CTRL_MAC_TABLE_ALLOC 0
+ #define VIRTIO_NET_CTRL_MAC_TABLE_SET 1
+
#endif /* _LINUX_VIRTIO_NET_H */
^ permalink raw reply related [flat|nested] 30+ messages in thread
* [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
` (3 preceding siblings ...)
2009-01-16 21:13 ` [PATCH 4/5] virtio_net: Add a MAC filter table Alex Williamson
@ 2009-01-16 21:13 ` Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-19 6:05 ` [PATCH 0/5] virtio_net: Add MAC and VLAN filtering David Miller
5 siblings, 1 reply; 30+ messages in thread
From: Alex Williamson @ 2009-01-16 21:13 UTC (permalink / raw)
To: netdev; +Cc: rusty, markmc, kvm
VLAN filtering allows the hypervisor to drop packets from VLANs
that we're not a part of, further reducing the number of extraneous
packets recieved. This makes use of the VLAN virtqueue command class.
The ENABLE command is used both to activate the filter and verify the
existence of the functionality on the backend.
Also, this fixes a sizing issue for MAX_PACKET_LEN when using VLANs.
VLANS add 4 bytes of header, resulting in a maximum full packet size
of 1518 bytes (destination MAC + source MAC + VLAN + ethernet type +
MTU). Withouth this, a VLAN over virtio_net is likely to get a
truncation error in the backend.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---
drivers/net/virtio_net.c | 52 ++++++++++++++++++++++++++++++++++++--------
include/linux/virtio_net.h | 15 +++++++++++++
2 files changed, 58 insertions(+), 9 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 9be0d6a..8d4bc83 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -24,6 +24,7 @@
#include <linux/virtio.h>
#include <linux/virtio_net.h>
#include <linux/scatterlist.h>
+#include <linux/if_vlan.h>
static int napi_weight = 128;
module_param(napi_weight, int, 0444);
@@ -38,7 +39,7 @@ MODULE_PARM_DESC(mac_entries,
"Number of entries in the MAC filter table.");
/* FIXME: MTU in config. */
-#define MAX_PACKET_LEN (ETH_HLEN+ETH_DATA_LEN)
+#define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
#define GOOD_COPY_LEN 128
struct virtnet_info
@@ -743,6 +744,28 @@ set_mode:
dev->name, allmulti ? "en" : "dis");
}
+static void virnet_vlan_rx_add_vid(struct net_device *dev, u16 vid)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+ u16 id = vid;
+
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
+ VIRTIO_NET_CTRL_VLAN_ADD, &id, sizeof(id)))
+ printk(KERN_WARNING "%s: Failed to add VLAN ID %d.\n",
+ dev->name, id);
+}
+
+static void virnet_vlan_rx_kill_vid(struct net_device *dev, u16 vid)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+ u16 id = vid;
+
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
+ VIRTIO_NET_CTRL_VLAN_KILL, &id, sizeof(id)))
+ printk(KERN_WARNING "%s: Failed to kill VLAN ID %d.\n",
+ dev->name, id);
+}
+
static struct ethtool_ops virtnet_ethtool_ops = {
.set_tx_csum = virtnet_set_tx_csum,
.set_sg = ethtool_op_set_sg,
@@ -761,15 +784,17 @@ static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
}
static const struct net_device_ops virtnet_netdev = {
- .ndo_open = virtnet_open,
- .ndo_stop = virtnet_close,
- .ndo_start_xmit = start_xmit,
- .ndo_validate_addr = eth_validate_addr,
- .ndo_set_mac_address = virtnet_set_mac_address,
- .ndo_set_rx_mode = virtnet_set_rx_mode,
- .ndo_change_mtu = virtnet_change_mtu,
+ .ndo_open = virtnet_open,
+ .ndo_stop = virtnet_close,
+ .ndo_start_xmit = start_xmit,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_set_mac_address = virtnet_set_mac_address,
+ .ndo_set_rx_mode = virtnet_set_rx_mode,
+ .ndo_change_mtu = virtnet_change_mtu,
+ .ndo_vlan_rx_add_vid = virnet_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = virnet_vlan_rx_kill_vid,
#ifdef CONFIG_NET_POLL_CONTROLLER
- .ndo_poll_controller = virtnet_netpoll,
+ .ndo_poll_controller = virtnet_netpoll,
#endif
};
@@ -863,6 +888,7 @@ static int virtnet_probe(struct virtio_device *vdev)
vi->cvq = NULL;
else {
unsigned int entries;
+ u8 vlan_filter = 1;
/*
* We use a separate stack variable here because the
@@ -878,6 +904,14 @@ static int virtnet_probe(struct virtio_device *vdev)
"MAC filter table allocation failed.\n");
mac_entries = 0;
}
+
+ /* Enable VLAN filtering if supported by the backend */
+ if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
+ VIRTIO_NET_CTRL_VLAN_ENABLE,
+ &vlan_filter, sizeof(vlan_filter))) {
+ printk(KERN_INFO "virtio_net: VLAN filter enabled\n");
+ dev->features |= NETIF_F_HW_VLAN_FILTER;
+ }
}
/* Initialize our empty receive and send queues. */
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 84086a6..a95028d 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -99,4 +99,19 @@ typedef __u8 virtio_net_ctrl_ack;
#define VIRTIO_NET_CTRL_MAC_TABLE_ALLOC 0
#define VIRTIO_NET_CTRL_MAC_TABLE_SET 1
+/*
+ * Control VLAN filtering
+ *
+ * The VLAN filter table is controlled via a simple ADD/KILL interface.
+ * VLAN IDs not added will be dropped. Kill is the opposite of add.
+ * Both commands expect an out entry containing a 2 byte VLAN ID.
+ * The ENABLE command expects an out entry containing a single byte,
+ * zero to disable, non-zero to enable. The default state is disabled
+ * for compatibility.
+ */
+#define VIRTIO_NET_CTRL_VLAN 2
+ #define VIRTIO_NET_CTRL_VLAN_ENABLE 0
+ #define VIRTIO_NET_CTRL_VLAN_ADD 1
+ #define VIRTIO_NET_CTRL_VLAN_KILL 2
+
#endif /* _LINUX_VIRTIO_NET_H */
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 0/5] virtio_net: Add MAC and VLAN filtering
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
` (4 preceding siblings ...)
2009-01-16 21:13 ` [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor Alex Williamson
@ 2009-01-19 6:05 ` David Miller
2009-01-19 8:30 ` Mark McLoughlin
5 siblings, 1 reply; 30+ messages in thread
From: David Miller @ 2009-01-19 6:05 UTC (permalink / raw)
To: alex.williamson; +Cc: netdev, rusty, markmc, kvm
From: Alex Williamson <alex.williamson@hp.com>
Date: Fri, 16 Jan 2009 14:13:12 -0700
> This series enables setting the virtio-net device MAC address, adds
> infrastructure for the new control virtqueue, and makes use of it
> to support set_rx_mode, unicast and multicast address lists, and
> supporting a hypervisor based VLAN filter. The goal is to make the
> virtio-net device support more of the features of a physical NIC and
> allow the hypervisor to discard packets we're not interested in.
>
> This version incorporates the review comments from Mark McLoughlin,
> specifically, much improved comments and commit logs, verifying the
> functionality of host not providing a MAC address, moving communication
> structs into virtio_net.h, adding warnings when things don't work, and
> making the strings grep'able. I've left the class/cmd split in the
> control header rather than consolidating it into a single value, I'm
> hoping I've made sufficient arguments for that. Also, I left the
> error return rather than a BUG_ON in send_command because I can't
> conditionally enable set_rx_mode as it's in a const struct. Instead,
> I've changed the caller to avoid the issue. Please comment and/or
> apply. Thanks,
I've been watching these patches passively, and it seems that
there have been some comments that Rusty hasn't submitted
certain virtio_net patches to me as well.
Please sort this all out and let me know what to queue up for
2.6.30
Thanks.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/5] virtio_net: Add MAC and VLAN filtering
2009-01-19 6:05 ` [PATCH 0/5] virtio_net: Add MAC and VLAN filtering David Miller
@ 2009-01-19 8:30 ` Mark McLoughlin
2009-01-20 1:10 ` David Miller
0 siblings, 1 reply; 30+ messages in thread
From: Mark McLoughlin @ 2009-01-19 8:30 UTC (permalink / raw)
To: David Miller; +Cc: alex.williamson, netdev, rusty, kvm
Hi Dave,
On Sun, 2009-01-18 at 22:05 -0800, David Miller wrote:
> From: Alex Williamson <alex.williamson@hp.com>
> Date: Fri, 16 Jan 2009 14:13:12 -0700
>
> > This series enables setting the virtio-net device MAC address, adds
> > infrastructure for the new control virtqueue, and makes use of it
> > to support set_rx_mode, unicast and multicast address lists, and
> > supporting a hypervisor based VLAN filter. The goal is to make the
> > virtio-net device support more of the features of a physical NIC and
> > allow the hypervisor to discard packets we're not interested in.
> >
> > This version incorporates the review comments from Mark McLoughlin,
> > specifically, much improved comments and commit logs, verifying the
> > functionality of host not providing a MAC address, moving communication
> > structs into virtio_net.h, adding warnings when things don't work, and
> > making the strings grep'able. I've left the class/cmd split in the
> > control header rather than consolidating it into a single value, I'm
> > hoping I've made sufficient arguments for that. Also, I left the
> > error return rather than a BUG_ON in send_command because I can't
> > conditionally enable set_rx_mode as it's in a const struct. Instead,
> > I've changed the caller to avoid the issue. Please comment and/or
> > apply. Thanks,
>
> I've been watching these patches passively, and it seems that
> there have been some comments that Rusty hasn't submitted
> certain virtio_net patches to me as well.
>
> Please sort this all out and let me know what to queue up for
> 2.6.30
The missing patch is the link status one which I forwarded from Rusty's
queue (at his request) just before as the merge window closed:
http://patchwork.ozlabs.org/patch/17500/
Here it is again.
Cheers,
Mark.
From: Mark McLoughlin <markmc@redhat.com>
Date: Tue, 9 Dec 2008 10:19:33 +0000
Subject: [PATCH] virtio_net: add link status handling
Allow the host to inform us that the link is down by adding
a VIRTIO_NET_F_STATUS which indicates that device status is
available in virtio_net config.
This is currently useful for simulating link down conditions
(e.g. using proposed qemu 'set_link' monitor command) but
would also be needed if we were to support device assignment
via virtio.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added future masking)
---
drivers/net/virtio_net.c | 43 ++++++++++++++++++++++++++++++++++++++++++-
include/linux/virtio_net.h | 5 +++++
2 files changed, 47 insertions(+), 1 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 43f6523..fb78368 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -42,6 +42,7 @@ struct virtnet_info
struct virtqueue *rvq, *svq;
struct net_device *dev;
struct napi_struct napi;
+ unsigned int status;
/* The skb we couldn't send because buffers were full. */
struct sk_buff *last_xmit_skb;
@@ -611,6 +612,7 @@ static struct ethtool_ops virtnet_ethtool_ops = {
.set_tx_csum = virtnet_set_tx_csum,
.set_sg = ethtool_op_set_sg,
.set_tso = ethtool_op_set_tso,
+ .get_link = ethtool_op_get_link,
};
#define MIN_MTU 68
@@ -636,6 +638,41 @@ static const struct net_device_ops virtnet_netdev = {
#endif
};
+static void virtnet_update_status(struct virtnet_info *vi)
+{
+ u16 v;
+
+ if (!virtio_has_feature(vi->vdev, VIRTIO_NET_F_STATUS))
+ return;
+
+ vi->vdev->config->get(vi->vdev,
+ offsetof(struct virtio_net_config, status),
+ &v, sizeof(v));
+
+ /* Ignore unknown (future) status bits */
+ v &= VIRTIO_NET_S_LINK_UP;
+
+ if (vi->status == v)
+ return;
+
+ vi->status = v;
+
+ if (vi->status & VIRTIO_NET_S_LINK_UP) {
+ netif_carrier_on(vi->dev);
+ netif_wake_queue(vi->dev);
+ } else {
+ netif_carrier_off(vi->dev);
+ netif_stop_queue(vi->dev);
+ }
+}
+
+static void virtnet_config_changed(struct virtio_device *vdev)
+{
+ struct virtnet_info *vi = vdev->priv;
+
+ virtnet_update_status(vi);
+}
+
static int virtnet_probe(struct virtio_device *vdev)
{
int err;
@@ -738,6 +775,9 @@ static int virtnet_probe(struct virtio_device *vdev)
goto unregister;
}
+ vi->status = VIRTIO_NET_S_LINK_UP;
+ virtnet_update_status(vi);
+
pr_debug("virtnet: registered device %s\n", dev->name);
return 0;
@@ -793,7 +833,7 @@ static unsigned int features[] = {
VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_UFO, VIRTIO_NET_F_HOST_TSO6,
VIRTIO_NET_F_HOST_ECN, VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
VIRTIO_NET_F_GUEST_ECN, /* We don't yet handle UFO input. */
- VIRTIO_NET_F_MRG_RXBUF,
+ VIRTIO_NET_F_MRG_RXBUF, VIRTIO_NET_F_STATUS,
VIRTIO_F_NOTIFY_ON_EMPTY,
};
@@ -805,6 +845,7 @@ static struct virtio_driver virtio_net = {
.id_table = id_table,
.probe = virtnet_probe,
.remove = __devexit_p(virtnet_remove),
+ .config_changed = virtnet_config_changed,
};
static int __init init(void)
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index cf335fe..7ffc754 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -23,11 +23,16 @@
#define VIRTIO_NET_F_HOST_ECN 13 /* Host can handle TSO[6] w/ ECN in. */
#define VIRTIO_NET_F_HOST_UFO 14 /* Host can handle UFO in. */
#define VIRTIO_NET_F_MRG_RXBUF 15 /* Host can merge receive buffers. */
+#define VIRTIO_NET_F_STATUS 16 /* virtio_net_config.status available */
+
+#define VIRTIO_NET_S_LINK_UP 1 /* Link is up */
struct virtio_net_config
{
/* The config defining mac address (if VIRTIO_NET_F_MAC) */
__u8 mac[6];
+ /* See VIRTIO_NET_F_STATUS and VIRTIO_NET_S_* above */
+ __u16 status;
} __attribute__((packed));
/* This is the first element of the scatter-gather list. If you don't
--
1.6.0.6
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor
2009-01-16 21:13 ` [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor Alex Williamson
@ 2009-01-19 9:32 ` Mark McLoughlin
2009-01-20 16:36 ` Alex Williamson
0 siblings, 1 reply; 30+ messages in thread
From: Mark McLoughlin @ 2009-01-19 9:32 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, rusty, kvm
Hi Alex,
On Fri, 2009-01-16 at 14:13 -0700, Alex Williamson wrote:
> VLAN filtering allows the hypervisor to drop packets from VLANs
> that we're not a part of, further reducing the number of extraneous
> packets recieved. This makes use of the VLAN virtqueue command class.
> The ENABLE command is used both to activate the filter and verify the
> existence of the functionality on the backend.
>
> Also, this fixes a sizing issue for MAX_PACKET_LEN when using VLANs.
> VLANS add 4 bytes of header, resulting in a maximum full packet size
> of 1518 bytes (destination MAC + source MAC + VLAN + ethernet type +
> MTU). Withouth this, a VLAN over virtio_net is likely to get a
> truncation error in the backend.
Looks good, but could you split the MAX_PACKET_LEN change out to a
separate patch, send to davem for 2.6.29. We should also get this into
stable.
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 9be0d6a..8d4bc83 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -24,6 +24,7 @@
> #include <linux/virtio.h>
> #include <linux/virtio_net.h>
> #include <linux/scatterlist.h>
> +#include <linux/if_vlan.h>
>
> static int napi_weight = 128;
> module_param(napi_weight, int, 0444);
> @@ -38,7 +39,7 @@ MODULE_PARM_DESC(mac_entries,
> "Number of entries in the MAC filter table.");
>
> /* FIXME: MTU in config. */
> -#define MAX_PACKET_LEN (ETH_HLEN+ETH_DATA_LEN)
> +#define MAX_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN)
> #define GOOD_COPY_LEN 128
>
> struct virtnet_info
> @@ -743,6 +744,28 @@ set_mode:
> dev->name, allmulti ? "en" : "dis");
> }
>
> +static void virnet_vlan_rx_add_vid(struct net_device *dev, u16 vid)
> +{
> + struct virtnet_info *vi = netdev_priv(dev);
> + u16 id = vid;
Indentation mixup.
> +
> + if (virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> + VIRTIO_NET_CTRL_VLAN_ADD, &id, sizeof(id)))
> + printk(KERN_WARNING "%s: Failed to add VLAN ID %d.\n",
> + dev->name, id);
> +}
> +
> +static void virnet_vlan_rx_kill_vid(struct net_device *dev, u16 vid)
> +{
> + struct virtnet_info *vi = netdev_priv(dev);
> + u16 id = vid;
Ditto.
> +
> + if (virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> + VIRTIO_NET_CTRL_VLAN_KILL, &id, sizeof(id)))
> + printk(KERN_WARNING "%s: Failed to kill VLAN ID %d.\n",
> + dev->name, id);
> +}
> +
> static struct ethtool_ops virtnet_ethtool_ops = {
> .set_tx_csum = virtnet_set_tx_csum,
> .set_sg = ethtool_op_set_sg,
> @@ -761,15 +784,17 @@ static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
> }
>
> static const struct net_device_ops virtnet_netdev = {
> - .ndo_open = virtnet_open,
> - .ndo_stop = virtnet_close,
> - .ndo_start_xmit = start_xmit,
> - .ndo_validate_addr = eth_validate_addr,
> - .ndo_set_mac_address = virtnet_set_mac_address,
> - .ndo_set_rx_mode = virtnet_set_rx_mode,
> - .ndo_change_mtu = virtnet_change_mtu,
> + .ndo_open = virtnet_open,
> + .ndo_stop = virtnet_close,
> + .ndo_start_xmit = start_xmit,
> + .ndo_validate_addr = eth_validate_addr,
> + .ndo_set_mac_address = virtnet_set_mac_address,
> + .ndo_set_rx_mode = virtnet_set_rx_mode,
> + .ndo_change_mtu = virtnet_change_mtu,
> + .ndo_vlan_rx_add_vid = virnet_vlan_rx_add_vid,
> + .ndo_vlan_rx_kill_vid = virnet_vlan_rx_kill_vid,
> #ifdef CONFIG_NET_POLL_CONTROLLER
> - .ndo_poll_controller = virtnet_netpoll,
> + .ndo_poll_controller = virtnet_netpoll,
> #endif
> };
>
> @@ -863,6 +888,7 @@ static int virtnet_probe(struct virtio_device *vdev)
> vi->cvq = NULL;
> else {
> unsigned int entries;
> + u8 vlan_filter = 1;
>
> /*
> * We use a separate stack variable here because the
> @@ -878,6 +904,14 @@ static int virtnet_probe(struct virtio_device *vdev)
> "MAC filter table allocation failed.\n");
> mac_entries = 0;
> }
> +
> + /* Enable VLAN filtering if supported by the backend */
> + if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> + VIRTIO_NET_CTRL_VLAN_ENABLE,
> + &vlan_filter, sizeof(vlan_filter))) {
> + printk(KERN_INFO "virtio_net: VLAN filter enabled\n");
Do we need this KERN_INFO? KERN_DEBUG, maybe?
> + dev->features |= NETIF_F_HW_VLAN_FILTER;
> + }
> }
>
> /* Initialize our empty receive and send queues. */
> diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
> index 84086a6..a95028d 100644
> --- a/include/linux/virtio_net.h
> +++ b/include/linux/virtio_net.h
> @@ -99,4 +99,19 @@ typedef __u8 virtio_net_ctrl_ack;
> #define VIRTIO_NET_CTRL_MAC_TABLE_ALLOC 0
> #define VIRTIO_NET_CTRL_MAC_TABLE_SET 1
>
> +/*
> + * Control VLAN filtering
> + *
> + * The VLAN filter table is controlled via a simple ADD/KILL interface.
> + * VLAN IDs not added will be dropped. Kill is the opposite of add.
> + * Both commands expect an out entry containing a 2 byte VLAN ID.
> + * The ENABLE command expects an out entry containing a single byte,
> + * zero to disable, non-zero to enable. The default state is disabled
> + * for compatibility.
> + */
> +#define VIRTIO_NET_CTRL_VLAN 2
> + #define VIRTIO_NET_CTRL_VLAN_ENABLE 0
> + #define VIRTIO_NET_CTRL_VLAN_ADD 1
> + #define VIRTIO_NET_CTRL_VLAN_KILL 2
Not too keen on "kill" - it matches the linux API, but "remove" or
"delete" is probably more sensible.
Cheers,
Mark.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC
2009-01-16 21:13 ` [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC Alex Williamson
@ 2009-01-19 9:32 ` Mark McLoughlin
0 siblings, 0 replies; 30+ messages in thread
From: Mark McLoughlin @ 2009-01-19 9:32 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, rusty, kvm
On Fri, 2009-01-16 at 14:13 -0700, Alex Williamson wrote:
> Many physical NICs let the OS re-program the "hardware" MAC
> address. Virtual NICs should allow this too.
>
> Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Cheers,
Mark.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands
2009-01-16 21:13 ` [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands Alex Williamson
@ 2009-01-19 9:32 ` Mark McLoughlin
[not found] ` <200901271352.57887.rusty@rustcorp.com.au>
1 sibling, 0 replies; 30+ messages in thread
From: Mark McLoughlin @ 2009-01-19 9:32 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, rusty, kvm
On Fri, 2009-01-16 at 14:13 -0700, Alex Williamson wrote:
> This will be used for RX mode, MAC filter table, VLAN filtering, etc...
>
> The control transaction consists of one or more "out" sg entries and
> one or more "in" sg entries. The first out entry contains a header
> defining the class and command. Additional out entries may provide
> data for the command. The last in entry provides a status response
> back from the command.
>
> Virtqueues typically run asynchronous, running a callback function
> when there's data in the channel. We can't readily make use of this
> in the command paths where we need to use this. Instead, we kick
> the virtqueue and spin. The kick causes an I/O write, triggering an
> immediate trap into the hypervisor.
>
> Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Cheers,
Mark.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 3/5] virtio_net: Add a set_rx_mode interface
2009-01-16 21:13 ` [PATCH 3/5] virtio_net: Add a set_rx_mode interface Alex Williamson
@ 2009-01-19 9:32 ` Mark McLoughlin
0 siblings, 0 replies; 30+ messages in thread
From: Mark McLoughlin @ 2009-01-19 9:32 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, rusty, kvm
On Fri, 2009-01-16 at 14:13 -0700, Alex Williamson wrote:
> Make use of the RX_MODE control virtqueue class to enable the
> set_rx_mode netdev interface. This allows us to selectively
> enable/disable promiscuous and allmulti mode so we don't see
> packets we don't want. We'll automatically enable these as
> needed if additional unicast or multicast addresses are
> requested.
>
> Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Cheers,
Mark.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-16 21:13 ` [PATCH 4/5] virtio_net: Add a MAC filter table Alex Williamson
@ 2009-01-19 9:33 ` Mark McLoughlin
[not found] ` <200901271300.30330.rusty@rustcorp.com.au>
1 sibling, 0 replies; 30+ messages in thread
From: Mark McLoughlin @ 2009-01-19 9:33 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, rusty, kvm
On Fri, 2009-01-16 at 14:13 -0700, Alex Williamson wrote:
> Make use of the MAC_TABLE control virtqueue class to support a
> MAC filter table. The size of the filter table defaults to 16
> entries and can be adjusted via the mac_entries module parameter.
> Note, the original hardware address does not count towards this.
>
> As with most real hardware, unicast addresses have priority in
> the filter table so we can avoid enabling full promiscuous
> until both unicast and multicast address overflow.
>
> Signed-off-by: Alex Williamson <alex.williamson@hp.com
Acked-by: Mark McLoughlin <markmc@redhat.com>
Cheers,
Mark.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 0/5] virtio_net: Add MAC and VLAN filtering
2009-01-19 8:30 ` Mark McLoughlin
@ 2009-01-20 1:10 ` David Miller
0 siblings, 0 replies; 30+ messages in thread
From: David Miller @ 2009-01-20 1:10 UTC (permalink / raw)
To: markmc; +Cc: alex.williamson, netdev, rusty, kvm
From: Mark McLoughlin <markmc@redhat.com>
Date: Mon, 19 Jan 2009 08:30:31 +0000
> virtio_net: add link status handling
>
> Allow the host to inform us that the link is down by adding
> a VIRTIO_NET_F_STATUS which indicates that device status is
> available in virtio_net config.
>
> This is currently useful for simulating link down conditions
> (e.g. using proposed qemu 'set_link' monitor command) but
> would also be needed if we were to support device assignment
> via virtio.
>
> Signed-off-by: Mark McLoughlin <markmc@redhat.com>
> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added future masking)
Queued up for 2.6.30, thanks!
^ permalink raw reply [flat|nested] 30+ messages in thread
* [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor
2009-01-19 9:32 ` Mark McLoughlin
@ 2009-01-20 16:36 ` Alex Williamson
2009-01-20 16:44 ` Mark McLoughlin
[not found] ` <200901271422.33369.rusty@rustcorp.com.au>
0 siblings, 2 replies; 30+ messages in thread
From: Alex Williamson @ 2009-01-20 16:36 UTC (permalink / raw)
To: netdev; +Cc: rusty, markmc, kvm
VLAN filtering allows the hypervisor to drop packets from VLANs
that we're not a part of, further reducing the number of extraneous
packets recieved. This makes use of the VLAN virtqueue command class.
The ENABLE command is used both to activate the filter and verify the
existence of the functionality on the backend.
Signed-off-by: Alex Williamson <alex.williamson@hp.com>
---
Updated with suggestions from Mark McLoughlin:
- VLAN packet length fix split into separate patch
- Indenting fix
- Lower priority VLAN enabled printk
- Renamed VLAN_KILL to VLAN_DEL
drivers/net/virtio_net.c | 49 +++++++++++++++++++++++++++++++++++++-------
include/linux/virtio_net.h | 15 +++++++++++++
2 files changed, 56 insertions(+), 8 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f3849d1..e07019c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -744,6 +744,28 @@ set_mode:
dev->name, allmulti ? "en" : "dis");
}
+static void virnet_vlan_rx_add_vid(struct net_device *dev, u16 vid)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+ u16 id = vid;
+
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
+ VIRTIO_NET_CTRL_VLAN_ADD, &id, sizeof(id)))
+ printk(KERN_WARNING "%s: Failed to add VLAN ID %d.\n",
+ dev->name, id);
+}
+
+static void virnet_vlan_rx_kill_vid(struct net_device *dev, u16 vid)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+ u16 id = vid;
+
+ if (virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
+ VIRTIO_NET_CTRL_VLAN_DEL, &id, sizeof(id)))
+ printk(KERN_WARNING "%s: Failed to kill VLAN ID %d.\n",
+ dev->name, id);
+}
+
static struct ethtool_ops virtnet_ethtool_ops = {
.set_tx_csum = virtnet_set_tx_csum,
.set_sg = ethtool_op_set_sg,
@@ -762,15 +784,17 @@ static int virtnet_change_mtu(struct net_device *dev, int new_mtu)
}
static const struct net_device_ops virtnet_netdev = {
- .ndo_open = virtnet_open,
- .ndo_stop = virtnet_close,
- .ndo_start_xmit = start_xmit,
- .ndo_validate_addr = eth_validate_addr,
- .ndo_set_mac_address = virtnet_set_mac_address,
- .ndo_set_rx_mode = virtnet_set_rx_mode,
- .ndo_change_mtu = virtnet_change_mtu,
+ .ndo_open = virtnet_open,
+ .ndo_stop = virtnet_close,
+ .ndo_start_xmit = start_xmit,
+ .ndo_validate_addr = eth_validate_addr,
+ .ndo_set_mac_address = virtnet_set_mac_address,
+ .ndo_set_rx_mode = virtnet_set_rx_mode,
+ .ndo_change_mtu = virtnet_change_mtu,
+ .ndo_vlan_rx_add_vid = virnet_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = virnet_vlan_rx_kill_vid,
#ifdef CONFIG_NET_POLL_CONTROLLER
- .ndo_poll_controller = virtnet_netpoll,
+ .ndo_poll_controller = virtnet_netpoll,
#endif
};
@@ -864,6 +888,7 @@ static int virtnet_probe(struct virtio_device *vdev)
vi->cvq = NULL;
else {
unsigned int entries;
+ u8 vlan_filter = 1;
/*
* We use a separate stack variable here because the
@@ -879,6 +904,14 @@ static int virtnet_probe(struct virtio_device *vdev)
"MAC filter table allocation failed.\n");
mac_entries = 0;
}
+
+ /* Enable VLAN filtering if supported by the backend */
+ if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
+ VIRTIO_NET_CTRL_VLAN_ENABLE,
+ &vlan_filter, sizeof(vlan_filter))) {
+ printk(KERN_DEBUG "virtio_net: VLAN filter enabled\n");
+ dev->features |= NETIF_F_HW_VLAN_FILTER;
+ }
}
/* Initialize our empty receive and send queues. */
diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h
index 84086a6..1d7171c 100644
--- a/include/linux/virtio_net.h
+++ b/include/linux/virtio_net.h
@@ -99,4 +99,19 @@ typedef __u8 virtio_net_ctrl_ack;
#define VIRTIO_NET_CTRL_MAC_TABLE_ALLOC 0
#define VIRTIO_NET_CTRL_MAC_TABLE_SET 1
+/*
+ * Control VLAN filtering
+ *
+ * The VLAN filter table is controlled via a simple ADD/DEL interface.
+ * VLAN IDs not added will be dropped. Del is the opposite of add.
+ * Both commands expect an out entry containing a 2 byte VLAN ID.
+ * The ENABLE command expects an out entry containing a single byte,
+ * zero to disable, non-zero to enable. The default state is disabled
+ * for compatibility.
+ */
+#define VIRTIO_NET_CTRL_VLAN 2
+ #define VIRTIO_NET_CTRL_VLAN_ENABLE 0
+ #define VIRTIO_NET_CTRL_VLAN_ADD 1
+ #define VIRTIO_NET_CTRL_VLAN_DEL 2
+
#endif /* _LINUX_VIRTIO_NET_H */
^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor
2009-01-20 16:36 ` Alex Williamson
@ 2009-01-20 16:44 ` Mark McLoughlin
2009-01-26 2:08 ` David Miller
[not found] ` <200901271422.33369.rusty@rustcorp.com.au>
1 sibling, 1 reply; 30+ messages in thread
From: Mark McLoughlin @ 2009-01-20 16:44 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, rusty, kvm
On Tue, 2009-01-20 at 09:36 -0700, Alex Williamson wrote:
> VLAN filtering allows the hypervisor to drop packets from VLANs
> that we're not a part of, further reducing the number of extraneous
> packets recieved. This makes use of the VLAN virtqueue command class.
> The ENABLE command is used both to activate the filter and verify the
> existence of the functionality on the backend.
>
> Signed-off-by: Alex Williamson <alex.williamson@hp.com>
Cool, looks good.
Acked-by: Mark McLoughlin <markmc@redhat.com>
Cheers,
Mark.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor
2009-01-20 16:44 ` Mark McLoughlin
@ 2009-01-26 2:08 ` David Miller
2009-01-26 17:42 ` Alex Williamson
0 siblings, 1 reply; 30+ messages in thread
From: David Miller @ 2009-01-26 2:08 UTC (permalink / raw)
To: markmc; +Cc: alex.williamson, netdev, rusty, kvm
From: Mark McLoughlin <markmc@redhat.com>
Date: Tue, 20 Jan 2009 16:44:00 +0000
> On Tue, 2009-01-20 at 09:36 -0700, Alex Williamson wrote:
> > VLAN filtering allows the hypervisor to drop packets from VLANs
> > that we're not a part of, further reducing the number of extraneous
> > packets recieved. This makes use of the VLAN virtqueue command class.
> > The ENABLE command is used both to activate the filter and verify the
> > existence of the functionality on the backend.
> >
> > Signed-off-by: Alex Williamson <alex.williamson@hp.com>
>
> Cool, looks good.
>
> Acked-by: Mark McLoughlin <markmc@redhat.com>
I'd like to make some forward progress on this patch set.
Either apply something, or defer to Rusty.
If you think I should apply it to net-next-2.6, please just
resubmit the serious with accumulated ACK'd etc.
Else I'll let these flow Rusty's way.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor
2009-01-26 2:08 ` David Miller
@ 2009-01-26 17:42 ` Alex Williamson
0 siblings, 0 replies; 30+ messages in thread
From: Alex Williamson @ 2009-01-26 17:42 UTC (permalink / raw)
To: David Miller; +Cc: markmc, netdev, rusty, kvm
On Sun, 2009-01-25 at 18:08 -0800, David Miller wrote:
> From: Mark McLoughlin <markmc@redhat.com>
> Date: Tue, 20 Jan 2009 16:44:00 +0000
>
> > On Tue, 2009-01-20 at 09:36 -0700, Alex Williamson wrote:
> > > VLAN filtering allows the hypervisor to drop packets from VLANs
> > > that we're not a part of, further reducing the number of extraneous
> > > packets recieved. This makes use of the VLAN virtqueue command class.
> > > The ENABLE command is used both to activate the filter and verify the
> > > existence of the functionality on the backend.
> > >
> > > Signed-off-by: Alex Williamson <alex.williamson@hp.com>
> >
> > Cool, looks good.
> >
> > Acked-by: Mark McLoughlin <markmc@redhat.com>
>
> I'd like to make some forward progress on this patch set.
> Either apply something, or defer to Rusty.
>
> If you think I should apply it to net-next-2.6, please just
> resubmit the serious with accumulated ACK'd etc.
>
> Else I'll let these flow Rusty's way.
I'd still like to see if Rusty has any comments now that he's hopefully
catching up after LCA. I'll resend them in a couple days if not.
Thanks,
Alex
--
Alex Williamson HP Open Source & Linux Org.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
[not found] ` <200901271300.30330.rusty@rustcorp.com.au>
@ 2009-01-27 3:38 ` Alex Williamson
2009-01-28 10:45 ` Rusty Russell
0 siblings, 1 reply; 30+ messages in thread
From: Alex Williamson @ 2009-01-27 3:38 UTC (permalink / raw)
To: Rusty Russell; +Cc: netdev, markmc, kvm
Hi Rusty,
On Tue, 2009-01-27 at 13:00 +1030, Rusty Russell wrote:
> On Saturday 17 January 2009 07:43:34 Alex Williamson wrote:
> > As with most real hardware, unicast addresses have priority in
>
> > the filter table so we can avoid enabling full promiscuous
>
> > until both unicast and multicast address overflow.
>
> Why not pretend to have infinite, and have the host turn promisc on
> when *it*
>
> decides? Skip the alloc call, and just use a feature bit like
> everything else?
I suppose it's just a matter of where do you want to add the smarts and
the tune-ability. Since we can't actually have an infinite table and an
array implementation seems to make sense from an efficiency standpoint,
it needs to be defined by someone to be a fixed size before we start
using it. I was hoping the guest driver might have a better idea how it
plans to use the filter table and that there'd be some benefit to having
that handshake happen between the driver and the backend. The module
parameter fell out of this and seems rather convenient.
I could pursue this is you like, but I'm not sure of the benefit,
particularly if we want to give the user some control of the size of the
actual table. Thoughts? Thanks for the comments,
Alex
--
Alex Williamson HP Open Source & Linux Org.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands
[not found] ` <200901271352.57887.rusty@rustcorp.com.au>
@ 2009-01-27 4:00 ` Alex Williamson
2009-01-28 13:05 ` Rusty Russell
0 siblings, 1 reply; 30+ messages in thread
From: Alex Williamson @ 2009-01-27 4:00 UTC (permalink / raw)
To: Rusty Russell; +Cc: netdev, markmc, kvm
Hi Rusty,
On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote:
> On Saturday 17 January 2009 07:43:23 Alex Williamson wrote:
> > + return status ? -EFAULT : 0;
>
> This is wrong. Currently this can't happen, right? But you put it in
> so someone in future may want some kind of return from the commands.
> In which case, they'll want to see the value.
>
> If we're sure they never want to see the value, then we don't need to
> be synchronous at all: just spin if add_buf() fails.
In my qemu series of patches it can happen if the command isn't properly
defined, something bad happens, or the command is unrecognized. As I
was hashing this out, I first had more errnos, but I wasn't sure how
extensively to fill out the error returns, and eventually defaulted to
ok/fail. Should I expand on these some? Suggestions for a reasonable
small yet complete set? Should we use Linux errno values and let other
OS virtio-net implementations create a switch table?
I would like to keep this interface synchronous, particularly I'm
wondering if there's anything we might want to do for ethtool like
statistics. In that case, the backend might fill a buffer of data along
with returning a status code. I could imagine other similar uses as
well.
>
> > +struct virtio_net_ctrl_hdr {
>
> > + __u8 class;
>
> > + __u8 cmd;
>
> > +};
>
> This would need to be __attribute__((packed)). On ARM, that struct
> would be 4 bytes long.
Thanks, I'll fix that.
> > +
>
> > +typedef __u8 virtio_net_ctrl_ack;
>
> > +
>
> > +#define VIRTIO_NET_OK 0
>
> > +#define VIRTIO_NET_ERR 1
>
> Hmm, we define it and don't use it. And we never expect it to actually
> error (did your qemu implementation ever actually return non-zero?).
Yup, good point. These are mainly here to stay in sync with the qemu
backend, which does make use of them. Should I remove them here, or
should we make a more worthwhile set of return values? I have tried
manually returning non-zero status from qemu to verify it's reflected in
the response. Thanks for the comments,
Alex
--
Alex Williamson HP Open Source & Linux Org.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor
[not found] ` <200901271422.33369.rusty@rustcorp.com.au>
@ 2009-01-27 4:19 ` Alex Williamson
0 siblings, 0 replies; 30+ messages in thread
From: Alex Williamson @ 2009-01-27 4:19 UTC (permalink / raw)
To: Rusty Russell; +Cc: netdev, markmc, kvm
Hi Rusty,
On Tue, 2009-01-27 at 14:22 +1030, Rusty Russell wrote:
> > + struct virtnet_info *vi = netdev_priv(dev);
> > + u16 id = vid;
> > +
> > + if (virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> > + VIRTIO_NET_CTRL_VLAN_ADD, &id, sizeof(id)))
> > + printk(KERN_WARNING "%s: Failed to add VLAN ID %d.\n",
> > + dev->name, id);
> > +}
>
> The temporary var seems strange here, and in kill_vid.
It does stand out a bit. I can't think of a valid reason for why I had
this, so I'll drop it.
> > - .ndo_poll_controller = virtnet_netpoll,
> > + .ndo_poll_controller = virtnet_netpoll,
>
> Pretty lining up is a pet peeve of mine. Just indent them one more and
> be happy :)
Noted ;^)
>
> > + /* Enable VLAN filtering if supported by the backend */
> > + if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_VLAN,
> > + VIRTIO_NET_CTRL_VLAN_ENABLE,
> > + &vlan_filter, sizeof(vlan_filter))) {
> > + printk(KERN_DEBUG "virtio_net: VLAN filter enabled\n");
> > + dev->features |= NETIF_F_HW_VLAN_FILTER;
> > + }
>
> OK, several things here. Should probably be a feature bit. What is the
> vlan_filter arg for? Finally, using ! for the "success" case is YA API
> nastiness for me; if we were leaving virtnet_send_command as is I'd
> make it return bool, if it really returned a sensible -errno I'd
> compare against 0 here.
The VLAN_ENABLE command can either enable or disable the VLAN filter, so
the vlan_filter arg is setting enable to "on". A feature bit might be
more efficient, but since this happens at device init and the enable
will fail if it's not supported by the backend, it seemed redundant.
I agree the success/failure logic would be more clear w/o the !, I'll
switch this to a == 0, or something better depending on what we decide
to do with send_command. Thanks for the comments,
Alex
--
Alex Williamson HP Open Source & Linux Org.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-27 3:38 ` Alex Williamson
@ 2009-01-28 10:45 ` Rusty Russell
2009-01-28 17:48 ` Alex Williamson
0 siblings, 1 reply; 30+ messages in thread
From: Rusty Russell @ 2009-01-28 10:45 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, markmc, kvm
On Tuesday 27 January 2009 14:08:02 Alex Williamson wrote:
> Hi Rusty,
>
> On Tue, 2009-01-27 at 13:00 +1030, Rusty Russell wrote:
> > On Saturday 17 January 2009 07:43:34 Alex Williamson wrote:
> > > As with most real hardware, unicast addresses have priority in
> >
> > > the filter table so we can avoid enabling full promiscuous
> >
> > > until both unicast and multicast address overflow.
> >
> > Why not pretend to have infinite, and have the host turn promisc on
> > when *it*
> >
> > decides? Skip the alloc call, and just use a feature bit like
> > everything else?
>
> I suppose it's just a matter of where do you want to add the smarts and
> the tune-ability. Since we can't actually have an infinite table and an
> array implementation seems to make sense from an efficiency standpoint,
> it needs to be defined by someone to be a fixed size before we start
> using it. I was hoping the guest driver might have a better idea how it
> plans to use the filter table and that there'd be some benefit to having
> that handshake happen between the driver and the backend. The module
> parameter fell out of this and seems rather convenient.
>
> I could pursue this is you like, but I'm not sure of the benefit,
> particularly if we want to give the user some control of the size of the
> actual table. Thoughts? Thanks for the comments,
I don't think the either-or case is real. Say the user decides they want a table of 1000000 entries. And the backend says "no way, I have a 16 array"? Currently you get nothing.
I guess your qemu code does dynamic allocation. But I'm sure you put a limit in there, right? :)
We don't want some complex negotiation, and I don't think the guest has any more clue than the host, nor can do much about it.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands
2009-01-27 4:00 ` Alex Williamson
@ 2009-01-28 13:05 ` Rusty Russell
2009-01-28 19:02 ` Alex Williamson
0 siblings, 1 reply; 30+ messages in thread
From: Rusty Russell @ 2009-01-28 13:05 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, markmc, kvm
On Tuesday 27 January 2009 14:30:06 Alex Williamson wrote:
> Hi Rusty,
>
> On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote:
> > On Saturday 17 January 2009 07:43:23 Alex Williamson wrote:
> > > + return status ? -EFAULT : 0;
> >
> > This is wrong. Currently this can't happen, right? But you put it in
> > so someone in future may want some kind of return from the commands.
> > In which case, they'll want to see the value.
> >
> > If we're sure they never want to see the value, then we don't need to
> > be synchronous at all: just spin if add_buf() fails.
>
> In my qemu series of patches it can happen if the command isn't properly
> defined
ie. guest bug.
> , something bad happens
??? You're going to tell me to read the code, aren't you? :)
> , or the command is unrecognized.
If we go for feature bits, this is also a guest bug. And I think we should, since that's what feature bits are for.
> As I
> was hashing this out, I first had more errnos, but I wasn't sure how
> extensively to fill out the error returns, and eventually defaulted to
> ok/fail. Should I expand on these some? Suggestions for a reasonable
> small yet complete set? Should we use Linux errno values and let other
> OS virtio-net implementations create a switch table?
Not error codes. In future we may have a command where return codes are meaningful, but I'm pretty sure they'll be command specific so we don't know how to define them now anyway. AFAICT that's the only real justification for defining this interface as sync.
So I think just defining 0 on success is fine. Defining that to always happen to well-formed commands is even better :)
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-28 10:45 ` Rusty Russell
@ 2009-01-28 17:48 ` Alex Williamson
2009-01-28 23:55 ` Rusty Russell
0 siblings, 1 reply; 30+ messages in thread
From: Alex Williamson @ 2009-01-28 17:48 UTC (permalink / raw)
To: Rusty Russell; +Cc: netdev, markmc, kvm
Hi Rusty,
On Wed, 2009-01-28 at 21:15 +1030, Rusty Russell wrote:
> On Tuesday 27 January 2009 14:08:02 Alex Williamson wrote:
> > Hi Rusty,
> >
> > On Tue, 2009-01-27 at 13:00 +1030, Rusty Russell wrote:
> > > On Saturday 17 January 2009 07:43:34 Alex Williamson wrote:
> > > > As with most real hardware, unicast addresses have priority in
> > >
> > > > the filter table so we can avoid enabling full promiscuous
> > >
> > > > until both unicast and multicast address overflow.
> > >
> > > Why not pretend to have infinite, and have the host turn promisc on
> > > when *it*
> > >
> > > decides? Skip the alloc call, and just use a feature bit like
> > > everything else?
> >
> > I suppose it's just a matter of where do you want to add the smarts and
> > the tune-ability. Since we can't actually have an infinite table and an
> > array implementation seems to make sense from an efficiency standpoint,
> > it needs to be defined by someone to be a fixed size before we start
> > using it. I was hoping the guest driver might have a better idea how it
> > plans to use the filter table and that there'd be some benefit to having
> > that handshake happen between the driver and the backend. The module
> > parameter fell out of this and seems rather convenient.
> >
> > I could pursue this is you like, but I'm not sure of the benefit,
> > particularly if we want to give the user some control of the size of the
> > actual table. Thoughts? Thanks for the comments,
>
> I don't think the either-or case is real. Say the user decides they
> want a table of 1000000 entries. And the backend says "no way, I have
> a 16 array"? Currently you get nothing.
Ok, I think I see what you're getting at. Linux can handle that, but
other OSes may not...
> I guess your qemu code does dynamic allocation. But I'm sure you put
> a limit in there, right? :)
Right, it allocates the table only when the alloc call is made. The
limit is currently what will fit in a host page size, which is perhaps a
little arbitrary, but I had some FUD about multi-page sg lists. The
resulting 682 entry (on 4k) limit seemed sufficiently large.
> We don't want some complex negotiation, and I don't think the guest
> has any more clue than the host, nor can do much about it.
For a typical Linux guest, I agree, and I would guess that most users
won't know or care about the module option to specify the size, nor
should qemu ever balk at allocating the default size Linux requests.
However, I also know of a more embedded, proprietary OS that lives in a
fairly rigid network environment and does have a specific number of
entries they're looking for. Other OSes might have different numbers.
Here's what I believe to be the parameters around which I've designed
the current interface:
A. Receiving unwanted packets is a waste of resources.
B. Proprietary OSes may have dependencies on hardware filtering and
may not be able to handle unwanted packets.
C. Different guests may want different filter table sizes.
D. The guest can make better decisions than the host about what
packets are unwanted.
>From this, I derive that the guest needs to know the size of the filter
table, needs to be able to specify the size of the filter table, and is
in the best position to program the filter table. A pseudo-infinite
table violates the condition of not sending the guest unwanted packets.
We could also go down the path of deciding what a "big enough" table is,
and making it part of the ABI to avoid the alloc, but then we may have
to revise the ABI if we underestimate (whereas the current limit is an
implementation detail).
Weaving in your comments from other parts of this series, would it help
at all to distinguish EINVAL from ENOMEM for the alloc call? Maybe the
caller could make some kind of intelligent decision based on that (ex.
unimplemented feature vs try something smaller). Thanks for the
comments,
Alex
--
Alex Williamson HP Open Source & Linux Org.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands
2009-01-28 13:05 ` Rusty Russell
@ 2009-01-28 19:02 ` Alex Williamson
2009-01-29 1:35 ` Rusty Russell
0 siblings, 1 reply; 30+ messages in thread
From: Alex Williamson @ 2009-01-28 19:02 UTC (permalink / raw)
To: Rusty Russell; +Cc: netdev, markmc, kvm
On Wed, 2009-01-28 at 23:35 +1030, Rusty Russell wrote:
> On Tuesday 27 January 2009 14:30:06 Alex Williamson wrote:
> > Hi Rusty,
> >
> > On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote:
> > > On Saturday 17 January 2009 07:43:23 Alex Williamson wrote:
> > > > + return status ? -EFAULT : 0;
> > >
> > > This is wrong. Currently this can't happen, right? But you put it in
> > > so someone in future may want some kind of return from the commands.
> > > In which case, they'll want to see the value.
> >
> > > If we're sure they never want to see the value, then we don't need to
> > > be synchronous at all: just spin if add_buf() fails.
> >
> > In my qemu series of patches it can happen if the command isn't properly
> > defined
>
> ie. guest bug.
>
> > , something bad happens
>
> ??? You're going to tell me to read the code, aren't you? :)
The only one of these that stands out is if the qemu_mallocz() for the
MAC filter table fails for a size we think is reasonable.
> > , or the command is unrecognized.
>
> If we go for feature bits, this is also a guest bug. And I think we
> should, since that's what feature bits are for.
One of the reasons I had avoided using a feature bit is that it's only a
32bit field, and we've already used up to bit 16 (though I'm not sure
what to do about the sparse-ness of it). The virtqueue interface I've
designed supports up to 255 command classes, each with 255 commands. We
can't add a feature bit for every one, or even much of a subset. I'd be
happy to add a feature bit indicating the virtqueue command channel is
supported, but beyond that, I think we need to fall back to enable or
initialization commands failing on various classes. We could also
define that command 0 for each class as a probe if we want to make it
more explicit. Thanks,
Alex
--
Alex Williamson HP Open Source & Linux Org.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-28 17:48 ` Alex Williamson
@ 2009-01-28 23:55 ` Rusty Russell
2009-01-29 0:34 ` Herbert Xu
0 siblings, 1 reply; 30+ messages in thread
From: Rusty Russell @ 2009-01-28 23:55 UTC (permalink / raw)
To: Alex Williamson; +Cc: netdev, markmc, kvm, Herbert Xu
On Thursday 29 January 2009 04:18:28 Alex Williamson wrote:
> Hi Rusty,
Hi Alex,
I've cc'd Herbert: he always has good thoughts about this kind of thing and I want to be sure you're getting a fair hearing.
> Here's what I believe to be the parameters around which I've designed
> the current interface:
>
> A. Receiving unwanted packets is a waste of resources.
> B. Proprietary OSes may have dependencies on hardware filtering and
> may not be able to handle unwanted packets.
> C. Different guests may want different filter table sizes.
> D. The guest can make better decisions than the host about what
> packets are unwanted.
A general OS has to handle the "unwanted packets" case if the NIC's table isn't big enough. A guest may want arbitrary filter table sizes, but you're not giving it to them anyway, so that argument is bogus too.
The host *has* to decide the max table size, the guest *doesn't*. Your implementation is the worst of both worlds. The host doesn't tell the guest what the limit is, but does fail it for exceeding it. Your guest implementation limits itself arbitrarily, but you feel better because you've invoked yet-another party (the admin of the user box) to fix it!
And there's no evidence whatsoever that the guest can do anything rational if it hits the filtering limit that the host can't do.
> >From this, I derive that the guest needs to know the size of the filter
> table, needs to be able to specify the size of the filter table, and is
> in the best position to program the filter table. A pseudo-infinite
> table violates the condition of not sending the guest unwanted packets.
I'm not aware of any OS which doesn't filter this anyway, but I'm not familiar with the Windows stack. I'd be very surprised though, since multicast filtering is often implemented as a hash on the card.
> We could also go down the path of deciding what a "big enough" table is,
> and making it part of the ABI to avoid the alloc, but then we may have
> to revise the ABI if we underestimate (whereas the current limit is an
> implementation detail).
OK, say the host bonds a NIC directly to the virtio nic for the guest. That NIC uses a hash to filter MAC addresses. In your scheme, you have numerous problems:
(1) we've defined filtering to be perfect, so we need to filter in the host,
(2) the guest will stop giving us filtering information after 16 entries, and
tell us to go into promisc mode, even though we could do better.
> Weaving in your comments from other parts of this series, would it help
> at all to distinguish EINVAL from ENOMEM for the alloc call? Maybe the
> caller could make some kind of intelligent decision based on that (ex.
> unimplemented feature vs try something smaller).
None of the currently defined calls should fail. Feature bits should guarantee the existence of features, and failure should be greeted with horror by the guest, which should then try to limp along.
So, this conversation has convinced me that the host should accept arbitrary filtering entries, and the guest should accept that it is best effort.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-28 23:55 ` Rusty Russell
@ 2009-01-29 0:34 ` Herbert Xu
2009-01-29 6:17 ` David Stevens
0 siblings, 1 reply; 30+ messages in thread
From: Herbert Xu @ 2009-01-29 0:34 UTC (permalink / raw)
To: Rusty Russell; +Cc: Alex Williamson, netdev, markmc, kvm
On Thu, Jan 29, 2009 at 10:25:46AM +1030, Rusty Russell wrote:
>
> So, this conversation has convinced me that the host should
> accept arbitrary filtering entries, and the guest should accept
> that it is best effort.
I agree with this completely.
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands
2009-01-28 19:02 ` Alex Williamson
@ 2009-01-29 1:35 ` Rusty Russell
0 siblings, 0 replies; 30+ messages in thread
From: Rusty Russell @ 2009-01-29 1:35 UTC (permalink / raw)
To: Alex Williamson, Anthony Liguori; +Cc: netdev, markmc, kvm
On Thursday 29 January 2009 05:32:21 Alex Williamson wrote:
> On Wed, 2009-01-28 at 23:35 +1030, Rusty Russell wrote:
> > On Tuesday 27 January 2009 14:30:06 Alex Williamson wrote:
> > > On Tue, 2009-01-27 at 13:52 +1030, Rusty Russell wrote:
> > > > If we're sure they never want to see the value, then we don't need to
> > > > be synchronous at all: just spin if add_buf() fails.
...
> The only one of these that stands out is if the qemu_mallocz() for the
> MAC filter table fails for a size we think is reasonable.
Yes, that seems fair. Of course it's quite probable that the malloc will "succeed" then we'll segv later on if we're really low on mem.
> > > , or the command is unrecognized.
> >
> > If we go for feature bits, this is also a guest bug. And I think we
> > should, since that's what feature bits are for.
>
> One of the reasons I had avoided using a feature bit is that it's only a
> 32bit field, and we've already used up to bit 16 (though I'm not sure
> what to do about the sparse-ness of it). The virtqueue interface I've
> designed supports up to 255 command classes, each with 255 commands. We
> can't add a feature bit for every one, or even much of a subset.
We will need to figure out how to figure out how to expand the feature bits for virtio_pci anyway. I think the plan is to use feature bit 31 to mean "look here for more feature bits". Anthony would know more...
s390 and lguest didn't fall into this trap; we have infinite feature bits.
> I'd be
> happy to add a feature bit indicating the virtqueue command channel is
> supported
We probably should, for good measure. If someone added another virtqueue for some other purpose in future and didn't want to support the command channel it would get icky.
If we have 255 features, the problem is that we have 255 features, not that it's a lot of bits :)
Thanks,
Rusty.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-29 0:34 ` Herbert Xu
@ 2009-01-29 6:17 ` David Stevens
2009-01-30 7:03 ` Rusty Russell
0 siblings, 1 reply; 30+ messages in thread
From: David Stevens @ 2009-01-29 6:17 UTC (permalink / raw)
To: Herbert Xu
Cc: Alex Williamson, kvm, markmc, netdev, netdev-owner, Rusty Russell
I haven't been following this closely, so apologies if the point's been
made, or
if you're talking about unicast addresses here too, but just to be clear:
For multicasting, false positives are ok, false negatives are not
(non-functional),
and if the fixed-size address filter is exceeded, _multicast_promiscuous_
(but not all unicasts, so not promiscuous mode) is the "good" thing to do.
So "best effort" still shouldn't lead to an address you previously joined
not
being passed because a new one is added.
Also, if you can't keep all the MAC multicast addresses (ie,
the limit is memory and not look-up speed), then getting
out of multicast-promiscuous mode correctly isn't easy
since you don't know what groups you "forgot". You could
rebuild from the protocol memberships, if you know when
you've left enough groups to fit, but otherwise the MAC
multicast addresses you didn't keep of course won't work if you
leave multicast-promiscuous mode and the filter doesn't
have them.
So, if you're talking about not being able to fit all
the address (vs. not wanting to search that many), then
I'd suggest either staying in MP mode until ifdown, or
making the join a hard failure at the limit in the first
place.
+-DLS
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [PATCH 4/5] virtio_net: Add a MAC filter table
2009-01-29 6:17 ` David Stevens
@ 2009-01-30 7:03 ` Rusty Russell
0 siblings, 0 replies; 30+ messages in thread
From: Rusty Russell @ 2009-01-30 7:03 UTC (permalink / raw)
To: David Stevens
Cc: Herbert Xu, Alex Williamson, kvm, markmc, netdev, netdev-owner
On Thursday 29 January 2009 16:47:55 David Stevens wrote:
> Also, if you can't keep all the MAC multicast addresses (ie,
> the limit is memory and not look-up speed), then getting
> out of multicast-promiscuous mode correctly isn't easy
> since you don't know what groups you "forgot". You could
> rebuild from the protocol memberships, if you know when
> you've left enough groups to fit, but otherwise the MAC
> multicast addresses you didn't keep of course won't work if you
> leave multicast-promiscuous mode and the filter doesn't
> have them.
The command is simply "set the filter", not "add" and "delete", so the host doesn't have this problem.
VLAN has more of an issue: that has add and del. So the host can do a counting hash, or say remember 16 and then a count of overflows, but both will be suboptimal over time.
Cheers,
Rusty.
^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2009-01-30 7:03 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-16 21:13 [PATCH 0/5] virtio_net: Add MAC and VLAN filtering Alex Williamson
2009-01-16 21:13 ` [PATCH 1/5] virtio_net: Allow setting the MAC address of the NIC Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-16 21:13 ` [PATCH 2/5] virtio_net: Add a virtqueue for outbound control commands Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
[not found] ` <200901271352.57887.rusty@rustcorp.com.au>
2009-01-27 4:00 ` Alex Williamson
2009-01-28 13:05 ` Rusty Russell
2009-01-28 19:02 ` Alex Williamson
2009-01-29 1:35 ` Rusty Russell
2009-01-16 21:13 ` [PATCH 3/5] virtio_net: Add a set_rx_mode interface Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-16 21:13 ` [PATCH 4/5] virtio_net: Add a MAC filter table Alex Williamson
2009-01-19 9:33 ` Mark McLoughlin
[not found] ` <200901271300.30330.rusty@rustcorp.com.au>
2009-01-27 3:38 ` Alex Williamson
2009-01-28 10:45 ` Rusty Russell
2009-01-28 17:48 ` Alex Williamson
2009-01-28 23:55 ` Rusty Russell
2009-01-29 0:34 ` Herbert Xu
2009-01-29 6:17 ` David Stevens
2009-01-30 7:03 ` Rusty Russell
2009-01-16 21:13 ` [PATCH 5/5] virtio_net: Add support for VLAN filtering in the hypervisor Alex Williamson
2009-01-19 9:32 ` Mark McLoughlin
2009-01-20 16:36 ` Alex Williamson
2009-01-20 16:44 ` Mark McLoughlin
2009-01-26 2:08 ` David Miller
2009-01-26 17:42 ` Alex Williamson
[not found] ` <200901271422.33369.rusty@rustcorp.com.au>
2009-01-27 4:19 ` Alex Williamson
2009-01-19 6:05 ` [PATCH 0/5] virtio_net: Add MAC and VLAN filtering David Miller
2009-01-19 8:30 ` Mark McLoughlin
2009-01-20 1:10 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).