* [PATCH 1/4] usbnet: Use wwan%d interface name for mobile broadband devices
From: Marcel Holtmann @ 2009-10-02 15:15 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Johannes Berg, Greg KH
In-Reply-To: <cover.1254495724.git.marcel@holtmann.org>
Add support for usbnet based devices like CDC-Ether to indicate that they
are actually mobile broadband devices. In that case use wwan%d as default
interface name.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
---
drivers/net/usb/cdc_ether.c | 20 ++++++++++++++------
drivers/net/usb/usbnet.c | 3 +++
include/linux/usb/usbnet.h | 1 +
3 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
index 4a6aff5..71e65fc 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -420,6 +420,14 @@ static const struct driver_info cdc_info = {
.status = cdc_status,
};
+static const struct driver_info mbm_info = {
+ .description = "Mobile Broadband Network Device",
+ .flags = FLAG_WWAN,
+ .bind = cdc_bind,
+ .unbind = usbnet_cdc_unbind,
+ .status = cdc_status,
+};
+
/*-------------------------------------------------------------------------*/
@@ -532,32 +540,32 @@ static const struct usb_device_id products [] = {
/* Ericsson F3507g */
USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x1900, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
- .driver_info = (unsigned long) &cdc_info,
+ .driver_info = (unsigned long) &mbm_info,
}, {
/* Ericsson F3507g ver. 2 */
USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x1902, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
- .driver_info = (unsigned long) &cdc_info,
+ .driver_info = (unsigned long) &mbm_info,
}, {
/* Ericsson F3607gw */
USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x1904, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
- .driver_info = (unsigned long) &cdc_info,
+ .driver_info = (unsigned long) &mbm_info,
}, {
/* Ericsson F3307 */
USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x1906, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
- .driver_info = (unsigned long) &cdc_info,
+ .driver_info = (unsigned long) &mbm_info,
}, {
/* Toshiba F3507g */
USB_DEVICE_AND_INTERFACE_INFO(0x0930, 0x130b, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
- .driver_info = (unsigned long) &cdc_info,
+ .driver_info = (unsigned long) &mbm_info,
}, {
/* Dell F3507g */
USB_DEVICE_AND_INTERFACE_INFO(0x413c, 0x8147, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
- .driver_info = (unsigned long) &cdc_info,
+ .driver_info = (unsigned long) &mbm_info,
},
{ }, // END
};
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index ca5ca5a..8124cf1 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1295,6 +1295,9 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod)
/* WLAN devices should always be named "wlan%d" */
if ((dev->driver_info->flags & FLAG_WLAN) != 0)
strcpy(net->name, "wlan%d");
+ /* WWAN devices should always be named "wwan%d" */
+ if ((dev->driver_info->flags & FLAG_WWAN) != 0)
+ strcpy(net->name, "wwan%d");
/* maybe the remote can't receive an Ethernet MTU */
if (net->mtu > (dev->hard_mtu - net->hard_header_len))
diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h
index f814730..86c31b7 100644
--- a/include/linux/usb/usbnet.h
+++ b/include/linux/usb/usbnet.h
@@ -90,6 +90,7 @@ struct driver_info {
#define FLAG_WLAN 0x0080 /* use "wlan%d" names */
#define FLAG_AVOID_UNLINK_URBS 0x0100 /* don't unlink urbs at usbnet_stop() */
#define FLAG_SEND_ZLP 0x0200 /* hw requires ZLPs are sent */
+#define FLAG_WWAN 0x0400 /* use "wwan%d" names */
/* init device ... can sleep, or cause probe() failure */
--
1.6.2.5
^ permalink raw reply related
* [PATCH 2/4] usbnet: Set device type for wlan and wwan devices
From: Marcel Holtmann @ 2009-10-02 15:15 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Johannes Berg, Greg KH
In-Reply-To: <cover.1254495724.git.marcel@holtmann.org>
For usbnet devices with FLAG_WLAN and FLAG_WWAN set the proper device
type so that uevent contains the correct value. This then allows an easy
identification of the actual underlying technology of the Ethernet device.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
---
drivers/net/usb/usbnet.c | 14 ++++++++++++++
1 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 8124cf1..378da8c 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1210,6 +1210,14 @@ static const struct net_device_ops usbnet_netdev_ops = {
// precondition: never called in_interrupt
+static struct device_type wlan_type = {
+ .name = "wlan",
+};
+
+static struct device_type wwan_type = {
+ .name = "wwan",
+};
+
int
usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod)
{
@@ -1325,6 +1333,12 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod)
dev->maxpacket = usb_maxpacket (dev->udev, dev->out, 1);
SET_NETDEV_DEV(net, &udev->dev);
+
+ if ((dev->driver_info->flags & FLAG_WLAN) != 0)
+ SET_NETDEV_DEVTYPE(net, &wlan_type);
+ if ((dev->driver_info->flags & FLAG_WWAN) != 0)
+ SET_NETDEV_DEVTYPE(net, &wwan_type);
+
status = register_netdev (net);
if (status)
goto out3;
--
1.6.2.5
^ permalink raw reply related
* [PATCH 3/4] net: introduce NETDEV_POST_INIT notifier
From: Marcel Holtmann @ 2009-10-02 15:15 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Johannes Berg, Greg KH
In-Reply-To: <cover.1254495724.git.marcel@holtmann.org>
From: Johannes Berg <johannes@sipsolutions.net>
For various purposes including a wireless extensions
bugfix, we need to hook into the netdev creation before
before netdev_register_kobject(). This will also ease
doing the dev type assignment that Marcel was working
on for cfg80211 drivers w/o touching them all.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
---
include/linux/notifier.h | 1 +
net/core/dev.c | 6 ++++++
2 files changed, 7 insertions(+), 0 deletions(-)
diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index 44428d2..29714b8 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -201,6 +201,7 @@ static inline int notifier_to_errno(int ret)
#define NETDEV_PRE_UP 0x000D
#define NETDEV_BONDING_OLDTYPE 0x000E
#define NETDEV_BONDING_NEWTYPE 0x000F
+#define NETDEV_POST_INIT 0x0010
#define SYS_DOWN 0x0001 /* Notify of system down */
#define SYS_RESTART SYS_DOWN
diff --git a/net/core/dev.c b/net/core/dev.c
index b8f74cf..a74c8fd 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4836,6 +4836,12 @@ int register_netdevice(struct net_device *dev)
dev->features |= NETIF_F_GSO;
netdev_initialize_kobject(dev);
+
+ ret = call_netdevice_notifiers(NETDEV_POST_INIT, dev);
+ ret = notifier_to_errno(ret);
+ if (ret)
+ goto err_uninit;
+
ret = netdev_register_kobject(dev);
if (ret)
goto err_uninit;
--
1.6.2.5
^ permalink raw reply related
* [PATCH 4/4] cfg80211: assign device type in netdev notifier callback
From: Marcel Holtmann @ 2009-10-02 15:15 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Johannes Berg, Greg KH
In-Reply-To: <cover.1254495724.git.marcel@holtmann.org>
Instead of having to modify every non-mac80211 for device type assignment,
do this inside the netdev notifier callback of cfg80211. So all drivers
that integrate with cfg80211 will export a proper device type.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
---
net/mac80211/iface.c | 5 -----
net/wireless/core.c | 7 +++++++
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b8295cb..f6005ad 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -754,10 +754,6 @@ int ieee80211_if_change_type(struct ieee80211_sub_if_data *sdata,
return 0;
}
-static struct device_type wiphy_type = {
- .name = "wlan",
-};
-
int ieee80211_if_add(struct ieee80211_local *local, const char *name,
struct net_device **new_dev, enum nl80211_iftype type,
struct vif_params *params)
@@ -789,7 +785,6 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name,
memcpy(ndev->dev_addr, local->hw.wiphy->perm_addr, ETH_ALEN);
SET_NETDEV_DEV(ndev, wiphy_dev(local->hw.wiphy));
- SET_NETDEV_DEVTYPE(ndev, &wiphy_type);
/* don't use IEEE80211_DEV_TO_SUB_IF because it checks too much */
sdata = netdev_priv(ndev);
diff --git a/net/wireless/core.c b/net/wireless/core.c
index 45b2be3..e6f02e9 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -625,6 +625,10 @@ static void wdev_cleanup_work(struct work_struct *work)
dev_put(wdev->netdev);
}
+static struct device_type wiphy_type = {
+ .name = "wlan",
+};
+
static int cfg80211_netdev_notifier_call(struct notifier_block * nb,
unsigned long state,
void *ndev)
@@ -641,6 +645,9 @@ static int cfg80211_netdev_notifier_call(struct notifier_block * nb,
WARN_ON(wdev->iftype == NL80211_IFTYPE_UNSPECIFIED);
switch (state) {
+ case NETDEV_POST_INIT:
+ SET_NETDEV_DEVTYPE(dev, &wiphy_type);
+ break;
case NETDEV_REGISTER:
/*
* NB: cannot take rdev->mtx here because this may be
--
1.6.2.5
^ permalink raw reply related
* [PATCH v3] net: Add vbus_enet driver
From: Gregory Haskins @ 2009-10-02 15:33 UTC (permalink / raw)
To: netdev; +Cc: linux-kernel, alacrityvm-devel
In-Reply-To: <20090804010915.17855.2660.stgit@dev.haskins.net>
A virtualized 802.x network device based on the VBUS interface. It can be
used with any hypervisor/kernel that supports the virtual-ethernet/vbus
protocol.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: David S. Miller <davem@davemloft.net>
[ added several new features since last review:
pre-mapped-transmit descriptors,
event-queue,
link-state event
tx-complete event
]
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
---
MAINTAINERS | 7
drivers/net/Kconfig | 14 +
drivers/net/Makefile | 1
drivers/net/vbus-enet.c | 1203 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/Kbuild | 1
include/linux/venet.h | 123 +++++
6 files changed, 1349 insertions(+), 0 deletions(-)
create mode 100644 drivers/net/vbus-enet.c
create mode 100644 include/linux/venet.h
diff --git a/MAINTAINERS b/MAINTAINERS
index b484756..ade37b5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5456,6 +5456,13 @@ S: Maintained
F: include/linux/vbus*
F: drivers/vbus/*
+VBUS ETHERNET DRIVER
+M: Gregory Haskins <ghaskins@novell.com>
+S: Maintained
+W: http://developer.novell.com/wiki/index.php/AlacrityVM
+F: include/linux/venet.h
+F: drivers/net/vbus-enet.c
+
VFAT/FAT/MSDOS FILESYSTEM
M: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
S: Maintained
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 5ce7cba..722f892 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -3211,4 +3211,18 @@ config VIRTIO_NET
This is the virtual network driver for virtio. It can be used with
lguest or QEMU based VMMs (like KVM or Xen). Say Y or M.
+config VBUS_ENET
+ tristate "VBUS Ethernet Driver"
+ default n
+ select VBUS_PROXY
+ help
+ A virtualized 802.x network device based on the VBUS
+ "virtual-ethernet" interface. It can be used with any
+ hypervisor/kernel that supports the vbus+venet protocol.
+
+config VBUS_ENET_DEBUG
+ bool "Enable Debugging"
+ depends on VBUS_ENET
+ default n
+
endif # NETDEVICES
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index ead8cab..2a3c7a9 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -277,6 +277,7 @@ obj-$(CONFIG_FS_ENET) += fs_enet/
obj-$(CONFIG_NETXEN_NIC) += netxen/
obj-$(CONFIG_NIU) += niu.o
obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
+obj-$(CONFIG_VBUS_ENET) += vbus-enet.o
obj-$(CONFIG_SFC) += sfc/
obj-$(CONFIG_WIMAX) += wimax/
diff --git a/drivers/net/vbus-enet.c b/drivers/net/vbus-enet.c
new file mode 100644
index 0000000..e8a0553
--- /dev/null
+++ b/drivers/net/vbus-enet.c
@@ -0,0 +1,1203 @@
+/*
+ * vbus_enet - A virtualized 802.x network device based on the VBUS interface
+ *
+ * Copyright (C) 2009 Novell, Gregory Haskins <ghaskins@novell.com>
+ *
+ * Derived from the SNULL example from the book "Linux Device Drivers" by
+ * Alessandro Rubini, Jonathan Corbet, and Greg Kroah-Hartman, published
+ * by O'Reilly & Associates.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+
+#include <linux/sched.h>
+#include <linux/kernel.h>
+#include <linux/slab.h>
+#include <linux/errno.h>
+#include <linux/types.h>
+#include <linux/interrupt.h>
+
+#include <linux/in.h>
+#include <linux/netdevice.h>
+#include <linux/etherdevice.h>
+#include <linux/ip.h>
+#include <linux/tcp.h>
+#include <linux/skbuff.h>
+#include <linux/ioq.h>
+#include <linux/vbus_driver.h>
+
+#include <linux/in6.h>
+#include <asm/checksum.h>
+
+#include <linux/venet.h>
+
+MODULE_AUTHOR("Gregory Haskins");
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("virtual-ethernet");
+MODULE_VERSION("1");
+
+static int rx_ringlen = 256;
+module_param(rx_ringlen, int, 0444);
+static int tx_ringlen = 256;
+module_param(tx_ringlen, int, 0444);
+static int sg_enabled = 1;
+module_param(sg_enabled, int, 0444);
+
+#define PDEBUG(_dev, fmt, args...) dev_dbg(&(_dev)->dev, fmt, ## args)
+
+struct vbus_enet_queue {
+ struct ioq *queue;
+ struct ioq_notifier notifier;
+ unsigned long count;
+};
+
+struct vbus_enet_priv {
+ spinlock_t lock;
+ struct net_device *dev;
+ struct vbus_device_proxy *vdev;
+ struct napi_struct napi;
+ struct vbus_enet_queue rxq;
+ struct {
+ struct vbus_enet_queue veq;
+ struct tasklet_struct task;
+ struct sk_buff_head outstanding;
+ } tx;
+ bool sg;
+ struct {
+ bool enabled;
+ char *pool;
+ } pmtd; /* pre-mapped transmit descriptors */
+ struct {
+ bool enabled;
+ bool linkstate;
+ bool txc;
+ unsigned long evsize;
+ struct vbus_enet_queue veq;
+ struct tasklet_struct task;
+ char *pool;
+ } evq;
+};
+
+static void vbus_enet_tx_reap(struct vbus_enet_priv *priv);
+
+static struct vbus_enet_priv *
+napi_to_priv(struct napi_struct *napi)
+{
+ return container_of(napi, struct vbus_enet_priv, napi);
+}
+
+static int
+queue_init(struct vbus_enet_priv *priv,
+ struct vbus_enet_queue *q,
+ int qid,
+ size_t ringsize,
+ void (*func)(struct ioq_notifier *))
+{
+ struct vbus_device_proxy *dev = priv->vdev;
+ int ret;
+
+ ret = vbus_driver_ioq_alloc(dev, qid, 0, ringsize, &q->queue);
+ if (ret < 0)
+ panic("ioq_alloc failed: %d\n", ret);
+
+ if (func) {
+ q->notifier.signal = func;
+ q->queue->notifier = &q->notifier;
+ }
+
+ q->count = ringsize;
+
+ return 0;
+}
+
+static int
+devcall(struct vbus_enet_priv *priv, u32 func, void *data, size_t len)
+{
+ struct vbus_device_proxy *dev = priv->vdev;
+
+ return dev->ops->call(dev, func, data, len, 0);
+}
+
+/*
+ * ---------------
+ * rx descriptors
+ * ---------------
+ */
+
+static void
+rxdesc_alloc(struct net_device *dev, struct ioq_ring_desc *desc, size_t len)
+{
+ struct sk_buff *skb;
+
+ len += ETH_HLEN;
+
+ skb = netdev_alloc_skb(dev, len + 2);
+ BUG_ON(!skb);
+
+ skb_reserve(skb, NET_IP_ALIGN); /* align IP on 16B boundary */
+
+ desc->cookie = (u64)skb;
+ desc->ptr = (u64)__pa(skb->data);
+ desc->len = len; /* total length */
+ desc->valid = 1;
+}
+
+static void
+rx_setup(struct vbus_enet_priv *priv)
+{
+ struct ioq *ioq = priv->rxq.queue;
+ struct ioq_iterator iter;
+ int ret;
+
+ /*
+ * We want to iterate on the "valid" index. By default the iterator
+ * will not "autoupdate" which means it will not hypercall the host
+ * with our changes. This is good, because we are really just
+ * initializing stuff here anyway. Note that you can always manually
+ * signal the host with ioq_signal() if the autoupdate feature is not
+ * used.
+ */
+ ret = ioq_iter_init(ioq, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0); /* will never fail unless seriously broken */
+
+ /*
+ * Seek to the tail of the valid index (which should be our first
+ * item, since the queue is brand-new)
+ */
+ ret = ioq_iter_seek(&iter, ioq_seek_tail, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * Now populate each descriptor with an empty SKB and mark it valid
+ */
+ while (!iter.desc->valid) {
+ rxdesc_alloc(priv->dev, iter.desc, priv->dev->mtu);
+
+ /*
+ * This push operation will simultaneously advance the
+ * valid-head index and increment our position in the queue
+ * by one.
+ */
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+}
+
+static void
+rx_teardown(struct vbus_enet_priv *priv)
+{
+ struct ioq *ioq = priv->rxq.queue;
+ struct ioq_iterator iter;
+ int ret;
+
+ ret = ioq_iter_init(ioq, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * free each valid descriptor
+ */
+ while (iter.desc->valid) {
+ struct sk_buff *skb = (struct sk_buff *)iter.desc->cookie;
+
+ iter.desc->valid = 0;
+ wmb();
+
+ iter.desc->ptr = 0;
+ iter.desc->cookie = 0;
+
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+
+ dev_kfree_skb(skb);
+ }
+}
+
+static int
+tx_setup(struct vbus_enet_priv *priv)
+{
+ struct ioq *ioq = priv->tx.veq.queue;
+ size_t iovlen = sizeof(struct venet_iov) * (MAX_SKB_FRAGS-1);
+ size_t len = sizeof(struct venet_sg) + iovlen;
+ struct ioq_iterator iter;
+ int i;
+ int ret;
+
+ if (!priv->sg)
+ /*
+ * There is nothing to do for a ring that is not using
+ * scatter-gather
+ */
+ return 0;
+
+ /* pre-allocate our descriptor pool if pmtd is enabled */
+ if (priv->pmtd.enabled) {
+ struct vbus_device_proxy *dev = priv->vdev;
+ size_t poollen = len * priv->tx.veq.count;
+ char *pool;
+ int shmid;
+
+ /* pmtdquery will return the shm-id to use for the pool */
+ ret = devcall(priv, VENET_FUNC_PMTDQUERY, NULL, 0);
+ BUG_ON(ret < 0);
+
+ shmid = ret;
+
+ pool = kzalloc(poollen, GFP_KERNEL | GFP_DMA);
+ if (!pool)
+ return -ENOMEM;
+
+ priv->pmtd.pool = pool;
+
+ ret = dev->ops->shm(dev, shmid, 0, pool, poollen, 0, NULL, 0);
+ BUG_ON(ret < 0);
+ }
+
+ ret = ioq_iter_init(ioq, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_set, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * Now populate each descriptor with an empty SG descriptor
+ */
+ for (i = 0; i < priv->tx.veq.count; i++) {
+ struct venet_sg *vsg;
+
+ if (priv->pmtd.enabled) {
+ size_t offset = (i * len);
+
+ vsg = (struct venet_sg *)&priv->pmtd.pool[offset];
+ iter.desc->ptr = (u64)offset;
+ } else {
+ vsg = kzalloc(len, GFP_KERNEL);
+ if (!vsg)
+ return -ENOMEM;
+
+ iter.desc->ptr = (u64)__pa(vsg);
+ }
+
+ iter.desc->cookie = (u64)vsg;
+ iter.desc->len = len;
+
+ ret = ioq_iter_seek(&iter, ioq_seek_next, 0, 0);
+ BUG_ON(ret < 0);
+ }
+
+ return 0;
+}
+
+static void
+tx_teardown(struct vbus_enet_priv *priv)
+{
+ struct ioq *ioq = priv->tx.veq.queue;
+ struct ioq_iterator iter;
+ struct sk_buff *skb;
+ int ret;
+
+ /* forcefully free all outstanding transmissions */
+ while ((skb = __skb_dequeue(&priv->tx.outstanding)))
+ dev_kfree_skb(skb);
+
+ if (!priv->sg)
+ /*
+ * There is nothing else to do for a ring that is not using
+ * scatter-gather
+ */
+ return;
+
+ if (priv->pmtd.enabled) {
+ /*
+ * PMTD mode means we only need to free the pool
+ */
+ kfree(priv->pmtd.pool);
+ return;
+ }
+
+ ret = ioq_iter_init(ioq, &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ /* seek to position 0 */
+ ret = ioq_iter_seek(&iter, ioq_seek_set, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * free each valid descriptor
+ */
+ while (iter.desc->cookie) {
+ struct venet_sg *vsg = (struct venet_sg *)iter.desc->cookie;
+
+ iter.desc->valid = 0;
+ wmb();
+
+ iter.desc->ptr = 0;
+ iter.desc->cookie = 0;
+
+ ret = ioq_iter_seek(&iter, ioq_seek_next, 0, 0);
+ BUG_ON(ret < 0);
+
+ kfree(vsg);
+ }
+}
+
+static void
+evq_teardown(struct vbus_enet_priv *priv)
+{
+ if (!priv->evq.enabled)
+ return;
+
+ ioq_put(priv->evq.veq.queue);
+ kfree(priv->evq.pool);
+}
+
+/*
+ * Open and close
+ */
+
+static int
+vbus_enet_open(struct net_device *dev)
+{
+ struct vbus_enet_priv *priv = netdev_priv(dev);
+ int ret;
+
+ ret = devcall(priv, VENET_FUNC_LINKUP, NULL, 0);
+ BUG_ON(ret < 0);
+
+ napi_enable(&priv->napi);
+
+ return 0;
+}
+
+static int
+vbus_enet_stop(struct net_device *dev)
+{
+ struct vbus_enet_priv *priv = netdev_priv(dev);
+ int ret;
+
+ napi_disable(&priv->napi);
+
+ ret = devcall(priv, VENET_FUNC_LINKDOWN, NULL, 0);
+ BUG_ON(ret < 0);
+
+ return 0;
+}
+
+/*
+ * Configuration changes (passed on by ifconfig)
+ */
+static int
+vbus_enet_config(struct net_device *dev, struct ifmap *map)
+{
+ if (dev->flags & IFF_UP) /* can't act on a running interface */
+ return -EBUSY;
+
+ /* Don't allow changing the I/O address */
+ if (map->base_addr != dev->base_addr) {
+ dev_warn(&dev->dev, "Can't change I/O address\n");
+ return -EOPNOTSUPP;
+ }
+
+ /* ignore other fields */
+ return 0;
+}
+
+static void
+vbus_enet_schedule_rx(struct vbus_enet_priv *priv)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ if (napi_schedule_prep(&priv->napi)) {
+ /* Disable further interrupts */
+ ioq_notify_disable(priv->rxq.queue, 0);
+ __napi_schedule(&priv->napi);
+ }
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+}
+
+static int
+vbus_enet_change_mtu(struct net_device *dev, int new_mtu)
+{
+ struct vbus_enet_priv *priv = netdev_priv(dev);
+ int ret;
+
+ dev->mtu = new_mtu;
+
+ /*
+ * FLUSHRX will cause the device to flush any outstanding
+ * RX buffers. They will appear to come in as 0 length
+ * packets which we can simply discard and replace with new_mtu
+ * buffers for the future.
+ */
+ ret = devcall(priv, VENET_FUNC_FLUSHRX, NULL, 0);
+ BUG_ON(ret < 0);
+
+ vbus_enet_schedule_rx(priv);
+
+ return 0;
+}
+
+/*
+ * The poll implementation.
+ */
+static int
+vbus_enet_poll(struct napi_struct *napi, int budget)
+{
+ struct vbus_enet_priv *priv = napi_to_priv(napi);
+ int npackets = 0;
+ struct ioq_iterator iter;
+ int ret;
+
+ PDEBUG(priv->dev, "polling...\n");
+
+ /* We want to iterate on the head of the in-use index */
+ ret = ioq_iter_init(priv->rxq.queue, &iter, ioq_idxtype_inuse,
+ IOQ_ITER_AUTOUPDATE);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * We stop if we have met the quota or there are no more packets.
+ * The EOM is indicated by finding a packet that is still owned by
+ * the south side
+ */
+ while ((npackets < budget) && (!iter.desc->sown)) {
+ struct sk_buff *skb = (struct sk_buff *)iter.desc->cookie;
+
+ if (iter.desc->len) {
+ skb_put(skb, iter.desc->len);
+
+ /* Maintain stats */
+ npackets++;
+ priv->dev->stats.rx_packets++;
+ priv->dev->stats.rx_bytes += iter.desc->len;
+
+ /* Pass the buffer up to the stack */
+ skb->dev = priv->dev;
+ skb->protocol = eth_type_trans(skb, priv->dev);
+ netif_receive_skb(skb);
+
+ mb();
+ } else
+ /*
+ * the device may send a zero-length packet when its
+ * flushing references on the ring. We can just drop
+ * these on the floor
+ */
+ dev_kfree_skb(skb);
+
+ /* Grab a new buffer to put in the ring */
+ rxdesc_alloc(priv->dev, iter.desc, priv->dev->mtu);
+
+ /* Advance the in-use tail */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+
+ PDEBUG(priv->dev, "%d packets received\n", npackets);
+
+ /*
+ * If we processed all packets, we're done; tell the kernel and
+ * reenable ints
+ */
+ if (ioq_empty(priv->rxq.queue, ioq_idxtype_inuse)) {
+ napi_complete(napi);
+ ioq_notify_enable(priv->rxq.queue, 0);
+ ret = 0;
+ } else
+ /* We couldn't process everything. */
+ ret = 1;
+
+ return ret;
+}
+
+/*
+ * Transmit a packet (called by the kernel)
+ */
+static int
+vbus_enet_tx_start(struct sk_buff *skb, struct net_device *dev)
+{
+ struct vbus_enet_priv *priv = netdev_priv(dev);
+ struct ioq_iterator iter;
+ int ret;
+ unsigned long flags;
+
+ PDEBUG(priv->dev, "sending %d bytes\n", skb->len);
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ if (ioq_full(priv->tx.veq.queue, ioq_idxtype_valid)) {
+ /*
+ * We must flow-control the kernel by disabling the
+ * queue
+ */
+ spin_unlock_irqrestore(&priv->lock, flags);
+ netif_stop_queue(dev);
+ dev_err(&priv->dev->dev, "tx on full queue bug\n");
+ return 1;
+ }
+
+ /*
+ * We want to iterate on the tail of both the "inuse" and "valid" index
+ * so we specify the "both" index
+ */
+ ret = ioq_iter_init(priv->tx.veq.queue, &iter, ioq_idxtype_both,
+ IOQ_ITER_AUTOUPDATE);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_tail, 0, 0);
+ BUG_ON(ret < 0);
+ BUG_ON(iter.desc->sown);
+
+ if (priv->sg) {
+ struct venet_sg *vsg = (struct venet_sg *)iter.desc->cookie;
+ struct scatterlist sgl[MAX_SKB_FRAGS+1];
+ struct scatterlist *sg;
+ int count, maxcount = ARRAY_SIZE(sgl);
+
+ sg_init_table(sgl, maxcount);
+
+ memset(vsg, 0, sizeof(*vsg));
+
+ vsg->cookie = (u64)skb;
+ vsg->len = skb->len;
+
+ if (skb->ip_summed == CHECKSUM_PARTIAL) {
+ vsg->flags |= VENET_SG_FLAG_NEEDS_CSUM;
+ vsg->csum.start = skb->csum_start - skb_headroom(skb);
+ vsg->csum.offset = skb->csum_offset;
+ }
+
+ if (skb_is_gso(skb)) {
+ struct skb_shared_info *sinfo = skb_shinfo(skb);
+
+ vsg->flags |= VENET_SG_FLAG_GSO;
+
+ vsg->gso.hdrlen = skb_headlen(skb);
+ vsg->gso.size = sinfo->gso_size;
+ if (sinfo->gso_type & SKB_GSO_TCPV4)
+ vsg->gso.type = VENET_GSO_TYPE_TCPV4;
+ else if (sinfo->gso_type & SKB_GSO_TCPV6)
+ vsg->gso.type = VENET_GSO_TYPE_TCPV6;
+ else if (sinfo->gso_type & SKB_GSO_UDP)
+ vsg->gso.type = VENET_GSO_TYPE_UDP;
+ else
+ panic("Virtual-Ethernet: unknown GSO type " \
+ "0x%x\n", sinfo->gso_type);
+
+ if (sinfo->gso_type & SKB_GSO_TCP_ECN)
+ vsg->flags |= VENET_SG_FLAG_ECN;
+ }
+
+ count = skb_to_sgvec(skb, sgl, 0, skb->len);
+
+ BUG_ON(count > maxcount);
+
+ for (sg = &sgl[0]; sg; sg = sg_next(sg)) {
+ struct venet_iov *iov = &vsg->iov[vsg->count++];
+
+ iov->len = sg->length;
+ iov->ptr = (u64)sg_phys(sg);
+ }
+
+ iter.desc->len = (u64)VSG_DESC_SIZE(vsg->count);
+
+ } else {
+ /*
+ * non scatter-gather mode: simply put the skb right onto the
+ * ring.
+ */
+ iter.desc->cookie = (u64)skb;
+ iter.desc->len = (u64)skb->len;
+ iter.desc->ptr = (u64)__pa(skb->data);
+ }
+
+ iter.desc->valid = 1;
+
+ priv->dev->stats.tx_packets++;
+ priv->dev->stats.tx_bytes += skb->len;
+
+ __skb_queue_tail(&priv->tx.outstanding, skb);
+
+ /*
+ * This advances both indexes together implicitly, and then
+ * signals the south side to consume the packet
+ */
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+
+ dev->trans_start = jiffies; /* save the timestamp */
+
+ if (ioq_full(priv->tx.veq.queue, ioq_idxtype_valid)) {
+ /*
+ * If the queue is congested, we must flow-control the kernel
+ */
+ PDEBUG(priv->dev, "backpressure tx queue\n");
+ netif_stop_queue(dev);
+ }
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+
+ return 0;
+}
+
+/* assumes priv->lock held */
+static void
+vbus_enet_skb_complete(struct vbus_enet_priv *priv, struct sk_buff *skb)
+{
+ PDEBUG(priv->dev, "completed sending %d bytes\n",
+ skb->len);
+
+ __skb_unlink(skb, &priv->tx.outstanding);
+ dev_kfree_skb(skb);
+}
+
+/*
+ * reclaim any outstanding completed tx packets
+ *
+ * assumes priv->lock held
+ */
+static void
+vbus_enet_tx_reap(struct vbus_enet_priv *priv)
+{
+ struct ioq_iterator iter;
+ int ret;
+
+ /*
+ * We want to iterate on the head of the valid index, but we
+ * do not want the iter_pop (below) to flip the ownership, so
+ * we set the NOFLIPOWNER option
+ */
+ ret = ioq_iter_init(priv->tx.veq.queue, &iter, ioq_idxtype_valid,
+ IOQ_ITER_NOFLIPOWNER);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * We are done once we find the first packet either invalid or still
+ * owned by the south-side
+ */
+ while (iter.desc->valid && !iter.desc->sown) {
+
+ if (!priv->evq.txc) {
+ struct sk_buff *skb;
+
+ if (priv->sg) {
+ struct venet_sg *vsg;
+
+ vsg = (struct venet_sg *)iter.desc->cookie;
+ skb = (struct sk_buff *)vsg->cookie;
+ } else
+ skb = (struct sk_buff *)iter.desc->cookie;
+
+ /*
+ * If TXC is not enabled, we are required to free
+ * the buffer resources now
+ */
+ vbus_enet_skb_complete(priv, skb);
+ }
+
+ /* Reset the descriptor */
+ iter.desc->valid = 0;
+
+ /* Advance the valid-index head */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+
+ /*
+ * If we were previously stopped due to flow control, restart the
+ * processing
+ */
+ if (netif_queue_stopped(priv->dev)
+ && !ioq_full(priv->tx.veq.queue, ioq_idxtype_valid)) {
+ PDEBUG(priv->dev, "re-enabling tx queue\n");
+ netif_wake_queue(priv->dev);
+ }
+}
+
+static void
+vbus_enet_timeout(struct net_device *dev)
+{
+ struct vbus_enet_priv *priv = netdev_priv(dev);
+ unsigned long flags;
+
+ dev_dbg(&dev->dev, "Transmit timeout\n");
+
+ spin_lock_irqsave(&priv->lock, flags);
+ vbus_enet_tx_reap(priv);
+ spin_unlock_irqrestore(&priv->lock, flags);
+}
+
+static void
+rx_isr(struct ioq_notifier *notifier)
+{
+ struct vbus_enet_priv *priv;
+ struct net_device *dev;
+
+ priv = container_of(notifier, struct vbus_enet_priv, rxq.notifier);
+ dev = priv->dev;
+
+ if (!ioq_empty(priv->rxq.queue, ioq_idxtype_inuse))
+ vbus_enet_schedule_rx(priv);
+}
+
+static void
+deferred_tx_isr(unsigned long data)
+{
+ struct vbus_enet_priv *priv = (struct vbus_enet_priv *)data;
+ unsigned long flags;
+
+ PDEBUG(priv->dev, "deferred_tx_isr\n");
+
+ spin_lock_irqsave(&priv->lock, flags);
+ vbus_enet_tx_reap(priv);
+ spin_unlock_irqrestore(&priv->lock, flags);
+
+ ioq_notify_enable(priv->tx.veq.queue, 0);
+}
+
+static void
+tx_isr(struct ioq_notifier *notifier)
+{
+ struct vbus_enet_priv *priv;
+
+ priv = container_of(notifier, struct vbus_enet_priv, tx.veq.notifier);
+
+ PDEBUG(priv->dev, "tx_isr\n");
+
+ ioq_notify_disable(priv->tx.veq.queue, 0);
+ tasklet_schedule(&priv->tx.task);
+}
+
+static void
+evq_linkstate_event(struct vbus_enet_priv *priv,
+ struct venet_event_header *header)
+{
+ struct venet_event_linkstate *event =
+ (struct venet_event_linkstate *)header;
+
+ switch (event->state) {
+ case 0:
+ netif_carrier_off(priv->dev);
+ break;
+ case 1:
+ netif_carrier_on(priv->dev);
+ break;
+ default:
+ break;
+ }
+}
+
+static void
+evq_txc_event(struct vbus_enet_priv *priv,
+ struct venet_event_header *header)
+{
+ struct venet_event_txc *event =
+ (struct venet_event_txc *)header;
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->lock, flags);
+
+ vbus_enet_tx_reap(priv);
+ vbus_enet_skb_complete(priv, (struct sk_buff *)event->cookie);
+
+ spin_unlock_irqrestore(&priv->lock, flags);
+}
+
+static void
+deferred_evq_isr(unsigned long data)
+{
+ struct vbus_enet_priv *priv = (struct vbus_enet_priv *)data;
+ int nevents = 0;
+ struct ioq_iterator iter;
+ int ret;
+
+ PDEBUG(priv->dev, "evq: polling...\n");
+
+ /* We want to iterate on the head of the in-use index */
+ ret = ioq_iter_init(priv->evq.veq.queue, &iter, ioq_idxtype_inuse,
+ IOQ_ITER_AUTOUPDATE);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_head, 0, 0);
+ BUG_ON(ret < 0);
+
+ /*
+ * The EOM is indicated by finding a packet that is still owned by
+ * the south side
+ */
+ while (!iter.desc->sown) {
+ struct venet_event_header *header;
+
+ header = (struct venet_event_header *)iter.desc->cookie;
+
+ switch (header->id) {
+ case VENET_EVENT_LINKSTATE:
+ evq_linkstate_event(priv, header);
+ break;
+ case VENET_EVENT_TXC:
+ evq_txc_event(priv, header);
+ break;
+ default:
+ panic("venet: unexpected event id:%d of size %d\n",
+ header->id, header->size);
+ break;
+ }
+
+ memset((void *)iter.desc->cookie, 0, priv->evq.evsize);
+
+ /* Advance the in-use tail */
+ ret = ioq_iter_pop(&iter, 0);
+ BUG_ON(ret < 0);
+
+ nevents++;
+ }
+
+ PDEBUG(priv->dev, "%d events received\n", nevents);
+
+ ioq_notify_enable(priv->evq.veq.queue, 0);
+}
+
+static void
+evq_isr(struct ioq_notifier *notifier)
+{
+ struct vbus_enet_priv *priv;
+
+ priv = container_of(notifier, struct vbus_enet_priv, evq.veq.notifier);
+
+ PDEBUG(priv->dev, "evq_isr\n");
+
+ ioq_notify_disable(priv->evq.veq.queue, 0);
+ tasklet_schedule(&priv->evq.task);
+}
+
+static int
+vbus_enet_sg_negcap(struct vbus_enet_priv *priv)
+{
+ struct net_device *dev = priv->dev;
+ struct venet_capabilities caps;
+ int ret;
+
+ memset(&caps, 0, sizeof(caps));
+
+ if (sg_enabled) {
+ caps.gid = VENET_CAP_GROUP_SG;
+ caps.bits |= (VENET_CAP_SG|VENET_CAP_TSO4|VENET_CAP_TSO6
+ |VENET_CAP_ECN|VENET_CAP_PMTD);
+ /* note: exclude UFO for now due to stack bug */
+ }
+
+ ret = devcall(priv, VENET_FUNC_NEGCAP, &caps, sizeof(caps));
+ if (ret < 0)
+ return ret;
+
+ if (caps.bits & VENET_CAP_SG) {
+ priv->sg = true;
+
+ dev->features |= NETIF_F_SG|NETIF_F_HW_CSUM|NETIF_F_FRAGLIST;
+
+ if (caps.bits & VENET_CAP_TSO4)
+ dev->features |= NETIF_F_TSO;
+ if (caps.bits & VENET_CAP_UFO)
+ dev->features |= NETIF_F_UFO;
+ if (caps.bits & VENET_CAP_TSO6)
+ dev->features |= NETIF_F_TSO6;
+ if (caps.bits & VENET_CAP_ECN)
+ dev->features |= NETIF_F_TSO_ECN;
+
+ if (caps.bits & VENET_CAP_PMTD)
+ priv->pmtd.enabled = true;
+ }
+
+ return 0;
+}
+
+static int
+vbus_enet_evq_negcap(struct vbus_enet_priv *priv, unsigned long count)
+{
+ struct venet_capabilities caps;
+ int ret;
+
+ memset(&caps, 0, sizeof(caps));
+
+ caps.gid = VENET_CAP_GROUP_EVENTQ;
+ caps.bits |= VENET_CAP_EVQ_LINKSTATE;
+ caps.bits |= VENET_CAP_EVQ_TXC;
+
+ ret = devcall(priv, VENET_FUNC_NEGCAP, &caps, sizeof(caps));
+ if (ret < 0)
+ return ret;
+
+ if (caps.bits) {
+ struct vbus_device_proxy *dev = priv->vdev;
+ struct venet_eventq_query query;
+ size_t poollen;
+ struct ioq_iterator iter;
+ char *pool;
+ int i;
+
+ priv->evq.enabled = true;
+
+ if (caps.bits & VENET_CAP_EVQ_LINKSTATE) {
+ /*
+ * We will assume there is no carrier until we get
+ * an event telling us otherwise
+ */
+ netif_carrier_off(priv->dev);
+ priv->evq.linkstate = true;
+ }
+
+ if (caps.bits & VENET_CAP_EVQ_TXC)
+ priv->evq.txc = true;
+
+ memset(&query, 0, sizeof(query));
+
+ ret = devcall(priv, VENET_FUNC_EVQQUERY, &query, sizeof(query));
+ if (ret < 0)
+ return ret;
+
+ priv->evq.evsize = query.evsize;
+ poollen = query.evsize * count;
+
+ pool = kzalloc(poollen, GFP_KERNEL | GFP_DMA);
+ if (!pool)
+ return -ENOMEM;
+
+ priv->evq.pool = pool;
+
+ ret = dev->ops->shm(dev, query.dpid, 0,
+ pool, poollen, 0, NULL, 0);
+ if (ret < 0)
+ return ret;
+
+ queue_init(priv, &priv->evq.veq, query.qid, count, evq_isr);
+
+ ret = ioq_iter_init(priv->evq.veq.queue,
+ &iter, ioq_idxtype_valid, 0);
+ BUG_ON(ret < 0);
+
+ ret = ioq_iter_seek(&iter, ioq_seek_set, 0, 0);
+ BUG_ON(ret < 0);
+
+ /* Now populate each descriptor with an empty event */
+ for (i = 0; i < count; i++) {
+ size_t offset = (i * query.evsize);
+ void *addr = &priv->evq.pool[offset];
+
+ iter.desc->ptr = (u64)offset;
+ iter.desc->cookie = (u64)addr;
+ iter.desc->len = query.evsize;
+
+ ret = ioq_iter_push(&iter, 0);
+ BUG_ON(ret < 0);
+ }
+
+ /* Finally, enable interrupts */
+ tasklet_init(&priv->evq.task, deferred_evq_isr,
+ (unsigned long)priv);
+ ioq_notify_enable(priv->evq.veq.queue, 0);
+ }
+
+ return 0;
+}
+
+static int
+vbus_enet_negcap(struct vbus_enet_priv *priv)
+{
+ int ret;
+
+ ret = vbus_enet_sg_negcap(priv);
+ if (ret < 0)
+ return ret;
+
+ return vbus_enet_evq_negcap(priv, tx_ringlen);
+}
+
+static int vbus_enet_set_tx_csum(struct net_device *dev, u32 data)
+{
+ struct vbus_enet_priv *priv = netdev_priv(dev);
+
+ if (data && !priv->sg)
+ return -ENOSYS;
+
+ return ethtool_op_set_tx_hw_csum(dev, data);
+}
+
+static struct ethtool_ops vbus_enet_ethtool_ops = {
+ .set_tx_csum = vbus_enet_set_tx_csum,
+ .set_sg = ethtool_op_set_sg,
+ .set_tso = ethtool_op_set_tso,
+ .get_link = ethtool_op_get_link,
+};
+
+static const struct net_device_ops vbus_enet_netdev_ops = {
+ .ndo_open = vbus_enet_open,
+ .ndo_stop = vbus_enet_stop,
+ .ndo_set_config = vbus_enet_config,
+ .ndo_start_xmit = vbus_enet_tx_start,
+ .ndo_change_mtu = vbus_enet_change_mtu,
+ .ndo_tx_timeout = vbus_enet_timeout,
+ .ndo_set_mac_address = eth_mac_addr,
+ .ndo_validate_addr = eth_validate_addr,
+};
+
+/*
+ * This is called whenever a new vbus_device_proxy is added to the vbus
+ * with the matching VENET_ID
+ */
+static int
+vbus_enet_probe(struct vbus_device_proxy *vdev)
+{
+ struct net_device *dev;
+ struct vbus_enet_priv *priv;
+ int ret;
+
+ printk(KERN_INFO "VENET: Found new device at %lld\n", vdev->id);
+
+ ret = vdev->ops->open(vdev, VENET_VERSION, 0);
+ if (ret < 0)
+ return ret;
+
+ dev = alloc_etherdev(sizeof(struct vbus_enet_priv));
+ if (!dev)
+ return -ENOMEM;
+
+ priv = netdev_priv(dev);
+
+ spin_lock_init(&priv->lock);
+ priv->dev = dev;
+ priv->vdev = vdev;
+
+ ret = vbus_enet_negcap(priv);
+ if (ret < 0) {
+ printk(KERN_INFO "VENET: Error negotiating capabilities for " \
+ "%lld\n",
+ priv->vdev->id);
+ goto out_free;
+ }
+
+ skb_queue_head_init(&priv->tx.outstanding);
+
+ queue_init(priv, &priv->rxq, VENET_QUEUE_RX, rx_ringlen, rx_isr);
+ queue_init(priv, &priv->tx.veq, VENET_QUEUE_TX, tx_ringlen, tx_isr);
+
+ rx_setup(priv);
+ tx_setup(priv);
+
+ ioq_notify_enable(priv->rxq.queue, 0); /* enable rx interrupts */
+
+ if (!priv->evq.txc) {
+ /*
+ * If the TXC feature is present, we will recieve our
+ * tx-complete notification via the event-channel. Therefore,
+ * we only enable txq interrupts if the TXC feature is not
+ * present.
+ */
+ tasklet_init(&priv->tx.task, deferred_tx_isr,
+ (unsigned long)priv);
+ ioq_notify_enable(priv->tx.veq.queue, 0);
+ }
+
+ dev->netdev_ops = &vbus_enet_netdev_ops;
+ dev->watchdog_timeo = 5 * HZ;
+ SET_ETHTOOL_OPS(dev, &vbus_enet_ethtool_ops);
+ SET_NETDEV_DEV(dev, &vdev->dev);
+
+ netif_napi_add(dev, &priv->napi, vbus_enet_poll, 128);
+
+ ret = devcall(priv, VENET_FUNC_MACQUERY, priv->dev->dev_addr, ETH_ALEN);
+ if (ret < 0) {
+ printk(KERN_INFO "VENET: Error obtaining MAC address for " \
+ "%lld\n",
+ priv->vdev->id);
+ goto out_free;
+ }
+
+ dev->features |= NETIF_F_HIGHDMA;
+
+ ret = register_netdev(dev);
+ if (ret < 0) {
+ printk(KERN_INFO "VENET: error %i registering device \"%s\"\n",
+ ret, dev->name);
+ goto out_free;
+ }
+
+ vdev->priv = priv;
+
+ return 0;
+
+ out_free:
+ free_netdev(dev);
+
+ return ret;
+}
+
+static int
+vbus_enet_remove(struct vbus_device_proxy *vdev)
+{
+ struct vbus_enet_priv *priv = (struct vbus_enet_priv *)vdev->priv;
+ struct vbus_device_proxy *dev = priv->vdev;
+
+ unregister_netdev(priv->dev);
+ napi_disable(&priv->napi);
+
+ rx_teardown(priv);
+ ioq_put(priv->rxq.queue);
+
+ tx_teardown(priv);
+ ioq_put(priv->tx.veq.queue);
+
+ if (priv->evq.enabled)
+ evq_teardown(priv);
+
+ dev->ops->close(dev, 0);
+
+ free_netdev(priv->dev);
+
+ return 0;
+}
+
+/*
+ * Finally, the module stuff
+ */
+
+static struct vbus_driver_ops vbus_enet_driver_ops = {
+ .probe = vbus_enet_probe,
+ .remove = vbus_enet_remove,
+};
+
+static struct vbus_driver vbus_enet_driver = {
+ .type = VENET_TYPE,
+ .owner = THIS_MODULE,
+ .ops = &vbus_enet_driver_ops,
+};
+
+static __init int
+vbus_enet_init_module(void)
+{
+ printk(KERN_INFO "Virtual Ethernet: Copyright (C) 2009 Novell, Gregory Haskins\n");
+ printk(KERN_DEBUG "VENET: Using %d/%d queue depth\n",
+ rx_ringlen, tx_ringlen);
+ return vbus_driver_register(&vbus_enet_driver);
+}
+
+static __exit void
+vbus_enet_cleanup(void)
+{
+ vbus_driver_unregister(&vbus_enet_driver);
+}
+
+module_init(vbus_enet_init_module);
+module_exit(vbus_enet_cleanup);
diff --git a/include/linux/Kbuild b/include/linux/Kbuild
index fa15bbf..911f7ef 100644
--- a/include/linux/Kbuild
+++ b/include/linux/Kbuild
@@ -359,6 +359,7 @@ unifdef-y += unistd.h
unifdef-y += usbdevice_fs.h
unifdef-y += utsname.h
unifdef-y += vbus_pci.h
+unifdef-y += venet.h
unifdef-y += videodev2.h
unifdef-y += videodev.h
unifdef-y += virtio_config.h
diff --git a/include/linux/venet.h b/include/linux/venet.h
new file mode 100644
index 0000000..b6bfd91
--- /dev/null
+++ b/include/linux/venet.h
@@ -0,0 +1,123 @@
+/*
+ * Copyright 2009 Novell. All Rights Reserved.
+ *
+ * Virtual-Ethernet adapter
+ *
+ * Author:
+ * Gregory Haskins <ghaskins@novell.com>
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA.
+ */
+
+#ifndef _LINUX_VENET_H
+#define _LINUX_VENET_H
+
+#include <linux/types.h>
+
+#define VENET_VERSION 1
+
+#define VENET_TYPE "virtual-ethernet"
+
+#define VENET_QUEUE_RX 0
+#define VENET_QUEUE_TX 1
+
+struct venet_capabilities {
+ __u32 gid;
+ __u32 bits;
+};
+
+#define VENET_CAP_GROUP_SG 0
+#define VENET_CAP_GROUP_EVENTQ 1
+
+/* CAPABILITIES-GROUP SG */
+#define VENET_CAP_SG (1 << 0)
+#define VENET_CAP_TSO4 (1 << 1)
+#define VENET_CAP_TSO6 (1 << 2)
+#define VENET_CAP_ECN (1 << 3)
+#define VENET_CAP_UFO (1 << 4)
+#define VENET_CAP_PMTD (1 << 5) /* pre-mapped tx desc */
+
+/* CAPABILITIES-GROUP EVENTQ */
+#define VENET_CAP_EVQ_LINKSTATE (1 << 0)
+#define VENET_CAP_EVQ_TXC (1 << 1) /* tx-complete */
+
+struct venet_iov {
+ __u32 len;
+ __u64 ptr;
+};
+
+#define VENET_SG_FLAG_NEEDS_CSUM (1 << 0)
+#define VENET_SG_FLAG_GSO (1 << 1)
+#define VENET_SG_FLAG_ECN (1 << 2)
+
+struct venet_sg {
+ __u64 cookie;
+ __u32 flags;
+ __u32 len; /* total length of all iovs */
+ struct {
+ __u16 start; /* csum starting position */
+ __u16 offset; /* offset to place csum */
+ } csum;
+ struct {
+#define VENET_GSO_TYPE_TCPV4 0 /* IPv4 TCP (TSO) */
+#define VENET_GSO_TYPE_UDP 1 /* IPv4 UDP (UFO) */
+#define VENET_GSO_TYPE_TCPV6 2 /* IPv6 TCP */
+ __u8 type;
+ __u16 hdrlen;
+ __u16 size;
+ } gso;
+ __u32 count; /* nr of iovs */
+ struct venet_iov iov[1];
+};
+
+struct venet_eventq_query {
+ __u32 flags;
+ __u32 evsize; /* size of each event */
+ __u32 dpid; /* descriptor pool-id */
+ __u32 qid;
+ __u8 pad[16];
+};
+
+#define VENET_EVENT_LINKSTATE 0
+#define VENET_EVENT_TXC 1
+
+struct venet_event_header {
+ __u32 flags;
+ __u32 size;
+ __u32 id;
+};
+
+struct venet_event_linkstate {
+ struct venet_event_header header;
+ __u8 state; /* 0 = down, 1 = up */
+};
+
+struct venet_event_txc {
+ struct venet_event_header header;
+ __u32 txqid;
+ __u64 cookie;
+};
+
+#define VSG_DESC_SIZE(count) (sizeof(struct venet_sg) + \
+ sizeof(struct venet_iov) * ((count) - 1))
+
+#define VENET_FUNC_LINKUP 0
+#define VENET_FUNC_LINKDOWN 1
+#define VENET_FUNC_MACQUERY 2
+#define VENET_FUNC_NEGCAP 3 /* negotiate capabilities */
+#define VENET_FUNC_FLUSHRX 4
+#define VENET_FUNC_PMTDQUERY 5
+#define VENET_FUNC_EVQQUERY 6
+
+#endif /* _LINUX_VENET_H */
^ permalink raw reply related
* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: Philipp Reisner @ 2009-10-02 15:54 UTC (permalink / raw)
To: Greg KH
Cc: linux-fbdev-devel, netdev, linux-kernel, dm-devel,
Evgeniy Polyakov, Andrew Morton, David S. Miller
In-Reply-To: <20091002135859.GA9383@kroah.com>
> On Fri, Oct 02, 2009 at 02:40:03PM +0200, Philipp Reisner wrote:
> > Affected: All code that uses connector, in kernel and out of mainline
> >
> > The connector, as it is today, does not allow the in kernel receiving
> > parts to do any checks on privileges of a message's sender.
>
> So, assume I know nothing about the connector architecture, what does
> this mean in a security context?
>
Think of the connector as a layer on top of netlink that allows more
than a hard coded number of subsystems to use netlink.
Netlink is used e.g. to modify routing tables in the kernel.
As it is today, subsystem utilising the connector can not examine
the capabilities of the user/program that sent the netlink message.
If the same would be true for netlink, than every unprivileged user
could change the routing tables on your box.
> > I know, there are not many out there that like connector, but as
> > long as it is in the kernel, we have to fix the security issues it has!
>
> And what specifically are the security issues?
>
unprivileged users can trigger operations that are supposed to be only
accessible to users having CAP_SYS_ADMIN (or some other CAP_XXX)
> > Please either drop connector, or someone who feels a bit responsible
> > and has our beloved dictator's blessing, PLEASE PLEASE PLEASE take
> > this into your tree, and send the pull request to Linus.
> >
> > Patches 1 to 4 are already Acked-by Evgeny, the connector's maintainer.
> > Patches 5 to 7 are the obvious fixes to the connector user's code.
>
> Obvious in what way?
>
They limit processing of connector/netlink messages in these subsystems
to messages sent from root (or some user having CAP_SYS_ADMIN).
That is obvious for dst, because device setup and destruction is done by
connector messages.
This is obvious for pohmelfs becuase these connector messages are
used there to change some configuration.
This is obvious for uvesafb because the connector messages are used
there to delegate some video bios emulation to userspace.
Last not least dm's dirty logging in user space, should be immune to
some crafted netlink packets sent by some unprivileged user.
Patches 1 to 4 fix the framework, should be merged as soon as possible.
Patches 5 to 8 (not 7) should probably be blessed by the affected
subsystem's maintainers. I think I have put all on CC.
HTH.
-phil
^ permalink raw reply
* Re: [BUG net-2.6] bluetooth/rfcomm : sleeping function called from invalid context at mm/slub.c:1719
From: Oliver Hartkopp @ 2009-10-02 16:04 UTC (permalink / raw)
To: Dave Young
Cc: Marcel Holtmann, Linux Netdev List,
linux-bluetooth-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <a8e1da0910020401m2fb8493ax95ff55a3b66131a5-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
Dave Young wrote:
> On Fri, Oct 2, 2009 at 2:28 PM, Oliver Hartkopp <oliver-fJ+pQTUTwRTk1uMJSBkQmQ@public.gmane.org> wrote:
>> Hello Marcel,
>>
>> with current net-2.6 tree ...
>>
>> While starting my PPP Bluetooth dialup networking, i got this:
>
> Hi, oliver
>
> please try following patch:
> http://patchwork.kernel.org/patch/51326/
Hi Dave,
that fixed it at ppp startup!
Tested-by: Oliver Hartkopp <oliver-fJ+pQTUTwRTk1uMJSBkQmQ@public.gmane.org>
Btw. when shutting down the ppp connection i still get this:
[ 361.996887] INFO: trying to register non-static key.
[ 361.996897] the code is fine but needs lockdep annotation.
[ 361.996902] turning off the locking correctness validator.
[ 361.996912] Pid: 0, comm: swapper Not tainted 2.6.31-08939-gdb8abec-dirty #22
[ 361.996919] Call Trace:
[ 361.996933] [<c12e4fb2>] ? printk+0xf/0x11
[ 361.996947] [<c1042214>] register_lock_class+0x5a/0x295
[ 361.996957] [<c1043af2>] __lock_acquire+0x9b/0xc03
[ 361.996967] [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 361.996985] [<fa59a168>] ? l2cap_get_chan_by_scid+0x35/0x43 [l2cap]
[ 361.996995] [<c104491f>] ? lock_release_non_nested+0x17b/0x1db
[ 361.997008] [<fa59a168>] ? l2cap_get_chan_by_scid+0x35/0x43 [l2cap]
[ 361.997018] [<c10426fd>] ? trace_hardirqs_off+0xb/0xd
[ 361.997028] [<c10446b6>] lock_acquire+0x5c/0x73
[ 361.997039] [<c124cd14>] ? skb_dequeue+0x12/0x4c
[ 361.997049] [<c12e6e23>] _spin_lock_irqsave+0x24/0x34
[ 361.997058] [<c124cd14>] ? skb_dequeue+0x12/0x4c
[ 361.997066] [<c124cd14>] skb_dequeue+0x12/0x4c
[ 361.997075] [<c124d579>] skb_queue_purge+0x14/0x1b
[ 361.997088] [<fa59ce3f>] l2cap_recv_frame+0xe9e/0x129a [l2cap]
[ 361.997099] [<c10421d1>] ? register_lock_class+0x17/0x295
[ 361.997110] [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 361.997128] [<c104464b>] ? __lock_acquire+0xbf4/0xc03
[ 361.997139] [<c120de74>] ? uhci_giveback_urb+0xf2/0x162
[ 361.997163] [<f8bb4c45>] ? hci_rx_task+0xfe/0x1f8 [bluetooth]
[ 361.997177] [<fa59d2e4>] l2cap_recv_acldata+0xa9/0x1be [l2cap]
[ 361.997190] [<fa59d23b>] ? l2cap_recv_acldata+0x0/0x1be [l2cap]
[ 361.997208] [<f8bb4c77>] hci_rx_task+0x130/0x1f8 [bluetooth]
[ 361.997219] [<c102a098>] tasklet_action+0x6b/0xb2
[ 361.997228] [<c102a46b>] __do_softirq+0x82/0x101
[ 361.997237] [<c102a515>] do_softirq+0x2b/0x43
[ 361.997246] [<c102a619>] irq_exit+0x35/0x68
[ 361.997256] [<c1004513>] do_IRQ+0x80/0x96
[ 361.997265] [<c10030ae>] common_interrupt+0x2e/0x34
[ 361.997275] [<c104007b>] ? tick_device_uses_broadcast+0x71/0x7c
[ 361.997286] [<c11747a8>] ? acpi_idle_enter_simple+0x103/0x12e
[ 361.997296] [<c1174515>] acpi_idle_enter_bm+0xc3/0x253
[ 361.997306] [<c1238b6f>] cpuidle_idle_call+0x60/0x91
[ 361.997315] [<c1001d44>] cpu_idle+0x49/0x65
[ 361.997324] [<c12e2f0e>] start_secondary+0x190/0x195
Thanks,
Oliver
^ permalink raw reply
* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: Greg KH @ 2009-10-02 16:10 UTC (permalink / raw)
To: Philipp Reisner
Cc: linux-fbdev-devel, netdev, linux-kernel, dm-devel,
Evgeniy Polyakov, Andrew Morton, David S. Miller
In-Reply-To: <200910021754.12940.philipp.reisner@linbit.com>
On Fri, Oct 02, 2009 at 05:54:12PM +0200, Philipp Reisner wrote:
> > On Fri, Oct 02, 2009 at 02:40:03PM +0200, Philipp Reisner wrote:
> > > Affected: All code that uses connector, in kernel and out of mainline
> > >
> > > The connector, as it is today, does not allow the in kernel receiving
> > > parts to do any checks on privileges of a message's sender.
> >
> > So, assume I know nothing about the connector architecture, what does
> > this mean in a security context?
> >
>
> Think of the connector as a layer on top of netlink that allows more
> than a hard coded number of subsystems to use netlink.
>
> Netlink is used e.g. to modify routing tables in the kernel.
>
> As it is today, subsystem utilising the connector can not examine
> the capabilities of the user/program that sent the netlink message.
>
> If the same would be true for netlink, than every unprivileged user
> could change the routing tables on your box.
>
> > > I know, there are not many out there that like connector, but as
> > > long as it is in the kernel, we have to fix the security issues it has!
> >
> > And what specifically are the security issues?
> >
>
> unprivileged users can trigger operations that are supposed to be only
> accessible to users having CAP_SYS_ADMIN (or some other CAP_XXX)
Ok, but it doesn't look like there are that many connector operations
right now, right?
Anyway, I have no objection to the patches, and figure they should go
through David's network tree.
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: Lars Ellenberg @ 2009-10-02 16:21 UTC (permalink / raw)
To: Greg KH
Cc: linux-fbdev-devel, netdev, Philipp Reisner, linux-kernel,
dm-devel, Evgeniy Polyakov, Andrew Morton, David S. Miller,
Alasdair G Kergon
In-Reply-To: <20091002135859.GA9383@kroah.com>
On Fri, Oct 02, 2009 at 06:58:59AM -0700, Greg KH wrote:
> On Fri, Oct 02, 2009 at 02:40:03PM +0200, Philipp Reisner wrote:
> > Affected: All code that uses connector, in kernel and out of mainline
> >
> > The connector, as it is today, does not allow the in kernel receiving
> > parts to do any checks on privileges of a message's sender.
>
> So, assume I know nothing about the connector architecture, what does
> this mean in a security context?
Arbitrary unprivileged users may craft a netlink message, which gets delivered
through connector to callbacks (registered in kernel with cn_add_callback).
These callbacks will then act on the message, as if it originated from an
"expected" source. But currently there is no mechanism to verify the origin,
even if the callbacks would try to.
> > I know, there are not many out there that like connector, but as
> > long as it is in the kernel, we have to fix the security issues it has!
>
> And what specifically are the security issues?
For the cn_ulog_callback (dm-log-userspace-transfer.c),
someone would be able to fake completion (with or without error code)
of ulog entries, copying arbitrary data into receiving_pkg entries.
/*
* This is the connector callback that delivers data
* that was sent from userspace.
*/
static void cn_ulog_callback(void *data)
{
struct cn_msg *msg = (struct cn_msg *)data;
struct dm_ulog_request *tfr = (struct dm_ulog_request *)(msg + 1);
spin_lock(&receiving_list_lock);
if (msg->len == 0)
fill_pkg(msg, NULL);
else if (msg->len < sizeof(*tfr))
DMERR("Incomplete message received (expected %u, got %u): [%u]",
(unsigned)sizeof(*tfr), msg->len, msg->seq);
else
fill_pkg(NULL, tfr);
spin_unlock(&receiving_list_lock);
}
static int fill_pkg(struct cn_msg *msg, struct dm_ulog_request *tfr)
{
uint32_t rtn_seq = (msg) ? msg->seq : (tfr) ? tfr->seq : 0;
...
} else {
pkg->error = tfr->error;
memcpy(pkg->data, tfr->data, tfr->data_size);
*(pkg->data_size) = tfr->data_size;
}
complete(&pkg->complete);
should make that obvious: if an unprivileged user can deliver arbitrary msg to
cn_ulog_callback, that should at least be disruptive to services that use it.
fix: check origin of message for proper credentials (e.g. CAP_SYS_ADMIN).
what or how much damage a crafted message can do in uvesafb_cn_callback,
I'm not sure. But, if I get the msg->seq right, and get by the first
sanity check, again, arbitrary input is copied into some
kernel object, which will likely at least confuse that subsystem,
maybe do damage, or result in some sort of denial of service.
I just don't know what these uvesafb_ktask do, but I doubt that anyone but root
should be able to manipulate them.
in the case of dst and pohemlfs, it is (re|de) configuration of respective in
kernel objects, possibly exposing arbitrary data content
@Evgeniy - is that statement correct? Does something prevent an
unprivileged user to export arbitrary things via dst?
At least some sort of denial of service should be possible there.
for DRBD, we have of course similar problems as long as we use the connector
in its current form as our configuration choice.
I'm not sure what actual harm can be done by arbitrary calling
w1_reset_select_slave(), or w1_process_command_io(),
but allowing unprivileged users to meddle with arbitrary devices is most likely
not the intended behaviour there, either.
The "obvious" way was to first make the credentials and capabilities of the
message origin available to these callbacks, and then test on "CAP_SYS_ADMIN".
Note that the suggested usage of the connector for _userspace_ tools
is to bind() to some netlink socket, subscribing to apropriate mutlicast
groups, which will usually fail for unprivileged users in netlink_bind()
because of
/* Only superuser is allowed to listen multicasts */
if (nladdr->nl_groups) {
if (!netlink_capable(sock, NL_NONROOT_RECV))
return -EPERM;
err = netlink_realloc_groups(sk);
if (err)
return err;
}
So typical userspace tools will fail when used as non-root.
But if you leave out the bind, you are perfectly able to _send_ arbitrary
messages on that socket, even if you are not able to receive any replies from
connector kernel space in that case.
Cheers,
Lars
^ permalink raw reply
* Re: [PATCH 5/8] dm/connector: Only process connector packages from privileged processes
From: Jonathan Brassow @ 2009-10-02 16:40 UTC (permalink / raw)
To: device-mapper development
Cc: linux-fbdev-devel, netdev, LKML, Philipp Reisner, Greg KH,
Evgeniy Polyakov, Andrew Morton, David S. Miller, Alasdair Kergon
In-Reply-To: <1254487211-11810-6-git-send-email-philipp.reisner@linbit.com>
[-- Attachment #1.1: Type: text/plain, Size: 2190 bytes --]
This patch (and "[dm-devel] [PATCH 3/8] connector/dm: Fixed a
compilation warning") will likely collide with an earlier patch (which
agk is pushing) to fix the compilation warning (https://www.redhat.com/archives/dm-devel/2009-September/msg00218.html
), but the fix-up will be trivial.
The dm-log-userspace code checks that incoming messages correspond to
requests that were sent to userspace by way of a sequence number. If
they don't correspond, they are dropped. So, you must be able to
receive the messages from this kernel module (be root) in order to be
able respond with a message that will be accepted. I can't completely
rule out the ability to guess a sequence number, and be able to beat
the log daemon in responding while the window of that sequence
number's validity is open though... If someone could manage to pull
this off with accuracy, they could disrupt the creation of a device,
mimic a log device failure, or cause mirror resynchronization to occur
to a different area that may simultaneously be performing a write
(potential data corruption of a mirror). It would be an impressive
feat to accomplish this, but I very much welcome the patch rather than
test fate.
Reviewed-by: Jonathan Brassow <jbrassow@redhat.com>
brassow
On Oct 2, 2009, at 7:40 AM, Philipp Reisner wrote:
> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
> ---
> drivers/md/dm-log-userspace-transfer.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/md/dm-log-userspace-transfer.c b/drivers/md/dm-
> log-userspace-transfer.c
> index 1327e1a..54abf9e 100644
> --- a/drivers/md/dm-log-userspace-transfer.c
> +++ b/drivers/md/dm-log-userspace-transfer.c
> @@ -133,6 +133,9 @@ static void cn_ulog_callback(struct cn_msg *msg,
> struct netlink_skb_parms *nsp)
> {
> struct dm_ulog_request *tfr = (struct dm_ulog_request *)(msg + 1);
>
> + if (!cap_raised(nsp->eff_cap, CAP_SYS_ADMIN))
> + return;
> +
> spin_lock(&receiving_list_lock);
> if (msg->len == 0)
> fill_pkg(msg, NULL);
> --
> 1.6.0.4
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
[-- Attachment #1.2: Type: text/html, Size: 3438 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply
* Re: SPLICE_F_NONBLOCK semantics...
From: David Miller @ 2009-10-02 16:45 UTC (permalink / raw)
To: jens.axboe
Cc: torvalds, eric.dumazet, jgunthorpe, vl, opurdila, netdev,
linux-kernel
In-Reply-To: <20091002074754.GE14918@kernel.dk>
From: Jens Axboe <jens.axboe@oracle.com>
Date: Fri, 2 Oct 2009 09:47:54 +0200
> The net patch looks fine and correct to me, feel free to add my acked-by
> if you want.
Thanks Jens.
^ permalink raw reply
* Re: [PATCH] net: Fix wrong sizeof
From: David Miller @ 2009-10-02 16:54 UTC (permalink / raw)
To: khali; +Cc: linux-kernel, netdev, linux-doc, rdunlap, stable
In-Reply-To: <20091002113038.1dc3d284@hyperion.delvare>
From: Jean Delvare <khali@linux-fr.org>
Date: Fri, 2 Oct 2009 11:30:38 +0200
> Which is why I have always preferred sizeof(struct foo) over
> sizeof(var).
>
> Signed-off-by: Jean Delvare <khali@linux-fr.org>
> Cc: Randy Dunlap <rdunlap@xenotime.net>
Any time you see "&" in a sizeof() expression, it's almost
certainly a bug. Something for the folks with automated
tools to look for if they haven't already :-)
I'll apply this, thanks.
^ permalink raw reply
* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: David Miller @ 2009-10-02 16:57 UTC (permalink / raw)
To: philipp.reisner
Cc: linux-fbdev-devel, greg, linux-kernel, dm-devel, netdev, zbr,
akpm
In-Reply-To: <200910021754.12940.philipp.reisner@linbit.com>
From: Philipp Reisner <philipp.reisner@linbit.com>
Date: Fri, 2 Oct 2009 17:54:12 +0200
> Think of the connector as a layer on top of netlink that allows more
> than a hard coded number of subsystems to use netlink.
There are no such limits in netlink, we have 'genetlink' which allows
an arbitrary number of subsystems to use netlink.
What connector provides over netlink/genetlink is something different
altogether.
^ permalink raw reply
* Re: [net-2.6 PATCH] e1000e/igb/ixgbe: Don't report an error if devices don't support AER
From: David Miller @ 2009-10-02 17:04 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, gospo, elendil
In-Reply-To: <20091002071542.5072.23381.stgit@localhost.localdomain>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Fri, 02 Oct 2009 00:15:48 -0700
> From: Frans Pop <elendil@planet.nl>
>
> The only error returned by pci_{en,dis}able_pcie_error_reporting() is
> -EIO which simply means that Advanced Error Reporting is not supported.
> There is no need to report that, so remove the error check from e1000e,
> igb and ixgbe.
>
> Signed-off-by: Frans Pop <elendil@planet.nl>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Applied, thanks.
^ permalink raw reply
* Adding to linux-next?
From: Gregory Haskins @ 2009-10-02 17:08 UTC (permalink / raw)
To: linux-next, Stephen Rothwell
Cc: linux-kernel@vger.kernel.org, netdev, David Miller,
alacrityvm-devel@lists.sourceforge.net
[-- Attachment #1: Type: text/plain, Size: 2554 bytes --]
Hello Stephen, linux-next'ers,
I am looking for some guidance on policy/procedure governing inclusion
of a tree to linux-next. For instance: Do I have to be arbitrarily
invited (e.g. by some committee on LKML), or do I explicitly request
consideration? I tried to Google around for answers, and also found the
linux-next wiki, but I was not getting any clear answers.
I have these guest drivers to support IO on top of the AlacrityVM
hypervisor:
http://lkml.org/lkml/2009/8/3/278
The comments have since died down. I realize this can mean anything
from "no objection" to "no interest" ;), but I assume the former unless
someone pipes up.
I believe I addressed the review comments and received an Ack from the
one maintainer of the tree that overlaps with the work (netdev/davem), here:
http://lkml.org/lkml/2009/8/3/505
Since the rest of the work doesn't really fall into any existing
subsystem, and David conceded that the netdev overlap portion should
carry elsewhere, I offer to fill this role myself from within the
AlacrityVM tree itself.
As such, I have taken the driver series and created a new branch here:
git://git.kernel.org/pub/scm/linux/kernel/git/ghaskins/alacrityvm/linux-2.6.git
linux-next
Unlike the original posting, I have excluded the final ethernet patch
since I posted a v3 today (http://lkml.org/lkml/2009/10/2/239) that I
would like to have David re-Ack before including.
Once the driver has been suitably approved by David, and if he still
feels its ok to carry in a tree other than netdev, I will re-add it to
the linux-next branch.
Because I am not really sure of the policies for linux-next, let me
state my intentions of this branch, since I am an unknown in the
maintainership role:
I will only post patches to this branch that:
*) do not fall into an existing maintained subsystem category, unless
the appropriate maintainer has relinquished the patch to carry in my tree.
*) have previously been posted to LKML for suitable review.
IOW: The purpose is not to sneak something in, or subvert a maintained
subsystem. It is purely to carry pieces that have no other home and are
maintained under the AlacrityVM project. You can find more details of
the project here:
http://developer.novell.com/wiki/index.php/AlacrityVM
If this is not acceptable, or I need to follow some other procedure,
please advise me on the proper steps. Perhaps I will update the wiki
FAQ on what I learn from your responses :)
Thank you, and Kind Regards,
-Greg
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 267 bytes --]
^ permalink raw reply
* Re: Splice on blocking TCP sockets again..
From: Jason Gunthorpe @ 2009-10-02 17:10 UTC (permalink / raw)
To: Volker Lendecke; +Cc: Eric Dumazet, netdev, Volker Lendecke
In-Reply-To: <E1Mssmb-004RJz-Hf@intern.SerNet.DE>
On Wed, Sep 30, 2009 at 08:37:13AM +0200, Volker Lendecke wrote:
> On Tue, Sep 29, 2009 at 06:48:20PM -0600, Jason Gunthorpe wrote:
> > FWIW, it looks like samba has a splice code now, but doesn't enable it
> > due to this issue?
>
> Right. What I've learned from the comments is that splice is
> only usable in multi-threaded programs. One thread is
> reading, one is writing from the other end. I deferred using
> splice until we have the proper architecture to do sync
> syscalls in helper threads to make them virtually async. We
> have some code for that now, but it's not a high priority
> for me at this moment.
So, it looks like thanks to Eric and davem that splice will be changed
so it can be blocking on the TCP and non-blocking on the PIPE.
I'd suggest a construct like the following as a compatability
solution:
struct pollfd pfd = {.fd = tcpfd, events = POLLIN | POLLRDHUP};
while (..) {
rc = splice(tcpfd,0,pfd[1],0,count,SPLICE_F_MOVE | SPLICE_F_NONBLOCK);
if (rc == -1)
//...
if (rc == 0) {
if (pfd.revents & POLLRDHUP)
// oops, EOF on TCP
/* Might be an old kernel that nonblocks on TCP, have to check
if this is EOF or do blocking. */
rc = poll(&pfd,1,-1);
if (rc == -1)
//...
}
rc = splice(pfd[0],0,ofd,0,..., SPLICE_F_MOVE)
}
Which should add no overhead in the new splice blocks case, and falls
back gracefully on older kernels..
Thanks,
Jason
^ permalink raw reply
* Re: [PATCH] Use sk_mark for routing lookup in more places
From: Maciej Żenczykowski @ 2009-10-02 17:25 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, atis, panther, netdev
In-Reply-To: <4AC598D7.9080900@gmail.com>
Cool!
As I've already pointed out in a post 2 or so weeks ago, we need the
exact same treatment in a ton of places throughout the code (tcp,
ipv6, decnet, etc...).
Maybe it would make more sense to create some constructor-like
functions for the flowi struct?
On Thu, Oct 1, 2009 at 23:08, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Eric Dumazet a écrit :
>> Here is a followup on this area, thanks.
>>
>> [RFC] af_packet: fill skb->mark at xmit
>>
>> skb->mark may be used by classifiers, so fill it in case user
>> set a SO_MARK option on socket.
>>
>
> Maybe a more generic way to handle this for various protocols
> would be to fill skb->mark in sock_alloc_send_pskb()
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply
* Re: 2.6.32-rc1-git2: Reported regressions from 2.6.31
From: Rafael J. Wysocki @ 2009-10-02 17:32 UTC (permalink / raw)
To: Stefan Richter
Cc: Jaswinder Singh Rajput, Linux Kernel Mailing List, Adrian Bunk,
Andrew Morton, Linus Torvalds, Natalie Protasevich,
Kernel Testers List, Network Development, Linux ACPI,
Linux PM List, Linux SCSI List, Linux Wireless List, DRI
In-Reply-To: <4AC5F975.6060505@s5r6.in-berlin.de>
On Friday 02 October 2009, Stefan Richter wrote:
> Jaswinder Singh Rajput wrote:
> > If you add one more entry say "Suspected commit :" then it will be great
> > and will solve regressions much faster.
>
> Will? Might.
In fact I add the "First-Bad-Commit" annotation where there is a bisection
result or it's possible to fix things by reverting a specific commit.
> > You can request submitter to
> > submit 'suspected commit' by git bisect and also specify git bisect
> > links like : (for more information about git bisect check
> > http://kerneltrap.org/node/11753)
>
> I disagree. A reporter should only be asked to bisect (using git or
> other tools) /if/ a developer determined that bisection may speed up the
> debugging process or is the only remaining option to make progress with
> a bug.
>
> It would be wrong to steal a reporter's valuable time by asking for
> bisection before anybody familiar with the matter even had a first look
> at the report.
Agreed.
Thanks,
Rafael
^ permalink raw reply
* Re: [PATCH] TCPCT-1: adding a sysctl
From: William Allen Simpson @ 2009-10-02 17:52 UTC (permalink / raw)
To: netdev
In-Reply-To: <4AC61505.8030701@gmail.com>
William Allen Simpson wrote:
> This is a straightforward re-implementation of an earlier patch, that no
> longer applies cleanly, that was reviewed:
>
> http://thread.gmane.org/gmane.linux.network/102586
>
In that thread, David Miller wrote:
"This looks mostly fine to me. I would even advocate not using a config
option for this."
It would make the code look cleaner, and with the sysctl instead, it
would probably be fine. But SYN cookies has both.
Before I go much further, I'd like guidance.
^ permalink raw reply
* Re: [PATCH] make TLLAO option for NA packets configurable
From: Stephen Hemminger @ 2009-10-02 17:53 UTC (permalink / raw)
To: Octavian Purdila; +Cc: David Miller, cratiu, netdev
In-Reply-To: <200910020119.47320.opurdila@ixiacom.com>
On Fri, 2 Oct 2009 01:19:47 +0300
Octavian Purdila <opurdila@ixiacom.com> wrote:
> On Thursday 01 October 2009 22:37:40 you wrote:
> > From: Stephen Hemminger <shemminger@vyatta.com>
> > Date: Thu, 1 Oct 2009 11:56:11 -0700
> >
> > > On Thu, 1 Oct 2009 21:39:32 +0300
> > >
> > > Octavian Purdila <opurdila@ixiacom.com> wrote:
> > >> On Thursday 01 October 2009 21:14:50 you wrote:
> > >> > Probably this should be a per interface property rather than per
> > >> > namespace.
> > >>
> > >> In our case, where we have lots of interfaces active, it would be nice
> > >> to have the per namespace property as well.
> > >
> > > The ipv6 control infrastructure already has that option. If you changed
> > > your patch to use a per-interface control then there would be:
> > >
> > > /proc/sys/net/ipv6/conf/all/force_tllao
> >
> > Right, this would work a lot better.
> >
>
> Here is v3 which also updates Documentation/networking/ip-sysctl.txt.
>
> Thanks,
> tavi
>
>
This is good although I would have shortened the name.
--
^ permalink raw reply
* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: David Miller @ 2009-10-02 17:56 UTC (permalink / raw)
To: philipp.reisner
Cc: linux-kernel, netdev, akpm, greg, dm-devel, zbr,
linux-fbdev-devel
In-Reply-To: <1254487211-11810-1-git-send-email-philipp.reisner@linbit.com>
From: Philipp Reisner <philipp.reisner@linbit.com>
Date: Fri, 2 Oct 2009 14:40:03 +0200
> Affected: All code that uses connector, in kernel and out of mainline
>
> The connector, as it is today, does not allow the in kernel receiving
> parts to do any checks on privileges of a message's sender.
>
> I know, there are not many out there that like connector, but as
> long as it is in the kernel, we have to fix the security issues it has!
>
> Please either drop connector, or someone who feels a bit responsible
> and has our beloved dictator's blessing, PLEASE PLEASE PLEASE take
> this into your tree, and send the pull request to Linus.
>
> Patches 1 to 4 are already Acked-by Evgeny, the connector's maintainer.
> Patches 5 to 7 are the obvious fixes to the connector user's code.
>
> For convenience these patches are also available as git tree:
> git://git.drbd.org/linux-2.6-drbd.git connector-fix
All applied to net-2.6, I'll push this out to Linus later
today.
^ permalink raw reply
* Re: [PATCH] Use sk_mark for routing lookup in more places
From: David Miller @ 2009-10-02 18:00 UTC (permalink / raw)
To: zenczykowski; +Cc: eric.dumazet, atis, panther, netdev
In-Reply-To: <55a4f86e0910021025u7523029av1e4ee917d1fb1ee5@mail.gmail.com>
From: Maciej Żenczykowski <zenczykowski@gmail.com>
Date: Fri, 2 Oct 2009 10:25:13 -0700
> Maybe it would make more sense to create some constructor-like
> functions for the flowi struct?
Maybe just an initializer like "FLOWI_SOCK(sk)" or similar.
So you can say:
struct flowi fl = FLOWI_SOCK(sk);
But the thing is we usually want to initialize all of the
details in one go, so we'd need a very messy macro for this
that would take many arguments.
It's important to use an initializer rather than assignments in some
inline function so that GCC can better coalesce many small members
into since large stores to the stack. It doesn't do this as well with
real assignment statements.
^ permalink raw reply
* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: Greg KH @ 2009-10-02 18:00 UTC (permalink / raw)
To: David Miller
Cc: philipp.reisner, linux-kernel, netdev, akpm, dm-devel, zbr,
linux-fbdev-devel
In-Reply-To: <20091002.105659.102583702.davem@davemloft.net>
On Fri, Oct 02, 2009 at 10:56:59AM -0700, David Miller wrote:
> From: Philipp Reisner <philipp.reisner@linbit.com>
> Date: Fri, 2 Oct 2009 14:40:03 +0200
>
> > Affected: All code that uses connector, in kernel and out of mainline
> >
> > The connector, as it is today, does not allow the in kernel receiving
> > parts to do any checks on privileges of a message's sender.
> >
> > I know, there are not many out there that like connector, but as
> > long as it is in the kernel, we have to fix the security issues it has!
> >
> > Please either drop connector, or someone who feels a bit responsible
> > and has our beloved dictator's blessing, PLEASE PLEASE PLEASE take
> > this into your tree, and send the pull request to Linus.
> >
> > Patches 1 to 4 are already Acked-by Evgeny, the connector's maintainer.
> > Patches 5 to 7 are the obvious fixes to the connector user's code.
> >
> > For convenience these patches are also available as git tree:
> > git://git.drbd.org/linux-2.6-drbd.git connector-fix
>
> All applied to net-2.6, I'll push this out to Linus later
> today.
Should it also go to -stable? If so, I can pick it up once it hits
Linus's tree.
thanks,
greg k-h
^ permalink raw reply
* Re: [PATCH] cnic: Fix NETDEV_UP event processing.
From: David Miller @ 2009-10-02 18:03 UTC (permalink / raw)
To: mchan; +Cc: netdev, michaelc, benli
In-Reply-To: <1254464254-32005-1-git-send-email-mchan@broadcom.com>
From: "Michael Chan" <mchan@broadcom.com>
Date: Thu, 1 Oct 2009 23:17:34 -0700
> This fixes the problem of not handling the NETDEV_UP event properly
> during hot-plug or modprobe of bnx2 after cnic. The handling was
> skipped by mistakenly using "else if" to check for the event.
>
> Also update version to 2.0.1.
>
> Signed-off-by: Michael Chan <mchan@broadcom.com>
> Signed-off-by: Benjamin Li <benli@broadcom.com>
Applied, thanks Michael.
^ permalink raw reply
* Re: [PATCH 0/8] SECURITY ISSUE with connector
From: David Miller @ 2009-10-02 18:05 UTC (permalink / raw)
To: greg
Cc: linux-fbdev-devel, netdev, philipp.reisner, linux-kernel,
dm-devel, zbr, akpm
In-Reply-To: <20091002180022.GA22229@kroah.com>
From: Greg KH <greg@kroah.com>
Date: Fri, 2 Oct 2009 11:00:22 -0700
> On Fri, Oct 02, 2009 at 10:56:59AM -0700, David Miller wrote:
>> All applied to net-2.6, I'll push this out to Linus later
>> today.
>
> Should it also go to -stable? If so, I can pick it up once it hits
> Linus's tree.
Yes, please take it into -stable.
Greg, I'll also send you a batch of other networking bits
for -stable later this afternoon as well, just FYI...
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox