* Re: 2.6.37 regression: adding main interface to a bridge breaks vlan interface RX
From: Simon Arlott @ 2011-01-17 18:17 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev, Linux Kernel Mailing List, jesse, Herbert Xu
In-Reply-To: <1295280044.6264.5.camel@bwh-desktop>
On 17/01/11 16:00, Ben Hutchings wrote:
> On Sun, 2011-01-16 at 14:09 +0000, Simon Arlott wrote:
>> [ 1.666706] forcedeth 0000:00:08.0: ifname eth0, PHY OUI 0x5043 @ 16, addr 00:e0:81:4d:2b:ec
>> [ 1.666767] forcedeth 0000:00:08.0: highdma csum vlan pwrctl mgmt gbit lnktim msi desc-v3
>>
>> I have eth0 and eth0.3840 which works until I add eth0 to a bridge.
>> While eth0 is in a bridge (the bridge device is up), eth0.3840 is unable
>> to receive packets. Using tcpdump on eth0 shows the packets being
>> received with a VLAN tag but they don't appear on eth0.3840. They appear
>> with the VLAN tag on the bridge interface.
> [...]
>
> This means the behaviour is now consistent, whether or not hardware VLAN
> tag stripping is enabled. (I previously pointed out the inconsistent
> behaviour in <http://thread.gmane.org/gmane.linux.network/149864>.) I
> would consider this an improvement.
Shouldn't the kernel also prevent a device from being both part of a
bridge and having VLANs? Instead everything appears to work except
incoming traffic.
--
Simon Arlott
^ permalink raw reply
* [net-next-2.6 PATCH v8 2/2] net_sched: implement a root container qdisc sch_mqprio
From: John Fastabend @ 2011-01-17 18:06 UTC (permalink / raw)
To: davem
Cc: bhutchings, jarkao2, hadi, eric.dumazet, shemminger, tgraf,
nhorman, netdev
In-Reply-To: <20110117175542.29543.38690.stgit@jf-dev1-dcblab>
This implements a mqprio queueing discipline that by default creates
a pfifo_fast qdisc per tx queue and provides the needed configuration
interface.
Using the mqprio qdisc the number of tcs currently in use along
with the range of queues alloted to each class can be configured. By
default skbs are mapped to traffic classes using the skb priority.
This mapping is configurable.
Configurable parameters,
struct tc_mqprio_qopt {
__u8 num_tc;
__u8 prio_tc_map[TC_BITMASK + 1];
__u8 hw;
__u16 count[TC_MAX_QUEUE];
__u16 offset[TC_MAX_QUEUE];
};
Here the count/offset pairing give the queue alignment and the
prio_tc_map gives the mapping from skb->priority to tc.
The hw bit determines if the hardware should configure the count
and offset values. If the hardware bit is set then the operation
will fail if the hardware does not implement the ndo_setup_tc
operation. This is to avoid undetermined states where the hardware
may or may not control the queue mapping. Also minimal bounds
checking is done on the count/offset to verify a queue does not
exceed num_tx_queues and that queue ranges do not overlap. Otherwise
it is left to user policy or hardware configuration to create
useful mappings.
It is expected that hardware QOS schemes can be implemented by
creating appropriate mappings of queues in ndo_tc_setup().
One expected use case is drivers will use the ndo_setup_tc to map
queue ranges onto 802.1Q traffic classes. This provides a generic
mechanism to map network traffic onto these traffic classes and
removes the need for lower layer drivers to know specifics about
traffic types.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
include/linux/pkt_sched.h | 12 +
net/sched/Kconfig | 12 +
net/sched/Makefile | 1
net/sched/sch_generic.c | 4
net/sched/sch_mqprio.c | 417 +++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 446 insertions(+), 0 deletions(-)
create mode 100644 net/sched/sch_mqprio.c
diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 2cfa4bc..776cd93 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -481,4 +481,16 @@ struct tc_drr_stats {
__u32 deficit;
};
+/* MQPRIO */
+#define TC_QOPT_BITMASK 15
+#define TC_QOPT_MAX_QUEUE 16
+
+struct tc_mqprio_qopt {
+ __u8 num_tc;
+ __u8 prio_tc_map[TC_QOPT_BITMASK + 1];
+ __u8 hw;
+ __u16 count[TC_QOPT_MAX_QUEUE];
+ __u16 offset[TC_QOPT_MAX_QUEUE];
+};
+
#endif
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index a36270a..f52f5eb 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -205,6 +205,18 @@ config NET_SCH_DRR
If unsure, say N.
+config NET_SCH_MQPRIO
+ tristate "Multi-queue priority scheduler (MQPRIO)"
+ help
+ Say Y here if you want to use the Multi-queue Priority scheduler.
+ This scheduler allows QOS to be offloaded on NICs that have support
+ for offloading QOS schedulers.
+
+ To compile this driver as a module, choose M here: the module will
+ be called sch_mqprio.
+
+ If unsure, say N.
+
config NET_SCH_INGRESS
tristate "Ingress Qdisc"
depends on NET_CLS_ACT
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 960f5db..26ce681 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -32,6 +32,7 @@ obj-$(CONFIG_NET_SCH_MULTIQ) += sch_multiq.o
obj-$(CONFIG_NET_SCH_ATM) += sch_atm.o
obj-$(CONFIG_NET_SCH_NETEM) += sch_netem.o
obj-$(CONFIG_NET_SCH_DRR) += sch_drr.o
+obj-$(CONFIG_NET_SCH_MQPRIO) += sch_mqprio.o
obj-$(CONFIG_NET_CLS_U32) += cls_u32.o
obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o
obj-$(CONFIG_NET_CLS_FW) += cls_fw.o
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 34dc598..723b278 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -540,6 +540,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
.dump = pfifo_fast_dump,
.owner = THIS_MODULE,
};
+EXPORT_SYMBOL(pfifo_fast_ops);
struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
struct Qdisc_ops *ops)
@@ -674,6 +675,7 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue,
return oqdisc;
}
+EXPORT_SYMBOL(dev_graft_qdisc);
static void attach_one_default_qdisc(struct net_device *dev,
struct netdev_queue *dev_queue,
@@ -761,6 +763,7 @@ void dev_activate(struct net_device *dev)
dev_watchdog_up(dev);
}
}
+EXPORT_SYMBOL(dev_activate);
static void dev_deactivate_queue(struct net_device *dev,
struct netdev_queue *dev_queue,
@@ -840,6 +843,7 @@ void dev_deactivate(struct net_device *dev)
list_add(&dev->unreg_list, &single);
dev_deactivate_many(&single);
}
+EXPORT_SYMBOL(dev_deactivate);
static void dev_init_scheduler_queue(struct net_device *dev,
struct netdev_queue *dev_queue,
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
new file mode 100644
index 0000000..8620c65
--- /dev/null
+++ b/net/sched/sch_mqprio.c
@@ -0,0 +1,417 @@
+/*
+ * net/sched/sch_mqprio.c
+ *
+ * Copyright (c) 2010 John Fastabend <john.r.fastabend@intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ */
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <net/netlink.h>
+#include <net/pkt_sched.h>
+#include <net/sch_generic.h>
+
+struct mqprio_sched {
+ struct Qdisc **qdiscs;
+ int hw_owned;
+};
+
+static void mqprio_destroy(struct Qdisc *sch)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ struct mqprio_sched *priv = qdisc_priv(sch);
+ unsigned int ntx;
+
+ if (!priv->qdiscs)
+ return;
+
+ for (ntx = 0; ntx < dev->num_tx_queues && priv->qdiscs[ntx]; ntx++)
+ qdisc_destroy(priv->qdiscs[ntx]);
+
+ if (priv->hw_owned && dev->netdev_ops->ndo_setup_tc)
+ dev->netdev_ops->ndo_setup_tc(dev, 0);
+ else
+ netdev_set_num_tc(dev, 0);
+
+ kfree(priv->qdiscs);
+}
+
+static int mqprio_parse_opt(struct net_device *dev, struct tc_mqprio_qopt *qopt)
+{
+ int i, j;
+
+ /* Verify num_tc is not out of max range */
+ if (qopt->num_tc > TC_MAX_QUEUE)
+ return -EINVAL;
+
+ /* Verify priority mapping uses valid tcs */
+ for (i = 0; i < TC_BITMASK + 1; i++) {
+ if (qopt->prio_tc_map[i] >= qopt->num_tc)
+ return -EINVAL;
+ }
+
+ /* net_device does not support requested operation */
+ if (qopt->hw && !dev->netdev_ops->ndo_setup_tc)
+ return -EINVAL;
+
+ /* if hw owned qcount and qoffset are taken from LLD so
+ * no reason to verify them here
+ */
+ if (qopt->hw)
+ return 0;
+
+ for (i = 0; i < qopt->num_tc; i++) {
+ unsigned int last = qopt->offset[i] + qopt->count[i];
+
+ /* Verify the queue count is in tx range being equal to the
+ * real_num_tx_queues indicates the last queue is in use.
+ */
+ if (qopt->offset[i] >= dev->real_num_tx_queues ||
+ !qopt->count[i] ||
+ last > dev->real_num_tx_queues)
+ return -EINVAL;
+
+ /* Verify that the offset and counts do not overlap */
+ for (j = i + 1; j < qopt->num_tc; j++) {
+ if (last > qopt->offset[j])
+ return -EINVAL;
+ }
+ }
+
+ return 0;
+}
+
+static int mqprio_init(struct Qdisc *sch, struct nlattr *opt)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ struct mqprio_sched *priv = qdisc_priv(sch);
+ struct netdev_queue *dev_queue;
+ struct Qdisc *qdisc;
+ int i, err = -EOPNOTSUPP;
+ struct tc_mqprio_qopt *qopt = NULL;
+
+ BUILD_BUG_ON(TC_MAX_QUEUE != TC_QOPT_MAX_QUEUE);
+ BUILD_BUG_ON(TC_BITMASK != TC_QOPT_BITMASK);
+
+ if (sch->parent != TC_H_ROOT)
+ return -EOPNOTSUPP;
+
+ if (!netif_is_multiqueue(dev))
+ return -EOPNOTSUPP;
+
+ if (nla_len(opt) < sizeof(*qopt))
+ return -EINVAL;
+
+ qopt = nla_data(opt);
+ if (mqprio_parse_opt(dev, qopt))
+ return -EINVAL;
+
+ /* pre-allocate qdisc, attachment can't fail */
+ priv->qdiscs = kcalloc(dev->num_tx_queues, sizeof(priv->qdiscs[0]),
+ GFP_KERNEL);
+ if (priv->qdiscs == NULL) {
+ err = -ENOMEM;
+ goto err;
+ }
+
+ for (i = 0; i < dev->num_tx_queues; i++) {
+ dev_queue = netdev_get_tx_queue(dev, i);
+ qdisc = qdisc_create_dflt(dev_queue, &pfifo_fast_ops,
+ TC_H_MAKE(TC_H_MAJ(sch->handle),
+ TC_H_MIN(i + 1)));
+ if (qdisc == NULL) {
+ err = -ENOMEM;
+ goto err;
+ }
+ qdisc->flags |= TCQ_F_CAN_BYPASS;
+ priv->qdiscs[i] = qdisc;
+ }
+
+ /* If the mqprio options indicate that hardware should own
+ * the queue mapping then run ndo_setup_tc otherwise use the
+ * supplied and verified mapping
+ */
+ if (qopt->hw) {
+ priv->hw_owned = 1;
+ err = dev->netdev_ops->ndo_setup_tc(dev, qopt->num_tc);
+ if (err)
+ goto err;
+ } else {
+ netdev_set_num_tc(dev, qopt->num_tc);
+ for (i = 0; i < qopt->num_tc; i++)
+ netdev_set_tc_queue(dev, i,
+ qopt->count[i], qopt->offset[i]);
+ }
+
+ /* Always use supplied priority mappings */
+ for (i = 0; i < TC_BITMASK + 1; i++)
+ netdev_set_prio_tc_map(dev, i, qopt->prio_tc_map[i]);
+
+ sch->flags |= TCQ_F_MQROOT;
+ return 0;
+
+err:
+ mqprio_destroy(sch);
+ return err;
+}
+
+static void mqprio_attach(struct Qdisc *sch)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ struct mqprio_sched *priv = qdisc_priv(sch);
+ struct Qdisc *qdisc;
+ unsigned int ntx;
+
+ /* Attach underlying qdisc */
+ for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
+ qdisc = priv->qdiscs[ntx];
+ qdisc = dev_graft_qdisc(qdisc->dev_queue, qdisc);
+ if (qdisc)
+ qdisc_destroy(qdisc);
+ }
+ kfree(priv->qdiscs);
+ priv->qdiscs = NULL;
+}
+
+static struct netdev_queue *mqprio_queue_get(struct Qdisc *sch,
+ unsigned long cl)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ unsigned long ntx = cl - 1 - netdev_get_num_tc(dev);
+
+ if (ntx >= dev->num_tx_queues)
+ return NULL;
+ return netdev_get_tx_queue(dev, ntx);
+}
+
+static int mqprio_graft(struct Qdisc *sch, unsigned long cl, struct Qdisc *new,
+ struct Qdisc **old)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ struct netdev_queue *dev_queue = mqprio_queue_get(sch, cl);
+
+ if (!dev_queue)
+ return -EINVAL;
+
+ if (dev->flags & IFF_UP)
+ dev_deactivate(dev);
+
+ *old = dev_graft_qdisc(dev_queue, new);
+
+ if (dev->flags & IFF_UP)
+ dev_activate(dev);
+
+ return 0;
+}
+
+static int mqprio_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ struct mqprio_sched *priv = qdisc_priv(sch);
+ unsigned char *b = skb_tail_pointer(skb);
+ struct tc_mqprio_qopt opt;
+ struct Qdisc *qdisc;
+ unsigned int i;
+
+ sch->q.qlen = 0;
+ memset(&sch->bstats, 0, sizeof(sch->bstats));
+ memset(&sch->qstats, 0, sizeof(sch->qstats));
+
+ for (i = 0; i < dev->num_tx_queues; i++) {
+ qdisc = netdev_get_tx_queue(dev, i)->qdisc;
+ spin_lock_bh(qdisc_lock(qdisc));
+ sch->q.qlen += qdisc->q.qlen;
+ sch->bstats.bytes += qdisc->bstats.bytes;
+ sch->bstats.packets += qdisc->bstats.packets;
+ sch->qstats.qlen += qdisc->qstats.qlen;
+ sch->qstats.backlog += qdisc->qstats.backlog;
+ sch->qstats.drops += qdisc->qstats.drops;
+ sch->qstats.requeues += qdisc->qstats.requeues;
+ sch->qstats.overlimits += qdisc->qstats.overlimits;
+ spin_unlock_bh(qdisc_lock(qdisc));
+ }
+
+ opt.num_tc = netdev_get_num_tc(dev);
+ memcpy(opt.prio_tc_map, dev->prio_tc_map, sizeof(opt.prio_tc_map));
+ opt.hw = priv->hw_owned;
+
+ for (i = 0; i < netdev_get_num_tc(dev); i++) {
+ opt.count[i] = dev->tc_to_txq[i].count;
+ opt.offset[i] = dev->tc_to_txq[i].offset;
+ }
+
+ NLA_PUT(skb, TCA_OPTIONS, sizeof(opt), &opt);
+
+ return skb->len;
+nla_put_failure:
+ nlmsg_trim(skb, b);
+ return -1;
+}
+
+static struct Qdisc *mqprio_leaf(struct Qdisc *sch, unsigned long cl)
+{
+ struct netdev_queue *dev_queue = mqprio_queue_get(sch, cl);
+
+ if (!dev_queue)
+ return NULL;
+
+ return dev_queue->qdisc_sleeping;
+}
+
+static unsigned long mqprio_get(struct Qdisc *sch, u32 classid)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ unsigned int ntx = TC_H_MIN(classid);
+
+ if (ntx > dev->num_tx_queues + netdev_get_num_tc(dev))
+ return 0;
+ return ntx;
+}
+
+static void mqprio_put(struct Qdisc *sch, unsigned long cl)
+{
+}
+
+static int mqprio_dump_class(struct Qdisc *sch, unsigned long cl,
+ struct sk_buff *skb, struct tcmsg *tcm)
+{
+ struct net_device *dev = qdisc_dev(sch);
+
+ if (cl <= netdev_get_num_tc(dev)) {
+ tcm->tcm_parent = TC_H_ROOT;
+ tcm->tcm_info = 0;
+ } else {
+ int i;
+ struct netdev_queue *dev_queue;
+
+ dev_queue = mqprio_queue_get(sch, cl);
+ tcm->tcm_parent = 0;
+ for (i = 0; i < netdev_get_num_tc(dev); i++) {
+ struct netdev_tc_txq tc = dev->tc_to_txq[i];
+ int q_idx = cl - netdev_get_num_tc(dev);
+
+ if (q_idx > tc.offset &&
+ q_idx <= tc.offset + tc.count) {
+ tcm->tcm_parent =
+ TC_H_MAKE(TC_H_MAJ(sch->handle),
+ TC_H_MIN(i + 1));
+ break;
+ }
+ }
+ tcm->tcm_info = dev_queue->qdisc_sleeping->handle;
+ }
+ tcm->tcm_handle |= TC_H_MIN(cl);
+ return 0;
+}
+
+static int mqprio_dump_class_stats(struct Qdisc *sch, unsigned long cl,
+ struct gnet_dump *d)
+{
+ struct net_device *dev = qdisc_dev(sch);
+
+ if (cl <= netdev_get_num_tc(dev)) {
+ int i;
+ struct Qdisc *qdisc;
+ struct gnet_stats_queue qstats = {0};
+ struct gnet_stats_basic_packed bstats = {0};
+ struct netdev_tc_txq tc = dev->tc_to_txq[cl - 1];
+
+ /* Drop lock here it will be reclaimed before touching
+ * statistics this is required because the d->lock we
+ * hold here is the look on dev_queue->qdisc_sleeping
+ * also acquired below.
+ */
+ spin_unlock_bh(d->lock);
+
+ for (i = tc.offset; i < tc.offset + tc.count; i++) {
+ qdisc = netdev_get_tx_queue(dev, i)->qdisc;
+ spin_lock_bh(qdisc_lock(qdisc));
+ bstats.bytes += qdisc->bstats.bytes;
+ bstats.packets += qdisc->bstats.packets;
+ qstats.qlen += qdisc->qstats.qlen;
+ qstats.backlog += qdisc->qstats.backlog;
+ qstats.drops += qdisc->qstats.drops;
+ qstats.requeues += qdisc->qstats.requeues;
+ qstats.overlimits += qdisc->qstats.overlimits;
+ spin_unlock_bh(qdisc_lock(qdisc));
+ }
+ /* Reclaim root sleeping lock before completing stats */
+ spin_lock_bh(d->lock);
+ if (gnet_stats_copy_basic(d, &bstats) < 0 ||
+ gnet_stats_copy_queue(d, &qstats) < 0)
+ return -1;
+ } else {
+ struct netdev_queue *dev_queue = mqprio_queue_get(sch, cl);
+
+ sch = dev_queue->qdisc_sleeping;
+ sch->qstats.qlen = sch->q.qlen;
+ if (gnet_stats_copy_basic(d, &sch->bstats) < 0 ||
+ gnet_stats_copy_queue(d, &sch->qstats) < 0)
+ return -1;
+ }
+ return 0;
+}
+
+static void mqprio_walk(struct Qdisc *sch, struct qdisc_walker *arg)
+{
+ struct net_device *dev = qdisc_dev(sch);
+ unsigned long ntx;
+
+ if (arg->stop)
+ return;
+
+ /* Walk hierarchy with a virtual class per tc */
+ arg->count = arg->skip;
+ for (ntx = arg->skip;
+ ntx < dev->num_tx_queues + netdev_get_num_tc(dev);
+ ntx++) {
+ if (arg->fn(sch, ntx + 1, arg) < 0) {
+ arg->stop = 1;
+ break;
+ }
+ arg->count++;
+ }
+}
+
+static const struct Qdisc_class_ops mqprio_class_ops = {
+ .graft = mqprio_graft,
+ .leaf = mqprio_leaf,
+ .get = mqprio_get,
+ .put = mqprio_put,
+ .walk = mqprio_walk,
+ .dump = mqprio_dump_class,
+ .dump_stats = mqprio_dump_class_stats,
+};
+
+struct Qdisc_ops mqprio_qdisc_ops __read_mostly = {
+ .cl_ops = &mqprio_class_ops,
+ .id = "mqprio",
+ .priv_size = sizeof(struct mqprio_sched),
+ .init = mqprio_init,
+ .destroy = mqprio_destroy,
+ .attach = mqprio_attach,
+ .dump = mqprio_dump,
+ .owner = THIS_MODULE,
+};
+
+static int __init mqprio_module_init(void)
+{
+ return register_qdisc(&mqprio_qdisc_ops);
+}
+
+static void __exit mqprio_module_exit(void)
+{
+ unregister_qdisc(&mqprio_qdisc_ops);
+}
+
+module_init(mqprio_module_init);
+module_exit(mqprio_module_exit);
+
+MODULE_LICENSE("GPL");
^ permalink raw reply related
* [net-next-2.6 PATCH v8 1/2] net: implement mechanism for HW based QOS
From: John Fastabend @ 2011-01-17 18:06 UTC (permalink / raw)
To: davem
Cc: bhutchings, jarkao2, hadi, eric.dumazet, shemminger, tgraf,
nhorman, netdev
In-Reply-To: <20110117175542.29543.38690.stgit@jf-dev1-dcblab>
This patch provides a mechanism for lower layer devices to
steer traffic using skb->priority to tx queues. This allows
for hardware based QOS schemes to use the default qdisc without
incurring the penalties related to global state and the qdisc
lock. While reliably receiving skbs on the correct tx ring
to avoid head of line blocking resulting from shuffling in
the LLD. Finally, all the goodness from txq caching and xps/rps
can still be leveraged.
Many drivers and hardware exist with the ability to implement
QOS schemes in the hardware but currently these drivers tend
to rely on firmware to reroute specific traffic, a driver
specific select_queue or the queue_mapping action in the
qdisc.
By using select_queue for this drivers need to be updated for
each and every traffic type and we lose the goodness of much
of the upstream work. Firmware solutions are inherently
inflexible. And finally if admins are expected to build a
qdisc and filter rules to steer traffic this requires knowledge
of how the hardware is currently configured. The number of tx
queues and the queue offsets may change depending on resources.
Also this approach incurs all the overhead of a qdisc with filters.
With the mechanism in this patch users can set skb priority using
expected methods ie setsockopt() or the stack can set the priority
directly. Then the skb will be steered to the correct tx queues
aligned with hardware QOS traffic classes. In the normal case with
single traffic class and all queues in this class everything
works as is until the LLD enables multiple tcs.
To steer the skb we mask out the lower 4 bits of the priority
and allow the hardware to configure upto 15 distinct classes
of traffic. This is expected to be sufficient for most applications
at any rate it is more then the 8021Q spec designates and is
equal to the number of prio bands currently implemented in
the default qdisc.
This in conjunction with a userspace application such as
lldpad can be used to implement 8021Q transmission selection
algorithms one of these algorithms being the extended transmission
selection algorithm currently being used for DCB.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
include/linux/netdevice.h | 68 +++++++++++++++++++++++++++++++++++++++++++++
net/core/dev.c | 55 ++++++++++++++++++++++++++++++++++++
2 files changed, 122 insertions(+), 1 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0f6b1c9..c973582 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -646,6 +646,14 @@ struct xps_dev_maps {
(nr_cpu_ids * sizeof(struct xps_map *)))
#endif /* CONFIG_XPS */
+#define TC_MAX_QUEUE 16
+#define TC_BITMASK 15
+/* HW offloaded queuing disciplines txq count and offset maps */
+struct netdev_tc_txq {
+ u16 count;
+ u16 offset;
+};
+
/*
* This structure defines the management hooks for network devices.
* The following hooks can be defined; unless noted otherwise, they are
@@ -756,6 +764,11 @@ struct xps_dev_maps {
* int (*ndo_set_vf_port)(struct net_device *dev, int vf,
* struct nlattr *port[]);
* int (*ndo_get_vf_port)(struct net_device *dev, int vf, struct sk_buff *skb);
+ * int (*ndo_setup_tc)(struct net_device *dev, u8 tc)
+ * Called to setup 'tc' number of traffic classes in the net device. This
+ * is always called from the stack with the rtnl lock held and netif tx
+ * queues stopped. This allows the netdevice to perform queue management
+ * safely.
*/
#define HAVE_NET_DEVICE_OPS
struct net_device_ops {
@@ -814,6 +827,7 @@ struct net_device_ops {
struct nlattr *port[]);
int (*ndo_get_vf_port)(struct net_device *dev,
int vf, struct sk_buff *skb);
+ int (*ndo_setup_tc)(struct net_device *dev, u8 tc);
#if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE)
int (*ndo_fcoe_enable)(struct net_device *dev);
int (*ndo_fcoe_disable)(struct net_device *dev);
@@ -1146,6 +1160,9 @@ struct net_device {
/* Data Center Bridging netlink ops */
const struct dcbnl_rtnl_ops *dcbnl_ops;
#endif
+ u8 num_tc;
+ struct netdev_tc_txq tc_to_txq[TC_MAX_QUEUE];
+ u8 prio_tc_map[TC_BITMASK + 1];
#if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE)
/* max exchange id for FCoE LRO by ddp */
@@ -1162,6 +1179,57 @@ struct net_device {
#define NETDEV_ALIGN 32
static inline
+int netdev_get_prio_tc_map(const struct net_device *dev, u32 prio)
+{
+ return dev->prio_tc_map[prio & TC_BITMASK];
+}
+
+static inline
+int netdev_set_prio_tc_map(struct net_device *dev, u8 prio, u8 tc)
+{
+ if (tc >= dev->num_tc)
+ return -EINVAL;
+
+ dev->prio_tc_map[prio & TC_BITMASK] = tc & TC_BITMASK;
+ return 0;
+}
+
+static inline
+void netdev_reset_tc(struct net_device *dev)
+{
+ dev->num_tc = 0;
+ memset(dev->tc_to_txq, 0, sizeof(dev->tc_to_txq));
+ memset(dev->prio_tc_map, 0, sizeof(dev->prio_tc_map));
+}
+
+static inline
+int netdev_set_tc_queue(struct net_device *dev, u8 tc, u16 count, u16 offset)
+{
+ if (tc >= dev->num_tc)
+ return -EINVAL;
+
+ dev->tc_to_txq[tc].count = count;
+ dev->tc_to_txq[tc].offset = offset;
+ return 0;
+}
+
+static inline
+int netdev_set_num_tc(struct net_device *dev, u8 num_tc)
+{
+ if (num_tc > TC_MAX_QUEUE)
+ return -EINVAL;
+
+ dev->num_tc = num_tc;
+ return 0;
+}
+
+static inline
+int netdev_get_num_tc(struct net_device *dev)
+{
+ return dev->num_tc;
+}
+
+static inline
struct netdev_queue *netdev_get_tx_queue(const struct net_device *dev,
unsigned int index)
{
diff --git a/net/core/dev.c b/net/core/dev.c
index a215269..2348b98 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1593,6 +1593,48 @@ static void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
rcu_read_unlock();
}
+/* netif_setup_tc - Handle tc mappings on real_num_tx_queues change
+ * @dev: Network device
+ * @txq: number of queues available
+ *
+ * If real_num_tx_queues is changed the tc mappings may no longer be
+ * valid. To resolve this verify the tc mapping remains valid and if
+ * not NULL the mapping. With no priorities mapping to this
+ * offset/count pair it will no longer be used. In the worst case TC0
+ * is invalid nothing can be done so disable priority mappings. If is
+ * expected that drivers will fix this mapping if they can before
+ * calling netif_set_real_num_tx_queues.
+ */
+void netif_setup_tc(struct net_device *dev, unsigned int txq)
+{
+ int i;
+ struct netdev_tc_txq *tc = &dev->tc_to_txq[0];
+
+ /* If TC0 is invalidated disable TC mapping */
+ if (tc->offset + tc->count > txq) {
+ pr_warning("Number of in use tx queues changed "
+ "invalidating tc mappings. Priority "
+ "traffic classification disabled!\n");
+ dev->num_tc = 0;
+ return;
+ }
+
+ /* Invalidated prio to tc mappings set to TC0 */
+ for (i = 1; i < TC_BITMASK + 1; i++) {
+ int q = netdev_get_prio_tc_map(dev, i);
+
+ tc = &dev->tc_to_txq[q];
+ if (tc->offset + tc->count > txq) {
+ pr_warning("Number of in use tx queues "
+ "changed. Priority %i to tc "
+ "mapping %i is no longer valid "
+ "setting map to 0\n",
+ i, q);
+ netdev_set_prio_tc_map(dev, i, 0);
+ }
+ }
+}
+
/*
* Routine to help set real_num_tx_queues. To avoid skbs mapped to queues
* greater then real_num_tx_queues stale skbs on the qdisc must be flushed.
@@ -1612,6 +1654,9 @@ int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
if (rc)
return rc;
+ if (dev->num_tc)
+ netif_setup_tc(dev, txq);
+
if (txq < dev->real_num_tx_queues)
qdisc_reset_all_tx_gt(dev, txq);
}
@@ -2165,6 +2210,8 @@ u16 __skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb,
unsigned int num_tx_queues)
{
u32 hash;
+ u16 qoffset = 0;
+ u16 qcount = num_tx_queues;
if (skb_rx_queue_recorded(skb)) {
hash = skb_get_rx_queue(skb);
@@ -2173,13 +2220,19 @@ u16 __skb_tx_hash(const struct net_device *dev, const struct sk_buff *skb,
return hash;
}
+ if (dev->num_tc) {
+ u8 tc = netdev_get_prio_tc_map(dev, skb->priority);
+ qoffset = dev->tc_to_txq[tc].offset;
+ qcount = dev->tc_to_txq[tc].count;
+ }
+
if (skb->sk && skb->sk->sk_hash)
hash = skb->sk->sk_hash;
else
hash = (__force u16) skb->protocol ^ skb->rxhash;
hash = jhash_1word(hash, hashrnd);
- return (u16) (((u64) hash * num_tx_queues) >> 32);
+ return (u16) (((u64) hash * qcount) >> 32) + qoffset;
}
EXPORT_SYMBOL(__skb_tx_hash);
^ permalink raw reply related
* [net-next-2.6 PATCH v8 0/2] Series short description
From: John Fastabend @ 2011-01-17 18:05 UTC (permalink / raw)
To: davem
Cc: bhutchings, jarkao2, hadi, eric.dumazet, shemminger, tgraf,
nhorman, netdev
Changed from v7, Ben Hutchings wants to actively manage queues
from the ndo_setup_tc() routine. Presumably to change the number
of tx queues creating n queues per traffic class. I agreed so
this version updates the patch to work correctly in this case.
Previously I was calling ndo_setup_tc from netif_set_num_tx_queues(),
to verify the queue offset/count but this would break any
queue management so this is removed. Now the mappings are
invalidated if the mapping requires it and it is expected
that the netdevice configured a valid mapping so calling
back into the driver is not needed. Validation is still
required in the case of a netdevice that does not
implement ndo_setup_tc() or to recover in cases where
ndo_open is adjusting number of queues do to resource
constraints.
Finally added some documentation to netdevice.h regarding
the new ops routine.
---
John Fastabend (2):
net_sched: implement a root container qdisc sch_mqprio
net: implement mechanism for HW based QOS
include/linux/netdevice.h | 68 +++++++
include/linux/pkt_sched.h | 12 +
net/core/dev.c | 55 ++++++
net/sched/Kconfig | 12 +
net/sched/Makefile | 1
net/sched/sch_generic.c | 4
net/sched/sch_mqprio.c | 417 +++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 568 insertions(+), 1 deletions(-)
create mode 100644 net/sched/sch_mqprio.c
--
Signature
^ permalink raw reply
* [PATCH] ipv6: Silence privacy extensions initialization
From: Romain Francoise @ 2011-01-17 17:59 UTC (permalink / raw)
To: David S. Miller
Cc: Alexey Kuznetsov, Pekka Savola (ipv6), James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, netdev
When a network namespace is created (via CLONE_NEWNET), the loopback
interface is automatically added to the new namespace, triggering a
printk in ipv6_add_dev() if CONFIG_IPV6_PRIVACY is set.
This is problematic for applications which use CLONE_NEWNET as
part of a sandbox, like Chromium's suid sandbox or recent versions of
vsftpd. On a busy machine, it can lead to thousands of useless
"lo: Disabled Privacy Extensions" messages appearing in dmesg.
It's easy enough to check the status of privacy extensions via the
use_tempaddr sysctl, so just removing the printk seems like the most
sensible solution.
Signed-off-by: Romain Francoise <romain@orebokech.com>
---
net/ipv6/addrconf.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 5b189c9..24a1cf1 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -420,9 +420,6 @@ static struct inet6_dev * ipv6_add_dev(struct net_device *dev)
dev->type == ARPHRD_TUNNEL6 ||
dev->type == ARPHRD_SIT ||
dev->type == ARPHRD_NONE) {
- printk(KERN_INFO
- "%s: Disabled Privacy Extensions\n",
- dev->name);
ndev->cnf.use_tempaddr = -1;
} else {
in6_dev_hold(ndev);
--
1.7.2.3
^ permalink raw reply related
* Re: [PATCH] CHOKe flow scheduler (0.9)
From: Eric Dumazet @ 2011-01-17 17:54 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: Patrick McHardy, David Miller, netdev
In-Reply-To: <20110117092532.7d5f5a5b@nehalam>
Le lundi 17 janvier 2011 à 09:25 -0800, Stephen Hemminger a écrit :
> I rolled in your changes. But there is one more change I want to make.
> The existing flow match based on hash is vulnerable to side-channel DoS attack.
> It is possible for a hostile flow to send packets that match the same
> hash value which would effectively kill a targeted flow.
>
> The solution is to match based on full source and destination, not hash value.
> Still coding that up.
I see, but you only want to make this full test if (!q->filter_list) ?
(or precisely only if skb_get_rxhash() was used to get the cookie )
^ permalink raw reply
* Re: [PATCH v2] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Eric Dumazet @ 2011-01-17 17:29 UTC (permalink / raw)
To: Po-Yu Chuang; +Cc: netdev, linux-kernel, ratbert, bhutchings, joe, dilinger
In-Reply-To: <1295256060-2091-1-git-send-email-ratbert.chuang@gmail.com>
Le lundi 17 janvier 2011 à 17:21 +0800, Po-Yu Chuang a écrit :
> +static int ftmac100_rx_packet(struct ftmac100 *priv, int *processed)
> +{
> + struct net_device *netdev = priv->netdev;
> + struct ftmac100_rxdes *rxdes;
> + struct sk_buff *skb;
> + int length;
> + int copied = 0;
> + int done = 0;
> +
> + rxdes = ftmac100_rx_locate_first_segment(priv);
> + if (!rxdes)
> + return 0;
> +
> + length = ftmac100_rxdes_frame_length(rxdes);
> +
> + netdev->stats.rx_packets++;
> + netdev->stats.rx_bytes += length;
> +
> + if (unlikely(ftmac100_rx_packet_error(priv, rxdes))) {
> + ftmac100_rx_drop_packet(priv);
> + return 1;
> + }
> +
> + /* start processing */
> + skb = netdev_alloc_skb_ip_align(netdev, length);
> + if (unlikely(!skb)) {
> + if (net_ratelimit())
> + netdev_err(netdev, "rx skb alloc failed\n");
> +
> + ftmac100_rx_drop_packet(priv);
> + return 1;
> + }
> +
Please dont increase rx_packets/rx_bytes before the
netdev_alloc_skb_ip_align().
In case of mem allocation failure, it would be better not pretending we
handled a packet.
drivers/net/r8169.c for example does the rx_packets/rx_bytes only if
packet is delivered to upper stack.
^ permalink raw reply
* Re: [PATCH] CHOKe flow scheduler (0.9)
From: Stephen Hemminger @ 2011-01-17 17:25 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Patrick McHardy, David Miller, netdev
In-Reply-To: <1295077542.3977.20.camel@edumazet-laptop>
On Sat, 15 Jan 2011 08:45:42 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le vendredi 14 janvier 2011 à 15:45 -0800, Stephen Hemminger a écrit :
> > CHOKe ("CHOose and Kill" or "CHOose and Keep") is an alternative
> > packet scheduler based on the Random Exponential Drop (RED) algorithm.
> >
> > The core idea is:
> > For every packet arrival:
> > Calculate Qave
> > if (Qave < minth)
> > Queue the new packet
> > else
> > Select randomly a packet from the queue
> > if (both packets from same flow)
> > then Drop both the packets
> > else if (Qave > maxth)
> > Drop packet
> > else
> > Admit packet with proability p (same as RED)
> >
> > See also:
> > Rong Pan, Balaji Prabhakar, Konstantinos Psounis, "CHOKe: a stateless active
> > queue management scheme for approximating fair bandwidth allocation",
> > Proceeding of INFOCOM'2000, March 2000.
> >
> > Help from:
> > Eric Dumazet <eric.dumazet@gmail.com>
> > Patrick McHardy <kaber@trash.net>
> >
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> >
> > ---
> > This version is based on net-next, and assumes Eric's patch for
> > corrected bstats is already applied.
> >
> > 0.9 incorporate patches from Patrick/Eric
> > rework the peek_random and drop code to simplify and fix bug where
> > random_N needs to called with full length (including holes).
>
> Nice catch, I now have more "matched" counts after my test :
>
> qdisc choke 11: parent 1:11 limit 130000b min 10833b max 32500b ewma 13 Plog 21 Scell_log 30
> Sent 93944198 bytes 170889 pkt (dropped 829140, overlimits 436686 requeues 0)
> rate 48bit 0pps backlog 0b 0p requeues 0
> marked 0 early 436686 pdrop 0 other 0 matched 196227
>
> You missed the qdisc_bstats_update() move from enqueue() to dequeue()
>
> And some minor CodingStyle / checkpatch.pl changes, here is my
> latest diff on top of 0.9
>
> I believe you can release v1 :)
>
> Thanks !
I rolled in your changes. But there is one more change I want to make.
The existing flow match based on hash is vulnerable to side-channel DoS attack.
It is possible for a hostile flow to send packets that match the same
hash value which would effectively kill a targeted flow.
The solution is to match based on full source and destination, not hash value.
Still coding that up.
--
^ permalink raw reply
* Re: [PATCH v2] net: add Faraday FTMAC100 10/100 Ethernet driver
From: Joe Perches @ 2011-01-17 17:19 UTC (permalink / raw)
To: Po-Yu Chuang
Cc: netdev, linux-kernel, ratbert, bhutchings, eric.dumazet, dilinger
In-Reply-To: <1295256060-2091-1-git-send-email-ratbert.chuang@gmail.com>
On Mon, 2011-01-17 at 17:21 +0800, Po-Yu Chuang wrote:
> From: Po-Yu Chuang <ratbert@faraday-tech.com>
> FTMAC100 Ethernet Media Access Controller supports 10/100 Mbps and
> MII. This driver has been working on some ARM/NDS32 SoC's including
> Faraday A320 and Andes AG101.
Hi again.
> Signed-off-by: Po-Yu Chuang <ratbert@faraday-tech.com>
> ---
> v2:
> always use NAPI
> do not use our own net_device_stats structure
> don't set trans_start and last_rx
> stats.rx_packets and stats.rx_bytes include dropped packets
> add missed netif_napi_del()
> initialize spinlocks in probe function
> remove rx_lock and hw_lock
> use netdev_[err/info/dbg] instead of dev_* ones
> use netdev_alloc_skb_ip_align()
> remove ftmac100_get_stats()
> use is_valid_ether_addr() instead of is_zero_ether_addr()
> add const to ftmac100_ethtool_ops and ftmac100_netdev_ops
> use net_ratelimit() instead of printk_ratelimit()
> no explicit inline
> use %pM to print MAC address
> add comment before wmb
> use napi poll() to handle all interrupts
This looks very clean, thanks for doing the rework.
Now the the really trivial...
> + * priveate data
private
> +static void ftmac100_enable_all_int(struct ftmac100 *priv)
> +{
> + unsigned int imr;
> +
> + imr = FTMAC100_INT_RPKT_FINISH | FTMAC100_INT_NORXBUF
> + | FTMAC100_INT_XPKT_OK | FTMAC100_INT_XPKT_LOST
> + | FTMAC100_INT_RPKT_LOST | FTMAC100_INT_AHB_ERR
> + | FTMAC100_INT_PHYSTS_CHG;
This could be a #define.
> + maccr = FTMAC100_MACCR_XMT_EN |
> + FTMAC100_MACCR_RCV_EN |
> + FTMAC100_MACCR_XDMA_EN |
> + FTMAC100_MACCR_RDMA_EN |
> + FTMAC100_MACCR_CRC_APD |
> + FTMAC100_MACCR_FULLDUP |
> + FTMAC100_MACCR_RX_RUNT |
> + FTMAC100_MACCR_RX_BROADPKT;
Here too.
> +static int ftmac100_rx_packet_error(struct ftmac100 *priv,
> + struct ftmac100_rxdes *rxdes)
[]
> + if (unlikely(ftmac100_rxdes_frame_too_long(rxdes))) {
> + if (net_ratelimit())
> + netdev_info(netdev, "rx frame too long\n");
> +
> + netdev->stats.rx_length_errors++;
> + error = 1;
> + }
> +
> + if (unlikely(ftmac100_rxdes_runt(rxdes))) {
else if ?
> +static int ftmac100_rx_packet(struct ftmac100 *priv, int *processed)
> +{
> + struct net_device *netdev = priv->netdev;
> + struct ftmac100_rxdes *rxdes;
> + struct sk_buff *skb;
> + int length;
> + int copied = 0;
> + int done = 0;
You could use bool/true/false here for copied and done
and all the other uses of an int for a logical bool.
> +static void ftmac100_txdes_set_dma_own(struct ftmac100_txdes *txdes)
> +{
> + /*
> + * Make sure dma own bit will not be set before any other
> + * descriptor fiels.
field/fields
> +static int ftmac100_mdio_read(struct net_device *netdev, int phy_id, int reg)
> +{
> + struct ftmac100 *priv = netdev_priv(netdev);
> + int phycr;
> + int i;
> +
> + phycr = FTMAC100_PHYCR_PHYAD(phy_id) |
> + FTMAC100_PHYCR_REGAD(reg) |
> + FTMAC100_PHYCR_MIIRD;
> +
> + iowrite32(phycr, priv->base + FTMAC100_OFFSET_PHYCR);
> + for (i = 0; i < 10; i++) {
> + phycr = ioread32(priv->base + FTMAC100_OFFSET_PHYCR);
> +
> + if ((phycr & FTMAC100_PHYCR_MIIRD) == 0)
> + return phycr & FTMAC100_PHYCR_MIIRDATA;
> +
> + usleep_range(100, 1000);
> + }
> +
> + netdev_err(netdev, "mdio read timed out\n");
> + return 0xffff;
0xffff is a rather odd return, perhaps a #define?
> +/******************************************************************************
> + * initialization / finalization
> + *****************************************************************************/
> +static int __init ftmac100_init(void)
> +{
> + printk(KERN_INFO "Loading " DRV_NAME ": version " DRV_VERSION " ...\n");
You could use
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
before any #include and
pr_info("Loading version " DRV_VERSION " ...\n");
One last comment on split long line indentation style
and long function declarations.
There's no required style so you can use what you are
most comfortable doing.
Most of drivers/net uses an alignment to open parenthesis
using maximal tabs and minimal necessary spaces instead of
an extra tabstop.
Like:
static int some_long_function(type var1, type var2...
type varN)
and
some_long_function(var1, var2, ...
varN);
not
static int some_long_function(type var1, type var2...
type varN)
and
some_long_function(var1, var2, ...
varN);
^ permalink raw reply
* [PATCHv2] USB CDC NCM: tx_fixup() race condition fix
From: Alexey Orishko @ 2011-01-17 17:07 UTC (permalink / raw)
To: linux-usb; +Cc: netdev, davem, gregkh, yauheni.kaliuta, Alexey Orishko
- tx_fixup() can be called from either timer callback or from xmit()
in usbnet, so spinlock is added to avoid concurrency-related problem.
- minor correction due to checkpatch warning for some line over 80
chars after previous patch was applied.
Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com>
---
drivers/net/usb/cdc_ncm.c | 19 ++++++++++++-------
1 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index d776c4a..04e8ce1 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -54,7 +54,7 @@
#include <linux/usb/usbnet.h>
#include <linux/usb/cdc.h>
-#define DRIVER_VERSION "30-Nov-2010"
+#define DRIVER_VERSION "17-Jan-2011"
/* CDC NCM subclass 3.2.1 */
#define USB_CDC_NCM_NDP16_LENGTH_MIN 0x10
@@ -868,15 +868,19 @@ static void cdc_ncm_tx_timeout(unsigned long arg)
if (ctx->tx_timer_pending != 0) {
ctx->tx_timer_pending--;
restart = 1;
- } else
+ } else {
restart = 0;
+ }
spin_unlock(&ctx->mtx);
- if (restart)
+ if (restart) {
+ spin_lock(&ctx->mtx);
cdc_ncm_tx_timeout_start(ctx);
- else if (ctx->netdev != NULL)
+ spin_unlock(&ctx->mtx);
+ } else if (ctx->netdev != NULL) {
usbnet_start_xmit(NULL, ctx->netdev);
+ }
}
static struct sk_buff *
@@ -900,7 +904,6 @@ cdc_ncm_tx_fixup(struct usbnet *dev, struct sk_buff *skb, gfp_t flags)
skb_out = cdc_ncm_fill_tx_frame(ctx, skb);
if (ctx->tx_curr_skb != NULL)
need_timer = 1;
- spin_unlock(&ctx->mtx);
/* Start timer, if there is a remaining skb */
if (need_timer)
@@ -908,6 +911,8 @@ cdc_ncm_tx_fixup(struct usbnet *dev, struct sk_buff *skb, gfp_t flags)
if (skb_out)
dev->net->stats.tx_packets += ctx->tx_curr_frame_num;
+
+ spin_unlock(&ctx->mtx);
return skb_out;
error:
@@ -1020,8 +1025,8 @@ static int cdc_ncm_rx_fixup(struct usbnet *dev, struct sk_buff *skb_in)
if (((offset + temp) > actlen) ||
(temp > CDC_NCM_MAX_DATAGRAM_SIZE) || (temp < ETH_HLEN)) {
pr_debug("invalid frame detected (ignored)"
- "offset[%u]=%u, length=%u, skb=%p\n",
- x, offset, temp, skb_in);
+ "offset[%u]=%u, length=%u, skb=%p\n",
+ x, offset, temp, skb_in);
if (!x)
goto error;
break;
--
1.7.0.4
^ permalink raw reply related
* Re: [PATCH] dummy: do not create a link (dummy0) at module init by default
From: Stephen Hemminger @ 2011-01-17 16:56 UTC (permalink / raw)
To: David Ward; +Cc: netdev
In-Reply-To: <1295225393-5779-1-git-send-email-david.ward@ll.mit.edu>
On Sun, 16 Jan 2011 19:49:53 -0500
David Ward <david.ward@ll.mit.edu> wrote:
> When the dummy network driver is initialized with no parameters, a link
> is automatically created (named 'dummy0'). This is inconsistent with
> other virtual network drivers such as veth, macvlan, and macvtap, which
> do not create a link upon initialization.
>
> This also causes confusing behavior when sending an RTM_NEWLINK message
> for a dummy link, because the kernel will load the dummy network driver
> first if it has not already been loaded. When that occurs, the result
> is that two new links are actually created (or if IFLA_IFNAME is set to
> 'dummy0', the error EEXIST is returned). The following iproute command
> demonstrates this behavior:
>
> ip link add [ name dummy0 ] type dummy
>
> With this change, users who still want to have a link created when the
> dummy network driver is loaded (instead of using iproute to create the
> link as shown above) just need to set the 'numdummies' parameter to 1:
>
> modprobe dummy numdummies=1
>
> Signed-off-by: David Ward <david.ward@ll.mit.edu>
I understand what you are trying to do, and it makes sense.
But because of the history behind this it can't change.
We can't change existing API and break user scripts.
The 'ip link' command support is new (in last couple of years), and
the module parameter has been around since early days.
If you want to load module without any devices just use:
modprobe dummy numdummies=0
--
^ permalink raw reply
* Re: [Patch] Kill off warning: ‘inline’ is not at beginning of declaration
From: Gustavo F. Padovan @ 2011-01-17 16:13 UTC (permalink / raw)
To: Jesper Juhl
Cc: alsa-devel, Mauro Carvalho Chehab, Takashi Iwai,
Frederic Weisbecker, H. Peter Anvin, Jaroslav Kysela, Jens Axboe,
Stephen Hemminger, Andi Kleen, Pekka Savola (ipv6), x86,
James Morris, Ingo Molnar, oprofile-list, Alexey Kuznetsov,
Mark Fasheh, Marcel Holtmann, John W. Linville, David Teigland,
Joel Becker, Thomas Gleixner, linux-edac, trivial,
Hideaki YOSHIFUJI, netdev, Greg
In-Reply-To: <alpine.LNX.2.00.1101170000270.13377@swampdragon.chaosbits.net>
* Jesper Juhl <jj@chaosbits.net> [2011-01-17 00:09:38 +0100]:
> Fix a bunch of
> warning: ‘inline’ is not at beginning of declaration
> messages when building a 'make allyesconfig' kernel with -Wextra.
>
> These warnings are trivial to kill, yet rather annoying when building with
> -Wextra.
> The more we can cut down on pointless crap like this the better (IMHO).
>
> A previous patch to do this for a 'allnoconfig' build has already been
> merged. This just takes the cleanup a little further.
>
> Signed-off-by: Jesper Juhl <jj@chaosbits.net>
> ---
> arch/x86/oprofile/op_model_p4.c | 2 +-
> drivers/bluetooth/btusb.c | 4 ++--
> drivers/cpuidle/sysfs.c | 2 +-
> drivers/edac/i7300_edac.c | 2 +-
> fs/ocfs2/dir.c | 2 +-
> kernel/trace/ring_buffer.c | 2 +-
> net/ipv6/inet6_hashtables.c | 2 +-
> net/mac80211/tx.c | 2 +-
> sound/pci/au88x0/au88x0.h | 4 ++--
> sound/pci/au88x0/au88x0_core.c | 4 ++--
> 10 files changed, 13 insertions(+), 13 deletions(-)
For drivers/bluetooth
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
--
Gustavo F. Padovan
http://profusion.mobi
------------------------------------------------------------------------------
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand
malware threats, the impact they can have on your business, and how you
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
oprofile-list mailing list
oprofile-list@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oprofile-list
^ permalink raw reply
* Re: 2.6.37 regression: adding main interface to a bridge breaks vlan interface RX
From: Ben Hutchings @ 2011-01-17 16:00 UTC (permalink / raw)
To: Simon Arlott; +Cc: netdev, Linux Kernel Mailing List, jesse, Herbert Xu
In-Reply-To: <4D32FC1C.3010905@simon.arlott.org.uk>
On Sun, 2011-01-16 at 14:09 +0000, Simon Arlott wrote:
> [ 1.666706] forcedeth 0000:00:08.0: ifname eth0, PHY OUI 0x5043 @ 16, addr 00:e0:81:4d:2b:ec
> [ 1.666767] forcedeth 0000:00:08.0: highdma csum vlan pwrctl mgmt gbit lnktim msi desc-v3
>
> I have eth0 and eth0.3840 which works until I add eth0 to a bridge.
> While eth0 is in a bridge (the bridge device is up), eth0.3840 is unable
> to receive packets. Using tcpdump on eth0 shows the packets being
> received with a VLAN tag but they don't appear on eth0.3840. They appear
> with the VLAN tag on the bridge interface.
[...]
This means the behaviour is now consistent, whether or not hardware VLAN
tag stripping is enabled. (I previously pointed out the inconsistent
behaviour in <http://thread.gmane.org/gmane.linux.network/149864>.) I
would consider this an improvement.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* RE: [PATCH] USB CDC NCM: tx_fixup() race condition fix
From: Alexey ORISHKO @ 2011-01-17 15:04 UTC (permalink / raw)
To: Sergei Shtylyov
Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org,
gregkh-l3A5Bk7waGM@public.gmane.org,
yauheni.kaliuta-xNZwKgViW5gAvxtiuMwx3w@public.gmane.org
In-Reply-To: <4D3458F7.5070209-hkdhdckH98+B+jHODAdFcQ@public.gmane.org>
> > - tx_fixup() call be called from either timer callback or from xmit()
>
> s/call/can/?
Yes :-)
>
> > in usbnet, so spinlock is added to avoid concurrency-related
> problem.
> > - minor correction due checkpatch warning for some line over 80 chars
>
> Due to?
yep
Sorry for typos...
Regards,
alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH] USB CDC NCM: tx_fixup() race condition fix
From: Sergei Shtylyov @ 2011-01-17 14:57 UTC (permalink / raw)
To: Alexey Orishko
Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
davem-fT/PcQaiUtIeIZ0/mPfg9Q, gregkh-l3A5Bk7waGM,
yauheni.kaliuta-xNZwKgViW5gAvxtiuMwx3w, Alexey Orishko
In-Reply-To: <1295271573-8890-1-git-send-email-alexey.orishko-0IS4wlFg1OjSUeElwK9/Pw@public.gmane.org>
Hello.
Alexey Orishko wrote:
> - tx_fixup() call be called from either timer callback or from xmit()
s/call/can/?
> in usbnet, so spinlock is added to avoid concurrency-related problem.
> - minor correction due checkpatch warning for some line over 80 chars
Due to?
> after previous patch was applied.
> Signed-off-by: Alexey Orishko <alexey.orishko-0IS4wlFg1OjSUeElwK9/Pw@public.gmane.org>
> ---
> drivers/net/usb/cdc_ncm.c | 13 ++++++++-----
> 1 files changed, 8 insertions(+), 5 deletions(-)
> diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
> index d776c4a..bf13fa6 100644
> --- a/drivers/net/usb/cdc_ncm.c
> +++ b/drivers/net/usb/cdc_ncm.c
> @@ -54,7 +54,7 @@
> #include <linux/usb/usbnet.h>
> #include <linux/usb/cdc.h>
>
> -#define DRIVER_VERSION "30-Nov-2010"
> +#define DRIVER_VERSION "17-Jan-2011"
>
> /* CDC NCM subclass 3.2.1 */
> #define USB_CDC_NCM_NDP16_LENGTH_MIN 0x10
> @@ -873,9 +873,11 @@ static void cdc_ncm_tx_timeout(unsigned long arg)
>
> spin_unlock(&ctx->mtx);
>
> - if (restart)
> + if (restart) {
> + spin_lock(&ctx->mtx);
> cdc_ncm_tx_timeout_start(ctx);
> - else if (ctx->netdev != NULL)
> + spin_unlock(&ctx->mtx);
> + } else if (ctx->netdev != NULL)
The 'else' branch should now also have {}, according to
Documentation/CodingStyle.
> usbnet_start_xmit(NULL, ctx->netdev);
> }
>
> @@ -1021,7 +1024,7 @@ static int cdc_ncm_rx_fixup(struct usbnet *dev, struct sk_buff *skb_in)
> (temp > CDC_NCM_MAX_DATAGRAM_SIZE) || (temp < ETH_HLEN)) {
> pr_debug("invalid frame detected (ignored)"
> "offset[%u]=%u, length=%u, skb=%p\n",
> - x, offset, temp, skb_in);
> + x, offset, temp, skb_in);
Would be good to align uniformly with the previous line...
WBR, Sergei
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Merging SSB and HND/AI support
From: Jonas Gorski @ 2011-01-17 14:01 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Michael Büsch, linux-mips-6z/3iImG2C8G8FEW9MqTrA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <AANLkTinwGaqg8ahGWd3+_dfhrCNTQNOfO1E-EUepFJ+C-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 17 January 2011 14:54, Geert Uytterhoeven <geert-Td1EMuHUCqxL1ZNQvxDV9g@public.gmane.org> wrote:
> If it's AMBA, can it be integrated with the existing code in drivers/amba/?
Hm, I once had a sentence about it there, I must have accidentally deleted it.
I tried finding similarities between Broadcom's code and ARM's AMBA
specification to better understand the code, but except some tiny ones
I couldn't find anything usable. Unfortunately I couldn't find
anything about Broadcom's AMBA implementation, except that it's "AMBA"
licensed from ARM.
Jonas
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [Patch] Kill off warning: ‘inline’ is not at beginning of declaration
From: John W. Linville @ 2011-01-17 13:55 UTC (permalink / raw)
To: Jesper Juhl
Cc: alsa-devel, Mauro Carvalho Chehab, Takashi Iwai,
Frederic Weisbecker, Gustavo F. Padovan, Jens Axboe,
Stephen Hemminger, Andi Kleen, H. Peter Anvin,
Pekka Savola (ipv6), Robert Richter, x86, James Morris,
Ingo Molnar, oprofile-list, Alexey Kuznetsov, Mark Fasheh,
Marcel Holtmann, David Teigland, Joel Becker, Thomas Gleixner,
linux-edac, trivial, Hideaki YOSHIFUJI, netdev, Greg
In-Reply-To: <alpine.LNX.2.00.1101170000270.13377@swampdragon.chaosbits.net>
On Mon, Jan 17, 2011 at 12:09:38AM +0100, Jesper Juhl wrote:
> Fix a bunch of
> warning: ‘inline’ is not at beginning of declaration
> messages when building a 'make allyesconfig' kernel with -Wextra.
>
> These warnings are trivial to kill, yet rather annoying when building with
> -Wextra.
> The more we can cut down on pointless crap like this the better (IMHO).
>
> A previous patch to do this for a 'allnoconfig' build has already been
> merged. This just takes the cleanup a little further.
>
> Signed-off-by: Jesper Juhl <jj@chaosbits.net>
> ---
> net/mac80211/tx.c | 2 +-
ack
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
_______________________________________________
Alsa-devel mailing list
Alsa-devel@alsa-project.org
http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
^ permalink raw reply
* Re: Merging SSB and HND/AI support
From: Geert Uytterhoeven @ 2011-01-17 13:54 UTC (permalink / raw)
To: Jonas Gorski
Cc: Michael Büsch, linux-mips-6z/3iImG2C8G8FEW9MqTrA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <AANLkTims0DPfG+u9qynuuj_-0WjUr1nAGLuFz3k706T--JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Mon, Jan 17, 2011 at 14:43, Jonas Gorski <jonas.gorski-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On 17 January 2011 12:57, Michael Büsch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org> wrote:
>> Well... I don't really like the idea of running one driver and
>> subsystem implementation on completely distinct types of silicon.
>> We will end up with the same mess that broadcom ended up with in
>> their "SB" code (broadcom's SSB backplane implementation).
>> For example, in their code the driver calls pci_enable_device() and
>> related PCI functions, even if there is no PCI device at all. The calls
>> are magically re-routed to the actual SB backplane.
>> You'd have to do the same mess with SSB. Calling ssb_device_enable()
>> will mean "enable the SSB device", if the backplane is SSB, and will
>> mean "enable the HND/AI" device, if the backplane is HND/AI.
> P.S: Any suggestions for the name? Would be "ai" okay? Technically
> it's "AMBA Interconnect", but "amba" is already taken.
If it's AMBA, can it be integrated with the existing code in drivers/amba/?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Merging SSB and HND/AI support
From: Jonas Gorski @ 2011-01-17 13:43 UTC (permalink / raw)
To: Michael Büsch; +Cc: linux-mips, linux-wireless, netdev
In-Reply-To: <1295265468.24530.23.camel@maggie>
On 17 January 2011 12:57, Michael Büsch <mb@bu3sch.de> wrote:
> Well... I don't really like the idea of running one driver and
> subsystem implementation on completely distinct types of silicon.
> We will end up with the same mess that broadcom ended up with in
> their "SB" code (broadcom's SSB backplane implementation).
> For example, in their code the driver calls pci_enable_device() and
> related PCI functions, even if there is no PCI device at all. The calls
> are magically re-routed to the actual SB backplane.
> You'd have to do the same mess with SSB. Calling ssb_device_enable()
> will mean "enable the SSB device", if the backplane is SSB, and will
> mean "enable the HND/AI" device, if the backplane is HND/AI.
It didn't strike me as that bad, but I also didn't look at any PCI code.
> So I'm still in favor of doing a separate HND/AI bus implementation,
> even if
> that means duplicating a few lines of code.
Well, it means at least duplicating most of the chipcommon driver and
the mips core driver. But if you are fine with that, I see no problem
with having a separate driver for the AI bus.
> SSB doesn't search for SSB busses in the system, because there's no
> way to do so. The architecture (or the PCI/PCMCIA/SDIO device) registers
> the bus,
> if it detected an SSB device. So for the embedded case, it's hardcoded
> in the arch code. For the PCI case it simply depends on the PCI IDs.
> I don't see a problem here. Your arch code will already have to know
> what machine it is running on. So it will have to decide whether to
> register a SSB or HND/AI bus.
Okay. This is mostly for the embedded case, where it is possible to
create a single kernel that boots on both. The "detection" could also
be done through the cpu type (74k => register AI bus, else SSB bus)
instead of the chipid register of the common core.
>> Also I don't know
>> if it is a good idea to let arch-specific code depend on code in
>> staging.
>
> Sure. The code needs to be cleaned up and moved to the mainline kernel
> _anyway_. You don't get around this.
Yes, you are right.
So I guess the proposed course of action would be:
1. Make the HND/AI-Bus code from brcm80211 its own independent driver,
2. Re-add the non-wifi related code (chipcommon, mips, etc),
3. Clean up the code until it meets Linux' code style/quality,
4. Move it out of staging,
and finally
5. Add the required arch specific code to bcm47xx for the newer SoCs.
Jonas
P.S: Any suggestions for the name? Would be "ai" okay? Technically
it's "AMBA Interconnect", but "amba" is already taken.
^ permalink raw reply
* [PATCH] USB CDC NCM: tx_fixup() race condition fix
From: Alexey Orishko @ 2011-01-17 13:39 UTC (permalink / raw)
To: linux-usb-u79uwXL29TY76Z2rM5mHXA
Cc: netdev-u79uwXL29TY76Z2rM5mHXA, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
gregkh-l3A5Bk7waGM, yauheni.kaliuta-xNZwKgViW5gAvxtiuMwx3w,
Alexey Orishko
- tx_fixup() call be called from either timer callback or from xmit()
in usbnet, so spinlock is added to avoid concurrency-related problem.
- minor correction due checkpatch warning for some line over 80 chars
after previous patch was applied.
Signed-off-by: Alexey Orishko <alexey.orishko-0IS4wlFg1OjSUeElwK9/Pw@public.gmane.org>
---
drivers/net/usb/cdc_ncm.c | 13 ++++++++-----
1 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index d776c4a..bf13fa6 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -54,7 +54,7 @@
#include <linux/usb/usbnet.h>
#include <linux/usb/cdc.h>
-#define DRIVER_VERSION "30-Nov-2010"
+#define DRIVER_VERSION "17-Jan-2011"
/* CDC NCM subclass 3.2.1 */
#define USB_CDC_NCM_NDP16_LENGTH_MIN 0x10
@@ -873,9 +873,11 @@ static void cdc_ncm_tx_timeout(unsigned long arg)
spin_unlock(&ctx->mtx);
- if (restart)
+ if (restart) {
+ spin_lock(&ctx->mtx);
cdc_ncm_tx_timeout_start(ctx);
- else if (ctx->netdev != NULL)
+ spin_unlock(&ctx->mtx);
+ } else if (ctx->netdev != NULL)
usbnet_start_xmit(NULL, ctx->netdev);
}
@@ -900,7 +902,6 @@ cdc_ncm_tx_fixup(struct usbnet *dev, struct sk_buff *skb, gfp_t flags)
skb_out = cdc_ncm_fill_tx_frame(ctx, skb);
if (ctx->tx_curr_skb != NULL)
need_timer = 1;
- spin_unlock(&ctx->mtx);
/* Start timer, if there is a remaining skb */
if (need_timer)
@@ -908,6 +909,8 @@ cdc_ncm_tx_fixup(struct usbnet *dev, struct sk_buff *skb, gfp_t flags)
if (skb_out)
dev->net->stats.tx_packets += ctx->tx_curr_frame_num;
+
+ spin_unlock(&ctx->mtx);
return skb_out;
error:
@@ -1021,7 +1024,7 @@ static int cdc_ncm_rx_fixup(struct usbnet *dev, struct sk_buff *skb_in)
(temp > CDC_NCM_MAX_DATAGRAM_SIZE) || (temp < ETH_HLEN)) {
pr_debug("invalid frame detected (ignored)"
"offset[%u]=%u, length=%u, skb=%p\n",
- x, offset, temp, skb_in);
+ x, offset, temp, skb_in);
if (!x)
goto error;
break;
--
1.7.0.4
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply related
* Re: rps testing questions
From: Ben Hutchings @ 2011-01-17 13:08 UTC (permalink / raw)
To: mi wake; +Cc: netdev
In-Reply-To: <AANLkTin1pC=auiFBt83YomdhVgUO8uSdvq=tPaDu0=3U@mail.gmail.com>
On Mon, 2011-01-17 at 17:43 +0800, mi wake wrote:
> I do a rps(Receive Packet Steering) testing on centos 5.5 with kernel 2.6.37.
> cpu: 8 core Intel.
> ethernet adapter: bnx2x
>
> Problem statement:
> enable rps with:
> echo "ff" > /sys/class/net/eth2/queues/rx-0/rps_cpus.
>
> running 1 instances of netperf TCP_RR: netperf -t TCP_RR -H 192.168.0.1 -c -C
> without rps: 9963.48(Trans Rate per sec)
> with rps: 9387.59(Trans Rate per sec)
>
> I do ab and tbench testing also find there is less tps with enable
> rps.but,there is more cpu using when with enable rps.when with enable
> rps ,softirqs is blanced on cpus.
>
> is there something wrong with my test?
In addition to what Eric said, check the interrupt moderation settings
(ethtool -c/-C options). One-way latency for a single request/response
test will be at least the interrupt moderation value.
I haven't tested RPS by itself (Solarflare NICs have plenty of hardware
queues) so I don't know whether it can improve latency. However, RFS
certainly does when there are many flows.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH v3 00/16] make rpc_pipefs be mountable multiple time
From: Rob Landley @ 2011-01-17 12:30 UTC (permalink / raw)
To: Kirill A. Shutemov
Cc: Trond Myklebust, J. Bruce Fields, Neil Brown, Pavel Emelyanov,
linux-nfs, David S. Miller, Al Viro, containers, netdev,
linux-kernel
In-Reply-To: <1295012954-7769-1-git-send-email-kas@openvz.org>
On 01/14/2011 07:48 AM, Kirill A. Shutemov wrote:
> Prepare nfs/sunrpc stack to use multiple instances of rpc_pipefs.
> Only for client for now.
Ok, Google is being really unhelpful here.
What is rpc_pipefs for? What uses it, and to do what exactly? Is it
used by nfs server code, or by the client code, or both? Is it a way
for userspace to talk to the kernel, or for the kernel to talk to
itself? Is it used at mount time, or during filesystem operation?
I'm interested in giving this patch series a much more thorough review,
but I can't figure out what the subsystem it's modifying actually _is_.
(Maybe this is something to do with filesystems/nfs/rpc-cache.txt?)
Rob
^ permalink raw reply
* Re: Merging SSB and HND/AI support
From: Michael Büsch @ 2011-01-17 12:00 UTC (permalink / raw)
To: Florian Fainelli
Cc: Jonas Gorski, linux-mips-6z/3iImG2C8G8FEW9MqTrA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <201101171220.52292.florian-p3rKhJxN3npAfugRpC6u6w@public.gmane.org>
On Mon, 2011-01-17 at 12:20 +0100, Florian Fainelli wrote:
> On Monday 17 January 2011 11:56:23 Michael Büsch wrote:
> > On Mon, 2011-01-17 at 11:46 +0100, Jonas Gorski wrote:
> > > Hello,
> > >
> > > I am currently looking into adding support for the newer Broadcom
> > > BCM47xx/53xx SoCs. They require having HND/AI support, which probably
> > > means merging the current SSB code and the HND/AI code from the
> > > brcm80211 driver. Is anyone already working on this?
> > >
> > > As far as I can see, there are two possibilities:
> > >
> > > a) Merge the HND/AI code into the current SSB code, or
> > >
> > > b) add the missing code for SoCs to brcm80211 and replace the SSB code
> > > with it.
> >
> > Why can't we keep those two platforms separated?
>
> That is also what I am wondering about. Considering that previous BCM47xx
> platforms use a MIPS4k core and newer one use MIPS74k or later, you would not
> be able to build a single kernel for both which takes advantages of compile-
> time optimizations targetting MIPS74k. If this ist not a big concern, then
> let's target a single kernel.
Ok, but it should be easily possible to compile both SSB and HND/AI
bus support into one kernel anyway. Nothing prevents drivers from having
an SSB and an HND/AI probe callback.
--
Greetings Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: Merging SSB and HND/AI support
From: Michael Büsch @ 2011-01-17 11:57 UTC (permalink / raw)
To: Jonas Gorski
Cc: linux-mips-6z/3iImG2C8G8FEW9MqTrA,
linux-wireless-u79uwXL29TY76Z2rM5mHXA,
netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <AANLkTikJcug7LUTgX_YDD4Z8ZBrdkAdLq8_Epa6TkA5f-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On Mon, 2011-01-17 at 12:21 +0100, Jonas Gorski wrote:
> On 17 January 2011 11:56, Michael Büsch <mb-fseUSCV1ubazQB+pC5nmwQ@public.gmane.org> wrote:
> > On Mon, 2011-01-17 at 11:46 +0100, Jonas Gorski wrote:
> >> a) Merge the HND/AI code into the current SSB code, or
> >>
> >> b) add the missing code for SoCs to brcm80211 and replace the SSB code with it.
> >
> > Why can't we keep those two platforms separated?
> > Is there really a lot of shared code between SSB and HND/AI?
>
> Yes, as far as I understand the AI bus behaves mostly like a SSB bus
> except for places like enabling/disabling cores. E.g. the AI bus also
> has a common core, which has a bit for telling whether its a SSB or AI
> bus, and has the mostly the same registers as the SSB common cores (so
> most driver_chipcommon_* stuff also applies for the AI bus).
Well... I don't really like the idea of running one driver and
subsystem implementation on completely distinct types of silicon.
We will end up with the same mess that broadcom ended up with in
their "SB" code (broadcom's SSB backplane implementation).
For example, in their code the driver calls pci_enable_device() and
related PCI functions, even if there is no PCI device at all. The calls
are magically re-routed to the actual SB backplane.
You'd have to do the same mess with SSB. Calling ssb_device_enable()
will mean "enable the SSB device", if the backplane is SSB, and will
mean "enable the HND/AI" device, if the backplane is HND/AI.
So I'm still in favor of doing a separate HND/AI bus implementation,
even if
that means duplicating a few lines of code. I think that compared to the
workarounds and conditionals needed for getting SSB to run on HND/AI
hardware, it will be a net win.
> > So why do we need to replace or merge SSB in the first place? Can't
> > it co-exist with HND/AI?
>
> It probably can, but then the SSB code must be at least made AI aware
> so it doesn't try to attach itself if it finds one.
SSB doesn't search for SSB busses in the system, because there's no
way to do so. The architecture (or the PCI/PCMCIA/SDIO device) registers
the bus,
if it detected an SSB device. So for the embedded case, it's hardcoded
in the arch code. For the PCI case it simply depends on the PCI IDs.
I don't see a problem here. Your arch code will already have to know
what machine it is running on. So it will have to decide whether to
register a SSB or HND/AI bus.
It's like a platform_device. However, it doesn't use the platform_device
mechanism. There's no technical reason. It would be trivial to port the
SSB bus registration to use platform_device, however.
> Also I don't know
> if it is a good idea to let arch-specific code depend on code in
> staging.
Sure. The code needs to be cleaned up and moved to the mainline kernel
_anyway_. You don't get around this.
--
Greetings Michael.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH v4 05/10] net/fec: add dual fec support for mx28
From: Shawn Guo @ 2011-01-17 11:52 UTC (permalink / raw)
To: Lothar Waßmann
Cc: Uwe Kleine-König, gerg, B32542, netdev, s.hauer, jamie,
baruch, w.sang, r64343, eric, bryan.wu, jamie, davem,
linux-arm-kernel
In-Reply-To: <19763.64214.220441.325208@ipc1.ka-ro>
Hi Lothar,
On Mon, Jan 17, 2011 at 09:16:22AM +0100, Lothar Waßmann wrote:
> Hi,
>
> Shawn Guo writes:
> > On Fri, Jan 14, 2011 at 08:52:23AM +0100, Uwe Kleine-König wrote:
> > > On Fri, Jan 14, 2011 at 01:48:40PM +0800, Shawn Guo wrote:
> > > > Hi Uwe,
> > > >
> > > > On Thu, Jan 13, 2011 at 03:48:05PM +0100, Uwe Kleine-König wrote:
> > > >
> > > > [...]
> > > >
> > > > > > +/* Controller is ENET-MAC */
> > > > > > +#define FEC_QUIRK_ENET_MAC (1 << 0)
> > > > > does this really qualify to be a quirk?
> > > > >
> > > > My understanding is that ENET-MAC is a type of "quirky" FEC
> > > > controller.
> > > >
> > > > > > +/* Controller needs driver to swap frame */
> > > > > > +#define FEC_QUIRK_SWAP_FRAME (1 << 1)
> > > > > IMHO this is a bit misnamed. FEC_QUIRK_NEEDS_BE_DATA or similar would
> > > > > be more accurate.
> > > > >
> > > > When your make this change, you may want to pick a better name for
> > > > function swap_buffer too.
> > > >
> > > > [...]
> > > >
> > > > > > +static void *swap_buffer(void *bufaddr, int len)
> > > > > > +{
> > > > > > + int i;
> > > > > > + unsigned int *buf = bufaddr;
> > > > > > +
> > > > > > + for (i = 0; i < (len + 3) / 4; i++, buf++)
> > > > > > + *buf = cpu_to_be32(*buf);
> > > > > if len isn't a multiple of 4 this accesses bytes behind len. Is this
> > > > > generally OK here? (E.g. because skbs always have a length that is a
> > > > > multiple of 4?)
> > > > The len may not be a multiple of 4. But I believe bufaddr is always
> > > > a buffer allocated in a length that is a multiple of 4, and the 1~3
> > > > bytes exceeding the len very likely has no data that matters. But
> > > > yes, it deserves a safer implementation.
> > > Did you test what happens if bufaddr isn't aligned? Does it work at all
> > > then?
> > >
> > I see many calls passing a len that is not a multiple of 4, but it
> > works good.
> >
> That does not prove anything, actually.
>
> Anyway "bufaddr isn't aligned" != "len is not a multiple of 4".
> Is there any guarantee that the function cannot be called with a
> non-aligned buffer address?
>
Oops, I misunderstood the comment. With bounce buffer alignment
handling removed, the driver stops working. But at least, mx28
fec driver can work with FEC_ALIGNMENT 0x3 and not necessarily with
0xf.
I hope this is what you intended to know.
--
Regards,
Shawn
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox