* [PATCH] headers: net/ipv[46]/protocol.c header trim
From: Alexey Dobriyan @ 2009-09-07 12:38 UTC (permalink / raw)
To: davem; +Cc: netdev
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---
net/ipv4/protocol.c | 19 ++-----------------
net/ipv6/protocol.c | 15 ++-------------
2 files changed, 4 insertions(+), 30 deletions(-)
--- a/net/ipv4/protocol.c
+++ b/net/ipv4/protocol.c
@@ -22,26 +22,11 @@
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
-
-#include <asm/uaccess.h>
-#include <asm/system.h>
+#include <linux/cache.h>
#include <linux/module.h>
-#include <linux/types.h>
-#include <linux/kernel.h>
-#include <linux/string.h>
-#include <linux/socket.h>
-#include <linux/in.h>
-#include <linux/inet.h>
#include <linux/netdevice.h>
-#include <linux/timer.h>
-#include <net/ip.h>
+#include <linux/spinlock.h>
#include <net/protocol.h>
-#include <linux/skbuff.h>
-#include <net/sock.h>
-#include <net/icmp.h>
-#include <net/udp.h>
-#include <net/ipip.h>
-#include <linux/igmp.h>
struct net_protocol *inet_protos[MAX_INET_PROTOS] ____cacheline_aligned_in_smp;
static DEFINE_SPINLOCK(inet_proto_lock);
--- a/net/ipv6/protocol.c
+++ b/net/ipv6/protocol.c
@@ -20,20 +20,9 @@
* - Removed unused variable 'inet6_protocol_base'
* - Modified inet6_del_protocol() to correctly maintain copy bit.
*/
-
-#include <linux/errno.h>
-#include <linux/types.h>
-#include <linux/socket.h>
-#include <linux/sockios.h>
-#include <linux/net.h>
-#include <linux/in6.h>
+#include <linux/module.h>
#include <linux/netdevice.h>
-#include <linux/if_arp.h>
-
-#include <net/sock.h>
-#include <net/snmp.h>
-
-#include <net/ipv6.h>
+#include <linux/spinlock.h>
#include <net/protocol.h>
struct inet6_protocol *inet6_protos[MAX_INET_PROTOS];
^ permalink raw reply
* Re: [iproute2] tc action mirred question
From: Xiaofei Wu @ 2009-09-07 12:38 UTC (permalink / raw)
To: hadi; +Cc: linux netdev
In-Reply-To: <1252260806.4580.6.camel@dogo.mojatatu.com>
>> On node A,
>> wlan0, IP address 192.168.1.1/24 ; wlan1, IP address 192.168.2.1/24
>> I use command 'tc filter add dev wlan0 ... match ip src 192.168.1.0/24 ...
>> action mirred egress mirror dev wlan1' to mirror packets.
>> When I use 'tcpdump -i wlan1', I can 'see' the packets 'A(wlan0)->B' (node B will forward them to C).
>> How to forward the mirroring packets 'A(wlan1)' to D (then, node D forwards them to C) ?
>>
>>
>Is there a trick to this question or is it too basic? ;->
I am sorry.
Maybe it is very easy for you. But I didn't find enough documents(or examples) about 'tc' to help me.
I just want to know:
1) Could I forward the mirroring packets to another node ,and then route it to the destination(if I use
iproute2 (ip, tc ...) )? I described my purpose in my last email.
2) After I mirrored the packets, I should use 'ip route' , 'ip rule' to modify route tables. Is this right?
>You should repeat the same on wlan1 to mirror to wlan0 i.e on wlan1:
>match ip src 192.168.2.0/24 ...
>action mirred egress mirror dev wlan0
Would this cause loops?
>Note the node C will receive "wrong" src mac addresses on those
>interfaces; you may want to correct/edit them first before you send them
>out. Look at using the pedit action.
Regards,
Wu
^ permalink raw reply
* Re: [PATCH net-next-2.6] bonding: introduce primary_passive option
From: Jiri Pirko @ 2009-09-07 12:50 UTC (permalink / raw)
To: Nicolas de Pesloüan; +Cc: netdev, davem, fubar, bonding-devel
In-Reply-To: <4AA01F65.7010408@free.fr>
Thu, Sep 03, 2009 at 09:56:21PM CEST, nicolas.2p.debian@free.fr wrote:
> Jiri Pirko wrote:
>> (updated)
>>
>> In some cases there is not desirable to switch back to primary interface when
>> it's link recovers and rather stay with currently active one. We need to avoid
>> packetloss as much as we can in some cases. This is solved by introducing
>> primary_passive option. Note that enslaved primary slave is set as current
>> active no matter what.
>>
>> This patch depends on the following one:
>> [net-next-2.6] bonding: make ab_arp select active slaves as other modes
>> http://patchwork.ozlabs.org/patch/32684/
>>
>> Signed-off-by: Jiri Pirko <jpirko@redhat.com>
>>
>> diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
>> index d5181ce..e70fa8e 100644
>> --- a/Documentation/networking/bonding.txt
>> +++ b/Documentation/networking/bonding.txt
>> @@ -614,6 +614,17 @@ primary
>> The primary option is only valid for active-backup mode.
>> +primary_passive
>> +
>> + Specifies the behaviour of the primary slave in case of
>> + it's link recovery has been detected. By default (value 0) the
>> + primary slave is set as active slave immediately after the link
>> + recovery. If the value is 1 or 2 then current active slave doesn't
>> + change as long as it's link status doesn't change. This prevents
>> + the bonding device from flip-flopping. Plus if the value is 1 this
>> + behaviour happens only if the speed and duplex of primary slave is
>> + higher. It the value is 2 then it happens everytime.
>> +
>
> May I suggest the following option name and description instead ?
>
> -----
>
> primary_return
>
> Specifies the behavior of the current active slave when the primary was
> down and comes back up. This option is designed to prevent flip-flopping
> between the primary slave and other slaves. The possible values and
> their respective effects are:
>
> alway or 0 (default)
>
> The primary slave becomes the active slave whenever it comes
> back up.
>
> better or 1
>
> The primary slave becomes the active slave when it comes back
> up, if the speed and duplex of the primary slave is better
> than the speed and duplex of the current active slave.
>
> failure_only or 2
>
> The primary slave becomes the active slave only if the current
> active slave fails and the primary slave is up.
>
> When no slave are active, if the primary comes back up, it becomes the
> active slave, regardless of the value of primary_return.
>
> -----
>
> Then, to allow those logical names, I suggest you use the
> bond_parse_parm() function and the following constants and struct, to
> parse module params and sysfs configuration.
>
> enum {
> BOND_PRI_RETURN_ALWAYS = 0,
> BOND_PRI_RETURN_BETTER = 1,
> BOND_PRI_RETURN_FAILURE_ONLY = 2,
> };
>
> const struct bond_parm_tbl bond_pri_return_tbl[] = {
> { "always", BOND_PRI_RETURN_ALWAYS},
> { "better", BOND_PRI_RETURN_BETTER},
> { "failure_only", BOND_PRI_RETURN_FAILURE_ONLY},
> { NULL, -1},
> };
>
> Nicolas.
Ok, I like this, but I would stay rather with Jay's suggestion to call this
option primary_passive. Seems more accurate to me.
I'm going to repost soon.
Jirka
>
>> updelay
>> Specifies the time, in milliseconds, to wait before enabling a
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 699bfdd..65066c1 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -94,6 +94,7 @@ static int downdelay;
>> static int use_carrier = 1;
>> static char *mode;
>> static char *primary;
>> +static int primary_passive;
>> static char *lacp_rate;
>> static char *ad_select;
>> static char *xmit_hash_policy;
>> @@ -126,6 +127,9 @@ MODULE_PARM_DESC(mode, "Mode of operation : 0 for balance-rr, "
>> "6 for balance-alb");
>> module_param(primary, charp, 0);
>> MODULE_PARM_DESC(primary, "Primary network device to use");
>> +module_param(primary_passive, int, 0);
>> +MODULE_PARM_DESC(primary_passive, "Do not set primary slave active once it comes up; "
>> + "0 for off (default), 1 for on only if speed of primary is not higher, 2 for on");
>> module_param(lacp_rate, charp, 0);
>> MODULE_PARM_DESC(lacp_rate, "LACPDU tx rate to request from 802.3ad partner "
>> "(slow/fast)");
>> @@ -1070,6 +1074,25 @@ out:
>> }
>> +static bool bond_should_loose_active(struct bonding *bond)
>> +{
>> + struct slave *prim = bond->primary_slave;
>> + struct slave *curr = bond->curr_active_slave;
>> +
>> + if (!prim || !curr || curr->link != BOND_LINK_UP)
>> + return true;
>> + if (bond->force_primary) {
>> + bond->force_primary = false;
>> + return true;
>> + }
>> + if (bond->params.primary_passive == 1 &&
>> + (prim->speed < curr->speed ||
>> + (prim->speed == curr->speed && prim->duplex <= curr->duplex)))
>> + return false;
>> + if (bond->params.primary_passive == 2)
>> + return false;
>> + return true;
>> +}
>> /**
>> * find_best_interface - select the best available slave to be the active one
>> @@ -1094,7 +1117,8 @@ static struct slave *bond_find_best_slave(struct bonding *bond)
>> }
>> if ((bond->primary_slave) &&
>> - bond->primary_slave->link == BOND_LINK_UP) {
>> + bond->primary_slave->link == BOND_LINK_UP &&
>> + bond_should_loose_active(bond)) {
>> new_active = bond->primary_slave;
>> }
>> @@ -1675,8 +1699,10 @@ int bond_enslave(struct net_device *bond_dev,
>> struct net_device *slave_dev)
>> if (USES_PRIMARY(bond->params.mode) && bond->params.primary[0]) {
>> /* if there is a primary slave, remember it */
>> - if (strcmp(bond->params.primary, new_slave->dev->name) == 0)
>> + if (strcmp(bond->params.primary, new_slave->dev->name) == 0) {
>> bond->primary_slave = new_slave;
>> + bond->force_primary = true;
>> + }
>> }
>> write_lock_bh(&bond->curr_slave_lock);
>> @@ -4942,6 +4968,18 @@ static int bond_check_params(struct bond_params *params)
>> primary = NULL;
>> }
>> + if (primary) {
>> + if ((primary_passive != 0) && (primary_passive != 1) &&
>> + (primary_passive != 2)) {
>> + pr_warning(DRV_NAME
>> + ": Warning: primary_passive module parameter "
>> + "(%d), not of valid value (0/1/2), so it was "
>> + "set to 0\n",
>> + primary_passive);
>> + primary_passive = 0;
>> + }
>> + }
>> +
>> if (fail_over_mac) {
>> fail_over_mac_value = bond_parse_parm(fail_over_mac,
>> fail_over_mac_tbl);
>> @@ -4973,6 +5011,7 @@ static int bond_check_params(struct bond_params *params)
>> params->use_carrier = use_carrier;
>> params->lacp_fast = lacp_fast;
>> params->primary[0] = 0;
>> + params->primary_passive = primary_passive;
>> params->fail_over_mac = fail_over_mac_value;
>> if (primary) {
>> diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>> index 6044e12..e813d48 100644
>> --- a/drivers/net/bonding/bond_sysfs.c
>> +++ b/drivers/net/bonding/bond_sysfs.c
>> @@ -1212,6 +1212,59 @@ static DEVICE_ATTR(primary, S_IRUGO | S_IWUSR,
>> bonding_show_primary, bonding_store_primary);
>> /*
>> + * Show and set the primary_passive flag.
>> + */
>> +static ssize_t bonding_show_primary_passive(struct device *d,
>> + struct device_attribute *attr,
>> + char *buf)
>> +{
>> + struct bonding *bond = to_bond(d);
>> +
>> + return sprintf(buf, "%d\n", bond->params.primary_passive);
>> +}
>> +
>> +static ssize_t bonding_store_primary_passive(struct device *d,
>> + struct device_attribute *attr,
>> + const char *buf, size_t count)
>> +{
>> + int new_value, ret = count;
>> + struct bonding *bond = to_bond(d);
>> +
>> + if (!rtnl_trylock())
>> + return restart_syscall();
>> +
>> + if (sscanf(buf, "%d", &new_value) != 1) {
>> + pr_err(DRV_NAME
>> + ": %s: no primary_passive value specified.\n",
>> + bond->dev->name);
>> + ret = -EINVAL;
>> + goto out;
>> + }
>> + if (new_value == 0 || new_value == 1 || new_value == 2) {
>> + bond->params.primary_passive = new_value;
>> + pr_info(DRV_NAME ": %s: Setting primary_passive to %d.\n",
>> + bond->dev->name, new_value);
>> + if (new_value == 0 || new_value == 1) {
>> + bond->force_primary = true;
>> + read_lock(&bond->lock);
>> + write_lock_bh(&bond->curr_slave_lock);
>> + bond_select_active_slave(bond);
>> + write_unlock_bh(&bond->curr_slave_lock);
>> + read_unlock(&bond->lock);
>> + }
>> + } else {
>> + pr_info(DRV_NAME
>> + ": %s: Ignoring invalid primary_passive value %d.\n",
>> + bond->dev->name, new_value);
>> + }
>> +out:
>> + rtnl_unlock();
>> + return count;
>> +}
>> +static DEVICE_ATTR(primary_passive, S_IRUGO | S_IWUSR,
>> + bonding_show_primary_passive, bonding_store_primary_passive);
>> +
>> +/*
>> * Show and set the use_carrier flag.
>> */
>> static ssize_t bonding_show_carrier(struct device *d,
>> @@ -1500,6 +1553,7 @@ static struct attribute *per_bond_attrs[] = {
>> &dev_attr_num_unsol_na.attr,
>> &dev_attr_miimon.attr,
>> &dev_attr_primary.attr,
>> + &dev_attr_primary_passive.attr,
>> &dev_attr_use_carrier.attr,
>> &dev_attr_active_slave.attr,
>> &dev_attr_mii_status.attr,
>> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>> index 6290a50..b6287e0 100644
>> --- a/drivers/net/bonding/bonding.h
>> +++ b/drivers/net/bonding/bonding.h
>> @@ -131,6 +131,7 @@ struct bond_params {
>> int lacp_fast;
>> int ad_select;
>> char primary[IFNAMSIZ];
>> + int primary_passive;
>> __be32 arp_targets[BOND_MAX_ARP_TARGETS];
>> };
>> @@ -190,6 +191,7 @@ struct bonding {
>> struct slave *curr_active_slave;
>> struct slave *current_arp_slave;
>> struct slave *primary_slave;
>> + bool force_primary;
>> s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
>> rwlock_t lock;
>> rwlock_t curr_slave_lock;
>>
>
^ permalink raw reply
* Re: net_sched 00/07: classful multiqueue dummy scheduler
From: Eric Dumazet @ 2009-09-07 13:00 UTC (permalink / raw)
To: David Miller; +Cc: kaber, netdev
In-Reply-To: <20090907.015039.154939751.davem@davemloft.net>
David Miller a écrit :
> I gave these patches a very basic bashing with NIU, and it
> seems to work from what I've tried.
>
> I know that Jarek has expressed some questions about the callback
> scheme used by the new mq classful qdisc, as well as some other
> issues, but we can refine this using followon patches.
>
> For now I'm pushing this out so that it gets wider testing.
>
> Thanks everyone!
Very interesting :)
Had very litle time to test this, but got problems very fast, if rate estimator configured.
(Here, eth2 maps to tg3, that uses a num_tx_queues of 5, even on non multiqueue device)
So its real_num_tx_queues is 1, but we can play with tc and mq
# tc qdisc replace dev eth2 handle 1: root estimator 1sec 8sec mq
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 62414 bytes 401 pkt (dropped 0, overlimits 0 requeues 0)
rate 5456bit 4pps backlog 0b 0p requeues 0
# tc qdisc replace dev eth2 parent 1:1 estimator 1sec 8sec pfifo
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 12984 bytes 88 pkt (dropped 0, overlimits 0 requeues 0)
rate 4368bit 4pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 12984 bytes 88 pkt (dropped 0, overlimits 0 requeues 0)
rate 9624bit 8pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 36781 bytes 244 pkt (dropped 0, overlimits 0 requeues 0)
rate 34360Mbit 205872pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 36781 bytes 244 pkt (dropped 0, overlimits 0 requeues 0)
rate 19824bit 16pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 52663 bytes 348 pkt (dropped 0, overlimits 0 requeues 0)
rate 17457Mbit 105605pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 52663 bytes 348 pkt (dropped 0, overlimits 0 requeues 0)
rate 22560bit 19pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 71775 bytes 473 pkt (dropped 0, overlimits 0 requeues 0)
rate 11838Mbit 47402pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 71775 bytes 473 pkt (dropped 0, overlimits 0 requeues 0)
rate 23880bit 20pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 94755 bytes 623 pkt (dropped 0, overlimits 0 requeues 0)
rate 3562Mbit 18621pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 94755 bytes 623 pkt (dropped 0, overlimits 0 requeues 0)
rate 24440bit 20pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 112501 bytes 741 pkt (dropped 0, overlimits 0 requeues 0)
rate 734270Kbit 9562pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 112501 bytes 741 pkt (dropped 0, overlimits 0 requeues 0)
rate 24632bit 20pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 127137 bytes 836 pkt (dropped 0, overlimits 0 requeues 0)
rate 25390Mbit 4913pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 127137 bytes 836 pkt (dropped 0, overlimits 0 requeues 0)
rate 24960bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 150745 bytes 992 pkt (dropped 0, overlimits 0 requeues 0)
rate 6212Mbit 1693pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 150745 bytes 992 pkt (dropped 0, overlimits 0 requeues 0)
rate 25032bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 174008 bytes 1144 pkt (dropped 0, overlimits 0 requeues 0)
rate 29377Mbit 674pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 174008 bytes 1144 pkt (dropped 0, overlimits 0 requeues 0)
rate 24904bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 185976 bytes 1224 pkt (dropped 0, overlimits 0 requeues 0)
rate 13093Mbit 408pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 185976 bytes 1224 pkt (dropped 0, overlimits 0 requeues 0)
rate 25288bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 232272 bytes 1530 pkt (dropped 0, overlimits 0 requeues 0)
rate 5196Mbit 57pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 232272 bytes 1530 pkt (dropped 0, overlimits 0 requeues 0)
rate 24784bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 300071 bytes 1977 pkt (dropped 0, overlimits 0 requeues 0)
rate 8988Mbit 6pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 300071 bytes 1977 pkt (dropped 0, overlimits 0 requeues 0)
rate 24432bit 20pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 377495 bytes 2490 pkt (dropped 0, overlimits 0 requeues 0)
rate 20429Mbit 2pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 377495 bytes 2490 pkt (dropped 0, overlimits 0 requeues 0)
rate 24520bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 448070 bytes 2958 pkt (dropped 0, overlimits 0 requeues 0)
rate 5726Mbit 4pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 448070 bytes 2958 pkt (dropped 0, overlimits 0 requeues 0)
rate 24576bit 20pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 519926 bytes 3435 pkt (dropped 0, overlimits 0 requeues 0)
rate 265505Kbit 3pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 519926 bytes 3435 pkt (dropped 0, overlimits 0 requeues 0)
rate 24920bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 522242 bytes 3449 pkt (dropped 0, overlimits 0 requeues 0)
rate 232389Kbit 62pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 522242 bytes 3449 pkt (dropped 0, overlimits 0 requeues 0)
rate 25304bit 21pps backlog 0b 0p requeues 0
# tc -s -d qdisc show dev eth2
qdisc mq 1: root
Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
rate 177925Kbit 49pps backlog 0b 0p requeues 0
qdisc pfifo 8001: parent 1:1 limit 1000p
Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
rate 25400bit 21pps backlog 0b 0p requeues 0
<<<crash>>>
(On another term I had a "ping -i 0.1 192.168.20.120" that gave :
2009/08/07 14:53:42.498 64 bytes from 192.168.20.120: icmp_seq=1982 ttl=64 time=0.126 ms
2009/08/07 14:53:42.598 64 bytes from 192.168.20.120: icmp_seq=1983 ttl=64 time=0.118 ms
2009/08/07 14:53:42.698 64 bytes from 192.168.20.120: icmp_seq=1984 ttl=64 time=0.114 ms
2009/08/07 14:53:42.798 64 bytes from 192.168.20.120: icmp_seq=1985 ttl=64 time=0.123 ms
2009/08/07 14:53:42.898 64 bytes from 192.168.20.120: icmp_seq=1986 ttl=64 time=0.126 ms
2009/08/07 14:53:42.998 64 bytes from 192.168.20.120: icmp_seq=1987 ttl=64 time=0.119 ms
2009/08/07 14:53:43.098 64 bytes from 192.168.20.120: icmp_seq=1988 ttl=64 time=0.122 ms
2009/08/07 14:53:43.198 64 bytes from 192.168.20.120: icmp_seq=1989 ttl=64 time=0.119 ms
2009/08/07 14:53:43.298 64 bytes from 192.168.20.120: icmp_seq=1990 ttl=64 time=0.117 ms
2009/08/07 14:53:43.398 64 bytes from 192.168.20.120: icmp_seq=1991 ttl=64 time=0.117 ms
ping: sendmsg: No buffer space available
^ permalink raw reply
* Re: [iproute2] tc action mirred question
From: jamal @ 2009-09-07 12:54 UTC (permalink / raw)
To: Xiaofei Wu; +Cc: linux netdev
In-Reply-To: <749852.6581.qm@web111606.mail.gq1.yahoo.com>
On Mon, 2009-09-07 at 05:38 -0700, Xiaofei Wu wrote:
> I just want to know:
> 1) Could I forward the mirroring packets to another node ,and then route it to the destination(if I use
> iproute2 (ip, tc ...) )? I described my purpose in my last email.
Yes, you can mirror to another node(B/D). To route on that node(B/D),
your dst MAC address has to be correct for that destination node(B/D) to
accept it. You could try to run the destination node in promisc mode
and you may be able to get away without changing dst mac.
> 2) After I mirrored the packets, I should use 'ip route' , 'ip rule' to modify route tables. Is this right?
>
Assuming you are talking about B/D, yes you can do routing there if the
node accepts it..
> >You should repeat the same on wlan1 to mirror to wlan0 i.e on wlan1:
> >match ip src 192.168.2.0/24 ...
> >action mirred egress mirror dev wlan0
>
> Would this cause loops?
>
Ok, so this was the trick question;->
I dont see how the loop would happen - they are different "match" rules
i.e one is for 192.168.2.0/24 and the other is for 192.168.1.0/24
Whether it loops or not is easy for you to verify.
cheers,
jamal
^ permalink raw reply
* Re: net_sched 05/07: reintroduce dev->qdisc for use by sch_api
From: Patrick McHardy @ 2009-09-07 13:16 UTC (permalink / raw)
To: Jarek Poplawski; +Cc: netdev
In-Reply-To: <20090906185757.GA8833@ami.dom.local>
Jarek Poplawski wrote:
>> @@ -1383,7 +1375,7 @@ static int tc_ctl_tclass(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
>> pid = TC_H_MAKE(qid, pid);
>> } else {
>> if (qid == 0)
>> - qid = dev_queue->qdisc_sleeping->handle;
>> + qid = dev->qdisc->handle;
>
> Probably I miss something, but in mq root case it seems to never do
> anything we need. If so, it could be the example of possible issues
> elsewhere.
Sorry, I'm not sure what you're saying ..
> I thought this mq virtual root qdisc could be done more transparently
> and invisible for the current code, but it seems, in your
> implementation some pointers like this, or parent ids (especially
> TC_H_ROOT) might be different, and even if it works OK, needs a lot of
> verification. So, my question is, if it's really necessary.
Same here.
^ permalink raw reply
* Re: [PATCH 1/2] Add an alternative cs89x0 driver
From: Sascha Hauer @ 2009-09-07 13:21 UTC (permalink / raw)
To: Kurt Van Dijck; +Cc: netdev
In-Reply-To: <20090907123554.GA2928@e-circ.dyndns.org>
On Mon, Sep 07, 2009 at 02:35:54PM +0200, Kurt Van Dijck wrote:
> On Mon, Sep 07, 2009 at 12:24:34PM +0200, Sascha Hauer wrote:
> > Date: Mon, 7 Sep 2009 12:24:34 +0200
> > Subject: Re: [PATCH 1/2] Add an alternative cs89x0 driver
> > From: Sascha Hauer <s.hauer@pengutronix.de>
> > To: Kurt Van Dijck <kurt.van.dijck@eia.be>
> > Cc: netdev@vger.kernel.org
> > List-ID: <netdev.vger.kernel.org>
> >
> > Hi Kurt,
> >
> > On Wed, Aug 26, 2009 at 12:46:34PM +0200, Kurt Van Dijck wrote:
> > > Hi Sacha,
> > >
> > > I'm using a 2.6.25.
> > > Converting to your platform_device based driver,
> > > I needed to configure the irq (see patch, irq flags).
> > > Looking in the old cs89x0.c, it's done in the driver. Should I have
> > > configured the irq level elsewhere? Or is this patch valid to do?
> >
> > This is the way to go. I don't know if the cs89x0 has configurable
> > interrupt levels though.
> I haven't read any spec about cs89x0, but by looking in the existing
> code:
> 1) IRQF_TRIGGER_HIGH seems like the default
> 2) the old driver didn't do any irq config in the chip either.
> May I assume you were lucky testing the driver on a platform that had
> IRQF_TRIGGER_HIGH per default?
> I have it running (with patch) on a iMX31 (arm)
I tested it with an i.MX1. Looking at the code it seems that
__irq_set_trigger is only called if one of the trigger mask bits is set.
So the interrupt control registers are never touched when 0 is passed as
irq flags. The reset default for i.MX is rising edge.
Sascha
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply
* Re: net_sched 07/07: add classful multiqueue dummy scheduler
From: Patrick McHardy @ 2009-09-07 13:27 UTC (permalink / raw)
To: Jarek Poplawski; +Cc: netdev
In-Reply-To: <20090906200409.GB8833@ami.dom.local>
Jarek Poplawski wrote:
>> struct Qdisc_class_ops
>> {
>> /* Child qdisc manipulation */
>> + unsigned int (*select_queue)(struct Qdisc *, struct tcmsg *);
>> int (*graft)(struct Qdisc *, unsigned long cl,
>> struct Qdisc *, struct Qdisc **);
>> struct Qdisc * (*leaf)(struct Qdisc *, unsigned long cl);
>> @@ -122,6 +123,7 @@ struct Qdisc_ops
>> void (*reset)(struct Qdisc *);
>> void (*destroy)(struct Qdisc *);
>> int (*change)(struct Qdisc *, struct nlattr *arg);
>> + void (*attach)(struct Qdisc *);
>
> Probably it's a matter of taste, but I wonder why these two methods
> used only by one qdisc in max 2 places can't be functions instead
> (maybe even static in case of select_queue)? (And this mq sched could
> be tested with some flag instead of ->attach, I guess.)
Yes, we could also use normal functions. Either way is fine with me.
>> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
>> index d71f12b..2a78d54 100644
>> --- a/net/sched/sch_api.c
>> +++ b/net/sched/sch_api.c
>> @@ -678,6 +678,11 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
>> if (dev->flags & IFF_UP)
>> dev_deactivate(dev);
>>
>> + if (new && new->ops->attach) {
>> + new->ops->attach(new);
>> + num_q = 0;
>> + }
>> +
>
> Actually, I wonder if it's not cleaner to let replace all qdiscs with
> noops below like in qdisc delete case, and do this attaching in one
> place only (dev_activate).
I don't think that would work since dev_activate() allocates its own
qdiscs, which use different handles than those specified by userspace.
We also need the new qdisc for notifications. It would be a nice
cleanup however if you can make it work.
>> @@ -1095,10 +1100,16 @@ create_n_graft:
>> q = qdisc_create(dev, &dev->rx_queue,
>> tcm->tcm_parent, tcm->tcm_parent,
>> tca, &err);
>> - else
>> - q = qdisc_create(dev, netdev_get_tx_queue(dev, 0),
>> + else {
>> + unsigned int ntx = 0;
>> +
>> + if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue)
>> + ntx = p->ops->cl_ops->select_queue(p, tcm);
>
> So, this if could be probably made shorter with a common function, but
> the main point is: this probably works only for qdiscs having mq as a
> parent, and not below.
Yes. mq can only be attached to the root however, so its not
possible to use it as a child qdisc.
>> +static int mq_init(struct Qdisc *sch, struct nlattr *opt)
>> +{
>> + struct net_device *dev = qdisc_dev(sch);
>> + struct mq_sched *priv = qdisc_priv(sch);
>> + struct netdev_queue *dev_queue;
>> + struct Qdisc *qdisc;
>> + unsigned int ntx;
>> +
>> + if (sch->parent != TC_H_ROOT)
>> + return -EOPNOTSUPP;
>> +
>> + if (!netif_is_multiqueue(dev))
>> + return -EOPNOTSUPP;
>> +
>> + /* pre-allocate qdiscs, attachment can't fail */
>> + priv->qdiscs = kcalloc(dev->num_tx_queues, sizeof(priv->qdiscs[0]),
>> + GFP_KERNEL);
>
> I guess we could avoid this at all or at least to do it in one step with
> current ->attach.
It seemed easier this way, but I don't care much where its done exactly.
>> + if (priv->qdiscs == NULL)
>> + return -ENOMEM;
>> +
>> + for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
>> + dev_queue = netdev_get_tx_queue(dev, ntx);
>> + qdisc = qdisc_create_dflt(dev, dev_queue, &pfifo_fast_ops,
>> + TC_H_MAKE(TC_H_MAJ(sch->handle),
>> + TC_H_MIN(ntx + 1)));
>
> As I wrote in 05/07 comment, I wonder if we really can't achieve this
> with old TC_H_ROOT parentid, and maybe some mapping while dumping to
> the userspace only.
I don't see the advantage.
> Another possibility would be considering a new
> kind of root (mqroot?) to tell precisely, where a new qdisc should be
> added.
That's what mq is doing.
>> +static int mq_dump(struct Qdisc *sch, struct sk_buff *skb)
>> +{
>> + struct net_device *dev = qdisc_dev(sch);
>> + struct Qdisc *qdisc;
>> + unsigned int ntx;
>> +
>> + sch->q.qlen = 0;
>> + memset(&sch->bstats, 0, sizeof(sch->bstats));
>> + memset(&sch->qstats, 0, sizeof(sch->qstats));
>> +
>> + for (ntx = 0; ntx < dev->num_tx_queues; ntx++) {
>> + qdisc = netdev_get_tx_queue(dev, ntx)->qdisc_sleeping;
>> + spin_lock_bh(qdisc_lock(qdisc));
>> + sch->q.qlen += qdisc->q.qlen;
>> + sch->bstats.bytes += qdisc->bstats.bytes;
>> + sch->bstats.packets += qdisc->bstats.packets;
>> + sch->qstats.qlen += qdisc->qstats.qlen;
>
> Like in Christoph's case, we should probably use q.qlen instead.
Its done a few lines above. This simply sums up all members of qstats.
^ permalink raw reply
* Re: net_sched 00/07: classful multiqueue dummy scheduler
From: Patrick McHardy @ 2009-09-07 13:29 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <4AA503E4.2060504@gmail.com>
Eric Dumazet wrote:
> David Miller a écrit :
>> I gave these patches a very basic bashing with NIU, and it
>> seems to work from what I've tried.
>>
>> I know that Jarek has expressed some questions about the callback
>> scheme used by the new mq classful qdisc, as well as some other
>> issues, but we can refine this using followon patches.
>>
>> For now I'm pushing this out so that it gets wider testing.
>>
>> Thanks everyone!
>
> Very interesting :)
>
> Had very litle time to test this, but got problems very fast, if rate estimator configured.
I didn't test that, but I'll look into it.
> qdisc mq 1: root
> Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
> rate 177925Kbit 49pps backlog 0b 0p requeues 0
> qdisc pfifo 8001: parent 1:1 limit 1000p
> Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
> rate 25400bit 21pps backlog 0b 0p requeues 0
>
> <<<crash>>>
Did you capture the crash?
> (On another term I had a "ping -i 0.1 192.168.20.120" that gave :
>
> 2009/08/07 14:53:42.498 64 bytes from 192.168.20.120: icmp_seq=1982 ttl=64 time=0.126 ms
> 2009/08/07 14:53:42.598 64 bytes from 192.168.20.120: icmp_seq=1983 ttl=64 time=0.118 ms
> 2009/08/07 14:53:42.698 64 bytes from 192.168.20.120: icmp_seq=1984 ttl=64 time=0.114 ms
> 2009/08/07 14:53:42.798 64 bytes from 192.168.20.120: icmp_seq=1985 ttl=64 time=0.123 ms
> 2009/08/07 14:53:42.898 64 bytes from 192.168.20.120: icmp_seq=1986 ttl=64 time=0.126 ms
> 2009/08/07 14:53:42.998 64 bytes from 192.168.20.120: icmp_seq=1987 ttl=64 time=0.119 ms
> 2009/08/07 14:53:43.098 64 bytes from 192.168.20.120: icmp_seq=1988 ttl=64 time=0.122 ms
> 2009/08/07 14:53:43.198 64 bytes from 192.168.20.120: icmp_seq=1989 ttl=64 time=0.119 ms
> 2009/08/07 14:53:43.298 64 bytes from 192.168.20.120: icmp_seq=1990 ttl=64 time=0.117 ms
> 2009/08/07 14:53:43.398 64 bytes from 192.168.20.120: icmp_seq=1991 ttl=64 time=0.117 ms
> ping: sendmsg: No buffer space available
Was this also with rate estimators? No buffer space available
indicates that some class/qdisc isn't dequeued or the packets
are leaking, so the output of tc -s -d qdisc show ... might be
helpful.
^ permalink raw reply
* Re: net_sched 00/07: classful multiqueue dummy scheduler
From: Patrick McHardy @ 2009-09-07 14:23 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <4AA50ACF.9010400@trash.net>
[-- Attachment #1: Type: text/plain, Size: 2247 bytes --]
Patrick McHardy wrote:
> Eric Dumazet wrote:
>> Had very litle time to test this, but got problems very fast, if rate estimator configured.
>
> I didn't test that, but I'll look into it.
>
>> qdisc mq 1: root
>> Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
>> rate 177925Kbit 49pps backlog 0b 0p requeues 0
>> qdisc pfifo 8001: parent 1:1 limit 1000p
>> Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
>> rate 25400bit 21pps backlog 0b 0p requeues 0
>>
>> <<<crash>>>
>
> Did you capture the crash?
>
>> (On another term I had a "ping -i 0.1 192.168.20.120" that gave :
>>
>> 2009/08/07 14:53:42.498 64 bytes from 192.168.20.120: icmp_seq=1982 ttl=64 time=0.126 ms
>> 2009/08/07 14:53:42.598 64 bytes from 192.168.20.120: icmp_seq=1983 ttl=64 time=0.118 ms
>> 2009/08/07 14:53:42.698 64 bytes from 192.168.20.120: icmp_seq=1984 ttl=64 time=0.114 ms
>> 2009/08/07 14:53:42.798 64 bytes from 192.168.20.120: icmp_seq=1985 ttl=64 time=0.123 ms
>> 2009/08/07 14:53:42.898 64 bytes from 192.168.20.120: icmp_seq=1986 ttl=64 time=0.126 ms
>> 2009/08/07 14:53:42.998 64 bytes from 192.168.20.120: icmp_seq=1987 ttl=64 time=0.119 ms
>> 2009/08/07 14:53:43.098 64 bytes from 192.168.20.120: icmp_seq=1988 ttl=64 time=0.122 ms
>> 2009/08/07 14:53:43.198 64 bytes from 192.168.20.120: icmp_seq=1989 ttl=64 time=0.119 ms
>> 2009/08/07 14:53:43.298 64 bytes from 192.168.20.120: icmp_seq=1990 ttl=64 time=0.117 ms
>> 2009/08/07 14:53:43.398 64 bytes from 192.168.20.120: icmp_seq=1991 ttl=64 time=0.117 ms
>> ping: sendmsg: No buffer space available
>
> Was this also with rate estimators? No buffer space available
> indicates that some class/qdisc isn't dequeued or the packets
> are leaking, so the output of tc -s -d qdisc show ... might be
> helpful.
I figured out the bug, which is likely responsible for both
problems. When grafting a mq class and creating a rate estimator,
the new qdisc is not attached to the device queue yet and also
doesn't have TC_H_ROOT as parent, so qdisc_create() selects
qdisc_root_sleeping_lock() for the estimator, which belongs to
the qdisc that is getting replaced.
This is a patch I used for testing, but I'll come up with
something more elegant (I hope) as a final fix :)
[-- Attachment #2: x --]
[-- Type: text/plain, Size: 1497 bytes --]
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 2a78d54..428eb34 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -732,7 +732,8 @@ static struct lock_class_key qdisc_rx_lock;
*/
static struct Qdisc *
-qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
+qdisc_create(struct net_device *dev, struct Qdisc *p,
+ struct netdev_queue *dev_queue,
u32 parent, u32 handle, struct nlattr **tca, int *errp)
{
int err;
@@ -810,8 +811,9 @@ qdisc_create(struct net_device *dev, struct netdev_queue *dev_queue,
if (tca[TCA_RATE]) {
spinlock_t *root_lock;
- if ((sch->parent != TC_H_ROOT) &&
- !(sch->flags & TCQ_F_INGRESS))
+ if (((sch->parent != TC_H_ROOT) &&
+ !(sch->flags & TCQ_F_INGRESS)) &&
+ (!p || !p->ops->attach))
root_lock = qdisc_root_sleeping_lock(sch);
else
root_lock = qdisc_lock(sch);
@@ -1097,7 +1099,7 @@ create_n_graft:
if (!(n->nlmsg_flags&NLM_F_CREATE))
return -ENOENT;
if (clid == TC_H_INGRESS)
- q = qdisc_create(dev, &dev->rx_queue,
+ q = qdisc_create(dev, p, &dev->rx_queue,
tcm->tcm_parent, tcm->tcm_parent,
tca, &err);
else {
@@ -1106,7 +1108,7 @@ create_n_graft:
if (p && p->ops->cl_ops && p->ops->cl_ops->select_queue)
ntx = p->ops->cl_ops->select_queue(p, tcm);
- q = qdisc_create(dev, netdev_get_tx_queue(dev, ntx),
+ q = qdisc_create(dev, p, netdev_get_tx_queue(dev, ntx),
tcm->tcm_parent, tcm->tcm_handle,
tca, &err);
}
^ permalink raw reply related
* Re: [PATCH 00/12] Gigaset driver patches for 2.6.32
From: Daniel Walker @ 2009-09-07 14:30 UTC (permalink / raw)
To: Tilman Schmidt; +Cc: davem, linux-kernel, netdev, i4ldeveloper, Hansjoerg Lipp
In-Reply-To: <4AA4CD64.4070501@imap.cc>
On Mon, 2009-09-07 at 11:07 +0200, Tilman Schmidt wrote:
> Daniel,
>
> thanks for taking a look at my patches.
>
> On Sun, 06 Sep 2009 18:26:46 -0700, Daniel Walker wrote:
> > patches 6,7,10, and 12 all have checkpatch errors. Could you fix those?
>
> I have reissued patch 7, exchanging the "static" and "inline" keywords
> as requested.
>
> The other patches I'd much prefer to keep as they are. The "ERRORs"
> checkpatch.pl reported for them are results of keeping to the existing
> formatting of the patched files. Completely reformatting them would
> cloud the actual changes made by the patches.
Yeah, it looks like the whole file needs a checkpatch clean up.. Sounds
like your not willing to do that? Usually if a checkpatch cleanup comes
first prior to all your other changes , it doesn't usually cloud the
rest of the changes..
Daniel
^ permalink raw reply
* [PATCH -next] cdc_ether: additional Ericsson MBM PID's to the whitelist
From: Torgny Johansson @ 2009-09-07 14:36 UTC (permalink / raw)
To: David Brownell; +Cc: netdev
This is a re-submit since I did not see any feedback on the first submittal.
This patch adds seven PID's to the whitelist set of devices.
Devices added to the whitelist:
Ericsson Mobile Broadband Module variants (F3607gw and F3307)
Dell F3607gw variants
Toshiba F3607gw variants
Signed-off-by: Torgny Johansson <torgny.johansson@ericsson.com>
diff --git a/drivers/net/usb/Kconfig b/drivers/net/usb/Kconfig index c47237c..32d9356 100644
--- a/drivers/net/usb/Kconfig
+++ b/drivers/net/usb/Kconfig
@@ -174,7 +174,7 @@ config USB_NET_CDCETHER
* Ericsson Mobile Broadband Module (all variants)
* Motorola (DM100 and SB4100)
* Broadcom Cable Modem (reference design)
- * Toshiba (PCX1100U and F3507g)
+ * Toshiba (PCX1100U and F3507g/F3607gw)
* ...
This driver creates an interface named "ethX", where X depends on diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c index 4a6aff5..5d99106 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -544,20 +544,55 @@ static const struct usb_device_id products [] = {
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
.driver_info = (unsigned long) &cdc_info, }, {
- /* Ericsson F3307 */
+ /* Ericsson F3607gw ver 2 */
+ USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x1905, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long) &cdc_info, }, {
+ /* Ericsson F3607gw ver 3 */
USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x1906, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
.driver_info = (unsigned long) &cdc_info, }, {
+ /* Ericsson F3307 */
+ USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x190a, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long) &cdc_info, }, {
+ /* Ericsson F3307 ver 2 */
+ USB_DEVICE_AND_INTERFACE_INFO(0x0bdb, 0x1909, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long) &cdc_info, }, {
/* Toshiba F3507g */
USB_DEVICE_AND_INTERFACE_INFO(0x0930, 0x130b, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
.driver_info = (unsigned long) &cdc_info, }, {
+ /* Toshiba F3607gw */
+ USB_DEVICE_AND_INTERFACE_INFO(0x0930, 0x130c, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long) &cdc_info, }, {
+ /* Toshiba F3607gw ver 2 */
+ USB_DEVICE_AND_INTERFACE_INFO(0x0930, 0x1311, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long) &cdc_info, }, {
/* Dell F3507g */
USB_DEVICE_AND_INTERFACE_INFO(0x413c, 0x8147, USB_CLASS_COMM,
USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
.driver_info = (unsigned long) &cdc_info,
+}, {
+ /* Dell F3607gw */
+ USB_DEVICE_AND_INTERFACE_INFO(0x413c, 0x8183, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long) &cdc_info, }, {
+ /* Dell F3607gw ver 2 */
+ USB_DEVICE_AND_INTERFACE_INFO(0x413c, 0x8184, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_MDLM, USB_CDC_PROTO_NONE),
+ .driver_info = (unsigned long) &cdc_info,
},
{ }, // END
};
^ permalink raw reply
* Adding bridge interface to non-default network namespace crashes kernel
From: Atis Elsts @ 2009-09-07 15:07 UTC (permalink / raw)
To: netdev-u79uwXL29TY76Z2rM5mHXA; +Cc: containers-qjLDD68F18O7TbgM5vRIOg
[-- Attachment #1: Type: text/plain, Size: 491 bytes --]
Trying to add bridge interface from userspace program, after moving the
program to a new network namespace, causes kernel to crash. I am using latest
kernel version from git (2.6.31-rc9).
The bug is easy to reproduce - just compile and run the attached C program.
I see that bridge interface has NETIF_F_NETNS_LOCAL flag, but as I understand,
this flag simply means that a device cannot be *moved* across network
namespaces, not that it cannot be *created* in other namespaces.
--Atis
[-- Attachment #2: crash.c --]
[-- Type: text/x-csrc, Size: 571 bytes --]
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <linux/sched.h>
#include <linux/sockios.h>
int main() {
if (unshare(CLONE_NEWNET)) {
perror("unshare");
return -1;
}
int fd = socket(AF_INET, SOCK_DGRAM, 0);
if (fd < 0) {
perror("socket");
return -1;
}
const char *name = "lobridge";
if (ioctl(fd, SIOCBRADDBR, name) < 0) {
perror("ioctl SIOCBRADDBR");
return -1;
}
system("ip addr");
for (;;) {
printf("still running\n");
sleep(5);
}
}
[-- Attachment #3: syslog --]
[-- Type: text/plain, Size: 3289 bytes --]
Sep 7 16:06:19 debian kernel: [ 91.040000] ------------[ cut here ]------------
Sep 7 16:06:19 debian kernel: [ 91.040000] kernel BUG at fs/sysfs/group.c:65!
Sep 7 16:06:19 debian kernel: [ 91.040000] invalid opcode: 0000 [#1] SMP
Sep 7 16:06:19 debian kernel: [ 91.040000] last sysfs file: /sys/devices/virtual/net/lo/operstate
Sep 7 16:06:19 debian kernel: [ 91.040000] Modules linked in:
Sep 7 16:06:19 debian kernel: [ 91.040000]
Sep 7 16:06:19 debian kernel: [ 91.040000] Pid: 1667, comm: a.out Not tainted (2.6.31-rc9 #16)
Sep 7 16:06:19 debian kernel: [ 91.040000] EIP: 0060:[<c0218c5e>] EFLAGS: 00000246 CPU: 0
Sep 7 16:06:19 debian kernel: [ 91.040000] EIP is at internal_create_group+0x14e/0x180
Sep 7 16:06:19 debian kernel: [ 91.040000] EAX: 00000000 EBX: c71fb000 ECX: c053aae0 EDX: 00000000
Sep 7 16:06:19 debian kernel: [ 91.040000] ESI: 00000000 EDI: c71fb25c EBP: c6dede58 ESP: c6dede30
Sep 7 16:06:19 debian kernel: [ 91.040000] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Sep 7 16:06:19 debian kernel: [ 91.040000] Process a.out (pid: 1667, ti=c6dec000 task=c6df1920 task.ti=c6dec000)
Sep 7 16:06:19 debian kernel: [ 91.040000] Stack:
Sep 7 16:06:19 debian kernel: [ 91.040000] c71fb008 c053aae0 00000000 c71fb25c 00000000 00000000 c6dede58 c71fb000
Sep 7 16:06:19 debian kernel: [ 91.040000] <0> 00000000 c71fb25c c6dede60 c0218cbc c6dede84 c03c6bc3 00000000 00000000
Sep 7 16:06:19 debian kernel: [ 91.040000] <0> 00000000 c7102000 c71fb000 00000000 c7102000 c6dedea0 c03c348f 00000001
Sep 7 16:06:19 debian kernel: [ 91.040000] Call Trace:
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c0218cbc>] ? sysfs_create_group+0xc/0x10
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c03c6bc3>] ? br_sysfs_addbr+0x23/0xf0
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c03c348f>] ? br_add_bridge+0x18f/0x1a0
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c03c4328>] ? br_ioctl_deviceless_stub+0x1f8/0x210
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c03c4130>] ? br_ioctl_deviceless_stub+0x0/0x210
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c032e0cf>] ? sock_ioctl+0xbf/0x220
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c032e010>] ? sock_ioctl+0x0/0x220
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c01d4e28>] ? vfs_ioctl+0x28/0x80
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c01d4faa>] ? do_vfs_ioctl+0x6a/0x520
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c032d2a3>] ? sock_map_fd+0x43/0x70
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c032db22>] ? sys_socket+0x52/0x70
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c0118138>] ? do_page_fault+0x168/0x310
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c01d54b3>] ? sys_ioctl+0x53/0x70
Sep 7 16:06:19 debian kernel: [ 91.040000] [<c0102c04>] ? sysenter_do_call+0x12/0x22
Sep 7 16:06:19 debian kernel: [ 91.040000] Code: bd 8b 5d e0 83 45 ec 01 85 db 0f 84 63 ff ff ff 8b 12 89 f8 e8 e4 ce ff ff 8b 16 e9 53 ff ff ff 8b 40 18 85 c0 0f 85 e9 fe ff ff <0f> 0b eb fe 89 f8 e8 47 ea ff ff 8b 45 e8 83 c4 1c 5b 5e 5f 5d
Sep 7 16:06:19 debian kernel: [ 91.040000] EIP: [<c0218c5e>] internal_create_group+0x14e/0x180 SS:ESP 0068:c6dede30
Sep 7 16:06:19 debian kernel: [ 91.040000] ---[ end trace beadcfdb06c985eb ]---
[-- Attachment #4: Type: text/plain, Size: 206 bytes --]
_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers
^ permalink raw reply
* [patch]autosuspend for cdc-ether #2
From: Oliver Neukum @ 2009-09-07 15:16 UTC (permalink / raw)
To: David Brownell, Torgny Johansson, netdev, linux-usb,
David S. Miller
Hi,
this allows the use of remote wakeup to autosuspend online
devices. Nobody has made any comments. David, would you apply it?
Regards
Oliver
Signed-off-by: Oliver Neukum <oliver@neukum.org>
--
commit 7f2d9a0440f6453462bd7bf95f77eda17c16d865
Author: Oliver Neukum <oliver@neukum.org>
Date: Fri Aug 28 15:49:15 2009 +0200
usb: usbnet: runtime power management for active connections
Devices that support remote wakeup can be autosuspended even while the
interface is up. Transmissions are queued and processed after the device
has been woken up.
diff --git a/drivers/net/usb/cdc_ether.c b/drivers/net/usb/cdc_ether.c
index 4a6aff5..8ee5bd7 100644
--- a/drivers/net/usb/cdc_ether.c
+++ b/drivers/net/usb/cdc_ether.c
@@ -411,6 +411,12 @@ static int cdc_bind(struct usbnet *dev, struct usb_interface *intf)
return 0;
}
+static int cdc_manage_power(struct usbnet *dev, int on)
+{
+ dev->intf->needs_remote_wakeup = on;
+ return 0;
+}
+
static const struct driver_info cdc_info = {
.description = "CDC Ethernet Device",
.flags = FLAG_ETHER,
@@ -418,6 +424,7 @@ static const struct driver_info cdc_info = {
.bind = cdc_bind,
.unbind = usbnet_cdc_unbind,
.status = cdc_status,
+ .manage_power = cdc_manage_power,
};
/*-------------------------------------------------------------------------*/
@@ -570,6 +577,7 @@ static struct usb_driver cdc_driver = {
.disconnect = usbnet_disconnect,
.suspend = usbnet_suspend,
.resume = usbnet_resume,
+ .supports_autosuspend = 1,
};
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index edfd9e1..c21b3d2 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -575,6 +575,7 @@ EXPORT_SYMBOL_GPL(usbnet_unlink_rx_urbs);
int usbnet_stop (struct net_device *net)
{
struct usbnet *dev = netdev_priv(net);
+ struct driver_info *info = dev->driver_info;
int temp;
DECLARE_WAIT_QUEUE_HEAD_ONSTACK (unlink_wakeup);
DECLARE_WAITQUEUE (wait, current);
@@ -612,7 +613,10 @@ int usbnet_stop (struct net_device *net)
dev->flags = 0;
del_timer_sync (&dev->delay);
tasklet_kill (&dev->bh);
- usb_autopm_put_interface(dev->intf);
+ if (info->manage_power)
+ info->manage_power(dev, 0);
+ else
+ usb_autopm_put_interface(dev->intf);
return 0;
}
@@ -693,6 +697,13 @@ int usbnet_open (struct net_device *net)
// delay posting reads until we're fully open
tasklet_schedule (&dev->bh);
+
+ if (info->manage_power) {
+ retval = info->manage_power(dev, 1);
+ if (retval < 0)
+ goto done;
+ usb_autopm_put_interface(dev->intf);
+ }
return retval;
done:
usb_autopm_put_interface(dev->intf);
@@ -822,6 +833,7 @@ kevent (struct work_struct *work)
if (test_bit (EVENT_TX_HALT, &dev->flags)) {
unlink_urbs (dev, &dev->txq);
status = usb_clear_halt (dev->udev, dev->out);
+ usb_autopm_put_interface(dev->intf);
if (status < 0
&& status != -EPIPE
&& status != -ESHUTDOWN) {
@@ -893,17 +905,20 @@ static void tx_complete (struct urb *urb)
if (urb->status == 0) {
dev->net->stats.tx_packets++;
dev->net->stats.tx_bytes += entry->length;
+ usb_autopm_put_interface_async(dev->intf);
} else {
dev->net->stats.tx_errors++;
switch (urb->status) {
case -EPIPE:
+ /* we do not allow autosuspension */
usbnet_defer_kevent (dev, EVENT_TX_HALT);
break;
/* software-driven interface shutdown */
case -ECONNRESET: // async unlink
case -ESHUTDOWN: // hardware gone
+ usb_autopm_put_interface_async(dev->intf);
break;
// like rx, tx gets controller i/o faults during khubd delays
@@ -911,6 +926,7 @@ static void tx_complete (struct urb *urb)
case -EPROTO:
case -ETIME:
case -EILSEQ:
+ usb_mark_last_busy(dev->udev);
if (!timer_pending (&dev->delay)) {
mod_timer (&dev->delay,
jiffies + THROTTLE_JIFFIES);
@@ -919,8 +935,10 @@ static void tx_complete (struct urb *urb)
urb->status);
}
netif_stop_queue (dev->net);
+ usb_autopm_put_interface_async(dev->intf);
break;
default:
+ usb_autopm_put_interface_async(dev->intf);
if (netif_msg_tx_err (dev))
devdbg (dev, "tx err %d", entry->urb->status);
break;
@@ -996,8 +1014,29 @@ int usbnet_start_xmit (struct sk_buff *skb, struct net_device *net)
}
}
+
+
spin_lock_irqsave (&dev->txq.lock, flags);
+ retval = usb_autopm_get_interface_async(dev->intf);
+ if (retval < 0) {
+ spin_unlock_irqrestore (&dev->txq.lock, flags);
+ goto drop;
+ }
+
+#ifdef CONFIG_PM
+ /* if this triggers the device is still a sleep */
+ if (test_bit(EVENT_DEV_ASLEEP, &dev->flags)) {
+ /* transmission will be done in resume */
+ dev->deferred = urb;
+ /* no use to process more packets */
+ netif_stop_queue(net);
+ spin_unlock_irqrestore(&dev->txq.lock, flags);
+ retval = NET_XMIT_SUCCESS;
+ goto deferred;
+ }
+#endif
+
switch ((retval = usb_submit_urb (urb, GFP_ATOMIC))) {
case -EPIPE:
netif_stop_queue (net);
@@ -1028,6 +1067,7 @@ drop:
devdbg (dev, "> tx, len %d, type 0x%x",
length, skb->protocol);
}
+deferred:
return retval;
}
EXPORT_SYMBOL_GPL(usbnet_start_xmit);
@@ -1303,6 +1343,15 @@ int usbnet_suspend (struct usb_interface *intf, pm_message_t message)
struct usbnet *dev = usb_get_intfdata(intf);
if (!dev->suspend_count++) {
+ spin_lock_irq(&dev->txq.lock);
+ /* don't autosuspend while transmitting */
+ if (dev->txq.qlen && (message.event & PM_EVENT_AUTO)) {
+ spin_unlock_irq(&dev->txq.lock);
+ return -EBUSY;
+ } else {
+ set_bit(EVENT_DEV_ASLEEP, &dev->flags);
+ spin_unlock_irq(&dev->txq.lock);
+ }
/*
* accelerate emptying of the rx and queues, to avoid
* having everything error out.
@@ -1322,11 +1371,34 @@ EXPORT_SYMBOL_GPL(usbnet_suspend);
int usbnet_resume (struct usb_interface *intf)
{
- struct usbnet *dev = usb_get_intfdata(intf);
-
- if (!--dev->suspend_count)
+ struct usbnet *dev = usb_get_intfdata(intf);
+ struct sk_buff *skb;
+ struct urb *res;
+ int retval;
+
+ if (!--dev->suspend_count) {
+ spin_lock_irq(&dev->txq.lock);
+ res = dev->deferred;
+ dev->deferred = NULL;
+ clear_bit(EVENT_DEV_ASLEEP, &dev->flags);
+ spin_unlock_irq(&dev->txq.lock);
+ if (res) {
+ retval = usb_submit_urb(res, GFP_NOIO);
+ if (retval < 0) {
+ usb_free_urb(res);
+ netif_start_queue(dev->net);
+ usb_autopm_put_interface_async(dev->intf);
+ } else {
+ skb = (struct sk_buff *)res->context;
+ dev->net->trans_start = jiffies;
+ __skb_queue_tail (&dev->txq, skb);
+ if (!(dev->txq.qlen >= TX_QLEN(dev)))
+ netif_start_queue(dev->net);
+ }
+ }
tasklet_schedule (&dev->bh);
-
+ }
+
return 0;
}
EXPORT_SYMBOL_GPL(usbnet_resume);
diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h
index 310e18a..6fa0545 100644
--- a/include/linux/usb/usbnet.h
+++ b/include/linux/usb/usbnet.h
@@ -54,6 +54,7 @@ struct usbnet {
struct sk_buff_head txq;
struct sk_buff_head done;
struct urb *interrupt;
+ struct urb *deferred;
struct tasklet_struct bh;
struct work_struct kevent;
@@ -63,6 +64,8 @@ struct usbnet {
# define EVENT_RX_MEMORY 2
# define EVENT_STS_SPLIT 3
# define EVENT_LINK_RESET 4
+# define EVENT_DEV_WAKING 5
+# define EVENT_DEV_ASLEEP 6
};
static inline struct usb_driver *driver_of(struct usb_interface *intf)
@@ -100,6 +103,9 @@ struct driver_info {
/* see if peer is connected ... can sleep */
int (*check_connect)(struct usbnet *);
+ /* (dis)activate runtime power management */
+ int (*manage_power)(struct usbnet *, int);
+
/* for status polling */
void (*status)(struct usbnet *, struct urb *);
^ permalink raw reply related
* Re: Staging: cpc-usb CAN driver TODO list
From: Sebastian Haas @ 2009-09-07 15:35 UTC (permalink / raw)
To: Oliver Hartkopp
Cc: Wolfgang Grandegger, Greg KH, Linux Netdev List, Felipe Balbi
In-Reply-To: <4AA4E93F.90209@ems-wuensche.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Oliver,
Okay, I've just finished the first version today. It is much less effort
as I previously thought. 95% is written and compiles fine. Just the
sending part is missing and optimization (it is just a basic
implementation without USB anchors and stuff like that).
I think the further development should proceed at the Socket-CAN mailing
list, cause of things needs to be clarified specific to the core.
Cheers,
Sebastian
Sebastian Haas schrieb:
> Okay, thanks for the tips. Lets see what I can create today.
>
> Sebastian
>
> Oliver Hartkopp schrieb:
>> Wolfgang Grandegger wrote:
>>> On 09/07/2009 10:01 AM, Sebastian Haas wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> Wolfgang,
>>>>
>>>> Wolfgang Grandegger schrieb:
>>>>> Hi Sebastian,
>>>>>
>>>>> On 09/07/2009 07:56 AM, Sebastian Haas wrote:
>>>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>>>> Hash: SHA1
>>>>>>
>>>>>> Oliver,
>>>>>>
>>>>>> I'm not yet sure how to actually start the development. There is so
>>>>>> much
>>>>>> to do, and I've not much time to spend on this, unfortunately. Because
>>>>>> of this I can't rewrite the whole driver on my own in order to get a
>>>>>> Socket-CAN driver but I can provide support, review patches, rent
>>>>>> devices and make tests here.
>>>>>>
>>>>>> Oliver, you are not familiar with USB and I'm not very familiar with
>>>>>> CAN
>>>>>> netdev internals, why not combining these twos. You are writing the CAN
>>>>>> part and write the USB part.
>>>>>>
>>>>>> I'll also write a specification which contains any information you need
>>>>>> to develop a CAN driver for the device (commands, sequences, error
>>>>>> handling).
>>>>> Alternatively, EMS Wuensche could also hire an expert doing the job ;-).
>>>>> Note that we do a lot of Socket-CAN work in our free time, which is a
>>>>> limited resource. Progress depends on funding to a certain extend.
>>>> Money is also a limited resource. ;-)
>>>>
>>>> Let's become serious again, I know and respect that many of Socket-CAN
>>>> and the Staging developers spend their free time working on it. We will
>>>> of course work on the driver, but since we've not much time it may take
>>>> several months. If someone wants to help, we would be very glad and
>>>> happy to support the person as far as we can with devices, answers and
>>>> tests.
>>> OK, no problem. I really appreciate your support for Socket-CAN so far.
>> Indeed. Me too.
>
>> I tried to take a second look into cpc-usb_drv.c and i would suggest to remove
>> all the procfs and the chardev stuff and then create a CAN netdev when you
>> identified an USB node analogue to
>
>> /* Detect available channels */
>> for (i = 0; i < EMS_PCMCIA_MAX_CHAN; i++) {
>> dev = alloc_sja1000dev(0);
>> if (dev == NULL) {
>> err = -ENOMEM;
>> goto failure_cleanup;
>> }
>
>> card->net_dev[i] = dev;
>> priv = netdev_priv(dev);
>> priv->priv = card;
>> SET_NETDEV_DEV(dev, &pdev->dev);
>
>> as you know from your ems_pcmcia.c driver
>
>> and
>
>> struct net_device *alloc_sja1000dev(int sizeof_priv)
>> {
>> struct net_device *dev;
>> struct sja1000_priv *priv;
>
>> dev = alloc_candev(sizeof(struct sja1000_priv) + sizeof_priv);
>> if (!dev)
>> return NULL;
>
>> priv = netdev_priv(dev);
>
>> priv->dev = dev;
>> priv->can.bittiming_const = &sja1000_bittiming_const;
>> priv->can.do_set_bittiming = sja1000_set_bittiming;
>> priv->can.do_set_mode = sja1000_set_mode;
>
>> if (sizeof_priv)
>> priv->priv = (void *)priv + sizeof(struct sja1000_priv);
>
>> return dev;
>> }
>
>> as you know from the sja1000.c (which can probably be used for the
>> LPC2119_PRODUCT_ID we should try to implement first).
>
>> Then we need something like this stuff
>
>> static const struct net_device_ops sja1000_netdev_ops = {
>> .ndo_open = sja1000_open,
>> .ndo_stop = sja1000_close,
>> .ndo_start_xmit = sja1000_start_xmit,
>> };
>
>> int register_sja1000dev(struct net_device *dev)
>> {
>> if (!sja1000_probe_chip(dev))
>> return -ENODEV;
>
>> dev->netdev_ops = &sja1000_netdev_ops;
>
>> dev->flags |= IFF_ECHO; /* we support local echo */
>
>> set_reset_mode(dev);
>> chipset_init(dev);
>
>> return register_candev(dev);
>> }
>
>> from sja1000.c
>
>> And then we have an USB CAN node that has a belonging CAN netdevice (maybe
>> there is something else we can look at that's used in other USB ethernet
>> adapters).
>
>> I know from the PEAK USB driver at
>
>> http://www.peak-system.com/fileadmin/media/linux/files/peak-linux-driver.6.11.tar.gz
>
>> that i just needed to duplicate and modify the usb rx/tx stuff and redirect
>> the CAN frames into the network stack. But this PEAK driver does not have a
>> netlink configuration interface and can only be taken as a limited example ...
>
>> I assume, when the driver (cpc_usb.c or ems_usb.c analogue to the ems_pcmcia.c
>> ?) is prepared as described above, one can go and connect the rx/tx dataflow
>> and the netlink configuration.
>
>> Unfortunately i'm short of time the next two weeks but maybe you can start and
>> create such a new C-file (probably based on ems_pcmcia.c) ?
>
>> Best regards,
>> Oliver
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkqlKDsACgkQpqRB8PJG7XwpdgCglYjvGi/z1z5tX1ozgF/YTA1j
9pwAnAkIHScIlic0KZNZEDWT09jUt3N1
=OVN3
-----END PGP SIGNATURE-----
--
EMS Dr. Thomas Wuensche e.K.
Sonnenhang 3
85304 Ilmmuenster
HRA Neuburg a.d. Donau, HR-Nr. 70.106
Phone: +49-8441-490260
Fax : +49-8441-81860
http://www.ems-wuensche.com
^ permalink raw reply
* Re: multicast routing and multiple interfaces with same IP
From: Ilia K. @ 2009-09-07 15:35 UTC (permalink / raw)
To: David Miller; +Cc: opurdila, netdev
In-Reply-To: <20090829.000154.263733862.davem@davemloft.net>
[-- Attachment #1: Type: text/plain, Size: 1106 bytes --]
Hi,
I'm attaching a new patch. The changes:
mroute.h:
- consistent name for interface index: vifc_lcl_ifindex
- union of vifc_lcl_addr and vifc_lcl_ifindex since either one of them
can be used
ipmr.c:
- case VIFF_USE_IFINDEX and case 0 had almost the same code, so using
fall through and if to eliminate code duplication
Regards,
Ilia.
On Sat, Aug 29, 2009 at 10:01 AM, David Miller<davem@davemloft.net> wrote:
> From: Octavian Purdila <opurdila@ixiacom.com>
> Date: Thu, 27 Aug 2009 02:53:18 +0300
>
>> I don't have context on multicast routing, but this caught my attention:
>>
>>>@@ -61,11 +61,13 @@
>>> unsigned int vifc_rate_limit; /* Rate limiter values (NI) */
>>> struct in_addr vifc_lcl_addr; /* Our address */
>>> struct in_addr vifc_rmt_addr; /* IPIP tunnel addr */
>>>+ int ifindex; /* Local interface index */
>>> };
>>>
>>
>> Wouldn't this break userspace ABI?
>>
>> Perhaps you could use a union between vifc_lcl_addr and vifc_ifindex, they seem
>> to be exclusive.
>
> Indeed, this will need to be fixed up.
>
[-- Attachment #2: vif_add.patch --]
[-- Type: text/x-diff, Size: 1587 bytes --]
=== modified file 'include/linux/mroute.h'
--- include/linux/mroute.h 2009-08-10 11:17:32 +0000
+++ include/linux/mroute.h 2009-09-07 15:16:43 +0000
@@ -59,13 +59,17 @@
unsigned char vifc_flags; /* VIFF_ flags */
unsigned char vifc_threshold; /* ttl limit */
unsigned int vifc_rate_limit; /* Rate limiter values (NI) */
- struct in_addr vifc_lcl_addr; /* Our address */
+ union {
+ struct in_addr vifc_lcl_addr; /* Local interface address */
+ int vifc_lcl_ifindex; /* Local interface index */
+ };
struct in_addr vifc_rmt_addr; /* IPIP tunnel addr */
};
-#define VIFF_TUNNEL 0x1 /* IPIP tunnel */
-#define VIFF_SRCRT 0x2 /* NI */
-#define VIFF_REGISTER 0x4 /* register vif */
+#define VIFF_TUNNEL 0x1 /* IPIP tunnel */
+#define VIFF_SRCRT 0x2 /* NI */
+#define VIFF_REGISTER 0x4 /* register vif */
+#define VIFF_USE_IFINDEX 0x8 /* use vifc_lcl_ifindex to find an interface */
/*
* Cache manipulation structures for mrouted and PIMd
=== modified file 'net/ipv4/ipmr.c'
--- net/ipv4/ipmr.c 2009-08-10 11:17:32 +0000
+++ net/ipv4/ipmr.c 2009-09-07 15:20:07 +0000
@@ -470,8 +470,18 @@
return err;
}
break;
+
+ case VIFF_USE_IFINDEX:
case 0:
- dev = ip_dev_find(net, vifc->vifc_lcl_addr.s_addr);
+ if (vifc->vifc_flags==VIFF_USE_IFINDEX) {
+ dev = dev_get_by_index(net, vifc->vifc_lcl_ifindex);
+ if (dev && dev->ip_ptr == NULL) {
+ dev_put(dev);
+ return -EADDRNOTAVAIL;
+ }
+ } else {
+ dev = ip_dev_find(net, vifc->vifc_lcl_addr.s_addr);
+ }
if (!dev)
return -EADDRNOTAVAIL;
err = dev_set_allmulti(dev, 1);
^ permalink raw reply
* Re: [iproute2] tc action mirred question
From: Xiaofei Wu @ 2009-09-07 16:05 UTC (permalink / raw)
To: hadi; +Cc: linux netdev
I am a newbie for 'traffic control' and Linux networking. So I ask some experts here to help me.
Maybe my questions are stupid. But I hope I can get your reply. Thank you!
>> I just want to know:
>> 1) Could I forward the mirroring packets to another node ,and then route it to the destination(if I use
>> iproute2 (ip, tc ...) )? I described my purpose in my last email.
>Yes, you can mirror to another node(B/D). To route on that node(B/D),
>your dst MAC address has to be correct for that destination node(B/D) to
>accept it. You could try to run the destination node in promisc mode
>and you may be able to get away without changing dst mac.
(1) Could I use pedit action to modify the dst MAC, so the destination node D will accept it, then forward it to
node C? (or use other tools to modify the dst MAC, please give me more information)
(2) If I use 'ifconfig wlan0 promisc ... ' on node D, would it route the mirroring packets (the dst MAC is incorrect)
to node C?
>> 2) After I mirrored the packets, I should use 'ip route' , 'ip rule' to modify route tables. Is this right?
>>
>Assuming you are talking about B/D, yes you can do routing there if the
>node accepts it..
Regards,
Wu
^ permalink raw reply
* Re: multicast routing and multiple interfaces with same IP
From: Octavian Purdila @ 2009-09-07 16:25 UTC (permalink / raw)
To: Ilia K.; +Cc: David Miller, netdev
In-Reply-To: <1b9338490909070835t3517bc36o396539e7ea1721fc@mail.gmail.com>
On Monday 07 September 2009 18:35:41 you wrote:
> Hi,
> I'm attaching a new patch. The changes:
> mroute.h:
> - consistent name for interface index: vifc_lcl_ifindex
> - union of vifc_lcl_addr and vifc_lcl_ifindex since either one of them
> can be used
> ipmr.c:
> - case VIFF_USE_IFINDEX and case 0 had almost the same code, so using
> fall through and if to eliminate code duplication
>
Hi Ilia,
Looks good to me, but there are a couple of code style issues reported by
./scripts/checkpatch.pl.
Also, here:
>+ } else {
>+ dev = ip_dev_find(net, vifc->vifc_lcl_addr.s_addr);
>+ }
Usually no braces are used for single line statements in if/else.
tavi
^ permalink raw reply
* Next Sept 7: Bug : skb_release_head_state on x86
From: Sachin Sant @ 2009-09-07 16:49 UTC (permalink / raw)
To: netdev; +Cc: Stephen Rothwell, linux-next, David Miller
In-Reply-To: <20090907210206.7830ba68.sfr@canb.auug.org.au>
Today's next kernel running on a x86 box crashed with
BUG: unable to handle kernel paging request at 00010090
IP: [<c034559d>] skb_release_head_state+0x20/0xac
*pdpt = 000000003455c001 *pde = 0000000000000000
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/topology/core_siblings
Modules linked in: ipv6 microcode fuse loop dm_mod ppdev rtc_cmos i2c_piix4
rtc_core i2c_core rtc_lib button sr_mod tg3 parport_pc sworks_agp cdrom floppy
parport agpgart pcspkr libphy sg ohci_hcd ehci_hcd sd_mod crc_t10dif usbcore
edd fan ide_pci_generic serverworks ide_core ata_generic pata_serverworks
libata ips scsi_mod thermal processor thermal_sys hwmon [last unloaded:
speedstep_lib]
Pid: 6, comm: ksoftirqd/1 Not tainted (2.6.31-rc9-autotest-next-20090907-5-pae
#1) eserver xSeries 235 -[86717AX]-
EIP: 0060:[<c034559d>] EFLAGS: 00010206 CPU: 1
EIP is at skb_release_head_state+0x20/0xac
EAX: 00000000 EBX: f44b5200 ECX: f44b5200 EDX: 00010090
ESI: f5548000 EDI: 00000000 EBP: f5c69dd4 ESP: f5c69dd0
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process ksoftirqd/1 (pid: 6, ti=f5c68000 task=f5c4f280 task.ti=f5c68000)
Stack:
f44b5200 f5c69de0 c0345398 f5c69e48 f5c69de8 c034542e f5c69e58 c0388807
<0> f44b5200 f5582900 ced1a038 c07ac124 ced1a030 3e6f7c09 eb152044 f4b4bc00
<0> 00000006 c05a594c f5c69e30 c036a2c0 c07ac124 f4b4bc00 f4b4bc00 eb152030
Call Trace:
[<c0345398>] ? __kfree_skb+0xb/0x71
[<c034542e>] ? consume_skb+0x30/0x32
[<c0388807>] ? arp_process+0x572/0x58e
[<c036a2c0>] ? ip_local_deliver_finish+0x143/0x207
[<c0388907>] ? arp_rcv+0xda/0xed
[<c034bdc2>] ? netif_receive_skb+0x43a/0x459
[<c034bee4>] ? napi_skb_finish+0x1e/0x33
[<c034c267>] ? napi_gro_receive+0x20/0x24
[<f8b3667f>] ? tg3_poll+0x5ed/0x802 [tg3]
[<c034c351>] ? net_rx_action+0x93/0x173
[<c013769c>] ? __do_softirq+0xa7/0x144
[<c013775f>] ? do_softirq+0x26/0x2b
[<c01377ae>] ? ksoftirqd+0x4a/0xae
[<c0137764>] ? ksoftirqd+0x0/0xae
[<c0146a2e>] ? kthread+0x61/0x66
[<c01469cd>] ? kthread+0x0/0x66
[<c0103507>] ? kernel_thread_helper+0x7/0x10
Code: fe ff ff 83 c4 0c 5b 5e 5f 5d c3 55 89 e5 53 89 c3 8b 40 18 85 c0 74 05
e8 22 ae 00 00 8b 53 1c c7 43 18 00 00 00 00 85 d2 74 11 <f0> ff 0a 0f 94 c0 84
c0 74 07 89 d0 e8 81 c6 05 00 83 7b 6c 00
EIP: [<c034559d>] skb_release_head_state+0x20/0xac SS:ESP 0068:f5c69dd0
CR2: 0000000000010090
---[ end trace 64c8710cf222dc04 ]---
At the time of crash, kernbench was running on this box.
The corresponding c code is :
0000000000002387 <skb_release_head_state>:
static void skb_release_head_state(struct sk_buff *skb)
{
2387: 55 push %rbp
2388: 48 89 e5 mov %rsp,%rbp
238b: 53 push %rbx
238c: 48 89 fb mov %rdi,%rbx
238f: 48 83 ec 08 sub $0x8,%rsp
skb_dst_drop():
/usr/local/autobench/var/tmp/build/linux/include/net/dst.h:179
}
...... <SNIP> ......
...... <SNIP> ......
skb_release_head_state():
/usr/local/autobench/var/tmp/build/linux/net/core/skbuff.c:395
skb_dst_drop(skb);
#ifdef CONFIG_XFRM
secpath_put(skb->sp);
23a1: 48 8b 7b 30 mov 0x30(%rbx),%rdi
skb_dst_drop():
/usr/local/autobench/var/tmp/build/linux/include/net/dst.h:181
skb->_skb_dst = 0UL;
23a5: 48 c7 43 28 00 00 00 movq $0x0,0x28(%rbx)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This line
23ac: 00
Let me know what other information is required to debug this.
Thanks
-Sachin
--
---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------
^ permalink raw reply
* Re: net_sched 05/07: reintroduce dev->qdisc for use by sch_api
From: Jarek Poplawski @ 2009-09-07 16:49 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netdev
In-Reply-To: <4AA507AD.6000403@trash.net>
On Mon, Sep 07, 2009 at 03:16:29PM +0200, Patrick McHardy wrote:
> Jarek Poplawski wrote:
> >> @@ -1383,7 +1375,7 @@ static int tc_ctl_tclass(struct sk_buff *skb, struct nlmsghdr *n, void *arg)
> >> pid = TC_H_MAKE(qid, pid);
> >> } else {
> >> if (qid == 0)
> >> - qid = dev_queue->qdisc_sleeping->handle;
> >> + qid = dev->qdisc->handle;
> >
> > Probably I miss something, but in mq root case it seems to never do
> > anything we need. If so, it could be the example of possible issues
> > elsewhere.
>
> Sorry, I'm not sure what you're saying ..
>
> > I thought this mq virtual root qdisc could be done more transparently
> > and invisible for the current code, but it seems, in your
> > implementation some pointers like this, or parent ids (especially
> > TC_H_ROOT) might be different, and even if it works OK, needs a lot of
> > verification. So, my question is, if it's really necessary.
>
> Same here.
Nevermind! I simply had a dream there could be preserved some old
meaning of "root" etc. within a queue but it doesn't make a sense with
this kind of interface.
Jarek P.
^ permalink raw reply
* Re: Next Sept 7: Bug : skb_release_head_state on x86
From: Eric Dumazet @ 2009-09-07 17:17 UTC (permalink / raw)
To: Sachin Sant; +Cc: netdev, Stephen Rothwell, linux-next, David Miller
In-Reply-To: <4AA5399A.405@in.ibm.com>
Sachin Sant a écrit :
> Today's next kernel running on a x86 box crashed with
>
> BUG: unable to handle kernel paging request at 00010090
> IP: [<c034559d>] skb_release_head_state+0x20/0xac
> *pdpt = 000000003455c001 *pde = 0000000000000000
> Oops: 0002 [#1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu3/topology/core_siblings
> Modules linked in: ipv6 microcode fuse loop dm_mod ppdev rtc_cmos i2c_piix4
> rtc_core i2c_core rtc_lib button sr_mod tg3 parport_pc sworks_agp cdrom
> floppy
> parport agpgart pcspkr libphy sg ohci_hcd ehci_hcd sd_mod crc_t10dif
> usbcore
> edd fan ide_pci_generic serverworks ide_core ata_generic pata_serverworks
> libata ips scsi_mod thermal processor thermal_sys hwmon [last unloaded:
> speedstep_lib]
>
> Pid: 6, comm: ksoftirqd/1 Not tainted
> (2.6.31-rc9-autotest-next-20090907-5-pae
> #1) eserver xSeries 235 -[86717AX]-
> EIP: 0060:[<c034559d>] EFLAGS: 00010206 CPU: 1
> EIP is at skb_release_head_state+0x20/0xac
> EAX: 00000000 EBX: f44b5200 ECX: f44b5200 EDX: 00010090
> ESI: f5548000 EDI: 00000000 EBP: f5c69dd4 ESP: f5c69dd0
> DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> Process ksoftirqd/1 (pid: 6, ti=f5c68000 task=f5c4f280 task.ti=f5c68000)
> Stack:
> f44b5200 f5c69de0 c0345398 f5c69e48 f5c69de8 c034542e f5c69e58 c0388807
> <0> f44b5200 f5582900 ced1a038 c07ac124 ced1a030 3e6f7c09 eb152044 f4b4bc00
> <0> 00000006 c05a594c f5c69e30 c036a2c0 c07ac124 f4b4bc00 f4b4bc00 eb152030
> Call Trace:
This is a crash on a 32bit kernel
> [<c0345398>] ? __kfree_skb+0xb/0x71
> [<c034542e>] ? consume_skb+0x30/0x32
> [<c0388807>] ? arp_process+0x572/0x58e
> [<c036a2c0>] ? ip_local_deliver_finish+0x143/0x207
> [<c0388907>] ? arp_rcv+0xda/0xed
> [<c034bdc2>] ? netif_receive_skb+0x43a/0x459
> [<c034bee4>] ? napi_skb_finish+0x1e/0x33
> [<c034c267>] ? napi_gro_receive+0x20/0x24
> [<f8b3667f>] ? tg3_poll+0x5ed/0x802 [tg3]
> [<c034c351>] ? net_rx_action+0x93/0x173
> [<c013769c>] ? __do_softirq+0xa7/0x144
> [<c013775f>] ? do_softirq+0x26/0x2b
> [<c01377ae>] ? ksoftirqd+0x4a/0xae
> [<c0137764>] ? ksoftirqd+0x0/0xae
> [<c0146a2e>] ? kthread+0x61/0x66
> [<c01469cd>] ? kthread+0x0/0x66
> [<c0103507>] ? kernel_thread_helper+0x7/0x10
> Code: fe ff ff 83 c4 0c 5b 5e 5f 5d c3 55 89 e5 53 89 c3 8b 40 18 85 c0
> 74 05
> e8 22 ae 00 00 8b 53 1c c7 43 18 00 00 00 00 85 d2 74 11 <f0> ff 0a 0f
> 94 c0 84
> c0 74 07 89 d0 e8 81 c6 05 00 83 7b 6c 00
> EIP: [<c034559d>] skb_release_head_state+0x20/0xac SS:ESP 0068:f5c69dd0
> CR2: 0000000000010090
> ---[ end trace 64c8710cf222dc04 ]---
>
> At the time of crash, kernbench was running on this box.
>
> The corresponding c code is :
> 0000000000002387 <skb_release_head_state>:
> static void skb_release_head_state(struct sk_buff *skb) {
and you decode a 64 bits kernel
> 2387: 55 push %rbp 2388: 48 89
> e5 mov %rsp,%rbp
> 238b: 53 push %rbx 238c: 48 89
> fb mov %rdi,%rbx
> 238f: 48 83 ec 08 sub $0x8,%rsp
> skb_dst_drop():
> /usr/local/autobench/var/tmp/build/linux/include/net/dst.h:179
> }
> ...... <SNIP> ......
> ...... <SNIP> ......
>
> skb_release_head_state():
> /usr/local/autobench/var/tmp/build/linux/net/core/skbuff.c:395
> skb_dst_drop(skb);
> #ifdef CONFIG_XFRM
> secpath_put(skb->sp);
> 23a1: 48 8b 7b 30 mov 0x30(%rbx),%rdi
> skb_dst_drop():
> /usr/local/autobench/var/tmp/build/linux/include/net/dst.h:181
> skb->_skb_dst = 0UL;
> 23a5: 48 c7 43 28 00 00 00 movq $0x0,0x28(%rbx)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This line
> 23ac: 00
>
This is more probably
<f0> ff 0a lock decl (%edx)
part of :
secpath_put(skb->sp);
So some skb has a strange/buggy skb->sp (value 0x00010090)
It looks like skb->cb[xxx] overwrote skb->sp
Please check you have CONFIG_XFRM=y, and that you did rebuild all your modules after
patching your kernel...
^ permalink raw reply
* Re: net_sched 00/07: classful multiqueue dummy scheduler
From: Eric Dumazet @ 2009-09-07 17:21 UTC (permalink / raw)
To: Patrick McHardy; +Cc: David Miller, netdev
In-Reply-To: <4AA5175F.6030600@trash.net>
Patrick McHardy a écrit :
> Patrick McHardy wrote:
>> Eric Dumazet wrote:
>>> Had very litle time to test this, but got problems very fast, if rate estimator configured.
>> I didn't test that, but I'll look into it.
>>
>>> qdisc mq 1: root
>>> Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
>>> rate 177925Kbit 49pps backlog 0b 0p requeues 0
>>> qdisc pfifo 8001: parent 1:1 limit 1000p
>>> Sent 528702 bytes 3491 pkt (dropped 0, overlimits 0 requeues 0)
>>> rate 25400bit 21pps backlog 0b 0p requeues 0
>>>
>>> <<<crash>>>
>> Did you capture the crash?
No, in fact it was a freeze.
>>
>>> (On another term I had a "ping -i 0.1 192.168.20.120" that gave :
>>>
>>> 2009/08/07 14:53:42.498 64 bytes from 192.168.20.120: icmp_seq=1982 ttl=64 time=0.126 ms
>>> 2009/08/07 14:53:42.598 64 bytes from 192.168.20.120: icmp_seq=1983 ttl=64 time=0.118 ms
>>> 2009/08/07 14:53:42.698 64 bytes from 192.168.20.120: icmp_seq=1984 ttl=64 time=0.114 ms
>>> 2009/08/07 14:53:42.798 64 bytes from 192.168.20.120: icmp_seq=1985 ttl=64 time=0.123 ms
>>> 2009/08/07 14:53:42.898 64 bytes from 192.168.20.120: icmp_seq=1986 ttl=64 time=0.126 ms
>>> 2009/08/07 14:53:42.998 64 bytes from 192.168.20.120: icmp_seq=1987 ttl=64 time=0.119 ms
>>> 2009/08/07 14:53:43.098 64 bytes from 192.168.20.120: icmp_seq=1988 ttl=64 time=0.122 ms
>>> 2009/08/07 14:53:43.198 64 bytes from 192.168.20.120: icmp_seq=1989 ttl=64 time=0.119 ms
>>> 2009/08/07 14:53:43.298 64 bytes from 192.168.20.120: icmp_seq=1990 ttl=64 time=0.117 ms
>>> 2009/08/07 14:53:43.398 64 bytes from 192.168.20.120: icmp_seq=1991 ttl=64 time=0.117 ms
>>> ping: sendmsg: No buffer space available
>> Was this also with rate estimators? No buffer space available
>> indicates that some class/qdisc isn't dequeued or the packets
>> are leaking, so the output of tc -s -d qdisc show ... might be
>> helpful.
>
> I figured out the bug, which is likely responsible for both
> problems. When grafting a mq class and creating a rate estimator,
> the new qdisc is not attached to the device queue yet and also
> doesn't have TC_H_ROOT as parent, so qdisc_create() selects
> qdisc_root_sleeping_lock() for the estimator, which belongs to
> the qdisc that is getting replaced.
>
> This is a patch I used for testing, but I'll come up with
> something more elegant (I hope) as a final fix :)
Yes, this was the problem, and your patch fixed it.
Now adding CONFIG_SLUB_DEBUG_ON=y for next tries :)
Sep 7 16:37:55 erd kernel: [ 217.056813] =============================================================================
Sep 7 16:37:55 erd kernel: [ 217.056865] BUG kmalloc-256: Poison overwritten
Sep 7 16:37:55 erd kernel: [ 217.056910] -----------------------------------------------------------------------------
Sep 7 16:37:55 erd kernel: [ 217.056911]
Sep 7 16:37:55 erd kernel: [ 217.056990] INFO: 0xf6e622bc-0xf6e622bd. First byte 0x76 instead of 0x6b
Sep 7 16:37:55 erd kernel: [ 217.057049] INFO: Allocated in qdisc_alloc+0x1b/0x80 age=154593 cpu=2 pid=5165
Sep 7 16:37:55 erd kernel: [ 217.057094] INFO: Freed in qdisc_destroy+0x88/0xa0 age=139186 cpu=4 pid=5173
Sep 7 16:37:55 erd kernel: [ 217.057139] INFO: Slab 0xc16ddc40 objects=26 used=6 fp=0xf6e62260 flags=0x28040c3
Sep 7 16:37:55 erd kernel: [ 217.057184] INFO: Object 0xf6e62260 @offset=608 fp=0xf6e62850
Sep 7 16:37:55 erd kernel: [ 217.057184]
Sep 7 16:37:55 erd kernel: [ 217.057259] Bytes b4 0xf6e62250: d9 04 00 00 fc 6f fb ff 5a 5a 5a 5a 5a 5a 5a 5a Ù...üoûÿZZZZZZZZ
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62260: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62270: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62280: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62290: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e622a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e622b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 76 76 6b 6b kkkkkkkkkkkkvvkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e622c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e622d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e622e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e622f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62300: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62310: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62320: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62330: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62340: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
Sep 7 16:37:55 erd kernel: [ 217.057771] Object 0xf6e62350: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk¥
Sep 7 16:37:55 erd kernel: [ 217.057771] Redzone 0xf6e62360: bb bb bb bb »»»»
Sep 7 16:37:55 erd kernel: [ 217.057771] Padding 0xf6e62388: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
Sep 7 16:37:55 erd kernel: [ 217.057771] Pid: 5334, comm: bash Not tainted 2.6.31-rc5-04006-gedfbc1d-dirty #188
Sep 7 16:37:55 erd kernel: [ 217.057771] Call Trace:
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02a6d5f>] print_trailer+0xcf/0x120
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02a6e69>] check_bytes_and_report+0xb9/0xe0
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02a7097>] check_object+0x1b7/0x200
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02a89b6>] __slab_alloc+0x3d6/0x5a0
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02a9602>] __kmalloc+0x172/0x180
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02e4c02>] ? load_elf_binary+0x122/0x1550
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02e4c02>] load_elf_binary+0x122/0x1550
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c035655e>] ? strrchr+0xe/0x30
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02e2644>] ? load_misc_binary+0x64/0x420
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c029190f>] ? page_address+0xcf/0xf0
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c0291aac>] ? kmap_high+0x1c/0x1e0
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c029190f>] ? page_address+0xcf/0xf0
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c029194a>] ? kunmap_high+0x1a/0x90
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02b20d7>] search_binary_handler+0xa7/0x240
Sep 7 16:37:55 erd kernel: [ 217.057771] [<c02b3686>] do_execve+0x2e6/0x3c0
Sep 7 16:37:56 erd kernel: [ 217.057771] [<c0201638>] sys_execve+0x28/0x60
Sep 7 16:37:56 erd kernel: [ 217.057771] [<c0202d08>] sysenter_do_call+0x12/0x26
Sep 7 16:37:56 erd kernel: [ 217.057771] FIX kmalloc-256: Restoring 0xf6e622bc-0xf6e622bd=0x6b
Sep 7 16:37:56 erd kernel: [ 217.057771]
Sep 7 16:37:56 erd kernel: [ 217.057771] FIX kmalloc-256: Marking all objects used
^ permalink raw reply
* Re: net_sched 00/07: classful multiqueue dummy scheduler
From: Patrick McHardy @ 2009-09-07 17:28 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <4AA54128.2050607@gmail.com>
Eric Dumazet wrote:
>> I figured out the bug, which is likely responsible for both
>> problems. When grafting a mq class and creating a rate estimator,
>> the new qdisc is not attached to the device queue yet and also
>> doesn't have TC_H_ROOT as parent, so qdisc_create() selects
>> qdisc_root_sleeping_lock() for the estimator, which belongs to
>> the qdisc that is getting replaced.
>>
>> This is a patch I used for testing, but I'll come up with
>> something more elegant (I hope) as a final fix :)
>
> Yes, this was the problem, and your patch fixed it.
Thanks for testing.
> Now adding CONFIG_SLUB_DEBUG_ON=y for next tries :)
>
> Sep 7 16:37:55 erd kernel: [ 217.056813] =============================================================================
> Sep 7 16:37:55 erd kernel: [ 217.056865] BUG kmalloc-256: Poison overwritten
> Sep 7 16:37:55 erd kernel: [ 217.056910] -----------------------------------------------------------------------------
> Sep 7 16:37:55 erd kernel: [ 217.056911]
> Sep 7 16:37:55 erd kernel: [ 217.056990] INFO: 0xf6e622bc-0xf6e622bd. First byte 0x76 instead of 0x6b
> Sep 7 16:37:55 erd kernel: [ 217.057049] INFO: Allocated in qdisc_alloc+0x1b/0x80 age=154593 cpu=2 pid=5165
> Sep 7 16:37:55 erd kernel: [ 217.057094] INFO: Freed in qdisc_destroy+0x88/0xa0 age=139186 cpu=4 pid=5173
> Sep 7 16:37:55 erd kernel: [ 217.057139] INFO: Slab 0xc16ddc40 objects=26 used=6 fp=0xf6e62260 flags=0x28040c3
> Sep 7 16:37:55 erd kernel: [ 217.057184] INFO: Object 0xf6e62260 @offset=608 fp=0xf6e62850
> Sep 7 16:37:55 erd kernel: [ 217.057184]
I'm unable to reproduce this. Could you send me the commands you
used that lead to this?
^ permalink raw reply
* Re: net_sched 00/07: classful multiqueue dummy scheduler
From: Eric Dumazet @ 2009-09-07 17:30 UTC (permalink / raw)
To: Patrick McHardy; +Cc: David Miller, netdev
In-Reply-To: <4AA542B4.4090206@trash.net>
Patrick McHardy a écrit :
> Eric Dumazet wrote:
>>> I figured out the bug, which is likely responsible for both
>>> problems. When grafting a mq class and creating a rate estimator,
>>> the new qdisc is not attached to the device queue yet and also
>>> doesn't have TC_H_ROOT as parent, so qdisc_create() selects
>>> qdisc_root_sleeping_lock() for the estimator, which belongs to
>>> the qdisc that is getting replaced.
>>>
>>> This is a patch I used for testing, but I'll come up with
>>> something more elegant (I hope) as a final fix :)
>> Yes, this was the problem, and your patch fixed it.
>
> Thanks for testing.
>
>> Now adding CONFIG_SLUB_DEBUG_ON=y for next tries :)
>>
>> Sep 7 16:37:55 erd kernel: [ 217.056813] =============================================================================
>> Sep 7 16:37:55 erd kernel: [ 217.056865] BUG kmalloc-256: Poison overwritten
>> Sep 7 16:37:55 erd kernel: [ 217.056910] -----------------------------------------------------------------------------
>> Sep 7 16:37:55 erd kernel: [ 217.056911]
>> Sep 7 16:37:55 erd kernel: [ 217.056990] INFO: 0xf6e622bc-0xf6e622bd. First byte 0x76 instead of 0x6b
>> Sep 7 16:37:55 erd kernel: [ 217.057049] INFO: Allocated in qdisc_alloc+0x1b/0x80 age=154593 cpu=2 pid=5165
>> Sep 7 16:37:55 erd kernel: [ 217.057094] INFO: Freed in qdisc_destroy+0x88/0xa0 age=139186 cpu=4 pid=5173
>> Sep 7 16:37:55 erd kernel: [ 217.057139] INFO: Slab 0xc16ddc40 objects=26 used=6 fp=0xf6e62260 flags=0x28040c3
>> Sep 7 16:37:55 erd kernel: [ 217.057184] INFO: Object 0xf6e62260 @offset=608 fp=0xf6e62850
>> Sep 7 16:37:55 erd kernel: [ 217.057184]
>
> I'm unable to reproduce this. Could you send me the commands you
> used that lead to this?
>
Sorry, this was *before* your last patch.
I tried to have more information, because I was not able to get console messages at crash time on this remote dev machine.
enabling SLUB checks got some hint of what the problem was (using memory block after its freeing by qdisc_destroy)
^ permalink raw reply
* Re: net_sched 00/07: classful multiqueue dummy scheduler
From: Patrick McHardy @ 2009-09-07 17:33 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <4AA54347.8020401@gmail.com>
Eric Dumazet wrote:
> Patrick McHardy a écrit :
>>> Sep 7 16:37:55 erd kernel: [ 217.056813] =============================================================================
>>> Sep 7 16:37:55 erd kernel: [ 217.056865] BUG kmalloc-256: Poison overwritten
>>> Sep 7 16:37:55 erd kernel: [ 217.056910] -----------------------------------------------------------------------------
>>> Sep 7 16:37:55 erd kernel: [ 217.056911]
>>> Sep 7 16:37:55 erd kernel: [ 217.056990] INFO: 0xf6e622bc-0xf6e622bd. First byte 0x76 instead of 0x6b
>>> Sep 7 16:37:55 erd kernel: [ 217.057049] INFO: Allocated in qdisc_alloc+0x1b/0x80 age=154593 cpu=2 pid=5165
>>> Sep 7 16:37:55 erd kernel: [ 217.057094] INFO: Freed in qdisc_destroy+0x88/0xa0 age=139186 cpu=4 pid=5173
>>> Sep 7 16:37:55 erd kernel: [ 217.057139] INFO: Slab 0xc16ddc40 objects=26 used=6 fp=0xf6e62260 flags=0x28040c3
>>> Sep 7 16:37:55 erd kernel: [ 217.057184] INFO: Object 0xf6e62260 @offset=608 fp=0xf6e62850
>>> Sep 7 16:37:55 erd kernel: [ 217.057184]
>> I'm unable to reproduce this. Could you send me the commands you
>> used that lead to this?
>>
>
> Sorry, this was *before* your last patch.
>
> I tried to have more information, because I was not able to get console messages at crash time on this remote dev machine.
>
> enabling SLUB checks got some hint of what the problem was (using memory block after its freeing by qdisc_destroy)
OK, that probably explains it, the spinlock operations were operating
on already freed memory.
I'll do some more testing and will send the final patch if no
other problems show up.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox