* [PATCH 3/6] [IPROUTE2]: Update pkt_sched.h (to resemble the kernel one)
From: Jesper Dangaard Brouer @ 2007-09-12 10:14 UTC (permalink / raw)
To: netdev@vger.kernel.org
Cc: Patrick McHardy, David S. Miller, Stephen Hemminger
commit ef065a43b8900fbc0763eac0fa0a9a8a00c8aaa2
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Tue Sep 11 16:17:46 2007 +0200
[IPROUTE2]: Update pkt_sched.h (to resemble the kernel one)
Extend the tc_ratespec struct, with two parameters: 1) "cell_align"
that allow adjusting the alignment of the rate table. 2) "overhead"
that allow adding a packet overhead before the lookup in the kernel.
This is done in order to, add support to changing the rate table to
use the upper-boundry L2T (length to time) value. Currently we use the
lower-boundry, which result in under-estimating the actual bandwidth
usage.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/include/linux/pkt_sched.h b/include/linux/pkt_sched.h
index 268c515..919af93 100644
--- a/include/linux/pkt_sched.h
+++ b/include/linux/pkt_sched.h
@@ -77,8 +77,8 @@ struct tc_ratespec
{
unsigned char cell_log;
unsigned char __reserved;
- unsigned short feature;
- short addend;
+ unsigned short overhead;
+ short cell_align;
unsigned short mpu;
__u32 rate;
};
^ permalink raw reply related
* [PATCH 4/6] [IPROUTE2]: Overhead calculation is now done in the kernel
From: Jesper Dangaard Brouer @ 2007-09-12 10:14 UTC (permalink / raw)
To: netdev@vger.kernel.org
Cc: Patrick McHardy, David S. Miller, Stephen Hemminger
commit 07a74a2613440fc1a68d0faa7235ed7027532d78
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Tue Sep 11 16:59:58 2007 +0200
[IPROUTE2]: Overhead calculation is now done in the kernel.
The only current user is HTB. HTB overhead argument is now passed on
to the kernel (in the struct tc_ratespec). Also correct the data
types.
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/tc/q_htb.c b/tc/q_htb.c
index 53e3f78..310d36d 100644
--- a/tc/q_htb.c
+++ b/tc/q_htb.c
@@ -107,8 +107,9 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
__u32 rtab[256],ctab[256];
unsigned buffer=0,cbuffer=0;
int cell_log=-1,ccell_log = -1;
- unsigned mtu, mpu;
- unsigned char mpu8 = 0, overhead = 0;
+ unsigned mtu;
+ unsigned short mpu = 0;
+ unsigned short overhead = 0;
struct rtattr *tail;
memset(&opt, 0, sizeof(opt)); mtu = 1600; /* eth packet len */
@@ -127,12 +128,12 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
}
} else if (matches(*argv, "mpu") == 0) {
NEXT_ARG();
- if (get_u8(&mpu8, *argv, 10)) {
+ if (get_u16(&mpu, *argv, 10)) {
explain1("mpu"); return -1;
}
} else if (matches(*argv, "overhead") == 0) {
NEXT_ARG();
- if (get_u8(&overhead, *argv, 10)) {
+ if (get_u16(&overhead, *argv, 10)) {
explain1("overhead"); return -1;
}
} else if (matches(*argv, "quantum") == 0) {
@@ -206,9 +207,11 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
if (!buffer) buffer = opt.rate.rate / get_hz() + mtu;
if (!cbuffer) cbuffer = opt.ceil.rate / get_hz() + mtu;
-/* encode overhead and mpu, 8 bits each, into lower 16 bits */
- mpu = (unsigned)mpu8 | (unsigned)overhead << 8;
- opt.ceil.mpu = mpu; opt.rate.mpu = mpu;
+ opt.ceil.overhead = overhead;
+ opt.rate.overhead = overhead;
+
+ opt.ceil.mpu = mpu;
+ opt.rate.mpu = mpu;
if ((cell_log = tc_calc_rtable(opt.rate.rate, rtab, cell_log, mtu, mpu)) < 0) {
fprintf(stderr, "htb: failed to calculate rate table.\n");
diff --git a/tc/tc_core.c b/tc/tc_core.c
index 58155fb..1ab0ba0 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -73,8 +73,6 @@ int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
unsigned mpu)
{
int i;
- unsigned overhead = (mpu >> 8) & 0xFF;
- mpu = mpu & 0xFF;
if (mtu == 0)
mtu = 2047;
@@ -86,8 +84,6 @@ int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
}
for (i=0; i<256; i++) {
unsigned sz = (i<<cell_log);
- if (overhead)
- sz += overhead;
if (sz < mpu)
sz = mpu;
rtab[i] = tc_calc_xmittime(bps, sz);
^ permalink raw reply related
* [PATCH 5/6] [IPROUTE2]: Cleanup: tc_calc_rtable()
From: Jesper Dangaard Brouer @ 2007-09-12 10:14 UTC (permalink / raw)
To: netdev@vger.kernel.org
Cc: Patrick McHardy, David S. Miller, Stephen Hemminger
commit e3bad6e344303fec9916d1420aade98a2e6c79cc
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Wed Sep 5 10:47:47 2007 +0200
[IPROUTE2]: Cleanup: tc_calc_rtable().
Change tc_calc_rtable() to take a tc_ratespec struct as an
argument. (cell_log still needs to be passed on as a parameter,
because -1 indicate that the cell_log needs to be computed by the
function.).
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/tc/m_police.c b/tc/m_police.c
index 5d2528b..acdfd22 100644
--- a/tc/m_police.c
+++ b/tc/m_police.c
@@ -263,22 +263,20 @@ int act_parse_police(struct action_util *a,int *argc_p, char ***argv_p, int tca_
}
if (p.rate.rate) {
- if ((Rcell_log = tc_calc_rtable(p.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) {
+ p.rate.mpu = mpu;
+ if (tc_calc_rtable(&p.rate, rtab, Rcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate rate table.\n");
return -1;
}
p.burst = tc_calc_xmittime(p.rate.rate, buffer);
- p.rate.cell_log = Rcell_log;
- p.rate.mpu = mpu;
}
p.mtu = mtu;
if (p.peakrate.rate) {
- if ((Pcell_log = tc_calc_rtable(p.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) {
+ p.peakrate.mpu = mpu;
+ if (tc_calc_rtable(&p.peakrate, ptab, Pcell_log, mtu) < 0) {
fprintf(stderr, "POLICE: failed to calculate peak rate table.\n");
return -1;
}
- p.peakrate.cell_log = Pcell_log;
- p.peakrate.mpu = mpu;
}
tail = NLMSG_TAIL(n);
diff --git a/tc/q_cbq.c b/tc/q_cbq.c
index f2b4ce8..df98312 100644
--- a/tc/q_cbq.c
+++ b/tc/q_cbq.c
@@ -137,12 +137,11 @@ static int cbq_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl
if (allot < (avpkt*3)/2)
allot = (avpkt*3)/2;
- if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, allot, mpu)) < 0) {
+ r.mpu = mpu;
+ if (tc_calc_rtable(&r, rtab, cell_log, allot) < 0) {
fprintf(stderr, "CBQ: failed to calculate rate table.\n");
return -1;
}
- r.cell_log = cell_log;
- r.mpu = mpu;
if (ewma_log < 0)
ewma_log = TC_CBQ_DEF_EWMA;
@@ -336,12 +335,11 @@ static int cbq_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
unsigned pktsize = wrr.allot;
if (wrr.allot < (lss.avpkt*3)/2)
wrr.allot = (lss.avpkt*3)/2;
- if ((cell_log = tc_calc_rtable(r.rate, rtab, cell_log, pktsize, mpu)) < 0) {
+ r.mpu = mpu;
+ if (tc_calc_rtable(&r, rtab, cell_log, pktsize) < 0) {
fprintf(stderr, "CBQ: failed to calculate rate table.\n");
return -1;
}
- r.cell_log = cell_log;
- r.mpu = mpu;
}
if (ewma_log < 0)
ewma_log = TC_CBQ_DEF_EWMA;
diff --git a/tc/q_htb.c b/tc/q_htb.c
index 310d36d..e24ad6d 100644
--- a/tc/q_htb.c
+++ b/tc/q_htb.c
@@ -213,19 +213,17 @@ static int htb_parse_class_opt(struct qdisc_util *qu, int argc, char **argv, str
opt.ceil.mpu = mpu;
opt.rate.mpu = mpu;
- if ((cell_log = tc_calc_rtable(opt.rate.rate, rtab, cell_log, mtu, mpu)) < 0) {
+ if (tc_calc_rtable(&opt.rate, rtab, cell_log, mtu) < 0) {
fprintf(stderr, "htb: failed to calculate rate table.\n");
return -1;
}
opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer);
- opt.rate.cell_log = cell_log;
- if ((ccell_log = tc_calc_rtable(opt.ceil.rate, ctab, cell_log, mtu, mpu)) < 0) {
+ if (tc_calc_rtable(&opt.ceil, ctab, ccell_log, mtu) < 0) {
fprintf(stderr, "htb: failed to calculate ceil rate table.\n");
return -1;
}
opt.cbuffer = tc_calc_xmittime(opt.ceil.rate, cbuffer);
- opt.ceil.cell_log = ccell_log;
tail = NLMSG_TAIL(n);
addattr_l(n, 1024, TCA_OPTIONS, NULL, 0);
diff --git a/tc/q_tbf.c b/tc/q_tbf.c
index 1fc05f4..c7b4f0f 100644
--- a/tc/q_tbf.c
+++ b/tc/q_tbf.c
@@ -170,21 +170,20 @@ static int tbf_parse_opt(struct qdisc_util *qu, int argc, char **argv, struct nl
opt.limit = lim;
}
- if ((Rcell_log = tc_calc_rtable(opt.rate.rate, rtab, Rcell_log, mtu, mpu)) < 0) {
+ opt.rate.mpu = mpu;
+ if (tc_calc_rtable(&opt.rate, rtab, Rcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate rate table.\n");
return -1;
}
opt.buffer = tc_calc_xmittime(opt.rate.rate, buffer);
- opt.rate.cell_log = Rcell_log;
- opt.rate.mpu = mpu;
+
if (opt.peakrate.rate) {
- if ((Pcell_log = tc_calc_rtable(opt.peakrate.rate, ptab, Pcell_log, mtu, mpu)) < 0) {
+ opt.peakrate.mpu = mpu;
+ if (tc_calc_rtable(&opt.peakrate, ptab, Pcell_log, mtu) < 0) {
fprintf(stderr, "TBF: failed to calculate peak rate table.\n");
return -1;
}
opt.mtu = tc_calc_xmittime(opt.peakrate.rate, mtu);
- opt.peakrate.cell_log = Pcell_log;
- opt.peakrate.mpu = mpu;
}
tail = NLMSG_TAIL(n);
diff --git a/tc/tc_core.c b/tc/tc_core.c
index 1ab0ba0..c713a18 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -69,10 +69,11 @@ unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks)
rtab[pkt_len>>cell_log] = pkt_xmit_time
*/
-int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
- unsigned mpu)
+int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu)
{
int i;
+ unsigned bps = r->rate;
+ unsigned mpu = r->mpu;
if (mtu == 0)
mtu = 2047;
@@ -88,6 +89,7 @@ int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu,
sz = mpu;
rtab[i] = tc_calc_xmittime(bps, sz);
}
+ r->cell_log=cell_log;
return cell_log;
}
diff --git a/tc/tc_core.h b/tc/tc_core.h
index a139da6..e98a7b4 100644
--- a/tc/tc_core.h
+++ b/tc/tc_core.h
@@ -13,7 +13,7 @@ long tc_core_time2ktime(long time);
long tc_core_ktime2time(long ktime);
unsigned tc_calc_xmittime(unsigned rate, unsigned size);
unsigned tc_calc_xmitsize(unsigned rate, unsigned ticks);
-int tc_calc_rtable(unsigned bps, __u32 *rtab, int cell_log, unsigned mtu, unsigned mpu);
+int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mtu);
int tc_setup_estimator(unsigned A, unsigned time_const, struct tc_estimator *est);
^ permalink raw reply related
* [PATCH 6/6] [IPROUTE2]: Change the rate table calc of transmit cost to use upper bound value
From: Jesper Dangaard Brouer @ 2007-09-12 10:14 UTC (permalink / raw)
To: netdev@vger.kernel.org
Cc: Patrick McHardy, David S. Miller, Stephen Hemminger
commit 2e3edbef7913ac43899c8258ee59d9032778cee1
Author: Jesper Dangaard Brouer <hawk@comx.dk>
Date: Wed Sep 5 15:24:51 2007 +0200
[IPROUTE2]: Change the rate table calc of transmit cost to use upper bound value.
Patrick McHardy, Cite: 'its better to overestimate than underestimate
to stay in control of the queue'.
Illustrating the rate table array:
Legend description
rtab[x] : Array index x of rtab[x]
xmit_sz : Transmit size contained in rtab[x] (normally transmit time)
maps[a-b] : Packet sizes from a to b, will map into rtab[x]
Current/old rate table mapping (cell_log:3):
rtab[0]:=xmit_sz:0 maps[0-7]
rtab[1]:=xmit_sz:8 maps[8-15]
rtab[2]:=xmit_sz:16 maps[16-23]
rtab[3]:=xmit_sz:24 maps[24-31]
rtab[4]:=xmit_sz:32 maps[32-39]
rtab[5]:=xmit_sz:40 maps[40-47]
rtab[6]:=xmit_sz:48 maps[48-55]
New rate table mapping, with kernel cell_align support.
rtab[0]:=xmit_sz:8 maps[0-8]
rtab[1]:=xmit_sz:16 maps[9-16]
rtab[2]:=xmit_sz:24 maps[17-24]
rtab[3]:=xmit_sz:32 maps[25-32]
rtab[4]:=xmit_sz:40 maps[33-40]
rtab[5]:=xmit_sz:48 maps[41-48]
rtab[6]:=xmit_sz:56 maps[49-56]
New TC util on a kernel WITHOUT support for cell_align
rtab[0]:=xmit_sz:8 maps[0-7]
rtab[1]:=xmit_sz:16 maps[8-15]
rtab[2]:=xmit_sz:24 maps[16-23]
rtab[3]:=xmit_sz:32 maps[24-31]
rtab[4]:=xmit_sz:40 maps[32-39]
rtab[5]:=xmit_sz:48 maps[40-47]
rtab[6]:=xmit_sz:56 maps[48-55]
Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
diff --git a/tc/tc_core.c b/tc/tc_core.c
index c713a18..752b07c 100644
--- a/tc/tc_core.c
+++ b/tc/tc_core.c
@@ -84,11 +84,12 @@ int tc_calc_rtable(struct tc_ratespec *r, __u32 *rtab, int cell_log, unsigned mt
cell_log++;
}
for (i=0; i<256; i++) {
- unsigned sz = (i<<cell_log);
+ unsigned sz = ((i+1)<<cell_log);
if (sz < mpu)
sz = mpu;
rtab[i] = tc_calc_xmittime(bps, sz);
}
+ r->cell_align=-1; // Due to the sz calc
r->cell_log=cell_log;
return cell_log;
}
^ permalink raw reply related
* net/bluetooth/hci_sock.c:352: error: storage size of 'ctv' isn't known
From: Robert P. J. Day @ 2007-09-12 10:15 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: TEXT/PLAIN, Size: 799 bytes --]
latest git pull, "make allyesconfig" on i386:
...
CC net/bluetooth/hci_sock.o
net/bluetooth/hci_sock.c: In function âhci_sock_cmsgâ:
net/bluetooth/hci_sock.c:352: error: storage size of âctvâ isnât known
net/bluetooth/hci_sock.c:352: warning: unused variable âctvâ
make[2]: *** [net/bluetooth/hci_sock.o] Error 1
make[1]: *** [net/bluetooth] Error 2
make: *** [net] Error 2
rday
p.s. dumb question -- what locale should i be using to get those
quotes to not make such a mess of my screen? thanks.
--
========================================================================
Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA
http://crashcourse.ca
========================================================================
^ permalink raw reply
* Re: [PATCH] NET : convert IP route cache garbage colleciton from softirq processing to a workqueue
From: Eric Dumazet @ 2007-09-12 10:18 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Herbert Xu, David Miller, netdev@vger.kernel.org
In-Reply-To: <20070912100054.GA3649@infradead.org>
On Wed, 12 Sep 2007 11:00:54 +0100
Christoph Hellwig <hch@infradead.org> wrote:
> This looks nice in general, getting things out of softirq context is
> always good.
I am preparing a patch to net/ipv4/route.c to migrate rt_check_expire()
as well.
>
> On Tue, Sep 11, 2007 at 02:56:13PM +0200, Eric Dumazet wrote:
> > #if RT_CACHE_DEBUG >= 2
> > static atomic_t dst_total = ATOMIC_INIT(0);
> > #endif
> > -static unsigned long dst_gc_timer_expires;
> > -static unsigned long dst_gc_timer_inc = DST_GC_MAX;
> > -static void dst_run_gc(unsigned long);
> > +static struct {
> > + spinlock_t lock;
> > + struct dst_entry *list;
> > + unsigned long timer_inc;
> > + unsigned long timer_expires;
> > +} dst_garbage = {
> > + .lock = __SPIN_LOCK_UNLOCKED(dst_garbage.lock),
> > + .timer_inc = DST_GC_MAX,
> > +};
>
> Can you please et rid of this useless struct? It just complicates
> the code and means we can't use the proper DEFINE_SPINLOCK initializer.
When using the standard DEFINE_SPINLOCK initializer, the lock is in the
data section, while list is in bss section.
This 'useless struct' makes lock/list being on the same cache line, so
reduces latency of __dst_free(). I wish more structures in kernel be used
instead of relying on random placement of the linker...
>
> > +DECLARE_DELAYED_WORK(dst_gc_work, dst_gc_task);
>
> This should be static.
Yes I agree
^ permalink raw reply
* Re: [PATCH 0/6] NET_SCHED: Rate table fixes
From: Patrick McHardy @ 2007-09-12 10:54 UTC (permalink / raw)
To: jdb; +Cc: netdev@vger.kernel.org, David S. Miller, Stephen Hemminger
In-Reply-To: <1189592020.26927.20.camel@localhost.localdomain>
Jesper Dangaard Brouer wrote:
> This set of patches, aim at fixing an issue with the rate table used
> by the rate based schedulers.
ACK for all the patches :)
^ permalink raw reply
* Re: [PATCH 09/16] net: Initialize the network namespace of network devices.
From: David Miller @ 2007-09-12 10:58 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m1odgcvoje.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:24:21 -0600
>
> Except for carefully selected pseudo devices all network
> interfaces should start out in the initial network namespace.
> Ultimately it will be register_netdev that examines what
> dev->nd_net is set to and places a device in a network namespace.
>
> This patch modifies alloc_netdev to initialize the network
> namespace a device is in with the initial network namespace.
> This gets it right for the vast majority of devices so their
> drivers need not be modified and for those few pseudo devices
> that need something different they can change this parameter
> before calling register_netdevice.
>
> The network namespace parameter on a network device is not
> reference counted as the devices are inside of a network namespace
> and cannot remain in that namespace past the lifetime of the
> network namespace.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied to net-2.6.24, thanks.
^ permalink raw reply
* Re: [PATCH] Move the definition of pr_err() into kernel.h
From: Stephen Hemminger @ 2007-09-12 10:59 UTC (permalink / raw)
To: Emil Medve; +Cc: linux-kernel, netdev, i2c, linux-omap-open-source, Emil Medve
In-Reply-To: <11895225651521-git-send-email-Emilian.Medve@Freescale.com>
On Tue, 11 Sep 2007 09:56:05 -0500
Emil Medve <Emilian.Medve@Freescale.com> wrote:
> Other pr_*() macros are already defined in kernel.h, but pr_err() was defined
> multiple times in several other places
>
> Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com>
pr_error seems better than pr_err
Please add the full set:
pr_alert
pr_critical
pr_error
pr_warn
pr_notice
^ permalink raw reply
* Re: [PATCH 10/16] net: Make packet reception network namespace safe
From: David Miller @ 2007-09-12 11:00 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m1k5r0voh4.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:25:43 -0600
>
> This patch modifies every packet receive function
> registered with dev_add_pack() to drop packets if they
> are not from the initial network namespace.
>
> This should ensure that the various network stacks do
> not receive packets in a anything but the initial network
> namespace until the code has been converted and is ready
> for them.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied to net-2.6.24, thanks.
^ permalink raw reply
* Re: [PATCH 3/6] [IPROUTE2]: Update pkt_sched.h (to resemble the kernel one)
From: Stephen Hemminger @ 2007-09-12 11:02 UTC (permalink / raw)
To: jdb; +Cc: netdev@vger.kernel.org, Patrick McHardy, David S. Miller
In-Reply-To: <1189592054.26927.24.camel@localhost.localdomain>
On Wed, 12 Sep 2007 12:14:14 +0200
Jesper Dangaard Brouer <jdb@comx.dk> wrote:
> commit ef065a43b8900fbc0763eac0fa0a9a8a00c8aaa2
> Author: Jesper Dangaard Brouer <hawk@comx.dk>
> Date: Tue Sep 11 16:17:46 2007 +0200
>
> [IPROUTE2]: Update pkt_sched.h (to resemble the kernel one)
>
> Extend the tc_ratespec struct, with two parameters: 1) "cell_align"
> that allow adjusting the alignment of the rate table. 2) "overhead"
> that allow adding a packet overhead before the lookup in the kernel.
>
> This is done in order to, add support to changing the rate table to
> use the upper-boundry L2T (length to time) value. Currently we use the
> lower-boundry, which result in under-estimating the actual bandwidth
> usage.
>
> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
>
Okay, but don't need a special patch to do it. I perodically sync
up the headers before each release.
^ permalink raw reply
* Re: [PATCH 11/16] net: Make device event notification network namespace safe
From: David Miller @ 2007-09-12 11:02 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m1fy1ovoeo.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:27:11 -0600
>
> Every user of the network device notifiers is either a protocol
> stack or a pseudo device. If a protocol stack that does not have
> support for multiple network namespaces receives an event for a
> device that is not in the initial network namespace it quite possibly
> can get confused and do the wrong thing.
>
> To avoid problems until all of the protocol stacks are converted
> this patch modifies all netdev event handlers to ignore events on
> devices that are not in the initial network namespace.
>
> As the rest of the code is made network namespace aware these
> checks can be removed.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH 4/6] [IPROUTE2]: Overhead calculation is now done in the kernel
From: Stephen Hemminger @ 2007-09-12 11:05 UTC (permalink / raw)
To: jdb; +Cc: netdev@vger.kernel.org, Patrick McHardy, David S. Miller
In-Reply-To: <1189592079.26927.25.camel@localhost.localdomain>
On Wed, 12 Sep 2007 12:14:39 +0200
Jesper Dangaard Brouer <jdb@comx.dk> wrote:
> commit 07a74a2613440fc1a68d0faa7235ed7027532d78
> Author: Jesper Dangaard Brouer <hawk@comx.dk>
> Date: Tue Sep 11 16:59:58 2007 +0200
>
> [IPROUTE2]: Overhead calculation is now done in the kernel.
>
> The only current user is HTB. HTB overhead argument is now passed on
> to the kernel (in the struct tc_ratespec). Also correct the data
> types.
>
> Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
How is this binary compatable with older kernels?
^ permalink raw reply
* Re: [PATCH 12/16] net: Support multiple network namespaces with netlink
From: David Miller @ 2007-09-12 11:06 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m1bqccvock.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:28:27 -0600
>
> Each netlink socket will live in exactly one network namespace,
> this includes the controlling kernel sockets.
>
> This patch updates all of the existing netlink protocols
> to only support the initial network namespace. Request
> by clients in other namespaces will get -ECONREFUSED.
> As they would if the kernel did not have the support for
> that netlink protocol compiled in.
>
> As each netlink protocol is updated to be multiple network
> namespace safe it can register multiple kernel sockets
> to acquire a presence in the rest of the network namespaces.
>
> The implementation in af_netlink is a simple filter implementation
> at hash table insertion and hash table look up time.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied to net-2.6.24, thanks.
^ permalink raw reply
* Re: [PATCH] NET : convert IP route cache garbage colleciton from softirq processing to a workqueue
From: David Miller @ 2007-09-12 11:10 UTC (permalink / raw)
To: hch; +Cc: dada1, herbert, netdev
In-Reply-To: <20070912100054.GA3649@infradead.org>
From: Christoph Hellwig <hch@infradead.org>
Date: Wed, 12 Sep 2007 11:00:54 +0100
> Can you please et rid of this useless struct? It just complicates
> the code and means we can't use the proper DEFINE_SPINLOCK initializer.
As long as the linker can move global variables around we need
to do things like this to keep objects together when that's
what we want.
So it's not really useless, but perhaps deserves a comment.
^ permalink raw reply
* Re: [PATCH] NET : convert IP route cache garbage colleciton from softirq processing to a workqueue
From: David Miller @ 2007-09-12 11:12 UTC (permalink / raw)
To: dada1; +Cc: herbert, netdev
In-Reply-To: <20070912120845.1b2d77dc.dada1@cosmosbay.com>
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Wed, 12 Sep 2007 12:08:45 +0200
> Unfortunatly, there is no equivalent for this one.
> This gives on my Opterons a nice "prefetchnta"
>
> prefetch(addr) is more like __builtin_prefetch(addr, 0, 3)
>
> I would like to avoid to zap L2 cache with useless data.
>
> __builtin_prefetch() is included from gcc 3.1 (2002), so every
> platform should support it, as linux-2.6 requires gcc 3.2 at least.
>
> I guess you are going to tell me to first publish a patch to lkml :)
Basically, yes :-) You won't be the only person to find this
useful.
^ permalink raw reply
* [PATCH] [POWERPC] ucc_geth: fix module removal
From: Anton Vorontsov @ 2007-09-12 11:25 UTC (permalink / raw)
To: linuxppc-dev; +Cc: netdev
- uccf should be set to NULL to not double-free memory on
subsequent calls;
- ind_hash_q and group_hash_q lists should be initialized in the
probe() function, instead of struct_init() (called by open()),
otherwise there will be an oops if ucc_geth_driver removed
prior 'ifconfig ethX up';
- add unregister_netdev();
- reorder geth_remove() steps.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
---
drivers/net/ucc_geth.c | 17 ++++++++++-------
1 files changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ucc_geth.c b/drivers/net/ucc_geth.c
index 9a38dfe..bc2b3bf 100644
--- a/drivers/net/ucc_geth.c
+++ b/drivers/net/ucc_geth.c
@@ -2080,8 +2080,10 @@ static void ucc_geth_memclean(struct ucc_geth_private *ugeth)
if (!ugeth)
return;
- if (ugeth->uccf)
+ if (ugeth->uccf) {
ucc_fast_free(ugeth->uccf);
+ ugeth->uccf = NULL;
+ }
if (ugeth->p_thread_data_tx) {
qe_muram_free(ugeth->thread_dat_tx_offset);
@@ -2312,10 +2314,6 @@ static int ucc_struct_init(struct ucc_geth_private *ugeth)
ug_info = ugeth->ug_info;
uf_info = &ug_info->uf_info;
- /* Create CQs for hash tables */
- INIT_LIST_HEAD(&ugeth->group_hash_q);
- INIT_LIST_HEAD(&ugeth->ind_hash_q);
-
if (!((uf_info->bd_mem_part == MEM_PART_SYSTEM) ||
(uf_info->bd_mem_part == MEM_PART_MURAM))) {
if (netif_msg_probe(ugeth))
@@ -3949,6 +3947,10 @@ static int ucc_geth_probe(struct of_device* ofdev, const struct of_device_id *ma
ugeth = netdev_priv(dev);
spin_lock_init(&ugeth->lock);
+ /* Create CQs for hash tables */
+ INIT_LIST_HEAD(&ugeth->group_hash_q);
+ INIT_LIST_HEAD(&ugeth->ind_hash_q);
+
dev_set_drvdata(device, dev);
/* Set the dev->base_addr to the gfar reg region */
@@ -4002,9 +4004,10 @@ static int ucc_geth_remove(struct of_device* ofdev)
struct net_device *dev = dev_get_drvdata(device);
struct ucc_geth_private *ugeth = netdev_priv(dev);
- dev_set_drvdata(device, NULL);
- ucc_geth_memclean(ugeth);
+ unregister_netdev(dev);
free_netdev(dev);
+ ucc_geth_memclean(ugeth);
+ dev_set_drvdata(device, NULL);
return 0;
}
--
1.5.0.6
^ permalink raw reply related
* [PATCH] phy: implement release function
From: Anton Vorontsov @ 2007-09-12 11:26 UTC (permalink / raw)
To: netdev; +Cc: linuxppc-dev
Lately I've got this nice badness on mdio bus removal:
Device 'e0103120:06' does not have a release() function, it is broken and must be fixed.
------------[ cut here ]------------
Badness at drivers/base/core.c:107
NIP: c015c1a8 LR: c015c1a8 CTR: c0157488
REGS: c34bdcf0 TRAP: 0700 Not tainted (2.6.23-rc5-g9ebadfbb-dirty)
MSR: 00029032 <EE,ME,IR,DR> CR: 24088422 XER: 00000000
...
[c34bdda0] [c015c1a8] device_release+0x78/0x80 (unreliable)
[c34bddb0] [c01354cc] kobject_cleanup+0x80/0xbc
[c34bddd0] [c01365f0] kref_put+0x54/0x6c
[c34bdde0] [c013543c] kobject_put+0x24/0x34
[c34bddf0] [c015c384] put_device+0x1c/0x2c
[c34bde00] [c0180e84] mdiobus_unregister+0x2c/0x58
...
Though actually there is nothing broken, it just device
subsystem core expects another "pattern" of resource managment.
This patch implement phy device's release function, thus
we're getting rid of this badness.
Also small hidden bug fixed, hope none other introduced. ;-)
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
---
drivers/net/phy/mdio_bus.c | 9 +++++----
drivers/net/phy/phy_device.c | 13 +++++++++++++
include/linux/phy.h | 1 +
3 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index fc2f0e6..c30196d 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -91,9 +91,12 @@ int mdiobus_register(struct mii_bus *bus)
err = device_register(&phydev->dev);
- if (err)
+ if (err) {
printk(KERN_ERR "phy %d failed to register\n",
i);
+ phy_device_free(phydev);
+ phydev = NULL;
+ }
}
bus->phy_map[i] = phydev;
@@ -110,10 +113,8 @@ void mdiobus_unregister(struct mii_bus *bus)
int i;
for (i = 0; i < PHY_MAX_ADDR; i++) {
- if (bus->phy_map[i]) {
+ if (bus->phy_map[i])
device_unregister(&bus->phy_map[i]->dev);
- kfree(bus->phy_map[i]);
- }
}
}
EXPORT_SYMBOL(mdiobus_unregister);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index e275df8..80c283c 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -44,6 +44,17 @@ static struct phy_driver genphy_driver;
extern int mdio_bus_init(void);
extern void mdio_bus_exit(void);
+void phy_device_free(struct phy_device *phydev)
+{
+ kfree(phydev);
+}
+EXPORT_SYMBOL(phy_device_free);
+
+static void phy_device_release(struct device *dev)
+{
+ phy_device_free(to_phy_device(dev));
+}
+
struct phy_device* phy_device_create(struct mii_bus *bus, int addr, int phy_id)
{
struct phy_device *dev;
@@ -54,6 +65,8 @@ struct phy_device* phy_device_create(struct mii_bus *bus, int addr, int phy_id)
if (NULL == dev)
return (struct phy_device*) PTR_ERR((void*)-ENOMEM);
+ dev->dev.release = phy_device_release;
+
dev->speed = 0;
dev->duplex = -1;
dev->pause = dev->asym_pause = 0;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 2a65978..9ec1363 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -398,6 +398,7 @@ int phy_mii_ioctl(struct phy_device *phydev,
int phy_start_interrupts(struct phy_device *phydev);
void phy_print_status(struct phy_device *phydev);
struct phy_device* phy_device_create(struct mii_bus *bus, int addr, int phy_id);
+void phy_device_free(struct phy_device *phydev);
extern struct bus_type mdio_bus_type;
#endif /* __PHY_H */
--
1.5.0.6
^ permalink raw reply related
* Re: [PATCH] Fix e100 on systems that have cache incoherent DMA
From: James Chapman @ 2007-09-12 11:30 UTC (permalink / raw)
To: David Acker
Cc: Jeff Garzik, Kok, Auke, John Ronciak, Jesse Brandeburg,
Jeff Kirsher, Milton Miller, netdev, e1000-devel, Scott Feldman
In-Reply-To: <46E700A3.5010000@roinet.com>
David Acker wrote:
> Jeff Garzik wrote:
>> David Acker wrote:
>>> Let me know if there is any other information I can provide you. I
>>> will look through the code to see what could be going on with your
>>> machine. I will also look into reproducing these results with a
>>> newer kernel. This may be tricky since compulab's patches are pretty
>>> stale and don't always apply easily.
>>
>>
>> pktgen outputs for the various cases modified/unmodified[/others?]
>> would be nice, if you have a spot of time.
>>
>> Jeff
>
> I am not familiar with pktgen but I seem to have it working for a simple
> test.
> I edited the 1-1 example from
> ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/ .
> The results with and without the patch are below.
It looks like you ran pktgen on the embedded system, which exercised
only the transmit path. Auke indicated that the lockup was in the RU.
Have you run pktgen on a test system to fire packets at the embedded
system at max rate? Also test what happens when you fire packets in both
directions simultaneously.
--
James Chapman
Katalix Systems Ltd
http://www.katalix.com
Catalysts for your Embedded Linux software development
^ permalink raw reply
* [PATCH 2/3] sk98lin: ethtool perm_addr build fix
From: Stephen Hemminger @ 2007-09-12 11:30 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Linux Netdev
In-Reply-To: <20070911132819.2cc5bbf4@oldman>
Deal with API changes while sk98lin was removed.
ethtool_ops no longer has a perm_addr hook.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
---
drivers/net/sk98lin/skethtool.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/drivers/net/sk98lin/skethtool.c b/drivers/net/sk98lin/skethtool.c
index 3646069..5a6da89 100644
--- a/drivers/net/sk98lin/skethtool.c
+++ b/drivers/net/sk98lin/skethtool.c
@@ -616,7 +616,6 @@ const struct ethtool_ops SkGeEthtoolOps = {
.get_pauseparam = getPauseParams,
.set_pauseparam = setPauseParams,
.get_link = ethtool_op_get_link,
- .get_perm_addr = ethtool_op_get_perm_addr,
.get_sg = ethtool_op_get_sg,
.set_sg = setScatterGather,
.get_tx_csum = ethtool_op_get_tx_csum,
--
1.5.2.5
^ permalink raw reply related
* [PATCH 3/3]: sk98lin: neuter device to only SysKonnect boards
From: Stephen Hemminger @ 2007-09-12 11:30 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Linux Netdev
In-Reply-To: <20070911132423.62347d5c@oldman>
The skge driver works better for all boards except older SysKonnect
boards.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
---
drivers/net/sk98lin/skge.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/drivers/net/sk98lin/skge.c b/drivers/net/sk98lin/skge.c
index bf21862..7dc9c9e 100644
--- a/drivers/net/sk98lin/skge.c
+++ b/drivers/net/sk98lin/skge.c
@@ -5168,10 +5168,17 @@ err_out:
#endif
static struct pci_device_id skge_pci_tbl[] = {
+#ifdef SK98LIN_ALL_DEVICES
{ PCI_VENDOR_ID_3COM, 0x1700, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
{ PCI_VENDOR_ID_3COM, 0x80eb, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+#endif
+#ifdef GENESIS
+ /* Generic SysKonnect SK-98xx Gigabit Ethernet Server Adapter */
{ PCI_VENDOR_ID_SYSKONNECT, 0x4300, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+#endif
+ /* Generic SysKonnect SK-98xx V2.0 Gigabit Ethernet Adapter */
{ PCI_VENDOR_ID_SYSKONNECT, 0x4320, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+#ifdef SK98LIN_ALL_DEVICES
/* DLink card does not have valid VPD so this driver gags
* { PCI_VENDOR_ID_DLINK, 0x4c00, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
*/
@@ -5180,6 +5187,7 @@ static struct pci_device_id skge_pci_tbl[] = {
{ PCI_VENDOR_ID_CNET, 0x434e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
{ PCI_VENDOR_ID_LINKSYS, 0x1032, PCI_ANY_ID, 0x0015, },
{ PCI_VENDOR_ID_LINKSYS, 0x1064, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+#endif
{ 0 }
};
--
1.5.2.5
^ permalink raw reply related
* Re: [PATCH 13/16] net: Make the device list and device lookups per namespace.
From: David Miller @ 2007-09-12 11:39 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m17in0vo0d.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:35:46 -0600
>
> This patch makes most of the generic device layer network
> namespace safe. This patch makes dev_base_head a
> network namespace variable, and then it picks up
> a few associated variables. The functions:
> dev_getbyhwaddr
> dev_getfirsthwbytype
> dev_get_by_flags
> dev_get_by_name
> __dev_get_by_name
> dev_get_by_index
> __dev_get_by_index
> dev_ioctl
> dev_ethtool
> dev_load
> wireless_process_ioctl
>
> were modified to take a network namespace argument, and
> deal with it.
>
> vlan_ioctl_set and brioctl_set were modified so their
> hooks will receive a network namespace argument.
>
> So basically anthing in the core of the network stack that was
> affected to by the change of dev_base was modified to handle
> multiple network namespaces. The rest of the network stack was
> simply modified to explicitly use &init_net the initial network
> namespace. This can be fixed when those components of the network
> stack are modified to handle multiple network namespaces.
>
> For now the ifindex generator is left global.
>
> Fundametally ifindex numbers are per namespace, or else
> we will have corner case problems with migration when
> we get that far.
>
> At the same time there are assumptions in the network stack
> that the ifindex of a network device won't change. Making
> the ifindex number global seems a good compromise until
> the network stack can cope with ifindex changes when
> you change namespaces, and the like.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied to net-2.6.24, thanks.
^ permalink raw reply
* Re: [PATCH 14/16] net: Factor out __dev_alloc_name from dev_alloc_name
From: David Miller @ 2007-09-12 11:49 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m13axovnyf.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:36:56 -0600
>
> When forcibly changing the network namespace of a device
> I need something that can generate a name for the device
> in the new namespace without overwriting the old name.
>
> __dev_alloc_name provides me that functionality.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied to net-2.6.24, thanks.
^ permalink raw reply
* Re: [PATCH 15/16] net: Implement network device movement between namespaces
From: David Miller @ 2007-09-12 11:54 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m1y7fgu9ax.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:38:46 -0600
>
> This patch introduces NETIF_F_NETNS_LOCAL a flag to indicate
> a network device is local to a single network namespace and
> should never be moved. Useful for pseudo devices that we
> need an instance in each network namespace (like the loopback
> device) and for any device we find that cannot handle multiple
> network namespaces so we may trap them in the initial network
> namespace.
>
> This patch introduces the function dev_change_net_namespace
> a function used to move a network device from one network
> namespace to another. To the network device nothing
> special appears to happen, to the components of the network
> stack it appears as if the network device was unregistered
> in the network namespace it is in, and a new device
> was registered in the network namespace the device
> was moved to.
>
> This patch sets up a namespace device destructor that
> upon the exit of a network namespace moves all of the
> movable network devices to the initial network namespace
> so they are not lost.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied to net-2.6.24
^ permalink raw reply
* Re: [PATCH 16/16] net: netlink support for moving devices between network namespaces.
From: David Miller @ 2007-09-12 11:57 UTC (permalink / raw)
To: ebiederm; +Cc: netdev, containers
In-Reply-To: <m1tzq4u92n.fsf_-_@ebiederm.dsl.xmission.com>
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:43:44 -0600
>
> The simplest thing to implement is moving network devices between
> namespaces. However with the same attribute IFLA_NET_NS_PID we can
> easily implement creating devices in the destination network
> namespace as well. However that is a little bit trickier so this
> patch sticks to what is simple and easy.
>
> A pid is used to identify a process that happens to be a member
> of the network namespace we want to move the network device to.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Applied to net-2.6.24, thanks.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox