* Re: [PATCH] e100: timer power saving
From: Arjan van de Ven @ 2007-09-09 22:37 UTC (permalink / raw)
To: Auke Kok; +Cc: jeff, netdev, shemminger
In-Reply-To: <20070906185137.30733.99594.stgit@localhost.localdomain>
Auke Kok wrote:
> From: Stephen Hemminger <shemminger@linux-foundation.org>
>
> Since E100 timer is 2HZ, use rounding to make timer occur on the
> correct boundary.
>
> Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
given that I was about to send out exactly this same patch...
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
^ permalink raw reply
* [IPv6] BUG: NULL pointer dereference in(?) ip6_flush_pending_frames
From: Bernhard Schmidt @ 2007-09-09 22:24 UTC (permalink / raw)
To: netdev
Hi,
I'm running a public Teredo relay (IPv4-to-IPv6 migration protocol)
using Miredo. Every once in a while (a few minutes to days after
daemon restart) it becomes unusable and I see the following kernel
message:
BUG: unable to handle kernel NULL pointer dereference at virtual address
0000008c
printing eip:
c02640e6
*pde = 00000000
Oops: 0000 [#17]
SMP
Modules linked in: ip6table_filter ip6_tables af_packet tun bitrev crc32
ipt_LOG xt_tcpudp iptable_filter iptable_mangle ip_tables x_tables
dm_mod capability commoncap iTCO_wdt floppy e1000 rtc unix
CPU: 0
EIP: 0060:[<c02640e6>] Not tainted VLI
EFLAGS: 00210246 (2.6.21.3-iabg-pe750 #1)
EIP is at ip6_flush_pending_frames+0x97/0x121
eax: 00000000 ebx: d3e3ca80 ecx: db590380 edx: d3e3caf0
esi: d3e3cc80 edi: db590380 ebp: 00000002 esp: d4af7cd4
ds: 007b es: 007b fs: 00d8 gs: 0033 ss: 0068
Process miredo (pid: 17615, ti=d4af6000 task=cfd60030 task.ti=d4af6000)
Stack: 000005d0 00000000 d4af7d44 d4af7d54 d4af7d54 00000000 db590380
c0275ab5
00000000 00000000 00000040 00000000 00000000 d4af7d48 df4c6780
00000040
d4af7f44 d3e3ca80 3a000000 00000000 0000001c 003a0000 00000000
00000000
Call Trace:
[<c0275ab5>] rawv6_sendmsg+0x840/0xa63
[<c0258a09>] inet_sendmsg+0x3b/0x45
[<c021df73>] sock_sendmsg+0xbc/0xd4
[<c0123f99>] autoremove_wake_function+0x0/0x35
[<e087c911>] tun_chr_aio_read+0x29e/0x2a8 [tun]
[<c011025a>] default_wake_function+0x0/0xc
[<c021e29c>] sys_sendto+0x118/0x138
[<c014d03c>] do_readv_writev+0x17d/0x187
[<e087c673>] tun_chr_aio_read+0x0/0x2a8 [tun]
[<c021ef2e>] sys_socketcall+0x15e/0x242
[<c0102560>] syscall_call+0x7/0xb
=======================
Code: 8d 43 70 8b 48 04 39 c1 74 31 85 c9 74 2d ff 48 08 8b 11 8b 41 04
c7 41 04 00 00 00 00 c7 01 00 00 00 00 89 42 04 89 10 8b 41 28 <8b> b8
8c 00 00 00 85 ff 0f 85 6b ff ff ff eb 94 83 a3 84 01 00
EIP: [<c02640e6>] ip6_flush_pending_frames+0x97/0x121 SS:ESP
0068:d4af7cd4
I have not found anything related on netdev, I'll try a new kernel to be
sure. Do you need any more information to debug this issue?
Hardware is a Dell PowerEdge 750 (i386 P4 HT), vanilla kernel 2.6.21.3
running Debian testing.
Thanks,
Bernhard
^ permalink raw reply
* Re: ne driver crashes when unloaded in 2.6.22.6
From: Chris Rankin @ 2007-09-09 22:12 UTC (permalink / raw)
To: Dan Williams; +Cc: netdev
In-Reply-To: <1189374958.6221.4.camel@xo-3E-67-34.localdomain>
--- Dan Williams <dcbw@redhat.com> wrote:
> Offhand question, does your ne2000 card support carrier detection?
Err... there is a /sys/class/net/eth0/carrier entry (I think - not in front of that machine right
now). IIRC it said "1" when I read it.
Cheers,
Chris
___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
now.
http://uk.answers.yahoo.com/
^ permalink raw reply
* Re: ne driver crashes when unloaded in 2.6.22.6
From: Dan Williams @ 2007-09-09 21:55 UTC (permalink / raw)
To: Chris Rankin; +Cc: netdev
In-Reply-To: <311869.99402.qm@web52909.mail.re2.yahoo.com>
On Sun, 2007-09-09 at 20:46 +0100, Chris Rankin wrote:
> Hi,
>
> While trying to get my NE2000 ISA card working with NetworkManager and Linux 2.6.22.6, I
> discovered that the ne module will cause the kernel to oops when it is unloaded. The problem is
> that the module's clean-up function tries to unregister a platform driver unconditionally,
> although the platform driver may never have been registered in the first place. I have created a
> patch which seems to work OK. The idea behind the patch is to de-initialise the unregistered
> ne_driver structure to the point where the unregister function will abort without error:
>
> --- drivers/net/ne.c.orig 2007-07-21 17:00:52.000000000 +0100
> +++ drivers/net/ne.c 2007-09-09 20:04:28.000000000 +0100
> @@ -272,6 +272,7 @@
> pnp_device_detach(idev);
> return -ENXIO;
> }
> + SET_NETDEV_DEV(dev, &idev->dev);
> ei_status.priv = (unsigned long)idev;
> break;
> }
> @@ -827,6 +828,7 @@
> free_netdev(dev);
> return err;
> }
> + SET_NETDEV_DEV(dev, &pdev->dev);
> platform_set_drvdata(pdev, dev);
> return 0;
> }
> @@ -909,11 +911,14 @@
> is at boot) and so the probe will get confused by any other 8390 cards.
> ISA device autoprobes on a running machine are not recommended anyway. */
>
> int __init init_module(void)
> {
> int this_dev, found = 0;
> int plat_found = !ne_init();
>
> + if ( !plat_found )
> + ne_driver.driver.bus = NULL;
> +
> for (this_dev = 0; this_dev < MAX_NE_CARDS; this_dev++) {
> struct net_device *dev = alloc_ei_netdev();
> if (!dev)
>
> The patch also contains a few SET_NETDEV_DEV() statements, which I am hoping will help these
> network cards work with NetworkManager. This isn't a complete solution because it still doesn't
> address the non-PnP ISA case. However, I think it's a start... At the moment, my eth device
> appears in /sys/class/net but is also completely ignored by lshal.
Awesome. SET_NETDEV_DEV is required because it establishes the 'device'
link in sysfs that HAL uses for additional information about the card.
At some point all drivers should get audited for SET_NETDEV_DEV (or
rather lack of it) and get fixed if they don't have it.
Offhand question, does your ne2000 card support carrier detection?
Dan
^ permalink raw reply
* [PATCH resend] Fix a lock problem in generic phy code
From: Hans-Jürgen Koch @ 2007-09-09 21:36 UTC (permalink / raw)
To: linux-kernel; +Cc: netdev, Jeff Garzik, afleming
In-Reply-To: <200708311430.09590.hjk@linutronix.de>
I already sent this patch on August, 31. I never got an answer, so here it
is again.
Lock debugging finds a problem in phy.c and phy_device.c:
[ 3.420000] =================================
[ 3.420000] [ INFO: inconsistent lock state ]
[ 3.420000] 2.6.23-rc3-mm1 #21
[ 3.420000] ---------------------------------
[ 3.420000] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
[ 3.420000] swapper/1 [HC0[0]:SC1[1]:HE1:SE0] takes:
[ 3.420000] (&dev->lock){-+..}, at: [<c017da1c>] phy_timer+0x1c/0x4c8
[ 3.420000] {softirq-on-W} state was registered at:
[ 3.420000] [<c006021c>] lock_acquire+0x94/0xac
[ 3.420000] [<c026423c>] _spin_lock+0x40/0x50
[ 3.420000] [<c017e2b4>] phy_probe+0x40/0x88
[ 3.420000] [<c017e60c>] phy_attach+0xc8/0xfc
[ 3.420000] [<c017e6c4>] phy_connect+0x1c/0x58
[ 3.420000] [<c017fc44>] macb_probe+0x4fc/0x5bc
[ 3.420000] [<c0177b44>] platform_drv_probe+0x20/0x24
[ 3.420000] [<c0176210>] driver_probe_device+0xb0/0x1bc
[ 3.420000] [<c01764dc>] __driver_attach+0xe4/0xe8
[ 3.420000] [<c017550c>] bus_for_each_dev+0x54/0x80
[ 3.420000] [<c0176074>] driver_attach+0x20/0x28
[ 3.420000] [<c01758d4>] bus_add_driver+0x84/0x1e0
[ 3.420000] [<c0176708>] driver_register+0x54/0x90
[ 3.420000] [<c0177de0>] platform_driver_register+0x6c/0x88
[ 3.420000] [<c00167ec>] macb_init+0x14/0x1c
[ 3.420000] [<c00087e4>] kernel_init+0x9c/0x2b4
[ 3.420000] [<c0040338>] do_exit+0x0/0x8e8
[ 3.420000] irq event stamp: 115025
[ 3.420000] hardirqs last enabled at (115025): [<c0264958>] _spin_unlock_irq+0x30/0x60
[ 3.420000] hardirqs last disabled at (115024): [<c02642c8>] _spin_lock_irq+0x28/0x60
[ 3.420000] softirqs last enabled at (114999): [<c0042910>] __do_softirq+0xfc/0x12c
[ 3.420000] softirqs last disabled at (115022): [<c0042ee8>] irq_exit+0x68/0x7c
[ 3.420000]
[ 3.420000] other info that might help us debug this:
[ 3.420000] no locks held by swapper/1.
[ 3.420000]
[ 3.420000] stack backtrace:
[ 3.420000] [<c0028af4>] (dump_stack+0x0/0x14) from [<c005d8e4>] (print_usage_bug+0x120/0x150)
[ 3.420000] [<c005d7c4>] (print_usage_bug+0x0/0x150) from [<c005e5e8>] (mark_lock+0x574/0x6bc)
[ 3.420000] [<c005e074>] (mark_lock+0x0/0x6bc) from [<c005f4c0>] (__lock_acquire+0x4d0/0x1198)
[ 3.420000] [<c005eff0>] (__lock_acquire+0x0/0x1198) from [<c006021c>] (lock_acquire+0x94/0xac)
[ 3.420000] [<c0060188>] (lock_acquire+0x0/0xac) from [<c026423c>] (_spin_lock+0x40/0x50)
[ 3.420000] [<c02641fc>] (_spin_lock+0x0/0x50) from [<c017da1c>] (phy_timer+0x1c/0x4c8)
[ 3.420000] r5:c098a800 r4:00000104
[ 3.420000] [<c017da00>] (phy_timer+0x0/0x4c8) from [<c0046ee4>] (run_timer_softirq+0x1a0/0x218)
[ 3.420000] r7:c081ddd0 r6:c0331e80 r5:c098aa50 r4:00000104
[ 3.420000] [<c0046d44>] (run_timer_softirq+0x0/0x218) from [<c00428a8>] (__do_softirq+0x94/0x12c)
[ 3.420000] [<c0042814>] (__do_softirq+0x0/0x12c) from [<c0042ee8>] (irq_exit+0x68/0x7c)
[ 3.420000] [<c0042e80>] (irq_exit+0x0/0x7c) from [<c0023060>] (__exception_text_start+0x60/0x74)
[ 3.420000] r4:00000001
[ 3.420000] [<c0023000>] (__exception_text_start+0x0/0x74) from [<c0023b28>] (__irq_svc+0x48/0x74)
[ 3.420000] Exception stack(0xc081de70 to 0xc081deb8)
[ 3.420000] de60: c030b2a8 c081a800 60000013 c081c000
[ 3.420000] de80: c032db40 c081dee0 c032d758 c081dece 00000012 c030b398 00000034 c081df2c
[ 3.420000] dea0: c081deb8 c081deb8 c003d6f0 c003d6f8 60000013 ffffffff
[ 3.420000] r7:00000003 r6:00000001 r5:fefff000 r4:ffffffff
[ 3.420000] [<c003d494>] (vprintk+0x0/0x474) from [<c003d930>] (printk+0x28/0x30)
[ 3.420000] [<c003d908>] (printk+0x0/0x30) from [<c01d3558>] (sock_register+0x64/0x84)
[ 3.420000] r3:00000001 r2:00000002 r1:00000001 r0:c02e4c88
[ 3.420000] [<c01d34f4>] (sock_register+0x0/0x84) from [<c001b7bc>] (af_unix_init+0x40/0x88)
[ 3.420000] r5:00000000 r4:00000000
[ 3.420000] [<c001b77c>] (af_unix_init+0x0/0x88) from [<c00087e4>] (kernel_init+0x9c/0x2b4)
[ 3.420000] r4:00000000
[ 3.420000] [<c0008748>] (kernel_init+0x0/0x2b4) from [<c0040338>] (do_exit+0x0/0x8e8)
The following patch fixes it. Tested on an AT91SAM9263-EK board, kernel 2.6.23-rc4 and -rc3-mm1.
Signed-off-by: Hans J. Koch <hjk@linutronix.de>
---
Index: linux-2.6.23-rc/drivers/net/phy/phy_device.c
===================================================================
--- linux-2.6.23-rc.orig/drivers/net/phy/phy_device.c 2007-08-31 14:07:47.000000000 +0200
+++ linux-2.6.23-rc/drivers/net/phy/phy_device.c 2007-08-31 14:08:22.000000000 +0200
@@ -644,7 +644,7 @@
if (!(phydrv->flags & PHY_HAS_INTERRUPT))
phydev->irq = PHY_POLL;
- spin_lock(&phydev->lock);
+ spin_lock_bh(&phydev->lock);
/* Start out supporting everything. Eventually,
* a controller will attach, and may modify one
@@ -658,7 +658,7 @@
if (phydev->drv->probe)
err = phydev->drv->probe(phydev);
- spin_unlock(&phydev->lock);
+ spin_unlock_bh(&phydev->lock);
return err;
Index: linux-2.6.23-rc/drivers/net/phy/phy.c
===================================================================
--- linux-2.6.23-rc.orig/drivers/net/phy/phy.c 2007-08-31 14:15:20.000000000 +0200
+++ linux-2.6.23-rc/drivers/net/phy/phy.c 2007-08-31 14:15:43.000000000 +0200
@@ -755,7 +755,7 @@
*/
void phy_start(struct phy_device *phydev)
{
- spin_lock(&phydev->lock);
+ spin_lock_bh(&phydev->lock);
switch (phydev->state) {
case PHY_STARTING:
@@ -769,7 +769,7 @@
default:
break;
}
- spin_unlock(&phydev->lock);
+ spin_unlock_bh(&phydev->lock);
}
EXPORT_SYMBOL(phy_stop);
EXPORT_SYMBOL(phy_start);
^ permalink raw reply
* Question about NAT-T and PF_KEY...
From: Stjepan Gros @ 2007-09-09 20:30 UTC (permalink / raw)
To: netdev; +Cc: ikev2-devel
Hi all,
I'm having problems telling the kernel to do ESP-in-UDP encapsulation.
Outgoing direction seems to work, but the incoming packets on the other
side are passed to ikev2 daemon instead of kernel decapsulating them.
The only strange thing I'm noticing for now is the difference between
setkey and ip command outputs. In the ip command output the following
line appears (complete output is at the end of this mail).
encap type espinudp sport 4500 dport 4500 addr 111.0.0.0
with strange address, 111.0.0.0, for which I don't know the purpose and
also I don't know from where it came from. Also, I don't know how to
manipulate that address via PF_KEY!
Any help would be very appreciated! In case this is not detailed enough
to point to the problem, I can send more information.
Thanks,
Stjepan
# ip xfrm state sh
src 10.0.0.2 dst 192.168.0.2
proto esp spi 0x8e19037d reqid 0 mode tunnel
replay-window 0
auth sha1 0xf928fc8f76092e08238934d1caa1d78f8d144bd8
enc des3_ede 0xc8a8d5cd9ea831854c37e02f54e6916d79fb575834bc5854
encap type espinudp sport 4500 dport 4500 addr 111.0.0.0
src 192.168.0.2 dst 10.0.0.2
proto esp spi 0x41a5ebfc reqid 0 mode tunnel
replay-window 0
auth sha1 0xa7a5a366761812cfee2c5855fd95aef87c2e3411
enc des3_ede 0xbc045267fd15c78c57aeada27f0bdc970164e69751083b51
encap type espinudp sport 4500 dport 4500 addr 111.0.0.0
10.0.0.2[4500] 192.168.0.2[4500]
esp-udp mode=tunnel spi=2384003965(0x8e19037d)
reqid=0(0x00000000)
E: 3des-cbc c8a8d5cd 9ea83185 4c37e02f 54e6916d 79fb5758
34bc5854
A: hmac-sha1 f928fc8f 76092e08 238934d1 caa1d78f 8d144bd8
seq=0x00000000 replay=0 flags=0x00000000 state=mature
created: Sep 9 20:11:45 2007 current: Sep 9 20:12:11 2007
diff: 26(s) hard: 0(s) soft: 0(s)
last: Sep 9 20:11:45 2007 hard: 0(s) soft: 0(s)
current: 432(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 3 hard: 0 soft: 0
sadb_seq=1 pid=16076 refcnt=0
192.168.0.2[4500] 10.0.0.2[4500]
esp-udp mode=tunnel spi=1101392892(0x41a5ebfc)
reqid=0(0x00000000)
E: 3des-cbc bc045267 fd15c78c 57aeada2 7f0bdc97 0164e697
51083b51
A: hmac-sha1 a7a5a366 761812cf ee2c5855 fd95aef8 7c2e3411
seq=0x00000000 replay=0 flags=0x00000000 state=mature
created: Sep 9 20:11:45 2007 current: Sep 9 20:12:11 2007
diff: 26(s) hard: 0(s) soft: 0(s)
last: hard: 0(s) soft: 0(s)
current: 0(bytes) hard: 0(bytes) soft: 0(bytes)
allocated: 0 hard: 0 soft: 0
sadb_seq=0 pid=16076 refcnt=0
^ permalink raw reply
* Re: [-mm patch] unexport raise_softirq_irqoff
From: Christoph Hellwig @ 2007-09-09 20:41 UTC (permalink / raw)
To: Adrian Bunk; +Cc: Andrew Morton, davem, linux-kernel, netdev
In-Reply-To: <20070909202544.GV3563@stusta.de>
On Sun, Sep 09, 2007 at 10:25:44PM +0200, Adrian Bunk wrote:
> On Fri, Aug 31, 2007 at 09:58:22PM -0700, Andrew Morton wrote:
> >...
> > Changes since 2.6.23-rc3-mm1:
> >...
> > git-net.patch
> >...
> > git trees
> >...
>
> raise_softirq_irqoff no longer has any modular user.
>
> Signed-off-by: Adrian Bunk <bunk@kernel.org>
This should probably go in through Dave's tree as it's removing this
rather annoying user.
^ permalink raw reply
* [-mm patch] net/sctp/socket.c: make 3 variables static
From: Adrian Bunk @ 2007-09-09 20:25 UTC (permalink / raw)
To: Andrew Morton, vladislav.yasevich, sri, davem
Cc: linux-kernel, lksctp-developers, netdev
In-Reply-To: <20070831215822.26e1432b.akpm@linux-foundation.org>
On Fri, Aug 31, 2007 at 09:58:22PM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.23-rc3-mm1:
>...
> git-net.patch
>...
> git trees
>...
This patch makes the following needlessly globalvariables static:
- sctp_memory_pressure
- sctp_memory_allocated
- sctp_sockets_allocated
Signed-off-by: Adrian Bunk <bunk@kernel.org>
---
3c211ad074038414ebc156b1abbc3df78dc60cb2
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 37e7306..f53545a 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -112,9 +112,9 @@ extern int sysctl_sctp_mem[3];
extern int sysctl_sctp_rmem[3];
extern int sysctl_sctp_wmem[3];
-int sctp_memory_pressure;
-atomic_t sctp_memory_allocated;
-atomic_t sctp_sockets_allocated;
+static int sctp_memory_pressure;
+static atomic_t sctp_memory_allocated;
+static atomic_t sctp_sockets_allocated;
static void sctp_enter_memory_pressure(void)
{
^ permalink raw reply related
* [-mm patch] make tcp_splice_data_recv() static
From: Adrian Bunk @ 2007-09-09 20:25 UTC (permalink / raw)
To: Andrew Morton, Jens Axboe; +Cc: linux-kernel, netdev
In-Reply-To: <20070831215822.26e1432b.akpm@linux-foundation.org>
On Fri, Aug 31, 2007 at 09:58:22PM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.23-rc3-mm1:
>...
> git-block.patch
>...
> git trees
>...
tcp_splice_data_recv() can become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
---
233aefd2a215430c16bd02eca06fb8a4b6079f7a
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 22576e4..6623796 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -515,8 +515,9 @@ static inline void tcp_push(struct sock *sk, int flags, int mss_now,
}
}
-int tcp_splice_data_recv(read_descriptor_t *rd_desc, struct sk_buff *skb,
- unsigned int offset, size_t len)
+static int tcp_splice_data_recv(read_descriptor_t *rd_desc,
+ struct sk_buff *skb,
+ unsigned int offset, size_t len)
{
struct tcp_splice_state *tss = rd_desc->arg.data;
^ permalink raw reply related
* [-mm patch] unexport raise_softirq_irqoff
From: Adrian Bunk @ 2007-09-09 20:25 UTC (permalink / raw)
To: Andrew Morton, davem; +Cc: linux-kernel, netdev
In-Reply-To: <20070831215822.26e1432b.akpm@linux-foundation.org>
On Fri, Aug 31, 2007 at 09:58:22PM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.23-rc3-mm1:
>...
> git-net.patch
>...
> git trees
>...
raise_softirq_irqoff no longer has any modular user.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
---
eff0407b63757cdd4164a0bdde0313e8f154b6dc
diff --git a/kernel/softirq.c b/kernel/softirq.c
index abae56c..ce38b56 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -335,8 +335,6 @@ inline fastcall void raise_softirq_irqoff(unsigned int nr)
wakeup_softirqd();
}
-EXPORT_SYMBOL(raise_softirq_irqoff);
-
void fastcall raise_softirq(unsigned int nr)
{
unsigned long flags;
^ permalink raw reply related
* [2.6 patch] make sctp_addto_param() static
From: Adrian Bunk @ 2007-09-09 20:25 UTC (permalink / raw)
To: Wei Yongjun, Vlad Yasevich, sri, davem
Cc: linux-kernel, lksctp-developers, netdev
sctp_addto_param() can become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
---
include/net/sctp/structs.h | 1
net/sctp/sm_make_chunk.c | 39 ++++++++++++++++++-------------------
2 files changed, 20 insertions(+), 20 deletions(-)
38f8064114b9e89a6a911b2e3625a41cdb477bcd
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index c0d5848..ee4559b 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -726,7 +726,6 @@ int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len,
struct iovec *data);
void sctp_chunk_free(struct sctp_chunk *);
void *sctp_addto_chunk(struct sctp_chunk *, int len, const void *data);
-void *sctp_addto_param(struct sctp_chunk *, int len, const void *data);
struct sctp_chunk *sctp_chunkify(struct sk_buff *,
const struct sctp_association *,
struct sock *);
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 79856c9..cf3b560 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -839,6 +839,26 @@ err_chunk:
return retval;
}
+/* Append bytes to the end of a parameter. Will panic if chunk is not big
+ * enough.
+ */
+static void *sctp_addto_param(struct sctp_chunk *chunk, int len,
+ const void *data)
+{
+ void *target;
+ int chunklen = ntohs(chunk->chunk_hdr->length);
+
+ target = skb_put(chunk->skb, len);
+
+ memcpy(target, data, len);
+
+ /* Adjust the chunk length field. */
+ chunk->chunk_hdr->length = htons(chunklen + len);
+ chunk->chunk_end = skb_tail_pointer(chunk->skb);
+
+ return target;
+}
+
/* Make an ABORT chunk with a PROTOCOL VIOLATION cause code. */
struct sctp_chunk *sctp_make_abort_violation(
const struct sctp_association *asoc,
@@ -1146,25 +1166,6 @@ void *sctp_addto_chunk(struct sctp_chunk *chunk, int len, const void *data)
return target;
}
-/* Append bytes to the end of a parameter. Will panic if chunk is not big
- * enough.
- */
-void *sctp_addto_param(struct sctp_chunk *chunk, int len, const void *data)
-{
- void *target;
- int chunklen = ntohs(chunk->chunk_hdr->length);
-
- target = skb_put(chunk->skb, len);
-
- memcpy(target, data, len);
-
- /* Adjust the chunk length field. */
- chunk->chunk_hdr->length = htons(chunklen + len);
- chunk->chunk_end = skb_tail_pointer(chunk->skb);
-
- return target;
-}
-
/* Append bytes from user space to the end of a chunk. Will panic if
* chunk is not big enough.
* Returns a kernel err value.
^ permalink raw reply related
* [-mm patch] really unexport do_softirq
From: Adrian Bunk @ 2007-09-09 20:25 UTC (permalink / raw)
To: Andrew Morton, Robert Olsson, David S. Miller; +Cc: linux-kernel, netdev
In-Reply-To: <20070831215822.26e1432b.akpm@linux-foundation.org>
On Fri, Aug 31, 2007 at 09:58:22PM -0700, Andrew Morton wrote:
>...
> Changes since 2.6.23-rc3-mm1:
>...
> git-net.patch
>...
> git trees
>...
This hydra had more than one head...
Signed-off-by: Adrian Bunk <bunk@kernel.org>
---
arch/i386/kernel/irq.c | 2 --
arch/powerpc/kernel/irq.c | 1 -
arch/s390/kernel/irq.c | 1 -
arch/sh/kernel/irq.c | 1 -
arch/x86_64/kernel/irq.c | 1 -
5 files changed, 6 deletions(-)
68791fe88172ac3c2dbb0fbbffb8befc7b59e3f7
diff --git a/arch/i386/kernel/irq.c b/arch/i386/kernel/irq.c
index a6b2c7e..de1601f 100644
--- a/arch/i386/kernel/irq.c
+++ b/arch/i386/kernel/irq.c
@@ -231,8 +231,6 @@ asmlinkage void do_softirq(void)
local_irq_restore(flags);
}
-
-EXPORT_SYMBOL(do_softirq);
#endif
/*
diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index dfad0e4..65c2409 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -395,7 +395,6 @@ void do_softirq(void)
local_irq_restore(flags);
}
-EXPORT_SYMBOL(do_softirq);
/*
diff --git a/arch/s390/kernel/irq.c b/arch/s390/kernel/irq.c
index 8f0cbca..c36d812 100644
--- a/arch/s390/kernel/irq.c
+++ b/arch/s390/kernel/irq.c
@@ -95,7 +95,6 @@ asmlinkage void do_softirq(void)
local_irq_restore(flags);
}
-EXPORT_SYMBOL(do_softirq);
void init_irq_proc(void)
{
diff --git a/arch/sh/kernel/irq.c b/arch/sh/kernel/irq.c
index 0340498..4b49d03 100644
--- a/arch/sh/kernel/irq.c
+++ b/arch/sh/kernel/irq.c
@@ -245,7 +245,6 @@ asmlinkage void do_softirq(void)
local_irq_restore(flags);
}
-EXPORT_SYMBOL(do_softirq);
#endif
void __init init_IRQ(void)
diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 87423b7..3542f0c 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -236,4 +236,3 @@ asmlinkage void do_softirq(void)
}
local_irq_restore(flags);
}
-EXPORT_SYMBOL(do_softirq);
^ permalink raw reply related
* ne driver crashes when unloaded in 2.6.22.6
From: Chris Rankin @ 2007-09-09 19:46 UTC (permalink / raw)
To: netdev
Hi,
While trying to get my NE2000 ISA card working with NetworkManager and Linux 2.6.22.6, I
discovered that the ne module will cause the kernel to oops when it is unloaded. The problem is
that the module's clean-up function tries to unregister a platform driver unconditionally,
although the platform driver may never have been registered in the first place. I have created a
patch which seems to work OK. The idea behind the patch is to de-initialise the unregistered
ne_driver structure to the point where the unregister function will abort without error:
--- drivers/net/ne.c.orig 2007-07-21 17:00:52.000000000 +0100
+++ drivers/net/ne.c 2007-09-09 20:04:28.000000000 +0100
@@ -272,6 +272,7 @@
pnp_device_detach(idev);
return -ENXIO;
}
+ SET_NETDEV_DEV(dev, &idev->dev);
ei_status.priv = (unsigned long)idev;
break;
}
@@ -827,6 +828,7 @@
free_netdev(dev);
return err;
}
+ SET_NETDEV_DEV(dev, &pdev->dev);
platform_set_drvdata(pdev, dev);
return 0;
}
@@ -909,11 +911,14 @@
is at boot) and so the probe will get confused by any other 8390 cards.
ISA device autoprobes on a running machine are not recommended anyway. */
int __init init_module(void)
{
int this_dev, found = 0;
int plat_found = !ne_init();
+ if ( !plat_found )
+ ne_driver.driver.bus = NULL;
+
for (this_dev = 0; this_dev < MAX_NE_CARDS; this_dev++) {
struct net_device *dev = alloc_ei_netdev();
if (!dev)
The patch also contains a few SET_NETDEV_DEV() statements, which I am hoping will help these
network cards work with NetworkManager. This isn't a complete solution because it still doesn't
address the non-PnP ISA case. However, I think it's a start... At the moment, my eth device
appears in /sys/class/net but is also completely ignored by lshal.
Cheers,
Chris
___________________________________________________________
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good http://uk.promotions.yahoo.com/forgood/environment.html
^ permalink raw reply
* [PATCH] iproute2: patches from Debian.
From: Andreas Henriksson @ 2007-09-09 19:10 UTC (permalink / raw)
To: shemminger; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 1109 bytes --]
Hello Stephen Hemminger and the rest of the people on the nevdev list!
I'm posting a bunch of patches for iproute2. A few I've written myself
and have a Signed-off-by with my name in them, the others I've picked up
from the iproute package in Debian. I tried my best to add a decent
description, references to debian bug number and when possible a From
header suggesting who the original author might be.
They should all apply with some offset and sometimes a bit fuzz to the
current iproute2 git tree.
new patches posted to the debian bug tracking system:
add-mpath-docs.diff
doublefree.diff
fix-layer-syntax.diff
new-manpages.diff
tick2time-fix.diff
wrandom-fix.diff
fixes in the debian iproute package:
empty_linkname.diff
ip_address_flush_loop.diff
lartc.diff
netbug_fix.diff
okey-ikey.diff
remove_tc_filters_reference.diff
patches updating help text syntax:
ip_route_usage.diff
ip_rule_usage.diff
patches fixing documentation typos:
fix_ss_typo.diff
ip.8-typo.diff
ip_address.diff
libnetlink_typo.diff
manpages-typo.diff
tcb_htb_typo.diff
tc_cbq_details_typo.diff
--
Regards,
Andreas Henriksson
[-- Attachment #2: empty_linkname.diff --]
[-- Type: text/x-patch, Size: 572 bytes --]
Disallow empty link name.
From: Alexander Wirt <formorer@debian.org>
Patch from debian iproute package.
diff -urNad iproute-20060323~/ip/iplink.c iproute-20060323/ip/iplink.c
--- iproute-20060323~/ip/iplink.c 2006-03-22 00:57:50.000000000 +0100
+++ iproute-20060323/ip/iplink.c 2006-09-08 21:07:14.000000000 +0200
@@ -384,6 +384,10 @@
}
if (newname && strcmp(dev, newname)) {
+ if (strlen(newname) == 0) {
+ printf("\"\" is not valid device identifier\n",dev);
+ return -1;
+ }
if (do_changename(dev, newname) < 0)
return -1;
dev = newname;
[-- Attachment #3: ip_address_flush_loop.diff --]
[-- Type: text/x-patch, Size: 1153 bytes --]
Abort flush after 10 seconds.
From: Alexander Wirt <formorer@debian.org>
Patch from Debian iproute package.
diff -urNad iproute-20060323~/ip/ipaddress.c iproute-20060323/ip/ipaddress.c
--- iproute-20060323~/ip/ipaddress.c 2006-09-08 17:02:03.000000000 +0200
+++ iproute-20060323/ip/ipaddress.c 2006-09-08 17:03:01.000000000 +0200
@@ -15,6 +15,7 @@
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
+#include <time.h>
#include <syslog.h>
#include <fcntl.h>
#include <sys/ioctl.h>
@@ -589,6 +590,7 @@
if (flush) {
int round = 0;
char flushb[4096-512];
+ time_t start = time(0);
filter.flushb = flushb;
filter.flushp = 0;
@@ -617,6 +619,12 @@
printf("Warum?\n");
return 1;
+ if (time(0) - start > 10) {
+ printf("\n*** Flush not completed after %ld seconds, %d entries remain ***\n",
+ time(0) - start, filter.flushed);
+ exit(1);
+ }
+
if (show_stats) {
printf("\n*** Round %d, deleting %d addresses ***\n", round, filter.flushed);
fflush(stdout);
[-- Attachment #4: lartc.diff --]
[-- Type: text/x-patch, Size: 755 bytes --]
Add references to lartc, also drop bogus reference to tc-filters
Patch from Debian iproute package.
diff -Nur old/man/man8/ip.8 new/man/man8/ip.8
--- old/man/man8/ip.8 2005-01-05 22:40:29.000000000 +0000
+++ new/man/man8/ip.8 2005-01-05 22:47:03.000000000 +0000
@@ -1803,6 +1803,8 @@
.RB "IP Command reference " ip-cref.ps
.br
.RB "IP tunnels " ip-cref.ps
+.br
+.RB http://lartc.org/
.SH AUTHOR
diff -Nur old/man/man8/tc.8 new/man/man8/tc.8
--- old/man/man8/tc.8 2004-10-19 20:49:02.000000000 +0000
+++ new/man/man8/tc.8 2005-01-05 22:46:15.000000000 +0000
@@ -341,7 +341,7 @@
.BR tc-pfifo (8),
.BR tc-bfifo (8),
.BR tc-pfifo_fast (8),
-.BR tc-filters (8)
+.BR http://lartc.org/
.SH AUTHOR
Manpage maintained by bert hubert (ahu@ds9a.nl)
[-- Attachment #5: netbug_fix.diff --]
[-- Type: text/x-patch, Size: 1490 bytes --]
Fix misc/netbug script.
See the following debian bugs:
http://bugs.debian.org/289541
http://bugs.debian.org/313540
http://bugs.debian.org/313541
http://bugs.debian.org/313544
http://bugs.debian.org/347699
Changes from Debian iproute package rediffed to apply against current
iproute2 git tree.
--- iproute2/misc/netbug 2007-09-09 17:36:19.000000000 +0200
+++ iproute-20070313/misc/netbug 2007-09-09 20:42:01.000000000 +0200
@@ -1,23 +1,16 @@
#! /bin/bash
+set -e
+
echo -n "Send network configuration summary to [ENTER means kuznet@ms2.inr.ac.ru] "
IFS="" read mail || exit 1
[ -z "$mail" ] && mail=kuznet@ms2.inr.ac.ru
+netbug=`mktemp -d -t netbug.XXXXXX` || (echo "$0: Cannot create temporary directory" >&2; exit 1; )
+netbugtar=`tempfile -d $netbug --suffix=tar.gz` || (echo "$0: Cannot create temporary file" >&2; exit 1; )
+tmppath=$netbug
+trap "/bin/rm -rf $netbug $netbugtar" 0 1 2 3 13 15
-netbug=""
-while [ "$netbug" = "" ]; do
- netbug=`echo netbug.$$.$RANDOM`
- if [ -e /tmp/$netbug ]; then
- netbug=""
- fi
-done
-
-tmppath=/tmp/$netbug
-
-trap "rm -rf $tmppath $tmppath.tar.gz" 0 SIGINT
-
-mkdir $tmppath
mkdir $tmppath/net
cat /proc/slabinfo > $tmppath/slabinfo
@@ -44,9 +37,8 @@
fi
cd /tmp
-tar c $netbug | gzip -9c > $netbug.tar.gz
-
-uuencode $netbug.tar.gz $netbug.tar.gz | mail -s $netbug "$mail"
+tar c $tmppath | gzip -9c > $netbugtar
+uuencode $netbugtar $netbugtar | mail -s $netbug "$mail"
echo "Sending to <$mail>; subject is $netbug"
[-- Attachment #6: okey-ikey.diff --]
[-- Type: text/x-patch, Size: 681 bytes --]
Fix typo in GRE tunnels (i_key vs. o_key).
From: <herbert@gondor.apana.org.au>
If a dotted quad ikey is specified for GRE tunnels, it gets set as the
okey instead. This patch fixes it.
See Debian bug #200714 - http://bugs.debian.org/200714
Patch from Debian iproute package.
--- iproute/ip/iptunnel.c.orig 2003-07-10 11:47:06.000000000 +1000
+++ iproute/ip/iptunnel.c 2003-07-10 11:47:11.000000000 +1000
@@ -221,7 +221,7 @@
NEXT_ARG();
p->i_flags |= GRE_KEY;
if (strchr(*argv, '.'))
- p->o_key = get_addr32(*argv);
+ p->i_key = get_addr32(*argv);
else {
if (get_unsigned(&uval, *argv, 0)<0) {
fprintf(stderr, "invalid value of \"ikey\"\n");
[-- Attachment #7: remove_tc_filters_reference.diff --]
[-- Type: text/x-patch, Size: 1084 bytes --]
Remove bogus reference to tc-filters(8) from tc(8) manpage.
See debian bug #289225 - http://bugs.debian.org/289225
Patch from Debian iproute package.
diff -urNad iproute-20070313~/man/man8/tc.8 iproute-20070313/man/man8/tc.8
--- iproute-20070313~/man/man8/tc.8 2007-06-10 20:22:40.000000000 +0200
+++ iproute-20070313/man/man8/tc.8 2007-06-10 20:23:16.000000000 +0200
@@ -202,8 +202,7 @@
tc filters
If tc filters are attached to a class, they are consulted first
for relevant instructions. Filters can match on all fields of a packet header,
-as well as on the firewall mark applied by ipchains or iptables. See
-.BR tc-filters (8).
+as well as on the firewall mark applied by ipchains or iptables.
.TP
Type of Service
Some qdiscs have built in rules for classifying packets based on the TOS field.
@@ -242,8 +241,7 @@
.TP
FILTERS
Filters have a three part ID, which is only needed when using a hashed
-filter hierarchy, for which see
-.BR tc-filters (8).
+filter hierarchy.
.SH UNITS
All parameters accept a floating point number, possibly followed by a unit.
.P
[-- Attachment #8: add-mpath-docs.diff --]
[-- Type: text/x-patch, Size: 3607 bytes --]
Add mpath to ip manpage.
From: Norbert Buchmuller <norbi@nix.hu>
The 'mpath' parameter of 'ip route' is not documented in the manual page
nor in ip-cref.tex.
...huge part of the text in the patch was taken from the net/ipv4/Kconfig
file in the Linux kernel source (2.6.18). I suppose that's OK because both
Linux and iproute2 are GPL'd, but I let you know anyway.
See Debian bug #428442 - http://bugs.debian.org/428442
diff -Naur iproute-20061002/doc/ip-cref.tex iproute-20061002.fixed/doc/ip-cref.tex
--- iproute-20061002/doc/ip-cref.tex 2007-06-11 19:26:52.000000000 +0200
+++ iproute-20061002.fixed/doc/ip-cref.tex 2007-06-11 20:34:09.000000000 +0200
@@ -1348,6 +1348,28 @@
route reflecting its relative bandwidth or quality.
\end{itemize}
+\item \verb|mpath MP_ALGO|
+
+--- the multipath algo to use.
+\verb|MP_ALGO| may be one of the following values:
+\begin{itemize}
+\item \verb|rr| --- round robin algorithm.
+Multipath routes are chosen according to Round Robin.
+\item \verb|drr| --- interface round robin algorithm.
+Connections are distributed in a round robin fashion over the
+available interfaces. This policy makes sense if the connections
+should be primarily distributed on interfaces and not on routes.
+\item \verb|random| --- random algorithm.
+Multipath routes are chosen in a random fashion. Actually,
+there is no weight for a route. The advantage of this policy
+is that it is implemented stateless and therefore introduces only
+a very small delay.
+\item \verb|wrandom| --- weighted random algorithm.
+Multipath routes are chosen in a weighted random fashion.
+The per route weights are the weights visible via ip route 2. As the
+corresponding state management introduces some overhead routing delay
+is increased.
+\end{itemize}
\item \verb|scope SCOPE_VAL|
--- the scope of the destinations covered by the route prefix.
diff -Naur iproute-20061002/man/man8/ip.8 iproute-20061002.fixed/man/man8/ip.8
--- iproute-20061002/man/man8/ip.8 2007-06-11 19:26:52.000000000 +0200
+++ iproute-20061002.fixed/man/man8/ip.8 2007-06-11 20:26:43.000000000 +0200
@@ -146,7 +146,9 @@
.B scope
.IR SCOPE " ] [ "
.B metric
-.IR METRIC " ]"
+.IR METRIC " ] [ "
+.B mpath
+.IR MP_ALGO " ]"
.ti -8
.IR INFO_SPEC " := " "NH OPTIONS FLAGS" " ["
@@ -205,6 +207,10 @@
.BR onlink " | " pervasive " ]"
.ti -8
+.IR MP_ALGO " := [ "
+.BR rr " | " drr " | " random " | " wrandom " ]"
+
+.ti -8
.IR RTPROTO " := [ "
.BR kernel " | " boot " | " static " |"
.IR NUMBER " ]"
@@ -1116,6 +1122,38 @@
.in -8
.TP
+.BI mpath " MP_ALGO"
+the multipath algo to use.
+.I MP_ALGO
+may be one of the following values:
+
+.in +8
+.B rr
+- round robin algorithm.
+Multipath routes are chosen according to Round Robin.
+.sp
+.B drr
+- interface round robin algorithm.
+Connections are distributed in a round robin fashion over the
+available interfaces. This policy makes sense if the connections
+should be primarily distributed on interfaces and not on routes.
+.sp
+.B random
+- random algorithm.
+Multipath routes are chosen in a random fashion. Actually,
+there is no weight for a route. The advantage of this policy
+is that it is implemented stateless and therefore introduces only
+a very small delay.
+.sp
+.B wrandom
+- weighted random algorithm.
+Multipath routes are chosen in a weighted random fashion.
+The per route weights are the weights visible via ip route 2. As the
+corresponding state management introduces some overhead routing delay
+is increased.
+.in -8
+
+.TP
.BI scope " SCOPE_VAL"
the scope of the destinations covered by the route prefix.
.I SCOPE_VAL
[-- Attachment #9: doublefree.diff --]
[-- Type: text/x-patch, Size: 1675 bytes --]
Fix corruption when using batch files with comments and broken lines.
See Debian bug #398912 - http://bugs.debian.org/398912
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
diff -urip iproute-20061002/include/utils.h iproute-20061002.fixed2/include/utils.h
--- iproute-20061002/include/utils.h 2006-10-02 22:13:34.000000000 +0200
+++ iproute-20061002.fixed2/include/utils.h 2007-08-16 00:51:58.000000000 +0200
@@ -132,7 +132,7 @@ int print_timestamp(FILE *fp);
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
extern int cmdlineno;
-extern size_t getcmdline(char **line, size_t *len, FILE *in);
+extern ssize_t getcmdline(char **line, size_t *len, FILE *in);
extern int makeargs(char *line, char *argv[], int maxargs);
#endif /* __UTILS_H__ */
diff -urip iproute-20061002/lib/utils.c iproute-20061002.fixed2/lib/utils.c
--- iproute-20061002/lib/utils.c 2006-10-02 22:13:34.000000000 +0200
+++ iproute-20061002.fixed2/lib/utils.c 2007-08-16 00:49:02.000000000 +0200
@@ -578,9 +578,9 @@ int print_timestamp(FILE *fp)
int cmdlineno;
/* Like glibc getline but handle continuation lines and comments */
-size_t getcmdline(char **linep, size_t *lenp, FILE *in)
+ssize_t getcmdline(char **linep, size_t *lenp, FILE *in)
{
- size_t cc;
+ ssize_t cc;
char *cp;
if ((cc = getline(linep, lenp, in)) < 0)
@@ -608,9 +608,11 @@ size_t getcmdline(char **linep, size_t *
if (cp)
*cp = '\0';
- *linep = realloc(*linep, strlen(*linep) + strlen(line1) + 1);
+ *lenp = strlen(*linep) + strlen(line1) + 1;
+ *linep = realloc(*linep, *lenp);
if (!*linep) {
fprintf(stderr, "Out of memory\n");
+ *lenp = 0;
return -1;
}
cc += cc1 - 2;
[-- Attachment #10: fix-layer-syntax.diff --]
[-- Type: text/x-patch, Size: 2089 bytes --]
Fix ematch cmp and nbyte syntax help text.
From: Lionel Elie Mamane <lionel@mamane.lu>
The help/usage screen of ematch cmp and nbyte say recognised symbolic
values for "layer FOO" are link, header and next-header, but the code
does _not_ implement that: it will recognise "next-header" as what is
supposed to be "header" and will not recognise "header". The right
symbolic values seem to be link, network, transport. Here is a patch
that changes the help/usage screen to match the code.
See Debian bug #438653 - http://bugs.debian.org/438653
diff --recursive -N -u iproute-20070313.deb/tc/em_cmp.c iproute-20070313.lio/tc/em_cmp.c
--- iproute-20070313.deb/tc/em_cmp.c 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313.lio/tc/em_cmp.c 2007-08-18 18:44:26.773897785 +0200
@@ -1,5 +1,5 @@
/*
- * em_cmp.c Simle coparison Ematch
+ * em_cmp.c Simple comparison Ematch
*
* This program is free software; you can distribute it and/or
* modify it under the terms of the GNU General Public License
@@ -32,7 +32,7 @@
"Usage: cmp(ALIGN at OFFSET [ ATTRS ] { eq | lt | gt } VALUE)\n" \
"where: ALIGN := { u8 | u16 | u32 }\n" \
" ATTRS := [ layer LAYER ] [ mask MASK ] [ trans ]\n" \
- " LAYER := { link | header | next-header | 0..%d }\n" \
+ " LAYER := { link | network | transport | 0..%d }\n" \
"\n" \
"Example: cmp(u16 at 3 layer 2 mask 0xff00 gt 20)\n",
TCF_LAYER_MAX);
diff --recursive -N -u iproute-20070313.deb/tc/em_nbyte.c iproute-20070313.lio/tc/em_nbyte.c
--- iproute-20070313.deb/tc/em_nbyte.c 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313.lio/tc/em_nbyte.c 2007-08-18 18:45:48.159607231 +0200
@@ -32,7 +32,7 @@
"Usage: nbyte(NEEDLE at OFFSET [layer LAYER])\n" \
"where: NEEDLE := { string | \"c-escape-sequence\" }\n" \
" OFFSET := int\n" \
- " LAYER := { link | header | next-header | 0..%d }\n" \
+ " LAYER := { link | network | transport | 0..%d }\n" \
"\n" \
"Example: nbyte(\"ababa\" at 12 layer 1)\n",
TCF_LAYER_MAX);
[-- Attachment #11: new-manpages.diff --]
[-- Type: text/x-patch, Size: 6032 bytes --]
Add new manpages and symlinks.
Symlink rtstat(8) and ctstat(8) to lnstat(8).
Add rtacct/nstat manpage based on doc/nstat.sgml as rtacct(8).
Symlink nstat(8) to rtacct(8).
Add arpd(8) symlink based on doc/arpd.sgml.
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
diff --git a/Makefile b/Makefile
index af0d5e4..6c976dd 100644
--- a/Makefile
+++ b/Makefile
@@ -53,6 +53,9 @@ install: all
install -m 0755 -d $(DESTDIR)$(MANDIR)/man8
install -m 0644 $(shell find man/man8 -maxdepth 1 -type f) $(DESTDIR)$(MANDIR)/man8
ln -sf tc-bfifo.8 $(DESTDIR)$(MANDIR)/man8/tc-pfifo.8
+ ln -sf lnstat.8 $(DESTDIR)$(MANDIR)/man8/rtstat.8
+ ln -sf lnstat.8 $(DESTDIR)$(MANDIR)/man8/ctstat.8
+ ln -sf rtacct.8 $(DESTDIR)$(MANDIR)/man8/nstat.8
install -m 0755 -d $(DESTDIR)$(MANDIR)/man3
install -m 0644 $(shell find man/man3 -maxdepth 1 -type f) $(DESTDIR)$(MANDIR)/man3
diff --git a/man/man8/arpd.8 b/man/man8/arpd.8
new file mode 100644
index 0000000..d172600
--- /dev/null
+++ b/man/man8/arpd.8
@@ -0,0 +1,66 @@
+.TH ARPD 8 "28 June, 2007"
+
+.SH NAME
+arpd \- userspace arp daemon.
+
+.SH SYNOPSIS
+Usage: arpd [ -lk ] [ -a N ] [ -b dbase ] [ -f file ] [ interfaces ]
+
+.SH DESCRIPTION
+The
+.B arpd
+daemon collects gratuitous ARP information, saving it on local disk and feeding it to kernel on demand to avoid redundant broadcasting due to limited size of kernel ARP cache.
+
+.SH OPTIONS
+.TP
+-h -?
+Print help
+.TP
+-l
+Dump arpd database to stdout and exit. Output consists of three columns: interface index, IP address and MAC address. Negative entries for dead hosts are also shown, in this case MAC address is replaced by word FAILED followed by colon and time when the fact that host is dead was proven the last time.
+.TP
+-f <FILE>
+Read and load arpd database from FILE in text format similar dumped by option -l. Exit after load, probably listing resulting database, if option -l is also given. If FILE is -, stdin is read to get ARP table.
+.TP
+-b <DATABASE>
+location of database file. Default location is /var/lib/arpd/arpd.db
+.TP
+-a <NUMBER>
+arpd not only passively listens ARP on wire, but also send brodcast queries itself. NUMBER is number of such queries to make before destination is considered as dead. When arpd is started as kernel helper (i.e. with app_solicit enabled in sysctl or even with option -k) without this option and still did not learn enough information, you can observe 1 second gaps in service. Not fatal, but not good.
+.TP
+-k
+Suppress sending broadcast queries by kernel. It takes sense together with option -a.
+.TP
+-n <TIME>
+Timeout of negative cache. When resolution fails arpd suppresses further attempts to resolve for this period. It makes sense only together with option -k This timeout should not be too much longer than boot time of a typical host not supporting gratuitous ARP. Default value is 60 seconds.
+.TP
+-r <RATE>
+Maximal steady rate of broadcasts sent by arpd in packets per second. Default value is 1.
+.TP
+-B <NUMBER>
+Number of broadcasts sent by <tt/arpd/ back to back. Default value is 3. Together with option <tt/-R/ this option allows to police broadcasting not to exceed B+R*T over any interval of time T.
+.P
+<INTERFACE> is the name of networking interface to watch. If no interfaces given, arpd monitors all the interfaces. In this case arpd does not adjust sysctl parameters, it is supposed user does this himself after arpd is started.
+.P
+Signals
+.br
+arpd exits gracefully syncing database and restoring adjusted sysctl parameters, when receives SIGINT or SIGTERM. SIGHUP syncs database to disk. SIGUSR1 sends some statistics to syslog. Effect of another signals is undefined, they may corrupt database and leave sysctl praameters in an unpredictable state.
+.P
+Note
+.br
+In order for arpd to be able to serve as ARP resolver, kernel must be compiled with the option CONFIG_ARPD and, in the case when interface list in not given on command line, variable app_solicit on interfaces of interest should be in /proc/sys/net/ipv4/neigh/*. If this is not made arpd still collects gratuitous ARP information in its database.
+.SH EXAMPLES
+.TP
+arpd -b /var/tmp/arpd.db
+Start arpd to collect gratuitous ARP, but not messing with kernel functionality.
+.TP
+killall arpd ; arpd -l -b /var/tmp/arpd.db
+Look at result after some time.
+.TP
+arpd -b /var/tmp/arpd.db -a 1 eth0 eth1
+Enable kernel helper, leaving leading role to kernel.
+.TP
+arpd -b /var/tmp/arpd.db -a 3 -k eth0 eth1
+Completely replace kernel resolution on interfaces eth0 and eth1. In this case kernel still does unicast probing to validate entries, but all the broadcast activity is suppressed and made under authority of arpd.
+.PP
+This is mode which arpd is supposed to work normally. It is not default just to prevent occasional enabling of too aggressive mode occasionally.
diff --git a/man/man8/rtacct.8 b/man/man8/rtacct.8
new file mode 100644
index 0000000..fb9afe8
--- /dev/null
+++ b/man/man8/rtacct.8
@@ -0,0 +1,48 @@
+.TH RTACCT 8 "27 June, 2007"
+
+.SH NAME
+nstat, rtacct - network statistics tools.
+
+.SH SYNOPSIS
+Usage: nstat [ -h?vVzrnasd:t: ] [ PATTERN [ PATTERN ] ]
+.br
+Usage: rtacct [ -h?vVzrnasd:t: ] [ ListOfRealms ]
+
+.SH DESCRIPTION
+.B nstat
+and
+.B rtacct
+are simple tools to monitor kernel snmp counters and network interface statistics.
+
+.SH OPTIONS
+.TP
+-h -?
+Print help
+.TP
+-v -V
+Print version
+.TP
+-z
+Dump zero counters too. By default they are not shown.
+.TP
+-r
+Reset history.
+.TP
+-n
+Do not display anything, only update history.
+.TP
+-a
+Dump absolute values of counters. The default is to calculate increments since the previous use.
+.TP
+-s
+Do not update history, so that the next time you will see counters including values accumulated to the moment of this measurement too.
+.TP
+-d <INTERVAL>
+Run in daemon mode collecting statistics. <INTERVAL> is interval between measurements in seconds.
+.TP
+-t <INTERVAL>
+Time interval to average rates. Default value is 60 seconds.
+
+.SH SEE ALSO
+lnstat(8)
+
[-- Attachment #12: tick2time-fix.diff --]
[-- Type: text/x-patch, Size: 1318 bytes --]
Fix overflow in time2tick / tick2time.
The helper functions gets passed an unsigned int, which gets cast to long
and overflows.
See Debian bug #175462 - http://bugs.debian.org/175462
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
diff -uri iproute-20070313.orig/tc/tc_core.c iproute-20070313/tc/tc_core.c
--- iproute-20070313.orig/tc/tc_core.c 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313/tc/tc_core.c 2007-08-15 00:41:30.000000000 +0200
@@ -35,12 +35,12 @@
}
-long tc_core_time2tick(long time)
+unsigned tc_core_time2tick(unsigned time)
{
return time*tick_in_usec;
}
-long tc_core_tick2time(long tick)
+unsigned tc_core_tick2time(unsigned tick)
{
return tick/tick_in_usec;
}
diff -uri iproute-20070313.orig/tc/tc_core.h iproute-20070313/tc/tc_core.h
--- iproute-20070313.orig/tc/tc_core.h 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313/tc/tc_core.h 2007-08-15 00:41:49.000000000 +0200
@@ -7,8 +7,8 @@
#define TIME_UNITS_PER_SEC 1000000000
int tc_core_time2big(long time);
-long tc_core_time2tick(long time);
-long tc_core_tick2time(long tick);
+unsigned tc_core_time2tick(unsigned time);
+unsigned tc_core_tick2time(unsigned tick);
long tc_core_time2ktime(long time);
long tc_core_ktime2time(long ktime);
unsigned tc_calc_xmittime(unsigned rate, unsigned size);
[-- Attachment #13: wrandom-fix.diff --]
[-- Type: text/x-patch, Size: 1082 bytes --]
Fix off-by-one in print of wrandom algo.
From: Norbert Buchmuller <norbi@nix.hu>
The 'wrandom' multipath algo is recognised when adding the route, but
not resolved when it is printed (prints 'unknown'):
ianus:~# ip ro add 1.2.3.4 mpath wrandom nexthop dev ppp0 weight 1 nexthop dev ppp1 weight 2
ianus:~# ip ro get to 1.2.3.4
1.2.3.4 mpath unknown dev ppp0 src 62.77.192.67
cache mtu 1492 advmss 1452 hoplimit 64
ianus:~# ip ro del 1.2.3.4 mpath wrandom nexthop dev ppp0 weight 1 nexthop dev ppp1 weight 2
See Debian bug #428440 - http://bugs.debian.org/428440
diff -Naur iproute-20061002/ip/iproute.c iproute-20061002.fixed/ip/iproute.c
--- iproute-20061002/ip/iproute.c 2007-06-11 19:26:52.000000000 +0200
+++ iproute-20061002.fixed/ip/iproute.c 2007-06-11 19:27:29.000000000 +0200
@@ -358,7 +358,7 @@
__u32 mp_alg = *(__u32*) RTA_DATA(tb[RTA_MP_ALGO]);
if (mp_alg > IP_MP_ALG_NONE) {
fprintf(fp, "mpath %s ",
- mp_alg < IP_MP_ALG_MAX ? mp_alg_names[mp_alg] : "unknown");
+ mp_alg <= IP_MP_ALG_MAX ? mp_alg_names[mp_alg] : "unknown");
}
}
[-- Attachment #14: ip_route_usage.diff --]
[-- Type: text/x-patch, Size: 1032 bytes --]
Add src syntax to route help text.
From: Alexander Wirt <formorer@debian.org>
Add src option to ip_route usage.
See Debian bug #226142 - http://bugs.debian.org/226142
diff -urNad iproute-20061002~/ip/iproute.c iproute-20061002/ip/iproute.c
--- iproute-20061002~/ip/iproute.c 2006-10-15 16:44:25.000000000 +0200
+++ iproute-20061002/ip/iproute.c 2006-10-15 16:50:29.000000000 +0200
@@ -73,7 +73,7 @@
fprintf(stderr, "OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ]\n");
fprintf(stderr, " [ rtt NUMBER ] [ rttvar NUMBER ]\n");
fprintf(stderr, " [ window NUMBER] [ cwnd NUMBER ] [ initcwnd NUMBER ]\n");
- fprintf(stderr, " [ ssthresh NUMBER ] [ realms REALM ]\n");
+ fprintf(stderr, " [ ssthresh NUMBER ] [ realms REALM ] [ src ADDRESS ]\n");
fprintf(stderr, "TYPE := [ unicast | local | broadcast | multicast | throw |\n");
fprintf(stderr, " unreachable | prohibit | blackhole | nat ]\n");
fprintf(stderr, "TABLE_ID := [ local | main | default | all | NUMBER ]\n");
[-- Attachment #15: ip_rule_usage.diff --]
[-- Type: text/x-patch, Size: 935 bytes --]
Add syntax for rule prio help text.
From: Alexander Wirt <formorer@debian.org>
Add [ prio NUMBER ] to ip_rule.c
See Debian bug #213673 - http://bugs.debian.org/213673
diff -urNad iproute-20070313~/ip/iprule.c iproute-20070313/ip/iprule.c
--- iproute-20070313~/ip/iprule.c 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313/ip/iprule.c 2007-06-10 17:54:46.000000000 +0200
@@ -38,7 +38,7 @@
{
fprintf(stderr, "Usage: ip rule [ list | add | del | flush ] SELECTOR ACTION\n");
fprintf(stderr, "SELECTOR := [ not ] [ from PREFIX ] [ to PREFIX ] [ tos TOS ] [ fwmark FWMARK[/MASK] ]\n");
- fprintf(stderr, " [ dev STRING ] [ pref NUMBER ]\n");
+ fprintf(stderr, " [ dev STRING ] [ pref NUMBER ] [ prio NUMBER ]\n");
fprintf(stderr, "ACTION := [ table TABLE_ID ]\n");
fprintf(stderr, " [ prohibit | reject | unreachable ]\n");
fprintf(stderr, " [ realms [SRCREALM/]DSTREALM ]\n");
[-- Attachment #16: fix_ss_typo.diff --]
[-- Type: text/x-patch, Size: 603 bytes --]
Fix typo in ss manpage.
Patch from debian iproute package.
diff -urNad iproute-20070313~/man/man8/ss.8 iproute-20070313/man/man8/ss.8
--- iproute-20070313~/man/man8/ss.8 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313/man/man8/ss.8 2007-06-10 19:36:04.000000000 +0200
@@ -9,7 +9,7 @@
is used to dump socket statistics. It allows showing information similar
to
.IR netstat .
-It can display more TCP information than state than other tools.
+It can display more TCP and state informations than other tools.
.SH OPTIONS
These programs follow the usual GNU command line syntax, with long
[-- Attachment #17: ip.8-typo.diff --]
[-- Type: text/x-patch, Size: 430 bytes --]
Make the backslash visible in ip manpage.
See Debian bug #285507 - http://bugs.debian.org/285507
--- orig/man/man8/ip.8 2004-10-19 20:49:02.000000000 +0000
+++ new/man/man8/ip.8 2005-01-05 22:04:12.000000000 +0000
@@ -374,7 +374,7 @@
.BR "\-o" , " \-oneline"
output each record on a single line, replacing line feeds
with the
-.B '\'
+.B '\e\'
character. This is convenient when you want to count records
with
.BR wc (1)
[-- Attachment #18: ip_address.diff --]
[-- Type: text/x-patch, Size: 570 bytes --]
Strict syntax for advice in error message.
diff -ruN iproute-20051007.orig/ip/ipaddress.c iproute-20051007/ip/ipaddress.c
--- iproute-20051007.orig/ip/ipaddress.c 2005-09-21 21:33:18.000000000 +0200
+++ iproute-20051007/ip/ipaddress.c 2006-03-14 07:26:26.830934712 +0100
@@ -901,7 +901,7 @@
return ipaddr_list_or_flush(argc-1, argv+1, 1);
if (matches(*argv, "help") == 0)
usage();
- fprintf(stderr, "Command \"%s\" is unknown, try \"ip address help\".\n", *argv);
+ fprintf(stderr, "Command \"%s\" is unknown, try \"ip addr help\".\n", *argv);
exit(-1);
}
[-- Attachment #19: libnetlink_typo.diff --]
[-- Type: text/x-patch, Size: 559 bytes --]
Fix typo in libnetlink manpage (writen -> written).
diff -urNad iproute-20070313~/man/man3/libnetlink.3 iproute-20070313/man/man3/libnetlink.3
--- iproute-20070313~/man/man3/libnetlink.3 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313/man/man3/libnetlink.3 2007-06-10 19:28:30.000000000 +0200
@@ -187,7 +187,7 @@
This library should be named librtnetlink.
.SH AUTHORS
-netlink/rtnetlink was designed and writen by Alexey Kuznetsov.
+netlink/rtnetlink was designed and written by Alexey Kuznetsov.
Andi Kleen wrote the man page.
.SH SEE ALSO
[-- Attachment #20: manpages-typo.diff --]
[-- Type: text/x-patch, Size: 1753 bytes --]
Fix typos in tc-prio(8) manpage.
From: Alexander Wirt <formorer@debian.org>
diff -urNad iproute-20061002~/man/man8/tc-prio.8 iproute-20061002/man/man8/tc-prio.8
--- iproute-20061002~/man/man8/tc-prio.8 2006-10-15 17:06:41.000000000 +0200
+++ iproute-20061002/man/man8/tc-prio.8 2006-10-15 17:10:52.000000000 +0200
@@ -30,7 +30,7 @@
On creation with 'tc qdisc add', a fixed number of bands is created. Each
band is a class, although is not possible to add classes with 'tc qdisc
add', the number of bands to be created must instead be specified on the
-commandline attaching PRIO to its root.
+command line attaching PRIO to its root.
When dequeueing, band 0 is tried first and only if it did not deliver a
packet does PRIO try band 1, and so onwards. Maximum reliability packets
@@ -88,7 +88,7 @@
The four TOS bits (the 'TOS field') are defined as:
.nf
-Binary Decimcal Meaning
+Binary Decimal Meaning
-----------------------------------------
1000 8 Minimize delay (md)
0100 4 Maximize throughput (mt)
@@ -125,13 +125,13 @@
The second column contains the value of the relevant
four TOS bits, followed by their translated meaning. For example, 15 stands
-for a packet wanting Minimal Montetary Cost, Maximum Reliability, Maximum
+for a packet wanting Minimal Monetary Cost, Maximum Reliability, Maximum
Throughput AND Minimum Delay.
The fourth column lists the way the Linux kernel interprets the TOS bits, by
showing to which Priority they are mapped.
-The last column shows the result of the default priomap. On the commandline,
+The last column shows the result of the default priomap. On the command line,
the default priomap looks like this:
1, 2, 2, 2, 1, 2, 0, 0 , 1, 1, 1, 1, 1, 1, 1, 1
[-- Attachment #21: tcb_htb_typo.diff --]
[-- Type: text/x-patch, Size: 847 bytes --]
Fix typo in tc-htb(8) manpage (mininum -> minimum).
diff -urNad iproute-20070313~/man/man8/tc-htb.8 iproute-20070313/man/man8/tc-htb.8
--- iproute-20070313~/man/man8/tc-htb.8 2007-03-13 22:50:56.000000000 +0100
+++ iproute-20070313/man/man8/tc-htb.8 2007-06-10 19:30:08.000000000 +0200
@@ -137,7 +137,7 @@
.SH NOTES
Due to Unix timing constraints, the maximum ceil rate is not infinite and may in fact be quite low. On Intel,
there are 100 timer events per second, the maximum rate is that rate at which 'burst' bytes are sent each timer tick.
-From this, the mininum burst size for a specified rate can be calculated. For i386, a 10mbit rate requires a 12 kilobyte
+From this, the minimum burst size for a specified rate can be calculated. For i386, a 10mbit rate requires a 12 kilobyte
burst as 100*12kb*8 equals 10mbit.
.SH SEE ALSO
[-- Attachment #22: tc_cbq_details_typo.diff --]
[-- Type: text/x-patch, Size: 732 bytes --]
Fix typo in tc-cbq-details(8) manpage (occured -> occurred).
diff -urNad iproute-20070313~/man/man8/tc-cbq-details.8 iproute-20070313/man/man8/tc-cbq-details.8
--- iproute-20070313~/man/man8/tc-cbq-details.8 2007-06-10 19:25:18.000000000 +0200
+++ iproute-20070313/man/man8/tc-cbq-details.8 2007-06-10 19:25:58.000000000 +0200
@@ -210,7 +210,7 @@
priority. If found, choose it, and terminate.
.TP
(iii)
-Choose the class at which break out to the fallback algorithm occured. Terminate.
+Choose the class at which break out to the fallback algorithm occurred. Terminate.
.P
The packet is enqueued to the class which was chosen when either algorithm
terminated. It is therefore possible for a packet to be enqueued *not* at a
^ permalink raw reply related
* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Arjan van de Ven @ 2007-09-09 18:18 UTC (permalink / raw)
To: Denys Vlasenko
Cc: Linus Torvalds, Nick Piggin, Satyam Sharma, Herbert Xu,
Paul Mackerras, Christoph Lameter, Chris Snook, Ilpo Jarvinen,
Paul E. McKenney, Stefan Richter, Linux Kernel Mailing List,
linux-arch, Netdev, Andrew Morton, ak, heiko.carstens,
David Miller, schwidefsky, wensong, horms, wjiang, cfriesen,
zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <200709091902.55388.vda.linux@googlemail.com>
On Sun, 9 Sep 2007 19:02:54 +0100
Denys Vlasenko <vda.linux@googlemail.com> wrote:
> Why is all this fixation on "volatile"? I don't think
> people want "volatile" keyword per se, they want atomic_read(&x) to
> _always_ compile into an memory-accessing instruction, not register
> access.
and ... why is that?
is there any valid, non-buggy code sequence that makes that a
reasonable requirement?
^ permalink raw reply
* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Denys Vlasenko @ 2007-09-09 18:02 UTC (permalink / raw)
To: Linus Torvalds
Cc: Nick Piggin, Satyam Sharma, Herbert Xu, Paul Mackerras,
Christoph Lameter, Chris Snook, Ilpo Jarvinen, Paul E. McKenney,
Stefan Richter, Linux Kernel Mailing List, linux-arch, Netdev,
Andrew Morton, ak, heiko.carstens, David Miller, schwidefsky,
wensong, horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl,
segher
In-Reply-To: <alpine.LFD.0.999.0708170929580.30176@woody.linux-foundation.org>
On Friday 17 August 2007 17:48, Linus Torvalds wrote:
>
> On Fri, 17 Aug 2007, Nick Piggin wrote:
> >
> > That's not obviously just taste to me. Not when the primitive has many
> > (perhaps, the majority) of uses that do not require said barriers. And
> > this is not solely about the code generation (which, as Paul says, is
> > relatively minor even on x86). I prefer people to think explicitly
> > about barriers in their lockless code.
>
> Indeed.
>
> I think the important issues are:
>
> - "volatile" itself is simply a badly/weakly defined issue. The semantics
> of it as far as the compiler is concerned are really not very good, and
> in practice tends to boil down to "I will generate so bad code that
> nobody can accuse me of optimizing anything away".
>
> - "volatile" - regardless of how well or badly defined it is - is purely
> a compiler thing. It has absolutely no meaning for the CPU itself, so
> it at no point implies any CPU barriers. As a result, even if the
> compiler generates crap code and doesn't re-order anything, there's
> nothing that says what the CPU will do.
>
> - in other words, the *only* possible meaning for "volatile" is a purely
> single-CPU meaning. And if you only have a single CPU involved in the
> process, the "volatile" is by definition pointless (because even
> without a volatile, the compiler is required to make the C code appear
> consistent as far as a single CPU is concerned).
>
> So, let's take the example *buggy* code where we use "volatile" to wait
> for other CPU's:
>
> atomic_set(&var, 0);
> while (!atomic_read(&var))
> /* nothing */;
>
>
> which generates an endless loop if we don't have atomic_read() imply
> volatile.
>
> The point here is that it's buggy whether the volatile is there or not!
> Exactly because the user expects multi-processing behaviour, but
> "volatile" doesn't actually give any real guarantees about it. Another CPU
> may have done:
>
> external_ptr = kmalloc(..);
> /* Setup is now complete, inform the waiter */
> atomic_inc(&var);
>
> but the fact is, since the other CPU isn't serialized in any way, the
> "while-loop" (even in the presense of "volatile") doesn't actually work
> right! Whatever the "atomic_read()" was waiting for may not have
> completed, because we have no barriers!
Why is all this fixation on "volatile"? I don't think
people want "volatile" keyword per se, they want atomic_read(&x) to
_always_ compile into an memory-accessing instruction, not register access.
--
vda
^ permalink raw reply
* Re: [PATCH 03/16] net: Basic network namespace infrastructure.
From: Paul E. McKenney @ 2007-09-09 16:45 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Linux Containers, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller
In-Reply-To: <m1fy1otarm.fsf-T1Yj925okcoyDheHMi7gv2pdwda3JcWeAL8bYrjMMd8@public.gmane.org>
On Sun, Sep 09, 2007 at 04:04:45AM -0600, Eric W. Biederman wrote:
> "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
>
> > On Sat, Sep 08, 2007 at 03:15:34PM -0600, Eric W. Biederman wrote:
> >>
> >> This is the basic infrastructure needed to support network
> >> namespaces. This infrastructure is:
> >> - Registration functions to support initializing per network
> >> namespace data when a network namespaces is created or destroyed.
> >>
> >> - struct net. The network namespace data structure.
> >> This structure will grow as variables are made per network
> >> namespace but this is the minimal starting point.
> >>
> >> - Functions to grab a reference to the network namespace.
> >> I provide both get/put functions that keep a network namespace
> >> from being freed. And hold/release functions serve as weak references
> >> and will warn if their count is not zero when the data structure
> >> is freed. Useful for dealing with more complicated data structures
> >> like the ipv4 route cache.
> >>
> >> - A list of all of the network namespaces so we can iterate over them.
> >>
> >> - A slab for the network namespace data structure allowing leaks
> >> to be spotted.
> >
> > If I understand this correctly, the only way to get to a namespace is
> > via get_net_ns_by_pid(), which contains the rcu_read_lock() that matches
> > the rcu_barrier() below.
>
> Not quite. That is the convoluted case for getting a namespace someone
> else is using. current->nsproxy->net_ns works and should require no
> locking to read (only the current process may modify it) and does hold
> a reference to the network namespace. Similarly for sock->sk_net.
Ah! Got it, thank you for the explanation.
> > So, is the get_net() in sock_copy() in this patch adding a reference to
> > an element that is guaranteed to already have at least one reference?
>
> Yes.
>
> > If not, how are we preventing sock_copy() from running concurrently with
> > cleanup_net()? Ah, I see -- in sock_copy() we are getting a reference
> > to the new struct sock that no one else can get a reference to, so OK.
> > Ditto for the get_net() in sk_alloc().
>
> > But I still don't understand what is protecting the get_net() in
> > dev_seq_open(). Is there an existing reference?
>
> Sort of. The directories under /proc/net are created when create
> a network namespace and they are destroyed when the network namespace
> is removed. And those directories remember which network namespace
> they are for and that is what dev_seq_open is referencing.
>
> So the tricky case what happens if we open a directory under /proc/net
> as we are cleaning up a network namespace.
Yep! ;-)
> > If so, how do we know
> > that it won't be removed just as we are trying to add our reference
> > (while at the same time cleanup_net() is running)? Ditto for the other
> > _open() operations in the same patch. And for netlink_seq_open().
> >
> > Enlightenment?
>
> Good spotting. It looks like you have found a legitimate race. Grr.
> I thought I had a reference to the network namespace there. I need to
> step back and think about this a bit, and see if I can come up with a
> legitimate idiom.
>
> I know the network namespace exists and I have not finished
> cleanup_net because I can still get to the /proc entries.
OK. Hmmm... I need to go review locking for /proc...
> I know I cannot use get_net for the reference in in /proc because
> otherwise I could not release the network namespace unless I was to
> unmount the filesystem, which is not a desirable property.
>
> I think I can change the idiom to:
>
> struct net *maybe_get_net(struct net *net)
> {
> if (!atomic_inc_not_zero(&net->count))
> net = NULL;
> return net;
> }
>
> Which would make dev_seq_open be:
>
> static int dev_seq_open(struct inode *inode, struct file *file)
> {
> struct seq_file *seq;
> int res;
> res = seq_open(file, &dev_seq_ops);
> if (!res) {
> seq = file->private_data;
> seq->private = maybe_get_net(PROC_NET(inode));
> if (!seq->private) {
> res = -ENOENT;
> seq_release(inode, file);
> }
> }
> return res;
> }
>
> I'm still asking myself if I need any kind of locking to ensure
> struct net does not go away in the mean time, if so rcu_read_lock()
> should be sufficient.
Agreed -- and it might be possible to leverage the existing locking
in the /proc code.
Thanx, Paul
> I will read through the generic proc code very carefully after
> I have slept and see if there is what I the code above is sufficient,
> and if so update the patchset.
>
> Eric
^ permalink raw reply
* Re: r8169: slow samba performance
From: Francois Romieu @ 2007-09-09 15:33 UTC (permalink / raw)
To: David Madsen; +Cc: netdev
In-Reply-To: <ec7e30370709090044m55efbbcl3cb9e9489f2d5e6e@mail.gmail.com>
David Madsen <david.madsen@gmail.com> :
> >Does "acceptable" mean that there is a noticeable difference when compared
> >to the patch based on a busy-waiting loop ?
>
> I noticed a somewhat significant difference between patch #0002 and a
> busy wait loop with ndelay(10). Write performance was equivalent in
> both cases as should be the case. Read perfomance for me maxed out
Do you have some (gross) figure for the write performance ?
> around 150ish megabit whereas switching to the ndelay(10) loop brought
> up average performance around 350ish megabit while reading the same
> files over samba.
Hardly extatic. :o/
Do you see a difference in the system load too, say a few lines of 'vmstat 1' ?
Can you add the patch below on top of #0002 and see if there is some
benefit from it ?
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index b85ab4a..8d8fff3 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -2457,6 +2457,7 @@ static int rtl8169_start_xmit(struct sk_buff *skb, struct net_device *dev)
smp_wmb();
RTL_W8(TxPoll, NPQ); /* set polling bit */
+ RTL_R8(TxPoll);
if (TX_BUFFS_AVAIL(tp) < MAX_SKB_FRAGS) {
netif_stop_queue(dev);
I'd welcome if you could try the patch below on top of #0002 too:
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index b85ab4a..840df3b 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -2457,6 +2457,17 @@ static int rtl8169_start_xmit(struct sk_buff *skb, struct net_device *dev)
smp_wmb();
RTL_W8(TxPoll, NPQ); /* set polling bit */
+{
+ static unsigned int wait_max = 0;
+ unsigned i;
+
+ for (i = 0; (RTL_R8(TxPoll) & NPQ) && (i < 1000); i++)
+ ndelay(10);
+ if (i > wait_max) {
+ wait_max = i;
+ printk(KERN_INFO "%s: wait_max = %d\n", dev->name, wait_max);
+ }
+}
if (TX_BUFFS_AVAIL(tp) < MAX_SKB_FRAGS) {
netif_stop_queue(dev);
--
Ueimor
^ permalink raw reply related
* Re: wither bounds checking for networking sysctls
From: Eric W. Biederman @ 2007-09-09 15:26 UTC (permalink / raw)
To: Rick Jones; +Cc: Stephen Hemminger, Linux Network Development list
In-Reply-To: <46D84C8D.4070009@hp.com>
Rick Jones <rick.jones2@hp.com> writes:
> Stephen Hemminger wrote:
>> On Thu, 30 Aug 2007 18:09:17 -0700
>> Rick Jones <rick.jones2@hp.com> wrote:
>>
>>
>>> While messing about with "sysctl_tcp_rto_min" I went back and forth a bit as
>>> to whether there should have been bounds checking (as did some of the folks
>>> who did some internal review for me). That leads to the question - is it
>>> considered worthwhile to add a bit more bounds checking to sundry networking
>>> sysctls?
>>>
>>>rick jones
>>
>>
>> IMHO As long as the any value from sysctl doesn't crash kernel, we
>> should let it go. Enforcing RFC policy or inter-dependencies seems
>> likes a useless exercise.
>
> I was thinking more along the lines of more fundamental things - like precluding
> negative values when something is clearly a positive.
The sysctl infrastructure has some fairly simple support for
doing min/max type things. So if it makes sense it isn't hard to
make proc_dointvec_minmax the method and then set extra1 to point to
the min and extra2 to be the max.
Eric
^ permalink raw reply
* Re: [PATCH v3 2/2][BNX2]: Add iSCSI support to BNX2 devices.
From: FUJITA Tomonori @ 2007-09-09 15:05 UTC (permalink / raw)
To: hch
Cc: jeff, tomof, open-iscsi, davem, mchristi, netdev, anilgv, talm,
lusinsky, uri, fujita.tomonori
In-Reply-To: <20070908120036.GB8478@infradead.org>
On Sat, 8 Sep 2007 13:00:36 +0100
Christoph Hellwig <hch@infradead.org> wrote:
> On Sat, Sep 08, 2007 at 07:32:27AM -0400, Jeff Garzik wrote:
> > FUJITA Tomonori wrote:
> > >Yeah, iommu code ignores the lld limitations (the problem is that the
> > >lld limitations are in request_queue and iommu code can't access to
> > >request_queue). There is no way to tell iommu code about the lld
> > >limitations.
> >
> >
> > This fact very much wants fixing.
>
>
> Absolutely. Unfortunately everyone wastes their time on creating workarounds
> instead of fixing the underlying problem.
Any ideas on how to fix this?
I chatted to Jens and James on this last week.
- we could just copies the lld limitations to device structure. it's
hacky but device structure already has hacky stuff.
- we could just link device structure to request_queue structure so
that iommu code can see request_queue structure.
- we could remove the lld limitations in request_queue strucutre and
have a new strucutre (something like struct io_restrictions). then
somehow we could link the new structure with request_queue and device
strucutres.
^ permalink raw reply
* Re: [PATCH 03/16] net: Basic network namespace infrastructure.
From: Eric W. Biederman @ 2007-09-09 10:18 UTC (permalink / raw)
To: Eric Dumazet
Cc: Linux Containers, netdev-u79uwXL29TY76Z2rM5mHXA, David Miller
In-Reply-To: <46E3B281.4030105-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org>
Eric Dumazet <dada1-fPLkHRcR87vqlBn2x/YWAg@public.gmane.org> writes:
>
> Nice work Eric !
Thanks.
> "struct net" is not a very descriptive name imho, why dont stick "ns" or
> "namespace" somewhere ?
My fingers rebelled, and struct net seems to be sufficiently descriptive.
However that is a cosmetic detail and if there is a general consensus
that renaming it to be struct netns or whatever would be a more
readable/maintainable name I can change it.
> Do we really need yet another "struct kmem_cache *net_cachep;" ?
> The object is so small that the standard caches should be OK (kzalloc())
The practical issue at this point in the cycle is visibility. With a
kmem cache it is easy to spot ref counting leaks or other problems
if they happen. Without it debugging is much more difficult. While I
am touched with your faith in my ability to write perfect patches I
think it makes a lot of sense to keep the cache at least until
sometime after the network namespace code is merged and people
generally have confidence in the implementation.
Eric
^ permalink raw reply
* Re: [PATCH 03/16] net: Basic network namespace infrastructure.
From: Eric W. Biederman @ 2007-09-09 10:04 UTC (permalink / raw)
To: paulmck; +Cc: David Miller, netdev, Linux Containers
In-Reply-To: <20070909003308.GA10417@linux.vnet.ibm.com>
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> writes:
> On Sat, Sep 08, 2007 at 03:15:34PM -0600, Eric W. Biederman wrote:
>>
>> This is the basic infrastructure needed to support network
>> namespaces. This infrastructure is:
>> - Registration functions to support initializing per network
>> namespace data when a network namespaces is created or destroyed.
>>
>> - struct net. The network namespace data structure.
>> This structure will grow as variables are made per network
>> namespace but this is the minimal starting point.
>>
>> - Functions to grab a reference to the network namespace.
>> I provide both get/put functions that keep a network namespace
>> from being freed. And hold/release functions serve as weak references
>> and will warn if their count is not zero when the data structure
>> is freed. Useful for dealing with more complicated data structures
>> like the ipv4 route cache.
>>
>> - A list of all of the network namespaces so we can iterate over them.
>>
>> - A slab for the network namespace data structure allowing leaks
>> to be spotted.
>
> If I understand this correctly, the only way to get to a namespace is
> via get_net_ns_by_pid(), which contains the rcu_read_lock() that matches
> the rcu_barrier() below.
Not quite. That is the convoluted case for getting a namespace someone
else is using. current->nsproxy->net_ns works and should require no
locking to read (only the current process may modify it) and does hold
a reference to the network namespace. Similarly for sock->sk_net.
> So, is the get_net() in sock_copy() in this patch adding a reference to
> an element that is guaranteed to already have at least one reference?
Yes.
> If not, how are we preventing sock_copy() from running concurrently with
> cleanup_net()? Ah, I see -- in sock_copy() we are getting a reference
> to the new struct sock that no one else can get a reference to, so OK.
> Ditto for the get_net() in sk_alloc().
> But I still don't understand what is protecting the get_net() in
> dev_seq_open(). Is there an existing reference?
Sort of. The directories under /proc/net are created when create
a network namespace and they are destroyed when the network namespace
is removed. And those directories remember which network namespace
they are for and that is what dev_seq_open is referencing.
So the tricky case what happens if we open a directory under /proc/net
as we are cleaning up a network namespace.
> If so, how do we know
> that it won't be removed just as we are trying to add our reference
> (while at the same time cleanup_net() is running)? Ditto for the other
> _open() operations in the same patch. And for netlink_seq_open().
>
> Enlightenment?
Good spotting. It looks like you have found a legitimate race. Grr.
I thought I had a reference to the network namespace there. I need to
step back and think about this a bit, and see if I can come up with a
legitimate idiom.
I know the network namespace exists and I have not finished
cleanup_net because I can still get to the /proc entries.
I know I cannot use get_net for the reference in in /proc because
otherwise I could not release the network namespace unless I was to
unmount the filesystem, which is not a desirable property.
I think I can change the idiom to:
struct net *maybe_get_net(struct net *net)
{
if (!atomic_inc_not_zero(&net->count))
net = NULL;
return net;
}
Which would make dev_seq_open be:
static int dev_seq_open(struct inode *inode, struct file *file)
{
struct seq_file *seq;
int res;
res = seq_open(file, &dev_seq_ops);
if (!res) {
seq = file->private_data;
seq->private = maybe_get_net(PROC_NET(inode));
if (!seq->private) {
res = -ENOENT;
seq_release(inode, file);
}
}
return res;
}
I'm still asking myself if I need any kind of locking to ensure
struct net does not go away in the mean time, if so rcu_read_lock()
should be sufficient.
I will read through the generic proc code very carefully after
I have slept and see if there is what I the code above is sufficient,
and if so update the patchset.
Eric
^ permalink raw reply
* Bluetooth patches for 2.6.23-rc5
From: Marcel Holtmann @ 2007-09-09 9:37 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
Hi Dave,
here are four additional patches that should go into 2.6.23 before its
final release. Please pull and send them to Linus.
Regards
Marcel
Please pull from
git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6.git
This will update the following files:
drivers/bluetooth/hci_usb.c | 5 ++++-
net/bluetooth/hci_core.c | 8 +++-----
net/bluetooth/hci_sock.c | 27 +++++++++++++++++++++------
3 files changed, 28 insertions(+), 12 deletions(-)
through these ChangeSets:
Commit: 89f2783ded0a4fc98852cb9552bb27a80cd6a41a
Author: Marcel Holtmann <marcel@holtmann.org> Sun, 09 Sep 2007 08:39:49 +0200
[Bluetooth] Fix parameter list for event filter command
On device initialization the event filters are cleared. In case of
clearing the filters the extra condition type shall be omitted.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Commit: 7c631a67601f116d303cfb98a3d964a150090e38
Author: Marcel Holtmann <marcel@holtmann.org> Sun, 09 Sep 2007 08:39:43 +0200
[Bluetooth] Update security filter for Bluetooth 2.1
This patch updates the HCI security filter with support for the
Bluetooth 2.1 commands and events.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Commit: 767c5eb5d35aeb85987143f0a730bc21d3ecfb3d
Author: Marcel Holtmann <marcel@holtmann.org> Sun, 09 Sep 2007 08:39:34 +0200
[Bluetooth] Add compat handling for timestamp structure
The timestamp structure needs special handling in case of compat
programs. Use the same wrapping method the network core uses.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Commit: 26a4a06e7ff2874154eb3f4b4ba0514dc563b100
Author: Marcel Holtmann <marcel@holtmann.org> Sun, 09 Sep 2007 08:39:27 +0200
[Bluetooth] Add missing stat.byte_rx counter modification
With the support for hci_recv_fragment() the call to increase the
stat.byte_rx counter got accidentally removed. This patch fixes it.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
^ permalink raw reply
* Re: 2.6.23-rc4-mm1: e1000e napi lockup
From: Jiri Slaby @ 2007-09-09 9:58 UTC (permalink / raw)
Cc: Andrew Morton, netdev, e1000-devel, Auke Kok, David S. Miller
In-Reply-To: <46E0FB82.2040000@gmail.com>
On 09/07/2007 09:19 AM, Jiri Slaby wrote:
> Hi,
>
> I found a regression in 2.6.23-rc4-mm1 (since -rc3-mm1) in e1000e driver.
> napi_disable(&adapter->napi) in e1000_probe freezes the kernel on boot.
Ok, after these changes:
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index c1c64e2..f8ec537 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -1693,10 +1693,7 @@ quit_polling:
if (adapter->itr_setting & 3)
e1000_set_itr(adapter);
netif_rx_complete(poll_dev, napi);
- if (test_bit(__E1000_DOWN, &adapter->state))
- atomic_dec(&adapter->irq_sem);
- else
- e1000_irq_enable(adapter);
+ e1000_irq_enable(adapter);
return 0;
}
@@ -4257,7 +4254,6 @@ static int __devinit e1000_probe(struct pci_dev *pdev,
/* tell the stack to leave us alone until e1000_open() is called */
netif_carrier_off(netdev);
netif_stop_queue(netdev);
- napi_disable(&adapter->napi);
strcpy(netdev->name, "eth%d");
err = register_netdev(netdev);
I still have problems with the driver. When I do `ip link set eth0 up', ksoftirq
runs with 100 % cpu time, so I think you endlessly re-schedule some timer (or
the new napi layer?)
regards,
--
Jiri Slaby (jirislaby@gmail.com)
Faculty of Informatics, Masaryk University
^ permalink raw reply related
* Re: [PATCH 03/16] net: Basic network namespace infrastructure.
From: Eric Dumazet @ 2007-09-09 8:44 UTC (permalink / raw)
To: Eric W. Biederman; +Cc: David Miller, netdev, Linux Containers
In-Reply-To: <m1ejh8x3ih.fsf_-_@ebiederm.dsl.xmission.com>
Eric W. Biederman a écrit :
> This is the basic infrastructure needed to support network
> namespaces. This infrastructure is:
> - Registration functions to support initializing per network
> namespace data when a network namespaces is created or destroyed.
>
> - struct net. The network namespace data structure.
> This structure will grow as variables are made per network
> namespace but this is the minimal starting point.
>
> - Functions to grab a reference to the network namespace.
> I provide both get/put functions that keep a network namespace
> from being freed. And hold/release functions serve as weak references
> and will warn if their count is not zero when the data structure
> is freed. Useful for dealing with more complicated data structures
> like the ipv4 route cache.
>
> - A list of all of the network namespaces so we can iterate over them.
>
> - A slab for the network namespace data structure allowing leaks
> to be spotted.
>
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Nice work Eric !
"struct net" is not a very descriptive name imho, why dont stick "ns" or
"namespace" somewhere ?
Do we really need yet another "struct kmem_cache *net_cachep;" ?
The object is so small that the standard caches should be OK (kzalloc())
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox