Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH] bridge: packets leaking out of disabled/blocked ports
From: Stephen Hemminger @ 2007-08-30 19:22 UTC (permalink / raw)
  To: wang dengyi, David S. Miller; +Cc: bridge, netdev
In-Reply-To: <426497.88154.qm@web51910.mail.re2.yahoo.com>

This patch fixes some packet leakage in bridge.  The bridging code
was allowing forward table entries to be generated even if a device
was being blocked. The fix is to not add forwarding database entries
unless the port is active.

The bug arose as part of the conversion to processing STP frames
through normal receive path (in 2.6.17).

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>

--- a/net/bridge/br_fdb.c	2007-08-30 07:49:01.000000000 -0700
+++ b/net/bridge/br_fdb.c	2007-08-30 11:40:36.000000000 -0700
@@ -384,6 +384,11 @@ void br_fdb_update(struct net_bridge *br
 	if (hold_time(br) == 0)
 		return;

+	/* ignore packets unless we are using this port */
+	if (!(source->state == BR_STATE_LEARNING ||
+	      source->state == BR_STATE_FORWARDING))
+		return;
+
 	fdb = fdb_find(head, addr);
 	if (likely(fdb)) {
 		/* attempt to update an entry for a local interface */
--- a/net/bridge/br_input.c	2007-08-30 07:49:01.000000000 -0700
+++ b/net/bridge/br_input.c	2007-08-30 12:19:57.000000000 -0700
@@ -101,9 +101,8 @@ static int br_handle_local_finish(struct
 {
 	struct net_bridge_port *p = rcu_dereference(skb->dev->br_port);

-	if (p && p->state != BR_STATE_DISABLED)
+	if (p)
 		br_fdb_update(p->br, p, eth_hdr(skb)->h_source);
-
 	return 0;	 /* process further */
 }

^ permalink raw reply

* Re: [PATCH 13/30] net: Don't do pointless kmalloc return value casts in zd1211 driver
From: Jesper Juhl @ 2007-08-30 20:20 UTC (permalink / raw)
  To: Daniel Drake
  Cc: Linux Kernel Mailing List, netdev-u79uwXL29TY76Z2rM5mHXA,
	David S. Miller, Ulrich Kunitz,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <46D70467.1040109-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org>

On 30/08/2007, Daniel Drake <dsd-aBrp7R+bbdUdnm+yROfE0A@public.gmane.org> wrote:
> Jesper Juhl wrote:
> > Since kmalloc() returns a void pointer there is no reason to cast
> > its return value.
> > This patch also removes a pointless initialization of a variable.
>
> NAK: adds a sparse warning
> zd_chip.c:116:15: warning: implicit cast to nocast type
>
Ok, I must admit I didn't check with sparse since it seemed pointless
- we usually never cast void pointers to other pointer types,
specifically because the C language nicely guarantees that the right
thing will happen without the cast.  Sometimes we have to cast them to
integer types, su sure we need the cast there.   But what on earth
makes a "zd_addr_t *" so special that we have to explicitly cast a
"void *" to that type?

I see it's defined as
  typedef u32 __nocast zd_addr_t;
in drivers/net/wireless/zd1211rw/zd_types.h , but why the __nocast ?

What would be wrong in applying my patch that removes the cast of the
kmalloc() return value and then also remove the "__nocast" here?


-- 
Jesper Juhl <jesper.juhl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

^ permalink raw reply

* Re: [PATCH 0/0] sky2: update for 2.6.24
From: Stephen Hemminger @ 2007-08-30 20:36 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: netdev
In-Reply-To: <20070829193922.078561651@linux-foundation.org>

Patch number 10 doesn't work right, it causes excess interrupts
and console messages.
8 & 9 are only needed for #10, so skip them as well.

So please only apply 1-7 to netdev for 2.6.24
-- 
Stephen Hemminger <shemminger@linux-foundation.org>

^ permalink raw reply

* Re: [PATCH] bridge: packets leaking out of disabled/blocked ports
From: John W. Linville @ 2007-08-30 20:03 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: wang dengyi, David S. Miller, bridge, netdev
In-Reply-To: <20070830122258.1c606516@freepuppy.rosehill.hemminger.net>

On Thu, Aug 30, 2007 at 12:22:58PM -0700, Stephen Hemminger wrote:
> This patch fixes some packet leakage in bridge.  The bridging code
> was allowing forward table entries to be generated even if a device
> was being blocked. The fix is to not add forwarding database entries
> unless the port is active.

Seems reasonable -- ACK

John
-- 
John W. Linville
linville@tuxdriver.com

^ permalink raw reply

* [GIT PULL] SCTP updates
From: Vlad Yasevich @ 2007-08-30 21:26 UTC (permalink / raw)
  To: davem; +Cc: lksctp-developers, netdev

Hi David

Please pull the  following changes since commit
b07d68b5ca4d55a16fab223d63d5fb36f89ff42f:
  Linus Torvalds (1):
        Linux 2.6.23-rc4

that are available in the git repository at:

  master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev.git master

Vlad Yasevich (7):
      SCTP: properly clean up fragment and ordering queues during FWD-TSN.
      SCTP: Assign stream sequence numbers to the entire message
      SCTP: Pick the correct port when binding to 0.
      SCTP: Uncomfirmed transports can't become Inactive
      SCTP: Do not retransmit chunks that are newer then rtt.
      SCTP: Correctly disable listening when backlog is 0.
      SCTP: Abort on COOKIE-ECHO if backlog is exceeded.

Wei Yongjun (4):
      SCTP: Fix sctp_addto_chunk() to add pad with correct length
      SCTP: Fix to encode PROTOCOL VIOLATION error cause correctly
      SCTP: Use net_ratelimit to suppress error messages print too fast
      SCTP: Fix to handle invalid parameter length correctly

 include/net/sctp/sm.h       |    2 +-
 include/net/sctp/structs.h  |    1 +
 include/net/sctp/ulpqueue.h |    1 +
 net/sctp/associola.c        |    7 ++-
 net/sctp/outqueue.c         |    7 +++
 net/sctp/sm_make_chunk.c    |  112 +++++++++++++++++++++++++++++-------------
 net/sctp/sm_sideeffect.c    |    8 ++-
 net/sctp/sm_statefuns.c     |   51 ++++++++++----------
 net/sctp/socket.c           |    3 +
 net/sctp/ulpqueue.c         |   75 ++++++++++++++++++++++++-----
 10 files changed, 190 insertions(+), 77 deletions(-)

Thanks
-vlad

^ permalink raw reply

* [PATCH] bonding: update some distro-specific documentation
From: Andy Gospodarek @ 2007-08-30 21:24 UTC (permalink / raw)
  To: netdev; +Cc: fubar


These are some changes that update some of the distro-specific details
in for configuring bonding.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>

---

 bonding.txt |   45 ++++++++++++++++++++++++++-------------------
 1 files changed, 26 insertions(+), 19 deletions(-)


diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 1da5666..52fc1d0 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -755,9 +755,9 @@ the system /etc/modules.conf or /etc/modprobe.conf configuration file.
 ------------------------------------------
 
 	This section applies to distros using a version of initscripts
-with bonding support, for example, Red Hat Linux 9 or Red Hat
-Enterprise Linux version 3 or 4.  On these systems, the network
-initialization scripts have some knowledge of bonding, and can be
+with bonding support, for example, Red Hat Linux 9, Red Hat Enterprise
+Linux version 3, 4 or 5, Fedora, etc.  On these systems, the network
+initialization scripts have some knowledge of bonding, and can be 
 configured to control bonding devices.
 
 	These distros will not automatically load the network adapter
@@ -802,15 +802,20 @@ BROADCAST=192.168.1.255
 ONBOOT=yes
 BOOTPROTO=none
 USERCTL=no
+BONDING_OPTS="mode=balance-alb miimon=100"
 
 	Be sure to change the networking specific lines (IPADDR,
 NETMASK, NETWORK and BROADCAST) to match your network configuration.
+You also need to set the BONDING_OPTS= line to specify the desired
+options for your bond0 interface.  Specifying bonding options in this
+way is the preferred method for configuring bonding interfaces.
 
-	Finally, it is necessary to edit /etc/modules.conf (or
+	It is no longer necessary to edit /etc/modules.conf (or
 /etc/modprobe.conf, depending upon your distro) to load the bonding
 module with your desired options when the bond0 interface is brought
 up.  The following lines in /etc/modules.conf (or modprobe.conf) will
-load the bonding module, and select its options:
+load the bonding module, and select its options but this is no longer
+the preferred method.
 
 alias bond0 bonding
 options bond0 mode=balance-alb miimon=100
@@ -826,8 +831,9 @@ up and running.
 ---------------------------------
 
 	Recent versions of initscripts (the version supplied with
-Fedora Core 3 and Red Hat Enterprise Linux 4 is reported to work) do
-have support for assigning IP information to bonding devices via DHCP.
+Fedora Core 3 and Red Hat Enterprise Linux 4 and later is reported to
+work) do have support for assigning IP information to bonding devices
+via DHCP.
 
 	To configure bonding for DHCP, configure it as described
 above, except replace the line "BOOTPROTO=none" with "BOOTPROTO=dhcp"
@@ -837,18 +843,19 @@ is case sensitive.
 3.2.2 Configuring Multiple Bonds with Initscripts
 -------------------------------------------------
 
-	At this writing, the initscripts package does not directly
-support loading the bonding driver multiple times, so the process for
-doing so is the same as described in the "Configuring Multiple Bonds
-Manually" section, below.
-
-	NOTE: It has been observed that some Red Hat supplied kernels
-are apparently unable to rename modules at load time (the "-o bond1"
-part).  Attempts to pass that option to modprobe will produce an
-"Operation not permitted" error.  This has been reported on some
-Fedora Core kernels, and has been seen on RHEL 4 as well.  On kernels
-exhibiting this problem, it will be impossible to configure multiple
-bonds with differing parameters.
+	Initscripts packages that are included with Fedora 7 and Red
+Hat Enterprise Linux 5 support multiple bonding interfaces by simply
+specifying the appropriate BONDING_OPTS= in ifcfg-bondX where X is
+the number of the bond.  Other distros may not include support in
+initscripts for multiple bonding interfaces, so you may need to follow
+the process as described in the "Configuring Multiple Bonds Manually"
+section, below.
+
+	It has been observed that much older kernels are apparently
+unable to rename modules at load time (the "-o bond1" part).  Attempts
+to pass that option to modprobe will produce an "Operation not 
+permitted" error.  On kernels exhibiting this problem, it will be 
+impossible to configure multiple bonds with differing parameters.
 
 3.3 Configuring Bonding Manually with Ifenslave
 -----------------------------------------------

^ permalink raw reply related

* Re: [PATCH 1/3] netxen: Avoid firmware load in PCI probe
From: Arnaldo Carvalho de Melo @ 2007-08-30 21:58 UTC (permalink / raw)
  To: dhananjay; +Cc: netdev, jeff, rob
In-Reply-To: <20070828115603.166212155@netxen.com>

Em Tue, Aug 28, 2007 at 05:23:25PM +0530, dhananjay@netxen.com escreveu:
> Loading firmware during PCI probe can lead to incorrect initialization,
> rendering the card unusable until next reboot.  This was introduced a while
> ago as a workaround for firmware bug, a better workaround was submitted for
> this a while ago. So removing original hack that loads firmware during probe.
> 
> Signed-off by: Dhananjay Phadke <dhananjay@netxen.com>

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>

I was having these problems and after applying this patch the NIC is
back working, thank you for fixing this!

- Arnaldo
 
> Index: netdev-2.6/drivers/net/netxen/netxen_nic_main.c
> ===================================================================
> --- netdev-2.6.orig/drivers/net/netxen/netxen_nic_main.c
> +++ netdev-2.6/drivers/net/netxen/netxen_nic_main.c
> @@ -639,10 +639,6 @@ netxen_nic_probe(struct pci_dev *pdev, c
>  			NETXEN_CRB_NORMALIZE(adapter,
>  				NETXEN_ROMUSB_GLB_PEGTUNE_DONE));
>  		/* Handshake with the card before we register the devices. */
> -		writel(0, NETXEN_CRB_NORMALIZE(adapter, CRB_CMDPEG_STATE));
> -		netxen_pinit_from_rom(adapter, 0);
> -		msleep(1);
> -		netxen_load_firmware(adapter);
>  		netxen_phantom_init(adapter, NETXEN_NIC_PEG_TUNE);
>  	}

^ permalink raw reply

* Re: [PATCH 5/5] Net: ath5k, kconfig changes
From: Nick Kossifidis @ 2007-08-30 22:18 UTC (permalink / raw)
  To: John W. Linville
  Cc: Christoph Hellwig, Jiri Slaby, linux-kernel, linux-wireless,
	netdev
In-Reply-To: <20070830123609.GA5140@tuxdriver.com>

2007/8/30, John W. Linville <linville@tuxdriver.com>:
> On Thu, Aug 30, 2007 at 04:38:09AM +0300, Nick Kossifidis wrote:
> > 2007/8/28, Christoph Hellwig <hch@infradead.org>:
>
> > > Also this whole patch seems rather pointless.  It saves only
> > > very little and turns the driver into a complete ifdef maze.
>
> > Also most
> > people will use 5212 code only, 5211 cards are on some old laptops and
> > 5210, well i couldn't even find  a 5210 for actual testing :P
>
> FWIW, I'd bet dollars to donuts that distros will enable them all
> together.
>
> Is saving code space the only reason to turn these off?  How much
> space do you save?
>
> Is there some way you can isolate and/or limit the number of ifdef
> blocks further?  If so, we might consider a version of this patch
> that depends on EMBEDDED or somesuch...?
>
> John

O.K. as a first step i'll limit 5210 code only then, just an option
like "support older 5210 chipsets" which is going to be off by default
instead of 3 options. It's not just saving space, it's also saving
some runtime checks. It's not really a gain in performance though,
most checks are done during initialization and dfs setup, i just
thought it would be usefull to save as much cpu as possible.

-- 
GPG ID: 0xD21DB2DB
As you read this post global entropy rises. Have Fun ;-)
Nick

^ permalink raw reply

* Re: [PATCH 13/30] net: Don't do pointless kmalloc return value casts in zd1211 driver
From: Joe Perches @ 2007-08-30 22:19 UTC (permalink / raw)
  To: Jesper Juhl
  Cc: Daniel Drake, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA, David S. Miller, Ulrich Kunitz,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <9a8748490708301320o49d8e794vc5c37ffc938006f1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Thu, 2007-08-30 at 22:20 +0200, Jesper Juhl wrote:
> Ok, I must admit I didn't check with sparse since it seemed pointless
> - we usually never cast void pointers to other pointer types,
> specifically because the C language nicely guarantees that the right
> thing will happen without the cast.  Sometimes we have to cast them to
> integer types, su sure we need the cast there.   But what on earth
> makes a "zd_addr_t *" so special that we have to explicitly cast a
> "void *" to that type?

http://marc.info/?l=linux-netdev&m=117113743902549&w=1

^ permalink raw reply

* [PATCH] Don't needlessly initialize variable to NULL in zd_chip   (was: Re: [PATCH 13/30] net: Don't do pointless kmalloc return value casts in zd1211 driver)
From: Jesper Juhl @ 2007-08-30 22:30 UTC (permalink / raw)
  To: Joe Perches
  Cc: Daniel Drake, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA, David S. Miller, Ulrich Kunitz,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA, Jesper Juhl
In-Reply-To: <1188512393.6062.156.camel@localhost>

On Friday 31 August 2007 00:19:53 Joe Perches wrote:
> On Thu, 2007-08-30 at 22:20 +0200, Jesper Juhl wrote:
> > Ok, I must admit I didn't check with sparse since it seemed pointless
> > - we usually never cast void pointers to other pointer types,
> > specifically because the C language nicely guarantees that the right
> > thing will happen without the cast.  Sometimes we have to cast them to
> > integer types, su sure we need the cast there.   But what on earth
> > makes a "zd_addr_t *" so special that we have to explicitly cast a
> > "void *" to that type?
> 
> http://marc.info/?l=linux-netdev&m=117113743902549&w=1
> 

Thank you for that link Joe.

I'm not sure I agree with the __nocast, but I respect that this is 
the maintainers choice.

But, I still think the first chunk of the patch that removes the 
initial variable initialization makes sense. 
Initializing the variable to NULL is pointless since it'll never be
used before the kmalloc() call. So here's a revised patch that just
gets rid of that little detail.



No need to initialize to NULL when variable is never used before 
it's assigned the return value of a kmalloc() call.

Signed-off-by: Jesper Juhl <jesper.juhl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---

diff --git a/drivers/net/wireless/zd1211rw/zd_chip.c b/drivers/net/wireless/zd1211rw/zd_chip.c
index c39f198..30872fe 100644
--- a/drivers/net/wireless/zd1211rw/zd_chip.c
+++ b/drivers/net/wireless/zd1211rw/zd_chip.c
@@ -106,7 +106,7 @@ int zd_ioread32v_locked(struct zd_chip *chip, u32 *values, const zd_addr_t *addr
 {
 	int r;
 	int i;
-	zd_addr_t *a16 = (zd_addr_t *)NULL;
+	zd_addr_t *a16;
 	u16 *v16;
 	unsigned int count16;
 

^ permalink raw reply related

* Re: [PATCH] Don't needlessly initialize variable to NULL in zd_chip (was: Re: [PATCH 13/30] net: Don't do pointless kmalloc return value casts in zd1211 driver)
From: Randy Dunlap @ 2007-08-30 22:42 UTC (permalink / raw)
  To: Jesper Juhl
  Cc: Joe Perches, Daniel Drake, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA, David S. Miller, Ulrich Kunitz,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <200708310030.31713.jesper.juhl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

On Fri, 31 Aug 2007 00:30:31 +0200 Jesper Juhl wrote:

> On Friday 31 August 2007 00:19:53 Joe Perches wrote:
> > On Thu, 2007-08-30 at 22:20 +0200, Jesper Juhl wrote:
> > > Ok, I must admit I didn't check with sparse since it seemed pointless
> > > - we usually never cast void pointers to other pointer types,
> > > specifically because the C language nicely guarantees that the right
> > > thing will happen without the cast.  Sometimes we have to cast them to
> > > integer types, su sure we need the cast there.   But what on earth
> > > makes a "zd_addr_t *" so special that we have to explicitly cast a
> > > "void *" to that type?
> > 
> > http://marc.info/?l=linux-netdev&m=117113743902549&w=1
> > 
> 
> Thank you for that link Joe.
> 
> I'm not sure I agree with the __nocast, but I respect that this is 
> the maintainers choice.
> 
> But, I still think the first chunk of the patch that removes the 
> initial variable initialization makes sense. 
> Initializing the variable to NULL is pointless since it'll never be
> used before the kmalloc() call. So here's a revised patch that just
> gets rid of that little detail.


BTW:  http://marc.info/?l=linux-wireless&m=118831813500769&w=2


> No need to initialize to NULL when variable is never used before 
> it's assigned the return value of a kmalloc() call.
> 
> Signed-off-by: Jesper Juhl <jesper.juhl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> ---
> 
> diff --git a/drivers/net/wireless/zd1211rw/zd_chip.c b/drivers/net/wireless/zd1211rw/zd_chip.c
> index c39f198..30872fe 100644
> --- a/drivers/net/wireless/zd1211rw/zd_chip.c
> +++ b/drivers/net/wireless/zd1211rw/zd_chip.c
> @@ -106,7 +106,7 @@ int zd_ioread32v_locked(struct zd_chip *chip, u32 *values, const zd_addr_t *addr
>  {
>  	int r;
>  	int i;
> -	zd_addr_t *a16 = (zd_addr_t *)NULL;
> +	zd_addr_t *a16;
>  	u16 *v16;
>  	unsigned int count16;


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: [PATCH] Don't needlessly initialize variable to NULL in zd_chip (was: Re: [PATCH 13/30] net: Don't do pointless kmalloc return value casts in zd1211 driver)
From: Jesper Juhl @ 2007-08-30 23:04 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: Joe Perches, Daniel Drake, Linux Kernel Mailing List,
	netdev-u79uwXL29TY76Z2rM5mHXA, David S. Miller, Ulrich Kunitz,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20070830154255.6a146c21.randy.dunlap-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

On 31/08/2007, Randy Dunlap <randy.dunlap-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> wrote:
...
>
> BTW:  http://marc.info/?l=linux-wireless&m=118831813500769&w=2
>
...

Heh, thanks Randy.

All too often patches get missed since I don't happen to include the
right magic person to Cc. So I generally take a "better to have one Cc
too many than miss one" approach when sending patches - otherwise I
just end up resending it several times and eventually have to bother
Andrew to move it through -mm.

I see the point of people not getting things twice, but too many
patches slip through the cracks already (and not just my patches,
that's a general problem) so I'd rather inconvenience a few people
with one extra email than missing the proper maintainer by not Cc'ing
the right list.    Picking the right list/set of people to Cc is hard!

(whoops, mistakenly didn't do a reply-to-all initially; sorry Randy,
now you'll get this mail twice ;)

--
Jesper Juhl <jesper.juhl-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

^ permalink raw reply

* Re: [1/1] Block device throttling [Re: Distributed storage.]
From: Daniel Phillips @ 2007-08-30 23:20 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: Jens Axboe, netdev, linux-kernel, linux-fsdevel, Peter Zijlstra
In-Reply-To: <20070829085331.GA16607@2ka.mipt.ru>

On Wednesday 29 August 2007 01:53, Evgeniy Polyakov wrote:
> Then, if of course you will want, which I doubt, you can reread
> previous mails and find that it was pointed to that race and
> possibilities to solve it way too long ago.

What still bothers me about your response is that, while you know the 
race exists and do not disagree with my example, you don't seem to see 
that that race can eventually lock up the block device by repeatedly 
losing throttle counts which are never recovered.  What prevents that?

> > --- 2.6.22.clean/block/ll_rw_blk.c	2007-07-08 16:32:17.000000000
> > -0700 +++ 2.6.22/block/ll_rw_blk.c	2007-08-24 12:07:16.000000000
> > -0700 @@ -3237,6 +3237,15 @@ end_io:
> >   */
> >  void generic_make_request(struct bio *bio)
> >  {
> > +	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
> > +
> > +	if (q && q->metric) {
> > +		int need = bio->bi_reserved = q->metric(bio);
> > +		bio->queue = q;
>
> In case you have stacked device, this entry will be rewritten and you
> will lost all your account data.

It is a weakness all right.  Well,

-	if (q && q->metric) {
+	if (q && q->metric && !bio->queue) {

which fixes that problem.  Maybe there is a better fix possible.  Thanks 
for the catch!

The original conception was that this block throttling would apply only 
to the highest level submission of the bio, the one that crosses the 
boundary between filesystem (or direct block device application) and 
block layer.  Resubmitting a bio or submitting a dependent bio from 
inside a block driver does not need to be throttled because all 
resources required to guarantee completion must have been obtained 
_before_ the bio was allowed to proceed into the block layer.

The other principle we are trying to satisfy is that the throttling 
should not be released until bio->endio, which I am not completely sure 
about with the patch as modified above.  Your earlier idea of having 
the throttle protection only cover the actual bio submission is 
interesting and may be effective in some cases, in fact it may cover 
the specific case of ddsnap.  But we don't have to look any further 
than ddraid (distributed raid) to find a case it doesn't cover - the 
additional memory allocated to hold parity data has to be reserved 
until parity data is deallocated, long after the submission completes.
So while you manage to avoid some logistical difficulties, it also looks 
like you didn't solve the general problem.

Hopefully I will be able to report on whether my patch actually works 
soon, when I get back from vacation.  The mechanism in ddsnap this is 
supposed to replace is effective, it is just ugly and tricky to verify.

Regards,

Daniel

^ permalink raw reply

* Re: [PATCH 13/30] net: Don't do pointless kmalloc return value casts in zd1211 driver
From: Daniel Drake @ 2007-08-30 23:47 UTC (permalink / raw)
  To: Jesper Juhl
  Cc: Linux Kernel Mailing List, netdev, David S. Miller, Ulrich Kunitz,
	linux-wireless
In-Reply-To: <9a8748490708301320o49d8e794vc5c37ffc938006f1@mail.gmail.com>

Jesper Juhl wrote:
> What would be wrong in applying my patch that removes the cast of the
> kmalloc() return value and then also remove the "__nocast" here?

We use it as a safety measure when coding. For example the write 
register function takes an address and a value. We got one of these the 
wrong way round once, and had a non-obvious bug.

nocast and sparse helps us prevent this.

Daniel

^ permalink raw reply

* [PATCH] make _minimum_ TCP retransmission timeout configurable take 2
From: Rick Jones @ 2007-08-31  0:09 UTC (permalink / raw)
  To: netdev

Enable configuration of the minimum TCP Retransmission Timeout via
a new sysctl "tcp_rto_min" to help those who's networks (eg cellular)
have quite variable RTTs avoid spurrious RTOs.

Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Lamont Jones <lamont@hp.com>
---

diff -r 06d7322848a3 Documentation/networking/ip-sysctl.txt
--- a/Documentation/networking/ip-sysctl.txt	Mon Aug 27 18:32:35 2007 -0700
+++ b/Documentation/networking/ip-sysctl.txt	Thu Aug 30 17:06:16 2007 -0700
@@ -339,6 +339,13 @@ tcp_rmem - vector of 3 INTEGERs: min, de
 	selected receiver buffers for TCP socket. This value does not override
 	net.core.rmem_max, "static" selection via SO_RCVBUF does not use this.
 	Default: 87380*2 bytes.
+
+tcp_rto_min - INTEGER
+	The minimum value for the TCP Retransmission Timeout, expressed
+	in milliseconds for the convenience of the user.
+	This is bounded at the low-end by TCP_RTO_MIN and by TCP_RTO_MAX at
+	the high-end.	
+	Default: 200.
 
 tcp_sack - BOOLEAN
 	Enable select acknowledgments (SACKS).
diff -r 06d7322848a3 include/net/tcp.h
--- a/include/net/tcp.h	Mon Aug 27 18:32:35 2007 -0700
+++ b/include/net/tcp.h	Thu Aug 30 17:06:16 2007 -0700
@@ -232,6 +232,7 @@ extern int sysctl_tcp_workaround_signed_
 extern int sysctl_tcp_workaround_signed_windows;
 extern int sysctl_tcp_slow_start_after_idle;
 extern int sysctl_tcp_max_ssthresh;
+extern unsigned int sysctl_tcp_rto_min;
 
 extern atomic_t tcp_memory_allocated;
 extern atomic_t tcp_sockets_allocated;
diff -r 06d7322848a3 net/ipv4/sysctl_net_ipv4.c
--- a/net/ipv4/sysctl_net_ipv4.c	Mon Aug 27 18:32:35 2007 -0700
+++ b/net/ipv4/sysctl_net_ipv4.c	Thu Aug 30 17:06:16 2007 -0700
@@ -186,6 +186,32 @@ static int strategy_allowed_congestion_c
 
 }
 
+/* if there is ever a proc_dointvec_ms_jiffies_minmax we can get rid
+   of this routine */
+
+static int proc_tcp_rto_min(ctl_table *ctl, int write, struct file *filp,
+			    void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+	u32 *valp = ctl->data;
+	u32 oldval = *valp;
+	int ret;
+
+	ret = proc_dointvec_ms_jiffies(ctl, write, filp, buffer, lenp, ppos);
+	if (ret)
+		return ret;
+
+	/* some bounds checking would be in order */   
+	if (write && *valp != oldval) {
+		if (*valp < TCP_RTO_MIN || *valp > TCP_RTO_MAX) {
+			*valp = oldval;
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+
 ctl_table ipv4_table[] = {
 	{
 		.ctl_name	= NET_IPV4_TCP_TIMESTAMPS,
@@ -819,6 +845,14 @@ ctl_table ipv4_table[] = {
 		.mode		= 0644,
 		.proc_handler	= &proc_dointvec,
 	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname	= "tcp_rto_min",
+		.data		= &sysctl_tcp_rto_min,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= &proc_tcp_rto_min
+	},
 	{ .ctl_name = 0 }
 };
 
diff -r 06d7322848a3 net/ipv4/tcp_input.c
--- a/net/ipv4/tcp_input.c	Mon Aug 27 18:32:35 2007 -0700
+++ b/net/ipv4/tcp_input.c	Thu Aug 30 17:06:16 2007 -0700
@@ -91,6 +91,8 @@ int sysctl_tcp_nometrics_save __read_mos
 
 int sysctl_tcp_moderate_rcvbuf __read_mostly = 1;
 int sysctl_tcp_abc __read_mostly;
+
+unsigned int sysctl_tcp_rto_min __read_mostly = TCP_RTO_MIN;
 
 #define FLAG_DATA		0x01 /* Incoming frame contained data.		*/
 #define FLAG_WIN_UPDATE		0x02 /* Incoming ACK was a window update.	*/
@@ -616,13 +618,13 @@ static void tcp_rtt_estimator(struct soc
 			if (tp->mdev_max < tp->rttvar)
 				tp->rttvar -= (tp->rttvar-tp->mdev_max)>>2;
 			tp->rtt_seq = tp->snd_nxt;
-			tp->mdev_max = TCP_RTO_MIN;
+			tp->mdev_max = sysctl_tcp_rto_min;
 		}
 	} else {
 		/* no previous measure. */
 		tp->srtt = m<<3;	/* take the measured time to be rtt */
 		tp->mdev = m<<1;	/* make sure rto = 3*rtt */
-		tp->mdev_max = tp->rttvar = max(tp->mdev, TCP_RTO_MIN);
+		tp->mdev_max = tp->rttvar = max(tp->mdev, sysctl_tcp_rto_min);
 		tp->rtt_seq = tp->snd_nxt;
 	}
 }
@@ -851,7 +853,7 @@ static void tcp_init_metrics(struct sock
 	}
 	if (dst_metric(dst, RTAX_RTTVAR) > tp->mdev) {
 		tp->mdev = dst_metric(dst, RTAX_RTTVAR);
-		tp->mdev_max = tp->rttvar = max(tp->mdev, TCP_RTO_MIN);
+		tp->mdev_max = tp->rttvar = max(tp->mdev, sysctl_tcp_rto_min);
 	}
 	tcp_set_rto(sk);
 	tcp_bound_rto(sk);

^ permalink raw reply

* Re: [PATCH] make _minimum_ TCP retransmission timeout configurable take 2
From: David Miller @ 2007-08-31  0:39 UTC (permalink / raw)
  To: rick.jones2; +Cc: netdev
In-Reply-To: <200708310009.RAA04175@tardy.cup.hp.com>

From: Rick Jones <rick.jones2@hp.com>
Date: Thu, 30 Aug 2007 17:09:04 -0700 (PDT)

> Enable configuration of the minimum TCP Retransmission Timeout via
> a new sysctl "tcp_rto_min" to help those who's networks (eg cellular)
> have quite variable RTTs avoid spurrious RTOs.
> 
> Signed-off-by: Rick Jones <rick.jones2@hp.com>
> Signed-off-by: Lamont Jones <lamont@hp.com>

Thanks for doing this work Rick.

But as John Heffner and I both mentioned, it's pretty clear we should
do this as a routing metric.  Both for handling realistic scenerios
where the sysctl doesn't work, and to help prevent misuse (example:
someone decides that it would be _totally_ _awesome_ for "Carrier
Grade Linux" to set this to 3 seconds by default in /etc/sysctl.conf
and crap like that).

^ permalink raw reply

* Re: [PATCH] make _minimum_ TCP retransmission timeout configurable take 2
From: Rick Jones @ 2007-08-31  1:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20070830.173912.79067694.davem@davemloft.net>

David Miller wrote:
> From: Rick Jones <rick.jones2@hp.com>
> Date: Thu, 30 Aug 2007 17:09:04 -0700 (PDT)
> 
> 
>>Enable configuration of the minimum TCP Retransmission Timeout via
>>a new sysctl "tcp_rto_min" to help those who's networks (eg cellular)
>>have quite variable RTTs avoid spurrious RTOs.
>>
>>Signed-off-by: Rick Jones <rick.jones2@hp.com>
>>Signed-off-by: Lamont Jones <lamont@hp.com>
> 
> 
> Thanks for doing this work Rick.
> 
> But as John Heffner and I both mentioned, it's pretty clear we should
> do this as a routing metric.  Both for handling realistic scenerios
> where the sysctl doesn't work, and to help prevent misuse (example:
> someone decides that it would be _totally_ _awesome_ for "Carrier
> Grade Linux" to set this to 3 seconds by default in /etc/sysctl.conf
> and crap like that).

If nothing else it was worth the practice :)  I'll be happy with either 
mechanism, just wasn't sure if the jury was still out on whether making 
it a routing metric was really necessary.  I can see where it would be 
goodness if one had separate paths out of a system, one with the highly 
variable RTT and one with non-trivial loss rates, just that thusfar I've 
not come across any :)  I've only seen one path with high RTT 
variability and the other path with trivial loss rates.

Also, not surprisingly, the folks for whom I'm doing this are a triffle 
"anxious" so I figured that simplicity was worthwhile.  Particularly if 
it was going to be the case those folks were going to be asking for 
back-ports.

Anyhow, I'll try grubbing around the source code (already doing that to 
see about writing a pet tcp cong module) but if pointers to the likely 
relevant files were available I could try to help thrash-out the routing 
metric version.  Like I said the consumers of this are a triffle well, 
"anxious" :)

rick

^ permalink raw reply

* Re: [Bugme-new] [Bug 8961] New: BUG triggered by oidentd in netlink code
From: Andrew Morton @ 2007-08-31  1:08 UTC (permalink / raw)
  To: netdev; +Cc: bugme-daemon, link
In-Reply-To: <bug-8961-10286@http.bugzilla.kernel.org/>

On Thu, 30 Aug 2007 07:41:31 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8961

This looks serious.

>            Summary: BUG triggered by oidentd in netlink code
>            Product: Other
>            Version: 2.5
>      KernelVersion: 2.6.22.3
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: other_other@kernel-bugs.osdl.org
>         ReportedBy: link@miggy.org
> 
> 
> Most recent kernel where this bug did not occur: 2.6.21.2
> Distribution: Debian/Etch
> Hardware Environment: uk2.net host server
> lspci says->
> 00:00.0 Host bridge: Intel Corporation 82845G/GL[Brookdale-G]/GE/PE DRAM
> Controller/Host-Hub Interface (rev 03)
> 00:02.0 VGA compatible controller: Intel Corporation 82845G/GL[Brookdale-G]/GE
> Chipset Integrated Graphics Device (rev 03)
> 00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
> USB UHCI Controller #1 (rev 02)
> 00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
> USB UHCI Controller #2 (rev 02)
> 00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
> USB UHCI Controller #3 (rev 02)
> 00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
> Controller (rev 02)
> 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 82)
> 00:1f.0 ISA bridge: Intel Corporation 82801DB/DBL (ICH4/ICH4-L) LPC Interface
> Bridge (rev 02)
> 00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 02)
> 00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM
> (ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 02)
> 03:06.0 RAID bus controller: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID (rev 01)
> 03:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL-8139/8139C/8139C+ (rev 10)
> Software Environment: oidentd
> Problem Description:
> Something in oidentd's use of netlink is triggering a BUG
> Steps to reproduce:
> Setup a Debian/Etch box, install oidentd, run a 2.6.22.3 kernel, ensure there
> are sufficient connections to the oidentd service and wait.
> 
> 'Oops' output:
> 
> Aug 29 23:28:44 bowl kernel: [349587.500440] BUG: unable to handle kernel NULL
> pointer dereference<1>BUG: unable to handle kernel NULL pointer dereference at
> virtual address 00000054
> Aug 29 23:28:44 bowl kernel: [349587.500454]  printing eip:
> Aug 29 23:28:45 bowl kernel: [349587.500457] c03318ae
> Aug 29 23:28:45 bowl kernel: [349587.500459] *pde = 00000000
> Aug 29 23:28:45 bowl kernel: [349587.500464] Oops: 0000 [#1]
> Aug 29 23:28:45 bowl kernel: [349587.500466] PREEMPT SMP
> Aug 29 23:28:46 bowl kernel: [349587.500474] Modules linked in: w83627hf
> hwmon_vid i2c_isa
> Aug 29 23:28:46 bowl kernel: [349587.500483] CPU:    0
> Aug 29 23:28:47 bowl kernel: [349587.500485] EIP:    0060:[<c03318ae>]    Not
> tainted VLI
> Aug 29 23:28:47 bowl kernel: [349587.500487] EFLAGS: 00010246   (2.6.22.3 #1)
> Aug 29 23:28:47 bowl kernel: [349587.500499] EIP is at netlink_rcv_skb+0xa/0x7e
> Aug 29 23:28:48 bowl kernel: [349587.500506] eax: 00000000   ebx: 00000000  
> ecx: c148d2a0   edx: c0398819
> Aug 29 23:28:48 bowl kernel: [349587.500510] esi: 00000000   edi: c0398819  
> ebp: c7a21c8c   esp: c7a21c80
> Aug 29 23:28:48 bowl kernel: [349587.500517] ds: 007b   es: 007b   fs: 00d8 
> gs: 0033  ss: 0068
> Aug 29 23:28:50 bowl kernel: [349587.500521] Process oidentd (pid: 17943,
> ti=c7a20000 task=cee231c0 task.ti=c7a20000)
> Aug 29 23:28:51 bowl kernel: [349587.500527] Stack: 00000000 c7a21cac f7c8ba78
> c7a21ca4 c0331962 c0398819 f7c8ba00 0000004c
> Aug 29 23:28:52 bowl kernel: [349587.500542]        f736f000 c7a21cb4 c03988e3
> 00000001 f7c8ba00 c7a21cc4 c03312a5 0000004c
> Aug 29 23:28:54 bowl kernel: [349587.500558]        f7c8ba00 c7a21cd4 c0330681
> f7c8ba00 e4695280 c7a21d00 c03307c6 7fffffff
> Aug 29 23:28:54 bowl kernel: [349587.500578] Call Trace:
> Aug 29 23:28:54 bowl kernel: [349587.500581]  [<c010361a>]
> show_trace_log_lvl+0x1c/0x33
> Aug 29 23:28:55 bowl kernel: [349587.500591]  [<c01036d4>]
> show_stack_log_lvl+0x8d/0xaa
> Aug 29 23:28:57 bowl kernel: [349587.500595]  [<c010390e>]
> show_registers+0x1cb/0x321
> Aug 29 23:28:59 bowl kernel: [349587.500604]  [<c0103bff>] die+0x112/0x1e1
> Aug 29 23:29:00 bowl kernel: [349587.500607]  [<c01132d2>]
> do_page_fault+0x229/0x565
> Aug 29 23:29:05 bowl kernel: [349587.500618]  [<c03c8d3a>] error_code+0x72/0x78
> Aug 29 23:29:07 bowl kernel: [349587.500625]  [<c0331962>]
> netlink_run_queue+0x40/0x76
> Aug 29 23:29:07 bowl kernel: [349587.500632]  [<c03988e3>]
> inet_diag_rcv+0x1f/0x2c
> Aug 29 23:29:07 bowl kernel: [349587.500639]  [<c03312a5>]
> netlink_data_ready+0x57/0x59
> Aug 29 23:29:08 bowl kernel: [349587.500643]  [<c0330681>]
> netlink_sendskb+0x24/0x45
> Aug 29 23:29:08 bowl kernel: [349587.500651]  [<c03307c6>]
> netlink_unicast+0x100/0x116
> Aug 29 23:29:08 bowl kernel: [349587.500656]  [<c0330f83>]
> netlink_sendmsg+0x1c2/0x280
> Aug 29 23:29:09 bowl kernel: [349587.500664]  [<c02fcce9>]
> sock_sendmsg+0xba/0xd5
> Aug 29 23:29:12 bowl kernel: [349587.500671]  [<c02fe4d1>]
> sys_sendmsg+0x17b/0x1e8
> Aug 29 23:29:12 bowl kernel: [349587.500676]  [<c02fe92d>]
> sys_socketcall+0x230/0x24d
> Aug 29 23:29:13 bowl kernel: [349587.500684]  [<c01028d2>] syscall_call+0x7/0xb
> Aug 29 23:29:13 bowl kernel: [349587.500691]  =======================
> Aug 29 23:29:13 bowl kernel: [349587.500693] Code: f0 ff 4e 18 0f 94 c0 84 c0
> 0f 84 66 ff ff ff 89 f0 e8 86 e2 fc ff e9 5a ff ff ff f0 ff 40 10 eb be 55 89
> e5 57 89 d7 56 89 c6 53 <8b> 50 54 83 fa 10 72 55 8b 9e 9c 00 00 00 31 c9 8b 03
> 83 f8 0f
> Aug 29 23:29:13 bowl kernel: [349587.500770] EIP: [<c03318ae>]
> netlink_rcv_skb+0xa/0x7e SS:ESP 0068:c7a21c80
> Aug 29 23:29:13 bowl kernel: [349587.501851]  at virtual address 00000054
> Aug 29 23:29:13 bowl kernel: [349587.501913]  printing eip:
> Aug 29 23:29:14 bowl kernel: [349587.501963] c03318ae
> Aug 29 23:29:14 bowl kernel: [349587.502022] *pde = 00000000
> Aug 29 23:29:15 bowl kernel: [349587.502079] Oops: 0000 [#2]
> Aug 29 23:29:15 bowl kernel: [349587.502136] PREEMPT SMP
> Aug 29 23:29:15 bowl kernel: [349587.502271] Modules linked in: w83627hf
> hwmon_vid i2c_isa
> Aug 29 23:29:16 bowl kernel: [349587.502489] CPU:    1
> Aug 29 23:29:16 bowl kernel: [349587.502490] EIP:    0060:[<c03318ae>]    Not
> tainted VLI
> Aug 29 23:29:17 bowl kernel: [349587.502491] EFLAGS: 00010246   (2.6.22.3 #1)
> Aug 29 23:29:17 bowl kernel: [349587.502647] EIP is at netlink_rcv_skb+0xa/0x7e
> Aug 29 23:29:17 bowl kernel: [349587.502691] eax: 00000000   ebx: 00000000  
> ecx: c14346a0   edx: c0398819
> Aug 29 23:29:17 bowl kernel: [349587.502737] esi: 00000000   edi: c0398819  
> ebp: e37f3c8c   esp: e37f3c80
> Aug 29 23:29:17 bowl kernel: [349587.502783] ds: 007b   es: 007b   fs: 00d8 
> gs: 0033  ss: 0068
> Aug 29 23:29:17 bowl kernel: [349587.502828] Process oidentd (pid: 17945,
> ti=e37f2000 task=dc69e6e0 task.ti=e37f2000)
> Aug 29 23:29:18 bowl kernel: [349587.502875] Stack: 00000000 e37f3cac f7c8ba78
> e37f3ca4 c0331962 c0398819 f7c8ba00 0000004c
> Aug 29 23:29:18 bowl kernel: [349587.503198]        f736f000 e37f3cb4 c03988e3
> 00000001 f7c8ba00 e37f3cc4 c03312a5 0000004c
> Aug 29 23:29:18 bowl kernel: [349587.503519]        f7c8ba00 e37f3cd4 c0330681
> f7c8ba00 e1a35a80 e37f3d00 c03307c6 7fffffff
> Aug 29 23:29:18 bowl kernel: [349587.503839] Call Trace:
> Aug 29 23:29:18 bowl kernel: [349587.503917]  [<c010361a>]
> show_trace_log_lvl+0x1c/0x33
> Aug 29 23:29:18 bowl kernel: [349587.503994]  [<c01036d4>]
> show_stack_log_lvl+0x8d/0xaa
> Aug 29 23:29:18 bowl kernel: [349587.504067]  [<c010390e>]
> show_registers+0x1cb/0x321
> Aug 29 23:29:18 bowl kernel: [349587.504142]  [<c0103bff>] die+0x112/0x1e1
> Aug 29 23:29:18 bowl kernel: [349587.504215]  [<c01132d2>]
> do_page_fault+0x229/0x565
> Aug 29 23:29:18 bowl kernel: [349587.504290]  [<c03c8d3a>] error_code+0x72/0x78
> Aug 29 23:29:18 bowl kernel: [349587.504366]  [<c0331962>]
> netlink_run_queue+0x40/0x76
> Aug 29 23:29:18 bowl kernel: [349587.504440]  [<c03988e3>]
> inet_diag_rcv+0x1f/0x2c
> Aug 29 23:29:18 bowl kernel: [349587.504514]  [<c03312a5>]
> netlink_data_ready+0x57/0x59
> Aug 29 23:29:18 bowl kernel: [349587.504589]  [<c0330681>]
> netlink_sendskb+0x24/0x45
> Aug 29 23:29:18 bowl kernel: [349587.504662]  [<c03307c6>]
> netlink_unicast+0x100/0x116
> Aug 29 23:29:19 bowl kernel: [349587.504736]  [<c0330f83>]
> netlink_sendmsg+0x1c2/0x280
> Aug 29 23:29:19 bowl kernel: [349587.504809]  [<c02fcce9>]
> sock_sendmsg+0xba/0xd5
> Aug 29 23:29:19 bowl kernel: [349587.504885]  [<c02fe4d1>]
> sys_sendmsg+0x17b/0x1e8
> Aug 29 23:29:19 bowl kernel: [349587.504958]  [<c02fe92d>]
> sys_socketcall+0x230/0x24d
> Aug 29 23:29:19 bowl kernel: [349587.505032]  [<c01028d2>] syscall_call+0x7/0xb
> Aug 29 23:29:19 bowl kernel: [349587.505105]  =======================
> Aug 29 23:29:19 bowl kernel: [349587.505146] Code: f0 ff 4e 18 0f 94 c0 84 c0
> 0f 84 66 ff ff ff 89 f0 e8 86 e2 fc ff e9 5a ff ff ff f0 ff 40 10 eb be 55 89
> e5 57 89 d7 56 89 c6 53 <8b> 50 54 83 fa 10 72 55 8b 9e 9c 00 00 00 31 c9 8b 03
> 83 f8 0f
> Aug 29 23:29:19 bowl kernel: [349587.507160] EIP: [<c03318ae>]
> netlink_rcv_skb+0xa/0x7e SS:ESP 0068:e37f3c80
> Aug 29 23:43:48 bowl kernel: [350485.786725] BUG: unable to handle kernel NULL
> pointer dereference<1>BUG: unable to handle kernel NULL pointer dereference at
> virtual address 00000054
> Aug 29 23:43:48 bowl kernel: [350485.786739]  printing eip:
> Aug 29 23:43:48 bowl kernel: [350485.786743] c03318ae
> Aug 29 23:43:48 bowl kernel: [350485.786745] *pde = 00000000
> Aug 29 23:43:48 bowl kernel: [350485.786750] Oops: 0000 [#3]
> Aug 29 23:43:49 bowl kernel: [350485.786751] PREEMPT SMP
> Aug 29 23:43:49 bowl kernel: [350485.786755] Modules linked in: w83627hf
> hwmon_vid i2c_isa
> Aug 29 23:43:49 bowl kernel: [350485.786763] CPU:    0
> Aug 29 23:43:49 bowl kernel: [350485.786765] EIP:    0060:[<c03318ae>]    Not
> tainted VLI
> Aug 29 23:43:49 bowl kernel: [350485.786766] EFLAGS: 00010246   (2.6.22.3 #1)
> Aug 29 23:43:49 bowl kernel: [350485.786781] EIP is at netlink_rcv_skb+0xa/0x7e
> Aug 29 23:43:49 bowl kernel: [350485.786785] eax: 00000000   ebx: 00000000  
> ecx: c148d2a0   edx: c0398819
> Aug 29 23:43:49 bowl kernel: [350485.786789] esi: 00000000   edi: c0398819  
> ebp: dee05c8c   esp: dee05c80
> Aug 29 23:43:50 bowl kernel: [350485.786792] ds: 007b   es: 007b   fs: 00d8 
> gs: 0033  ss: 0068
> Aug 29 23:43:50 bowl kernel: [350485.786795] Process oidentd (pid: 21495,
> ti=dee04000 task=dc69e6e0 task.ti=dee04000)
> Aug 29 23:43:50 bowl kernel: [350485.786798] Stack: 00000000 dee05cac f7c8ba78
> dee05ca4 c0331962 c0398819 f7c8ba00 0000004c
> Aug 29 23:43:50 bowl kernel: [350485.786807]        f736f000 dee05cb4 c03988e3
> 00000001 f7c8ba00 dee05cc4 c03312a5 0000004c
> Aug 29 23:43:51 bowl kernel: [350485.786816]        f7c8ba00 dee05cd4 c0330681
> f7c8ba00 e4695980 dee05d00 c03307c6 7fffffff
> Aug 29 23:43:51 bowl kernel: [350485.786829] Call Trace:
> Aug 29 23:43:51 bowl kernel: [350485.786832]  [<c010361a>]
> show_trace_log_lvl+0x1c/0x33
> Aug 29 23:43:51 bowl kernel: [350485.786839]  [<c01036d4>]
> show_stack_log_lvl+0x8d/0xaa
> Aug 29 23:43:52 bowl kernel: [350485.786844]  [<c010390e>]
> show_registers+0x1cb/0x321
> Aug 29 23:43:52 bowl kernel: [350485.786848]  [<c0103bff>] die+0x112/0x1e1
> Aug 29 23:43:52 bowl kernel: [350485.786852]  [<c01132d2>]
> do_page_fault+0x229/0x565
> Aug 29 23:43:52 bowl kernel: [350485.786859]  [<c03c8d3a>] error_code+0x72/0x78
> Aug 29 23:43:52 bowl kernel: [350485.786870]  [<c0331962>]
> netlink_run_queue+0x40/0x76
> Aug 29 23:43:52 bowl kernel: [350485.786875]  [<c03988e3>]
> inet_diag_rcv+0x1f/0x2c
> Aug 29 23:43:52 bowl kernel: [350485.786880]  [<c03312a5>]
> netlink_data_ready+0x57/0x59
> Aug 29 23:43:53 bowl kernel: [350485.786885]  [<c0330681>]
> netlink_sendskb+0x24/0x45
> Aug 29 23:43:53 bowl kernel: [350485.786889]  [<c03307c6>]
> netlink_unicast+0x100/0x116
> Aug 29 23:43:53 bowl kernel: [350485.786893]  [<c0330f83>]
> netlink_sendmsg+0x1c2/0x280
> Aug 29 23:43:53 bowl kernel: [350485.786898]  [<c02fcce9>]
> sock_sendmsg+0xba/0xd5
> Aug 29 23:43:53 bowl kernel: [350485.786909]  [<c02fe4d1>]
> sys_sendmsg+0x17b/0x1e8
> Aug 29 23:43:53 bowl kernel: [350485.786914]  [<c02fe92d>]
> sys_socketcall+0x230/0x24d
> Aug 29 23:43:53 bowl kernel: [350485.786919]  [<c01028d2>] syscall_call+0x7/0xb
> Aug 29 23:43:53 bowl kernel: [350485.786923]  =======================
> Aug 29 23:43:53 bowl kernel: [350485.786926] Code: f0 ff 4e 18 0f 94 c0 84 c0
> 0f 84 66 ff ff ff 89 f0 e8 86 e2 fc ff e9 5a ff ff ff f0 ff 40 10 eb be 55 89
> e5 57 89 d7 56 89 c6 53 <8b> 50 54 83 fa 10 72 55 8b 9e 9c 00 00 00 31 c9 8b 03
> 83 f8 0f
> Aug 29 23:43:53 bowl kernel: [350485.786976] EIP: [<c03318ae>]
> netlink_rcv_skb+0xa/0x7e SS:ESP 0068:dee05c80
> Aug 29 23:43:53 bowl kernel: [350485.790485]  at virtual address 00000054
> Aug 29 23:43:53 bowl kernel: [350485.790557]  printing eip:
> Aug 29 23:43:53 bowl kernel: [350485.790613] c03318ae
> Aug 29 23:43:53 bowl kernel: [350485.790665] *pde = 00000000
> Aug 29 23:43:53 bowl kernel: [350485.790727] Oops: 0000 [#4]
> Aug 29 23:43:53 bowl kernel: [350485.790779] PREEMPT SMP
> Aug 29 23:43:53 bowl kernel: [350485.790907] Modules linked in: w83627hf
> hwmon_vid i2c_isa
> Aug 29 23:43:53 bowl kernel: [350485.791103] CPU:    1
> Aug 29 23:43:53 bowl kernel: [350485.791104] EIP:    0060:[<c03318ae>]    Not
> tainted VLI
> Aug 29 23:43:54 bowl kernel: [350485.791106] EFLAGS: 00010246   (2.6.22.3 #1)
> Aug 29 23:43:54 bowl kernel: [350485.791241] EIP is at netlink_rcv_skb+0xa/0x7e
> Aug 29 23:43:56 bowl kernel: [350485.791286] eax: 00000000   ebx: 00000000  
> ecx: c153a920   edx: c0398819
> Aug 29 23:43:57 bowl kernel: [350485.791336] esi: 00000000   edi: c0398819  
> ebp: eaa85c8c   esp: eaa85c80
> Aug 29 23:43:57 bowl kernel: [350485.791389] ds: 007b   es: 007b   fs: 00d8 
> gs: 0033  ss: 0068
> Aug 29 23:43:57 bowl kernel: [350485.791441] Process oidentd (pid: 21497,
> ti=eaa84000 task=caca0330 task.ti=eaa84000)
> Aug 29 23:43:57 bowl kernel: [350485.791492] Stack: 00000000 eaa85cac f7c8ba78
> eaa85ca4 c0331962 c0398819 f7c8ba00 0000004c
> Aug 29 23:43:57 bowl kernel: [350485.791825]        f736f000 eaa85cb4 c03988e3
> 00000001 f7c8ba00 eaa85cc4 c03312a5 0000004c
> Aug 29 23:43:57 bowl kernel: [350485.792158]        f7c8ba00 eaa85cd4 c0330681
> f7c8ba00 e9d49180 eaa85d00 c03307c6 7fffffff
> Aug 29 23:43:57 bowl kernel: [350485.792491] Call Trace:
> Aug 29 23:43:57 bowl kernel: [350485.792572]  [<c010361a>]
> show_trace_log_lvl+0x1c/0x33
> Aug 29 23:43:57 bowl kernel: [350485.792653]  [<c01036d4>]
> show_stack_log_lvl+0x8d/0xaa
> Aug 29 23:43:57 bowl kernel: [350485.792731]  [<c010390e>]
> show_registers+0x1cb/0x321
> Aug 29 23:43:58 bowl kernel: [350485.792808]  [<c0103bff>] die+0x112/0x1e1
> Aug 29 23:43:58 bowl kernel: [350485.792885]  [<c01132d2>]
> do_page_fault+0x229/0x565
> Aug 29 23:43:58 bowl kernel: [350485.792963]  [<c03c8d3a>] error_code+0x72/0x78
> Aug 29 23:43:58 bowl kernel: [350485.793043]  [<c0331962>]
> netlink_run_queue+0x40/0x76
> Aug 29 23:43:58 bowl kernel: [350485.793123]  [<c03988e3>]
> inet_diag_rcv+0x1f/0x2c
> Aug 29 23:43:58 bowl kernel: [350485.793208]  [<c03312a5>]
> netlink_data_ready+0x57/0x59
> Aug 29 23:43:58 bowl kernel: [350485.793290]  [<c0330681>]
> netlink_sendskb+0x24/0x45
> Aug 29 23:43:58 bowl kernel: [350485.793373]  [<c03307c6>]
> netlink_unicast+0x100/0x116
> Aug 29 23:43:59 bowl kernel: [350485.793455]  [<c0330f83>]
> netlink_sendmsg+0x1c2/0x280
> Aug 29 23:43:59 bowl kernel: [350485.793538]  [<c02fcce9>]
> sock_sendmsg+0xba/0xd5
> Aug 29 23:43:59 bowl kernel: [350485.793641]  [<c02fe4d1>]
> sys_sendmsg+0x17b/0x1e8
> Aug 29 23:43:59 bowl kernel: [350485.793732]  [<c02fe92d>]
> sys_socketcall+0x230/0x24d
> Aug 29 23:43:59 bowl kernel: [350485.793822]  [<c01028d2>] syscall_call+0x7/0xb
> Aug 29 23:44:00 bowl kernel: [350485.793919]  =======================
> Aug 29 23:44:00 bowl kernel: [350485.793964] Code: f0 ff 4e 18 0f 94 c0 84 c0
> 0f 84 66 ff ff ff 89 f0 e8 86 e2 fc ff e9 5a ff ff ff f0 ff 40 10 eb be 55 89
> e5 57 89 d7 56 89 c6 53 <8b> 50 54 83 fa 10 72 55 8b 9e 9c 00 00 00 31 c9 8b 03
> 83 f8 0f
> Aug 29 23:44:00 bowl kernel: [350485.796458] EIP: [<c03318ae>]
> netlink_rcv_skb+0xa/0x7e SS:ESP 0068:eaa85c80
> 
> 
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.

^ permalink raw reply

* wither bounds checking for networking sysctls
From: Rick Jones @ 2007-08-31  1:09 UTC (permalink / raw)
  To: Linux Network Development list

While messing about with "sysctl_tcp_rto_min" I went back and forth a 
bit as to whether there should have been bounds checking (as did some of 
the folks who did some internal review for me).  That leads to the 
question - is it considered worthwhile to add a bit more bounds checking 
to sundry networking sysctls?

rick jones

^ permalink raw reply

* Re: [Lksctp-developers] SCTP: Fix dead loop while received unexpected chunk with length set to zero
From: Wei Yongjun @ 2007-08-31  2:38 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: lksctp-developers, netdev
In-Reply-To: <46D6C9F2.5020702@hp.com>

Vlad Yasevich wrote:
> Wei Yongjun wrote:
>   
>> Vlad Yasevich wrote:
>>     
>>> Wei Yongjun wrote:
>>>  
>>>       
>>>> Vlad Yasevich wrote:
>>>>    
>>>>         
>>>>> NACK
>>>>>
>>>>> Section 8.4:
>>>>>
>>>>>    An SCTP packet is called an "out of the blue" (OOTB) packet if it is
>>>>>    correctly formed (i.e., passed the receiver's CRC32c check; see
>>>>>    Section 6.8), but the receiver is not able to identify the
>>>>>    association to which this packet belongs.
>>>>>
>>>>>
>>>>> I would argue that the packet is not correctly formed in this case
>>>>> and deserves a protocol violation ABORT in return.
>>>>>
>>>>> -vlad
>>>>>         
>>>>>           
>>>> As your comment, patch has been changed.
>>>> Patch has been split to two, one is resolve this dead loop problem in
>>>> this mail.
>>>> And the other is come in another mail to discard partial chunk which
>>>> chunk length is set to zero.
>>>>     
>>>>         
>>> I am starting to question the entire OOTB packet handling.  There are way
>>> too many function that do almost the same thing and all handle OOTB a
>>> little
>>> different.
>>>
>>> sctp_sf_do_9_2_reshutack() is also called during sctp_sf_do_dupcook_a()
>>> processing, so checking for INIT chunk is wrong.  Checking for just the
>>> chunkhdr_t should be enough.
>>>   
>>>       
>> This has been changed.
>>     
>>> sctp_sf_tabort_8_4_8 is used directly as well (not just through the state
>>> machine).  Not sure if the header verification is appropriate.
>>>   
>>>       
>> It is needed. Because sctp_sf_tabort_8_4_8() is called to handle OOTB
>> packet before check the header length.
>>     
>
> But now we are doing the same thing twice (and this is not the only place).
> I know I am being really picky here, but I am starting to thing the ootb handling\
> is a mess and I really don't want to add to the mess.
>
> Until I (or someone else) prove that it's not a mess or fix it, I am going
> to hold off on these patches.
>
> Feel free to go through the spec and fix all the OOTB handling.
>
> Thanks
> -vlad
>   
Hi vlad:

I think this probleam must be check as soon as possible, because this is 
a security hole. This probleam let SCTP module to be unsafe, if we load 
it, single bad format SCTP packet can make my system dead loop and no 
response to anything, console is freeze too. The same as kernel panic, 
and also can be used to attack other machine by send too many ABORT packet.

May be someone can provide a better patch to this probleam. And I'd 
pleased to see someone to resolve this probleam.

Regards
Wei Yongjun


^ permalink raw reply

* Re: wither bounds checking for networking sysctls
From: Stephen Hemminger @ 2007-08-31  3:59 UTC (permalink / raw)
  To: Rick Jones; +Cc: Linux Network Development list
In-Reply-To: <46D76A3D.9090207@hp.com>

On Thu, 30 Aug 2007 18:09:17 -0700
Rick Jones <rick.jones2@hp.com> wrote:

> While messing about with "sysctl_tcp_rto_min" I went back and forth a 
> bit as to whether there should have been bounds checking (as did some of 
> the folks who did some internal review for me).  That leads to the 
> question - is it considered worthwhile to add a bit more bounds checking 
> to sundry networking sysctls?
> 
> rick jones

IMHO As long as the any value from sysctl doesn't crash kernel, we
should let it go. Enforcing RFC policy or inter-dependencies seems
likes a useless exercise.

^ permalink raw reply

* [ofa-general] Re: [PATCH RFC] iw_cxgb3: Support "iwarp-only" interfaces to avoid 4-tuple conflicts with the host stack.
From: Roland Dreier @ 2007-08-31  4:27 UTC (permalink / raw)
  To: Steve Wise; +Cc: netdev, ewg, general, linux-kernel
In-Reply-To: <1187905185.5547.13.camel@stevo-desktop>

 > The sysadmin creates "for iwarp use only" alias interfaces of the form
 > "devname:iw*" where devname is the native interface name (eg eth0) for the
 > iwarp netdev device.  The alias label can be anything starting with "iw".
 > The "iw" immediately after the ':' is the key used by the iwarp driver.

What's wrong with my suggestion of having the iwarp driver create an
"iwX" interface to go with the normal "ethX" interface?  It seems
simpler to me, and there's a somewhat similar precedent with how
mac80211 devices create both wlan0 and wmaster0 interfaces.

 - R.

^ permalink raw reply

* Re: [PATCH] make _minimum_ TCP retransmission timeout configurable take 2
From: John Heffner @ 2007-08-31  4:52 UTC (permalink / raw)
  To: Rick Jones; +Cc: netdev
In-Reply-To: <46D769C1.8090808@hp.com>

Rick Jones wrote:
> Like I said the consumers of this are a triffle well, 
> "anxious" :)

Just curious, did you or this customer try with F-RTO enabled?  Or is 
this case you're dealing with truly hopeless?

   -John

^ permalink raw reply

* Re: [PATCH] make _minimum_ TCP retransmission timeout configurable take 2
From: David Miller @ 2007-08-31  5:09 UTC (permalink / raw)
  To: rick.jones2; +Cc: netdev
In-Reply-To: <46D769C1.8090808@hp.com>

From: Rick Jones <rick.jones2@hp.com>
Date: Thu, 30 Aug 2007 18:07:13 -0700

> Anyhow, I'll try grubbing around the source code (already doing that to 
> see about writing a pet tcp cong module) but if pointers to the likely 
> relevant files were available I could try to help thrash-out the routing 
> metric version.  Like I said the consumers of this are a triffle well, 
> "anxious" :)

The change is actually a lot simpler than the sysctl version.

In fact it borders on trivial :-)

Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index c91476c..dff3192 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -351,6 +351,8 @@ enum
 #define RTAX_INITCWND RTAX_INITCWND
 	RTAX_FEATURES,
 #define RTAX_FEATURES RTAX_FEATURES
+	RTAX_RTO_MIN,
+#define RTAX_RTO_MIN RTAX_RTO_MIN
 	__RTAX_MAX
 };
 
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9785df3..1ee7212 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -555,6 +555,16 @@ static void tcp_event_data_recv(struct sock *sk, struct sk_buff *skb)
 		tcp_grow_window(sk, skb);
 }
 
+static u32 tcp_rto_min(struct sock *sk)
+{
+	struct dst_entry *dst = __sk_dst_get(sk);
+	u32 rto_min = TCP_RTO_MIN;
+
+	if (dst_metric_locked(dst, RTAX_RTO_MIN))
+		rto_min = dst->metrics[RTAX_RTO_MIN-1];
+	return rto_min;
+}
+
 /* Called to compute a smoothed rtt estimate. The data fed to this
  * routine either comes from timestamps, or from segments that were
  * known _not_ to have been retransmitted [see Karn/Partridge
@@ -616,13 +626,13 @@ static void tcp_rtt_estimator(struct sock *sk, const __u32 mrtt)
 			if (tp->mdev_max < tp->rttvar)
 				tp->rttvar -= (tp->rttvar-tp->mdev_max)>>2;
 			tp->rtt_seq = tp->snd_nxt;
-			tp->mdev_max = TCP_RTO_MIN;
+			tp->mdev_max = tcp_rto_min(sk);
 		}
 	} else {
 		/* no previous measure. */
 		tp->srtt = m<<3;	/* take the measured time to be rtt */
 		tp->mdev = m<<1;	/* make sure rto = 3*rtt */
-		tp->mdev_max = tp->rttvar = max(tp->mdev, TCP_RTO_MIN);
+		tp->mdev_max = tp->rttvar = max(tp->mdev, tcp_rto_min(sk));
 		tp->rtt_seq = tp->snd_nxt;
 	}
 }



^ permalink raw reply related

* Re: [GIT PULL] SCTP updates
From: David Miller @ 2007-08-31  5:14 UTC (permalink / raw)
  To: vladislav.yasevich; +Cc: lksctp-developers, netdev
In-Reply-To: <11885091901093-git-send-email-vladislav.yasevich@hp.com>

From: Vlad Yasevich <vladislav.yasevich@hp.com>
Date: Thu, 30 Aug 2007 17:26:30 -0400

> that are available in the git repository at:
> 
>   master.kernel.org:/pub/scm/linux/kernel/git/vxy/lksctp-dev.git master

Pulled, thanks a lot Vlad.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox