Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: Tc bug (kernel crash) more info
From: Jarek Poplawski @ 2007-08-29 13:30 UTC (permalink / raw)
  To: Badalian Vyacheslav; +Cc: netdev
In-Reply-To: <46D56C60.3060702@bigtelecom.ru>

On Wed, Aug 29, 2007 at 04:53:52PM +0400, Badalian Vyacheslav wrote:
...
> we have this kernel panic (then delete HTB) at all 2.6.18-x versions.
> on older kernel (2.6.x) we have another panic (then delete tc filter)... 
> summary we have TC panics 1 year ago ;) Sysctl option "reboot on panic" 

I'm not sure: do you mean it was less often? Did you try to report it
here? (Delete HTB: qdisc or classes?)

> save us. Now we up 2 backup computers and may try any patches to fix 
> this problem.
> 
> Also on 2.6.22 have strange dead. Black screen, no response to keyboard, 
> no info in netconsole, HardDisk led is stable red. "Black Dead"
> 

Yes, with all black it could be harder... Maybe 'set -x' at the
beginning (after #!/bin/sh line) of a script could manage to save
something before reboot or send with netconsole (but there could be
a lot of this with a large script...). Netconsole could be troublesome
too. One HTB deadlock problem during similar deleting was fixed in
2.6.23-rc (HTB timer problem) but the log was different. Anyway,
we probably need some more information (and trying).

Jarek P.

^ permalink raw reply

* [PATCH] Remove write-only variable from pktgen_thread
From: Pavel Emelyanov @ 2007-08-29 13:22 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oleg Nesterov, Sukadev Bhattiprolu, Linux Containers,
	Linux Kernel Mailing List, Linux Netdev List

The pktgen_thread.pid is set to current->pid and is never used
after this. So remove this at all.

Found during isolating the explicit pid/tgid usage.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>

---

diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 3a3154e..93695c2 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -380,7 +380,6 @@ struct pktgen_thread {
 	/* Field for thread to receive "posted" events terminate, stop ifs etc. */
 
 	u32 control;
-	int pid;
 	int cpu;
 
 	wait_queue_head_t queue;
@@ -3462,8 +3461,6 @@ static int pktgen_thread_worker(void *ar
 
 	init_waitqueue_head(&t->queue);
 
-	t->pid = current->pid;
-
 	pr_debug("pktgen: starting pktgen/%d:  pid=%d\n", cpu, task_pid_nr(current));
 
 	max_before_softirq = t->max_before_softirq;

^ permalink raw reply related

* Re: [PATCH 4/5] Net: ath5k, license is GPLv2
From: Xavier Bestel @ 2007-08-29 13:13 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Johannes Berg, linville, linux-kernel, linux-wireless, netdev
In-Reply-To: <4af2d03a0708290335x1a9aa38o566e9d416505f280@mail.gmail.com>

On Wed, 2007-08-29 at 08:35 -0200, Jiri Slaby wrote:
> On 8/29/07, Johannes Berg <johannes@sipsolutions.net> wrote:
> > On Tue, 2007-08-28 at 12:00 -0400, Jiri Slaby wrote:
> >
> > > The files are available only under GPLv2 since now.
> >
> > Since the BSD people are already getting upset about (for various
> > reasons among which seem to be a clear non-understanding) I'd suggest
> > changing it to:
> 
> yes, please. Can somebody do it, I'm away from my box.
> 
> > + * Parts of this file were originally licenced under the BSD licence:
> > + *
> > >  * Permission to use, copy, modify, and distribute this software for any
> > >  * purpose with or without fee is hereby granted, provided that the above
> > >  * copyright notice and this permission notice appear in all copies.
> > >  *
> > >  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
> > WARRANTIES
> > >  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> > >  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
> > >  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
> > >  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
> > >  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
> > >  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> > + *
> > + * Further changes to this file since the moment this notice was extended
> > + * are now distributed under the terms of the GPL version two as published
> > + * by the Free Software Foundation <yaddaya>
> >
> > johannes

How about asking for changes to be dual-licenced too ?

	Xav



^ permalink raw reply

* Re: Tc bug (kernel crash) more info
From: Badalian Vyacheslav @ 2007-08-29 12:53 UTC (permalink / raw)
  To: Jarek Poplawski; +Cc: netdev
In-Reply-To: <20070829121408.GB3575@ff.dom.local>

Jarek Poplawski пишет:
> On Wed, Aug 29, 2007 at 01:34:47PM +0200, Jarek Poplawski wrote:
>   
>> On 29-08-2007 11:34, Badalian Vyacheslav wrote:
>>     
>>> Again crash.  Need more posts of panic or this message have full info 
>>> that needed to fix bug?
>>>       
> ...
>   
>> If it's possible you can try it shortly without e.g. netconsole or
>> even without CONFIG_SMP.
>>     
>
> ...or maybe even dare to try something current like 2.6.23-rc4?
>
> Jarek P.
>
>   
we have this kernel panic (then delete HTB) at all 2.6.18-x versions.
on older kernel (2.6.x) we have another panic (then delete tc filter)... 
summary we have TC panics 1 year ago ;) Sysctl option "reboot on panic" 
save us. Now we up 2 backup computers and may try any patches to fix 
this problem.

Also on 2.6.22 have strange dead. Black screen, no response to keyboard, 
no info in netconsole, HardDisk led is stable red. "Black Dead"

^ permalink raw reply

* Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
From: OBATA Noboru @ 2007-08-29 12:26 UTC (permalink / raw)
  To: davem; +Cc: shemminger, yoshfuji, netdev
In-Reply-To: <20070828.133057.107937654.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Subject: Re: [PATCH 2.6.22] TCP: Make TCP_RTO_MAX a variable (take 2)
Date: Tue, 28 Aug 2007 13:30:57 -0700 (PDT)

> From: OBATA Noboru <noboru.obata.ar@hitachi.com>
> Date: Tue, 28 Aug 2007 22:04:47 +0900 (JST)
> 
> > (1) Make the application timeouts longer.  (Steve has shown that
> >     making an application timeouts twice the failover detection
> >     timeout would be a solution.)
> 
> This is the only feasible solution to your problem.

What about another option to let TCP have a notification?

Can it be a solution if it is standardized?

-- 
OBATA Noboru (noboru.obata.ar@hitachi.com)

^ permalink raw reply

* Re: Tc bug (kernel crash) more info
From: Jarek Poplawski @ 2007-08-29 12:14 UTC (permalink / raw)
  To: Badalian Vyacheslav; +Cc: netdev
In-Reply-To: <20070829113447.GA3575@ff.dom.local>

On Wed, Aug 29, 2007 at 01:34:47PM +0200, Jarek Poplawski wrote:
> On 29-08-2007 11:34, Badalian Vyacheslav wrote:
> > Again crash.  Need more posts of panic or this message have full info 
> > that needed to fix bug?
...
> If it's possible you can try it shortly without e.g. netconsole or
> even without CONFIG_SMP.

...or maybe even dare to try something current like 2.6.23-rc4?

Jarek P.

^ permalink raw reply

* Re: Tc bug (kernel crash) more info
From: Jarek Poplawski @ 2007-08-29 11:34 UTC (permalink / raw)
  To: Badalian Vyacheslav; +Cc: netdev
In-Reply-To: <46D53D9C.5070204@bigtelecom.ru>

On 29-08-2007 11:34, Badalian Vyacheslav wrote:
> Again crash.  Need more posts of panic or this message have full info 
> that needed to fix bug?

Hi,

Please, try to not create new threads each time: reply to the previous
one if you have something new. And this one doesn't seem to show more.
You have written earlier it's '1-5 times on week', so you should have
got used to it a little, so no need to panic...

You would better try to write if there was some previous kernel
version, which worked better for you?

It seems, there could be some locking problem and your script could
mess htb queue from the second cpu (or is interrupted). Probably you
could have something more in logs about this, and maybe even this
script could be helpful (you should mask secret things only). Or maybe
you could try to add some echos to this script to figure out the
part which is the most suspected. Of course .config and dmesg (zipped)
could be helpful too.

If it's possible you can try it shortly without e.g. netconsole or
even without CONFIG_SMP.

Regards,
Jarek P.

^ permalink raw reply

* Re: [PATCH] Prefix each line of multiline printk(KERN_<level> "foo\nbar") with KERN_<level>
From: Maciej W. Rozycki @ 2007-08-29 11:22 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Mike Frysinger, Joe Perches, linux-kernel, blinux-list,
	cluster-devel, discuss, jffs-dev, linux-acpi, linux-ide,
	linux-mips, linux-mm, linux-mtd, linux-scsi, mpt_linux_developer,
	netdev, osst-users, parisc-linux, tpmdd-devel, uclinux-dist-devel
In-Reply-To: <Pine.LNX.4.64.0708261305020.31149@anakin>

On Sun, 26 Aug 2007, Geert Uytterhoeven wrote:

> What I mean is that probably there used to be a printk() call starting with
> `\n'. Then someone added a `KERN_ERR' in front of it.

 I gather '\n' at the beginning is to assure the following line is output 
on a separate line rather than as a continuation of another one which may 
have been output without a trailing '\n'.  A situation where printk() is 
called with a string containing no trailing '\n' may be discouraged, but 
there are some more or less justified exceptions.  For example the SCSI 
disk spin-up code is one.

 Therefore it may be reasonable for more critical messages -- perhaps not 
ones at KERN_ERR, but certainly KERN_CRIT and higher ones -- that may 
potentially happen asynchronously to start with '\n'.  In this case a call 
would look like this:

	printk("\n" KERN_CRIT "The actual message.\n");

Of course based on "console_loglevel" and "default_message_level" the 
leading '\n' may still get swallowed from what gets printed to the console 
terminal, but in reality I do not think that poses a problem, as these 
both can be set by a system administrator according to the local policy.

  Maciej

^ permalink raw reply

* Re: [PATCH 2.6.23 0/2] cxgb3 - Fix dev->priv usage
From: Jeff Garzik @ 2007-08-29 10:42 UTC (permalink / raw)
  To: Roland Dreier, Divy Le Ray; +Cc: netdev, linux-kernel, Steve Wise
In-Reply-To: <ada3ay26gys.fsf@cisco.com>

Roland Dreier wrote:
> Looks OK to me but I would just roll up the second patch into the
> first patch and let Jeff merge it as one commit.  There's no point in
> creating an intermediate tree that doesn't build -- it just breaks git
> bisect for no useful purpose.

Agreed -- this needs to be in a single patch.

	Jeff




^ permalink raw reply

* Re: [PATCH 4/5] Net: ath5k, license is GPLv2
From: Jiri Slaby @ 2007-08-29 10:35 UTC (permalink / raw)
  To: Johannes Berg
  Cc: linville-2XuSBdqkA4R54TAoqtyWWQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1188381582.19891.6.camel-YfaajirXv214zXjbi5bjpg@public.gmane.org>

On 8/29/07, Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:
> On Tue, 2007-08-28 at 12:00 -0400, Jiri Slaby wrote:
>
> > The files are available only under GPLv2 since now.
>
> Since the BSD people are already getting upset about (for various
> reasons among which seem to be a clear non-understanding) I'd suggest
> changing it to:

yes, please. Can somebody do it, I'm away from my box.

> + * Parts of this file were originally licenced under the BSD licence:
> + *
> >  * Permission to use, copy, modify, and distribute this software for any
> >  * purpose with or without fee is hereby granted, provided that the above
> >  * copyright notice and this permission notice appear in all copies.
> >  *
> >  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
> WARRANTIES
> >  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
> >  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
> >  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
> >  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
> >  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
> >  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
> + *
> + * Further changes to this file since the moment this notice was extended
> + * are now distributed under the terms of the GPL version two as published
> + * by the Free Software Foundation <yaddaya>
>
> johannes
>

^ permalink raw reply

* [patch 7/7] s390: Drop ARP packages on HiperSockets interface with NOARP attribute.
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka, Klaus D. Wacker
In-Reply-To: <20070829092651.411517000@linux.vnet.ibm.com>

[-- Attachment #1: 711-qeth-arp.diff --]
[-- Type: text/plain, Size: 1699 bytes --]

From: Klaus D. Wacker <kdwacker@de.ibm.com>

A network interface can get ARP packets even when the interface has
NOARP specified. In a HiperSockets environment this disturbs receiving
systems when packets are sent on the multicast queue. (E.g. TCP/IP on
z/VM issues messages reporting invalid data on the HiperSockets
interface.)
Qeth will no longer send ARP packets on HiperSockets interface when
interface has the NOARP attribute.

Signed-off-by: Klaus D. Wacker <kdwacker@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---

 drivers/s390/net/qeth_main.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -2505,7 +2505,7 @@ qeth_rebuild_skb_fake_ll_tr(struct qeth_
 	struct iphdr *ip_hdr;

 	QETH_DBF_TEXT(trace,5,"skbfktr");
-	skb_set_mac_header(skb, -QETH_FAKE_LL_LEN_TR);
+	skb_set_mac_header(skb, (int)-QETH_FAKE_LL_LEN_TR);
 	/* this is a fake ethernet header */
 	fake_hdr = tr_hdr(skb);

@@ -4710,9 +4710,15 @@ qeth_send_packet(struct qeth_card *card,
 	if (card->info.type != QETH_CARD_TYPE_IQD)
 		rc = qeth_do_send_packet(card, queue, new_skb, hdr,
 					 elements_needed, ctx);
-	else
+	else {
+		if ((skb->protocol == htons(ETH_P_ARP)) &&
+		    (card->dev->flags & IFF_NOARP)) {
+			__qeth_free_new_skb(skb, new_skb);
+			return -EPERM;
+		}
 		rc = qeth_do_send_packet_fast(card, queue, new_skb, hdr,
 					      elements_needed, ctx);
+	}
 	if (!rc) {
 		card->stats.tx_packets++;
 		card->stats.tx_bytes += tx_bytes;

-- 

^ permalink raw reply

* [patch 6/7] s390: provide specific message for OSA-adapters exclusively used
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka
In-Reply-To: <20070829092651.411517000@linux.vnet.ibm.com>

[-- Attachment #1: 710-qeth-exclusive.diff --]
[-- Type: text/plain, Size: 3523 bytes --]

From: Ursula Braun <braunu@de.ibm.com>

Exclusive usage of OSA-cards has been introduced. Even though Linux
does not make use of it, qeth should be prepared to receive a bad RC
for some initialization steps. A meaningful message is now given,
if an OSA-device is set online, even though the OSA-adapter is already
exclusively used by another host.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---

 drivers/s390/net/qeth_main.c |   28 +++++++++++++++++++---------
 drivers/s390/net/qeth_mpc.h  |    1 +
 2 files changed, 20 insertions(+), 9 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -1541,16 +1541,21 @@ qeth_idx_write_cb(struct qeth_channel *c
 	card = CARD_FROM_CDEV(channel->ccwdev);
 
 	if (!(QETH_IS_IDX_ACT_POS_REPLY(iob->data))) {
-		PRINT_ERR("IDX_ACTIVATE on write channel device %s: negative "
-			  "reply\n", CARD_WDEV_ID(card));
+		if (QETH_IDX_ACT_CAUSE_CODE(iob->data) == 0x19)
+			PRINT_ERR("IDX_ACTIVATE on write channel device %s: "
+				"adapter exclusively used by another host\n",
+				CARD_WDEV_ID(card));
+		else
+			PRINT_ERR("IDX_ACTIVATE on write channel device %s: "
+				"negative reply\n", CARD_WDEV_ID(card));
 		goto out;
 	}
 	memcpy(&temp, QETH_IDX_ACT_FUNC_LEVEL(iob->data), 2);
 	if ((temp & ~0x0100) != qeth_peer_func_level(card->info.func_level)) {
 		PRINT_WARN("IDX_ACTIVATE on write channel device %s: "
-			   "function level mismatch "
-			   "(sent: 0x%x, received: 0x%x)\n",
-			   CARD_WDEV_ID(card), card->info.func_level, temp);
+			"function level mismatch "
+			"(sent: 0x%x, received: 0x%x)\n",
+			CARD_WDEV_ID(card), card->info.func_level, temp);
 		goto out;
 	}
 	channel->state = CH_STATE_UP;
@@ -1596,8 +1601,13 @@ qeth_idx_read_cb(struct qeth_channel *ch
 			goto out;
 	}
 	if (!(QETH_IS_IDX_ACT_POS_REPLY(iob->data))) {
-		PRINT_ERR("IDX_ACTIVATE on read channel device %s: negative "
-			  "reply\n", CARD_RDEV_ID(card));
+		if (QETH_IDX_ACT_CAUSE_CODE(iob->data) == 0x19)
+			PRINT_ERR("IDX_ACTIVATE on read channel device %s: "
+				"adapter exclusively used by another host\n",
+				CARD_RDEV_ID(card));
+		else
+			PRINT_ERR("IDX_ACTIVATE on read channel device %s: "
+				"negative reply\n", CARD_RDEV_ID(card));
 		goto out;
 	}
 
@@ -1612,8 +1622,8 @@ qeth_idx_read_cb(struct qeth_channel *ch
 	memcpy(&temp, QETH_IDX_ACT_FUNC_LEVEL(iob->data), 2);
 	if (temp != qeth_peer_func_level(card->info.func_level)) {
 		PRINT_WARN("IDX_ACTIVATE on read channel device %s: function "
-			   "level mismatch (sent: 0x%x, received: 0x%x)\n",
-			   CARD_RDEV_ID(card), card->info.func_level, temp);
+			"level mismatch (sent: 0x%x, received: 0x%x)\n",
+			CARD_RDEV_ID(card), card->info.func_level, temp);
 		goto out;
 	}
 	memcpy(&card->token.issuer_rm_r,
Index: linux-2.6-uschi/drivers/s390/net/qeth_mpc.h
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_mpc.h
+++ linux-2.6-uschi/drivers/s390/net/qeth_mpc.h
@@ -565,6 +565,7 @@ extern unsigned char IDX_ACTIVATE_WRITE[
 #define QETH_IDX_ACT_QDIO_DEV_REALADDR(buffer) (buffer+0x20)
 #define QETH_IS_IDX_ACT_POS_REPLY(buffer) (((buffer)[0x08]&3)==2)
 #define QETH_IDX_REPLY_LEVEL(buffer) (buffer+0x12)
+#define QETH_IDX_ACT_CAUSE_CODE(buffer) (buffer)[0x09]
 
 #define PDU_ENCAPSULATION(buffer) \
 	(buffer + *(buffer + (*(buffer+0x0b)) + \

-- 

^ permalink raw reply

* [patch 5/7] s390: crash during reboot after failing online setting
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka
In-Reply-To: <20070829092651.411517000@linux.vnet.ibm.com>

[-- Attachment #1: 709-qeth-crash.diff --]
[-- Type: text/plain, Size: 2457 bytes --]

From: Ursula Braun <braunu@de.ibm.com>

Online setting of a qeth device may fail for instance because of:
- out-of-memory condition when allocating qdio queues
- IDX ACTIVATE problem
- ...
Such a device is still returned in a driver_for_each_device loop
processed in qeth_reboot_event(), which calls
qeth_clear_qdio_buffers(). Make sure qeth_clear_output_buffer() is
called only, if the qdio queues have been successfully allocated
during initialization of a qeth device.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---

 drivers/s390/net/qeth_main.c |   20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -3356,10 +3356,12 @@ out_freeoutq:
 	while (i > 0)
 		kfree(card->qdio.out_qs[--i]);
 	kfree(card->qdio.out_qs);
+	card->qdio.out_qs = NULL;
 out_freepool:
 	qeth_free_buffer_pool(card);
 out_freeinq:
 	kfree(card->qdio.in_q);
+	card->qdio.in_q = NULL;
 out_nomem:
 	atomic_set(&card->qdio.state, QETH_QDIO_UNINITIALIZED);
 	return -ENOMEM;
@@ -3375,16 +3377,20 @@ qeth_free_qdio_buffers(struct qeth_card 
 		QETH_QDIO_UNINITIALIZED)
 		return;
 	kfree(card->qdio.in_q);
+	card->qdio.in_q = NULL;
 	/* inbound buffer pool */
 	qeth_free_buffer_pool(card);
 	/* free outbound qdio_qs */
-	for (i = 0; i < card->qdio.no_out_queues; ++i){
-		for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j)
-			qeth_clear_output_buffer(card->qdio.out_qs[i],
-					&card->qdio.out_qs[i]->bufs[j]);
-		kfree(card->qdio.out_qs[i]);
+	if (card->qdio.out_qs) {
+		for (i = 0; i < card->qdio.no_out_queues; ++i) {
+			for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j)
+				qeth_clear_output_buffer(card->qdio.out_qs[i],
+						&card->qdio.out_qs[i]->bufs[j]);
+			kfree(card->qdio.out_qs[i]);
+		}
+		kfree(card->qdio.out_qs);
+		card->qdio.out_qs = NULL;
 	}
-	kfree(card->qdio.out_qs);
 }
 
 static void
@@ -3395,7 +3401,7 @@ qeth_clear_qdio_buffers(struct qeth_card
 	QETH_DBF_TEXT(trace, 2, "clearqdbf");
 	/* clear outbound buffers to free skbs */
 	for (i = 0; i < card->qdio.no_out_queues; ++i)
-		if (card->qdio.out_qs[i]){
+		if (card->qdio.out_qs && card->qdio.out_qs[i]) {
 			for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j)
 				qeth_clear_output_buffer(card->qdio.out_qs[i],
 						&card->qdio.out_qs[i]->bufs[j]);

-- 

^ permalink raw reply

* [patch 4/7] s390: Announce tx checksumming for qeth devices in TSO/EDDP mode
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka
In-Reply-To: <20070829092651.411517000@linux.vnet.ibm.com>

[-- Attachment #1: 706-qeth-tx-chksum.diff --]
[-- Type: text/plain, Size: 5129 bytes --]

From: Frank Blaschka <frank.blaschka@de.ibm.com>

TSO requires tx checksumming. For non GSO frames in TSO/EDDP mode we
have to manually calculate the checksum. 

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---


Subject: [patch 4/7] [PATCH] qeth: Announce tx checksumming for qeth devices in TSO/EDDP mode

From: Frank Blaschka <frank.blaschka@de.ibm.com>

TSO requires tx checksumming. For non GSO frames in TSO/EDDP mode we
have to manually calculate the checksum. 

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---

 drivers/s390/net/qeth_main.c |   82 +++++++++++++++++++++++++++++++++++--------
 1 file changed, 68 insertions(+), 14 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -4555,6 +4555,53 @@ qeth_get_elements_no(struct qeth_card *c
         return elements_needed;
 }
 
+static void qeth_tx_csum(struct sk_buff *skb)
+{
+	int tlen;
+
+	if (skb->protocol == htons(ETH_P_IP)) {
+		tlen = ntohs(ip_hdr(skb)->tot_len) - (ip_hdr(skb)->ihl << 2);
+		switch (ip_hdr(skb)->protocol) {
+		case IPPROTO_TCP:
+			tcp_hdr(skb)->check = 0;
+			tcp_hdr(skb)->check = csum_tcpudp_magic(
+				ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
+				tlen, ip_hdr(skb)->protocol,
+				skb_checksum(skb, skb_transport_offset(skb),
+					tlen, 0));
+			break;
+		case IPPROTO_UDP:
+			udp_hdr(skb)->check = 0;
+			udp_hdr(skb)->check = csum_tcpudp_magic(
+				ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
+				tlen, ip_hdr(skb)->protocol,
+				skb_checksum(skb, skb_transport_offset(skb),
+					tlen, 0));
+			break;
+		}
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		switch (ipv6_hdr(skb)->nexthdr) {
+		case IPPROTO_TCP:
+			tcp_hdr(skb)->check = 0;
+			tcp_hdr(skb)->check = csum_ipv6_magic(
+				&ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr,
+				ipv6_hdr(skb)->payload_len,
+				ipv6_hdr(skb)->nexthdr,
+				skb_checksum(skb, skb_transport_offset(skb),
+					ipv6_hdr(skb)->payload_len, 0));
+			break;
+		case IPPROTO_UDP:
+			udp_hdr(skb)->check = 0;
+			udp_hdr(skb)->check = csum_ipv6_magic(
+				&ipv6_hdr(skb)->saddr, &ipv6_hdr(skb)->daddr,
+				ipv6_hdr(skb)->payload_len,
+				ipv6_hdr(skb)->nexthdr,
+				skb_checksum(skb, skb_transport_offset(skb),
+					ipv6_hdr(skb)->payload_len, 0));
+			break;
+		}
+	}
+}
 
 static int
 qeth_send_packet(struct qeth_card *card, struct sk_buff *skb)
@@ -4640,6 +4687,10 @@ qeth_send_packet(struct qeth_card *card,
 		elements_needed += elems;
 	}
 
+	if ((large_send == QETH_LARGE_SEND_NO) &&
+	    (skb->ip_summed == CHECKSUM_PARTIAL))
+		qeth_tx_csum(new_skb);
+
 	if (card->info.type != QETH_CARD_TYPE_IQD)
 		rc = qeth_do_send_packet(card, queue, new_skb, hdr,
 					 elements_needed, ctx);
@@ -6387,20 +6438,18 @@ qeth_deregister_addr_entry(struct qeth_c
 static u32
 qeth_ethtool_get_tx_csum(struct net_device *dev)
 {
-	/* We may need to say that we support tx csum offload if
-	 * we do EDDP or TSO. There are discussions going on to
-	 * enforce rules in the stack and in ethtool that make
-	 * SG and TSO depend on HW_CSUM. At the moment there are
-	 * no such rules....
-	 * If we say yes here, we have to checksum outbound packets
-	 * any time. */
-	return 0;
+	return (dev->features & NETIF_F_HW_CSUM) != 0;
 }
 
 static int
 qeth_ethtool_set_tx_csum(struct net_device *dev, u32 data)
 {
-	return -EINVAL;
+	if (data)
+		dev->features |= NETIF_F_HW_CSUM;
+	else
+		dev->features &= ~NETIF_F_HW_CSUM;
+
+	return 0;
 }
 
 static u32
@@ -7414,7 +7463,8 @@ qeth_start_ipa_tso(struct qeth_card *car
 	}
 	if (rc && (card->options.large_send == QETH_LARGE_SEND_TSO)){
 		card->options.large_send = QETH_LARGE_SEND_NO;
-		card->dev->features &= ~ (NETIF_F_TSO | NETIF_F_SG);
+		card->dev->features &= ~(NETIF_F_TSO | NETIF_F_SG |
+						NETIF_F_HW_CSUM);
 	}
 	return rc;
 }
@@ -7554,22 +7604,26 @@ qeth_set_large_send(struct qeth_card *ca
 	card->options.large_send = type;
 	switch (card->options.large_send) {
 	case QETH_LARGE_SEND_EDDP:
-		card->dev->features |= NETIF_F_TSO | NETIF_F_SG;
+		card->dev->features |= NETIF_F_TSO | NETIF_F_SG |
+					NETIF_F_HW_CSUM;
 		break;
 	case QETH_LARGE_SEND_TSO:
 		if (qeth_is_supported(card, IPA_OUTBOUND_TSO)){
-			card->dev->features |= NETIF_F_TSO | NETIF_F_SG;
+			card->dev->features |= NETIF_F_TSO | NETIF_F_SG |
+						NETIF_F_HW_CSUM;
 		} else {
 			PRINT_WARN("TSO not supported on %s. "
 				   "large_send set to 'no'.\n",
 				   card->dev->name);
-			card->dev->features &= ~(NETIF_F_TSO | NETIF_F_SG);
+			card->dev->features &= ~(NETIF_F_TSO | NETIF_F_SG |
+						NETIF_F_HW_CSUM);
 			card->options.large_send = QETH_LARGE_SEND_NO;
 			rc = -EOPNOTSUPP;
 		}
 		break;
 	default: /* includes QETH_LARGE_SEND_NO */
-		card->dev->features &= ~(NETIF_F_TSO | NETIF_F_SG);
+		card->dev->features &= ~(NETIF_F_TSO | NETIF_F_SG |
+					NETIF_F_HW_CSUM);
 		break;
 	}
 	if (card->state == CARD_STATE_UP)

-- 

^ permalink raw reply

* [patch 3/7] s390: dont return the return values of void functions.
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka, Heiko Carstens
In-Reply-To: <20070829092651.411517000@linux.vnet.ibm.com>

[-- Attachment #1: 705-qeth-return.diff --]
[-- Type: text/plain, Size: 1619 bytes --]

From: Heiko Carstens <heiko.carstens@de.ibm.com>

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---

 drivers/s390/net/qeth.h     |    4 ++--
 drivers/s390/net/qeth_sys.c |    8 ++++----
 2 files changed, 6 insertions(+), 6 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth.h
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth.h
+++ linux-2.6-uschi/drivers/s390/net/qeth.h
@@ -1178,9 +1178,9 @@ qeth_ipaddr_to_string(enum qeth_prot_ver
 		      char *buf)
 {
 	if (proto == QETH_PROT_IPV4)
-		return qeth_ipaddr4_to_string(addr, buf);
+		qeth_ipaddr4_to_string(addr, buf);
 	else if (proto == QETH_PROT_IPV6)
-		return qeth_ipaddr6_to_string(addr, buf);
+		qeth_ipaddr6_to_string(addr, buf);
 }
 
 static inline int
Index: linux-2.6-uschi/drivers/s390/net/qeth_sys.c
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_sys.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_sys.c
@@ -1760,10 +1760,10 @@ qeth_remove_device_attributes(struct dev
 {
 	struct qeth_card *card = dev->driver_data;
 
-	if (card->info.type == QETH_CARD_TYPE_OSN)
-		return sysfs_remove_group(&dev->kobj,
-					  &qeth_osn_device_attr_group);
-
+	if (card->info.type == QETH_CARD_TYPE_OSN) {
+		sysfs_remove_group(&dev->kobj, &qeth_osn_device_attr_group);
+		return;
+	}
 	sysfs_remove_group(&dev->kobj, &qeth_device_attr_group);
 	sysfs_remove_group(&dev->kobj, &qeth_device_ipato_group);
 	sysfs_remove_group(&dev->kobj, &qeth_device_vipa_group);

-- 

^ permalink raw reply

* [patch 2/7] s390: enforce a rate limit for inbound scatter gather messages
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka
In-Reply-To: <20070829092651.411517000@linux.vnet.ibm.com>

[-- Attachment #1: 704-qeth-rate-limit.diff --]
[-- Type: text/plain, Size: 1369 bytes --]

From: Frank Blaschka <frank.blaschka@de.ibm.com>

under memory pressure scatter gather mode switching messages must be
rate limited.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---

 drivers/s390/net/qeth_main.c |   13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -2803,13 +2803,16 @@ qeth_queue_input_buffer(struct qeth_card
 		if (newcount < count) {
 			/* we are in memory shortage so we switch back to
 			   traditional skb allocation and drop packages */
-			if (atomic_cmpxchg(&card->force_alloc_skb, 0, 1))
-				printk(KERN_WARNING
-					"qeth: switch to alloc skb\n");
+			if (!atomic_read(&card->force_alloc_skb) &&
+			    net_ratelimit())
+				PRINT_WARN("Switch to alloc skb\n");
+			atomic_set(&card->force_alloc_skb, 3);
 			count = newcount;
 		} else {
-			if (atomic_cmpxchg(&card->force_alloc_skb, 1, 0))
-				printk(KERN_WARNING "qeth: switch to sg\n");
+			if ((atomic_read(&card->force_alloc_skb) == 1) &&
+			    net_ratelimit())
+				PRINT_WARN("Switch to sg\n");
+			atomic_add_unless(&card->force_alloc_skb, -1, 0);
 		}
 
 		/*

-- 

^ permalink raw reply

* [patch 1/7] s390: ungrouping a device must not be interruptible
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka
In-Reply-To: <20070829092651.411517000@linux.vnet.ibm.com>

[-- Attachment #1: 703-qeth-ungroup.diff --]
[-- Type: text/plain, Size: 1494 bytes --]

From: Ursula Braun <braunu@de.ibm.com>

Problem:
A recovery thread must not be active when device is removed.
In qeth_remove_device() an interruptible wait operation is used
to wait until a qeth recovery thread is finished. If a user really
interrupts the ungroup operation of a qeth device while a recovery
is running, cio and qeth are out of sync (device already removed
from cio, but kept in qeth). A following module unload of qeth
results in a kernel OOPS here.

Solution:
Do not allow interruption of ungroup operation to guarantee
finishing of a potentially running qeth recovery thread.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
---

 drivers/s390/net/qeth_main.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Index: linux-2.6-uschi/drivers/s390/net/qeth_main.c
===================================================================
--- linux-2.6-uschi.orig/drivers/s390/net/qeth_main.c
+++ linux-2.6-uschi/drivers/s390/net/qeth_main.c
@@ -561,7 +561,7 @@ qeth_set_offline(struct ccwgroup_device 
 }
 
 static int
-qeth_wait_for_threads(struct qeth_card *card, unsigned long threads);
+qeth_threads_running(struct qeth_card *card, unsigned long threads);
 
 
 static void
@@ -576,8 +576,7 @@ qeth_remove_device(struct ccwgroup_devic
 	if (!card)
 		return;
 
-	if (qeth_wait_for_threads(card, 0xffffffff))
-		return;
+	wait_event(card->wait_q, qeth_threads_running(card, 0xffffffff) == 0);
 
 	if (cgdev->state == CCWGROUP_ONLINE){
 		card->use_hard_stop = 1;

-- 

^ permalink raw reply

* [patch 0/7] s390 - qeth patches for 2.6.23-rc3 (resend)
From: Ursula Braun @ 2007-08-29  9:26 UTC (permalink / raw)
  To: jgarzik, netdev, linux-s390; +Cc: frank.blaschka

-- 
Jeff,

this is a resend of the s390 / qeth patches sent on monday.
This time I have changed the wrong Subject line prefix of the patches
from "qeth" to "s390". Sorry!

qeth patches for 2.6.23-rc3:
- do not allow interruption of "ungroup"
- scatter gather mode: enforce rate limit
- don't return void function return values
- add tx checkumming for TSO/EDDP mode
- invoke qeth_clear_output_buffer only for allocated qdio queues.
- add specific message for exclusively used OSA-adapters
- drop ARP packets on HiperSockets

Regards,   Ursula Braun

^ permalink raw reply

* Re: [PATCH 4/5] Net: ath5k, license is GPLv2
From: Johannes Berg @ 2007-08-29  9:59 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: linville-2XuSBdqkA4R54TAoqtyWWQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <7515194658758617-+5AFNAhbZwkm4RdzfppkhA@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 1325 bytes --]

On Tue, 2007-08-28 at 12:00 -0400, Jiri Slaby wrote:

> The files are available only under GPLv2 since now.

Since the BSD people are already getting upset about (for various
reasons among which seem to be a clear non-understanding) I'd suggest
changing it to:


+ * Parts of this file were originally licenced under the BSD licence:
+ *
>  * Permission to use, copy, modify, and distribute this software for any
>  * purpose with or without fee is hereby granted, provided that the above
>  * copyright notice and this permission notice appear in all copies.
>  *
>  * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
>  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
>  * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
>  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
>  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
>  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
>  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ *
+ * Further changes to this file since the moment this notice was extended
+ * are now distributed under the terms of the GPL version two as published
+ * by the Free Software Foundation <yaddaya>

johannes

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply

* Tc bug (kernel crash) more info
From: Badalian Vyacheslav @ 2007-08-29  9:34 UTC (permalink / raw)
  To: netdev

Again crash.  Need more posts of panic or this message have full info 
that needed to fix bug?

BUG: unable to handle kernel NULL pointer dereference at virtual address 
00000008
 printing eip:
c01bf041
*pde = 00000000
Oops: 0000 [#1]
SMP
Modules linked in: cls_u32 sch_sfq sch_htb netconsole xt_tcpudp 
iptable_filter ip_tables x_tables e752x_edac edac_mc i2c_i801
CPU:    2
EIP:    0060:[<c01bf041>]    Not tainted VLI
EFLAGS: 00010282   (2.6.22-gentoo-r5-fw #6)
EIP is at rb_erase+0x110/0x22f
eax: e40fa334   ebx: 00000000   ecx: 00000000   edx: e40fa334
esi: e6add334   edi: e5a86134   ebp: f6840428   esp: c21c5d20
ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
Process swapper (pid: 0, ti=c21c4000 task=c21b8a90 task.ti=c21c4000)
Stack: 00000001 e5a86134 00000000 e5a86000 00000055 f88391a7 f6840080 
00057857
       00000000 f0ac4600 f6840080 f883aa3d f6e53ec0 e6b18380 f6e53ec0 
00000008
       f6840428 f6840000 00000000 f6840080 00000000 61cf32bc 00000001 
e6b18380
Call Trace:
 [<f88391a7>] htb_safe_rb_erase+0x43/0x51 [sch_htb]
 [<f883aa3d>] htb_dequeue+0x145/0x6d4 [sch_htb]
 [<f88618fe>] sfq_enqueue+0x1c/0x18a [sch_sfq]
 [<c02b7592>] __qdisc_run+0x1e/0x188
 [<c02adcce>] dev_queue_xmit+0x152/0x25c
 [<c02c8f77>] ip_output+0x280/0x2b9
 [<c02c51cc>] ip_forward_finish+0x0/0x2e
 [<c02c5465>] ip_forward+0x26b/0x2c6
 [<c02c51cc>] ip_forward_finish+0x0/0x2e
 [<c02c41fb>] ip_rcv+0x484/0x4bd
 [<c02a8a0d>] __netdev_alloc_skb+0x1c/0x35
 [<c02abd54>] netif_receive_skb+0x2b8/0x319
 [<c0238034>] e1000_clean_rx_irq+0x375/0x441
 [<c0237cbf>] e1000_clean_rx_irq+0x0/0x441
 [<c02370ea>] e1000_clean+0x71/0x237
 [<c02ada90>] net_rx_action+0x91/0x17d
 [<c011c39a>] __do_softirq+0x5d/0xc1
 [<c011c430>] do_softirq+0x32/0x36
 [<c010439a>] do_IRQ+0x7e/0x90
 [<c010d461>] smp_apic_timer_interrupt+0x74/0x80
 [<c010439a>] do_IRQ+0x7e/0x90
 [<c0102ed3>] common_interrupt+0x23/0x28
 [<c0100ab2>] mwait_idle_with_hints+0x3c/0x40
 [<c0100bbe>] cpu_idle+0x5a/0x6f
 =======================
Code: 01 00 00 8b 4e 08 39 d9 0f 85 85 00 00 00 8b 4e 04 8b 01 a8 01 75 
14 83 c8 01 89 ea 89 01 89 f0 83 26 fe e8 1e fd ff ff 8b 4e 04 <8b> 59 
08 85 db 74 06 8b 03 a8 01 74 15 8b 41 04 85 c0 0f 84 c6
EIP: [<c01bf041>] rb_erase+0x110/0x22f SS:ESP 0068:c21c5d20
Kernel panic - not syncing: Fatal exception in interrupt
Rebooting in 3 seconds..

^ permalink raw reply

* Re: [Cbe-oss-dev] [PATCH] spidernet: fix interrupt reason recognition
From: Ishizaki Kou @ 2007-08-29  8:58 UTC (permalink / raw)
  To: linas; +Cc: netdev, cbe-oss-dev
In-Reply-To: <200708220026.l7M0QODr013175@toshiba.co.jp>


Linas-san,

Ishizaki Kou wrote:
> Linas Vepstas wrote:
> > On Mon, Aug 20, 2007 at 10:13:27PM +0900, Ishizaki Kou wrote:
> > > Please apply this to 2.6.23.
> > 
> > I'll review and forward shortly.  Kick me if you don't see a formal
> > reply in a few days.
> > 
> > > And also, please apply the following Arnd-san's patch to fix a problem
> > > that spidernet driver sometimes causes a BUG_ON at open.
> > > 
> > >  http://patchwork.ozlabs.org/cbe-oss-dev/patch?id=12211
> > 
> > Are you sure? This patch no longer applies cleanly, in part because
> 
> I see. I'll send another applicable patch.
> 
> > your patch "[PATCH] spidernet: improve interrupt handling" 
> > from Mon, 09 Jul 2007 added a spider_net_enable_interrupts(card); 
> > at the end of spider_net_open().  Because of this, it seems like 
> > Arnd's patch is no longer needed, right?
> 
> As you pointed out, we intended that "[PATCH] spidernet: improve
> interrupt handling" solves the same problem which Arnd's patch solves.
> 
> When spider_net_open() is called, interrupt reasons sometimes remain
> on interrupt status register, even though they are masked by mask
> register.  With this patch, spider_net_interrupt() compares the value
> of interrupt status register with SPIDER_NET_INTX_MASK_VALUE, not with
> interrupt mask register value.  As a result, spider_net_interrupt()
> (which is called from request_irq() in spider_net_open()) starts
> polling and causes BUG_ON().
> 
> So, netif_poll_enable() must be called before request_irq() is
> called. This is the reason that we also need Arnd's patch.


How about following two patches that I posted last week:

 http://patchwork.ozlabs.org/cbe-oss-dev/patch?id=12997
 http://patchwork.ozlabs.org/cbe-oss-dev/patch?id=13049

Best regards,
Kou Ishizaki

^ permalink raw reply

* Re: [1/1] Block device throttling [Re: Distributed storage.]
From: Evgeniy Polyakov @ 2007-08-29  8:53 UTC (permalink / raw)
  To: Daniel Phillips
  Cc: Jens Axboe, netdev, linux-kernel, linux-fsdevel, Peter Zijlstra
In-Reply-To: <200708281408.06618.phillips@phunq.net>

On Tue, Aug 28, 2007 at 02:08:04PM -0700, Daniel Phillips (phillips@phunq.net) wrote:
> On Tuesday 28 August 2007 10:54, Evgeniy Polyakov wrote:
> > On Tue, Aug 28, 2007 at 10:27:59AM -0700, Daniel Phillips (phillips@phunq.net) wrote:
> > > > We do not care about one cpu being able to increase its counter
> > > > higher than the limit, such inaccuracy (maximum bios in flight
> > > > thus can be more than limit, difference is equal to the number of
> > > > CPUs - 1) is a price for removing atomic operation. I thought I
> > > > pointed it in the original description, but might forget, that if
> > > > it will be an issue, that atomic operations can be introduced
> > > > there. Any uber-precise measurements in the case when we are
> > > > close to the edge will not give us any benefit at all, since were
> > > > are already in the grey area.
> > >
> > > This is not just inaccurate, it is suicide.  Keep leaking throttle
> > > counts and eventually all of them will be gone.  No more IO
> > > on that block device!
> >
> > First, because number of increased and decreased operations are the
> > same, so it will dance around limit in both directions.
> 
> No.  Please go and read it the description of the race again.  A count
> gets irretrievably lost because the write operation of the first
> decrement is overwritten by the second. Data gets lost.  Atomic 
> operations exist to prevent that sort of thing.  You either need to use 
> them or have a deep understanding of SMP read and write ordering in 
> order to preserve data integrity by some equivalent algorithm.

I think you should complete your emotional email with decription of how
atomic types are operated and how processors access data. Just to give a
lesson to those who never knew how SMP works, but create patches and
have the conscience to send them and even discuss.
Then, if of course you will want, which I doubt, you can reread previous 
mails and find that it was pointed to that race and possibilities to 
solve it way too long ago. 
Anyway, I prefer to look like I do not know how SMP and atomic operation
work and thus stay away from this discussion.

> --- 2.6.22.clean/block/ll_rw_blk.c	2007-07-08 16:32:17.000000000 -0700
> +++ 2.6.22/block/ll_rw_blk.c	2007-08-24 12:07:16.000000000 -0700
> @@ -3237,6 +3237,15 @@ end_io:
>   */
>  void generic_make_request(struct bio *bio)
>  {
> +	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
> +
> +	if (q && q->metric) {
> +		int need = bio->bi_reserved = q->metric(bio);
> +		bio->queue = q;

In case you have stacked device, this entry will be rewritten and you
will lost all your account data.

> +		wait_event_interruptible(q->throttle_wait, atomic_read(&q->available) >= need);
> +		atomic_sub(&q->available, need);
> +	}

-- 
	Evgeniy Polyakov

^ permalink raw reply

* Re: RFC: issues concerning the next NAPI interface
From: Jan-Bernd Themann @ 2007-08-29  8:43 UTC (permalink / raw)
  To: James Chapman
  Cc: David Miller, shemminger, akepner, netdev, raisch, themann,
	linux-kernel, linuxppc-dev, meder, tklein, stefan.roscher
In-Reply-To: <46D52B14.8010508@katalix.com>

On Wednesday 29 August 2007 10:15, James Chapman wrote:
> Jan-Bernd Themann wrote:
> > What I'm trying to improve with this approach is interrupt
> > mitigation for NICs where the hardware support for interrupt
> > mitigation is limited. I'm not trying to improve this for NICs
> > that work well with the means their HW provides. I'm aware of
> > the fact that this scheme has it's tradeoffs and certainly
> > can not be as good as a HW approach.
> > So I'm grateful for any ideas that do have less tradeoffs and
> > provide a mechanism to reduce interrupts without depending on
> > HW support of the NIC.
> > 
> > In the end I want to reduce the CPU utilization. And one way
> > to do that is LRO which also works only well if there are more
> > then just a very few packets to aggregate. So at least our
> > driver (eHEA) would benefit from a mix of timer based polling
> > and plain NAPI (depending on load situations).
> 
> Wouldn't you achieve the same result by enabling hardware interrupt 
> mitigation in eHEA in combination with NAPI? Presumably a 10G interface 
> has hardware mitigation features?

Quote from above: "What I'm trying to improve with this approach 
is interrupt mitigation for NICs where the hardware support for
interrupt mitigation is limited"

So guess why I'm doing that ;-)

> 
> > If there is no need for a generic mechanism for this kind of
> > network adapters, then we can just leave this to each device
> > driver.
> 
> I've been looking at this from a different angle. My goal is to optimize 
> NAPI packet forwarding rates while minimizing packet latency. Using 
> hardware interrupt mitigation hurts latency so I'm investigating ways to 
> turn it off without risking NAPI poll on/off thrashing at certain packet 
> rates.
> 
> Jan-Bernd, I think I've found a solution to the issue that you 
> highlighted with my scheme yesterday and it doesn't involve generating 
> other interrupts using hrtimers etc. :) Initial results are very 
> encouraging in my setups. Would you be willing to test it with eHEA? I 
> don't have a 10G setup. If results are encouraging, I'll post an RFC to 
> ask for review / feedback from the NAPI experts here. What do you think?
> 

I'm not sure which solution you mean. If you post your RFC, please create
a new thread (other title)


^ permalink raw reply

* Re: RFC: issues concerning the next NAPI interface
From: Jan-Bernd Themann @ 2007-08-29  8:31 UTC (permalink / raw)
  To: David Miller
  Cc: jchapman, shemminger, akepner, netdev, raisch, themann,
	linux-kernel, linuxppc-dev, meder, tklein, stefan.roscher
In-Reply-To: <20070829.012916.02298847.davem@davemloft.net>

On Wednesday 29 August 2007 10:29, David Miller wrote:
> From: Jan-Bernd Themann <ossthema@de.ibm.com>
> Date: Wed, 29 Aug 2007 09:10:15 +0200
> 
> > In the end I want to reduce the CPU utilization. And one way
> > to do that is LRO which also works only well if there are more
> > then just a very few packets to aggregate. So at least our
> > driver (eHEA) would benefit from a mix of timer based polling
> > and plain NAPI (depending on load situations).
> > 
> > If there is no need for a generic mechanism for this kind of
> > network adapters, then we can just leave this to each device
> > driver.
> 
> No objections from me either way, if something works then
> fine.
> 
> Let's come back to this once you have a tested sample implementation
> that does what you want, ok?

Sounds good

^ permalink raw reply

* Re: RFC: issues concerning the next NAPI interface
From: David Miller @ 2007-08-29  8:29 UTC (permalink / raw)
  To: ossthema
  Cc: jchapman, shemminger, akepner, netdev, raisch, themann,
	linux-kernel, linuxppc-dev, meder, tklein, stefan.roscher
In-Reply-To: <46D51BD7.6040904@de.ibm.com>

From: Jan-Bernd Themann <ossthema@de.ibm.com>
Date: Wed, 29 Aug 2007 09:10:15 +0200

> In the end I want to reduce the CPU utilization. And one way
> to do that is LRO which also works only well if there are more
> then just a very few packets to aggregate. So at least our
> driver (eHEA) would benefit from a mix of timer based polling
> and plain NAPI (depending on load situations).
> 
> If there is no need for a generic mechanism for this kind of
> network adapters, then we can just leave this to each device
> driver.

No objections from me either way, if something works then
fine.

Let's come back to this once you have a tested sample implementation
that does what you want, ok?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox