* Re: 2.6.20->2.6.21 - networking dies after random time
From: Jarek Poplawski @ 2007-08-07 10:09 UTC (permalink / raw)
To: Chuck Ebbert
Cc: Ingo Molnar, Marcin Ślusarz, Thomas Gleixner, Linus Torvalds,
Jean-Baptiste Vignaud, linux-kernel, shemminger, linux-net,
netdev, Andrew Morton, Alan Cox
In-Reply-To: <46B75DD4.5080709@redhat.com>
On Mon, Aug 06, 2007 at 01:43:48PM -0400, Chuck Ebbert wrote:
> On 08/06/2007 03:03 AM, Ingo Molnar wrote:
> >
> > But, since level types don't need this retriggers too much I think
> > this "don't mask interrupts by default" idea should be rethinked:
> > is there enough gain to risk such hard to diagnose errors?
> >
> >
>
> I reverted those masking changes in Fedora and the baffling problem
> with 3Com 3C905 network adapters went away.
>
> Before, they would print:
>
> eth0: transmit timed out, tx_status 00 status e601.
> diagnostics: net 0ccc media 8880 dma 0000003a fifo 0000
> eth0: Interrupt posted but not delivered -- IRQ blocked by another device?
> Flags; bus-master 1, dirty 295757(13) current 295757(13)
> Transmit list 00000000 vs. f7150a20.
> 0: @f7150200 length 80000070 status 0c010070
> 1: @f71502a0 length 80000070 status 0c010070
> 2: @f7150340 length 8000005c status 0c01005c
>
> Now they just work, apparently...
>
> So why not just revert the change?
>
Ingo has written about such possibility. But, it would be good
to know which precisely place is to blame, as well. Since this
diagnosing takes time, I think Chuck is right, and maybe at least
this temporary patch for resend.c without this warning, should
be recomended for stables (2.6.21 and 2.6.22)?
Jarek P.
^ permalink raw reply
* [PATCH] phy layer: fix phy_mii_ioctl for autonegotiation
From: Domen Puncer @ 2007-08-07 10:12 UTC (permalink / raw)
To: netdev; +Cc: macro
Fix a thinko (?) in setting phydev->autoneg.
Signed-off-by: Domen Puncer <domen.puncer@telargo.com>
---
This fixes my "mii.h -> ethtool.h advertising #defines". I'm not sure
why and how they're translated, but it does work now.
Maybe they're just ignored, since mii-tool directly reads and writes
MII registers.
drivers/net/phy/phy.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Index: work-powerpc.git/drivers/net/phy/phy.c
===================================================================
--- work-powerpc.git.orig/drivers/net/phy/phy.c
+++ work-powerpc.git/drivers/net/phy/phy.c
@@ -261,7 +261,7 @@ void phy_sanitize_settings(struct phy_de
/* Sanitize settings based on PHY capabilities */
if ((features & SUPPORTED_Autoneg) == 0)
- phydev->autoneg = 0;
+ phydev->autoneg = AUTONEG_DISABLE;
idx = phy_find_valid(phy_find_setting(phydev->speed, phydev->duplex),
features);
@@ -374,7 +374,7 @@ int phy_mii_ioctl(struct phy_device *phy
if (mii_data->phy_id == phydev->addr) {
switch(mii_data->reg_num) {
case MII_BMCR:
- if (val & (BMCR_RESET|BMCR_ANENABLE))
+ if ((val & (BMCR_RESET|BMCR_ANENABLE)) == 0)
phydev->autoneg = AUTONEG_DISABLE;
else
phydev->autoneg = AUTONEG_ENABLE;
^ permalink raw reply
* Re: Distributed storage.
From: Jens Axboe @ 2007-08-07 12:05 UTC (permalink / raw)
To: Daniel Phillips
Cc: Evgeniy Polyakov, netdev, linux-kernel, linux-fsdevel,
Peter Zijlstra
In-Reply-To: <200708051423.45484.phillips@phunq.net>
On Sun, Aug 05 2007, Daniel Phillips wrote:
> A simple way to solve the stable accounting field issue is to add a new
> pointer to struct bio that is owned by the top level submitter
> (normally generic_make_request but not always) and is not affected by
> any recursive resubmission. Then getting rid of that field later
> becomes somebody's summer project, which is not all that urgent because
> struct bio is already bloated up with a bunch of dubious fields and is
> a transient structure anyway.
Thanks for your insights. Care to detail what bloat and dubious fields
struct bio has?
And we don't add temporary fields out of laziness, hoping that "someone"
will later kill it again and rewrite it in a nicer fashion. Hint: that
never happens, bloat sticks.
--
Jens Axboe
^ permalink raw reply
* Re: 2.6.20->2.6.21 - networking dies after random time
From: Jarek Poplawski @ 2007-08-07 12:13 UTC (permalink / raw)
To: Marcin Ślusarz
Cc: Ingo Molnar, Thomas Gleixner, Linus Torvalds,
Jean-Baptiste Vignaud, linux-kernel, shemminger, linux-net,
netdev, Andrew Morton, Alan Cox
In-Reply-To: <20070807095246.GB3223@ff.dom.local>
On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote:
> On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
> > 2007/8/7, Jarek Poplawski <jarkao2@o2.pl>:
> > > On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote:
> > > > Network card still locks up (tested on 2.6.22.1). I had to upload more
> > > > data than usual (~350 MB vs ~1-100 MB) to trigger that bug but it
> > > > might be a coincidence...
> > >
> > > Thanks! It's a good news after all - it would be really strange why
> > > this place doesn't hit more people (it seems there is some safety
> > > elsewhere for this).
> > >
> > > BTW: I hope, this previous Thomas' patch with Ingo's warning to resend.c
> > > (with a warning), had no problems with a similar load?
> > I always tested on 500-600 MB "dataset"
> >
> > > PS: Marcin, if you need a break in this testing let us know!
> > No, i don't need a break. I'll have more time in next weeks.
>
> Great! So, I'll try to send a patch with _SW_RESEND in a few hours,
> if Ingo doesn't prepare something for you.
So, the let's try this idea yet: modified Ingo's "x86: activate
HARDIRQS_SW_RESEND" patch.
(Don't forget about make oldconfig before make.)
For testing only.
Cheers,
Jarek P.
PS: alas there was not even time for "compile checking"...
---
diff -Nurp 2.6.22.1-/arch/i386/Kconfig 2.6.22.1/arch/i386/Kconfig
--- 2.6.22.1-/arch/i386/Kconfig 2007-07-09 01:32:17.000000000 +0200
+++ 2.6.22.1/arch/i386/Kconfig 2007-08-07 13:13:03.000000000 +0200
@@ -1252,6 +1252,10 @@ config GENERIC_PENDING_IRQ
depends on GENERIC_HARDIRQS && SMP
default y
+config HARDIRQS_SW_RESEND
+ bool
+ default y
+
config X86_SMP
bool
depends on SMP && !X86_VOYAGER
diff -Nurp 2.6.22.1-/arch/x86_64/Kconfig 2.6.22.1/arch/x86_64/Kconfig
--- 2.6.22.1-/arch/x86_64/Kconfig 2007-07-09 01:32:17.000000000 +0200
+++ 2.6.22.1/arch/x86_64/Kconfig 2007-08-07 13:13:03.000000000 +0200
@@ -690,6 +690,10 @@ config GENERIC_PENDING_IRQ
depends on GENERIC_HARDIRQS && SMP
default y
+config HARDIRQS_SW_RESEND
+ bool
+ default y
+
menu "Power management options"
source kernel/power/Kconfig
diff -Nurp 2.6.22.1-/kernel/irq/manage.c 2.6.22.1/kernel/irq/manage.c
--- 2.6.22.1-/kernel/irq/manage.c 2007-07-09 01:32:17.000000000 +0200
+++ 2.6.22.1/kernel/irq/manage.c 2007-08-07 13:13:03.000000000 +0200
@@ -169,6 +169,14 @@ void enable_irq(unsigned int irq)
desc->depth--;
}
spin_unlock_irqrestore(&desc->lock, flags);
+#ifdef CONFIG_HARDIRQS_SW_RESEND
+ /*
+ * Do a bh disable/enable pair to trigger any pending
+ * irq resend logic:
+ */
+ local_bh_disable();
+ local_bh_enable();
+#endif
}
EXPORT_SYMBOL(enable_irq);
diff -Nurp 2.6.22.1-/kernel/irq/resend.c 2.6.22.1/kernel/irq/resend.c
--- 2.6.22.1-/kernel/irq/resend.c 2007-07-09 01:32:17.000000000 +0200
+++ 2.6.22.1/kernel/irq/resend.c 2007-08-07 13:57:54.000000000 +0200
@@ -62,16 +62,24 @@ void check_irq_resend(struct irq_desc *d
*/
desc->chip->enable(irq);
+ /*
+ * Temporary hack to figure out more about the problem, which
+ * is causing the ancient network cards to die.
+ */
+
if ((status & (IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) {
desc->status = (status & ~IRQ_PENDING) | IRQ_REPLAY;
- if (!desc->chip || !desc->chip->retrigger ||
- !desc->chip->retrigger(irq)) {
+ if (desc->handle_irq == handle_edge_irq) {
+ if (desc->chip->retrigger)
+ desc->chip->retrigger(irq);
+ return;
+ }
#ifdef CONFIG_HARDIRQS_SW_RESEND
- /* Set it pending and activate the softirq: */
- set_bit(irq, irqs_resend);
- tasklet_schedule(&resend_tasklet);
+ WARN_ON_ONCE(1);
+ /* Set it pending and activate the softirq: */
+ set_bit(irq, irqs_resend);
+ tasklet_schedule(&resend_tasklet);
#endif
- }
}
}
^ permalink raw reply
* Re: Possible bug in realtek 8169 ethernet driver
From: Bram @ 2007-08-07 12:45 UTC (permalink / raw)
To: Francois Romieu; +Cc: linux-kernel, netdev
In-Reply-To: <20070806210637.GA18611@electric-eye.fr.zoreil.com>
Francois Romieu wrote:
> Bram <bram@linux.kernel.as.avontuur.org> :
> [...]
> > The device now works! But, it still comes up as eth2 instead of eth0,
> > even though it's first detected as eth0. There are no other network
>
> Check the udev rules and/or your init scripts ?
>
You're right, it's a udev script assigning new names to unknown cards, I
wasn't aware of that.
Thanks,
Bram
^ permalink raw reply
* Re: [PATCH RFC]: napi_struct V5
From: jamal @ 2007-08-07 12:52 UTC (permalink / raw)
To: David Miller; +Cc: netdev, shemminger, jgarzik, rusty
In-Reply-To: <20070805.232423.21363072.davem@davemloft.net>
On Sun, 2007-05-08 at 23:24 -0700, David Miller wrote:
>
> 3) Attempt to bring NAPI howto as uptodate as is possible for such
> a rotting document. :)
That doc is out of date on the split of work - it focusses mostly
describing the original tulip which did not mix rx and tx in the
napi_poll(). AFAIK, no driver does that today (although i really liked
that scheme, there is a lot of fscked hardware out there that wont work
well with that scheme). Where are the firemen when you need them?
Scanning your changes on the drivers for hardware i possess, I dont see
any issues.
cheers,
jamal
^ permalink raw reply
* Re: 2.6.20->2.6.21 - networking dies after random time
From: Jarek Poplawski @ 2007-08-07 12:55 UTC (permalink / raw)
To: Marcin Ślusarz
Cc: Ingo Molnar, Thomas Gleixner, Linus Torvalds,
Jean-Baptiste Vignaud, linux-kernel, shemminger, linux-net,
netdev, Andrew Morton, Alan Cox
In-Reply-To: <20070807121339.GA3946@ff.dom.local>
On Tue, Aug 07, 2007 at 02:13:39PM +0200, Jarek Poplawski wrote:
> On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote:
> > On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote:
...
> > > No, i don't need a break. I'll have more time in next weeks.
> >
> > Great! So, I'll try to send a patch with _SW_RESEND in a few hours,
> > if Ingo doesn't prepare something for you.
>
> So, the let's try this idea yet: modified Ingo's "x86: activate
> HARDIRQS_SW_RESEND" patch.
> (Don't forget about make oldconfig before make.)
> For testing only.
>
> Cheers,
> Jarek P.
>
> PS: alas there was not even time for "compile checking"...
And here is one more patch to test the same idea (chip->retrigger()).
Let's try i386 way! (I hope I will not be arrested for this...)
(Should be tested without any previous patches.)
Jarek P.
PS: as above
---
diff -Nurp 2.6.22.1-/arch/x86_64/kernel/io_apic.c 2.6.22.1/arch/x86_64/kernel/io_apic.c
--- 2.6.22.1-/arch/x86_64/kernel/io_apic.c 2007-07-09 01:32:17.000000000 +0200
+++ 2.6.22.1/arch/x86_64/kernel/io_apic.c 2007-08-07 14:37:45.000000000 +0200
@@ -1311,15 +1311,8 @@ static unsigned int startup_ioapic_irq(u
static int ioapic_retrigger_irq(unsigned int irq)
{
struct irq_cfg *cfg = &irq_cfg[irq];
- cpumask_t mask;
- unsigned long flags;
-
- spin_lock_irqsave(&vector_lock, flags);
- cpus_clear(mask);
- cpu_set(first_cpu(cfg->domain), mask);
- send_IPI_mask(mask, cfg->vector);
- spin_unlock_irqrestore(&vector_lock, flags);
+ send_IPI_self(cfg->vector);
return 1;
}
^ permalink raw reply
* Re: e100 (was: eepro100) - Nobody Cares (hardware?)
From: ericj @ 2007-08-07 12:49 UTC (permalink / raw)
To: Kok, Auke; +Cc: Jeff Garzik, NetDev
In-Reply-To: <46B7C095.5010202@intel.com>
[-- Attachment #1: Type: text/plain, Size: 2006 bytes --]
On Mon, 06 Aug 2007 17:45:09 -0700, Kok, Auke wrote
> [moving to netdev mailinglist]
> Eric,
>
> please don't forget that an entire team here at Intel is
> dedicated to supporting e100 and pro/1000 devices from Intel.
>
> Most of the pro/100 features are documented in the SDM which
> contains some references to the eeprom parts. Mostly the
> device doesn't need much configuration from the eeprom to work
> (unlike gigE parts). The SDM can be downloaded from our sf.net
> project page:
>
>
http://sourceforge.net/project/showfiles.php?group_id=42302&package_id=68544
>
> The issue that you are reporting:
>
> "My system boots fine but when I try to bring up the onboard
> ethernet (an EEPro 100 VE) I get a "Nobody Cares" message and
> the interrupt is disabled."
>
> However has been recently patched. This should have worked
> regardless of whether you used e100 or eepro100 (noting that
> nobody supports eepro100 anymore, you should really use e100
> for all tests).
>
> if you look in drivers/pci/quirks.c you'll find that there is
> specific code for e100 devices. If this quirk doesn't work for
> you then we'll need to dig into that. For this I'd like you to
> gather:
>
> - `ethtool -e eth0` output
> - `lspci -n` output
>
> this will allow me to check the quirck code and see if it has
> the right device ID. I'm suspecting that the device ID is
> missing somehow, or the workaround fails.
>
> Auke
Thanks for the help.
Here are the lspci -n and ethtool -e outputs. I am attaching both the
results for the 'bad' unit and for another one which is supposedly
identical except for some battery charge circuitry.
The eeprom data on the bad one may be a little odd due to my trying to
make it match that of the good one, including that I forgot what the
real MAC address was supposed to be.
I can get one that I haven't screwed up if you need it, but it will
probably take all day.
--
"A hunch is creativity trying to tell you something" -- Frank Capra
Eric Johnson
[-- Attachment #2: lspci_good.txt --]
[-- Type: application/octet-stream, Size: 585 bytes --]
00:00.0 0600: 8086:3580 (rev 02)
00:00.1 0880: 8086:3584 (rev 02)
00:00.3 0880: 8086:3585 (rev 02)
00:02.0 0300: 8086:3582 (rev 02)
00:02.1 0380: 8086:3582 (rev 02)
00:1d.0 0c03: 8086:24c2 (rev 02)
00:1d.1 0c03: 8086:24c4 (rev 02)
00:1d.2 0c03: 8086:24c7 (rev 02)
00:1d.7 0c03: 8086:24cd (rev 02)
00:1e.0 0604: 8086:244e (rev 82)
00:1f.0 0601: 8086:24c0 (rev 02)
00:1f.1 0101: 8086:24cb (rev 02)
00:1f.3 0c05: 8086:24c3 (rev 02)
00:1f.5 0401: 8086:24c5 (rev 02)
01:08.0 0200: 8086:103a (rev 82)
01:0c.0 0280: 1814:0302
01:0d.0 0607: 104c:ac55 (rev 01)
01:0d.1 0607: 104c:ac55 (rev 01)
[-- Attachment #3: lspci_bad.txt --]
[-- Type: application/octet-stream, Size: 585 bytes --]
00:00.0 0600: 8086:3580 (rev 02)
00:00.1 0880: 8086:3584 (rev 02)
00:00.3 0880: 8086:3585 (rev 02)
00:02.0 0300: 8086:3582 (rev 02)
00:02.1 0380: 8086:3582 (rev 02)
00:1d.0 0c03: 8086:24c2 (rev 02)
00:1d.1 0c03: 8086:24c4 (rev 02)
00:1d.2 0c03: 8086:24c7 (rev 02)
00:1d.7 0c03: 8086:24cd (rev 02)
00:1e.0 0604: 8086:244e (rev 82)
00:1f.0 0601: 8086:24c0 (rev 02)
00:1f.1 0101: 8086:24cb (rev 02)
00:1f.3 0c05: 8086:24c3 (rev 02)
00:1f.5 0401: 8086:24c5 (rev 02)
01:08.0 0200: 8086:103a (rev 82)
01:0c.0 0280: 1814:0302
01:0d.0 0607: 104c:ac55 (rev 01)
01:0d.1 0607: 104c:ac55 (rev 01)
[-- Attachment #4: ethtool_bad.txt --]
[-- Type: application/octet-stream, Size: 486 bytes --]
Offset Values
------ ------
0x0000 00 02 b3 c0 ff ee 00 00 00 00 ff ff ff ff ff ff
0x0010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 42 09
[-- Attachment #5: ethtool_good.txt --]
[-- Type: application/octet-stream, Size: 486 bytes --]
Offset Values
------ ------
0x0000 00 1b ec 00 00 57 00 00 00 00 ff ff ff ff ff ff
0x0010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 08 48
^ permalink raw reply
* Re: fscked clock sources revisited
From: jamal @ 2007-08-07 13:19 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Robert.Olsson, shemminger, kaber
In-Reply-To: <1185848076.5162.39.camel@localhost>
On Mon, 2007-30-07 at 22:14 -0400, jamal wrote:
> I am going to test with hpet when i get the chance
Couldnt figure how to turn on/off hpet, so didnt test.
> and perhaps turn off all the other sources if nothing good comes out; i
> need my numbers ;->
Here are some numbers that make the mystery even more interesting. This
is with kernel 2.6.22-rc4. Repeating with kernel 2.6.23-rc1 didnt show
anything different. I went back to test on 2.6.22-rc4 because it is the
base for my batching patches - and since those drove me to this test, i
wanted something that reduces variables when comparing with batching.
I picked udp for this test because i can select different packet sizes.
i used iperf. The sender is a dual opteron with tg3. The receiver is a
dual xeon.
The default HZ is 250. Each packet size was run 3 times with different
clock sources. The experiment made sure that the receiver wasnt a
bottleneck (increased socket buffer sizes etc)
Packet | jiffies (1/250) | tsc | acpi_pm
-------------------------|---------------|---------------
64 | 141, 145, 142 | 131, 136, 130 | 103, 104, 110
128 | 256, 256, 256 | 274, 260, 269 | 216, 206, 220
512 | 513, 513, 513 | 886, 886, 886 | 828, 814, 806
1280 | 684, 684, 684 | 951, 951, 951 | 951, 951, 951
So i was wrong to declare jiffies as being good. The last batch of
experiments were based on only 64 byte UDP. Clearly as packet size goes
up, the results are worse with jiffies.
At this point, i decided to recompile the kernel with HZ=1000 and the
observations show that the jiffies results are improved.
Packet | jiffies (1/250) | tsc | acpi_pm
-------------------------|---------------|---------------
64 | 145, 135, 135 | 131, 137, 139 | 110, 110, 108
128 | 257, 257, 257 | 270, 264, 250 | 218, 216, 217
512 | 819, 776, 819 | 886, 886, 886 | 841, 824, 846
1280 | 855, 855, 855 | 951, 950, 951 | 951, 951, 951
Still not as good as the other two at large packet sizes.
For this machine: The ideal clock source would be jiffies with
HZ=1000 upto about 100 bytes then change to tsc. Of course i could pick
tsc but people have dissed it so far - i probably didnt hit the
condition where it goes into deep slumber.
Any insights? This makes it hard to quantify batching experimental
improvements as i feel it could be architecture or worse machine
dependent.
cheers,
jamal
^ permalink raw reply
* [RFC] stuff from tcp-2.6 partially merged to upcoming net-2.6.24?
From: Ilpo Järvinen @ 2007-08-07 13:19 UTC (permalink / raw)
To: David Miller; +Cc: Netdev
Hi Dave,
...Noticed you were planning to open net-2.6.24 tree... IMHO, part of the
stuff in tcp-2.6 could be merged to 2.6.24. I suggest that most of the
stuff which is not directly related to the rbtree, new lost marker, nor
sacktag reorganization are taken. Some of those things are very trivial
to take as they do not introduce have any conflicts. Besides that there
are some stuff that would need some work if takes as they are built on
top of stuff that will remain only in tcp-2.6 (includes left_out removal
and IsReno/Fack conversion)... But if it's ok, I could try to come up with
a solution even to them... Perhaps do this in two (or more) stages by
first taking the trivial ones...
I tried rebasing tcp-2.6 (there's some not yet submitted work on top of
it too) to top of be1b685fe6c9928848b26b568eaa86ba8ce0046c, result is
here:
http://www.cs.helsinki.fi/u/ijjarvin/tcp-rebase/{before,after}
...There was at least one gotcha (sacktag's flag reset position change
when sacktag_state is created). But all in all, conflicts weren't that
hard to resolve... One may resolve some things differently than I did,
so YMMV if you want to try that yourself... :-) ...I also diffed
all.patch'es to see if there was some undesired side-effect from diff
but didn't find any. Currently only compile tested.
Do you have any suggestion how I should proceed? Or do you perhaps object
such partial merge completely? ...I could try to come up with a cleaned up
patch series which has original and their bug fix parts combined to a
single patch per change (would provide cleaner history and shouldn't be
very hard to do either)...
--
i.
^ permalink raw reply
* Fw: [Bug 8845] New: Kernel 2.6.23-RC2: TCP + ICH9 + Amule + Hours = Freeze of my Debian => Reboot
From: Stephen Hemminger @ 2007-08-07 13:37 UTC (permalink / raw)
To: netdev
Any takers?
Subject: [Bug 8845] New: Kernel 2.6.23-RC2: TCP + ICH9 + Amule + Hours = Freeze of my Debian => Reboot
http://bugzilla.kernel.org/show_bug.cgi?id=8845
Summary: Kernel 2.6.23-RC2: TCP + ICH9 + Amule + Hours = Freeze
of my Debian => Reboot
Product: Networking
Version: 2.5
KernelVersion: Kernel 2.6.23-RC2
Platform: All
OS/Version: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: IPV4
AssignedTo: shemminger@osdl.org
ReportedBy: v4ahixbmb4@forgetably.com
Hello!
I had the same problem with kernel 2.6.22.
I have a very recent Asus P5K-VM motherboard, and with the same harddrive and
software on an Asus P5B-VM I had no problem. I have no problem with Ktorrent.
When I use amule with only the udp mode (Kademlia), my Debian Etch/Lenny works
alright for 24 hours straight, with dozens of active downloads.
When I use the regular tcp mode (edonkey protocol), after 3 hours my Debian
completely freezes (no hardrive activity, no console accessible, impossible to
trigger a reboot from the keyboard with the right sequence of keys), and I have
to reboot.
I reproduced this more than ten times.
I use an old D-Link DFE-530TX ethernet card with which I never had any problem
over the years. I have a cable internet connection (1 MB/s up/ 30MB/s down)
There is nothing in the logs before the freeze.
--
Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
^ permalink raw reply
* [PATCH] [iputils] Print received packets as icmp_seq
From: Alexander Graf @ 2007-08-07 13:49 UTC (permalink / raw)
To: netdev; +Cc: Alexander Graf
Now, ping and ping6 print the packets which are actually received, too, not
only the amount of sent packets.
It has the format:
icmp_seq=received/seq
Signed-off-by: Alexander Graf <sohalt@gmail.com>
---
ping_common.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/ping_common.c b/ping_common.c
index be36cbd..83be553 100644
--- a/ping_common.c
+++ b/ping_common.c
@@ -711,7 +711,7 @@ restamp:
} else {
int i;
__u8 *cp, *dp;
- printf("%d bytes from %s: icmp_seq=%u", cc, from, seq);
+ printf("%d bytes from %s: icmp_seq=%li/%u", cc, from, nreceived, seq);
if (hops >= 0)
printf(" ttl=%d", hops);
--
1.5.2.4
^ permalink raw reply related
* [PATCH] [iputils] Print packet loss with more precision
From: Alexander Graf @ 2007-08-07 13:49 UTC (permalink / raw)
To: netdev; +Cc: Alexander Graf
In-Reply-To: <1186494590225-git-send-email-sohalt@gmail.com>
Signed-off-by: Alexander Graf <sohalt@gmail.com>
---
ping_common.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/ping_common.c b/ping_common.c
index 83be553..49acab2 100644
--- a/ping_common.c
+++ b/ping_common.c
@@ -795,9 +795,9 @@ void finish(void)
if (nerrors)
printf(", +%ld errors", nerrors);
if (ntransmitted) {
- printf(", %d%% packet loss",
- (int) ((((long long)(ntransmitted - nreceived)) * 100) /
- ntransmitted));
+ printf(", %f%% packet loss",
+ (((long long)(ntransmitted - nreceived)) * 100.0) /
+ ntransmitted);
printf(", time %ldms", 1000*tv.tv_sec+tv.tv_usec/1000);
}
putchar('\n');
--
1.5.2.4
^ permalink raw reply related
* [ofa-general] [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
From: Steve Wise @ 2007-08-07 14:37 UTC (permalink / raw)
To: Roland Dreier, David S. Miller; +Cc: netdev, linux-kernel, OpenFabrics General
Networking experts,
I'd like input on the patch below, and help in solving this bug
properly. iWARP devices that support both native stack TCP and iWARP
(aka RDMA over TCP/IP/Ethernet) connections on the same interface need
the fix below or some similar fix to the RDMA connection manager.
This is a BUG in the Linux RDMA-CMA code as it stands today.
Here is the issue:
Consider an mpi cluster running mvapich2. And the cluster runs
MPI/Sockets jobs concurrently with MPI/RDMA jobs. It is possible,
without the patch below, for MPI/Sockets processes to mistakenly get
incoming RDMA connections and vice versa. The way mvapich2 works is
that the ranks all bind and listen to a random port (retrying new random
ports if the bind fails with "in use"). Once they get a free port and
bind/listen, they advertise that port number to the peers to do
connection setup. Currently, without the patch below, the mpi/rdma
processes can end up binding/listening to the _same_ port number as the
mpi/sockets processes running over the native tcp stack. This is due to
duplicate port spaces for native stack TCP and the rdma cm's RDMA_PS_TCP
port space. If this happens, then the connections can get screwed up.
The correct solution in my mind is to use the host stack's TCP port
space for _all_ RDMA_PS_TCP port allocations. The patch below is a
minimal delta to unify the port spaces by using the kernel stack to bind
ports. This is done by allocating a kernel socket and binding to the
appropriate local addr/port. It also allows the kernel stack to pick
ephemeral ports by virtue of just passing in port 0 on the kernel bind
operation.
There has been a discussion already on the RDMA list if anyone is
interested:
http://www.mail-archive.com/general@lists.openfabrics.org/msg05162.html
Thanks,
Steve.
---
RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
This is needed for iwarp providers that support native and rdma
connections over the same interface.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
---
drivers/infiniband/core/cma.c | 27 ++++++++++++++++++++++++++-
1 files changed, 26 insertions(+), 1 deletions(-)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 9e0ab04..e4d2d7f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -111,6 +111,7 @@ struct rdma_id_private {
struct rdma_cm_id id;
struct rdma_bind_list *bind_list;
+ struct socket *sock;
struct hlist_node node;
struct list_head list;
struct list_head listen_list;
@@ -695,6 +696,8 @@ static void cma_release_port(struct rdma
kfree(bind_list);
}
mutex_unlock(&lock);
+ if (id_priv->sock)
+ sock_release(id_priv->sock);
}
void rdma_destroy_id(struct rdma_cm_id *id)
@@ -1790,6 +1793,25 @@ static int cma_use_port(struct idr *ps,
return 0;
}
+static int cma_get_tcp_port(struct rdma_id_private *id_priv)
+{
+ int ret;
+ struct socket *sock;
+
+ ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
+ if (ret)
+ return ret;
+ ret = sock->ops->bind(sock,
+ (struct socketaddr *)&id_priv->id.route.addr.src_addr,
+ ip_addr_size(&id_priv->id.route.addr.src_addr));
+ if (ret) {
+ sock_release(sock);
+ return ret;
+ }
+ id_priv->sock = sock;
+ return 0;
+}
+
static int cma_get_port(struct rdma_id_private *id_priv)
{
struct idr *ps;
@@ -1801,6 +1823,9 @@ static int cma_get_port(struct rdma_id_p
break;
case RDMA_PS_TCP:
ps = &tcp_ps;
+ ret = cma_get_tcp_port(id_priv); /* Synch with native stack */
+ if (ret)
+ goto out;
break;
case RDMA_PS_UDP:
ps = &udp_ps;
@@ -1815,7 +1840,7 @@ static int cma_get_port(struct rdma_id_p
else
ret = cma_use_port(ps, id_priv);
mutex_unlock(&lock);
-
+out:
return ret;
}
^ permalink raw reply related
* [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
From: Evgeniy Polyakov @ 2007-08-07 14:54 UTC (permalink / raw)
To: Steve Wise
Cc: netdev, Roland Dreier, linux-kernel, OpenFabrics General,
David S. Miller
In-Reply-To: <46B883B5.8040702@opengridcomputing.com>
Hi Steve.
On Tue, Aug 07, 2007 at 09:37:41AM -0500, Steve Wise (swise@opengridcomputing.com) wrote:
> +static int cma_get_tcp_port(struct rdma_id_private *id_priv)
> +{
> + int ret;
> + struct socket *sock;
> +
> + ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
> + if (ret)
> + return ret;
> + ret = sock->ops->bind(sock,
> + (struct socketaddr
> *)&id_priv->id.route.addr.src_addr,
> + ip_addr_size(&id_priv->id.route.addr.src_addr));
If get away from talks about broken offloading, this one will result in
the case, when usual network dataflow can enter private rdma land, i.e.
after bind succeeded this socket is accessible via any other network
device. Is it inteded?
And this is quite noticeble overhead per rdma connection, btw.
--
Evgeniy Polyakov
^ permalink raw reply
* Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
From: Steve Wise @ 2007-08-07 15:06 UTC (permalink / raw)
To: Evgeniy Polyakov
Cc: Roland Dreier, David S. Miller, netdev, linux-kernel, Sean Hefty,
OpenFabrics General
In-Reply-To: <20070807145441.GA24895@2ka.mipt.ru>
Evgeniy Polyakov wrote:
> Hi Steve.
>
> On Tue, Aug 07, 2007 at 09:37:41AM -0500, Steve Wise (swise@opengridcomputing.com) wrote:
>> +static int cma_get_tcp_port(struct rdma_id_private *id_priv)
>> +{
>> + int ret;
>> + struct socket *sock;
>> +
>> + ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
>> + if (ret)
>> + return ret;
>> + ret = sock->ops->bind(sock,
>> + (struct socketaddr
>> *)&id_priv->id.route.addr.src_addr,
>> + ip_addr_size(&id_priv->id.route.addr.src_addr));
>
> If get away from talks about broken offloading, this one will result in
> the case, when usual network dataflow can enter private rdma land, i.e.
> after bind succeeded this socket is accessible via any other network
> device. Is it inteded?
> And this is quite noticeble overhead per rdma connection, btw.
>
I'm not sure I understand your question? What do you mean by
"accessible"? The intention is to _just_ reserve the addr/port.
The socket struct alloc and bind was a simple way to do this. I
assume we'll have to come up with a better way though.
Namely provide a low level interface to the port space allocator
allowing both rdma and the host tcp stack to share the space without
requiring a socket struct for rdma connections.
Or maybe we'll come up a different and better solution to this issue...
Steve.
^ permalink raw reply
* [ofa-general] Re: [PATCH RFC] RDMA/CMA: Allocate PS_TCP ports from the host TCP port space.
From: Evgeniy Polyakov @ 2007-08-07 15:39 UTC (permalink / raw)
To: Steve Wise
Cc: netdev, Roland Dreier, linux-kernel, OpenFabrics General,
David S. Miller
In-Reply-To: <46B88A75.3040004@opengridcomputing.com>
On Tue, Aug 07, 2007 at 10:06:29AM -0500, Steve Wise (swise@opengridcomputing.com) wrote:
> >On Tue, Aug 07, 2007 at 09:37:41AM -0500, Steve Wise
> >(swise@opengridcomputing.com) wrote:
> >>+static int cma_get_tcp_port(struct rdma_id_private *id_priv)
> >>+{
> >>+ int ret;
> >>+ struct socket *sock;
> >>+
> >>+ ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
> >>+ if (ret)
> >>+ return ret;
> >>+ ret = sock->ops->bind(sock,
> >>+ (struct socketaddr
> >>*)&id_priv->id.route.addr.src_addr,
> >>+ ip_addr_size(&id_priv->id.route.addr.src_addr));
> >
> >If get away from talks about broken offloading, this one will result in
> >the case, when usual network dataflow can enter private rdma land, i.e.
> >after bind succeeded this socket is accessible via any other network
> >device. Is it inteded?
> >And this is quite noticeble overhead per rdma connection, btw.
> >
>
> I'm not sure I understand your question? What do you mean by
> "accessible"? The intention is to _just_ reserve the addr/port.
Above RDMA ->bind() ends up with tcp_v4_get_port(), which will only add
socket into bhash, but it is only accessible for new sockets created for
listening connections or expilicit bind, network traffic checks only
listening and establised hashes, which are not affected by above change,
so it was false alarm from my side. It does allow to 'grab' a port and
forbid its possible reuse.
--
Evgeniy Polyakov
^ permalink raw reply
* Re: [PATCH] drivers/net/ibmveth.c: memset fix
From: Brian King @ 2007-08-07 15:40 UTC (permalink / raw)
To: Mariusz Kozlowski; +Cc: Jeff Garzik, santil, netdev, linux-kernel
In-Reply-To: <200708062344.03443.m.kozlowski@tuxland.pl>
Mariusz Kozlowski wrote:
>>>> Looks like memset() is zeroing wrong nr of bytes.
>>> Good catch, however, I think we can just remove this memset altogether
>>> since the memory gets allocated via kzalloc.
>> Correct, that memset() is superfluous.
>
> Ok. Then this should do it.
Acked-by: Brian King <brking@linux.vnet.ibm.com>
>
> Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
>
> drivers/net/ibmveth.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> --- linux-2.6.23-rc1-mm2-a/drivers/net/ibmveth.c 2007-08-01 08:43:46.000000000 +0200
> +++ linux-2.6.23-rc1-mm2-b/drivers/net/ibmveth.c 2007-08-06 23:32:13.000000000 +0200
> @@ -963,7 +963,7 @@ static int __devinit ibmveth_probe(struc
> {
> int rc, i;
> struct net_device *netdev;
> - struct ibmveth_adapter *adapter = NULL;
> + struct ibmveth_adapter *adapter;
>
> unsigned char *mac_addr_p;
> unsigned int *mcastFilterSize_p;
> @@ -997,7 +997,6 @@ static int __devinit ibmveth_probe(struc
> SET_MODULE_OWNER(netdev);
>
> adapter = netdev->priv;
> - memset(adapter, 0, sizeof(adapter));
> dev->dev.driver_data = netdev;
>
> adapter->vdev = dev;
> -
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Brian King
Linux on Power Virtualization
IBM Linux Technology Center
^ permalink raw reply
* Re: [PATCH] e1000e: New pci-express e1000 driver (currently for ICH9 devices only)
From: Jeff Garzik @ 2007-08-07 16:26 UTC (permalink / raw)
To: Kok, Auke
Cc: NetDev, Andrew Morton, Arjan van de Ven, Ronciak, John,
Andi Kleen
In-Reply-To: <46B79934.8010405@intel.com>
Kok, Auke wrote:
> From: Auke Kok <auke-jan.h.kok@intel.com>
> Date: Mon, 6 Aug 2007 14:14:44 -0700
> Subject: [PATCH] e1000e: New pci-express e1000 driver (currently for
> ICH9 devices only)
>
> This driver implements support for the ICH9 on-board LAN ethernet
> device. The device is similar to ICH8.
>
> The driver encompasses code to support 82571/2/3, es2lan and ICH8
> devices as well, but those device IDs are disabled and will be
> "lifted" from the e1000 driver over one at a time once this driver
> receives some more live time.
>
> Changes to the last snapshot posted are exclusively in the internal
> hardware API organization. Many thanks to Jeff Garzik for jumping in
> and getting this organized with a keen eye on the future layout.
>
> Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Thanks for posting the patch in a git-am friendly format :)
I merged this into netdev-2.6.git#e1000e just now, and pulled it into
netdev-2.6.git#ALL so that Andrew's -mm tree will automatically pick up
this driver.
Please submit e1000e in the form of follow-up patches to #e1000e, rather
than reposting the entire driver.
We'll leave it on this side branch for a little while, to give others a
chance to review and test, and give you (auke) a chance to update for
Andi's comments etc.
Jeff
^ permalink raw reply
* Re: 2.6.20->2.6.21 - networking dies after random time
From: Jean-Baptiste Vignaud @ 2007-08-07 17:16 UTC (permalink / raw)
To: jarkao2
Cc: cebbert, mingo, marcin.slusarz, tglx, torvalds, linux-kernel,
shemminger, linux-net, netdev, akpm, alan
> On Tue, Aug 07, 2007 at 11:21:07AM +0200, Jean-Baptiste Vignaud wrote:
> >
> > > > * interrupts (i use irqbalance, but problem was the same without)
> > >
> > > I wonder if you tried without SMP too?
> >
> > No i did not. Do you think that this can be a problem ?
> > To test with no SMP, do i need to recompile kernel or is there a kernel parameter ?
>
> It's always better to exclude any complications if it's possible.
> Yes, there is the kernel parameter for this: nosmp. So, if you
> have some time to spare I think 2.6.23-rc2 with this nosmp
> could be an interesting option.
So this afternoon i compiled 2.6.23-rc2 with same options as 2.6.23-rc1 and edited grub.conf to add nosmp but after reboot the box did not responded. Back home, i saw that the kernel failed because it was unable to find the partitions (mdadm failed, then LVM). After a few tests, removing nosmp let the kernel boot correctly. It seems that even the fedora provided kernels have the same behavior (well at least 2.6.22.1-41.fc7).
Jb
^ permalink raw reply
* Re: e100 (was: eepro100) - Nobody Cares (hardware?)
From: ericj @ 2007-08-07 18:01 UTC (permalink / raw)
To: Kok, Auke; +Cc: Jeff Garzik, NetDev
In-Reply-To: <46B7C095.5010202@intel.com>
I want to thank everyone who helped with this.
It was proven to be a hardware issue. The board designer had left a GPIO
pin in an indeterminate state because he was planning to use it later to
do something with the battery charge circuitry.
I apologize for wasting everyone's time.
On Mon, 06 Aug 2007 17:45:09 -0700, Kok, Auke wrote
> [moving to netdev mailinglist]
>
> ericj wrote:
> > On Mon, 6 Aug 2007 11:20:58 -0500, ericj wrote
> >> On Mon, 06 Aug 2007 12:13:28 -0400, Jeff Garzik wrote
> >>> eepro100 is going to be removed. Please try e100 on 2.6.22 or
> >>> 2.6.23-rc2.
> >
> >> I will give the 2.6.23 a try.
> >
> > I tried 2.6.23-rc2 and there was no change.
> >
> > There is now some question from the hardware guys about whether the
> > eeproms were properly configured before shipping the boards. Is there
> > any documentation of the eeprom on an EE Pro 100 VE (ICH4) so that I can
> > figure out if any of the settings in there might be causing the problem?
> >
> > The only fields I know of for sure are the MAC address at the beginning
> > and the checksum at the end. I also see from the driver code that there
> > is at least one byte controlling wake-on-lan, which I don't care about -
> > unless it's the problem.
> >
> > Thanks for ethtool, by the way. It's been helpful in looking at this and
> > comparing the eeprom to an earlier version of the board that works.
>
> Eric,
>
> please don't forget that an entire team here at Intel is
> dedicated to supporting e100 and pro/1000 devices from Intel.
>
> Most of the pro/100 features are documented in the SDM which
> contains some references to the eeprom parts. Mostly the
> device doesn't need much configuration from the eeprom to work
> (unlike gigE parts). The SDM can be downloaded from our sf.net
> project page:
>
>
http://sourceforge.net/project/showfiles.php?group_id=42302&package_id=68544
>
> The issue that you are reporting:
>
> "My system boots fine but when I try to bring up the onboard
> ethernet (an EEPro 100 VE) I get a "Nobody Cares" message and
> the interrupt is disabled."
>
> However has been recently patched. This should have worked
> regardless of whether you used e100 or eepro100 (noting that
> nobody supports eepro100 anymore, you should really use e100
> for all tests).
>
> if you look in drivers/pci/quirks.c you'll find that there is
> specific code for e100 devices. If this quirk doesn't work for
> you then we'll need to dig into that. For this I'd like you to
> gather:
>
> - `ethtool -e eth0` output
> - `lspci -n` output
>
> this will allow me to check the quirck code and see if it has
> the right device ID. I'm suspecting that the device ID is
> missing somehow, or the workaround fails.
>
> Auke
--
"A hunch is creativity trying to tell you something" -- Frank Capra
Eric Johnson
^ permalink raw reply
* Re: e100
From: Kok, Auke @ 2007-08-07 18:03 UTC (permalink / raw)
To: ericj; +Cc: Jeff Garzik, NetDev
In-Reply-To: <20070807175942.M75711@ericj.net>
ericj wrote:
> I want to thank everyone who helped with this.
>
> It was proven to be a hardware issue. The board designer had left a GPIO
> pin in an indeterminate state because he was planning to use it later to
> do something with the battery charge circuitry.
>
> I apologize for wasting everyone's time.
happens to everyone :)
Thanks for letting us know.
Auke
>
> On Mon, 06 Aug 2007 17:45:09 -0700, Kok, Auke wrote
>> [moving to netdev mailinglist]
>>
>> ericj wrote:
>>> On Mon, 6 Aug 2007 11:20:58 -0500, ericj wrote
>>>> On Mon, 06 Aug 2007 12:13:28 -0400, Jeff Garzik wrote
>>>>> eepro100 is going to be removed. Please try e100 on 2.6.22 or
>>>>> 2.6.23-rc2.
>>>> I will give the 2.6.23 a try.
>>> I tried 2.6.23-rc2 and there was no change.
>>>
>>> There is now some question from the hardware guys about whether the
>>> eeproms were properly configured before shipping the boards. Is there
>>> any documentation of the eeprom on an EE Pro 100 VE (ICH4) so that I can
>>> figure out if any of the settings in there might be causing the problem?
>>>
>>> The only fields I know of for sure are the MAC address at the beginning
>>> and the checksum at the end. I also see from the driver code that there
>>> is at least one byte controlling wake-on-lan, which I don't care about -
>>> unless it's the problem.
>>>
>>> Thanks for ethtool, by the way. It's been helpful in looking at this and
>>> comparing the eeprom to an earlier version of the board that works.
>> Eric,
>>
>> please don't forget that an entire team here at Intel is
>> dedicated to supporting e100 and pro/1000 devices from Intel.
>>
>> Most of the pro/100 features are documented in the SDM which
>> contains some references to the eeprom parts. Mostly the
>> device doesn't need much configuration from the eeprom to work
>> (unlike gigE parts). The SDM can be downloaded from our sf.net
>> project page:
>>
>>
> http://sourceforge.net/project/showfiles.php?group_id=42302&package_id=68544
>> The issue that you are reporting:
>>
>> "My system boots fine but when I try to bring up the onboard
>> ethernet (an EEPro 100 VE) I get a "Nobody Cares" message and
>> the interrupt is disabled."
>>
>> However has been recently patched. This should have worked
>> regardless of whether you used e100 or eepro100 (noting that
>> nobody supports eepro100 anymore, you should really use e100
>> for all tests).
>>
>> if you look in drivers/pci/quirks.c you'll find that there is
>> specific code for e100 devices. If this quirk doesn't work for
>> you then we'll need to dig into that. For this I'd like you to
>> gather:
>>
>> - `ethtool -e eth0` output
>> - `lspci -n` output
>>
>> this will allow me to check the quirck code and see if it has
>> the right device ID. I'm suspecting that the device ID is
>> missing somehow, or the workaround fails.
>>
>> Auke
>
>
> --
>
> "A hunch is creativity trying to tell you something" -- Frank Capra
>
> Eric Johnson
^ permalink raw reply
* Re: Distributed storage.
From: Daniel Phillips @ 2007-08-07 18:24 UTC (permalink / raw)
To: Jens Axboe
Cc: Evgeniy Polyakov, netdev, linux-kernel, linux-fsdevel,
Peter Zijlstra
In-Reply-To: <20070807120523.GX5245@kernel.dk>
On Tuesday 07 August 2007 05:05, Jens Axboe wrote:
> On Sun, Aug 05 2007, Daniel Phillips wrote:
> > A simple way to solve the stable accounting field issue is to add a
> > new pointer to struct bio that is owned by the top level submitter
> > (normally generic_make_request but not always) and is not affected
> > by any recursive resubmission. Then getting rid of that field
> > later becomes somebody's summer project, which is not all that
> > urgent because struct bio is already bloated up with a bunch of
> > dubious fields and is a transient structure anyway.
>
> Thanks for your insights. Care to detail what bloat and dubious
> fields struct bio has?
First obvious one I see is bi_rw separate from bi_flags. Front_size and
back_size smell dubious. Is max_vecs really necessary? You could
reasonably assume bi_vcnt rounded up to a power of two and bury the
details of making that work behind wrapper functions to change the
number of bvecs, if anybody actually needs that. Bi_endio and
bi_destructor could be combined. I don't see a lot of users of bi_idx,
that looks like a soft target. See what happened to struct page when a
couple of folks got serious about attacking it, some really deep hacks
were done to pare off a few bytes here and there. But struct bio as a
space waster is not nearly in the same ballpark.
It would be interesting to see if bi_bdev could be made read only.
Generally, each stage in the block device stack knows what the next
stage is going to be, so why do we have to write that in the bio? For
error reporting from interrupt context? Anyway, if Evgeniy wants to do
the patch, I will happily unload the task of convincing you that random
fields are/are not needed in struct bio :-)
Regards,
Daniel
^ permalink raw reply
* [RFC] cubic: backoff after slow start
From: Stephen Hemminger @ 2007-08-07 18:37 UTC (permalink / raw)
To: Injong Rhee, Sangtae Ha; +Cc: netdev
CUBIC takes several unnecessary iterations to converge out of slow start. This
is most noticable over a link where the bottleneck queue size is much larger than BDP,
and the sender has to "fill the pipe" in slow start before the first loss. Typical
consumer broadband links seem to have large (up to 2secs) of queue that needs
to get filled before the first loss.
A possible fix is to use a beta of .5 (same as original TCP) when leaving
slow start. Originally, the Linux version didn't do slow start so it probably
never was observed.
--- a/net/ipv4/tcp_cubic.c 2007-08-02 12:16:22.000000000 +0100
+++ b/net/ipv4/tcp_cubic.c 2007-08-03 15:57:12.000000000 +0100
@@ -289,7 +289,11 @@ static u32 bictcp_recalc_ssthresh(struct
ca->loss_cwnd = tp->snd_cwnd;
- return max((tp->snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U);
+ /* Initial backoff when leaving slow start */
+ if (tp->snd_ssthresh == 0x7fffffff)
+ return max(tp->snd_cwnd >> 1U, 2U);
+ else
+ return max((tp->snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U);
}
static u32 bictcp_undo_cwnd(struct sock *sk)
^ permalink raw reply
* [RFT] sky2: turn on pci power
From: Stephen Hemminger @ 2007-08-07 19:12 UTC (permalink / raw)
To: Florian Lohoff; +Cc: Michal Piotrowski, netdev
In-Reply-To: <20070725072202.GA4905@paradigm.rfc822.org>
This setup step got dropped in 2.6.23, Yukon-EX configuration, maybe
this fixes your problem?
--- a/drivers/net/sky2.c 2007-08-06 04:39:36.000000000 -0400
+++ b/drivers/net/sky2.c 2007-08-07 14:50:25.000000000 -0400
@@ -222,6 +222,8 @@ static void sky2_power_on(struct sky2_hw
if (hw->chip_id == CHIP_ID_YUKON_EC_U || hw->chip_id == CHIP_ID_YUKON_EX) {
u32 reg;
+ sky2_pci_write32(hw, PCI_DEV_REG3, 0);
+
reg = sky2_pci_read32(hw, PCI_DEV_REG4);
/* set all bits to 0 except bits 15..12 and 8 */
reg &= P_ASPM_CONTROL_MSK;
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox