All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
@ 2006-10-13 21:40 Aleksey Gorelov
  2006-10-13 21:42 ` Ryan Richter
  0 siblings, 1 reply; 39+ messages in thread
From: Aleksey Gorelov @ 2006-10-13 21:40 UTC (permalink / raw)
  To: ryan, linux-kernel, xhejtman, auke-jan.h.kok

>-----Original Message-----
>From: linux-kernel-owner@vger.kernel.org 
>[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Ryan Richter
>Sent: Friday, October 13, 2006 2:26 PM
>To: linux-kernel@vger.kernel.org
>Subject: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
>
>I have a new system based on the Inter 965G chipset, and all 
>the kernels
>I've used - 2.6.18, .19-rc1, and .19-rc2 - have failed to reset the
>machine on a reboot.  "Machine Restart" is printed, but it just hangs
>there.  SysRQ is non-functional at that point.

  The similar issue has been discussed in adjacent thread "Machine reboot". Is it Intel
motherboard, or just carries Intel chipset ? Does building e1000 driver as a module and 'rmmod
e1000' just before reboot help ?

Aleks.



^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-13 21:40 Machine restart doesn't work - Intel 965G, 2.6.19-rc2 Aleksey Gorelov
@ 2006-10-13 21:42 ` Ryan Richter
  2006-10-13 21:45   ` Lukas Hejtmanek
  0 siblings, 1 reply; 39+ messages in thread
From: Ryan Richter @ 2006-10-13 21:42 UTC (permalink / raw)
  To: Aleksey Gorelov; +Cc: linux-kernel, xhejtman, auke-jan.h.kok

On Fri, Oct 13, 2006 at 02:40:29PM -0700, Aleksey Gorelov wrote:
> >-----Original Message-----
> >From: linux-kernel-owner@vger.kernel.org 
> >[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Ryan Richter
> >Sent: Friday, October 13, 2006 2:26 PM
> >To: linux-kernel@vger.kernel.org
> >Subject: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
> >
> >I have a new system based on the Inter 965G chipset, and all 
> >the kernels
> >I've used - 2.6.18, .19-rc1, and .19-rc2 - have failed to reset the
> >machine on a reboot.  "Machine Restart" is printed, but it just hangs
> >there.  SysRQ is non-functional at that point.
> 
>   The similar issue has been discussed in adjacent thread "Machine
>   reboot". Is it Intel motherboard, or just carries Intel chipset ?
>   Does building e1000 driver as a module and 'rmmod e1000' just before
>   reboot help ?

It's an Intel DG965RY board.  I'll try out your suggestion on Monday.

Thanks,
-ryan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-13 21:42 ` Ryan Richter
@ 2006-10-13 21:45   ` Lukas Hejtmanek
  2006-10-13 21:46     ` Ryan Richter
  0 siblings, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2006-10-13 21:45 UTC (permalink / raw)
  To: Ryan Richter; +Cc: Aleksey Gorelov, linux-kernel, auke-jan.h.kok

On Fri, Oct 13, 2006 at 05:42:50PM -0400, Ryan Richter wrote:
> >   The similar issue has been discussed in adjacent thread "Machine
> >   reboot". Is it Intel motherboard, or just carries Intel chipset ?
> >   Does building e1000 driver as a module and 'rmmod e1000' just before
> >   reboot help ?
> 
> It's an Intel DG965RY board.  I'll try out your suggestion on Monday.

Btw, are you using i386 or x86_64 architecture?

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-13 21:45   ` Lukas Hejtmanek
@ 2006-10-13 21:46     ` Ryan Richter
  2006-10-13 21:49       ` Lukas Hejtmanek
  2006-10-13 21:50       ` Aleksey Gorelov
  0 siblings, 2 replies; 39+ messages in thread
From: Ryan Richter @ 2006-10-13 21:46 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Aleksey Gorelov, linux-kernel, auke-jan.h.kok

On Fri, Oct 13, 2006 at 11:45:23PM +0200, Lukas Hejtmanek wrote:
> On Fri, Oct 13, 2006 at 05:42:50PM -0400, Ryan Richter wrote:
> > >   The similar issue has been discussed in adjacent thread "Machine
> > >   reboot". Is it Intel motherboard, or just carries Intel chipset ?
> > >   Does building e1000 driver as a module and 'rmmod e1000' just before
> > >   reboot help ?
> > 
> > It's an Intel DG965RY board.  I'll try out your suggestion on Monday.
> 
> Btw, are you using i386 or x86_64 architecture?

x86_64.

-ryan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-13 21:46     ` Ryan Richter
@ 2006-10-13 21:49       ` Lukas Hejtmanek
  2006-10-13 21:51         ` Ryan Richter
  2006-10-17 18:00         ` Ryan Richter
  2006-10-13 21:50       ` Aleksey Gorelov
  1 sibling, 2 replies; 39+ messages in thread
From: Lukas Hejtmanek @ 2006-10-13 21:49 UTC (permalink / raw)
  To: Ryan Richter; +Cc: Aleksey Gorelov, linux-kernel, auke-jan.h.kok

On Fri, Oct 13, 2006 at 05:46:08PM -0400, Ryan Richter wrote:
> > > >   The similar issue has been discussed in adjacent thread "Machine
> > > >   reboot". Is it Intel motherboard, or just carries Intel chipset ?
> > > >   Does building e1000 driver as a module and 'rmmod e1000' just before
> > > >   reboot help ?
> > > 
> > > It's an Intel DG965RY board.  I'll try out your suggestion on Monday.
> > 
> > Btw, are you using i386 or x86_64 architecture?
> 
> x86_64.

Hm, I'm also using x86_64 and 2.6.19-rc1-git9 works OK for me regardless of
e1000. 2.6.18 hangs if e1000 is built in.

Could you also try exactly 2.6.19-rc1-git9?

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-13 21:46     ` Ryan Richter
  2006-10-13 21:49       ` Lukas Hejtmanek
@ 2006-10-13 21:50       ` Aleksey Gorelov
  1 sibling, 0 replies; 39+ messages in thread
From: Aleksey Gorelov @ 2006-10-13 21:50 UTC (permalink / raw)
  To: Ryan Richter, Lukas Hejtmanek
  Cc: Aleksey Gorelov, linux-kernel, auke-jan.h.kok



--- Ryan Richter <ryan@tau.solarneutrino.net> wrote:

> On Fri, Oct 13, 2006 at 11:45:23PM +0200, Lukas Hejtmanek wrote:
> > On Fri, Oct 13, 2006 at 05:42:50PM -0400, Ryan Richter wrote:
> > > >   The similar issue has been discussed in adjacent thread "Machine
> > > >   reboot". Is it Intel motherboard, or just carries Intel chipset ?
> > > >   Does building e1000 driver as a module and 'rmmod e1000' just before
> > > >   reboot help ?
> > > 
> > > It's an Intel DG965RY board.  I'll try out your suggestion on Monday.
> > 
> > Btw, are you using i386 or x86_64 architecture?
> 
> x86_64.
> 
And mine is i386.

Aleks.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-13 21:49       ` Lukas Hejtmanek
@ 2006-10-13 21:51         ` Ryan Richter
  2006-10-17 18:00         ` Ryan Richter
  1 sibling, 0 replies; 39+ messages in thread
From: Ryan Richter @ 2006-10-13 21:51 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Aleksey Gorelov, linux-kernel, auke-jan.h.kok

On Fri, Oct 13, 2006 at 11:49:22PM +0200, Lukas Hejtmanek wrote:
> On Fri, Oct 13, 2006 at 05:46:08PM -0400, Ryan Richter wrote:
> > > > >   The similar issue has been discussed in adjacent thread "Machine
> > > > >   reboot". Is it Intel motherboard, or just carries Intel chipset ?
> > > > >   Does building e1000 driver as a module and 'rmmod e1000' just before
> > > > >   reboot help ?
> > > > 
> > > > It's an Intel DG965RY board.  I'll try out your suggestion on Monday.
> > > 
> > > Btw, are you using i386 or x86_64 architecture?
> > 
> > x86_64.
> 
> Hm, I'm also using x86_64 and 2.6.19-rc1-git9 works OK for me regardless of
> e1000. 2.6.18 hangs if e1000 is built in.
> 
> Could you also try exactly 2.6.19-rc1-git9?

Will do, I'll try that on Monday also.  I'd do it now, but obviously I
can't reboot the machine remotely...

Thanks,
-ryan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-13 21:49       ` Lukas Hejtmanek
  2006-10-13 21:51         ` Ryan Richter
@ 2006-10-17 18:00         ` Ryan Richter
  2006-10-17 20:53           ` Aleksey Gorelov
  1 sibling, 1 reply; 39+ messages in thread
From: Ryan Richter @ 2006-10-17 18:00 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Aleksey Gorelov, linux-kernel, auke-jan.h.kok

On Fri, Oct 13, 2006 at 11:49:22PM +0200, Lukas Hejtmanek wrote:
> On Fri, Oct 13, 2006 at 05:46:08PM -0400, Ryan Richter wrote:
> > > > >   The similar issue has been discussed in adjacent thread "Machine
> > > > >   reboot". Is it Intel motherboard, or just carries Intel chipset ?
> > > > >   Does building e1000 driver as a module and 'rmmod e1000' just before
> > > > >   reboot help ?
> > > > 
> > > > It's an Intel DG965RY board.  I'll try out your suggestion on Monday.
> > > 
> > > Btw, are you using i386 or x86_64 architecture?
> > 
> > x86_64.
> 
> Hm, I'm also using x86_64 and 2.6.19-rc1-git9 works OK for me regardless of
> e1000. 2.6.18 hangs if e1000 is built in.
> 
> Could you also try exactly 2.6.19-rc1-git9?

2.6.19-rc1-git9 doesn't work any better for me.  I haven't tried
unloading the e1000 module yet.  Since I run the machine off an nfsroot,
it will require some creativity to test that.

-ryan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-17 18:00         ` Ryan Richter
@ 2006-10-17 20:53           ` Aleksey Gorelov
  2006-10-17 21:17             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000? Auke Kok
  2006-10-17 22:27             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 Ryan Richter
  0 siblings, 2 replies; 39+ messages in thread
From: Aleksey Gorelov @ 2006-10-17 20:53 UTC (permalink / raw)
  To: Ryan Richter, Lukas Hejtmanek
  Cc: Aleksey Gorelov, linux-kernel, auke-jan.h.kok



--- Ryan Richter <ryan@tau.solarneutrino.net> wrote:
> 
> 2.6.19-rc1-git9 doesn't work any better for me.  I haven't tried
> unloading the e1000 module yet.  Since I run the machine off an nfsroot,
> it will require some creativity to test that.
> 
> -ryan

You may try the following patch instead if it's easier for you. It'll likely break suspend stuff,
but you won't need to play around with modules.

Aleks.

--- linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c.orig	2006-10-17 13:36:06.000000000 -0700
+++ linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c	2006-10-17 13:36:50.000000000 -0700
@@ -4847,6 +4847,7 @@
 static void e1000_shutdown(struct pci_dev *pdev)
 {
 	e1000_suspend(pdev, PMSG_SUSPEND);
+	pci_set_power_state(pdev, PCI_D0);
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER




^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000?
  2006-10-17 20:53           ` Aleksey Gorelov
@ 2006-10-17 21:17             ` Auke Kok
  2006-10-17 22:14               ` dared1st
  2007-11-20 14:38               ` e1000 driver problems Lukas Hejtmanek
  2006-10-17 22:27             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 Ryan Richter
  1 sibling, 2 replies; 39+ messages in thread
From: Auke Kok @ 2006-10-17 21:17 UTC (permalink / raw)
  To: Aleksey Gorelov
  Cc: Ryan Richter, Lukas Hejtmanek, linux-kernel, auke-jan.h.kok,
	Jesse Brandeburg, Ronciak, John

Aleksey Gorelov wrote:
> 
> --- Ryan Richter <ryan@tau.solarneutrino.net> wrote:
>> 2.6.19-rc1-git9 doesn't work any better for me.  I haven't tried
>> unloading the e1000 module yet.  Since I run the machine off an nfsroot,
>> it will require some creativity to test that.
>>
>> -ryan
> 
> You may try the following patch instead if it's easier for you. It'll likely break suspend stuff,
> but you won't need to play around with modules.
> 
> Aleks.
> 
> --- linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c.orig	2006-10-17 13:36:06.000000000 -0700
> +++ linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c	2006-10-17 13:36:50.000000000 -0700
> @@ -4847,6 +4847,7 @@
>  static void e1000_shutdown(struct pci_dev *pdev)
>  {
>  	e1000_suspend(pdev, PMSG_SUSPEND);
> +	pci_set_power_state(pdev, PCI_D0);
>  }
>  
>  #ifdef CONFIG_NET_POLL_CONTROLLER

I wouldn't do that like this, since e1000_suspend already does a pci_set_power_state() 
right before it exits, and doing two of those closely after another might result in an 
undetermined state.

I would be more interested in forcing D3 state instead of the current 
`pci_set_power_state(pdev, pci_choose_state(pdev, state));` in e1000_suspend, so can you 
try this instead?

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index ce0d35f..30ceeec 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -4793,7 +4793,7 @@ #endif

         pci_disable_device(pdev);

-       pci_set_power_state(pdev, pci_choose_state(pdev, state));
+       pci_set_power_state(pdev, PCI_D3hot);

         return 0;
  }


alternatively, you can try PCI_D3cold or PCI_D0, but setting the device to D0 is a 
no-op: the device is already in D0 at run-time, so that's silly.

In any case: this is not a driver bug, but really (unfortunately) a platform issue, so 
this fix is not suitable for general cases *at all*, and we'd have to validate this 
nasty workaround on all other chipsets that e1000 supports too, something that ain't 
going to happen I'm sure.

constructive: I've just spend some time working with e100+suspend+shutdown+netconsole, 
so I'll audit e1000 for that in the next few weeks and make sure that all works 
properly. Perhaps that yields something for you.

Cheers,

Auke

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* RE: Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000?
  2006-10-17 21:17             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000? Auke Kok
@ 2006-10-17 22:14               ` dared1st
  2007-11-20 14:38               ` e1000 driver problems Lukas Hejtmanek
  1 sibling, 0 replies; 39+ messages in thread
From: dared1st @ 2006-10-17 22:14 UTC (permalink / raw)
  To: 'Auke Kok'
  Cc: 'Ryan Richter', 'Lukas Hejtmanek', linux-kernel,
	'Jesse Brandeburg', 'Ronciak, John'



>-----Original Message-----
>From: Auke Kok [mailto:auke-jan.h.kok@intel.com]
>Sent: Tuesday, October 17, 2006 2:17 PM
>To: Aleksey Gorelov
>Cc: Ryan Richter; Lukas Hejtmanek; linux-kernel@vger.kernel.org; auke-
>jan.h.kok@intel.com; Jesse Brandeburg; Ronciak, John
>Subject: Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000?
>
>Aleksey Gorelov wrote:
>>
>> --- Ryan Richter <ryan@tau.solarneutrino.net> wrote:
>>> 2.6.19-rc1-git9 doesn't work any better for me.  I haven't tried
>>> unloading the e1000 module yet.  Since I run the machine off an nfsroot,
>>> it will require some creativity to test that.
>>>
>>> -ryan
>>
>> You may try the following patch instead if it's easier for you. It'll
>likely break suspend stuff,
>> but you won't need to play around with modules.
>>
>> Aleks.
>>
>> --- linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c.orig	2006-10-17
>13:36:06.000000000 -0700
>> +++ linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c	2006-10-17
>13:36:50.000000000 -0700
>> @@ -4847,6 +4847,7 @@
>>  static void e1000_shutdown(struct pci_dev *pdev)
>>  {
>>  	e1000_suspend(pdev, PMSG_SUSPEND);
>> +	pci_set_power_state(pdev, PCI_D0);
>>  }
>>
>>  #ifdef CONFIG_NET_POLL_CONTROLLER
>
>I wouldn't do that like this, since e1000_suspend already does a
>pci_set_power_state()
>right before it exits, and doing two of those closely after another might
>result in an
>undetermined state.
>
>I would be more interested in forcing D3 state instead of the current
>`pci_set_power_state(pdev, pci_choose_state(pdev, state));` in
>e1000_suspend, so can you
>try this instead?


But how this is different from original variant? 
pci_choose_state(pdev, PMSG_SUSPEND) returns PCI_D3hot, and it is when LAN
in this state machine does not reboot. That's why I tried PCI_D0 in first
place (actually, I've originally just remove pci_set_power_state call at
all). I did not try PCI_D3cold, though.

>
>diff --git a/drivers/net/e1000/e1000_main.c
>b/drivers/net/e1000/e1000_main.c
>index ce0d35f..30ceeec 100644
>--- a/drivers/net/e1000/e1000_main.c
>+++ b/drivers/net/e1000/e1000_main.c
>@@ -4793,7 +4793,7 @@ #endif
>
>         pci_disable_device(pdev);
>
>-       pci_set_power_state(pdev, pci_choose_state(pdev, state));
>+       pci_set_power_state(pdev, PCI_D3hot);
>
>         return 0;
>  }
>

>
>alternatively, you can try PCI_D3cold or PCI_D0, but setting the device to
>D0 is a
>no-op: the device is already in D0 at run-time, so that's silly.
>
>In any case: this is not a driver bug, but really (unfortunately) a
>platform issue, so
>this fix is not suitable for general cases *at all*, and we'd have to
>validate this
>nasty workaround on all other chipsets that e1000 supports too, something
>that ain't
>going to happen I'm sure.

Anyway, both patches are rather ugly, and will become even uglier with
checks for particular platform and system_state. I wish BIOS engineers fix
it 'the right way'.

Aleks

>
>constructive: I've just spend some time working with
>e100+suspend+shutdown+netconsole,
>so I'll audit e1000 for that in the next few weeks and make sure that all
>works
>properly. Perhaps that yields something for you.
>
>Cheers,
>
>Auke


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-17 20:53           ` Aleksey Gorelov
  2006-10-17 21:17             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000? Auke Kok
@ 2006-10-17 22:27             ` Ryan Richter
  2006-10-20 17:57               ` Auke Kok
  1 sibling, 1 reply; 39+ messages in thread
From: Ryan Richter @ 2006-10-17 22:27 UTC (permalink / raw)
  To: Aleksey Gorelov; +Cc: Lukas Hejtmanek, linux-kernel, auke-jan.h.kok

On Tue, Oct 17, 2006 at 01:53:15PM -0700, Aleksey Gorelov wrote:
> 
> 
> --- Ryan Richter <ryan@tau.solarneutrino.net> wrote:
> > 
> > 2.6.19-rc1-git9 doesn't work any better for me.  I haven't tried
> > unloading the e1000 module yet.  Since I run the machine off an nfsroot,
> > it will require some creativity to test that.
> > 
> > -ryan
> 
> You may try the following patch instead if it's easier for you. It'll
> likely break suspend stuff,
> but you won't need to play around with modules.
> 
> Aleks.
> 
> --- linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c.orig	2006-10-17 13:36:06.000000000 -0700
> +++ linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c	2006-10-17 13:36:50.000000000 -0700
> @@ -4847,6 +4847,7 @@
>  static void e1000_shutdown(struct pci_dev *pdev)
>  {
>  	e1000_suspend(pdev, PMSG_SUSPEND);
> +	pci_set_power_state(pdev, PCI_D0);
>  }
>  
>  #ifdef CONFIG_NET_POLL_CONTROLLER


This patch allows the machine to reboot normally.

Thanks,
-ryan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-17 22:27             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 Ryan Richter
@ 2006-10-20 17:57               ` Auke Kok
  2006-10-20 18:06                 ` Lukas Hejtmanek
  2006-10-20 18:07                 ` Ryan Richter
  0 siblings, 2 replies; 39+ messages in thread
From: Auke Kok @ 2006-10-20 17:57 UTC (permalink / raw)
  To: Ryan Richter
  Cc: Aleksey Gorelov, Lukas Hejtmanek, linux-kernel, auke-jan.h.kok

Ryan Richter wrote:
> On Tue, Oct 17, 2006 at 01:53:15PM -0700, Aleksey Gorelov wrote:
>>
>> --- Ryan Richter <ryan@tau.solarneutrino.net> wrote:
>>> 2.6.19-rc1-git9 doesn't work any better for me.  I haven't tried
>>> unloading the e1000 module yet.  Since I run the machine off an nfsroot,
>>> it will require some creativity to test that.
>>>
>>> -ryan
>> You may try the following patch instead if it's easier for you. It'll
>> likely break suspend stuff,
>> but you won't need to play around with modules.
>>
>> Aleks.
>>
>> --- linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c.orig	2006-10-17 13:36:06.000000000 -0700
>> +++ linux-2.6.19-rc2/drivers/net/e1000/e1000_main.c	2006-10-17 13:36:50.000000000 -0700
>> @@ -4847,6 +4847,7 @@
>>  static void e1000_shutdown(struct pci_dev *pdev)
>>  {
>>  	e1000_suspend(pdev, PMSG_SUSPEND);
>> +	pci_set_power_state(pdev, PCI_D0);
>>  }
>>  
>>  #ifdef CONFIG_NET_POLL_CONTROLLER
> 
> 
> This patch allows the machine to reboot normally.

To all that are seeing this problem:

can you send me (off-list is OK) the motherboard number+name, the BIOS versions (+ where 
you downloaded them from) that you have tried and for each version, whether it worked 
without this workaround or not?

Thanks,

Auke


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-20 17:57               ` Auke Kok
@ 2006-10-20 18:06                 ` Lukas Hejtmanek
  2006-10-21 17:34                   ` Ryan Richter
  2006-10-20 18:07                 ` Ryan Richter
  1 sibling, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2006-10-20 18:06 UTC (permalink / raw)
  To: Auke Kok; +Cc: Ryan Richter, Aleksey Gorelov, linux-kernel

On Fri, Oct 20, 2006 at 10:57:29AM -0700, Auke Kok wrote:
> To all that are seeing this problem:
> 
> can you send me (off-list is OK) the motherboard number+name, the BIOS 
> versions (+ where you downloaded them from) that you have tried and for 
> each version, whether it worked without this workaround or not?

Three days ago, Intel released a new BIOS version that claims to fix this issue.

I've tested it with 2.6.18 kernel which was unable to restart, it works now so
it seems that fix was successful.

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-20 17:57               ` Auke Kok
  2006-10-20 18:06                 ` Lukas Hejtmanek
@ 2006-10-20 18:07                 ` Ryan Richter
  2006-10-20 18:17                   ` Auke Kok
  2006-10-23 20:52                   ` Aleksey Gorelov
  1 sibling, 2 replies; 39+ messages in thread
From: Ryan Richter @ 2006-10-20 18:07 UTC (permalink / raw)
  To: Auke Kok; +Cc: Aleksey Gorelov, Lukas Hejtmanek, linux-kernel

On Fri, Oct 20, 2006 at 10:57:29AM -0700, Auke Kok wrote:
> To all that are seeing this problem:
> 
> can you send me (off-list is OK) the motherboard number+name, the BIOS 
> versions (+ where you downloaded them from) that you have tried and for 
> each version, whether it worked without this workaround or not?

I've got an Intel DG965RY with BIOS version 1250.  That's the only BIOS
I've tried (I flashed it first thing when I got the machine), and the
workaround works.

-ryan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-20 18:07                 ` Ryan Richter
@ 2006-10-20 18:17                   ` Auke Kok
  2006-10-20 18:29                     ` Lukas Hejtmanek
  2006-10-23 20:52                   ` Aleksey Gorelov
  1 sibling, 1 reply; 39+ messages in thread
From: Auke Kok @ 2006-10-20 18:17 UTC (permalink / raw)
  To: Ryan Richter; +Cc: Aleksey Gorelov, Lukas Hejtmanek, linux-kernel

Ryan Richter wrote:
> On Fri, Oct 20, 2006 at 10:57:29AM -0700, Auke Kok wrote:
>> To all that are seeing this problem:
>>
>> can you send me (off-list is OK) the motherboard number+name, the BIOS 
>> versions (+ where you downloaded them from) that you have tried and for 
>> each version, whether it worked without this workaround or not?
> 
> I've got an Intel DG965RY with BIOS version 1250.  That's the only BIOS
> I've tried (I flashed it first thing when I got the machine), and the
> workaround works.

OK, thanks.

Lukas Hejtmanek wrote:
 > Three days ago, Intel released a new BIOS version that claims to fix this issue.
 >
 > I've tested it with 2.6.18 kernel which was unable to restart, it works now so
 > it seems that fix was successful.

this is incomplete information. Which version did you have before? what is your 
motherboard number/name? etc. Please be complete.

Please provide what I asked for, if you can. I really need to know _everything_

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-20 18:17                   ` Auke Kok
@ 2006-10-20 18:29                     ` Lukas Hejtmanek
  0 siblings, 0 replies; 39+ messages in thread
From: Lukas Hejtmanek @ 2006-10-20 18:29 UTC (permalink / raw)
  To: Auke Kok; +Cc: Ryan Richter, Aleksey Gorelov, linux-kernel

On Fri, Oct 20, 2006 at 11:17:45AM -0700, Auke Kok wrote:
> Lukas Hejtmanek wrote:
> > Three days ago, Intel released a new BIOS version that claims to fix this 
> issue.
> >
> > I've tested it with 2.6.18 kernel which was unable to restart, it works 
> now so
> > it seems that fix was successful.
> 
> this is incomplete information. Which version did you have before? what is 
> your motherboard number/name? etc. Please be complete.
> 
> Please provide what I asked for, if you can. I really need to know 
> _everything_

Board name: DP965LT
BIOS version: 1458
2.6.18, 2.6.19-rc1 reboots OK.

BIOS version: 816
2.6.18 reboots OK.

BIOS version: 1162, 1176, 1250
2.6.18 cannot reboot. (rmmod e1000 causes reboot OK)
2.6.19-rc1 reboots OK (no additional patches)

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-20 18:06                 ` Lukas Hejtmanek
@ 2006-10-21 17:34                   ` Ryan Richter
  2006-10-21 17:56                     ` Auke Kok
  0 siblings, 1 reply; 39+ messages in thread
From: Ryan Richter @ 2006-10-21 17:34 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Auke Kok, Aleksey Gorelov, linux-kernel

On Fri, Oct 20, 2006 at 08:06:10PM +0200, Lukas Hejtmanek wrote:
> On Fri, Oct 20, 2006 at 10:57:29AM -0700, Auke Kok wrote:
> > To all that are seeing this problem:
> > 
> > can you send me (off-list is OK) the motherboard number+name, the BIOS 
> > versions (+ where you downloaded them from) that you have tried and for 
> > each version, whether it worked without this workaround or not?
> 
> Three days ago, Intel released a new BIOS version that claims to fix
> this issue.
> 
> I've tested it with 2.6.18 kernel which was unable to restart, it
> works now so it seems that fix was successful.

I just tried the 1458 BIOS without the workaround and it's working fine.

Thanks!
-ryan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-21 17:34                   ` Ryan Richter
@ 2006-10-21 17:56                     ` Auke Kok
  0 siblings, 0 replies; 39+ messages in thread
From: Auke Kok @ 2006-10-21 17:56 UTC (permalink / raw)
  To: Ryan Richter, Allan, Bruce W
  Cc: Lukas Hejtmanek, Aleksey Gorelov, linux-kernel

Ryan Richter wrote:
> On Fri, Oct 20, 2006 at 08:06:10PM +0200, Lukas Hejtmanek wrote:
>> On Fri, Oct 20, 2006 at 10:57:29AM -0700, Auke Kok wrote:
>>> To all that are seeing this problem:
>>>
>>> can you send me (off-list is OK) the motherboard number+name, the BIOS 
>>> versions (+ where you downloaded them from) that you have tried and for 
>>> each version, whether it worked without this workaround or not?
 >>
>> Three days ago, Intel released a new BIOS version that claims to fix
>> this issue.
>>
>> I've tested it with 2.6.18 kernel which was unable to restart, it
>> works now so it seems that fix was successful.
> 
> I just tried the 1458 BIOS without the workaround and it's working fine.

okay, looks like the latest BIOS fixes it (hang on reboot/restart) for everyone. Thanks 
for reporting back in, I'll make sure my colleagues write this down for everyone.

Cheers,

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: Machine restart doesn't work - Intel 965G, 2.6.19-rc2
  2006-10-20 18:07                 ` Ryan Richter
  2006-10-20 18:17                   ` Auke Kok
@ 2006-10-23 20:52                   ` Aleksey Gorelov
  1 sibling, 0 replies; 39+ messages in thread
From: Aleksey Gorelov @ 2006-10-23 20:52 UTC (permalink / raw)
  To: Ryan Richter, Auke Kok; +Cc: Aleksey Gorelov, Lukas Hejtmanek, linux-kernel



--- Ryan Richter <ryan@tau.solarneutrino.net> wrote:

> On Fri, Oct 20, 2006 at 10:57:29AM -0700, Auke Kok wrote:
> > To all that are seeing this problem:
> > 
> > can you send me (off-list is OK) the motherboard number+name, the BIOS 
> > versions (+ where you downloaded them from) that you have tried and for 
> > each version, whether it worked without this workaround or not?
> 
> I've got an Intel DG965RY with BIOS version 1250.  That's the only BIOS
> I've tried (I flashed it first thing when I got the machine), and the
> workaround works.
Mine was Intel DG965WH with the same BIOS version. I don't have the board currently, but I'll try
to retest if it comes to my hands one more time.

Aleks.


^ permalink raw reply	[flat|nested] 39+ messages in thread

* e1000 driver problems
  2006-10-17 21:17             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000? Auke Kok
  2006-10-17 22:14               ` dared1st
@ 2007-11-20 14:38               ` Lukas Hejtmanek
  2007-11-26 23:26                 ` Kok, Auke
  1 sibling, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-11-20 14:38 UTC (permalink / raw)
  To: Auke Kok; +Cc: linux-kernel

Hello,

I have laptop thinkpad T61 with 82566MM Gigabit Network Connection (rev 03)
(8086:1049). I have kernel 2.6.24-rc3. E1000E driver does not work (the card
is not detected although it is PCI-E), with E1000 driver, it works mostly OK
unless I force speed to 100Mbits. (ethtool -s eth0 autoneg off speed 100)

I got message about device hang:
Nov 20 10:57:24 anubis kernel: [  212.307502] e1000: eth0: e1000_watchdog:
10/100 speed: disabling TSO
Nov 20 11:03:02 anubis kernel: [  242.811474]   Tx Queue             <0>
Nov 20 11:03:02 anubis kernel: [  242.811476]   TDH                  <80>
Nov 20 11:03:02 anubis kernel: [  242.811478]   TDT                  <81>
Nov 20 11:03:02 anubis kernel: [  242.811480]   next_to_use          <81>
Nov 20 11:03:02 anubis kernel: [  242.811482]   next_to_clean        <80>
Nov 20 11:03:02 anubis kernel: [  242.811484] buffer_info[next_to_clean]
Nov 20 11:03:02 anubis kernel: [  242.811486]   time_stamp <100079cdf>
Nov 20 11:03:02 anubis kernel: [  242.811488]   next_to_watch        <80>
Nov 20 11:03:02 anubis kernel: [  242.811489]   jiffies <100079e68>
Nov 20 11:03:02 anubis kernel: [  242.811491]   next_to_watch.status <0>
Nov 20 11:03:04 anubis kernel: [  243.000047]   Tx Queue             <0>
Nov 20 11:03:04 anubis kernel: [  243.000049]   TDH                  <80>
Nov 20 11:03:04 anubis kernel: [  243.000051]   TDT                  <81>
Nov 20 11:03:04 anubis kernel: [  243.000053]   next_to_use          <81>
and so on.

Is it known problem?


-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-20 14:38               ` e1000 driver problems Lukas Hejtmanek
@ 2007-11-26 23:26                 ` Kok, Auke
  2007-11-27 15:07                   ` Lukas Hejtmanek
  0 siblings, 1 reply; 39+ messages in thread
From: Kok, Auke @ 2007-11-26 23:26 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Auke Kok, linux-kernel

Lukas Hejtmanek wrote:
> Hello,
> 
> I have laptop thinkpad T61 with 82566MM Gigabit Network Connection (rev 03)
> (8086:1049). I have kernel 2.6.24-rc3. E1000E driver does not work (the card
> is not detected although it is PCI-E), with E1000 driver, it works mostly OK
> unless I force speed to 100Mbits. (ethtool -s eth0 autoneg off speed 100)

this device (the ich8 onboard NIC) will not be supported in e1000e until 2.6.25.

> I got message about device hang:
> Nov 20 10:57:24 anubis kernel: [  212.307502] e1000: eth0: e1000_watchdog:
> 10/100 speed: disabling TSO
> Nov 20 11:03:02 anubis kernel: [  242.811474]   Tx Queue             <0>
> Nov 20 11:03:02 anubis kernel: [  242.811476]   TDH                  <80>
> Nov 20 11:03:02 anubis kernel: [  242.811478]   TDT                  <81>
> Nov 20 11:03:02 anubis kernel: [  242.811480]   next_to_use          <81>
> Nov 20 11:03:02 anubis kernel: [  242.811482]   next_to_clean        <80>
> Nov 20 11:03:02 anubis kernel: [  242.811484] buffer_info[next_to_clean]
> Nov 20 11:03:02 anubis kernel: [  242.811486]   time_stamp <100079cdf>
> Nov 20 11:03:02 anubis kernel: [  242.811488]   next_to_watch        <80>
> Nov 20 11:03:02 anubis kernel: [  242.811489]   jiffies <100079e68>
> Nov 20 11:03:02 anubis kernel: [  242.811491]   next_to_watch.status <0>
> Nov 20 11:03:04 anubis kernel: [  243.000047]   Tx Queue             <0>
> Nov 20 11:03:04 anubis kernel: [  243.000049]   TDH                  <80>
> Nov 20 11:03:04 anubis kernel: [  243.000051]   TDT                  <81>
> Nov 20 11:03:04 anubis kernel: [  243.000053]   next_to_use          <81>
> and so on.
> 
> Is it known problem?

there have been indeed know reports of these "fake" hangs. basically the counter
logic for these newer devices reports hangs too quickly for 10/100 speeds while
they actually are not occurring (the line is just very busy).

The fix for this has been to grant more time for the hardware to recover from this
busy state. I'll make sure to check if the upstream drivers are OK in this regard.

you can try our out-of-tree e1000 driver (7.6.x or newer) which should work OK for
you with respect to this problem. Please give that a try.

Cheers,

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-26 23:26                 ` Kok, Auke
@ 2007-11-27 15:07                   ` Lukas Hejtmanek
  2007-11-27 16:48                     ` Kok, Auke
  0 siblings, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-11-27 15:07 UTC (permalink / raw)
  To: Kok, Auke; +Cc: linux-kernel

On Mon, Nov 26, 2007 at 03:26:08PM -0800, Kok, Auke wrote:
> The fix for this has been to grant more time for the hardware to recover
> from this busy state. I'll make sure to check if the upstream drivers are OK
> in this regard.
> 
> you can try our out-of-tree e1000 driver (7.6.x or newer) which should work
> OK for you with respect to this problem. Please give that a try.

unfortunately, the 7.6.9 driver cannot be compiled with 2.6.24-rc3-git2
kernel due to compilation errors.

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-27 15:07                   ` Lukas Hejtmanek
@ 2007-11-27 16:48                     ` Kok, Auke
  2007-11-27 17:31                       ` Lukas Hejtmanek
  0 siblings, 1 reply; 39+ messages in thread
From: Kok, Auke @ 2007-11-27 16:48 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: linux-kernel

Lukas Hejtmanek wrote:
> On Mon, Nov 26, 2007 at 03:26:08PM -0800, Kok, Auke wrote:
>> The fix for this has been to grant more time for the hardware to recover
>> from this busy state. I'll make sure to check if the upstream drivers are OK
>> in this regard.
>>
>> you can try our out-of-tree e1000 driver (7.6.x or newer) which should work
>> OK for you with respect to this problem. Please give that a try.
> 
> unfortunately, the 7.6.9 driver cannot be compiled with 2.6.24-rc3-git2
> kernel due to compilation errors.

but the in-kernel version of e1000 supports the ich8 lan device just fine and can
be builtin. also this kernel has the first release of e1000e which supports the
ich9 onboard lan device.

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-27 16:48                     ` Kok, Auke
@ 2007-11-27 17:31                       ` Lukas Hejtmanek
  2007-11-27 17:40                         ` Kok, Auke
  0 siblings, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-11-27 17:31 UTC (permalink / raw)
  To: Kok, Auke; +Cc: linux-kernel

On Tue, Nov 27, 2007 at 08:48:52AM -0800, Kok, Auke wrote:
> > unfortunately, the 7.6.9 driver cannot be compiled with 2.6.24-rc3-git2
> > kernel due to compilation errors.
> 
> but the in-kernel version of e1000 supports the ich8 lan device just fine
> and can be builtin. also this kernel has the first release of e1000e which
> supports the ich9 onboard lan device.

I'm afraid, I'm missing the point as you have stated that in-kernel drivers
have problem with suspicious board hang...

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-27 17:31                       ` Lukas Hejtmanek
@ 2007-11-27 17:40                         ` Kok, Auke
  2007-11-27 17:44                           ` Lukas Hejtmanek
  0 siblings, 1 reply; 39+ messages in thread
From: Kok, Auke @ 2007-11-27 17:40 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: NetDev


[moving this discussion to netdev, dropping lkml]

Lukas Hejtmanek wrote:
> On Tue, Nov 27, 2007 at 08:48:52AM -0800, Kok, Auke wrote:
>>> unfortunately, the 7.6.9 driver cannot be compiled with 2.6.24-rc3-git2
>>> kernel due to compilation errors.
>> but the in-kernel version of e1000 supports the ich8 lan device just fine
>> and can be builtin. also this kernel has the first release of e1000e which
>> supports the ich9 onboard lan device.
> 
> I'm afraid, I'm missing the point as you have stated that in-kernel drivers
> have problem with suspicious board hang...

my mistake, sorry for that confusion.

the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in the
code for a line that says:

	adapter->tx_timeout_factor = 8;

change that number to 16 to cope with the problem on 10mbit link partners. However
this won't fix the issue on 100mbit partners and we need to investigate that if
that is the case.

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-27 17:40                         ` Kok, Auke
@ 2007-11-27 17:44                           ` Lukas Hejtmanek
  2007-11-27 18:23                             ` Kok, Auke
  0 siblings, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-11-27 17:44 UTC (permalink / raw)
  To: Kok, Auke; +Cc: NetDev

On Tue, Nov 27, 2007 at 09:40:08AM -0800, Kok, Auke wrote:
> > I'm afraid, I'm missing the point as you have stated that in-kernel drivers
> > have problem with suspicious board hang...
> 
> my mistake, sorry for that confusion.
> 
> the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in the
> code for a line that says:
> 
> 	adapter->tx_timeout_factor = 8;
> 
> change that number to 16 to cope with the problem on 10mbit link partners.
> However this won't fix the issue on 100mbit partners and we need to
> investigate that if that is the case.

100mbit is the issue in my case.

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-27 17:44                           ` Lukas Hejtmanek
@ 2007-11-27 18:23                             ` Kok, Auke
  2007-12-03  9:50                               ` Lukas Hejtmanek
  0 siblings, 1 reply; 39+ messages in thread
From: Kok, Auke @ 2007-11-27 18:23 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: NetDev

Lukas Hejtmanek wrote:
> On Tue, Nov 27, 2007 at 09:40:08AM -0800, Kok, Auke wrote:
>>> I'm afraid, I'm missing the point as you have stated that in-kernel drivers
>>> have problem with suspicious board hang...
>> my mistake, sorry for that confusion.
>>
>> the fake hangs on 82562/6 devices occur on 10mbit link only. You can check in the
>> code for a line that says:
>>
>> 	adapter->tx_timeout_factor = 8;
>>
>> change that number to 16 to cope with the problem on 10mbit link partners.
>> However this won't fix the issue on 100mbit partners and we need to
>> investigate that if that is the case.
> 
> 100mbit is the issue in my case.
> 

can you see if your problem goes away with this patch?

---
e1000: increase tx timeout factor for 100mbit speeds

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index b7c3070..2e46a15 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2601,7 +2601,7 @@ e1000_watchdog(unsigned long data)
                        case SPEED_100:
                                txb2b = 0;
                                netdev->tx_queue_len = 100;
-                               /* maybe add some timeout factor ? */
+                               adapter->tx_timeout_factor = 4;
                                break;
                        }

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-11-27 18:23                             ` Kok, Auke
@ 2007-12-03  9:50                               ` Lukas Hejtmanek
  2007-12-03 15:20                                 ` Kok, Auke
  0 siblings, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-12-03  9:50 UTC (permalink / raw)
  To: Kok, Auke; +Cc: NetDev

On Tue, Nov 27, 2007 at 10:23:00AM -0800, Kok, Auke wrote:
> can you see if your problem goes away with this patch?

I cannot test it right now but friend of mine has the same card with 2.6.23.1
kernel. it does not. he also tried module 7.6.12 from source fourge, your patch
did not help. Moreover, it just hangs network connections after while. No
message in dmesg about hangup.

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-03  9:50                               ` Lukas Hejtmanek
@ 2007-12-03 15:20                                 ` Kok, Auke
  2007-12-04 15:03                                   ` Lukas Hejtmanek
                                                     ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Kok, Auke @ 2007-12-03 15:20 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: NetDev

Lukas Hejtmanek wrote:
> On Tue, Nov 27, 2007 at 10:23:00AM -0800, Kok, Auke wrote:
>> can you see if your problem goes away with this patch?
> 
> I cannot test it right now but friend of mine has the same card with 2.6.23.1
> kernel. it does not. he also tried module 7.6.12 from source fourge, your patch
> did not help. Moreover, it just hangs network connections after while. No
> message in dmesg about hangup.

can you open up a ticket on e1000.sf.net and fill in a full bugreport including
output of all of these? :

- ethttool -i eth0
- ethtool -e eth0
- lspci -vv
- full dmesg (not just the driver parts)
- dmidecode
- cat /proc/interrupts

I'd like to see if maybe we can reproduce this in our lab.

Thanks,

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-03 15:20                                 ` Kok, Auke
@ 2007-12-04 15:03                                   ` Lukas Hejtmanek
  2007-12-04 15:34                                   ` Lukas Hejtmanek
  2007-12-18 11:08                                   ` Lukas Hejtmanek
  2 siblings, 0 replies; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-12-04 15:03 UTC (permalink / raw)
  To: Kok, Auke; +Cc: NetDev

On Mon, Dec 03, 2007 at 07:20:48AM -0800, Kok, Auke wrote:
> can you open up a ticket on e1000.sf.net and fill in a full bugreport
> including
> output of all of these? :
> 
> - ethttool -i eth0
> - ethtool -e eth0
> - lspci -vv
> - full dmesg (not just the driver parts)
> - dmidecode
> - cat /proc/interrupts
> 
> I'd like to see if maybe we can reproduce this in our lab.

I'll try but it may take some time. Meanwhile, I found that "hung" card
transmits but does not receive anything including ARP.

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-03 15:20                                 ` Kok, Auke
  2007-12-04 15:03                                   ` Lukas Hejtmanek
@ 2007-12-04 15:34                                   ` Lukas Hejtmanek
  2007-12-04 16:02                                     ` Matt Mathis
  2007-12-18 11:08                                   ` Lukas Hejtmanek
  2 siblings, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-12-04 15:34 UTC (permalink / raw)
  To: Kok, Auke; +Cc: NetDev

Another problem at my own machine. If I force speed to 100Mb on 1GE network,
I can reach 95Mbps bidirectional using UDP (0.34% loses) but only 1.2Mbps
using TCP. A bug? If I enable autonegotiation and run at 1GE, I can got 95Mb
on TCP to the target machine (the target machine has only 100Mb connection).

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-04 15:34                                   ` Lukas Hejtmanek
@ 2007-12-04 16:02                                     ` Matt Mathis
  2007-12-04 17:08                                       ` Lukas Hejtmanek
  0 siblings, 1 reply; 39+ messages in thread
From: Matt Mathis @ 2007-12-04 16:02 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: NetDev

This is probably not an e1000 problem, but a general Ethernet "feature".  If 
you defeat auto-negotiation to force the data rate, you implicitly defeat 
duplex negotiation as well.  You need to explicitly set the duplex mode.

In full duplex mode, send and receive are are fully independent, and packets
can flow concurrently in both directions.

In half duplex mode the receiver assumes it is hearing its own transmission, 
and ignores (or errors) frames that are partially overlapped by it's 
own transmission.

See for example: S. Shalunov and R. Carlson. "Detecting Duplex Mismatch on 
Ethernet"
http://www.springerlink.com/content/vapet12x584e6wyn/

Best regards,
--MM--

On Tue, 4 Dec 2007, Lukas Hejtmanek wrote:

> Another problem at my own machine. If I force speed to 100Mb on 1GE network,
> I can reach 95Mbps bidirectional using UDP (0.34% loses) but only 1.2Mbps
> using TCP. A bug? If I enable autonegotiation and run at 1GE, I can got 95Mb
> on TCP to the target machine (the target machine has only 100Mb connection).
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-04 16:02                                     ` Matt Mathis
@ 2007-12-04 17:08                                       ` Lukas Hejtmanek
  2007-12-04 17:19                                         ` Kok, Auke
  0 siblings, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-12-04 17:08 UTC (permalink / raw)
  To: Matt Mathis; +Cc: NetDev

On Tue, Dec 04, 2007 at 11:02:23AM -0500, Matt Mathis wrote:
> This is probably not an e1000 problem, but a general Ethernet "feature".  
> If you defeat auto-negotiation to force the data rate, you implicitly 
> defeat duplex negotiation as well.  You need to explicitly set the duplex 
> mode.

ethtool reports, that I have full duplex line:
ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Supports auto-negotiation: Yes
        Advertised link modes:  Not reported
        Advertised auto-negotiation: No
        Speed: 100Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: off
        Supports Wake-on: umbg
        Wake-on: d
        Current message level: 0x00000007 (7)
        Link detected: yes


-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-04 17:08                                       ` Lukas Hejtmanek
@ 2007-12-04 17:19                                         ` Kok, Auke
  2007-12-04 18:05                                           ` Rick Jones
  2007-12-04 19:59                                           ` Lukas Hejtmanek
  0 siblings, 2 replies; 39+ messages in thread
From: Kok, Auke @ 2007-12-04 17:19 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Matt Mathis, NetDev

Lukas Hejtmanek wrote:
> On Tue, Dec 04, 2007 at 11:02:23AM -0500, Matt Mathis wrote:
>> This is probably not an e1000 problem, but a general Ethernet "feature".  
>> If you defeat auto-negotiation to force the data rate, you implicitly 
>> defeat duplex negotiation as well.  You need to explicitly set the duplex 
>> mode.
> 
> ethtool reports, that I have full duplex line:

Matt was almost right - the link partner however might be working in half duplex
mode. in case you force speed/duplex, you are *required* to force the exact same
setting on the link partner, otherwise things like this can happen.

if you "just" want to disable gigabit speed, get the latest ethtool and run:

   ethtool -s eth0 advertise 0x0f

which keeps autonegotiation enabled but tells the e1000 card to _not_ advertise
gigabit speed capability to the link partner. The link partner and the e1000 can
then decide with autonegotiation whether to use 10half 10duplex 100half or 100full.

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-04 17:19                                         ` Kok, Auke
@ 2007-12-04 18:05                                           ` Rick Jones
  2007-12-04 19:59                                           ` Lukas Hejtmanek
  1 sibling, 0 replies; 39+ messages in thread
From: Rick Jones @ 2007-12-04 18:05 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Kok, Auke, Matt Mathis, NetDev

Here is some boilerplate on autoneg which I've been using in other 
forums for a number of years when questions about autoneg vs hardcoding 
and duplex-mismatch arise:


How 100Base-T Autoneg is supposed to work:

When both sides of the link are set to autoneg, they will "negotiate"
the duplex setting and select full-duplex if both sides can do
full-duplex.

If one side is hardcoded and not using autoneg, the autoneg process
will "fail" and the side trying to autoneg is required by spec to use
half-duplex mode.

If one side is using half-duplex, and the other is using full-duplex,
sorrow and woe is the usual result.

So, the following table shows what will happen given various settings
on each side:

                  Auto       Half       Full

    Auto        Happiness   Lucky      Sorrow

    Half        Lucky       Happiness  Sorrow

    Full        Sorrow      Sorrow     Happiness

Happiness means that there is a good shot of everything going well.
Lucky means that things will likely go well, but not because you did
anything correctly :) Sorrow means that there _will_ be a duplex
mis-match.

When there is a duplex mismatch, on the side running half-duplex you
will see various errors and probably a number of _LATE_ collisions
("normal" collisions don't count here).  On the side running
full-duplex you will see things like FCS errors.  Note that those
errors are not necessarily conclusive, they are simply indicators.

Further, it is important to keep in mind that a "clean" ping (or the
like - eg "linkloop" or default netperf TCP_RR) test result is
inconclusive here - a duplex mismatch causes lost traffic _only_ when
both sides of the link try to speak at the same time. A typical ping
test, being synchronous, one at a time request/response, never tries
to have both sides talking at the same time.

Finally, when/if you migrate to 1000Base-T, everything has to be set
to auto-neg anyway.

rick jones

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-04 17:19                                         ` Kok, Auke
  2007-12-04 18:05                                           ` Rick Jones
@ 2007-12-04 19:59                                           ` Lukas Hejtmanek
  2007-12-04 20:38                                             ` Kok, Auke
  1 sibling, 1 reply; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-12-04 19:59 UTC (permalink / raw)
  To: Kok, Auke; +Cc: Matt Mathis, NetDev

On Tue, Dec 04, 2007 at 09:19:11AM -0800, Kok, Auke wrote:
> if you "just" want to disable gigabit speed, get the latest ethtool and run:
> 
>    ethtool -s eth0 advertise 0x0f
> 

thanks. You may then let know people behind http://www.lesswatts.org/ to
change tips&tricks related to the network. I turned off autonegotiation
according to that web.

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-04 19:59                                           ` Lukas Hejtmanek
@ 2007-12-04 20:38                                             ` Kok, Auke
  0 siblings, 0 replies; 39+ messages in thread
From: Kok, Auke @ 2007-12-04 20:38 UTC (permalink / raw)
  To: Lukas Hejtmanek; +Cc: Matt Mathis, NetDev

Lukas Hejtmanek wrote:
> On Tue, Dec 04, 2007 at 09:19:11AM -0800, Kok, Auke wrote:
>> if you "just" want to disable gigabit speed, get the latest ethtool and run:
>>
>>    ethtool -s eth0 advertise 0x0f
>>
> 
> thanks. You may then let know people behind http://www.lesswatts.org/ to
> change tips&tricks related to the network. I turned off autonegotiation
> according to that web.

I have sent this out to them recently. This feature however is not available in
all drivers, they explicitly need to be able to handle the advertise mask flag,
and it requires a newer ethtool.

Cheers,

Auke

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: e1000 driver problems
  2007-12-03 15:20                                 ` Kok, Auke
  2007-12-04 15:03                                   ` Lukas Hejtmanek
  2007-12-04 15:34                                   ` Lukas Hejtmanek
@ 2007-12-18 11:08                                   ` Lukas Hejtmanek
  2 siblings, 0 replies; 39+ messages in thread
From: Lukas Hejtmanek @ 2007-12-18 11:08 UTC (permalink / raw)
  To: Kok, Auke; +Cc: NetDev

Hello,

the problem seems to be gone in 2.6.24-rc kernels.

-- 
Lukáš Hejtmánek

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2007-12-18 11:08 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-13 21:40 Machine restart doesn't work - Intel 965G, 2.6.19-rc2 Aleksey Gorelov
2006-10-13 21:42 ` Ryan Richter
2006-10-13 21:45   ` Lukas Hejtmanek
2006-10-13 21:46     ` Ryan Richter
2006-10-13 21:49       ` Lukas Hejtmanek
2006-10-13 21:51         ` Ryan Richter
2006-10-17 18:00         ` Ryan Richter
2006-10-17 20:53           ` Aleksey Gorelov
2006-10-17 21:17             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 / e1000? Auke Kok
2006-10-17 22:14               ` dared1st
2007-11-20 14:38               ` e1000 driver problems Lukas Hejtmanek
2007-11-26 23:26                 ` Kok, Auke
2007-11-27 15:07                   ` Lukas Hejtmanek
2007-11-27 16:48                     ` Kok, Auke
2007-11-27 17:31                       ` Lukas Hejtmanek
2007-11-27 17:40                         ` Kok, Auke
2007-11-27 17:44                           ` Lukas Hejtmanek
2007-11-27 18:23                             ` Kok, Auke
2007-12-03  9:50                               ` Lukas Hejtmanek
2007-12-03 15:20                                 ` Kok, Auke
2007-12-04 15:03                                   ` Lukas Hejtmanek
2007-12-04 15:34                                   ` Lukas Hejtmanek
2007-12-04 16:02                                     ` Matt Mathis
2007-12-04 17:08                                       ` Lukas Hejtmanek
2007-12-04 17:19                                         ` Kok, Auke
2007-12-04 18:05                                           ` Rick Jones
2007-12-04 19:59                                           ` Lukas Hejtmanek
2007-12-04 20:38                                             ` Kok, Auke
2007-12-18 11:08                                   ` Lukas Hejtmanek
2006-10-17 22:27             ` Machine restart doesn't work - Intel 965G, 2.6.19-rc2 Ryan Richter
2006-10-20 17:57               ` Auke Kok
2006-10-20 18:06                 ` Lukas Hejtmanek
2006-10-21 17:34                   ` Ryan Richter
2006-10-21 17:56                     ` Auke Kok
2006-10-20 18:07                 ` Ryan Richter
2006-10-20 18:17                   ` Auke Kok
2006-10-20 18:29                     ` Lukas Hejtmanek
2006-10-23 20:52                   ` Aleksey Gorelov
2006-10-13 21:50       ` Aleksey Gorelov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.