public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* REVISED: Experimentation with Athlon and fast_page_copy
@ 2001-05-04 17:22 Seth Goldberg
  2001-05-04 19:48 ` Brian Gerst
  2001-05-04 21:09 ` Alan Cox
  0 siblings, 2 replies; 26+ messages in thread
From: Seth Goldberg @ 2001-05-04 17:22 UTC (permalink / raw)
  To: linux-kernel

Hi,
 
 After removing my head from my a**, I revised the code that checks
the memory copy in the fast_page_copy routine.  The machine then
proceeded
not to stop at my panic, but I got my "normal" oopses.  I then had an
idea and removed all the prefetch instructions from the beginning of the
routine and tried the resultin kernel.  I now have no crashes.
What could this mean?

Here is a nother patch just so you can keep me honest if I
made another mistake:

-------------------------
diff -r ./arch/i386/lib/mmx.c ../lin2/linux/arch/i386/lib/mmx.c
149,150c149
<
< /*    __asm__ __volatile__ (
---
>       __asm__ __volatile__ (
158c157
<               "3: movw $0x1AEB, 1b\n"
---
>               "3: movw $0x1AEB, 1b\n" /* jmp on 26 bytes */
166c165
< */
---
>
170c169
<               "1: nop\n" /* prefetch 320(%0)\n" */
---
>               "1: prefetch 320(%0)\n"                                         
-------------------------

  Please let me know if that makes sense :).

  Thank you,
   Seth

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-04 22:26   ` Aaron Tiensivu
@ 2001-05-04 18:10     ` Bobby D. Bryant
  2001-05-05  6:43       ` John R Lenton
  2001-05-05  0:26     ` Joseph Carter
  1 sibling, 1 reply; 26+ messages in thread
From: Bobby D. Bryant @ 2001-05-04 18:10 UTC (permalink / raw)
  To: linux-kernel

Aaron Tiensivu wrote:

> > What still stands out is that exactly _zero_ people have reported the same
> > problem with non VIA chipset Athlons.
>
> This might be grasping at straws [...] This could be (total conjecture)

> related somehow to the corruption bugs they are admitting to in

> the 686B although they are blaming the SB Live now.

Just another data point (the news is in the final paragraph):

I recently built two near-twin systems using Athlon 1.2's and VIA chipsets
(EPoX 8KTA3), and have *never* been able to get either to boot an
Athlon-optimized kernel, having tried 2.4.0, 2.4.2, 2.4.4, and about 5
different -ac* variants of 2.4.3.

They do boot PIII kernels reliably for all those variants, though they still
suffer occasional oopses, hangs, or crashes (as discussed in other threads).

However (and here's the part I haven't mentioned before), yesterday I switched
one of them to a new mb with a non-VIA chipset (Asus A7A266), and it booted the
first Athlon kernel I tried (2.4.4).  No other changes to .config, same
processor as before, same memory, same disks, same video, same case, same power
cord, you name it.

Bobby Bryant
Austin, Texas



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-04 17:22 REVISED: Experimentation with Athlon and fast_page_copy Seth Goldberg
@ 2001-05-04 19:48 ` Brian Gerst
  2001-05-04 21:09 ` Alan Cox
  1 sibling, 0 replies; 26+ messages in thread
From: Brian Gerst @ 2001-05-04 19:48 UTC (permalink / raw)
  To: Seth Goldberg; +Cc: linux-kernel

Seth Goldberg wrote:
> 
> Hi,
> 
>  After removing my head from my a**, I revised the code that checks
> the memory copy in the fast_page_copy routine.  The machine then
> proceeded
> not to stop at my panic, but I got my "normal" oopses.  I then had an
> idea and removed all the prefetch instructions from the beginning of the
> routine and tried the resultin kernel.  I now have no crashes.
> What could this mean?

What are your "normal" oopses?

--

				Brian Gerst

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-04 17:22 REVISED: Experimentation with Athlon and fast_page_copy Seth Goldberg
  2001-05-04 19:48 ` Brian Gerst
@ 2001-05-04 21:09 ` Alan Cox
  2001-05-04 22:26   ` Aaron Tiensivu
  2001-05-09  2:11   ` Tom Leete
  1 sibling, 2 replies; 26+ messages in thread
From: Alan Cox @ 2001-05-04 21:09 UTC (permalink / raw)
  To: Seth Goldberg; +Cc: linux-kernel

> the memory copy in the fast_page_copy routine.  The machine then
> proceeded
> not to stop at my panic, but I got my "normal" oopses.  I then had an

Ok

> idea and removed all the prefetch instructions from the beginning of the
> routine and tried the resultin kernel.  I now have no crashes.
> What could this mean?

I think it has to mean a hardware problem.

> Here is a nother patch just so you can keep me honest if I
> made another mistake:

There is a mistake but you wont trigger it. It is no longer 26 bytes 8)
That patch is only used when the prefetchw faults with an illegal instruction
and is done so you can boot an athlon kernel on a lesser cpu

The prefetch instructions hint to the CPU what memory we will access very soon.
The primary effect of that is that we hit full theoretical memory bandwidth
when copying pages. It doesnt really change execution behaviour in any other
way which then does rather point to cpu or other hardware problem. The very
early athlons had prefetch bugs but we would not trigger those and no reporters
have such an early CPU.

What still stands out is that exactly _zero_ people have reported the same
problem with non VIA chipset Athlons.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-04 21:09 ` Alan Cox
@ 2001-05-04 22:26   ` Aaron Tiensivu
  2001-05-04 18:10     ` Bobby D. Bryant
  2001-05-05  0:26     ` Joseph Carter
  2001-05-09  2:11   ` Tom Leete
  1 sibling, 2 replies; 26+ messages in thread
From: Aaron Tiensivu @ 2001-05-04 22:26 UTC (permalink / raw)
  To: Alan Cox, linux-kernel


> What still stands out is that exactly _zero_ people have reported the same
> problem with non VIA chipset Athlons.

This might be grasping at straws I remember VIA problem in the "good old
days" of Socket 7 with CPU/PCI Prefetches and especially Read-around-Write
settings that would cause issues like we're seeing with the Athlon
pre-fetches. This could be (total conjecture) related somehow to the
corruption bugs they are admitting to in the 686B although they are blaming
the SB Live now.




^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-04 22:26   ` Aaron Tiensivu
  2001-05-04 18:10     ` Bobby D. Bryant
@ 2001-05-05  0:26     ` Joseph Carter
  2001-05-05  3:51       ` Chris Wedgwood
  1 sibling, 1 reply; 26+ messages in thread
From: Joseph Carter @ 2001-05-05  0:26 UTC (permalink / raw)
  To: Aaron Tiensivu; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 903 bytes --]

On Fri, May 04, 2001 at 06:26:14PM -0400, Aaron Tiensivu wrote:
> This might be grasping at straws I remember VIA problem in the "good old
> days" of Socket 7 with CPU/PCI Prefetches and especially Read-around-Write
> settings that would cause issues like we're seeing with the Athlon
> pre-fetches. This could be (total conjecture) related somehow to the
> corruption bugs they are admitting to in the 686B although they are blaming
> the SB Live now.

I don't see how they figure, but in case there was any doubt I have a VIA
KT133A/686B board (Abit KT7A) and don't experience anything resembling
disk corruption unless the box crashes for some other reason.  I do seem
to be experiencing AGP problems in spades, but my disks at least are fine.

-- 
Joseph Carter <knghtbrd@debian.org>                Free software developer

<_Anarchy_> Argh.. who's handing out the paper bags  8)


[-- Attachment #2: Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-05  0:26     ` Joseph Carter
@ 2001-05-05  3:51       ` Chris Wedgwood
  2001-05-05  4:08         ` Seth Goldberg
  2001-05-05  5:45         ` REVISED: Experimentation with Athlon and fast_page_copy Joseph Carter
  0 siblings, 2 replies; 26+ messages in thread
From: Chris Wedgwood @ 2001-05-05  3:51 UTC (permalink / raw)
  To: Joseph Carter; +Cc: Aaron Tiensivu, linux-kernel

On Fri, May 04, 2001 at 05:26:57PM -0700, Joseph Carter wrote:

    I don't see how they figure, but in case there was any doubt I
    have a VIA KT133A/686B board (Abit KT7A) and don't experience
    anything resembling disk corruption unless the box crashes for
    some other reason.  I do seem to be experiencing AGP problems in
    spades, but my disks at least are fine.

I too seem no disk problems whatsoever (nothing really interesting
there, many people do not) but am also seeing AGP problems.

In fact, I had to disable AGP to stop X locking the box hard... yet
agpgart and the video driver (NVidia[1]) both claim to support the
chipset -- does anyone actually have this working?)




  --cw

[1] Yeah, close source, blah blah blah. Right now it works for me,
    and it works _VERY_ well.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-05  3:51       ` Chris Wedgwood
@ 2001-05-05  4:08         ` Seth Goldberg
       [not found]           ` <20010505163204.A29622@metastasis.f00f.org>
  2001-05-05  5:45         ` REVISED: Experimentation with Athlon and fast_page_copy Joseph Carter
  1 sibling, 1 reply; 26+ messages in thread
From: Seth Goldberg @ 2001-05-05  4:08 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Joseph Carter, Aaron Tiensivu, linux-kernel

Chris Wedgwood wrote:
> 
> On Fri, May 04, 2001 at 05:26:57PM -0700, Joseph Carter wrote:
> 
>     I don't see how they figure, but in case there was any doubt I
>     have a VIA KT133A/686B board (Abit KT7A) and don't experience
>     anything resembling disk corruption unless the box crashes for
>     some other reason.  I do seem to be experiencing AGP problems in
>     spades, but my disks at least are fine.
> 
> I too seem no disk problems whatsoever (nothing really interesting
> there, many people do not) but am also seeing AGP problems.
> 
> In fact, I had to disable AGP to stop X locking the box hard... yet
> agpgart and the video driver (NVidia[1]) both claim to support the
> chipset -- does anyone actually have this working?)

  My IWILL (KT133A) + GeForce 256 are working fine over AGP.

  --S

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Athlon and fast_page_copy: What's it worth ? :)
       [not found]           ` <20010505163204.A29622@metastasis.f00f.org>
@ 2001-05-05  5:03             ` Seth Goldberg
  2001-05-05  6:20               ` Mark Hahn
                                 ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Seth Goldberg @ 2001-05-05  5:03 UTC (permalink / raw)
  To: linux-kernel

Hi,

  Before I go any further with this investigation, I'd like to get an
idea
of how much of a performance improvement the K7 fast_page_copy will give
me.
Can someone suggest the best benchmark to test the speed of this
routine?

 Thanks,
  Seth

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-05  3:51       ` Chris Wedgwood
  2001-05-05  4:08         ` Seth Goldberg
@ 2001-05-05  5:45         ` Joseph Carter
  1 sibling, 0 replies; 26+ messages in thread
From: Joseph Carter @ 2001-05-05  5:45 UTC (permalink / raw)
  To: Chris Wedgwood; +Cc: Aaron Tiensivu, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1418 bytes --]

On Sat, May 05, 2001 at 03:51:13PM +1200, Chris Wedgwood wrote:
>     I don't see how they figure, but in case there was any doubt I
>     have a VIA KT133A/686B board (Abit KT7A) and don't experience
>     anything resembling disk corruption unless the box crashes for
>     some other reason.  I do seem to be experiencing AGP problems in
>     spades, but my disks at least are fine.
> 
> I too seem no disk problems whatsoever (nothing really interesting
> there, many people do not) but am also seeing AGP problems.
> 
> In fact, I had to disable AGP to stop X locking the box hard... yet
> agpgart and the video driver (NVidia[1]) both claim to support the
> chipset -- does anyone actually have this working?)

Not an option with the Radeon unfortunately.  At least, not yet.  Whenever
I find the solution (recently a bunch of people have suggested a bunch of
things to try on dri-devel - thanks guys!) I'll post to that list what
fixed it since I know I am not the only person seeing this kind of
problem.  I think some of the guys are looking into improving the docs a
bit, so maybe if I find it soon the problem and workaround will get
documented.  =)

-- 
Joseph Carter <knghtbrd@debian.org>                Free software developer

<hop> kb: I demand integrity and honesty in those who i do business with
<hop> i know my demands are unreasonable, but a guy can dream, can't he?


[-- Attachment #2: Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Athlon and fast_page_copy: What's it worth ? :)
  2001-05-05  5:03             ` Athlon and fast_page_copy: What's it worth ? :) Seth Goldberg
@ 2001-05-05  6:20               ` Mark Hahn
  2001-05-05  9:15                 ` Tom Leete
  2001-05-05  7:17               ` Alan Cox
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 26+ messages in thread
From: Mark Hahn @ 2001-05-05  6:20 UTC (permalink / raw)
  To: Seth Goldberg; +Cc: linux-kernel

On Fri, 4 May 2001, Seth Goldberg wrote:

> Hi,
> 
>   Before I go any further with this investigation, I'd like to get an
> idea
> of how much of a performance improvement the K7 fast_page_copy will give
> me.
> Can someone suggest the best benchmark to test the speed of this
> routine?

Arjan van de Ven did the code, and he wrote a little test harness.
I've hacked it a bit (http://brain.mcmaster.ca/~hahn/athlon.c);
on my duron/600, kt133, pc133 cas2, it looks like this:

clear_page by 'normal_clear_page'        took 7221 cycles (324.6 MB/s)
clear_page by 'slow_zero_page'           took 7232 cycles (324.1 MB/s)
clear_page by 'fast_clear_page'          took 6110 cycles (383.6 MB/s)
clear_page by 'faster_clear_page'        took 2574 cycles (910.6 MB/s)

copy_page by 'normal_copy_page'  took 7224 cycles (324.4 MB/s)
copy_page by 'slow_copy_page'    took 7223 cycles (324.5 MB/s)
copy_page by 'fast_copy_page'    took 4662 cycles (502.7 MB/s)
copy_page by 'faster_copy'       took 2746 cycles (853.5 MB/s)
copy_page by 'even_faster'       took 2802 cycles (836.5 MB/s)

70% faster!


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-04 18:10     ` Bobby D. Bryant
@ 2001-05-05  6:43       ` John R Lenton
  2001-05-05  7:20         ` Alan Cox
  0 siblings, 1 reply; 26+ messages in thread
From: John R Lenton @ 2001-05-05  6:43 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 975 bytes --]

On Sat, May 05, 2001 at 12:10:06AM +0600, Bobby D. Bryant wrote:
> They do boot PIII kernels reliably for all those variants, though they still
> suffer occasional oopses, hangs, or crashes (as discussed in other threads).

and as happens with my SMP pIII VIA-based boxed (and I've finally
fixed the memory, so I no longer get the oopses, just solid
hardware hangs).

> However (and here's the part I haven't mentioned before), yesterday I switched
> one of them to a new mb with a non-VIA chipset (Asus A7A266), and it booted the
> first Athlon kernel I tried (2.4.4).  No other changes to .config, same
> processor as before, same memory, same disks, same video, same case, same power
> cord, you name it.

damn. I guess the saving of 200$ on the MSI has probably been
300$ down the drain :(

-- 
John Lenton (john@grulic.org.ar) -- Random fortune:
If you treat people right they will treat you right -- 90% of the time.
		-- Franklin Delano Roosevelt

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Athlon and fast_page_copy: What's it worth ? :)
  2001-05-05  5:03             ` Athlon and fast_page_copy: What's it worth ? :) Seth Goldberg
  2001-05-05  6:20               ` Mark Hahn
@ 2001-05-05  7:17               ` Alan Cox
  2001-05-05 14:19               ` Jonathan Morton
  2001-05-08 21:16               ` Arjan van de Ven
  3 siblings, 0 replies; 26+ messages in thread
From: Alan Cox @ 2001-05-05  7:17 UTC (permalink / raw)
  To: Seth Goldberg; +Cc: linux-kernel

>   Before I go any further with this investigation, I'd like to get an
> idea
> of how much of a performance improvement the K7 fast_page_copy will give
> me.
> Can someone suggest the best benchmark to test the speed of this
> routine?

About 30% on page copies. Its impact in real world is very dependant on the
job mix


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-05  6:43       ` John R Lenton
@ 2001-05-05  7:20         ` Alan Cox
  2001-05-07  1:26           ` John R Lenton
  0 siblings, 1 reply; 26+ messages in thread
From: Alan Cox @ 2001-05-05  7:20 UTC (permalink / raw)
  To: John R Lenton; +Cc: linux-kernel

> > one of them to a new mb with a non-VIA chipset (Asus A7A266), and it boot=
> ed the
> > first Athlon kernel I tried (2.4.4).  No other changes to .config, same
> > processor as before, same memory, same disks, same video, same case, same=
>  power
> > cord, you name it.
> 
> damn. I guess the saving of 200$ on the MSI has probably been
> 300$ down the drain :(

Dont panic just yet. Manfred's observation could mean we hit chipset specific 
behaviour on prefetches. 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Athlon and fast_page_copy: What's it worth ? :)
  2001-05-05  6:20               ` Mark Hahn
@ 2001-05-05  9:15                 ` Tom Leete
  0 siblings, 0 replies; 26+ messages in thread
From: Tom Leete @ 2001-05-05  9:15 UTC (permalink / raw)
  To: Mark Hahn; +Cc: Seth Goldberg, linux-kernel

Mark Hahn wrote:
> 
> On Fri, 4 May 2001, Seth Goldberg wrote:
> 
> > Hi,
> >
> >   Before I go any further with this investigation, I'd like to get an
> > idea
> > of how much of a performance improvement the K7 fast_page_copy will give
> > me.
> > Can someone suggest the best benchmark to test the speed of this
> > routine?
> 
> Arjan van de Ven did the code, and he wrote a little test harness.
> I've hacked it a bit (http://brain.mcmaster.ca/~hahn/athlon.c);
> on my duron/600, kt133, pc133 cas2, it looks like this:
> 
> clear_page by 'normal_clear_page'        took 7221 cycles (324.6 MB/s)
> clear_page by 'slow_zero_page'           took 7232 cycles (324.1 MB/s)
> clear_page by 'fast_clear_page'          took 6110 cycles (383.6 MB/s)
> clear_page by 'faster_clear_page'        took 2574 cycles (910.6 MB/s)
> 
> copy_page by 'normal_copy_page'  took 7224 cycles (324.4 MB/s)
> copy_page by 'slow_copy_page'    took 7223 cycles (324.5 MB/s)
> copy_page by 'fast_copy_page'    took 4662 cycles (502.7 MB/s)
> copy_page by 'faster_copy'       took 2746 cycles (853.5 MB/s)
> copy_page by 'even_faster'       took 2802 cycles (836.5 MB/s)
> 
> 70% faster!
> 

I've played with this some, too. I find that Arjan's tests are very delicate
about the number of hw interrupts serviced. On UP I see 2-3 interrupts per
page copy on average with my normal workload. On Athlon, interrupts hit 'rep
mov' (looong interruptable vector path insn) much worse than they do mmx
movq (direct path) instructions.

With hands off and no networking, breakeven is about the canonical 512
bytes, and page copy is about 30% better, as Alan says. With ethers up and X
running mmx gets better by comparison -- 40-60% for me. I haven't seen 70%
better, but I'd like to.

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Athlon and fast_page_copy: What's it worth ? :)
  2001-05-05  5:03             ` Athlon and fast_page_copy: What's it worth ? :) Seth Goldberg
  2001-05-05  6:20               ` Mark Hahn
  2001-05-05  7:17               ` Alan Cox
@ 2001-05-05 14:19               ` Jonathan Morton
  2001-05-05 14:41                 ` Alan Cox
  2001-05-05 15:17                 ` Jonathan Morton
  2001-05-08 21:16               ` Arjan van de Ven
  3 siblings, 2 replies; 26+ messages in thread
From: Jonathan Morton @ 2001-05-05 14:19 UTC (permalink / raw)
  To: Mark Hahn, Seth Goldberg; +Cc: linux-kernel

At 7:20 am +0100 5/5/2001, Mark Hahn wrote:
>On Fri, 4 May 2001, Seth Goldberg wrote:
>
>> Hi,
>>
>>   Before I go any further with this investigation, I'd like to get an
>> idea
>> of how much of a performance improvement the K7 fast_page_copy will give
>> me.
>> Can someone suggest the best benchmark to test the speed of this
>> routine?
>
>Arjan van de Ven did the code, and he wrote a little test harness.
>I've hacked it a bit (http://brain.mcmaster.ca/~hahn/athlon.c);
>on my duron/600, kt133, pc133 cas2, it looks like this:
>
>clear_page by 'normal_clear_page'        took 7221 cycles (324.6 MB/s)
>clear_page by 'slow_zero_page'           took 7232 cycles (324.1 MB/s)
>clear_page by 'fast_clear_page'          took 6110 cycles (383.6 MB/s)
>clear_page by 'faster_clear_page'        took 2574 cycles (910.6 MB/s)
>
>copy_page by 'normal_copy_page'  took 7224 cycles (324.4 MB/s)
>copy_page by 'slow_copy_page'    took 7223 cycles (324.5 MB/s)
>copy_page by 'fast_copy_page'    took 4662 cycles (502.7 MB/s)
>copy_page by 'faster_copy'       took 2746 cycles (853.5 MB/s)
>copy_page by 'even_faster'       took 2802 cycles (836.5 MB/s)
>
>70% faster!


On my Athlon 1GHz, Abit KT7, PC133 set to "Turbo" (not quite sure what the
actual CAS rating is, but it works):
[chromi@beryllium compsci]$ ./athlon
1000.047 MHz
clear_page by 'normal_clear_page'        took 10769 cycles (362.7 MB/s)
clear_page by 'slow_zero_page'           took 10349 cycles (377.5 MB/s)
clear_page by 'fast_clear_page'          took 10868 cycles (359.4 MB/s)
clear_page by 'faster_clear_page'        took 4345 cycles (899.1 MB/s)

copy_page by 'normal_copy_page'  took 11242 cycles (347.5 MB/s)
copy_page by 'slow_copy_page'    took 11245 cycles (347.4 MB/s)
copy_page by 'fast_copy_page'    took 7951 cycles (491.3 MB/s)
copy_page by 'faster_copy'       took 4765 cycles (819.7 MB/s)
copy_page by 'even_faster'       took 4955 cycles (788.3 MB/s)

My wild guess is that with the "faster" code, the K7 is avoiding loading
cache lines just to write them out again, and is just writing tons of data.
The PPC G4 - and perhaps even the G3 - performs a similar trick
automatically, without special assembly...

Perhaps the IWILL m/board doesn't like this behaviour, and somehow assumes
that all written cachelines have been read beforehand.  I heard of some
m/boards - particularly those with more than 3 DIMM slots - using a "helper
chip" to boost the signal to the last slot or two, so maybe that is a
problem?  How many slots does the IWILL have?

--------------------------------------------------------------
from:     Jonathan "Chromatix" Morton
mail:     chromi@cyberspace.org  (not for attachments)
big-mail: chromatix@penguinpowered.com
uni-mail: j.d.morton@lancaster.ac.uk

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-----BEGIN GEEK CODE BLOCK-----
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-----END GEEK CODE BLOCK-----



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Athlon and fast_page_copy: What's it worth ? :)
  2001-05-05 14:19               ` Jonathan Morton
@ 2001-05-05 14:41                 ` Alan Cox
  2001-05-05 15:17                 ` Jonathan Morton
  1 sibling, 0 replies; 26+ messages in thread
From: Alan Cox @ 2001-05-05 14:41 UTC (permalink / raw)
  To: Jonathan Morton; +Cc: Mark Hahn, Seth Goldberg, linux-kernel

> My wild guess is that with the "faster" code, the K7 is avoiding loading
> cache lines just to write them out again, and is just writing tons of data.
> The PPC G4 - and perhaps even the G3 - performs a similar trick
> automatically, without special assembly...

X86 has done that since the K5 era. 

No the main thing that the mmx copier does is to read and write in 64bit
wide chunks, and then more importantly to prefetch pending data. Thus instead
of stalling on reads there is a continual stream of data from the sdram hitting
the cpu ready for us to write it back out (and the write out is buffered too)

Alan


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Athlon and fast_page_copy: What's it worth ? :)
  2001-05-05 14:19               ` Jonathan Morton
  2001-05-05 14:41                 ` Alan Cox
@ 2001-05-05 15:17                 ` Jonathan Morton
  1 sibling, 0 replies; 26+ messages in thread
From: Jonathan Morton @ 2001-05-05 15:17 UTC (permalink / raw)
  To: Alan Cox; +Cc: Mark Hahn, Seth Goldberg, linux-kernel

At 3:41 pm +0100 5/5/2001, Alan Cox wrote:
>> My wild guess is that with the "faster" code, the K7 is avoiding loading
>> cache lines just to write them out again, and is just writing tons of data.
>> The PPC G4 - and perhaps even the G3 - performs a similar trick
>> automatically, without special assembly...
>
>X86 has done that since the K5 era.
>
>No the main thing that the mmx copier does is to read and write in 64bit
>wide chunks

Just for the record, this can be done on any PPC, by using the FPU
registers (which are much faster than x86 FPU).  AltiVec can do 128-bit
wide transfers.

>and then more importantly to prefetch pending data.

That's a tougher one.  AltiVec (in the G4) can do this, but I suspect it
can be emulated using the pipeline on earlier PowerPCs, by queueing up a
line of FPU load instructions and then a queue of FPU saves.  However, the
601 and 603 don't have a superscalar FPU, though I wonder if that would
actually affect a simple load/store operation.

This is rapidly getting offtopic, though...

--------------------------------------------------------------
from:     Jonathan "Chromatix" Morton
mail:     chromi@cyberspace.org  (not for attachments)
big-mail: chromatix@penguinpowered.com
uni-mail: j.d.morton@lancaster.ac.uk

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-----BEGIN GEEK CODE BLOCK-----
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*)
-----END GEEK CODE BLOCK-----



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-05  7:20         ` Alan Cox
@ 2001-05-07  1:26           ` John R Lenton
  2001-05-07  1:30             ` Jeremy
  0 siblings, 1 reply; 26+ messages in thread
From: John R Lenton @ 2001-05-07  1:26 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

On Sat, May 05, 2001 at 08:20:56AM +0100, Alan Cox wrote:
> Dont panic just yet. Manfred's observation could mean we hit chipset specific 
> behaviour on prefetches. 

OK - Please let me know when to start.

-- 
John Lenton (john@grulic.org.ar) -- Random fortune:
BOFH excuse #349:

Stray Alpha Particles from memory packaging caused Hard Memory Error on Server.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-07  1:26           ` John R Lenton
@ 2001-05-07  1:30             ` Jeremy
  0 siblings, 0 replies; 26+ messages in thread
From: Jeremy @ 2001-05-07  1:30 UTC (permalink / raw)
  To: Alan Cox, linux-kernel

Have non-production via KT133a, will test :) (tyan mobo, 1.33ghz, tulip eth, an
idea drive, nothing really exciting, just a fast ath)

-j

John R Lenton enlightened recipients with the following on 06May2001:
> On Sat, May 05, 2001 at 08:20:56AM +0100, Alan Cox wrote:
> > Dont panic just yet. Manfred's observation could mean we hit chipset specific 
> > behaviour on prefetches. 
> 
> OK - Please let me know when to start.

-- 
---------------------------------------------------------------------------
                          heffner at darkness.net
                       Darkness Network Engineering
                   PGP public key available on request
            My thoughts and opinions represent no one but myself
---------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Athlon and fast_page_copy: What's it worth ? :)
  2001-05-05  5:03             ` Athlon and fast_page_copy: What's it worth ? :) Seth Goldberg
                                 ` (2 preceding siblings ...)
  2001-05-05 14:19               ` Jonathan Morton
@ 2001-05-08 21:16               ` Arjan van de Ven
  3 siblings, 0 replies; 26+ messages in thread
From: Arjan van de Ven @ 2001-05-08 21:16 UTC (permalink / raw)
  To: Seth Goldberg; +Cc: linux-kernel

In article <3AF389BD.81F9B398@home.com> you wrote:
> Hi,

>   Before I go any further with this investigation, I'd like to get an
> idea
> of how much of a performance improvement the K7 fast_page_copy will give
> me.
> Can someone suggest the best benchmark to test the speed of this
> routine?

http://www.fenrus.demon.nl/athlon.c

is a userspace benchmark of the current code vs C etc

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-04 21:09 ` Alan Cox
  2001-05-04 22:26   ` Aaron Tiensivu
@ 2001-05-09  2:11   ` Tom Leete
  2001-05-09  8:49     ` Alan Cox
  1 sibling, 1 reply; 26+ messages in thread
From: Tom Leete @ 2001-05-09  2:11 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox wrote:
> 
> > the memory copy in the fast_page_copy routine.  The machine then
> > proceeded
> > not to stop at my panic, but I got my "normal" oopses.  I then had an
> 
> Ok
> 
> > idea and removed all the prefetch instructions from the beginning of the
> > routine and tried the resultin kernel.  I now have no crashes.
> > What could this mean?
> 
> I think it has to mean a hardware problem.

I don't think so, reasons below
 
> What still stands out is that exactly _zero_ people have reported the same
> problem with non VIA chipset Athlons.

Not any more :-(

Hi Alan,

IIRC this thread is about boot going catatonic right after unloading
__initmem.
I'm seeing that in 2.4.5-pre1 with Athlon stepping 2, AMD 751, MS-6195 mobo,
128M.
The machine is fine with kernels up through 2.4.4-pre3, and still works with
them.

On that gear, there is no crash. The keyboard and display are alive and
SysRq works.
I have copied the stack trace for pid=1 and the processor dump. I'm short of
time
but I have a kind typist electrifying the trace, and I'll try to generate
something
ksymoops can digest.

Here is what a quick eyeballing of System.map shows.

The code is at the end of init/main.c:init(). The processor dump shows
init() halted
in default_idle() from the sequence L6 -> init -> cpu_idle.

Trace of pid 1 shows it stuck in D state. The last addresses listed are from
filemap_nopage -> do_execve -> do_no_page -> handle_mm_fault -> __pmd_alloc
-> rwsem_down_write_failed -> stext_lock -> system_call. That looks fishy.

Earlier, it looks like handle_mm_fault is being triggered from
fast_clear_page.

I'll post the full dump soon as I have it.

Btw, above happens with both gcc-2.95.3 and gcc-3.0-[20010423] compiled
kernels.

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-09  2:11   ` Tom Leete
@ 2001-05-09  8:49     ` Alan Cox
  2001-05-09 11:38       ` Tom Leete
  0 siblings, 1 reply; 26+ messages in thread
From: Alan Cox @ 2001-05-09  8:49 UTC (permalink / raw)
  To: Tom Leete; +Cc: Alan Cox, linux-kernel

> > What still stands out is that exactly _zero_ people have reported the same
> > problem with non VIA chipset Athlons.
> 
> Not any more :-(

Still the same

> IIRC this thread is about boot going catatonic right after unloading
> __initmem.

Nope. Its about memory corruptions. Your bug sounds very different

> Earlier, it looks like handle_mm_fault is being triggered from
> fast_clear_page.

That would be messy. The other way around is sane but not that way

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-09  8:49     ` Alan Cox
@ 2001-05-09 11:38       ` Tom Leete
  2001-05-09 12:38         ` Alan Cox
  0 siblings, 1 reply; 26+ messages in thread
From: Tom Leete @ 2001-05-09 11:38 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox wrote:
> 
> 
> > IIRC this thread is about boot going catatonic right after unloading
> > __initmem.
> 
> Nope. Its about memory corruptions. Your bug sounds very different
> 
> > Earlier, it looks like handle_mm_fault is being triggered from
> > fast_clear_page.
> 
> That would be messy. The other way around is sane but not that way

Indeed, I was confused. Looks like ide-dma is getting goofy somehow.

Here is a decoded trace. Typos are likely. If the problem is not obvious to
anyone, I'll switch around my serial console setup to get some better info.

Warning (Oops_read): Code line not seen, dumping what data is available

Trace; ffff037f <END_OF_CODE+3fcfb2ab/????>
Trace; ffff0000 <END_OF_CODE+3fcfaf2c/????>
Trace; ffff0000 <END_OF_CODE+3fcfaf2c/????>
Trace; ffff0720 <END_OF_CODE+3fcfb64c/????>
Trace; c01b956a <ide_build_dmatable+2a/120>
Trace; c01b3fb5 <ide_set_handler+55/60>
Trace; c01b9aca <ide_dmaproc+11a/210>
Trace; c01b9380 <ide_dma_intr+0/b0>
Trace; c01b9940 <dma_timer_expiry+0/70>
Trace; c01bd457 <do_rw_disk+257/300>
Trace; c01b4d2a <ide_wait_stat+7a/e0>
Trace; c01b5010 <start_request+160/210>
Trace; c01b51ff <ide_do_request+10f/340>
Trace; c01b3430 <ali_cleanup+10/70>
Trace; c0132e45 <__wait_on_buffer+75/90>
Trace; c0134026 <bread+16/70>
Trace; c018665c <__make_request+43c/6f0>
Trace; c01866ce <__make_request+4ae/6f0>
Trace; c01866e6 <__make_request+4c6/6f0>
Trace; c018665c <__make_request+43c/6f0>
Trace; c01866ce <__make_request+4ae/6f0>
Trace; c01866e6 <__make_request+4c6/6f0>
Trace; c01b956a <ide_build_dmatable+2a/120>
Trace; c01b3fb5 <ide_set_handler+55/60>
Trace; c01b9aca <ide_dmaproc+11a/210>
Trace; c01b9380 <ide_dma_intr+0/b0>
Trace; c01b9940 <dma_timer_expiry+0/70>
Trace; c01bd457 <do_rw_disk+257/300>
Trace; c01b4d2a <ide_wait_stat+7a/e0>
Trace; c01b5010 <start_request+160/210>
Trace; c01b51ff <ide_do_request+10f/340>
Trace; c01b546e <do_ide_request+e/20>
Trace; c01134d0 <schedule+200/3e0>
Trace; c012bd2c <free_shortage+1c/c0>
Trace; c012cbd7 <__alloc_pages+87/300>
Trace; c022c12f <fast_copy_page+f/90>
Trace; c0125f4d <filemap_nopage+2bd/420>
Trace; c0125f58 <filemap_nopage+2c8/420>
Trace; c022c0ca <fast_clear_page+a/60>
Trace; c0122b7d <handle_mm_fault+cd/e0>
Trace; c012bd2c <free_shortage+1c/c0>
Trace; c0122a15 <do_no_page+45/e0>
Trace; c022c0ca <fast_clear_page+a/60>
Trace; c0222b7d <packet_ioctl+17d/350>
Trace; c0225476 <do_xprt_transmit+46/3d0>
Trace; c02254c3 <do_xprt_transmit+93/3d0>
Trace; c0112ba9 <do_page_fault+2a9/450>
Trace; c01239cf <do_munmap+5f/280>
Trace; c0112900 <do_page_fault+0/450>
Trace; c0106ddc <error_code+34/3c>
Trace; c022be20 <clear_user+30/40>
Trace; c0112900 <do_page_fault+0/450>
Trace; c0134026 <bread+16/70>
Trace; c018665c <__make_request+43c/6f0>
Trace; c01866ce <__make_request+4ae/6f0>
Trace; c01866e6 <__make_request+4c6/6f0>
Trace; c01b956a <ide_build_dmatable+2a/120>
Trace; c01b3fb5 <ide_set_handler+55/60>
Trace; c01b9aca <ide_dmaproc+11a/210>
Trace; c01b9380 <ide_dma_intr+0/b0>
Trace; c01b9940 <dma_timer_expiry+0/70>
Trace; c01bd457 <do_rw_disk+257/300>
Trace; c01b4d2a <ide_wait_stat+7a/e0>
Trace; c01b5010 <start_request+160/210>
Trace; c01b51ff <ide_do_request+10f/340>
Trace; c01b546e <do_ide_request+e/20>
Trace; c01134d0 <schedule+200/3e0>
Trace; c012be03 <inactive_shortage+33/90>
Trace; c0124d05 <__find_get_page+35/80>
Trace; c013ae1b <search_binary_handler+17b/190>
Trace; c0125d44 <filemap_nopage+b4/420>
Trace; c013af79 <do_execve+149/200>
Trace; c0122a15 <do_no_page+45/e0>
Trace; c012267d <vmtruncate+12d/160>
Trace; c0112ba9 <do_page_fault+2a9/450>
Trace; c022d075 <rwsem_down_write_failed+65/140>
Trace; c022f396 <stext_lock+37a/16bc>
Trace; c0106cbf <system_call+33/38>


1 warning issued.  Results may not be reliable.

Cheers,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-09 11:38       ` Tom Leete
@ 2001-05-09 12:38         ` Alan Cox
  2001-05-09 13:02           ` Tom Leete
  0 siblings, 1 reply; 26+ messages in thread
From: Alan Cox @ 2001-05-09 12:38 UTC (permalink / raw)
  To: Tom Leete; +Cc: Alan Cox, linux-kernel

> Trace; ffff037f <END_OF_CODE+3fcfb2ab/????>
> Trace; ffff0000 <END_OF_CODE+3fcfaf2c/????>
> Trace; ffff0000 <END_OF_CODE+3fcfaf2c/????>
> Trace; ffff0720 <END_OF_CODE+3fcfb64c/????>

Lets ignore the crap above..

> Trace; c01b956a <ide_build_dmatable+2a/120>
> Trace; c01b3fb5 <ide_set_handler+55/60>
> Trace; c01b9aca <ide_dmaproc+11a/210>
> Trace; c01b9380 <ide_dma_intr+0/b0>
> Trace; c01b9940 <dma_timer_expiry+0/70>
> Trace; c01bd457 <do_rw_disk+257/300>
> Trace; c01b4d2a <ide_wait_stat+7a/e0>
> Trace; c01b5010 <start_request+160/210>
> Trace; c01b51ff <ide_do_request+10f/340>

We seem to be several layers into recursive use of the ide driver - which 
shouldnt happen. In fact if these are the same interface the second dmatable 
build would leave HWIF(drive)->sg_table wrong.

> Trace; c01866ce <__make_request+4ae/6f0>
> Trace; c01866e6 <__make_request+4c6/6f0>
> Trace; c01b956a <ide_build_dmatable+2a/120>
> Trace; c01b3fb5 <ide_set_handler+55/60>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: REVISED: Experimentation with Athlon and fast_page_copy
  2001-05-09 12:38         ` Alan Cox
@ 2001-05-09 13:02           ` Tom Leete
  0 siblings, 0 replies; 26+ messages in thread
From: Tom Leete @ 2001-05-09 13:02 UTC (permalink / raw)
  To: Alan Cox; +Cc: linux-kernel

Alan Cox wrote:
> 
> > Trace; c01b956a <ide_build_dmatable+2a/120>
> > Trace; c01b3fb5 <ide_set_handler+55/60>
> > Trace; c01b9aca <ide_dmaproc+11a/210>
> > Trace; c01b9380 <ide_dma_intr+0/b0>
> > Trace; c01b9940 <dma_timer_expiry+0/70>
> > Trace; c01bd457 <do_rw_disk+257/300>
> > Trace; c01b4d2a <ide_wait_stat+7a/e0>
> > Trace; c01b5010 <start_request+160/210>
> > Trace; c01b51ff <ide_do_request+10f/340>
> 
> We seem to be several layers into recursive use of the ide driver - which
> shouldnt happen. In fact if these are the same interface the second dmatable
> build would leave HWIF(drive)->sg_table wrong.
> 
> > Trace; c01866ce <__make_request+4ae/6f0>
> > Trace; c01866e6 <__make_request+4c6/6f0>
> > Trace; c01b956a <ide_build_dmatable+2a/120>
> > Trace; c01b3fb5 <ide_set_handler+55/60>

I think maybe it smells like a configuration problem, I have a pair of ATAPI
drives on the second ide which I run with SCSI emulation. I'll see if I can
get a better look, with arguments to the calls.

Thanks,
Tom

-- 
The Daemons lurk and are dumb. -- Emerson

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2001-05-09 13:04 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-05-04 17:22 REVISED: Experimentation with Athlon and fast_page_copy Seth Goldberg
2001-05-04 19:48 ` Brian Gerst
2001-05-04 21:09 ` Alan Cox
2001-05-04 22:26   ` Aaron Tiensivu
2001-05-04 18:10     ` Bobby D. Bryant
2001-05-05  6:43       ` John R Lenton
2001-05-05  7:20         ` Alan Cox
2001-05-07  1:26           ` John R Lenton
2001-05-07  1:30             ` Jeremy
2001-05-05  0:26     ` Joseph Carter
2001-05-05  3:51       ` Chris Wedgwood
2001-05-05  4:08         ` Seth Goldberg
     [not found]           ` <20010505163204.A29622@metastasis.f00f.org>
2001-05-05  5:03             ` Athlon and fast_page_copy: What's it worth ? :) Seth Goldberg
2001-05-05  6:20               ` Mark Hahn
2001-05-05  9:15                 ` Tom Leete
2001-05-05  7:17               ` Alan Cox
2001-05-05 14:19               ` Jonathan Morton
2001-05-05 14:41                 ` Alan Cox
2001-05-05 15:17                 ` Jonathan Morton
2001-05-08 21:16               ` Arjan van de Ven
2001-05-05  5:45         ` REVISED: Experimentation with Athlon and fast_page_copy Joseph Carter
2001-05-09  2:11   ` Tom Leete
2001-05-09  8:49     ` Alan Cox
2001-05-09 11:38       ` Tom Leete
2001-05-09 12:38         ` Alan Cox
2001-05-09 13:02           ` Tom Leete

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox