* Blue G3 and machine check
@ 1999-03-14 13:35 Benjamin Herrenschmidt
1999-03-15 16:42 ` Ryuichi Oikawa
1999-03-24 9:30 ` Gabriel Paubert
0 siblings, 2 replies; 33+ messages in thread
From: Benjamin Herrenschmidt @ 1999-03-14 13:35 UTC (permalink / raw)
To: linuxppc-dev, Paul Mackerras
While reading of macbsd mailing list, I've seen that the Blue G3 causes a
machine check exception while probing a non-existing PCI slot. I don't
have time to look into this and I don't have one of those machines, but
if anyone want to give a try at fixing this...
(With luck, the second PCI bridge will have been properly setup by OF or
BootX and this fix could be enough to get the machine to boot further
that It's doing now).
--
E-Mail: <mailto:bh40@calva.net>
BenH. Web : <http://calvaweb.calvacom.fr/bh40/>
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 33+ messages in thread* Re: Blue G3 and machine check 1999-03-14 13:35 Blue G3 and machine check Benjamin Herrenschmidt @ 1999-03-15 16:42 ` Ryuichi Oikawa 1999-03-15 17:09 ` Geert Uytterhoeven 1999-03-24 9:30 ` Gabriel Paubert 1 sibling, 1 reply; 33+ messages in thread From: Ryuichi Oikawa @ 1999-03-15 16:42 UTC (permalink / raw) To: bh40; +Cc: linuxppc-dev, paulus > While reading of macbsd mailing list, I've seen that the Blue G3 causes a > machine check exception while probing a non-existing PCI slot. I don't > have time to look into this and I don't have one of those machines, but > if anyone want to give a try at fixing this... > > (With luck, the second PCI bridge will have been properly setup by OF or > BootX and this fix could be enough to get the machine to boot further > that It's doing now). As far as I tested(kernel source 2.2.1 from samba and BootX 1.0.2b), the kernel freezes within head.S before jumping to start_kernel. It seems to completely freeze after the following instructions: /* * Go back to running unmapped so we can load up new values * for SDR1 (hash table pointer) and the segment registers * and change to using our exception vectors. * On the 8xx, all we have to do is invalidate the TLB to clear * the old 8M byte TLB mappings and load the page table base register. */ #ifndef CONFIG_8xx lis r6,_SDR1@ha lwz r6,_SDR1@l(r6) #else /* The right way to do this would be to track it down through * init's TSS like the context switch code does, but this is * easier......until someone changes init's static structures. */ lis r6, swapper_pg_dir@h tophys(r6,r6,0) ori r6, r6, swapper_pg_dir@l mtspr M_TWB, r6 #endif lis r4,2f@h ori r4,r4,2f@l tophys(r4,r4,r3) li r3,MSR_KERNEL & ~(MSR_IR|MSR_DR) mtspr SRR0,r4 mtspr SRR1,r3 rfi Next, I looked into MacOS ROM file if there's a hint or somthing and found an OF boot script: <CHRP-BOOT> <COMPATIBLE> iMac,1 PowerMac1,1 PowerBook1,1 </COMPATIBLE> <DESCRIPTION> MacROM for NewWorld. </DESCRIPTION> <BOOT-SCRIPT> here >r dev / " model" active-package get-package-property abort" can't find MODEL" decode-string 2swap 2drop " iMac,1" $= ?dup 0= if " compatible" active-package get-package-property abort" can't find COMPATIBLE" false >r begin dup while decode-string here over 2swap bounds ?do i c@ dup [char] A [char] Z between if h# 20 xor then c, loop 2dup " powermac1,1" $= r> or >r 2dup " powerbook1,1" $= r> or >r 2drop repeat 2drop r> then r> here - allot 0= abort" this image is not for this platform" dev /openprom 0 0 " supports-bootinfo" property device-end " /chosen" find-package 0= abort" can't find '/chosen'" constant /chosen " memory" /chosen get-package-property abort" memory??" decode-int constant xmem 2drop " mmu" /chosen get-package-property abort" mmu??" decode-int constant xmmu 2drop " AAPL,debug" " /" find-package 0= abort" can't find '/'" get-package-property if false else 2drop true then ( debug? ) constant debug? debug? if cr ." checking for RELEASE-LOAD-AREA" then " release-load-area" $find 0= if 2drop false then ( xt|0 ) constant 'release-load-area debug? if 'release-load-area if ." , found it" else ." , not found" then then : do-translate " translate" xmmu $call-method ; : do-map " map" xmmu $call-method ; : do-unmap " unmap" xmmu $call-method ; : claim-mem " claim" xmem $call-method ; : release-mem " release" xmem $call-method ; : claim-virt " claim" xmmu $call-method ; : release-virt " release" xmmu $call-method ; 1000 constant pagesz pagesz 1- constant pagesz-1 -1000 constant pagemask h# 004000 constant elf-offset h# 00CCE8 constant elf-size elf-size pagesz-1 + pagemask and constant elf-pages h# 010CE8 constant lzss-offset h# 1CB7A2 constant lzss-size lzss-size pagesz-1 + pagemask and constant lzss-pages h# 1DC48A constant info-size info-size pagesz-1 + pagemask and constant info-pages 0 value load-base-claim 0 value info-base 'release-load-area if load-base to info-base else load-base info-pages 0 ['] claim-mem catch if 3drop 0 then to load-base-claim info-pages 1000 claim-virt to info-base load-base info-base info-pages 10 do-map then lzss-pages 400000 claim-mem constant rom-phys lzss-pages 1000 claim-virt constant rom-virt rom-phys rom-virt lzss-pages 10 do-map elf-pages 1000 claim-mem constant elf-phys elf-pages 1000 claim-virt constant elf-virt elf-phys elf-virt elf-pages 10 do-map info-base elf-offset + elf-virt elf-size move debug? if cr ." elf-phys,elf-virt,elf-pages: " elf-phys u. ." , " elf-virt u. ." , " elf-pages u. then debug? if cr ." copying compressed ROM image" then rom-virt lzss-pages 0 fill info-base lzss-offset + rom-virt lzss-size move 'release-load-area 0= if info-base info-pages do-unmap load-base-claim ?dup if info-pages release-mem then then debug? if cr ." MacOS-ROM phys,virt,size: " rom-phys u. ." , " rom-virt u. ." , " lzss-size u. then debug? if cr ." finding/creating '/rom/macos' package" then device-end 0 to my-self " /rom" find-device " macos" ['] find-device catch if 2drop new-device " macos" device-name finish-device then " /rom/macos" find-device debug? if cr ." creating 'AAPL,toolbox-image,lzss' property" then rom-virt encode-int lzss-size encode-int encode+ " AAPL,toolbox-image,lzss" property device-end debug? if cr ." copying MacOS.elf to load-base" then 'release-load-area if load-base elf-pages + 'release-load-area execute else load-base elf-pages 0 claim-mem load-base dup elf-pages 0 do-map then elf-virt load-base elf-size move elf-virt elf-pages do-unmap elf-virt elf-pages release-virt elf-phys elf-pages release-mem debug? if cr ." init-program" then init-program debug? if cr ." .registers" .registers then debug? if cr ." go" cr then go cr ." end of BOOT-SCRIPT" </BOOT-SCRIPT> </CHRP-BOOT>...MacOS icon bitmap and ELF header follows... BlueG3's OF contains ELF loader package and is this the loading script? Though I don't understand OF/Forth very well, this may be useful for prom_init or somewhere. Still can't reach the entry point. What should I do next? Thanks in advance, Ryuichi Oikawa roikawa@rr.iij4u.or.jp http://www.rr.iij4u.or.jp/~roikawa [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-15 16:42 ` Ryuichi Oikawa @ 1999-03-15 17:09 ` Geert Uytterhoeven 0 siblings, 0 replies; 33+ messages in thread From: Geert Uytterhoeven @ 1999-03-15 17:09 UTC (permalink / raw) To: Ryuichi Oikawa; +Cc: bh40, linuxppc-dev, paulus On Tue, 16 Mar 1999, Ryuichi Oikawa wrote: > > While reading of macbsd mailing list, I've seen that the Blue G3 causes a > > machine check exception while probing a non-existing PCI slot. I don't > > have time to look into this and I don't have one of those machines, but > > if anyone want to give a try at fixing this... > > > > (With luck, the second PCI bridge will have been properly setup by OF or > > BootX and this fix could be enough to get the machine to boot further > > that It's doing now). > As far as I tested(kernel source 2.2.1 from samba and BootX 1.0.2b), > the kernel freezes within head.S before jumping to start_kernel. It seems > to completely freeze after the following instructions: [...] Hmm, very similar to the place I suspect to crash my CHRP box. The change was made somewhere in january, but I still haven't found time to track it down :-( Greetings, Geert (ashamed he's still running 2.2.0) -- Geert Uytterhoeven Geert.Uytterhoeven@cs.kuleuven.ac.be Wavelets, Linux/{m68k~Amiga,PPC~CHRP} http://www.cs.kuleuven.ac.be/~geert/ Department of Computer Science -- Katholieke Universiteit Leuven -- Belgium [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-14 13:35 Blue G3 and machine check Benjamin Herrenschmidt 1999-03-15 16:42 ` Ryuichi Oikawa @ 1999-03-24 9:30 ` Gabriel Paubert 1999-03-24 23:12 ` Paul Mackerras 1 sibling, 1 reply; 33+ messages in thread From: Gabriel Paubert @ 1999-03-24 9:30 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Paul Mackerras On Sun, 14 Mar 1999, Benjamin Herrenschmidt wrote: > > While reading of macbsd mailing list, I've seen that the Blue G3 causes a > machine check exception while probing a non-existing PCI slot. I don't > have time to look into this and I don't have one of those machines, but > if anyone want to give a try at fixing this... In the case of a non present device, the MPC106 will terminate the cycle with a master abort and assert both the MCP (machine check) and TEA (transfer error) pins. There are 2 possible quick fixes for this (can't test them but they are very likely to work): - disable the MCP and TEA pins in the MPC106: resp. bits 11 (0x800) and 10 (0x400) in the PICR1 configuration space register at offset 0xA8. - finer control (affects only master abort cycles) by clearing bit 1 of ErrEnR1 at offset 0xc0. I don't like these solutions however because it's curing the symptom rather than the cause (and I like to have HW error reports). AFAIR there is code on the Alpha to handle machine checks when accessing PCI config space, maybe it will give some ideas for a truly correct fix. Note that you probably only need to protect the PCI config space accesses, for example by adding a handler to the actual accesses with code looking like (memop may be l[bhw]z, st[bhw], l[hw]brx, st[hw]brx): sync isync 1: memop reg,addr 2: sync 3: isync 4: (I think that at least one of the isync is probably unnecessary and perhaps both, but I'd rather choose the safe solution for this). in the exception table (replace 5f by 4b for the stores): .long 1b,5f .long 2b,5f .long 3b,5f and in the fixup section (for the loads only): 5: li reg,-1 b 4b If anybody wants to test that this works (I don't have any Mac, but I want to enable HW error reports through machine checks on my boards in the future). Note that you might want to encapsulate this in macros with a name like checked_{ld,st}_{8,be16,be32,le16,le32} or in separate subroutines to limit code expansion since it's not performance critical. Regards, Gabriel. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-24 9:30 ` Gabriel Paubert @ 1999-03-24 23:12 ` Paul Mackerras 1999-03-25 11:20 ` Gabriel Paubert 1999-03-25 12:10 ` Benjamin Herrenschmidt 0 siblings, 2 replies; 33+ messages in thread From: Paul Mackerras @ 1999-03-24 23:12 UTC (permalink / raw) To: paubert; +Cc: bh40, linuxppc-dev Gabriel Paubert <paubert@iram.es> wrote: > Note that you probably only need to protect the PCI config space accesses, If we are getting machine checks on config space accesses, then it is truly borken. Config spaces accesses in PCI are supposed to return ~0 if there is no device there, precisely so that you can safely probe to see whether the device is there. Did the original poster say whether the machine checks were on config space accesses or I/O or memory space accesses? It's common enough for drivers written for intel linux to go probing I/O ports to try to find devices to talk to. > for example by adding a handler to the actual accesses with code looking > like (memop may be l[bhw]z, st[bhw], l[hw]brx, st[hw]brx): > > sync > isync > 1: memop reg,addr > 2: sync > 3: isync > 4: > > (I think that at least one of the isync is probably unnecessary and > perhaps both, but I'd rather choose the safe solution for this). > in the exception table (replace 5f by 4b for the stores): > .long 1b,5f > .long 2b,5f > .long 3b,5f > > and in the fixup section (for the loads only): > 5: li reg,-1 > b 4b I think this is not sufficient because you are not generally guaranteed anything about the state of the registers after a machine check. AFAICS, we would have to save the contents of all the registers (at least all of the callee-saved ones) and restore them from memory if a machine check occurs. We could use setjmp/longjmp to do this. And yes, we do need the sync after the access, but I don't see why we would need the isync. Paul. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-24 23:12 ` Paul Mackerras @ 1999-03-25 11:20 ` Gabriel Paubert 1999-03-25 16:46 ` Apple Job Posting and Good News for LinuxPPC developers Kevin B. Hendricks 1999-03-29 23:44 ` Blue G3 and machine check Paul Mackerras 1999-03-25 12:10 ` Benjamin Herrenschmidt 1 sibling, 2 replies; 33+ messages in thread From: Gabriel Paubert @ 1999-03-25 11:20 UTC (permalink / raw) To: Paul.Mackerras; +Cc: bh40, linuxppc-dev On Thu, 25 Mar 1999, Paul Mackerras wrote: > Gabriel Paubert <paubert@iram.es> wrote: > > > Note that you probably only need to protect the PCI config space accesses, > > If we are getting machine checks on config space accesses, then it is > truly borken. Config spaces accesses in PCI are supposed to return ~0 > if there is no device there, precisely so that you can safely probe to > see whether the device is there. No, the PCI connector also has a presence detect pin which should be used for this. The PCI specification is very clear that the only cycles which are expected to end with a Master Abort are the special cycles. Configuration cycles are like any other cycles and a Mater Abort may result in a device pulling the SERR line and taking exceptions in this case. > Did the original poster say whether the machine checks were on config > space accesses or I/O or memory space accesses? It's common enough for > drivers written for intel linux to go probing I/O ports to try to find > devices to talk to. They were on PCI config space IIRC. > I think this is not sufficient because you are not generally > guaranteed anything about the state of the registers after a machine > check. AFAICS, we would have to save the contents of all the > registers (at least all of the callee-saved ones) and restore them > from memory if a machine check occurs. We could use setjmp/longjmp to > do this. And yes, we do need the sync after the access, but I don't > see why we would need the isync. No I don't think we need the isync after either, and perhaps not before since sync guarantees "that no subsequent instructions appear to be initiated until the sync instruction completes". There is also a recoverable flag in the MSR and I don't know what its state was in this case. But the worst is that you are not guaranteed anything about SRR0, so an in memory per processor flag telling 'hey, I might actually get a machine check, might be required'. For the registers, I can't believe that after a sync/isync sequence, any implementation will ever randomly modify any other register than the destination for the loads (and the address register for update form instructions). And yes, I just reread the following: "Note that if the error is caused by the memory subsystem, incorrect data could be loaded into the processor and register contents could be corrupted regardless of whether the exception is considered recoverable by the SRR1 bit corresponding to MSR[RI]." But I interpret it as the registers modified by the instruction and the potential use of the corrupted data by subsequent instructions, which should be bounded by following sync; if you interpret it very liberally all registers could be corrupted, not only GPR (including the stack pointer) but why not also LR, CTR, XER, CR, FPRs, FPSCR, BATS, segments, timebase, decrementer, SDR1, SPRGn, HID0 and others. Gabriel. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Apple Job Posting and Good News for LinuxPPC developers 1999-03-25 11:20 ` Gabriel Paubert @ 1999-03-25 16:46 ` Kevin B. Hendricks 1999-03-25 19:12 ` David Edelsohn 1999-03-26 6:08 ` Nathan Hurst 1999-03-29 23:44 ` Blue G3 and machine check Paul Mackerras 1 sibling, 2 replies; 33+ messages in thread From: Kevin B. Hendricks @ 1999-03-25 16:46 UTC (permalink / raw) To: linuxppc-dev Hi, Just in case anyone missed this. Apple is hiring someone whose job description will include making it easier for people to get enough hardware info to get Linux PPC to run on Apple's new hardware. We should get one of our own into this role and get a leg up on helping apple open up its hardware specs for Linux PPC What do you think? Kevin Title: Technology Manager-Linux Location: Santa Clara Valley,CA Technology Manager Ð Linux Support. Responsible for making Linux developers who support AppleÕs hardware more successful. Strengthens relationships with established developers such as TerraSoft and others. Works independently and in close collaboration with sales and marketing and fellow Technology and Partnership Managers. Builds strong internal and external relationships in order to effectively deliver Apple's message to Linux developers and champion their needs and issues within Apple. Works to help Linux developers get seeded with new Apple hardware to guarantee support for our new CPUs. Maintains high level of technical and market expertise. Communicates Apple product and marketing messages to assigned developers and the developer community at large in both one-on-one and one-to-many settings. Position requires a very high level of experience related to all flavors of Linux and UNIX, and requires excellent relationship skills. Expertise in the following is a must: Macintosh hardware, Linux (LinuxPPC, MkLinux, etc) and UNIX. Persuasiveness and the ability to deliver presentations to audiences of various sizes are required. The successful candidate will be results driven and possess a world-class attention to detail, remarkable follow-through, and an ability to totally focus on customer service. Normally requires BA/BS in scientific, engineering discipline or business, marketing, or communications plus 8-9 yr. exp. (or MA/MS/MBA plus 6-7 yr.) or equivalent experience. For consideration on this position send your resume indicating appropriate Attn code to: Apple Computer, Inc. 1 Infinite Loop, MS:38-3CE Cupertino, CA 95014 Attn:CG1003963NET FAX: 408-974-5691 email: applejobs@apple.com (Please email resume by pasting resume into email message document area - do not attach enclosures) Principals only. No phone calls please. Apple Computer has a corporate commitment to the principle of diversity. In that spirit, we welcome applications from all individuals. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-25 16:46 ` Apple Job Posting and Good News for LinuxPPC developers Kevin B. Hendricks @ 1999-03-25 19:12 ` David Edelsohn 1999-03-26 11:31 ` Gabriel Paubert 1999-03-26 6:08 ` Nathan Hurst 1 sibling, 1 reply; 33+ messages in thread From: David Edelsohn @ 1999-03-25 19:12 UTC (permalink / raw) To: linuxppc-dev >>>>> "Kevin B Hendricks" writes: Kevin> Just in case anyone missed this. Apple is hiring someone whose job description Kevin> will include making it easier for people to get enough hardware info to get Kevin> Linux PPC to run on Apple's new hardware. Kevin> We should get one of our own into this role and get a leg up on helping apple Kevin> open up its hardware specs for Linux PPC My group at IBM Research also is looking for someone with operating system development experience to join our project: http://www.research.ibm.com/kitchawan/ The project is targeted at multiprocessor systems utilizing 64-bit processors including PowerPC64. An additional aspect is compatibility with glibc. David =============================================================================== David Edelsohn T.J. Watson Research Center dje@watson.ibm.com P.O. Box 218 +1 914 945 4364 (TL 862) Yorktown Heights, NY 10598 [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-25 19:12 ` David Edelsohn @ 1999-03-26 11:31 ` Gabriel Paubert 1999-03-26 16:13 ` David Edelsohn 1999-04-02 12:11 ` Holger Bettag 0 siblings, 2 replies; 33+ messages in thread From: Gabriel Paubert @ 1999-03-26 11:31 UTC (permalink / raw) To: David Edelsohn; +Cc: linuxppc-dev > My group at IBM Research also is looking for someone with operating > system development experience to join our project: > > http://www.research.ibm.com/kitchawan/ > > The project is targeted at multiprocessor systems utilizing 64-bit processors > including PowerPC64. An additional aspect is compatibility with glibc. Do 64 bit PPC processors that can be bought as chips (not in a system) exist (i.e., can you buy a Power3 in a BGA package and the chipset that goes with) ? Not many people have seen a 620 AFAICT :-( and the G4 I've seen announced by Motorola is not what I expected: it's a super 750 with Altivec + 604 FPU (single cycle double precision multiplier) + SMP capabilities. But at least I expected: - a 64 bit version (means >32 address pins, perhaps ~40 or so, 64 is obviously overkill right now and I don't ask for it) - more superscalar (not 2+branch, but at least 4 way) - longer in flight instruction queue - 2 FPU: not sure, but divides seem to kill current PPC, one unit which can do all instructions(including sqrt) + one mult/add only would be great. OTOH, on FFT benchmarks which interest me also (not a single div), PPC are excellent. - 2 LSU to feed all these units: hey Pentium and PPro can do 2 memory accesses por clock since they came out. They need it because they require many load/store to compensate for the small number of registers but I'd expect it to be beneficial even on PPC with large and fast backside L2 caches. Power2 has had 2 LSU since the beginning AFAICT. Gabriel. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-26 11:31 ` Gabriel Paubert @ 1999-03-26 16:13 ` David Edelsohn 1999-03-27 6:27 ` Guy Sotomayor 1999-04-02 12:11 ` Holger Bettag 1 sibling, 1 reply; 33+ messages in thread From: David Edelsohn @ 1999-03-26 16:13 UTC (permalink / raw) To: Gabriel Paubert; +Cc: linuxppc-dev >>>>> Gabriel Paubert writes: Gabriel> Do 64 bit PPC processors that can be bought as chips (not in a system) Gabriel> exist (i.e., can you buy a Power3 in a BGA package and the chipset that Gabriel> goes with) ? I doubt it. The Power3 comes from the server organization and is intended for their use. IBM, however, does understand customer demand: if there was "enough" market interest, I am sure that it would find a way to sell it. The Power3 processor price is nowhere near a PPC750 or G4 and does not include AltiVec. David [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-26 16:13 ` David Edelsohn @ 1999-03-27 6:27 ` Guy Sotomayor 1999-03-27 20:44 ` David Edelsohn 0 siblings, 1 reply; 33+ messages in thread From: Guy Sotomayor @ 1999-03-27 6:27 UTC (permalink / raw) To: dje; +Cc: paubert, linuxppc-dev > > Gabriel> Do 64 bit PPC processors that can be bought as chips (not in a system) > > Gabriel> exist (i.e., can you buy a Power3 in a BGA package and the chipset > that > Gabriel> goes with) ? > > I doubt it. The Power3 comes from the server organization and is > intended for their use. IBM, however, does understand customer demand: > if there was "enough" market interest, I am sure that it would find a way > to sell it. The Power3 processor price is nowhere near a PPC750 or G4 and > does not include AltiVec. > Actually I think the 64bit PPCs are called PowerPC II. They're reasonably nice chips last time I looked. They are expensive to put together in a system (ie 8MB L2, 128bit data bus, etc). Also the 6xx bus is a split transaction bus, so the bus controllers are a bit more complex too. I also think that all the data paths are ECC'd with parity on addresses. For a good example on what type of systems are built using these things, look at the S70 Advanced server. TTFN - Guy [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-27 6:27 ` Guy Sotomayor @ 1999-03-27 20:44 ` David Edelsohn 0 siblings, 0 replies; 33+ messages in thread From: David Edelsohn @ 1999-03-27 20:44 UTC (permalink / raw) To: ggs; +Cc: paubert, linuxppc-dev >>>>> Guy Sotomayor writes: Guy> Actually I think the 64bit PPCs are called PowerPC II. They're reasonably Guy> nice chips last time I looked. They are expensive to put together in a Guy> system (ie 8MB L2, 128bit data bus, etc). Also the 6xx bus is a split Guy> transaction bus, so the bus controllers are a bit more complex too. I Guy> also think that all the data paths are ECC'd with parity on addresses. Guy> For a good example on what type of systems are built using these things, Guy> look at the S70 Advanced server. 64-bit PowerPC is an architecture, not a single chip. The first implementation was the PPC620 from Somerset, eventually only used by Groupe Bull. The S70 and S7A use a chip from IBM Rochester also used in AS/400s. The Power3 (aka PPC630fp) is yet a third 64-bit PowerPC implementation. IBM's recent 64-bit PowerPC chips are augmented to support additional requirements of AS/400 systems. David [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-26 11:31 ` Gabriel Paubert 1999-03-26 16:13 ` David Edelsohn @ 1999-04-02 12:11 ` Holger Bettag 1999-04-02 17:11 ` David Edelsohn 1999-04-05 16:06 ` Gabriel Paubert 1 sibling, 2 replies; 33+ messages in thread From: Holger Bettag @ 1999-04-02 12:11 UTC (permalink / raw) To: linuxppc-dev Gabriel Paubert <paubert@iram.es> writes: [beefier PowerPCs] > Not many people have seen a 620 AFAICT :-( and the G4 I've seen announced > by Motorola is not what I expected: it's a super 750 with Altivec + 604 > FPU (single cycle double precision multiplier) + SMP capabilities. > > But at least I expected: > > - a 64 bit version (means >32 address pins, perhaps ~40 or so, 64 is > obviously overkill right now and I don't ask for it) > I have heard rumours that the "Max" core has provisions to physically address more than 4GB of memory (via the MMU's segment registers). Processes would still be limited to a 4GB logical address space, though. > - more superscalar (not 2+branch, but at least 4 way) > The 604e is four-way superscalar, but it has slightly lower integer performance per clock cycle than the 750. Apparently you can't extract significantly more parallelism than 2 instructions per clock from real-world code, so the 604e's abilities are wasted, while the 750's exceptionally elegant branch handling leads to measurable benefits over the 604e's more traditional (though very sophisticated) branch prediction. > - longer in flight instruction queue > Compared to the 750, "Max" has additional reservation stations and a longer completion queue. Motorola's first estimates were 10% more integer performance per clock cycle than the 750, but this probably includes the effect of the larger L2. BTW, a larger number of unresolved "in flight" instructions would jeopardize the processors unique (among superscalar CPUs) ability to directly execute branches without predicting them, because the outcome of branch conditions would more often be still "in flight", too. > - 2 FPU: not sure, but divides seem to kill current PPC, one > unit which can do all instructions(including sqrt) + one mult/add only > would be great. OTOH, on FFT benchmarks which interest me also (not a > single div), PPC are excellent. > AFAIK, the divide algorithm used in PowerPC FPUs was selected specifically to re-use as much of the multiplier as possible; i.e. saving transistors was the foremost goal. With nowadays silicon structure sizes, it would probably be a very good idea to have a separate divider. Possibly even one that uses a faster algorithm (like estimation plus refinement as is done explicitly in AltiVec). > - 2 LSU to feed all these units: hey Pentium and PPro can do 2 memory > accesses por clock since they came out. They need it because they > require many load/store to compensate for the small number of registers > but I'd expect it to be beneficial even on PPC with large and fast > backside L2 caches. Power2 has had 2 LSU since the beginning AFAICT. > AFAIK, Pentium and PPro/PII/PIII do not have the equivalent of two full-blown LSUs. In best case, they can do either two loads or one load plus one store, but not two store operations. Furthermore, their L1 cache is not really dual-ported, only dual-banked, so that two accesses can only be carried out in parallel if they hit different banks. And finally, if you ever have an algorithm where loads or stores are the bottleneck, the performance will be limited by main memory bandwidth anyway, regardless of L1 bandwidth (unless you are register-starved, of course, but that will almost never happen with 32 GPRs + 32 FPRs (+ 32VRs)). Holger [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-04-02 12:11 ` Holger Bettag @ 1999-04-02 17:11 ` David Edelsohn 1999-04-02 22:19 ` Douglas Godfrey 1999-04-03 17:42 ` Holger Bettag 1999-04-05 16:06 ` Gabriel Paubert 1 sibling, 2 replies; 33+ messages in thread From: David Edelsohn @ 1999-04-02 17:11 UTC (permalink / raw) To: Holger Bettag; +Cc: linuxppc-dev >>>>> Holger Bettag writes: Holger> I have heard rumours that the "Max" core has provisions to physically address Holger> more than 4GB of memory (via the MMU's segment registers). Processes would Holger> still be limited to a 4GB logical address space, though. The PowerPC architecture always has been able to address more than 32-bits of "logical" address space. That is the reason for the PowerPC terminology of "effective address", "virtual address", and "real address". The intermediate "virtual address" space of a 32-bit PowerPC implementation is 52 bits. Pointers still are 32-bits, but a cooperating operating system and compiler can allow an application to address more virtual memory through runtime modifications to the virtual segments mapped by the segment registers, like memory overlays. That was the original reason for the design of the MMU in the POWER (predecessor of PowerPC) architecture. I do not believe that any compiler / OS combination takes advantage of this facility. David [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-04-02 17:11 ` David Edelsohn @ 1999-04-02 22:19 ` Douglas Godfrey 1999-04-03 17:42 ` Holger Bettag 1 sibling, 0 replies; 33+ messages in thread From: Douglas Godfrey @ 1999-04-02 22:19 UTC (permalink / raw) To: linuxppc-dev Reoly to David Edelsohn, 4/2/99 12:11 PM -0500: Re: Apple Job Posting and Good News for LinuxPPC devel >>>>>> Holger Bettag writes: > >Holger> I have heard rumours that the "Max" core has provisions to >physically address >Holger> more than 4GB of memory (via the MMU's segment registers). >Processes would >Holger> still be limited to a 4GB logical address space, though. > > The PowerPC architecture always has been able to address more than >32-bits of "logical" address space. That is the reason for the PowerPC >terminology of "effective address", "virtual address", and "real address". >The intermediate "virtual address" space of a 32-bit PowerPC >implementation is 52 bits. > > Pointers still are 32-bits, but a cooperating operating system and >compiler can allow an application to address more virtual memory through >runtime modifications to the virtual segments mapped by the segment >registers, like memory overlays. That was the original reason for the >design of the MMU in the POWER (predecessor of PowerPC) architecture. I >do not believe that any compiler / OS combination takes advantage of this >facility. > Only the IBM Mainframe and the AS/400 use windowed virtual storage to access more than 32 bits of address with 32 bit pointers. The IBM Mainframe can access a 44bit address range by using a 32 bit pointer as a base address for a 4k page boundry contiguous storage segment with a window size of up to 16meg. The old AS/400 did something similar to access it's virtual storage mapped database while the new AS/400 just switches to the PPC 64bit addressing mode and uses full 64bit pointers. Thanx... Doug [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-04-02 17:11 ` David Edelsohn 1999-04-02 22:19 ` Douglas Godfrey @ 1999-04-03 17:42 ` Holger Bettag 1999-04-05 16:11 ` Gabriel Paubert 1 sibling, 1 reply; 33+ messages in thread From: Holger Bettag @ 1999-04-03 17:42 UTC (permalink / raw) To: linuxppc-dev David Edelsohn <dje@watson.ibm.com> writes: > > >>>>> Holger Bettag writes: > > Holger> I have heard rumours that the "Max" core has provisions to > Holger> physically address more than 4GB of memory (via the MMU's segment > Holger> registers). Processes would still be limited to a 4GB logical > Holger> address space, though. > > The PowerPC architecture always has been able to address more than > 32-bits of "logical" address space. That is the reason for the PowerPC > terminology of "effective address", "virtual address", and "real address". > The intermediate "virtual address" space of a 32-bit PowerPC > implementation is 52 bits. > OK, that's true in the usual PowerPC terminology. > Pointers still are 32-bits, but a cooperating operating system and ^^ That's what I meant with processes being limited to a 4GB logical address space, because the segment registers are not accessible with only user mode privileges. > compiler can allow an application to address more virtual memory through > runtime modifications to the virtual segments mapped by the segment > registers, like memory overlays. That was the original reason for the > design of the MMU in the POWER (predecessor of PowerPC) architecture. I > do not believe that any compiler / OS combination takes advantage of this > facility. > Well, so far there are no 32bit PowerPCs that can handle more than 4GB of RAM. The rumours I heard indicate that "Max" might be able to handle more. That would make messing with segments worth the effort (and would generate all the neat problems of segmented addressing schemes). Holger [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-04-03 17:42 ` Holger Bettag @ 1999-04-05 16:11 ` Gabriel Paubert 0 siblings, 0 replies; 33+ messages in thread From: Gabriel Paubert @ 1999-04-05 16:11 UTC (permalink / raw) To: Holger Bettag; +Cc: linuxppc-dev On 3 Apr 1999, Holger Bettag wrote: > > compiler can allow an application to address more virtual memory through > > runtime modifications to the virtual segments mapped by the segment > > registers, like memory overlays. That was the original reason for the > > design of the MMU in the POWER (predecessor of PowerPC) architecture. I > > do not believe that any compiler / OS combination takes advantage of this > > facility. > > > Well, so far there are no 32bit PowerPCs that can handle more than 4GB of > RAM. The rumours I heard indicate that "Max" might be able to handle more. > That would make messing with segments worth the effort (and would generate > all the neat problems of segmented addressing schemes). There is no reasonable way a 32 bit PPC will handle more than 4Gb of physical RAM. However, with relatively simple patches, Linux could handle 4 Gb per process virtual address space and 3 Gb physical RAM on a 32 bit PPC. Gabriel. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-04-02 12:11 ` Holger Bettag 1999-04-02 17:11 ` David Edelsohn @ 1999-04-05 16:06 ` Gabriel Paubert 1999-04-06 5:53 ` Douglas Godfrey 1 sibling, 1 reply; 33+ messages in thread From: Gabriel Paubert @ 1999-04-05 16:06 UTC (permalink / raw) To: Holger Bettag; +Cc: linuxppc-dev On 2 Apr 1999, Holger Bettag wrote: > > - a 64 bit version (means >32 address pins, perhaps ~40 or so, 64 is > > obviously overkill right now and I don't ask for it) > > > I have heard rumours that the "Max" core has provisions to physically address > more than 4GB of memory (via the MMU's segment registers). Processes would > still be limited to a 4GB logical address space, though. This will be virtual and already exists. There is provision for 4Gb address space in the hash table defined for 32 bit processors and extending it is far too messy. Only a 64 bit PPC can be expected to have more than 32 address pins. > > - more superscalar (not 2+branch, but at least 4 way) > > > The 604e is four-way superscalar, but it has slightly lower integer performance > per clock cycle than the 750. Apparently you can't extract significantly more > parallelism than 2 instructions per clock from real-world code, so the 604e's > abilities are wasted, while the 750's exceptionally elegant branch handling > leads to measurable benefits over the 604e's more traditional (though very > sophisticated) branch prediction. It depends of your definition of real world code. Floating point intensive scientific code is often extremely well behaved in this respect and could use more than 2 instructions/clock (as long as you don't call the math library for scalar transcendental operations). > > - longer in flight instruction queue > > > Compared to the 750, "Max" has additional reservation stations and a longer > completion queue. Motorola's first estimates were 10% more integer performance > per clock cycle than the 750, but this probably includes the effect of > the larger L2. > > BTW, a larger number of unresolved "in flight" instructions would jeopardize > the processors unique (among superscalar CPUs) ability to directly execute > branches without predicting them, because the outcome of branch conditions > would more often be still "in flight", too. On PPC, with 8 CR fields, you can often prepare condition codes a long time in advance especially on scientific code. I've checked the code generated by GCC, and I often have 20 to 40 instructions between CR setting and actual branch (which are highly predictable BTW). Besides this the innermost loop is terminated by a bdnz when the iteration count can be computed beforehand. In this last case GCC does a bad job at optimizing the outer loop level but it's not that critical: Loop not using ctr: 104: 7c 03 f0 40 cmplw r3,r30 108: 7c 78 1b 78 mr r24,r3 10c: 40 80 00 9c bge 1a8 <fftrtr+0x1a8> 110: 54 69 10 3a rlwinm r9,r3,2,0,29 114: 7c 09 b0 2e lwzx r0,r9,r22 118: 38 63 00 01 addi r3,r3,1 11c: 7c 03 f0 40 cmplw r3,r30 ... [snipped 33 instructions] ... 1a4: 41 80 ff 70 blt 114 <fftrtr+0x114> Nested loops with bdnz on inner loop: 298: 7f 03 c3 78 mr r3,r24 29c: 7c 03 f0 40 cmplw r3,r30 2a0: 40 80 01 04 bge 3a4 <fftrtr+0x3a4> [Fair enough, 0 iteration count may actually happen, although it would be better to use r24 instead of r3 for cmplw] 2a4: 54 64 10 3a rlwinm r4,r3,2,0,29 2a8: 7c 04 b0 2e lwzx r0,r4,r22 2ac: 57 26 18 38 rlwinm r6,r25,3,0,28 2b0: 54 00 18 38 rlwinm r0,r0,3,0,28 2b4: 7c fd 02 14 add r7,r29,r0 2b8: 7d 07 32 14 add r8,r7,r6 2bc: 35 fa ff ff addic. r15,r26,-1 2c0: 7d e9 03 a6 mtctr r15 2c4: 41 82 00 d0 beq 394 <fftrtr+0x394> [Actually here the iteration count can never be zero but the compiler can't be blamed for not knowing it] 2c8: 7e 0c 83 78 mr r12,r16 2cc: 38 a0 00 00 li r5,0 2d0: 7d 9b 60 50 subf r12,r27,r12 ... [snipped 47 instructions] ... 390: 42 00 ff 40 bdnz 2d0 <fftrtr+0x2d0> 394: 38 63 00 01 addi r3,r3,1 398: 7c 03 f0 40 cmplw r3,r30 39c: 38 84 00 04 addi r4,r4,4 [Quite bad placement, neither r3 nor r30 are touched in the inner loop] 3a0: 41 80 ff 08 blt 2a8 <fftrtr+0x2a8> I don't ask for 40 or 70 in flight instructions like PPro/K7, but something closer to the 604 (15-20). 6 is very little especially because it blocks the processor after divides (and the FPU is also blocked but it's the next paragraph). And BTW, I know that a lot of code is not as predictable as scientific code but. I've written an Intel 486SX emulator which is the only code right now able to properly initialize my S3 video and there the branch density is significantly higher, indeed instruction dispatch and addressing mode decoding can probably be considered as a worst case in this respect. It's true that this code would not benefit a lot from much higher parallelism, but I consider it as non representative: it's a byte per byte interpreter which, at 24kB code+data, is small enough to be put in early boot code/Flash/whatever, and runs at a sufficient speed for this job on a 200MHz 603e (thanks to the fact that it does not thrash the cache). However, even in this code the branch immediately following the CR setting instruction is the exception, 5 instructions are common. For indirect branches the ctr is typically set quite a lot of instructions before the branch (and there are other conditional branches in between anyway). If you want to see the code, it is at: ftp://vcorr1.iram.es/pub/linux-2.2/mvme2600.generic-patch-2.2.4 in the arch/ppc/prepboot directory. > > - 2 FPU: not sure, but divides seem to kill current PPC, one > > unit which can do all instructions(including sqrt) + one mult/add only > > would be great. OTOH, on FFT benchmarks which interest me also (not a > > single div), PPC are excellent. > > > AFAIK, the divide algorithm used in PowerPC FPUs was selected specifically > to re-use as much of the multiplier as possible; i.e. saving transistors > was the foremost goal. > > With nowadays silicon structure sizes, it would probably be a very good idea > to have a separate divider. Possibly even one that uses a faster algorithm > (like estimation plus refinement as is done explicitly in AltiVec). At least we agree on that one. BTW, I wrote some time ago a square root routine which is much faster than the one currently implemented in glibc. Anyone wants to test it ? I was not able to prove mathmatically that it is fully IEEE compliant but I suspect so (and I can't test 2^53 possibilities. The code is free of divides and takes advantage of the intermediate 106 bit precision of fused multiply/add instructions). > > - 2 LSU to feed all these units: hey Pentium and PPro can do 2 memory > > accesses por clock since they came out. They need it because they > > require many load/store to compensate for the small number of registers > > but I'd expect it to be beneficial even on PPC with large and fast > > backside L2 caches. Power2 has had 2 LSU since the beginning AFAICT. > > > AFAIK, Pentium and PPro/PII/PIII do not have the equivalent of two full-blown > LSUs. In best case, they can do either two loads or one load plus one store, > but not two store operations. Furthermore, their L1 cache is not really > dual-ported, only dual-banked, so that two accesses can only be carried out > in parallel if they hit different banks. Pentium has (can even perform 2 stack pushes per clock of register and/or immediate values). P6 core (PPro/PII...) is 1 load + 1 store. Of course it's banked, but I would not care very much if it were implemented as in Power2, with special instructions which allow to load and store 2 FPR simultaneously. (even with alignment restrictions: with AltiVec, the HW to perform 16 byte load and stores is already there. I know it's more complex to implement when you need to access 2 different FPR, one of which may be ready and not the other, but it's great for complex arithmetic). > And finally, if you ever have an algorithm where loads or stores are the > bottleneck, the performance will be limited by main memory bandwidth anyway, > regardless of L1 bandwidth (unless you are register-starved, of course, but > that will almost never happen with 32 GPRs + 32 FPRs (+ 32VRs)). No, I want to saturate L2 BW, which is going to be in the 10 Gb/s range soon (500 Mhz @ 128 bit is 8Gb/s, double for 256 bit). With L2 caches larger than 1 Mb, my apps basically do not thrash L2 cache (and are even careful enough not to cause too many writebacks to it which could halve effective BW), provided it has enough associativity (4 way at least). Besides that, 2 LSU would help in saving/restoring registers on procedure entry/exit (especially on 64 bit PPC since there is no lmd/stmd). But this is a side effect: if you worried about the overhead of procedure entry/exit, you've got other problems to cater first. Gabriel. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-04-05 16:06 ` Gabriel Paubert @ 1999-04-06 5:53 ` Douglas Godfrey 0 siblings, 0 replies; 33+ messages in thread From: Douglas Godfrey @ 1999-04-06 5:53 UTC (permalink / raw) To: linuxppc-dev Reoly to Gabriel Paubert, 4/5/99 6:06 PM +0200: Re: Apple Job Posting and Good News for LinuxPPC devel > >No, I want to saturate L2 BW, which is going to be in the 10 Gb/s range >soon (500 Mhz @ 128 bit is 8Gb/s, double for 256 bit). With L2 caches >larger than 1 Mb, my apps basically do not thrash L2 cache (and are even >careful enough not to cause too many writebacks to it which could halve >effective BW), provided it has enough associativity (4 way at least). > With IBM recent announcement of Imbedded DRAM Macros for their Blue Logic library, the previous limits on L2 cache bandwidth no longer apply. IBM can deliver a G3 or G4 CPU with 2meg of imbedded DRAM as L2 with a 1024 bit wide L2 cache bus. This would allow a 4 way set associative cache with 4 simultaneous L2 cache transfers for read or write with a 2.5ns DRAM speed. The CPU die size would be only 50% larger and would only need 5 more mask steps. The cost of such a G3 or G4 CPU should be only 30% to 50% higher than current generation G3 CPUs, about the same as current Intel Pentium III CPUs. Combine this with the cache controller from the Power 3 CPU and IBM will be able to achieve between 8Gb/sec and 32Gb/sec L2 cache bandwidth. A 550mhz PPC G3 probably cannot even approach using that much bandwidth. Thanx... Doug [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-25 16:46 ` Apple Job Posting and Good News for LinuxPPC developers Kevin B. Hendricks 1999-03-25 19:12 ` David Edelsohn @ 1999-03-26 6:08 ` Nathan Hurst 1999-03-26 13:51 ` sean o'malley 1999-03-26 20:33 ` N.G. Temme 1 sibling, 2 replies; 33+ messages in thread From: Nathan Hurst @ 1999-03-26 6:08 UTC (permalink / raw) To: Kevin B. Hendricks; +Cc: linuxppc-dev On Thu, 25 Mar 1999, Kevin B. Hendricks wrote: > We should get one of our own into this role and get a leg up on helping apple > open up its hardware specs for Linux PPC > > What do you think? I'm voting for Paul Mackerras to get the job.. :-) (I'm sure he _love_ to spend his time working solely on linuxpmac.. :-) [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-26 6:08 ` Nathan Hurst @ 1999-03-26 13:51 ` sean o'malley 1999-03-28 5:08 ` Cort Dougan 1999-03-26 20:33 ` N.G. Temme 1 sibling, 1 reply; 33+ messages in thread From: sean o'malley @ 1999-03-26 13:51 UTC (permalink / raw) To: Nathan Hurst; +Cc: linuxppc-dev It sounded like it was more political and PR based than actual development. Im wondering if it would entail a whole department/unit with multiple people. Maybe a couple of developers to help solve some of the more nasty stuff , a pr person to help promote linuxppc and our efforts (judging from some of the stuff RHS has been saying he isnt to aware of our project or our project status.) and a person to deal with corporate management (convincing them to support linuxppc (use, drivers, info, etc) ) and a person tracking the development of the project and linux projects in general. It would be nice to have someone that comes from our project. >On Thu, 25 Mar 1999, Kevin B. Hendricks wrote: > >> We should get one of our own into this role and get a leg up on helping >>apple >> open up its hardware specs for Linux PPC >> >> What do you think? > >I'm voting for Paul Mackerras to get the job.. :-) >(I'm sure he _love_ to spend his time working solely on linuxpmac.. :-) > [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-26 13:51 ` sean o'malley @ 1999-03-28 5:08 ` Cort Dougan 0 siblings, 0 replies; 33+ messages in thread From: Cort Dougan @ 1999-03-28 5:08 UTC (permalink / raw) To: sean o'malley; +Cc: Nathan Hurst, linuxppc-dev Look at this thing from what Apple has done in the past. This just means Linux is the Apple OS of the week. They don't have a good track record with keeping with their plans. } It sounded like it was more political and PR based than actual development. } Im wondering if it would entail a whole department/unit with multiple } people. Maybe a couple of developers to help solve some of the more nasty } stuff , a pr person to help promote linuxppc and our efforts (judging from } some of the stuff RHS has been saying he isnt to aware of our project or [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Apple Job Posting and Good News for LinuxPPC developers 1999-03-26 6:08 ` Nathan Hurst 1999-03-26 13:51 ` sean o'malley @ 1999-03-26 20:33 ` N.G. Temme 1 sibling, 0 replies; 33+ messages in thread From: N.G. Temme @ 1999-03-26 20:33 UTC (permalink / raw) To: linuxppc-dev >On Thu, 25 Mar 1999, Kevin B. Hendricks wrote: > >> We should get one of our own into this role and get a leg up on helping >>apple >> open up its hardware specs for Linux PPC >> >> What do you think? > >I'm voting for Paul Mackerras to get the job.. :-) >(I'm sure he _love_ to spend his time working solely on linuxpmac.. :-) > me too [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-25 11:20 ` Gabriel Paubert 1999-03-25 16:46 ` Apple Job Posting and Good News for LinuxPPC developers Kevin B. Hendricks @ 1999-03-29 23:44 ` Paul Mackerras 1999-03-30 11:41 ` Gabriel Paubert 1 sibling, 1 reply; 33+ messages in thread From: Paul Mackerras @ 1999-03-29 23:44 UTC (permalink / raw) To: paubert; +Cc: bh40, linuxppc-dev Gabriel Paubert <paubert@iram.es> > No, the PCI connector also has a presence detect pin which should be used > for this. The PCI specification is very clear that the only cycles > which are expected to end with a Master Abort are the special cycles. > Configuration cycles are like any other cycles and a Mater Abort may > result in a device pulling the SERR line and taking exceptions in this > case. The PCI spec says that the host bridge must unambiguously report attempts to read the vendor ID of nonexistent devices, and that it is adequate for the host bridge to return ~0 on read accesses to config space registers of nonexistent devices. I guess a machine check can be regarded as pretty unambiguous. Sigh. :-( > But the worst is that you are not guaranteed anything about SRR0, so an in > memory per processor flag telling 'hey, I might actually get a machine > check, might be required'. For the registers, I can't believe that after a > sync/isync sequence, any implementation will ever randomly modify any > other register than the destination for the loads (and the address > register for update form instructions). Imagine that an interrupt occurs between the load/store and the sync. The CPU could be in full superscalar flight when it gets the error ack. The registers could certainly be in an inconsistent state when we get to the machine check handler. So we at least need to disable interrupts around the access. > And yes, I just reread the following: "Note that if the error is caused by > the memory subsystem, incorrect data could be loaded into the processor > and register contents could be corrupted regardless of whether the > exception is considered recoverable by the SRR1 bit corresponding to > MSR[RI]." > > But I interpret it as the registers modified by the instruction and the > potential use of the corrupted data by subsequent instructions, which > should be bounded by following sync; if you interpret it very liberally > all registers could be corrupted, not only GPR (including the stack > pointer) but why not also LR, CTR, XER, CR, FPRs, FPSCR, BATS, segments, > timebase, decrementer, SDR1, SPRGn, HID0 and others. Indeed. :-) I think it's likely that the following sequence will work OK: mtmsr to disable interrupts sync load/store sync re-enable interrupts if necessary and if we get a machine check on the second sync, the registers should be OK. Thoughts? Paul. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-29 23:44 ` Blue G3 and machine check Paul Mackerras @ 1999-03-30 11:41 ` Gabriel Paubert 1999-03-31 16:20 ` Ryuichi Oikawa 0 siblings, 1 reply; 33+ messages in thread From: Gabriel Paubert @ 1999-03-30 11:41 UTC (permalink / raw) To: Paul.Mackerras; +Cc: bh40, linuxppc-dev On Tue, 30 Mar 1999, Paul Mackerras wrote: > The PCI spec says that the host bridge must unambiguously report > attempts to read the vendor ID of nonexistent devices, and that it is > adequate for the host bridge to return ~0 on read accesses to config > space registers of nonexistent devices. > > I guess a machine check can be regarded as pretty unambiguous. > Sigh. :-( Indeed. But if it only happens on access through the P2P bridges, it means that the bridge transforms Master Abort on the secondary side into Target Aborts on the primary. IIRC there is a bit in the configuration of the bridge to control this. > Imagine that an interrupt occurs between the load/store and the sync. > The CPU could be in full superscalar flight when it gets the error > ack. The registers could certainly be in an inconsistent state when > we get to the machine check handler. So we at least need to disable > interrupts around the access. I always sais that the first thing to do is to disable interrupts. There is no hope of getting it running with interrupts enabled. Indeed all accesses to the PCI config space should performed with interrupts disabled and protected by a spinlock on SMP; given the indirect nature of most bridges, a layer that locks and checks for basically valid parameters as done on Intel in arch/i386/kernel/bios32.c is necessary: most ECC memory controllers report error status in PCI config space and you need to access it from interrupts if you decide to hanle memory errors properly. But you need to be extremely careful because of the situation that miht arise: you hold a spinlock, a machine check occurs and you try to get the same spinlock to clear the error status. Deadlock in sight... > > And yes, I just reread the following: "Note that if the error is caused by > > the memory subsystem, incorrect data could be loaded into the processor > > and register contents could be corrupted regardless of whether the > > exception is considered recoverable by the SRR1 bit corresponding to > > MSR[RI]." > > > > But I interpret it as the registers modified by the instruction and the > > potential use of the corrupted data by subsequent instructions, which > > should be bounded by following sync; if you interpret it very liberally > > all registers could be corrupted, not only GPR (including the stack > > pointer) but why not also LR, CTR, XER, CR, FPRs, FPSCR, BATS, segments, > > timebase, decrementer, SDR1, SPRGn, HID0 and others. > > Indeed. :-) > > I think it's likely that the following sequence will work OK: > > mtmsr to disable interrupts > sync > load/store > sync > re-enable interrupts if necessary > > and if we get a machine check on the second sync, the registers should > be OK. > > Thoughts? Willl SRR0 point at or after the sync instruction ? Adding an isync stops fetching and might act as a barrier on the point to which SRR0 progresses. It needs some checking, and it might also depend on the actual delay on the machine check in the bridge and processor; in most processors the machine check and interrupts pins are filtered and take a few clocks to reach the core, the transfer error acknowlegde does obviously not suffer from this problem. Gabriel. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-30 11:41 ` Gabriel Paubert @ 1999-03-31 16:20 ` Ryuichi Oikawa 1999-03-31 18:39 ` Gabriel Paubert 1999-04-04 1:17 ` Joel Klecker 0 siblings, 2 replies; 33+ messages in thread From: Ryuichi Oikawa @ 1999-03-31 16:20 UTC (permalink / raw) To: paubert; +Cc: Paul.Mackerras, bh40, linuxppc-dev From: Gabriel Paubert <paubert@iram.es> Subject: Re: Blue G3 and machine check > Indeed. But if it only happens on access through the P2P bridges, it means > that the bridge transforms Master Abort on the secondary side into Target > Aborts on the primary. IIRC there is a bit in the configuration of the > bridge to control this. Though I'm not sure, do you mean for example something like below? pmac_pci.c: __initfunc(unsigned long pmac_find_bridges(unsigned long mem_start, unsigned long mem_end)) { int bus; struct bridge_data *bridge; + struct device_node *p2pbridge; bridge_list = 0; max_bus = 0; add_bridges(find_devices("bandit"), &mem_start); add_bridges(find_devices("chaos"), &mem_start); add_bridges(find_devices("pci"), &mem_start); bridges = (struct bridge_data **) mem_start; mem_start += (max_bus + 1) * sizeof(struct bridge_data *); memset(bridges, 0, (max_bus + 1) * sizeof(struct bridge_data *)); for (bridge = bridge_list; bridge != NULL; bridge = bridge->next) for (bus = bridge->bus_number; bus <= bridge->max_bus; ++bus) bridges[bus] = bridge; + if((p2pbridge = find_devices("pci-bridge")) && !strcmp(p2pbridge->parent->name, "pci")) { + unsigned char devfn; + unsigned short val; + + if(!pci_device_loc(p2pbridge, &bus, &devfn)) { + grackle_pcibios_read_config_word(0, devfn, PCI_BRIDGE_CONTROL, &val); + val &= ~PCI_BRIDGE_CTL_MASTER_ABORT; + grackle_pcibios_write_config_word(0, devfn, PCI_BRIDGE_CONTROL, val); + grackle_pcibios_read_config_word(0, devfn, PCI_BRIDGE_CONTROL, &val); + } + } + return mem_start; } As a matter of fact, kernel successfully booted right before executing /sbin/init(I haven't prepareed filesystem yet) adding two more fixes: prom.c: * If the pci host bridge has an interrupt-map property, * look for our node in it. */ if (np->parent != 0 && pci_addrs != 0 && (imp = (struct pci_intr_map *) get_property(np->parent, "interrupt-map", &ml)) != 0 && (ip = (int *) get_property(np, "interrupts", &l)) != 0) { - unsigned int busdevfn = pci_addrs[0].addr.a_hi & 0xffff00; + /* P2P bridge's interrupt map contains no bus number */ + unsigned int devfn = pci_addrs[0].addr.a_hi & 0x00ff00; np->n_intrs = 0; np->intrs = (struct interrupt_info *) mem_start; for (i = 0; (ml -= sizeof(struct pci_intr_map)) >= 0; ++i) { if (imp[i].addr.a_hi == devfn) { np->intrs[np->n_intrs].line = imp[i].intr; np->intrs[np->n_intrs].sense = 0; ++np->n_intrs; } ide-pmac.c: *rp = NULL; *pp = removables; for (i = 0, np = atas; i < MAX_HWIFS && np != NULL; np = np->next) { + struct device_node *tp; + int hosted_by_mac_io; + + for (tp = np->parent, hosted_by_mac_io = 0; tp; tp = tp->parent) + if (tp->type && + (!strcmp(tp->type, "mac-io") || !strcmp(tp->type, "dbdma"))) { + hosted_by_mac_io = 1; + break; + } + if (!hosted_by_mac_io) + continue; if (np->n_addrs == 0) { printk(KERN_WARNING "ide: no address for device %s\n", np->full_name); continue; } These are the boot messages on the screen before kernel panic(hand-copied, may exist typo). ..... (scsi0) <Adaptec AHA 294X Ultra2 SCSI host adaptor> found at PCI 4/0 (scsi0) Wide channel, SCSI ID=7, 32/255 SCBs (scsi0) Down loading sequecncer code... 407 instructions down loaded scsi0: Adaptec ... scsi1: MESH scsi: 2 hosts Vendor: IBM Model: DDRS-39130D Rev: DC2A Type: Direct-access Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 ...... (scsi0:0:0:0) Synchronous at 80.0 Mbytes/sec, offset 15 SCSI device sda: hdwr sectors=512, Sectors=17850000 [8715MB] [8.7GB] .... PPP version 2.3.3 TCP compression code copyright 1989 Regents of the University of California PPP line discipline registered. eth0: BMAC+ at 00:05:02:09:f8:f3 Partition check: sda: sda1 sda2 sda3 sda4 sda5 sdb: sdb1 sdb2 sdb3 sdb4 sdc: sdc1 sdc2 sdc3 sdc4 VFS: Mounted root (hfs filesystem) readonly. Freeing unused kernel memory: 112k init 32k prep Warning: unable to open an initial console. kernel panic: No init found. Try passing init= option to kernel. I put precompiled kernel(2.2.1) on ftp://ppc.linux.or.jp/pub/users/oikawa/linux-pmac/bluemacg3/vmlinux-2.2.1-challenger2.gz I was given many useful advice on this by Benjamin Herrenschmidt. Could you give me recommended/suggested fix codes on machine check exception? I think I can try them and report. Regards, Ryuichi Oikawa roikawa@rr.iij4u.or.jp [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-31 16:20 ` Ryuichi Oikawa @ 1999-03-31 18:39 ` Gabriel Paubert 1999-04-05 16:36 ` Ryuichi Oikawa 1999-04-04 1:17 ` Joel Klecker 1 sibling, 1 reply; 33+ messages in thread From: Gabriel Paubert @ 1999-03-31 18:39 UTC (permalink / raw) To: Ryuichi Oikawa; +Cc: Paul.Mackerras, bh40, linuxppc-dev On Thu, 1 Apr 1999, Ryuichi Oikawa wrote: > Though I'm not sure, do you mean for example something like below? > > pmac_pci.c: > __initfunc(unsigned long pmac_find_bridges(unsigned long mem_start, unsigned long mem_end)) > { > int bus; > struct bridge_data *bridge; > + struct device_node *p2pbridge; > > bridge_list = 0; > max_bus = 0; > add_bridges(find_devices("bandit"), &mem_start); > add_bridges(find_devices("chaos"), &mem_start); > add_bridges(find_devices("pci"), &mem_start); > bridges = (struct bridge_data **) mem_start; > mem_start += (max_bus + 1) * sizeof(struct bridge_data *); > memset(bridges, 0, (max_bus + 1) * sizeof(struct bridge_data *)); > for (bridge = bridge_list; bridge != NULL; bridge = bridge->next) > for (bus = bridge->bus_number; bus <= bridge->max_bus; ++bus) > bridges[bus] = bridge; > > + if((p2pbridge = find_devices("pci-bridge")) && !strcmp(p2pbridge->parent->name, "pci")) { > + unsigned char devfn; > + unsigned short val; > + > + if(!pci_device_loc(p2pbridge, &bus, &devfn)) { > + grackle_pcibios_read_config_word(0, devfn, PCI_BRIDGE_CONTROL, &val); > + val &= ~PCI_BRIDGE_CTL_MASTER_ABORT; Yes, this was along these lines. But I consider this more as a temporary workaround than anything else. > Could you give me recommended/suggested fix codes on machine check > exception? I think I can try them and report. Can you first confirm that it is a machine check ? The code is quite explicit in traps.c: switch( regs->msr & 0x0000F000) { case (1<<12) : printk("Machine check signal - probably due to mm fault\n" "with mmu off\n"); break; case (1<<13) : printk("Transfer error ack signal\n"); break; case (1<<14) : printk("Data parity signal\n"); break; case (1<<15) : printk("Address parity signal\n"); break; default: printk("Unknown values in msr\n"); If it is a machine check then you should modify the corresponding case to handle a foreseen machine check. Note that earlier there is a comment about MBX boards doing basically the same which is `handled' by simply ignoring it. Then how do you tell that the machine check is foreseen ? I was thinking about the possibility of setting a global flag (per processor on SMP) saying that you are expecting a machine check: - in every grackle_xxx set the flag, mb(), perform the access mb() again, clear the flag: volatile int expect_machine_check = 0; int grackle_pcibios_read_config_byte(unsigned char bus, unsigned char dev_fn, unsigned char offset, unsigned char *val) { struct bridge_data *bp; if (bus > max_bus || (bp = bridges[bus]) == 0) return PCIBIOS_DEVICE_NOT_FOUND; out_be32(bp->cfg_addr, GRACKLE_CFA(bus, dev_fn, offset)); + expect_machine_check = 1; + mb(); *val = in_8(bp->cfg_data + (offset & 3)); + mb(); + expect_machine_check = 0; return PCIBIOS_SUCCESSFUL; } however this won't work because the SRR0 on the machine check might point to the load instruction of in_8, leading to an infinite machine check loop. So you have to go the harder way: replacing the in_8 with something like: asm volatile( "sync; " "1: lbzx %0,%1,%2;" "2: sync;" "3: isync;" "4: ;" " .section .fixup;" "5: li %0,-1;" " b 4b;" " .previous;" " .section __ex_table;" " .long 1b,5b,2b,5b,3b,5b;" " .previous;" : "r" (val) : "b" (bp->cfg_data), "r" (offset & 3)) ) and I've still probably forgotten something. I would also like to know at which instruction SRR0 points when we have a machine check. The architecture description deliberately gives a lot of latitude to the chip designers, it might even be necessary to perform some lengthy operation to make sure that this works because the machine check might be delayed enough clocks to let the processor proceed past the isync. Then you have to modify the machine check handler to use the fixup table as in arch/ppc/mm/fault.c: /* Are we prepared to handle this fault? */ if ((fixup = search_exception_table(regs->nip)) != 0) { regs->nip = fixup; return; } Regards, Gabriel. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-31 18:39 ` Gabriel Paubert @ 1999-04-05 16:36 ` Ryuichi Oikawa 1999-04-05 17:11 ` Gabriel Paubert 0 siblings, 1 reply; 33+ messages in thread From: Ryuichi Oikawa @ 1999-04-05 16:36 UTC (permalink / raw) To: paubert; +Cc: roikawa, Paul.Mackerras, bh40, linuxppc-dev Hello, > Can you first confirm that it is a machine check ? The code > is quite explicit in traps.c: > > switch( regs->msr & 0x0000F000) > { > case (1<<12) : > printk("Machine check signal - probably due to mm fault\n" > "with mmu off\n"); > break; > case (1<<13) : > printk("Transfer error ack signal\n"); > break; > case (1<<14) : > printk("Data parity signal\n"); > break; > case (1<<15) : > printk("Address parity signal\n"); > break; > default: > printk("Unknown values in msr\n"); ^^^^^^^^^^^^^^^^ It was reached here. Isn't it a machine check? Before I try to do your suggesion, I'd like to confirm a few things for my understandig. MPC106 user's manual says, "The SERR signal is used to report PCI address parity errors, PCI data parity errors on a special-cycle command, target-abort, or any other errors where the result is potentially catastrophic. The SERR signal is also asserted for master-abort, except if it happens for a PCI configuration access or special-cycle transaction. " Because MPC106 cannot master abort as far as P2P bridges are acting normally, P2P bridges have to report master abort to the host bridge. According DEC21154 user's manual it forwards a master abort as a target abort when master abort mode bit in bridge control register is set 1, except special-cycle transaction. Therefore, in this case scanning PCI devices with configuration reads must cause master abort, forwarded as a target abort and then MPC106 asserts SERR. We cannot know if it is really a target abort until we check the status register of the nearst P2P bridge to the target device. Therefore the ways work through this problem may be, from easiest way to difficult, a) Disable master abort fowarding for all P2P bridges, which I tried, but this also disables master abort forwarding for usual R/W transactions. b) Disable master abort fowarding for all P2P bridges walking through PCI device tree from the top to the target device before starting configuration transactions, and restore after the transactions are terminated. c) Always enable master abort fowarding for all P2P bridges and exception handler recovers system error if - exception is caused by PCI configration transaction, - host bridge recieved a target abort, - status register of the nearst P2P bridge to the target device shows master abort (how to know the target device?) and sets pcibios_config_read_xx() return value to ~0 (how?). We also have to rewrite pcibios_config_xx() as machine check exception safe. That's along your suggestion, I think. Can I assume this, or not? Probably I can try this config read function, > int grackle_pcibios_read_config_byte(unsigned char bus, unsigned char dev_fn, > unsigned char offset, unsigned char *val) > { > struct bridge_data *bp; > > if (bus > max_bus || (bp = bridges[bus]) == 0) > return PCIBIOS_DEVICE_NOT_FOUND; > out_be32(bp->cfg_addr, GRACKLE_CFA(bus, dev_fn, offset)); > + expect_machine_check = 1; > + mb(); > *val = in_8(bp->cfg_data + (offset & 3)); > + mb(); > + expect_machine_check = 0; > return PCIBIOS_SUCCESSFUL; > } > however this won't work because the SRR0 on the machine check might point > to the load instruction of in_8, leading to an infinite machine check > loop. So you have to go the harder way: > > replacing the in_8 with something like: > > asm volatile( > "sync; " > "1: lbzx %0,%1,%2;" > "2: sync;" > "3: isync;" > "4: ;" > " .section .fixup;" > "5: li %0,-1;" > " b 4b;" > " .previous;" > " .section __ex_table;" > " .long 1b,5b,2b,5b,3b,5b;" > " .previous;" > : "r" (val) > : "b" (bp->cfg_data), "r" (offset & 3)) > ) but it is beyond my understanding from the next. I don't believe I can write correct exception handler which seems very complicated. But it may be worth to try. Anyway it'll be next or after next weekend. I'll have to read more kernel code and PPC documents. > and I've still probably forgotten something. I would also like to know > at which instruction SRR0 points when we have a machine check. The > architecture description deliberately gives a lot of latitude to the chip > designers, it might even be necessary to perform some lengthy operation to > make sure that this works because the machine check might be delayed > enough clocks to let the processor proceed past the isync. > > Then you have to modify the machine check handler to use the fixup table > as in arch/ppc/mm/fault.c: > > /* Are we prepared to handle this fault? */ > if ((fixup = search_exception_table(regs->nip)) != 0) { > regs->nip = fixup; > return; > } Thank you for your advice. Ryuichi Oikawa roikawa@rr.iij4u.or.jp ps. In order to run Linux on BlueG3, pci-ide driver(UltraATA controller, CMD646)have to be modified for PPC. Anyone try this? [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-04-05 16:36 ` Ryuichi Oikawa @ 1999-04-05 17:11 ` Gabriel Paubert 0 siblings, 0 replies; 33+ messages in thread From: Gabriel Paubert @ 1999-04-05 17:11 UTC (permalink / raw) To: Ryuichi Oikawa; +Cc: Paul.Mackerras, bh40, linuxppc-dev On Tue, 6 Apr 1999, Ryuichi Oikawa wrote: > Hello, > > default: > > printk("Unknown values in msr\n"); > ^^^^^^^^^^^^^^^^ > It was reached here. Isn't it a machine check? I realize now that the code in traps.c is completely bogus. Please apply first the following patch (it seems that he author did not realize that bits are given in big endian ordering in PPC doc): --- linux-2.2.4/arch/ppc/kernel/traps.c Tue Jan 5 19:13:56 1999 +++ linux/arch/ppc/kernel/traps.c Mon Apr 5 19:04:57 1999 @@ -104,19 +104,19 @@ printk("Machine check in kernel mode.\n"); printk("Caused by (from msr): "); printk("regs %p ",regs); - switch( regs->msr & 0x0000F000) + switch( (regs->msr & 0x000F0000) >> 16 ) { - case (1<<12) : + case (8) : printk("Machine check signal - probably due to mm fault\n" "with mmu off\n"); break; - case (1<<13) : + case (4) : printk("Transfer error ack signal\n"); break; - case (1<<14) : + case (2) : printk("Data parity signal\n"); break; - case (1<<15) : + case (1) : printk("Address parity signal\n"); break; default: > Before I try to do your suggesion, I'd like to confirm a few things > for my understandig. MPC106 user's manual says, > "The SERR signal is used to report PCI address parity errors, > PCI data parity errors on a special-cycle command, target-abort, > or any other errors where the result is potentially catastrophic. > The SERR signal is also asserted for master-abort, except if it > happens for a PCI configuration access or special-cycle transaction. " I did not know there was an exception for PCI configuration cycles (I thought it was for sepcial cycles only which are designed to end in master abort) and I don't like it :-( : it makes the bridges non transparent wrt error handling. > > Because MPC106 cannot master abort as far as P2P bridges are acting > normally, P2P bridges have to report master abort to the host bridge. > According DEC21154 user's manual it forwards a master abort as a target > abort when master abort mode bit in bridge control register is set 1, > except special-cycle transaction. Therefore, in this case scanning PCI > devices with configuration reads must cause master abort, forwarded as > a target abort and then MPC106 asserts SERR. We cannot know if it is > really a target abort until we check the status register of the nearst > P2P bridge to the target device. > > Therefore the ways work through this problem may be, from easiest way > to difficult, > a) Disable master abort fowarding for all P2P bridges, which I tried, > but this also disables master abort forwarding for usual R/W > transactions. In most systems the serr signal is never signaled. This is an acceptable workaround for now. The PCI transaction times out and nothing serious happens. > b) Disable master abort fowarding for all P2P bridges walking through > PCI device tree from the top to the target device before starting > configuration transactions, and restore after the transactions > are terminated. Would result in inconsistent handling of errors on SMP, don't do that. > c) Always enable master abort fowarding for all P2P bridges and exception > handler recovers system error if > - exception is caused by PCI configration transaction, > - host bridge recieved a target abort, > - status register of the nearst P2P bridge to the target device > shows master abort (how to know the target device?) > and sets pcibios_config_read_xx() return value to ~0 (how?). > We also have to rewrite pcibios_config_xx() as machine check exception > safe. That's along your suggestion, I think. Indeed. But if may not be the simplest. So don't hold your breath. > Can I assume this, or not? Yes. > > replacing the in_8 with something like: > > > > asm volatile( > > "sync; " > > "1: lbzx %0,%1,%2;" > > "2: sync;" > > "3: isync;" > > "4: ;" > > " .section .fixup;" > > "5: li %0,-1;" > > " b 4b;" > > " .previous;" > > " .section __ex_table;" > > " .long 1b,5b,2b,5b,3b,5b;" > > " .previous;" > > : "r" (val) > > : "b" (bp->cfg_data), "r" (offset & 3)) > > ) > > > but it is beyond my understanding from the next. I don't believe I can > write correct exception handler which seems very complicated. But it may > be worth to try. Anyway it'll be next or after next weekend. I'll have to > read more kernel code and PPC documents. Actually I think it would be easier to provide a few `machine check safe functions' in arch/ppc/kernel/ which would be called something like safe_{read,write}[bwl] (you are encouraged to suggest better names) and would return a status indicating whether the operation had succeeded or not with prototypes like: int safe_readb(volatile u_char *, u_char *) int safe_writeb(u_char, volatile u_char *) returning either 0 (success) or -ENXIO on error (would it be the right error code). Export these functions since they might be useful in some other cases. Regards, Gabriel. P.S: the patch I've put on my ftp server for the MVME2600 (ftp://vcorr1.iram.es/pub/linux-2.2/mvem2600.generic-patch-2.2.4) includes some modifications to the PCI code which go in the right direction. However, it still needs some work. [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-31 16:20 ` Ryuichi Oikawa 1999-03-31 18:39 ` Gabriel Paubert @ 1999-04-04 1:17 ` Joel Klecker 1999-04-09 5:58 ` Joel Klecker 1 sibling, 1 reply; 33+ messages in thread From: Joel Klecker @ 1999-04-04 1:17 UTC (permalink / raw) To: Ryuichi Oikawa; +Cc: linuxppc-dev At 01:20 +0900 1999-04-01, Ryuichi Oikawa wrote: [snip] >I put precompiled kernel(2.2.1) on > >ftp://ppc.linux.or.jp/pub/users/oikawa/linux-pmac/bluemacg3/vmlinux-2. >2.1-challenger2.gz >I was given many useful advice on this by Benjamin Herrenschmidt. Any chance you could put up a proper diff for the sources used to build that kernel? I'd really appreciate it. Thanks. -- Joel Klecker (aka Espy) Debian GNU/Linux Developer <URL:mailto:jk@espy.org> <URL:mailto:espy@debian.org> <URL:http://web.espy.org/> <URL:http://www.debian.org/> [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-04-04 1:17 ` Joel Klecker @ 1999-04-09 5:58 ` Joel Klecker 1999-04-09 16:12 ` Ryuichi Oikawa 0 siblings, 1 reply; 33+ messages in thread From: Joel Klecker @ 1999-04-09 5:58 UTC (permalink / raw) To: linuxppc-dev At 17:17 -0800 1999-04-03, Joel Klecker wrote: >At 01:20 +0900 1999-04-01, Ryuichi Oikawa wrote: >[snip] >>I put precompiled kernel(2.2.1) on >> >>ftp://ppc.linux.or.jp/pub/users/oikawa/linux-pmac/bluemacg3/vmlinux-2. >>2.1-challenger2.gz >>I was given many useful advice on this by Benjamin Herrenschmidt. > >Any chance you could put up a proper diff for the sources used to build that >kernel? I'd really appreciate it. Thanks. Since I received absolutely no response to this, let me elaborate. I have had success in booting and running a system with this kernel, but its config does not suit my needs, therefore I really would like to have the source. The pseudo-diffs in the message I originally replied to do not seem to be enough to produce a working kernel, are they really all of the changes? If they are, then I would like a copy of the .config so I can see what the differences in configuration are. Thanks again. -- Joel Klecker (aka Espy) Debian GNU/Linux Developer <URL:mailto:jk@espy.org> <URL:mailto:espy@debian.org> <URL:http://web.espy.org/> <URL:http://www.debian.org/> [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-04-09 5:58 ` Joel Klecker @ 1999-04-09 16:12 ` Ryuichi Oikawa 0 siblings, 0 replies; 33+ messages in thread From: Ryuichi Oikawa @ 1999-04-09 16:12 UTC (permalink / raw) To: jk; +Cc: linuxppc-dev From: Joel Klecker <jk@espy.org> Subject: Re: Blue G3 and machine check > >>ftp://ppc.linux.or.jp/pub/users/oikawa/linux-pmac/bluemacg3/vmlinux-2. > >>2.1-challenger2.gz > >>I was given many useful advice on this by Benjamin Herrenschmidt. > > > >Any chance you could put up a proper diff for the sources used to build that > >kernel? I'd really appreciate it. Thanks. > > Since I received absolutely no response to this, let me elaborate. I > have had success in booting and running a system with this kernel, > but its config does not suit my needs, therefore I really would like > to have the source. The pseudo-diffs in the message I originally > replied to do not seem to be enough to produce a working kernel, are > they really all of the changes? Yes, that was all. > If they are, then I would like a copy > of the .config so I can see what the differences in configuration > are. Thanks again. I thought it was almost default... Then I'll make complete diff to 2.2.1 kernel from samba and put it later on ftp://ppc.linux.or.jp/pub/users/oikawa/linux-pmac/bluemacg3/ But please note that - it is minimal workaround and should not be used for long time - ultraATA chip(CMD646) driver has not been ported yet, so that if you need it, you have to rewrite yourself. But usual ATA bus, which CDROM is connected, does work. - it is a diff to kernel 2.2.1, not to latest one Regards, Ryuichi Oikawa roikawa@rr.iij4u.or.jp [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: Blue G3 and machine check 1999-03-24 23:12 ` Paul Mackerras 1999-03-25 11:20 ` Gabriel Paubert @ 1999-03-25 12:10 ` Benjamin Herrenschmidt 1 sibling, 0 replies; 33+ messages in thread From: Benjamin Herrenschmidt @ 1999-03-25 12:10 UTC (permalink / raw) To: Paul.Mackerras, linuxppc-dev On Thu, Mar 25, 1999, Paul Mackerras <paulus@cs.anu.edu.au> wrote: >Did the original poster say whether the machine checks were on config >space accesses or I/O or memory space accesses? It's common enough >for drivers written for intel linux to go probing I/O ports to try to >find devices to talk to. The original poster wrote that the machine check happens when reading config space on a non-existent device thru the PCI<->PCI bridge. The proposed fix was simply to add two globals "machine_check_expected" and "machine_check_received". The first one set to true before the probe, and the machine check exception handler incrementing the second one when the first one is true. -- E-Mail: <mailto:bh40@calva.net> BenH. Web : <http://calvaweb.calvacom.fr/bh40/> [[ This message was sent via the linuxppc-dev mailing list. Replies are ]] [[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]] [[ reply is of general interest. Please check http://lists.linuxppc.org/ ]] [[ and http://www.linuxppc.org/ for useful information before posting. ]] ^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~1999-04-09 16:12 UTC | newest] Thread overview: 33+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 1999-03-14 13:35 Blue G3 and machine check Benjamin Herrenschmidt 1999-03-15 16:42 ` Ryuichi Oikawa 1999-03-15 17:09 ` Geert Uytterhoeven 1999-03-24 9:30 ` Gabriel Paubert 1999-03-24 23:12 ` Paul Mackerras 1999-03-25 11:20 ` Gabriel Paubert 1999-03-25 16:46 ` Apple Job Posting and Good News for LinuxPPC developers Kevin B. Hendricks 1999-03-25 19:12 ` David Edelsohn 1999-03-26 11:31 ` Gabriel Paubert 1999-03-26 16:13 ` David Edelsohn 1999-03-27 6:27 ` Guy Sotomayor 1999-03-27 20:44 ` David Edelsohn 1999-04-02 12:11 ` Holger Bettag 1999-04-02 17:11 ` David Edelsohn 1999-04-02 22:19 ` Douglas Godfrey 1999-04-03 17:42 ` Holger Bettag 1999-04-05 16:11 ` Gabriel Paubert 1999-04-05 16:06 ` Gabriel Paubert 1999-04-06 5:53 ` Douglas Godfrey 1999-03-26 6:08 ` Nathan Hurst 1999-03-26 13:51 ` sean o'malley 1999-03-28 5:08 ` Cort Dougan 1999-03-26 20:33 ` N.G. Temme 1999-03-29 23:44 ` Blue G3 and machine check Paul Mackerras 1999-03-30 11:41 ` Gabriel Paubert 1999-03-31 16:20 ` Ryuichi Oikawa 1999-03-31 18:39 ` Gabriel Paubert 1999-04-05 16:36 ` Ryuichi Oikawa 1999-04-05 17:11 ` Gabriel Paubert 1999-04-04 1:17 ` Joel Klecker 1999-04-09 5:58 ` Joel Klecker 1999-04-09 16:12 ` Ryuichi Oikawa 1999-03-25 12:10 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).