* Software Emulation Exception... @ 2000-04-07 13:54 Wohlgemuth, Jason 2000-04-07 15:20 ` Pavel Roskin 0 siblings, 1 reply; 15+ messages in thread From: Wohlgemuth, Jason @ 2000-04-07 13:54 UTC (permalink / raw) To: 'linuxppc-embedded@lists.linuxppc.org' Hi everyone, I was hoping you all would pass some knowledge on to me. Occasionally our application dies and our board goes into a loop of printing "Software Emulation %s/%d NIP: %lx *NIP: 0x%x code: %x" which is coming from arch/ppc/traps.c, would the fact that I have kernel math emulation turned on increase the chances of this error. Without KME the Soft_emulate_8xx routine gets called and I didn't know if that handled this a little bit better. Thanks, Jason ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-07 13:54 Software Emulation Exception Wohlgemuth, Jason @ 2000-04-07 15:20 ` Pavel Roskin 2000-04-07 15:44 ` Dan Malek 2000-04-10 3:14 ` Unpacking the mvista kernel RPMs Graham Stoney 0 siblings, 2 replies; 15+ messages in thread From: Pavel Roskin @ 2000-04-07 15:20 UTC (permalink / raw) To: Wohlgemuth, Jason; +Cc: 'linuxppc-embedded@lists.linuxppc.org' Hello, Jason! > I was hoping you all would pass some knowledge on to me. Occasionally our > application dies and our board goes into a loop of printing > > "Software Emulation %s/%d NIP: %lx *NIP: 0x%x code: %x" This happened to me when I tried to read more than 8 megabytes of data from ATA Flash inthe same time by using e.g. dd if=/dev/hda of=/dev/null count=17000 The problem has gone after upgrading to the Montavista kernel (from their CDK 1.0) However, I have no idea what caused the problem and whether it has been fixed. By the way, I found a newer kernel (March, 23) in ftp://ftp.mvista.com/pub/CDK/updates/1.0/ppc_8xx/SRPMS/ I haven't tested it yet. Pavel Roskin ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-07 15:20 ` Pavel Roskin @ 2000-04-07 15:44 ` Dan Malek 2000-04-10 9:37 ` Marcus Sundberg 2000-04-10 3:14 ` Unpacking the mvista kernel RPMs Graham Stoney 1 sibling, 1 reply; 15+ messages in thread From: Dan Malek @ 2000-04-07 15:44 UTC (permalink / raw) To: Pavel Roskin Cc: Wohlgemuth, Jason, 'linuxppc-embedded@lists.linuxppc.org' > > "Software Emulation %s/%d NIP: %lx *NIP: 0x%x code: %x" So, decode the address/instruction and see if it makes sense. > The problem has gone after upgrading to the Montavista kernel (from their > CDK 1.0) Since I know something about both of your custom board designs, I can say that the software is unlikely to be the trouble, although it could affect the problem. This exception is typical of custom designs that haven't properly designed the external device interfaces to the 8xx bus. I strongly suspect one of the external devices isn't working correctly, and when the processor is executing code from DRAM it gets some trash from the bus instead of the proper instruction. After something like this occurs, dump the memory at this address and check to make sure what the processor claims to have executed is really there. > However, I have no idea what caused the problem and whether it has been > fixed. If you haven't attached a logic analyzer, tripped the error, and verified the bus timing you will never know what caused it. Things like this don't fix themselves :-). These problems are sensitive to the instruction stream (the instruction pointer, the instruction itself, and access to the device). Moving the code around may appear to "fix" the problem, but it is still there waiting to happen once the product is shipping...... -- Dan ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-07 15:44 ` Dan Malek @ 2000-04-10 9:37 ` Marcus Sundberg 2000-04-10 16:26 ` Dan Malek 0 siblings, 1 reply; 15+ messages in thread From: Marcus Sundberg @ 2000-04-10 9:37 UTC (permalink / raw) To: Dan Malek Cc: Pavel Roskin, Wohlgemuth, Jason, 'linuxppc-embedded@lists.linuxppc.org' Dan Malek <dan@netx4.com> writes: > > > "Software Emulation %s/%d NIP: %lx *NIP: 0x%x code: %x" > > So, decode the address/instruction and see if it makes sense. > > > The problem has gone after upgrading to the Montavista kernel (from their > > CDK 1.0) > > Since I know something about both of your custom board designs, I > can say that the software is unlikely to be the trouble, although it > could affect the problem. This exception is typical of custom designs > that haven't properly designed the external device interfaces to the > 8xx bus. Well, it also happens to be typical to the MM problem solved by the patches at http://www.zeta.org.au/~linsol/ (which are included in the Montavista kernel). And by Pavel's description it's almost 100% certain that the MM bug is the problem. //Marcus -- Signature under construction, please come back later. ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-10 9:37 ` Marcus Sundberg @ 2000-04-10 16:26 ` Dan Malek 2000-04-11 2:36 ` Jeff Millar 2000-04-11 8:50 ` Marcus Sundberg 0 siblings, 2 replies; 15+ messages in thread From: Dan Malek @ 2000-04-10 16:26 UTC (permalink / raw) To: Marcus Sundberg Cc: Pavel Roskin, Wohlgemuth, Jason, 'linuxppc-embedded@lists.linuxppc.org' Marcus Sundberg wrote: > Well, it also happens to be typical to the MM problem solved by > the patches at http://www.zeta.org.au/~linsol/..... That's pretty interesting, since the Software Emulation trap is caused by fetching trash from memory, and the MMU changes are to properly track dirty data pages. I don't buy it. All you did by adding these changes was eliminate (or change) the data TLB miss timing. Have you ever looked at the 8xx bus when it gets a TLB fault? Pretty damn weird. I've seen several external devices not behave properly because of this. > And by Pavel's description it's almost 100% certain that the MM > bug is the problem. Well, almost 100% doesn't cut it for me. If you can't determine the actual cause of the problem and prove it has been corrected, it is still broken. -- Dan ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-10 16:26 ` Dan Malek @ 2000-04-11 2:36 ` Jeff Millar 2000-04-11 2:03 ` Dan Malek 2000-04-11 8:50 ` Marcus Sundberg 1 sibling, 1 reply; 15+ messages in thread From: Jeff Millar @ 2000-04-11 2:36 UTC (permalink / raw) To: Dan Malek, Marcus Sundberg Cc: Pavel Roskin, Wohlgemuth, Jason, linuxppc-embedded Pavel was running on a virgin RPX-lite board with PCMCIA adapter and a Sandisk flash disk plugged in. Not on the new hardware. So, is it a problem with the RPX board or the Sandisk? jeff ----- Original Message ----- From: "Dan Malek" <dan@netx4.com> To: "Marcus Sundberg" <erammsu@kieraypc01.p.y.ki.era.ericsson.se> Cc: "Pavel Roskin" <pavel_roskin@geocities.com>; "Wohlgemuth, Jason" <jason_wohlgemuth@gilbarco.com>; <linuxppc-embedded@lists.linuxppc.org> Sent: Monday, April 10, 2000 12:26 PM Subject: Re: Software Emulation Exception... > > Marcus Sundberg wrote: > > > Well, it also happens to be typical to the MM problem solved by > > the patches at http://www.zeta.org.au/~linsol/..... > > That's pretty interesting, since the Software Emulation trap is > caused by fetching trash from memory, and the MMU changes are to > properly track dirty data pages. I don't buy it. All you did > by adding these changes was eliminate (or change) the data TLB > miss timing. Have you ever looked at the 8xx bus when it gets > a TLB fault? Pretty damn weird. I've seen several external devices > not behave properly because of this. > > > > And by Pavel's description it's almost 100% certain that the MM > > bug is the problem. > > Well, almost 100% doesn't cut it for me. If you can't determine the > actual cause of the problem and prove it has been corrected, it is > still broken. > > > > -- Dan > > > ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-11 2:36 ` Jeff Millar @ 2000-04-11 2:03 ` Dan Malek 0 siblings, 0 replies; 15+ messages in thread From: Dan Malek @ 2000-04-11 2:03 UTC (permalink / raw) To: Jeff Millar Cc: Marcus Sundberg, Pavel Roskin, Wohlgemuth, Jason, linuxppc-embedded Jeff Millar wrote: > .... So, is it a > problem with the RPX board or the Sandisk? Like I said before. I can't guess, and neither should you. The fact is the software emulation trap is caused by fetching an instruction that can't be properly decoded. We need to determine what it fetched and where that came from. These are some fundamental engineering skills we are trying to use here. Why aren't we using them? The PCMCIA interfaces have their timing requirements. On the 8xx, these are all very programmable. Look at the card specifications and the values programmed. Do they seem reasonable? Don't simply run maximum, excessively long timing on the PCMCIA. This can be just as bad on some cards or other external devices as running too fast. Attach a logic analyzer. Does the timing on the 8xx bus look reasonable? Use one of the GPIO pins to trigger the analyzer when you trap the fault (as early as possible). Look at the bus signals. What really happened? -- Dan ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-10 16:26 ` Dan Malek 2000-04-11 2:36 ` Jeff Millar @ 2000-04-11 8:50 ` Marcus Sundberg 2000-04-11 12:08 ` Dan Malek 1 sibling, 1 reply; 15+ messages in thread From: Marcus Sundberg @ 2000-04-11 8:50 UTC (permalink / raw) To: Dan Malek Cc: Pavel Roskin, Wohlgemuth, Jason, 'linuxppc-embedded@lists.linuxppc.org' Dan Malek <dan@netx4.com> writes: > Marcus Sundberg wrote: > > > Well, it also happens to be typical to the MM problem solved by > > the patches at http://www.zeta.org.au/~linsol/..... > > That's pretty interesting, since the Software Emulation trap is > caused by fetching trash from memory, and the MMU changes are to > properly track dirty data pages. I don't buy it. All you did > by adding these changes was eliminate (or change) the data TLB > miss timing. Well, I've verified the problem on MBX, ADS, FADS, RPX Lite/Classic and 3 different custom boards, with 823, 850, 860, 860T and 860P CPUs. It has a 100% chance of crashing without the patch, and a 0% chance of crashing with it. Especially Pavel's description sounded very much like this problem. All you have to do to experience the problem is allocate memory until the RAM gets low (allocating buffer cache by copying files around is enough.) If you are running a dynamicly linked glibc2 app what is most likely to happen is that the (dirty) jump table is thrown out by the broken MM code. When the application tries to call a function in a shared library the jump table will be read in from disk again, but as it's not filled in properly the app will crash, usually by jumping into some data area and getting an illegal instruction. What happens next is usually that the app's parent will run, and crash for the same reason. This will go on until we get up to init, which can not be killed, and get stuck in an endless loop of trying to execute the same illegal instruction. If you are running a staticly linked app, or using libc 1.99, it is usually more common to get a segfault, but in the end it all depends on what pages have been erroneously thrown out and which are accessed first. //Marcus -- Signature under construction, please come back later. ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-11 8:50 ` Marcus Sundberg @ 2000-04-11 12:08 ` Dan Malek 2000-04-11 14:51 ` Marcus Sundberg 0 siblings, 1 reply; 15+ messages in thread From: Dan Malek @ 2000-04-11 12:08 UTC (permalink / raw) To: Marcus Sundberg Cc: Pavel Roskin, Wohlgemuth, Jason, 'linuxppc-embedded@lists.linuxppc.org' Marcus Sundberg wrote: > .... It has a 100% chance of crashing without the patch, and a 0% > chance of crashing with it. I agree the patch is necessary to fix a problem managing dirty pages...... > All you have to do to experience the problem is allocate memory > until the RAM gets low ....but this wasn't the test.... > .... If you are running a dynamicly linked glibc2 > app what is most likely to happen is that the (dirty) jump table > is thrown out by the broken MM code. If, if, if......Is this really what is happening? I am concerned about proving a real problem and fixing it. If you aren't, that's OK. I have other things to do. -- Dan ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Software Emulation Exception... 2000-04-11 12:08 ` Dan Malek @ 2000-04-11 14:51 ` Marcus Sundberg 0 siblings, 0 replies; 15+ messages in thread From: Marcus Sundberg @ 2000-04-11 14:51 UTC (permalink / raw) To: Dan Malek Cc: Pavel Roskin, Wohlgemuth, Jason, 'linuxppc-embedded@lists.linuxppc.org' Dan Malek <dan@netx4.com> writes: > Marcus Sundberg wrote: > > > .... It has a 100% chance of crashing without the patch, and a 0% > > chance of crashing with it. > > I agree the patch is necessary to fix a problem managing dirty > pages...... > > > All you have to do to experience the problem is allocate memory > > until the RAM gets low > > ....but this wasn't the test.... I haven't seen any posts about "the test", but I doubt you can find a better test for the dirty page bug. > > .... If you are running a dynamicly linked glibc2 > > app what is most likely to happen is that the (dirty) jump table > > is thrown out by the broken MM code. > > If, if, if......Is this really what is happening? For me it is - I've watched the pte_val of a dirty page containg the jump table change to zero (kernel throwing the page out), and seen the crash when the page is being paged back in from disk. Obviously I can't be sure about other peoples problems as I haven't seen them happen, but if both the symptoms and the fix are identical there is a good chance that the problem is the same as well. > I am concerned about proving a real problem and fixing it. > If you aren't, that's OK. I have other things to do. A (serious) real problem has been found, and the fix is available. //Marcus -- Signature under construction, please come back later. ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Unpacking the mvista kernel RPMs 2000-04-07 15:20 ` Pavel Roskin 2000-04-07 15:44 ` Dan Malek @ 2000-04-10 3:14 ` Graham Stoney 2000-04-10 3:25 ` Jason Wohlgemuth 2000-04-10 16:18 ` Joe Green 1 sibling, 2 replies; 15+ messages in thread From: Graham Stoney @ 2000-04-10 3:14 UTC (permalink / raw) To: Pavel Roskin Cc: Wohlgemuth Jason, 'linuxppc-embedded@lists.linuxppc.org' Pavel Roskin writes: > By the way, I found a newer kernel (March, 23) in > ftp://ftp.mvista.com/pub/CDK/updates/1.0/ppc_8xx/SRPMS/ How does one go about unpacking one of these RPMs? I tried ftp'ing the rpm and going: % rpm -i hhl-kernel-2.2.13-7.src.rpm % rpm -bp /usr/src/redhat/SPECS/hhl-kernel-2.2.13.spec ... cp: configs/%{_hhl_kernel_configs}: No such file or directory Bad exit status from /var/tmp/rpm-tmp.689 (%prep) Can anyone please help me get the hang of RPM? I'm sure it's _supposed_ to make my life easier, but... Thanks, Graham ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: Unpacking the mvista kernel RPMs 2000-04-10 3:14 ` Unpacking the mvista kernel RPMs Graham Stoney @ 2000-04-10 3:25 ` Jason Wohlgemuth 2000-04-10 4:50 ` Graham Stoney 2000-04-10 16:18 ` Joe Green 1 sibling, 1 reply; 15+ messages in thread From: Jason Wohlgemuth @ 2000-04-10 3:25 UTC (permalink / raw) To: Graham Stoney, Pavel Roskin; +Cc: linuxppc-embedded Graham, If you just want to get the files out try this: work within a temp directory rpm2cpio < kernel.rpm > kernel.cpio cpio --make-directories --extract --verbose < kernel.cpio Enjoy, Jason -----Original Message----- From: owner-linuxppc-embedded@lists.linuxppc.org [mailto:owner-linuxppc-embedded@lists.linuxppc.org]On Behalf Of Graham Stoney Sent: Sunday, April 09, 2000 11:15 PM To: Pavel Roskin Cc: Wohlgemuth Jason; 'linuxppc-embedded@lists.linuxppc.org' Subject: Unpacking the mvista kernel RPMs Pavel Roskin writes: > By the way, I found a newer kernel (March, 23) in > ftp://ftp.mvista.com/pub/CDK/updates/1.0/ppc_8xx/SRPMS/ How does one go about unpacking one of these RPMs? I tried ftp'ing the rpm and going: % rpm -i hhl-kernel-2.2.13-7.src.rpm % rpm -bp /usr/src/redhat/SPECS/hhl-kernel-2.2.13.spec ... cp: configs/%{_hhl_kernel_configs}: No such file or directory Bad exit status from /var/tmp/rpm-tmp.689 (%prep) Can anyone please help me get the hang of RPM? I'm sure it's _supposed_ to make my life easier, but... Thanks, Graham ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Unpacking the mvista kernel RPMs 2000-04-10 3:25 ` Jason Wohlgemuth @ 2000-04-10 4:50 ` Graham Stoney 2000-04-10 17:47 ` Pavel Roskin 0 siblings, 1 reply; 15+ messages in thread From: Graham Stoney @ 2000-04-10 4:50 UTC (permalink / raw) To: Jason Wohlgemuth; +Cc: Graham Stoney, Pavel Roskin, linuxppc-embedded Jason Wohlgemuth writes: > If you just want to get the files out try this: ... > rpm2cpio < kernel.rpm > kernel.cpio Sure, this works, but it doesn't automagically apply the patches. Perhaps I'm just asking too much :-(. Thanks, Graham ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Unpacking the mvista kernel RPMs 2000-04-10 4:50 ` Graham Stoney @ 2000-04-10 17:47 ` Pavel Roskin 0 siblings, 0 replies; 15+ messages in thread From: Pavel Roskin @ 2000-04-10 17:47 UTC (permalink / raw) To: Graham Stoney; +Cc: Jason Wohlgemuth, linuxppc-embedded Hello! > Sure, this works, but it doesn't automagically apply the patches. > Perhaps I'm just asking too much :-(. You can install source rpm's and use "rpm -bp foo.spec" from /usr/src/redhat/SPECS to "prepare" the sources. All the patches will be applied. "man rpm" for details. I agree that it would be nice to do it in one step from foo.src.rpm just like "rpm -tp" works on tarballs. But the world is not perfect. Pavel Roskin ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Unpacking the mvista kernel RPMs 2000-04-10 3:14 ` Unpacking the mvista kernel RPMs Graham Stoney 2000-04-10 3:25 ` Jason Wohlgemuth @ 2000-04-10 16:18 ` Joe Green 1 sibling, 0 replies; 15+ messages in thread From: Joe Green @ 2000-04-10 16:18 UTC (permalink / raw) To: Graham Stoney, Pavel Roskin Cc: Wohlgemuth Jason, 'linuxppc-embedded@lists.linuxppc.org' On Sun, 09 Apr 2000, Graham Stoney wrote: > Pavel Roskin writes: > > By the way, I found a newer kernel (March, 23) in > > ftp://ftp.mvista.com/pub/CDK/updates/1.0/ppc_8xx/SRPMS/ > > How does one go about unpacking one of these RPMs? > > I tried ftp'ing the rpm and going: > % rpm -i hhl-kernel-2.2.13-7.src.rpm > % rpm -bp /usr/src/redhat/SPECS/hhl-kernel-2.2.13.spec > ... > ccp: configs/%{_hhl_kernel_configs}: No such file or directory > Bad exit status from /var/tmp/rpm-tmp.689 (%prep) First of all, if you just want to build a kernel, you can use the kernel-source RPM rather than the SRPM. This installs a kernel source tree that is preconfigured to use the cross-development tools. You only need to use the SRPM if you want to rebuild the RPMs. For rebuilding the RPMs, you need to use the "rpmconfig" support. I've been meaning to write a README me about this. We're defining our RPM spec files in terms of macros that can be changed for different hosts and targets, so that we can have a common source base for all. You'll have to bear with us a while if you want to work with this architecture right now, because it's still developing. To build the 2.2.13-7 kernel RPM, you need to install the hhl-rpmconfig-0.0.2-1 RPM. Then you need to configure RPM to include the correct macro files. When building on a Hard Hat Linux or Red Hat Linux system, I do this by creating a .rpmrc file in my home directory that contains the following: macrofiles: /usr/lib/rpm/macros:/opt/hardhat/config/rpm/hosts/i686-pc-linux-gnu:/opt/hardhat/config/rpm/targets/%{_target}:~/.rpmmacros That should all be one line; not sure how it will come out in email. The /opt/hardhat/config/rpm/hosts/i686-pc-linux-gnu macro file will provide host-specific macros. The /opt/hardhat/config/rpm/targets/%{_target} specification will select a target macro file based on the target specified when invoking rpm. For the 8xx target, we use the rpm option "--target ppc_8xx-hardhat-linux". I hope this helps. -- Joe Green <jgreen@mvista.com> MontaVista Software, Inc. ** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/ ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2000-04-11 14:51 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2000-04-07 13:54 Software Emulation Exception Wohlgemuth, Jason 2000-04-07 15:20 ` Pavel Roskin 2000-04-07 15:44 ` Dan Malek 2000-04-10 9:37 ` Marcus Sundberg 2000-04-10 16:26 ` Dan Malek 2000-04-11 2:36 ` Jeff Millar 2000-04-11 2:03 ` Dan Malek 2000-04-11 8:50 ` Marcus Sundberg 2000-04-11 12:08 ` Dan Malek 2000-04-11 14:51 ` Marcus Sundberg 2000-04-10 3:14 ` Unpacking the mvista kernel RPMs Graham Stoney 2000-04-10 3:25 ` Jason Wohlgemuth 2000-04-10 4:50 ` Graham Stoney 2000-04-10 17:47 ` Pavel Roskin 2000-04-10 16:18 ` Joe Green
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).