From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263292AbTLOGW2 (ORCPT ); Mon, 15 Dec 2003 01:22:28 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263303AbTLOGW2 (ORCPT ); Mon, 15 Dec 2003 01:22:28 -0500 Received: from k-kdom.nishanet.com ([65.125.12.2]:5134 "EHLO mail2k.k-kdom.nishanet.com") by vger.kernel.org with ESMTP id S263292AbTLOGWV (ORCPT ); Mon, 15 Dec 2003 01:22:21 -0500 Message-ID: <3FDD531B.40306@nishanet.com> Date: Mon, 15 Dec 2003 01:22:19 -0500 From: Bob User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel Subject: Re: IRQ disabled (SATA) on NForce2 and my theory References: <200312151113.52165.ross@datscreative.com.au> In-Reply-To: <200312151113.52165.ross@datscreative.com.au> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Ross Dickson wrote: > > > >>It was most noticeable for the timer interrupt, because the timer >>interrupt is basically always at "high load" and a lack of it would >>result in a hard lockup of the board. However, it now seems like the >>timer interrupt isn't the only interrupt suffering from this issue. >> >> > > >>, I think inserting the small delay in the appropriate IRQ >>handler might fix this, too. >> >> >> >>But there's still the question, why the delay is actually needed for >>NForce2 boards. That would basically mean that you'll have to >>introduce the delay for *every* IRQ, to avoid a lockup of any device >>that will do high load at some time. I bet that, if I put my Firewire >>Card back in (or just use the onboard Firewire ports) and stream a >>video from my DV cam onto the harddisk, it would lock up as well after >>a very short time, since those who know DV also know that DV has a >>very high bandwidth, half an hour of film is like 40GB or the >>like. (However, I can't test this right now, because my DV cam is >>currently not accessible) >> >> I can't get usb or agp8 to work without crashing, though ide onboard and on cards is stable. Might be that delay on all interrupts is neededs. >>So, we're still not "rock solid" with NForce2, I guess... >>Any idea? >> >> >>Regards, >>Julien >>- >> >> > > >If it is a C1 disconnect reconnection as we suspect then it should affect all >local apic interrupts so the following is the conservative approach which may help. > >here is one of many educated? guesswork posts on topic > >http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/2307.html > >If you want to put the delay in all local apic irq acks then remove the delay code >from apic.c and put it in > >/usr/src/linux-2.4.23-rd2/include/asm-i386/apic.h > >also needs to bring in delay.h > >I am trying it now for my patched 2.4.23 (it boots and runs OK so far) kern as >follows but if it is really needed there then it would be better merged within >the macro style of the bad ioapic selection code and given a kernel config >selection mechanism. > > >#ifndef __ASM_APIC_H >#define __ASM_APIC_H > >#include >#include >#include >#include >#include > > > >static inline void ack_APIC_irq(void) >{ > >#if defined(CONFIG_MK7) && defined(CONFIG_BLK_DEV_AMD74XX) > /* > * on 2200XP & nforce2 chipset we need at least 500ns delay here > * to stop lockups with udma100 drive. try to scale delay time > * with cpu speed. Ross Dickson. > */ > ndelay((cpu_khz >> 12)+200 ); /* don't ack too soon or hard lockup */ >#endif > > /* > * ack_APIC_irq() actually gets compiled as a single instruction: > * - a single rmw on Pentium/82489DX > * - a single write on P6+ cores (CONFIG_X86_GOOD_APIC) > * ... yummie. > */ > > /* Docs say use 0 for future compatibility */ > apic_write_around(APIC_EOI, 0); >} > > >Also note I stuffed up the syntax of the original #ifdef, code still works but >only tests the first param not both. The ifdef code should also be adjusted for >the ioapic patch if it is to be used widely on other chipsets and processor types. >Also others more familiar with the kernel build system could choose better >config params to test against. > >Anyhow we are still flying blind so far as manufacturers comments on this >topic. > > ...instead of sending nforce2 boards back saying they don't work with linux, try using competitive jealousy between Phoenix and Award, since Award has a bios update which fixes the lockups(except usb and agp8 and maybe firewire). >So maybe occasional lockups could be caused by this on other AMD cpu systems? >I don't know. > >Don't forget to recompile & install modules given the inline code above. > >Please cc me as to how it goes if you try it. > >Regards >Ross > Another clue to all of this is there is a delay for onboard amd74xx and that never crashed for me, but offboard promise and sii did, on fsck or grep etc. I'm hoping these latest patches with debug=1 will give a clue what the Award bios update does! -Bob