From: Mark Hounschell <markh@compro.net>
To: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"dmarkh@cfl.rr.com" <dmarkh@cfl.rr.com>,
Alain Knaff <alain@knaff.lu>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"fdutils@fdutils.linux.lu" <fdutils@fdutils.linux.lu>,
"Li, Shaohua" <shaohua.li@intel.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28
Date: Fri, 08 Jan 2010 12:42:37 -0500 [thread overview]
Message-ID: <4B476E8D.1030308@compro.net> (raw)
In-Reply-To: <1261600243.16916.56.camel@localhost.localdomain>
On 12/23/2009 03:30 PM, Pallipadi, Venkatesh wrote:
>>> Can you try this one line patch either on .28 or .32 (with /proc/interrupts
>>> output).
>>> This disables hpet2 and lapic timer should then be used on CPU 0. If things
>>> work with this test patch, we will know that the failure is somehow related
>>> to HPET usage in MSI mode.
>>>
>>> Thanks,
>>> Venki
>>>
>>> Reduce the rating of percpu hpet timer
>>>
>>> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
>>> ---
>>> arch/x86/kernel/hpet.c | 2 +-
>>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
>>> index cafb1c6..f89d17a 100644
>>> --- a/arch/x86/kernel/hpet.c
>>> +++ b/arch/x86/kernel/hpet.c
>>> @@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
>>> hpet_setup_irq(hdev);
>>> evt->irq = hdev->irq;
>>>
>>> - evt->rating = 110;
>>> + evt->rating = 40;
>>> evt->features = CLOCK_EVT_FEAT_ONESHOT;
>>> if (hdev->flags & HPET_DEV_PERI_CAP)
>>> evt->features |= CLOCK_EVT_FEAT_PERIODIC;
>>
>> That made it work. Used 2.6.32.2
>>
>> cat /proc/interrupts
>> CPU0 CPU1 CPU2 CPU3
>> 0: 82 0 0 1 IO-APIC-edge timer
>> 1: 0 0 0 67 IO-APIC-edge i8042
>> 3: 0 0 0 6 IO-APIC-edge
>> 4: 0 0 0 4 IO-APIC-edge
>> 6: 0 0 0 4 IO-APIC-edge floppy
>> 8: 0 0 0 8 IO-APIC-edge rtc0
>> 9: 0 0 0 0 IO-APIC-fasteoi acpi
>> 12: 0 0 10 1519 IO-APIC-edge i8042
>> 14: 0 0 39 10995 IO-APIC-edge
>> pata_atiixp
>> 15: 0 0 3 391 IO-APIC-edge
>> pata_atiixp
>> 16: 0 0 2 606 IO-APIC-fasteoi
>> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
>> 17: 0 0 0 3 IO-APIC-fasteoi
>> ehci_hcd:usb1, parport0, ni-pci-gpib
>> 18: 0 0 10 2168 IO-APIC-fasteoi
>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
>> 19: 0 0 0 130 IO-APIC-fasteoi
>> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
>> 22: 0 0 8 1151 IO-APIC-fasteoi ahci
>> 24: 0 0 0 0 HPET_MSI-edge hpet2
>> 29: 0 0 0 48 PCI-MSI-edge
>> sky2@pci:0000:04:00.0
>> NMI: 0 0 0 0 Non-maskable interrupts
>> LOC: 34842 30177 29672 29632 Local timer interrupts
>> SPU: 0 0 0 0 Spurious interrupts
>> PMI: 0 0 0 0 Performance monitoring
>> interrupts
>> PND: 0 0 0 0 Performance pending work
>> RES: 17501 20449 16670 11224 Rescheduling interrupts
>> CAL: 10554 2336 1102 1071 Function call interrupts
>> TLB: 364 562 753 468 TLB shootdowns
>> ERR: 0
>> MIS: 0
>>
>>
>> # fdformat /dev/fd0u1440
>> Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
>> Formatting ... done
>> Verifying ... done
>
> Hmmm.. Thats very interesting indeed.
>
> That clearly says that HPET MSI interrupts somehow is causing some
> caching side effect in the chipset that results in this floppy dma
> failure.
>
> Here's is what we have until now.
> IRQ 0 is based on HPET legacy interrupt and HPET device is also capable
> of MSI on this platform. So we also have a percpu hpet (hpet2 tied to
> CPU0). percpu hpet was added to avoid the usage of IRQ0+LAPIC broadcast
> in cases where LAPIC timer will stop working in deep C-state. As we have
> only one HPET channel free for percpu HPET, we only have hpet2 tied to
> CPU 0 and other CPUs still have to go through IRQ0+LAPIC broadcast with
> deep C-state.
>
> One problem here is that percpu hpet should only get used when LAPIC
> cannot be used (that is when CPU enters deep C-state). Using hpet2 in
> place of LAPIC timer even when deep C-state is not supported is not
> right in terms of performance. We need some changes here to fix that
> [Problem 1].
>
> But, that still does not explain why we are seeing this problem in the
> first place. I mean, using hpet2 is not optimal, but should not have
> functionality issues like this. Even fixing [Problem 1] above, we may
> see this problem on some other platform that supports deep C-state and
> so has hpet2 enabled for a valid reason.
>
> Also, I am not sure whether the problem also happens if legacy HPET
> interrupts are used during run time in place of LAPIC timer (May be
> worth to try this with a simple test patch, let me think about it). In
> this case, legacy HPET interrupt rightly goes quiet after boot, giving
> priority to LAPIC timer.
>
> With hpet MSI interrupts, we do a write followed by read of HPET
> memmapped register to set a HPET channel timeout + read of global HPET
> timer. This happens on every timer interrupt on CPU 0. And we also have
> MSI interrupt being delivered to CPU 0. I cannot think of any reason why
> this can break dma. We can probably try adding some dummy HPET read
> after dma write, to see if that flushes things properly.
>
Haven't seen any activity on this thread in a while. Just curious, are we
still working this?
Is there anything else I can do to help?
Thanks
Mark
next prev parent reply other threads:[~2010-01-08 17:42 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4AFB3962.2020106@ntlworld.com>
[not found] ` <4B2610F8.7050609@cfl.rr.com>
[not found] ` <4B2618EF.9020709@knaff.lu>
[not found] ` <4B264448.5040604@compro.net>
[not found] ` <4B26884C.8000306@knaff.lu>
[not found] ` <4B2697C4.2040204@compro.net>
[not found] ` <4B26A82E.5040902@knaff.lu>
[not found] ` <4B26B031.4060301@compro.net>
[not found] ` <4B26BAE3.2090408@knaff.lu>
[not found] ` <4B275975.8040509@cfl.rr.com>
[not found] ` <4B275B18.80704@knaff.lu>
[not found] ` <4B275D37.4090807@cfl.rr.com>
[not found] ` <4B2761E9.2030301@knaff.lu>
[not found] ` <4B276513.6030509@cfl.rr.com>
[not found] ` <4B276753.80807@knaff.lu>
[not found] ` <4B27983F.5090600@compro.net>
[not found] ` <4B27EF18.7050101@knaff.lu>
[not found] ` <4B28FDEB.3030800@compro.net>
[not found] ` <4B290029.90602@knaff.lu>
[not found] ` <4B2901DB.8040403@compro.net>
[not found] ` <4B29052B.9070406@knaff.lu>
[not found] ` <4B292D84.5040306@compro.net>
[not found] ` <4B29624F.2080109@knaff.lu>
[not found] ` <4B2A3805.8040707@compro.net>
[not found] ` <4B2A3E3E.8060405@knaff.lu>
[not found] ` <4B2A4975.8020809@compro.net>
[not found] ` <4B2A49F4.6070402@compro.net>
[not found] ` <4B2A4B86.8060307@knaff.lu>
[not found] ` <4B2A4C78.10107@compro.net>
[not found] ` <4B2A4CF7.6040000@knaff.lu>
[not found] ` <4B2A4EC9.2030902@compro.net>
[not found] ` <4B2A4FA5.5000701@knaff.lu>
[not found] ` <4B2A5192.6090602@compro.net>
[not found] ` <4B2A530D.3080606@knaff! .lu>
[not found] ` <4B2A530D.3080606@knaff.lu>
2009-12-17 17:00 ` DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?) Alain Knaff
2009-12-17 17:27 ` Linus Torvalds
2009-12-17 18:21 ` DMA cache consistency bug introduced in 2.6.28 Krzysztof Halasa
2009-12-17 20:46 ` DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?) Alain Knaff
2009-12-17 21:14 ` Linus Torvalds
2009-12-17 22:11 ` Alain Knaff
2009-12-17 22:43 ` Linus Torvalds
2009-12-17 23:24 ` Alain Knaff
2009-12-18 8:59 ` Mark Hounschell
2009-12-18 10:55 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: " Mark Hounschell
2009-12-18 15:01 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 Krzysztof Halasa
2009-12-18 15:22 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?) Linus Torvalds
2009-12-18 15:28 ` Mark Hounschell
2009-12-18 15:45 ` Linus Torvalds
2009-12-18 20:04 ` Mark Hounschell
2009-12-18 20:15 ` Linus Torvalds
2009-12-22 15:11 ` Mark Hounschell
2009-12-22 17:38 ` Linus Torvalds
2009-12-22 17:57 ` Mark Hounschell
2009-12-22 23:37 ` Pallipadi, Venkatesh
2009-12-23 0:22 ` Mark Hounschell
2009-12-23 13:02 ` Mark Hounschell
2009-12-23 15:10 ` Pallipadi, Venkatesh
2009-12-23 15:34 ` Mark Hounschell
2009-12-23 15:57 ` Mark Hounschell
2009-12-23 16:31 ` Linus Torvalds
2009-12-23 16:38 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 Andi Kleen
2009-12-23 16:49 ` Linus Torvalds
2009-12-23 17:08 ` Andi Kleen
2009-12-25 12:21 ` Arjan van de Ven
2009-12-25 20:33 ` Andi Kleen
2009-12-26 9:38 ` Arjan van de Ven
2009-12-26 16:40 ` Andi Kleen
2009-12-27 12:28 ` Alain Knaff
2009-12-28 1:54 ` Andi Kleen
2009-12-28 10:27 ` Alain Knaff
2009-12-28 14:54 ` Andi Kleen
2009-12-27 11:09 ` Pavel Machek
2009-12-28 20:54 ` Mark Hounschell
2009-12-23 17:19 ` Pallipadi, Venkatesh
2009-12-23 17:16 ` Andi Kleen
2009-12-23 20:11 ` alain
2009-12-23 17:41 ` Mark Hounschell
2009-12-23 18:01 ` Linus Torvalds
2009-12-23 18:11 ` Mark Hounschell
2009-12-23 19:18 ` Pallipadi, Venkatesh
2009-12-23 19:35 ` Mark Hounschell
2009-12-23 20:30 ` Pallipadi, Venkatesh
2009-12-23 20:34 ` alain
2009-12-23 21:34 ` Pallipadi, Venkatesh
2010-01-08 17:42 ` Mark Hounschell [this message]
2010-01-12 0:19 ` Pallipadi, Venkatesh
2010-01-12 9:04 ` Mark Hounschell
2010-01-15 2:01 ` Pallipadi, Venkatesh
2010-01-15 9:39 ` Mark Hounschell
2010-01-15 18:02 ` Mark Hounschell
2010-01-21 19:09 ` [PATCH] x86: Disable HPET MSI on ATI SB700/SB800 Pallipadi, Venkatesh
2010-01-22 22:00 ` [tip:x86/urgent] " tip-bot for Pallipadi, Venkatesh
2010-01-23 6:51 ` tip-bot for Pallipadi, Venkatesh
2010-01-23 7:21 ` [PATCH] " Yuhong Bao
2010-01-25 17:10 ` Andreas Herrmann
2010-01-28 9:17 ` Mark Hounschell
2010-01-28 13:25 ` Mark Hounschell
2010-01-28 13:41 ` Borislav Petkov
2010-01-28 14:45 ` Mark Hounschell
2010-05-17 14:59 ` Andreas Herrmann
2010-05-17 15:10 ` Yuhong Bao
2010-05-17 15:12 ` Linus Torvalds
2010-05-17 16:46 ` Andreas Herrmann
2010-05-18 0:56 ` Robert Hancock
2010-05-18 1:02 ` Linus Torvalds
2010-05-18 1:06 ` Robert Hancock
2010-05-18 8:45 ` Andi Kleen
2010-05-18 23:22 ` Robert Hancock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B476E8D.1030308@compro.net \
--to=markh@compro.net \
--cc=alain@knaff.lu \
--cc=andi@firstfloor.org \
--cc=dmarkh@cfl.rr.com \
--cc=fdutils@fdutils.linux.lu \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=shaohua.li@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=venkatesh.pallipadi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).