From: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>
To: "markh@compro.net" <markh@compro.net>
Cc: Andi Kleen <andi@firstfloor.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"dmarkh@cfl.rr.com" <dmarkh@cfl.rr.com>,
Alain Knaff <alain@knaff.lu>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"fdutils@fdutils.linux.lu" <fdutils@fdutils.linux.lu>,
"Li, Shaohua" <shaohua.li@intel.com>, Ingo Molnar <mingo@elte.hu>
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28
Date: Wed, 23 Dec 2009 12:30:43 -0800 [thread overview]
Message-ID: <1261600243.16916.56.camel@localhost.localdomain> (raw)
In-Reply-To: <4B3270FE.5090607@compro.net>
On Wed, 2009-12-23 at 11:35 -0800, Mark Hounschell wrote:
> On 12/23/2009 02:18 PM, Pallipadi, Venkatesh wrote:
> > On Wed, Dec 23, 2009 at 09:41:50AM -0800, Mark Hounschell wrote:
> >> On 12/23/2009 11:38 AM, Andi Kleen wrote:
> >>> Linus Torvalds <torvalds@linux-foundation.org> writes:
> >>>
> >>>> It's not using the lapic for CPU0.
> >>>>
> >>>> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
> >>>> expensive to reprogram (compared to the local apic). And having different
> >>>> timers for different CPU's is just odd.
> >>>>
> >>>> The fact that the timer subsystem can do this and it all (mostly) works at
> >>>> all is nice and impressive, but doesn't make it any less crazy ;)
> >>>
> >>> I suspect it's a system where the APIC timer stops in deeper idle
> >>> states and it supports them. In this case CPU #0 does timer broadcasts
> >>> when needed to wake the other CPUs up from deep C, but for that it has
> >>> to run with HPET. At least the other ones can still enjoy the LAPIC
> >>> timer.
> >>>
> >>> This might suggest that Mark's floppy controller doesn't like
> >>> deep C? Mark, did you try booting with processor.max_cstate=1
> >>> and HPET enabled?
> >>
> >> I just did and /proc/interrupts looks the same and the floppy still does
> >> not format.
> >>
> >
> > Can you try this one line patch either on .28 or .32 (with /proc/interrupts
> > output).
> > This disables hpet2 and lapic timer should then be used on CPU 0. If things
> > work with this test patch, we will know that the failure is somehow related
> > to HPET usage in MSI mode.
> >
> > Thanks,
> > Venki
> >
> > Reduce the rating of percpu hpet timer
> >
> > Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
> > ---
> > arch/x86/kernel/hpet.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
> > index cafb1c6..f89d17a 100644
> > --- a/arch/x86/kernel/hpet.c
> > +++ b/arch/x86/kernel/hpet.c
> > @@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
> > hpet_setup_irq(hdev);
> > evt->irq = hdev->irq;
> >
> > - evt->rating = 110;
> > + evt->rating = 40;
> > evt->features = CLOCK_EVT_FEAT_ONESHOT;
> > if (hdev->flags & HPET_DEV_PERI_CAP)
> > evt->features |= CLOCK_EVT_FEAT_PERIODIC;
>
> That made it work. Used 2.6.32.2
>
> cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3
> 0: 82 0 0 1 IO-APIC-edge timer
> 1: 0 0 0 67 IO-APIC-edge i8042
> 3: 0 0 0 6 IO-APIC-edge
> 4: 0 0 0 4 IO-APIC-edge
> 6: 0 0 0 4 IO-APIC-edge floppy
> 8: 0 0 0 8 IO-APIC-edge rtc0
> 9: 0 0 0 0 IO-APIC-fasteoi acpi
> 12: 0 0 10 1519 IO-APIC-edge i8042
> 14: 0 0 39 10995 IO-APIC-edge
> pata_atiixp
> 15: 0 0 3 391 IO-APIC-edge
> pata_atiixp
> 16: 0 0 2 606 IO-APIC-fasteoi
> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
> 17: 0 0 0 3 IO-APIC-fasteoi
> ehci_hcd:usb1, parport0, ni-pci-gpib
> 18: 0 0 10 2168 IO-APIC-fasteoi
> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
> 19: 0 0 0 130 IO-APIC-fasteoi
> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
> 22: 0 0 8 1151 IO-APIC-fasteoi ahci
> 24: 0 0 0 0 HPET_MSI-edge hpet2
> 29: 0 0 0 48 PCI-MSI-edge
> sky2@pci:0000:04:00.0
> NMI: 0 0 0 0 Non-maskable interrupts
> LOC: 34842 30177 29672 29632 Local timer interrupts
> SPU: 0 0 0 0 Spurious interrupts
> PMI: 0 0 0 0 Performance monitoring
> interrupts
> PND: 0 0 0 0 Performance pending work
> RES: 17501 20449 16670 11224 Rescheduling interrupts
> CAL: 10554 2336 1102 1071 Function call interrupts
> TLB: 364 562 753 468 TLB shootdowns
> ERR: 0
> MIS: 0
>
>
> # fdformat /dev/fd0u1440
> Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
> Formatting ... done
> Verifying ... done
Hmmm.. Thats very interesting indeed.
That clearly says that HPET MSI interrupts somehow is causing some
caching side effect in the chipset that results in this floppy dma
failure.
Here's is what we have until now.
IRQ 0 is based on HPET legacy interrupt and HPET device is also capable
of MSI on this platform. So we also have a percpu hpet (hpet2 tied to
CPU0). percpu hpet was added to avoid the usage of IRQ0+LAPIC broadcast
in cases where LAPIC timer will stop working in deep C-state. As we have
only one HPET channel free for percpu HPET, we only have hpet2 tied to
CPU 0 and other CPUs still have to go through IRQ0+LAPIC broadcast with
deep C-state.
One problem here is that percpu hpet should only get used when LAPIC
cannot be used (that is when CPU enters deep C-state). Using hpet2 in
place of LAPIC timer even when deep C-state is not supported is not
right in terms of performance. We need some changes here to fix that
[Problem 1].
But, that still does not explain why we are seeing this problem in the
first place. I mean, using hpet2 is not optimal, but should not have
functionality issues like this. Even fixing [Problem 1] above, we may
see this problem on some other platform that supports deep C-state and
so has hpet2 enabled for a valid reason.
Also, I am not sure whether the problem also happens if legacy HPET
interrupts are used during run time in place of LAPIC timer (May be
worth to try this with a simple test patch, let me think about it). In
this case, legacy HPET interrupt rightly goes quiet after boot, giving
priority to LAPIC timer.
With hpet MSI interrupts, we do a write followed by read of HPET
memmapped register to set a HPET channel timeout + read of global HPET
timer. This happens on every timer interrupt on CPU 0. And we also have
MSI interrupt being delivered to CPU 0. I cannot think of any reason why
this can break dma. We can probably try adding some dummy HPET read
after dma write, to see if that flushes things properly.
Thanks,
Venki
next prev parent reply other threads:[~2009-12-23 20:30 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4AFB3962.2020106@ntlworld.com>
[not found] ` <4B2610F8.7050609@cfl.rr.com>
[not found] ` <4B2618EF.9020709@knaff.lu>
[not found] ` <4B264448.5040604@compro.net>
[not found] ` <4B26884C.8000306@knaff.lu>
[not found] ` <4B2697C4.2040204@compro.net>
[not found] ` <4B26A82E.5040902@knaff.lu>
[not found] ` <4B26B031.4060301@compro.net>
[not found] ` <4B26BAE3.2090408@knaff.lu>
[not found] ` <4B275975.8040509@cfl.rr.com>
[not found] ` <4B275B18.80704@knaff.lu>
[not found] ` <4B275D37.4090807@cfl.rr.com>
[not found] ` <4B2761E9.2030301@knaff.lu>
[not found] ` <4B276513.6030509@cfl.rr.com>
[not found] ` <4B276753.80807@knaff.lu>
[not found] ` <4B27983F.5090600@compro.net>
[not found] ` <4B27EF18.7050101@knaff.lu>
[not found] ` <4B28FDEB.3030800@compro.net>
[not found] ` <4B290029.90602@knaff.lu>
[not found] ` <4B2901DB.8040403@compro.net>
[not found] ` <4B29052B.9070406@knaff.lu>
[not found] ` <4B292D84.5040306@compro.net>
[not found] ` <4B29624F.2080109@knaff.lu>
[not found] ` <4B2A3805.8040707@compro.net>
[not found] ` <4B2A3E3E.8060405@knaff.lu>
[not found] ` <4B2A4975.8020809@compro.net>
[not found] ` <4B2A49F4.6070402@compro.net>
[not found] ` <4B2A4B86.8060307@knaff.lu>
[not found] ` <4B2A4C78.10107@compro.net>
[not found] ` <4B2A4CF7.6040000@knaff.lu>
[not found] ` <4B2A4EC9.2030902@compro.net>
[not found] ` <4B2A4FA5.5000701@knaff.lu>
[not found] ` <4B2A5192.6090602@compro.net>
[not found] ` <4B2A530D.3080606@knaff! .lu>
[not found] ` <4B2A530D.3080606@knaff.lu>
2009-12-17 17:00 ` DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?) Alain Knaff
2009-12-17 17:27 ` Linus Torvalds
2009-12-17 18:21 ` DMA cache consistency bug introduced in 2.6.28 Krzysztof Halasa
2009-12-17 20:46 ` DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?) Alain Knaff
2009-12-17 21:14 ` Linus Torvalds
2009-12-17 22:11 ` Alain Knaff
2009-12-17 22:43 ` Linus Torvalds
2009-12-17 23:24 ` Alain Knaff
2009-12-18 8:59 ` Mark Hounschell
2009-12-18 10:55 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: " Mark Hounschell
2009-12-18 15:01 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 Krzysztof Halasa
2009-12-18 15:22 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?) Linus Torvalds
2009-12-18 15:28 ` Mark Hounschell
2009-12-18 15:45 ` Linus Torvalds
2009-12-18 20:04 ` Mark Hounschell
2009-12-18 20:15 ` Linus Torvalds
2009-12-22 15:11 ` Mark Hounschell
2009-12-22 17:38 ` Linus Torvalds
2009-12-22 17:57 ` Mark Hounschell
2009-12-22 23:37 ` Pallipadi, Venkatesh
2009-12-23 0:22 ` Mark Hounschell
2009-12-23 13:02 ` Mark Hounschell
2009-12-23 15:10 ` Pallipadi, Venkatesh
2009-12-23 15:34 ` Mark Hounschell
2009-12-23 15:57 ` Mark Hounschell
2009-12-23 16:31 ` Linus Torvalds
2009-12-23 16:38 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 Andi Kleen
2009-12-23 16:49 ` Linus Torvalds
2009-12-23 17:08 ` Andi Kleen
2009-12-25 12:21 ` Arjan van de Ven
2009-12-25 20:33 ` Andi Kleen
2009-12-26 9:38 ` Arjan van de Ven
2009-12-26 16:40 ` Andi Kleen
2009-12-27 12:28 ` Alain Knaff
2009-12-28 1:54 ` Andi Kleen
2009-12-28 10:27 ` Alain Knaff
2009-12-28 14:54 ` Andi Kleen
2009-12-27 11:09 ` Pavel Machek
2009-12-28 20:54 ` Mark Hounschell
2009-12-23 17:19 ` Pallipadi, Venkatesh
2009-12-23 17:16 ` Andi Kleen
2009-12-23 20:11 ` alain
2009-12-23 17:41 ` Mark Hounschell
2009-12-23 18:01 ` Linus Torvalds
2009-12-23 18:11 ` Mark Hounschell
2009-12-23 19:18 ` Pallipadi, Venkatesh
2009-12-23 19:35 ` Mark Hounschell
2009-12-23 20:30 ` Pallipadi, Venkatesh [this message]
2009-12-23 20:34 ` alain
2009-12-23 21:34 ` Pallipadi, Venkatesh
2010-01-08 17:42 ` Mark Hounschell
2010-01-12 0:19 ` Pallipadi, Venkatesh
2010-01-12 9:04 ` Mark Hounschell
2010-01-15 2:01 ` Pallipadi, Venkatesh
2010-01-15 9:39 ` Mark Hounschell
2010-01-15 18:02 ` Mark Hounschell
2010-01-21 19:09 ` [PATCH] x86: Disable HPET MSI on ATI SB700/SB800 Pallipadi, Venkatesh
2010-01-22 22:00 ` [tip:x86/urgent] " tip-bot for Pallipadi, Venkatesh
2010-01-23 6:51 ` tip-bot for Pallipadi, Venkatesh
2010-01-23 7:21 ` [PATCH] " Yuhong Bao
2010-01-25 17:10 ` Andreas Herrmann
2010-01-28 9:17 ` Mark Hounschell
2010-01-28 13:25 ` Mark Hounschell
2010-01-28 13:41 ` Borislav Petkov
2010-01-28 14:45 ` Mark Hounschell
2010-05-17 14:59 ` Andreas Herrmann
2010-05-17 15:10 ` Yuhong Bao
2010-05-17 15:12 ` Linus Torvalds
2010-05-17 16:46 ` Andreas Herrmann
2010-05-18 0:56 ` Robert Hancock
2010-05-18 1:02 ` Linus Torvalds
2010-05-18 1:06 ` Robert Hancock
2010-05-18 8:45 ` Andi Kleen
2010-05-18 23:22 ` Robert Hancock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1261600243.16916.56.camel@localhost.localdomain \
--to=venkatesh.pallipadi@intel.com \
--cc=alain@knaff.lu \
--cc=andi@firstfloor.org \
--cc=dmarkh@cfl.rr.com \
--cc=fdutils@fdutils.linux.lu \
--cc=linux-kernel@vger.kernel.org \
--cc=markh@compro.net \
--cc=mingo@elte.hu \
--cc=shaohua.li@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.