From: Mark Hounschell <markh@compro.net>
To: markh@compro.net
Cc: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"fdutils@fdutils.linux.lu" <fdutils@fdutils.linux.lu>,
"Li, Shaohua" <shaohua.li@intel.com>, Ingo Molnar <mingo@elte.hu>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)
Date: Wed, 23 Dec 2009 10:57:54 -0500 [thread overview]
Message-ID: <4B323E02.4030107@compro.net> (raw)
In-Reply-To: <4B32386B.2060509@compro.net>
On 12/23/2009 10:34 AM, Mark Hounschell wrote:
> On 12/23/2009 10:10 AM, Pallipadi, Venkatesh wrote:
>>
>>
>>> -----Original Message-----
>>> From: Mark Hounschell [mailto:markh@compro.net]
>>> Sent: Wednesday, December 23, 2009 5:03 AM
>>> To: Pallipadi, Venkatesh
>>> Cc: dmarkh@cfl.rr.com; Linus Torvalds; Alain Knaff; Linux
>>> Kernel Mailing List; fdutils@fdutils.linux.lu; Li, Shaohua; Ingo Molnar
>>> Subject: Re: [Fdutils] DMA cache consistency bug introduced in
>>> 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)
>>>
>>> On 12/22/2009 07:22 PM, Mark Hounschell wrote:
>>>> On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
>>>>> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>>>>>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>>>>>
>>>>>>> [ Ingo, Venki and Shaohua added to cc: see the whole
>>> thread on lkml for
>>>>>>> details, but Mark is basically chasing down a situation
>>> where the floppy
>>>>>>> driver seems to have trouble formatting floppies, and
>>> it happened
>>>>>>> between 2.6.27 and .28. The trouble seems to be that a
>>> DMA transfer of a
>>>>>>> memory block transfers the wrong value for the first
>>> byte of the block.
>>>>>>>
>>>>>>> Which should be impossible, but whatever. Some part of
>>> the system has a
>>>>>>> cached buffer that isn't flushed.
>>>>>>>
>>>>>>> What gets _you_ guys involved is that Mark cannot
>>> reproduce the bug if
>>>>>>> HPET is disabled in the BIOS or by using 'nohpet'. He
>>> found that out by
>>>>>>> pure luck while bisecting, because some time during his
>>> bisect, his
>>>>>>> machine wouldn't even boot with HPET.
>>>>>>>
>>>>>>> So the problem is: with HPET enabled, 2.6.27.4 _used_
>>> to work. But
>>>>>>> 2.6.28 (and current -git) does not. Any ideas? ]
>>>>>>>
>>>>>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>>>>>
>>>>>>>> Ok, I may have something that might help.
>>>>>>>>
>>>>>>>> # git bisect bad
>>>>>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>>>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>>>>>> Author: venkatesh.pallipadi@intel.com
>>> <venkatesh.pallipadi@intel.com>
>>>>>>>> Date: Fri Sep 5 18:02:18 2008 -0700
>>>>>>>>
>>>>>>>> x86: HPET_MSI Initialise per-cpu HPET timers
>>>>>>>>
>>>>>>>> Initialize a per CPU HPET MSI timer when possible.
>>> We retain the HPET
>>>>>>>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when
>>> legacy mode is being used. We
>>>>>>>> setup the remaining HPET timers as per CPU MSI based
>>> timers. This per CPU
>>>>>>>> timer will eliminate the need for timer broadcasting
>>> with IRQ 0 when there
>>>>>>>> is non-functional LAPIC timer across CPU deep C-states.
>>>>>>>>
>>>>>>>> If there are more CPUs than number of available
>>> timers, CPUs that do not
>>>>>>>> find any timer to use will continue using LAPIC and
>>> IRQ 0 broadcast.
>>>>>>>>
>>>>>>>> Signed-off-by: Venkatesh Pallipadi
>>> <venkatesh.pallipadi@intel.com>
>>>>>>>> Signed-off-by: Shaohua Li <shaohua.li@intel.com>
>>>>>>>> Signed-off-by: Ingo Molnar <mingo@elte.hu>
>>>>>>>>
>>>>>>>> And of coarse this was the first commit that I could not
>>> boot if I had hpet
>>>>>>>> enabled. To get this one to boot (single user mode only)
>>> I had to add the
>>>>>>>> the quiet cmdline option and following patch from to
>>> arch/x86/kernel/hpet.c
>>>>>>>>
>>>>>>>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>>>>>
>>>>>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct
>>> hpet_dev *dev)
>>>>>>>> {
>>>>>>>>
>>>>>>>> if (request_irq(dev->irq, hpet_interrupt_handler,
>>>>>>>> - IRQF_SHARED|IRQF_NOBALANCING,
>>> dev->name, dev))
>>>>>>>> + IRQF_DISABLED|IRQF_NOBALANCING,
>>> dev->name, dev))
>>>>>>>> return -1;
>>>>>>>>
>>>>>>>> disable_irq(dev->irq);
>>>>>>>>
>>>>>>>> AND add the quiet cmdline option.
>>>>>>>
>>>>>>> Ok, so we know why HPET didn't boot for you, and that was
>>> fixed later (by
>>>>>>> that 5ceb1a04). But is this also when the floppy started
>>> mis-behaving?
>>>>>>>
>>>>>>
>>>>>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when
>>> the floppy stops
>>>>>> working
>>>>>> and also when I could no longer boot with hpet enabled.
>>>>>
>>>>>
>>>>> I am missing something here. Commit 26afe5f2 is where
>>> system does not
>>>>> boot with HPET or is it where the floppy stops working when you boot
>>>>> with HPET enabled.
>>>>>
>>>>
>>>> As it happens, both happen there. Commit 5ceb1a04 is where it starts
>>>> booting _again_ with hpet enabled. So I took that patch
>>> (5ceb1a04) and
>>>> applied it to (26afe5f2f) to be able to boot with hpet
>>> enabled. I had to
>>>> use the quiet option to get to a login prompt, but there is where the
>>>> floppy format first fails, just as it does in 2.6.28 and up.
>>>>
>>>>> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
>>>>> output in each case. With that option, we should be using local APIC
>>>>> timer and PIT, HPET or HPET with MSI should not really
>>> matter. Does it
>>>>> still fail with .28 with that option?
>>>>>
>>>
>>> 2.6.28 still fails with that option.
>>>
>>> 2.6.27.41 /proc/interrupts with idle=halt
>>>
>>> CPU0 CPU1 CPU2 CPU3
>>> 0: 126 0 0 1
>>> IO-APIC-edge timer
>>> 1: 0 0 1 157
>>> IO-APIC-edge i8042
>>> 3: 0 0 0 6 IO-APIC-edge
>>> 4: 0 0 0 6 IO-APIC-edge
>>> 6: 0 0 0 4
>>> IO-APIC-edge floppy
>>> 8: 0 0 0 1
>>> IO-APIC-edge rtc0
>>> 9: 0 0 0 0
>>> IO-APIC-fasteoi acpi
>>> 12: 0 0 1 128
>>> IO-APIC-edge i8042
>>> 14: 0 0 34 4457 IO-APIC-edge
>>> pata_atiixp
>>> 15: 0 0 4 480 IO-APIC-edge
>>> pata_atiixp
>>> 16: 0 0 0 397 IO-APIC-fasteoi
>>> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel
>>> 17: 0 0 0 2 IO-APIC-fasteoi
>>> ehci_hcd:usb1
>>> 18: 0 0 0 0 IO-APIC-fasteoi
>>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
>>> 19: 0 0 0 142 IO-APIC-fasteoi
>>> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
>>> 22: 0 0 4 1154
>>> IO-APIC-fasteoi ahci
>>> 219: 0 0 3 63
>>> PCI-MSI-edge eth0
>>> NMI: 0 0 0 0
>>> Non-maskable interrupts
>>> LOC: 91539 91964 92525 91181 Local timer
>>> interrupts
>>> RES: 2888 3873 2434 2721
>>> Rescheduling interrupts
>>> CAL: 240 245 247 84 function
>>> call interrupts
>>> TLB: 768 628 526 512 TLB shootdowns
>>> SPU: 0 0 0 0 Spurious interrupts
>>> ERR: 0
>>> MIS: 0
>>>
>>> 2.6.28 /proc/interrupts with idle=halt
>>>
>>> CPU0 CPU1 CPU2 CPU3
>>> 0: 126 0 2 0
>>> IO-APIC-edge timer
>>> 1: 0 0 192 0
>>> IO-APIC-edge i8042
>>> 3: 0 0 6 0 IO-APIC-edge
>>> 4: 0 0 6 0 IO-APIC-edge
>>> 6: 0 0 4 0
>>> IO-APIC-edge floppy
>>> 8: 0 0 1 0
>>> IO-APIC-edge rtc0
>>> 9: 0 0 0 0
>>> IO-APIC-fasteoi acpi
>>> 12: 0 0 128 1
>>> IO-APIC-edge i8042
>>> 14: 0 1 147114 396 IO-APIC-edge
>>> pata_atiixp
>>> 15: 0 0 646 2 IO-APIC-edge
>>> pata_atiixp
>>> 16: 0 0 396 0 IO-APIC-fasteoi
>>> aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel
>>> 17: 0 0 0 0 IO-APIC-fasteoi
>>> ehci_hcd:usb1
>>> 18: 0 0 0 0 IO-APIC-fasteoi
>>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
>>> 19: 0 0 362 1 IO-APIC-fasteoi
>>> aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
>>> 22: 0 0 874 1
>>> IO-APIC-fasteoi ahci
>>> 1274: 0 0 193 4
>>> PCI-MSI-edge eth0
>>> 1279: 513207 0 0 0
>>> HPET_MSI-edge hpet2
>>> NMI: 0 0 0 0
>>> Non-maskable interrupts
>>> LOC: 268 513395 513138 522088 Local timer
>>> interrupts
>>> RES: 3262 3679 2573 3746
>>> Rescheduling interrupts
>>> CAL: 131 166 57 147 Function
>>> call interrupts
>>> TLB: 680 438 450 639 TLB shootdowns
>>> SPU: 0 0 0 0 Spurious interrupts
>>> ERR: 0
>>> MIS: 0
>>>
>>
>> Hmm. Looks like hpet2 is still getting used instead of local APIC timer in .28 case.
>>
>> I was expecting some low number in hpet2 and local timer on all CPU to be around the same value. Above shows CPU 0 is depending on hpet2 for some reason even with idle=halt. Can you send the output of below two in case of .28
>> /proc/timer_list
>
> Attached.
>
>> grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
>
> I have no /sys/devices/system/cpu/cpu0/cpuidle on this machine.
> Maybe because of
>
> #
> # CPU Frequency scaling
> #
> # CONFIG_CPU_FREQ is not set
> # CONFIG_CPU_IDLE is not set
>
> Would it be OK if when you ask for 2.6.28 info, I use a 2.6.32.2 kernel?
> That kernel also fails fdformat with hpet enabled on these machines.
>
I do have this on 2.6.32.2 though.
# grep . /sys/devices/system/cpu/cpuidle/current_*
/sys/devices/system/cpu/cpuidle/current_driver:acpi_idle
/sys/devices/system/cpu/cpuidle/current_governor_ro:ladder
Want me to go back to 2.6.28 and show this?
Mark
next prev parent reply other threads:[~2009-12-23 15:57 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4AFB3962.2020106@ntlworld.com>
[not found] ` <4B2610F8.7050609@cfl.rr.com>
[not found] ` <4B2618EF.9020709@knaff.lu>
[not found] ` <4B264448.5040604@compro.net>
[not found] ` <4B26884C.8000306@knaff.lu>
[not found] ` <4B2697C4.2040204@compro.net>
[not found] ` <4B26A82E.5040902@knaff.lu>
[not found] ` <4B26B031.4060301@compro.net>
[not found] ` <4B26BAE3.2090408@knaff.lu>
[not found] ` <4B275975.8040509@cfl.rr.com>
[not found] ` <4B275B18.80704@knaff.lu>
[not found] ` <4B275D37.4090807@cfl.rr.com>
[not found] ` <4B2761E9.2030301@knaff.lu>
[not found] ` <4B276513.6030509@cfl.rr.com>
[not found] ` <4B276753.80807@knaff.lu>
[not found] ` <4B27983F.5090600@compro.net>
[not found] ` <4B27EF18.7050101@knaff.lu>
[not found] ` <4B28FDEB.3030800@compro.net>
[not found] ` <4B290029.90602@knaff.lu>
[not found] ` <4B2901DB.8040403@compro.net>
[not found] ` <4B29052B.9070406@knaff.lu>
[not found] ` <4B292D84.5040306@compro.net>
[not found] ` <4B29624F.2080109@knaff.lu>
[not found] ` <4B2A3805.8040707@compro.net>
[not found] ` <4B2A3E3E.8060405@knaff.lu>
[not found] ` <4B2A4975.8020809@compro.net>
[not found] ` <4B2A49F4.6070402@compro.net>
[not found] ` <4B2A4B86.8060307@knaff.lu>
[not found] ` <4B2A4C78.10107@compro.net>
[not found] ` <4B2A4CF7.6040000@knaff.lu>
[not found] ` <4B2A4EC9.2030902@compro.net>
[not found] ` <4B2A4FA5.5000701@knaff.lu>
[not found] ` <4B2A5192.6090602@compro.net>
[not found] ` <4B2A530D.3080606@knaff! .lu>
[not found] ` <4B2A530D.3080606@knaff.lu>
2009-12-17 17:00 ` DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?) Alain Knaff
2009-12-17 17:27 ` Linus Torvalds
2009-12-17 18:21 ` DMA cache consistency bug introduced in 2.6.28 Krzysztof Halasa
2009-12-17 20:46 ` DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?) Alain Knaff
2009-12-17 21:14 ` Linus Torvalds
2009-12-17 22:11 ` Alain Knaff
2009-12-17 22:43 ` Linus Torvalds
2009-12-17 23:24 ` Alain Knaff
2009-12-18 8:59 ` Mark Hounschell
2009-12-18 10:55 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: " Mark Hounschell
2009-12-18 15:01 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 Krzysztof Halasa
2009-12-18 15:22 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?) Linus Torvalds
2009-12-18 15:28 ` Mark Hounschell
2009-12-18 15:45 ` Linus Torvalds
2009-12-18 20:04 ` Mark Hounschell
2009-12-18 20:15 ` Linus Torvalds
2009-12-22 15:11 ` Mark Hounschell
2009-12-22 17:38 ` Linus Torvalds
2009-12-22 17:57 ` Mark Hounschell
2009-12-22 23:37 ` Pallipadi, Venkatesh
2009-12-23 0:22 ` Mark Hounschell
2009-12-23 13:02 ` Mark Hounschell
2009-12-23 15:10 ` Pallipadi, Venkatesh
2009-12-23 15:34 ` Mark Hounschell
2009-12-23 15:57 ` Mark Hounschell [this message]
2009-12-23 16:31 ` Linus Torvalds
2009-12-23 16:38 ` [Fdutils] DMA cache consistency bug introduced in 2.6.28 Andi Kleen
2009-12-23 16:49 ` Linus Torvalds
2009-12-23 17:08 ` Andi Kleen
2009-12-25 12:21 ` Arjan van de Ven
2009-12-25 20:33 ` Andi Kleen
2009-12-26 9:38 ` Arjan van de Ven
2009-12-26 16:40 ` Andi Kleen
2009-12-27 12:28 ` Alain Knaff
2009-12-28 1:54 ` Andi Kleen
2009-12-28 10:27 ` Alain Knaff
2009-12-28 14:54 ` Andi Kleen
2009-12-27 11:09 ` Pavel Machek
2009-12-28 20:54 ` Mark Hounschell
2009-12-23 17:19 ` Pallipadi, Venkatesh
2009-12-23 17:16 ` Andi Kleen
2009-12-23 20:11 ` alain
2009-12-23 17:41 ` Mark Hounschell
2009-12-23 18:01 ` Linus Torvalds
2009-12-23 18:11 ` Mark Hounschell
2009-12-23 19:18 ` Pallipadi, Venkatesh
2009-12-23 19:35 ` Mark Hounschell
2009-12-23 20:30 ` Pallipadi, Venkatesh
2009-12-23 20:34 ` alain
2009-12-23 21:34 ` Pallipadi, Venkatesh
2010-01-08 17:42 ` Mark Hounschell
2010-01-12 0:19 ` Pallipadi, Venkatesh
2010-01-12 9:04 ` Mark Hounschell
2010-01-15 2:01 ` Pallipadi, Venkatesh
2010-01-15 9:39 ` Mark Hounschell
2010-01-15 18:02 ` Mark Hounschell
2010-01-21 19:09 ` [PATCH] x86: Disable HPET MSI on ATI SB700/SB800 Pallipadi, Venkatesh
2010-01-22 22:00 ` [tip:x86/urgent] " tip-bot for Pallipadi, Venkatesh
2010-01-23 6:51 ` tip-bot for Pallipadi, Venkatesh
2010-01-23 7:21 ` [PATCH] " Yuhong Bao
2010-01-25 17:10 ` Andreas Herrmann
2010-01-28 9:17 ` Mark Hounschell
2010-01-28 13:25 ` Mark Hounschell
2010-01-28 13:41 ` Borislav Petkov
2010-01-28 14:45 ` Mark Hounschell
2010-05-17 14:59 ` Andreas Herrmann
2010-05-17 15:10 ` Yuhong Bao
2010-05-17 15:12 ` Linus Torvalds
2010-05-17 16:46 ` Andreas Herrmann
2010-05-18 0:56 ` Robert Hancock
2010-05-18 1:02 ` Linus Torvalds
2010-05-18 1:06 ` Robert Hancock
2010-05-18 8:45 ` Andi Kleen
2010-05-18 23:22 ` Robert Hancock
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B323E02.4030107@compro.net \
--to=markh@compro.net \
--cc=fdutils@fdutils.linux.lu \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=shaohua.li@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=venkatesh.pallipadi@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.