All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Martin Wilck <martin.wilck@fujitsu-siemens.com>
Cc: Haren Myneni <hbabu@us.ibm.com>,
	"vgoyal@in.ibm.com" <vgoyal@in.ibm.com>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: PATCH/RFC: [kdump] fix APIC shutdown sequence
Date: Wed, 08 Aug 2007 09:21:23 -0600	[thread overview]
Message-ID: <m1myx2dpcc.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <46B986D5.2010407@fujitsu-siemens.com> (Martin Wilck's message of "Wed, 08 Aug 2007 11:03:17 +0200")

Martin Wilck <martin.wilck@fujitsu-siemens.com> writes:

> Hello Eric,
>
>> How bad is it if you just run with irqpoll in the kdump kernel?
>> If running with irqpoll is usable that is probably preferable
>> to putting in a hardware work around we can survive without.
>
> Yes, I tried that. No effect.

Ok.  Later in the thread it sounds like you have retried this and
irqpoll is working now.  

>> Have you done any looking at moving where the kernel initalizes
>> io_apics?  One of the todo items on the path is to leave
>> io_apic mode enabled and just startup the kernel in io_apic
>> mode.
>
> I have tried to recover from the "IRR set" situation in several ways by
> changing setup_IO_APIC_irq(). But I haven't found a way to recover from
> this situation once disable_IO_APIC() had been called.

Yes.  The long term goal is to remove the need for calling
disable_IO_APIC(). Because that makes the code simpler etc.

Once we get the kernel to the point where it can start in
ioapic mode (and not in i8259 mode) we can remove the
disabled code from the kexec on panic path.

> I concluded thatthe sequence of events
> "send INT message - never receive EOI - disable IO-APIC pin"
> messes up the IO-APIC (at least this specific one in the
> PCIEx-PCI bridge of the ICH7).

It is quite possible.  I have observed a lot of obscure bugs in the
corner cases of the state machines, although it is possible
this is correct behavior and it is just specific to level
triggered interrupts which are almost exclusively not on
the first ioapic in a system like you describe.

I suspect the issue is that we never send the EOI message from
the local apic, and so it waits forever.  Or that we have
reprogrammed the vectors by the time we send the EOI message
so that the EOI and the ioapic don't agree on the vector
number when the  EOI message is sent.  Grumble silly level
triggered interrupts grumble.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

WARNING: multiple messages have this Message-ID (diff)
From: ebiederm@xmission.com (Eric W. Biederman)
To: Martin Wilck <martin.wilck@fujitsu-siemens.com>
Cc: "vgoyal@in.ibm.com" <vgoyal@in.ibm.com>,
	Haren Myneni <hbabu@us.ibm.com>,
	"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: PATCH/RFC: [kdump] fix APIC shutdown sequence
Date: Wed, 08 Aug 2007 09:21:23 -0600	[thread overview]
Message-ID: <m1myx2dpcc.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <46B986D5.2010407@fujitsu-siemens.com> (Martin Wilck's message of "Wed, 08 Aug 2007 11:03:17 +0200")

Martin Wilck <martin.wilck@fujitsu-siemens.com> writes:

> Hello Eric,
>
>> How bad is it if you just run with irqpoll in the kdump kernel?
>> If running with irqpoll is usable that is probably preferable
>> to putting in a hardware work around we can survive without.
>
> Yes, I tried that. No effect.

Ok.  Later in the thread it sounds like you have retried this and
irqpoll is working now.  

>> Have you done any looking at moving where the kernel initalizes
>> io_apics?  One of the todo items on the path is to leave
>> io_apic mode enabled and just startup the kernel in io_apic
>> mode.
>
> I have tried to recover from the "IRR set" situation in several ways by
> changing setup_IO_APIC_irq(). But I haven't found a way to recover from
> this situation once disable_IO_APIC() had been called.

Yes.  The long term goal is to remove the need for calling
disable_IO_APIC(). Because that makes the code simpler etc.

Once we get the kernel to the point where it can start in
ioapic mode (and not in i8259 mode) we can remove the
disabled code from the kexec on panic path.

> I concluded thatthe sequence of events
> "send INT message - never receive EOI - disable IO-APIC pin"
> messes up the IO-APIC (at least this specific one in the
> PCIEx-PCI bridge of the ICH7).

It is quite possible.  I have observed a lot of obscure bugs in the
corner cases of the state machines, although it is possible
this is correct behavior and it is just specific to level
triggered interrupts which are almost exclusively not on
the first ioapic in a system like you describe.

I suspect the issue is that we never send the EOI message from
the local apic, and so it waits forever.  Or that we have
reprogrammed the vectors by the time we send the EOI message
so that the EOI and the ioapic don't agree on the vector
number when the  EOI message is sent.  Grumble silly level
triggered interrupts grumble.

Eric

  parent reply	other threads:[~2007-08-08 15:24 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-06 15:08 PATCH/RFC: [kdump] fix APIC shutdown sequence Martin Wilck
2007-08-06 15:08 ` Martin Wilck
2007-08-07 14:29 ` Vivek Goyal
2007-08-07 14:29   ` Vivek Goyal
2007-08-07 17:41   ` Martin Wilck
2007-08-07 17:41     ` Martin Wilck
2007-08-08  1:04     ` Eric W. Biederman
2007-08-08  1:04       ` Eric W. Biederman
2007-08-08  9:03       ` Martin Wilck
2007-08-08  9:03         ` Martin Wilck
2007-08-08  9:33         ` Vivek Goyal
2007-08-08  9:33           ` Vivek Goyal
2007-08-08 12:04           ` Martin Wilck
2007-08-08 12:04             ` Martin Wilck
2007-08-08 15:21         ` Eric W. Biederman [this message]
2007-08-08 15:21           ` Eric W. Biederman
2007-08-08 17:35           ` Martin Wilck
2007-08-08 17:35             ` Martin Wilck
2007-08-08 17:56             ` Eric W. Biederman
2007-08-08 17:56               ` Eric W. Biederman
2007-08-08 18:22               ` Martin Wilck
2007-08-08 18:22                 ` Martin Wilck
2007-08-08 18:38               ` Martin Wilck
2007-08-08 18:38                 ` Martin Wilck
2007-08-08 10:36     ` Vivek Goyal
2007-08-08 10:36       ` Vivek Goyal
2007-08-08 14:06       ` Chip Coldwell
2007-08-08 14:06         ` Chip Coldwell
2007-08-08 14:42         ` Vivek Goyal
2007-08-08 14:42           ` Vivek Goyal
2007-08-08 18:15           ` Martin Wilck
2007-08-08 18:15             ` Martin Wilck
2007-08-09 10:11             ` Vivek Goyal
2007-08-09 10:11               ` Vivek Goyal
2007-08-09 17:35               ` Martin Wilck
2007-08-09 17:35                 ` Martin Wilck
2007-08-07 19:44   ` Chip Coldwell
2007-08-07 19:44     ` Chip Coldwell
2007-08-08  0:29 ` Andrew Morton
2007-08-08  0:29   ` Andrew Morton
2007-08-08  8:32   ` Martin Wilck
2007-08-08  8:32     ` Martin Wilck
2007-08-08 11:38 ` Vivek Goyal
2007-08-08 11:38   ` Vivek Goyal
2007-08-08 18:07   ` Martin Wilck
2007-08-08 18:07     ` Martin Wilck
2007-08-08 21:25     ` Eric W. Biederman
2007-08-08 21:25       ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1myx2dpcc.fsf@ebiederm.dsl.xmission.com \
    --to=ebiederm@xmission.com \
    --cc=hbabu@us.ibm.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.wilck@fujitsu-siemens.com \
    --cc=vgoyal@in.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.