* RE: one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
@ 2006-09-06 18:59 Moore, Robert
2006-09-06 20:04 ` keith mannthey
0 siblings, 1 reply; 30+ messages in thread
From: Moore, Robert @ 2006-09-06 18:59 UTC (permalink / raw)
To: kmannth, Bjorn Helgaas
Cc: Len Brown, Li, Shaohua, Mattia Dongili, Andrew Morton, lkml,
linux acpi, KAMEZAWA Hiroyuki
>From one of the ACPI guys:
> Get hid
> Look for driver
> If you find a match, load it
> If no match, get CID
> Look for driver
> If you find a match, load it
> If you did not find an hid or cid match, punt
> -----Original Message-----
> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-
> owner@vger.kernel.org] On Behalf Of keith mannthey
> Sent: Wednesday, September 06, 2006 11:14 AM
> To: Bjorn Helgaas
> Cc: Len Brown; Moore, Robert; Li, Shaohua; Mattia Dongili; Andrew
Morton;
> lkml; linux acpi; KAMEZAWA Hiroyuki
> Subject: Re: one more ACPI Error (utglobal-0125): Unknown exception
code:
> 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
>
> On Fri, 2006-09-01 at 17:20 -0600, Bjorn Helgaas wrote:
> > On Friday 01 September 2006 17:01, keith mannthey wrote:
> > > On Thu, 2006-08-31 at 21:15 -0600, Bjorn Helgaas wrote:
> > > > The current ACPI driver binding algorithm in
acpi_bus_find_driver()
> > > > looks at each driver, checking whether it can match either the
_HID
> > > > or the _CID of a device. Since we try the motherboard driver
first,
> > > > it matches the memory device _CID.
> > >
> > > Ok I reverted the motherboard driver patch and cooked up the
following
> > > patch that works for my issue.
> > >
> > > It creates the idea that acpi_match_ids has a type of request to
> check
> > > against for _HID, _CID or both. See acpi_bus_match_req. I then
fix up
> > > all the needed callers to change the API to acpi_match_ids and
> > > acpi_bus_match and have callers can say what they want to match
> > > against.
> > >
> > > Then in acpi_bus_find_driver I have it do 2 passes to search for
> _HID
> > > first then the _CID.
> > >
> > > Does this look like it is in the right ballpark or should we be
doing
> > > something else? Built/tested against 2.6.18-rc4-mm3.
> >
> > Conceptually I like this much better than mucking with the
motherboard
> > driver. I'm not sure the important people have signed off on this
> > strategy of binding with _HID first, then _CID (hi, Len :-)) Maybe
> > there are ramifications that we need to consider. But I think it
> > is a better match for "what people expect should happen."
>
> ACPI folks can we get some response to this? This problem has been
> reported a few times against the -mm tree and I would like to get the
> proper fix (whatever it is) upstream sometime soon.
>
> Bjorn thanks for the help and for pointing the error reports in the
> right direction.
>
> Thanks,
> Keith
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-acpi"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-06 18:59 one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3] Moore, Robert
@ 2006-09-06 20:04 ` keith mannthey
2006-09-07 2:03 ` one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA " Shaohua Li
0 siblings, 1 reply; 30+ messages in thread
From: keith mannthey @ 2006-09-06 20:04 UTC (permalink / raw)
To: Moore, Robert
Cc: Bjorn Helgaas, Len Brown, Li, Shaohua, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
On Wed, 2006-09-06 at 11:59 -0700, Moore, Robert wrote:
> From one of the ACPI guys:
>
> > Get hid
> > Look for driver
> > If you find a match, load it
> > If no match, get CID
> > Look for driver
> > If you find a match, load it
> > If you did not find an hid or cid match, punt
I think this is what my patch is doing.
when looking for a driver: (acpi_bus_find_driver)
I check against the HID
return if found
Then I check against the CID
return if found
else
punt
Any objections to pushing this into -mm and dropping the motherboard
change?
Thanks,
Keith
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-06 20:04 ` keith mannthey
@ 2006-09-07 2:03 ` Shaohua Li
2006-09-07 15:25 ` Bjorn Helgaas
0 siblings, 1 reply; 30+ messages in thread
From: Shaohua Li @ 2006-09-07 2:03 UTC (permalink / raw)
To: kmannth
Cc: Moore, Robert, Bjorn Helgaas, Len Brown, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-09-07 at 04:04 +0800, keith mannthey wrote:
> On Wed, 2006-09-06 at 11:59 -0700, Moore, Robert wrote:
> > From one of the ACPI guys:
> >
> > > Get hid
> > > Look for driver
> > > If you find a match, load it
> > > If no match, get CID
> > > Look for driver
> > > If you find a match, load it
> > > If you did not find an hid or cid match, punt
>
> I think this is what my patch is doing.
>
> when looking for a driver: (acpi_bus_find_driver)
> I check against the HID
> return if found
> Then I check against the CID
> return if found
> else
> punt
>
> Any objections to pushing this into -mm and dropping the motherboard
> change?
I'd prefer not take this way. The ACPI driver model is already mess
enough, let's don't make it worse. We are converting the ACPI driver
model to Linux driver model, this will make the attempt difficult.
We can let the motherboard driver not bind to your device (say we didn't
register the motherboard driver, but just reserve the resource of the
deivce). Is it ok to you? (I remember Bjorn said he wants to reserve the
mem region of the device too).
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-07 2:03 ` one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA " Shaohua Li
@ 2006-09-07 15:25 ` Bjorn Helgaas
2006-09-08 0:57 ` Shaohua Li
0 siblings, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2006-09-07 15:25 UTC (permalink / raw)
To: Shaohua Li
Cc: kmannth, Moore, Robert, Len Brown, Mattia Dongili, Andrew Morton,
lkml, linux acpi, KAMEZAWA Hiroyuki
On Wednesday 06 September 2006 20:03, Shaohua Li wrote:
> On Thu, 2006-09-07 at 04:04 +0800, keith mannthey wrote:
> > On Wed, 2006-09-06 at 11:59 -0700, Moore, Robert wrote:
> > > From one of the ACPI guys:
> > >
> > > > Get hid
> > > > Look for driver
> > > > If you find a match, load it
> > > > If no match, get CID
> > > > Look for driver
> > > > If you find a match, load it
> > > > If you did not find an hid or cid match, punt
> >
> > I think this is what my patch is doing.
> >
> > when looking for a driver: (acpi_bus_find_driver)
> > I check against the HID
> > return if found
> > Then I check against the CID
> > return if found
> > else
> > punt
> >
> > Any objections to pushing this into -mm and dropping the motherboard
> > change?
> I'd prefer not take this way. The ACPI driver model is already mess
> enough, let's don't make it worse. We are converting the ACPI driver
> model to Linux driver model, this will make the attempt difficult.
I see that driver_bind() and driver_probe_device() don't mesh well
with the idea that multiple drivers might be able to claim a device,
because there doesn't seem to be a way to prioritize one driver
over another. Is that the problem you're referring to?
If we decide that "try HID first, then try CID" is the right thing,
I think we should figure out how to make that work. Maybe that
means extending the driver model somehow.
> We can let the motherboard driver not bind to your device (say we didn't
> register the motherboard driver, but just reserve the resource of the
> deivce). Is it ok to you? (I remember Bjorn said he wants to reserve the
> mem region of the device too).
My point was that ACPI tells us what resources the device uses,
and we should reserve all of them so we accurately model the system.
Reserving resources without registering the driver sounds like a hack
to work around broken behavior elsewhere, so I don't think it's a
good idea.
Bjorn
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-07 15:25 ` Bjorn Helgaas
@ 2006-09-08 0:57 ` Shaohua Li
2006-09-08 2:27 ` Bjorn Helgaas
0 siblings, 1 reply; 30+ messages in thread
From: Shaohua Li @ 2006-09-08 0:57 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: kmannth, Moore, Robert, Len Brown, Mattia Dongili, Andrew Morton,
lkml, linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
> On Wednesday 06 September 2006 20:03, Shaohua Li wrote:
> > On Thu, 2006-09-07 at 04:04 +0800, keith mannthey wrote:
> > > On Wed, 2006-09-06 at 11:59 -0700, Moore, Robert wrote:
> > > > From one of the ACPI guys:
> > > >
> > > > > Get hid
> > > > > Look for driver
> > > > > If you find a match, load it
> > > > > If no match, get CID
> > > > > Look for driver
> > > > > If you find a match, load it
> > > > > If you did not find an hid or cid match, punt
> > >
> > > I think this is what my patch is doing.
> > >
> > > when looking for a driver: (acpi_bus_find_driver)
> > > I check against the HID
> > > return if found
> > > Then I check against the CID
> > > return if found
> > > else
> > > punt
> > >
> > > Any objections to pushing this into -mm and dropping the motherboard
> > > change?
>
> > I'd prefer not take this way. The ACPI driver model is already mess
> > enough, let's don't make it worse. We are converting the ACPI driver
> > model to Linux driver model, this will make the attempt difficult.
>
> I see that driver_bind() and driver_probe_device() don't mesh well
> with the idea that multiple drivers might be able to claim a device,
> because there doesn't seem to be a way to prioritize one driver
> over another. Is that the problem you're referring to?
Yes.
> If we decide that "try HID first, then try CID" is the right thing,
> I think we should figure out how to make that work. Maybe that
> means extending the driver model somehow.
Don't think it's easy, especially no other bus needs it I guess.
> > We can let the motherboard driver not bind to your device (say we didn't
> > register the motherboard driver, but just reserve the resource of the
> > deivce). Is it ok to you? (I remember Bjorn said he wants to reserve the
> > mem region of the device too).
>
> My point was that ACPI tells us what resources the device uses,
> and we should reserve all of them so we accurately model the system.
>
> Reserving resources without registering the driver sounds like a hack
> to work around broken behavior elsewhere, so I don't think it's a
> good idea.
Do we really need the memory hotplug device returns pnp0c01/pnp0c02?
What's the purpose?
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-08 0:57 ` Shaohua Li
@ 2006-09-08 2:27 ` Bjorn Helgaas
2006-09-13 1:27 ` keith mannthey
0 siblings, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2006-09-08 2:27 UTC (permalink / raw)
To: Shaohua Li
Cc: Bjorn Helgaas, kmannth, Moore, Robert, Len Brown, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
> On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
>> If we decide that "try HID first, then try CID" is the right thing,
>> I think we should figure out how to make that work. Maybe that
>> means extending the driver model somehow.
> Don't think it's easy, especially no other bus needs it I guess.
I agree it's probably not easy, but I think having the right
semantics is more important than fitting cleanly into the
driver model. But I know that without code, I'm just venting
hot air, not contributing to a solution.
How's the ACPI driver model integration going, anyway? I seem
to recall some patches a while back, but I don't think they're
in the tree yet.
> Do we really need the memory hotplug device returns pnp0c01/pnp0c02?
> What's the purpose?
I don't know. But I think Keith already determined that a BIOS change
is not likely. I hate to ask for BIOS changes like this because it
feels like asking them to avoid broken things in Linux.
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-08 2:27 ` Bjorn Helgaas
@ 2006-09-13 1:27 ` keith mannthey
2006-09-13 14:51 ` Bjorn Helgaas
0 siblings, 1 reply; 30+ messages in thread
From: keith mannthey @ 2006-09-13 1:27 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Shaohua Li, Moore, Robert, Len Brown, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-09-07 at 20:27 -0600, Bjorn Helgaas wrote:
> > On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
> >> If we decide that "try HID first, then try CID" is the right thing,
> >> I think we should figure out how to make that work. Maybe that
> >> means extending the driver model somehow.
> > Don't think it's easy, especially no other bus needs it I guess.
>
> I agree it's probably not easy, but I think having the right
> semantics is more important than fitting cleanly into the
> driver model. But I know that without code, I'm just venting
> hot air, not contributing to a solution.
>
> How's the ACPI driver model integration going, anyway? I seem
> to recall some patches a while back, but I don't think they're
> in the tree yet.
>
> > Do we really need the memory hotplug device returns pnp0c01/pnp0c02?
> > What's the purpose?
>
> I don't know. But I think Keith already determined that a BIOS change
> is not likely. I hate to ask for BIOS changes like this because it
> feels like asking them to avoid broken things in Linux.
Ok my motherboard patch was dropped from -mm so I am broken again but
others are fixed. Is the answer that we do nothing about this issues?
I am pretty sure my SSDT table is valid if someone *cannot* point out
in the spec where my device is malformed by having both HID and CID I
will not be able even start the request to change the BIOS (it would be
a waste of my time). Sure having the CID of the memory device may be
overkill but is it wrong?
Unless someone can show me a alternate solution I am going to push the
check HID before CID patch to -mm in the next day or two.
Thanks,
Keith
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-13 1:27 ` keith mannthey
@ 2006-09-13 14:51 ` Bjorn Helgaas
2006-09-14 3:01 ` Shaohua Li
0 siblings, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2006-09-13 14:51 UTC (permalink / raw)
To: kmannth
Cc: Shaohua Li, Moore, Robert, Len Brown, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
On Tuesday 12 September 2006 19:27, keith mannthey wrote:
> On Thu, 2006-09-07 at 20:27 -0600, Bjorn Helgaas wrote:
> > > On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
> > >> If we decide that "try HID first, then try CID" is the right thing,
> > >> I think we should figure out how to make that work. Maybe that
> > >> means extending the driver model somehow.
> > > Don't think it's easy, especially no other bus needs it I guess.
> >
> > I agree it's probably not easy, but I think having the right
> > semantics is more important than fitting cleanly into the
> > driver model. But I know that without code, I'm just venting
> > hot air, not contributing to a solution.
> >
> > How's the ACPI driver model integration going, anyway? I seem
> > to recall some patches a while back, but I don't think they're
> > in the tree yet.
> >
> > > Do we really need the memory hotplug device returns pnp0c01/pnp0c02?
> > > What's the purpose?
> >
> > I don't know. But I think Keith already determined that a BIOS change
> > is not likely. I hate to ask for BIOS changes like this because it
> > feels like asking them to avoid broken things in Linux.
>
> Ok my motherboard patch was dropped from -mm so I am broken again but
> others are fixed. Is the answer that we do nothing about this issues?
>
> I am pretty sure my SSDT table is valid if someone *cannot* point out
> in the spec where my device is malformed by having both HID and CID I
> will not be able even start the request to change the BIOS (it would be
> a waste of my time). Sure having the CID of the memory device may be
> overkill but is it wrong?
I think that your SSDT is valid. I can't point to a specific
reference in the spec, but I think the "try _HID first, then try
_CID" strategy is clearly the intent. Otherwise, there would be
no reason to separate _HID from _CID.
> Unless someone can show me a alternate solution I am going to push the
> check HID before CID patch to -mm in the next day or two.
I support this, although I do understand that it will make it more
difficult to integrate ACPI into the driver model because the driver
model currently only does one pass to check whether a driver can claim
a device.
Bjorn
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-13 14:51 ` Bjorn Helgaas
@ 2006-09-14 3:01 ` Shaohua Li
2006-09-14 16:36 ` Bjorn Helgaas
2006-09-14 17:55 ` keith mannthey
0 siblings, 2 replies; 30+ messages in thread
From: Shaohua Li @ 2006-09-14 3:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: kmannth, Moore, Robert, Len Brown, Mattia Dongili, Andrew Morton,
lkml, linux acpi, KAMEZAWA Hiroyuki
On Wed, 2006-09-13 at 22:51 +0800, Bjorn Helgaas wrote:
> On Tuesday 12 September 2006 19:27, keith mannthey wrote:
> > On Thu, 2006-09-07 at 20:27 -0600, Bjorn Helgaas wrote:
> > > > On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
> > > >> If we decide that "try HID first, then try CID" is the right
> thing,
> > > >> I think we should figure out how to make that work. Maybe that
> > > >> means extending the driver model somehow.
> > > > Don't think it's easy, especially no other bus needs it I guess.
> > >
> > > I agree it's probably not easy, but I think having the right
> > > semantics is more important than fitting cleanly into the
> > > driver model. But I know that without code, I'm just venting
> > > hot air, not contributing to a solution.
> > >
> > > How's the ACPI driver model integration going, anyway? I seem
> > > to recall some patches a while back, but I don't think they're
> > > in the tree yet.
> > >
> > > > Do we really need the memory hotplug device returns
> pnp0c01/pnp0c02?
> > > > What's the purpose?
> > >
> > > I don't know. But I think Keith already determined that a BIOS
> change
> > > is not likely. I hate to ask for BIOS changes like this because
> it
> > > feels like asking them to avoid broken things in Linux.
> >
> > Ok my motherboard patch was dropped from -mm so I am broken again
> but
> > others are fixed. Is the answer that we do nothing about this
> issues?
> >
> > I am pretty sure my SSDT table is valid if someone *cannot* point
> out
> > in the spec where my device is malformed by having both HID and CID
> I
> > will not be able even start the request to change the BIOS (it would
> be
> > a waste of my time). Sure having the CID of the memory device may
> be
> > overkill but is it wrong?
>
> I think that your SSDT is valid. I can't point to a specific
> reference in the spec, but I think the "try _HID first, then try
> _CID" strategy is clearly the intent. Otherwise, there would be
> no reason to separate _HID from _CID.
The spec actually doesn't mention PNP0C01/PNP0C02. It's hard to say this
is valid or invalid.
The 'try _HID first then _CID' has another downside. It highly depends
on the driver is loaded first and then load the device. See motherboard
driver loads first and the mem hotplug driver isn't loaded, in this
situation if you scan the mem hotplug device, the mechanism will fail as
the two pass search will still bind motherboard driver to the device.
If you take the two pass search, I have a feeling this will make acpi
never be able to convert Linux driver model.
If you really want to workaround the issue, I prefer have a blacklist or
something to let ACPI not use the _CID for your device, but please don't
mess the ACPI core itself.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-14 3:01 ` Shaohua Li
@ 2006-09-14 16:36 ` Bjorn Helgaas
2006-09-15 1:39 ` Shaohua Li
2006-09-14 17:55 ` keith mannthey
1 sibling, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2006-09-14 16:36 UTC (permalink / raw)
To: Shaohua Li
Cc: kmannth, Moore, Robert, Len Brown, Mattia Dongili, Andrew Morton,
lkml, linux acpi, KAMEZAWA Hiroyuki
On Wednesday 13 September 2006 21:01, Shaohua Li wrote:
> On Wed, 2006-09-13 at 22:51 +0800, Bjorn Helgaas wrote:
> > I think that your SSDT is valid. I can't point to a specific
> > reference in the spec, but I think the "try _HID first, then try
> > _CID" strategy is clearly the intent. Otherwise, there would be
> > no reason to separate _HID from _CID.
> The spec actually doesn't mention PNP0C01/PNP0C02. It's hard to say this
> is valid or invalid.
This problem is more general than just Keith's situation. This
could happen with any device that has both _HID and _CID. As
soon as you have both _HID and _CID, you can have a driver that
claims the _HID and another that claims the _CID.
The spec obviously anticipates this situation, which is why I
think the SSDT is valid from the ACPI spec point of view.
Now, if you have some definition of the programming model of
PNP0C01/PNP0C02, and the memory device doesn't conform to that
model, then I would agree that the SSDT is invalid. But I
don't know where a PNP0C01/PNP0C02 programming model is defined.
The linux driver does nothing more than reserve the resources
of the device, so it doesn't use any programming model at all.
The memory device (in fact, any ACPI device at all) trivially
conforms to this "null programming model."
> The 'try _HID first then _CID' has another downside. It highly depends
> on the driver is loaded first and then load the device. See motherboard
> driver loads first and the mem hotplug driver isn't loaded, in this
> situation if you scan the mem hotplug device, the mechanism will fail as
> the two pass search will still bind motherboard driver to the device.
I agree, this is a problem that will have to be resolved. And it's
really not just an ACPI problem. A PCI driver can claim devices based
on a class or a vendor/device/subvendor/subdevice with wildcards.
Another driver can claim devices with a specific vendor/device/etc.
Some devices may match with both drivers.
PCI has a /sys/bus/pci/driver/XXX/{bind,unbind} mechanism to cause a
driver to release a device and bind another driver to it. Maybe we
could do something similar for ACPI.
Bjorn
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-14 16:36 ` Bjorn Helgaas
@ 2006-09-15 1:39 ` Shaohua Li
2006-09-19 10:22 ` Bjorn Helgaas
0 siblings, 1 reply; 30+ messages in thread
From: Shaohua Li @ 2006-09-15 1:39 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: kmannth, Moore, Robert, Len Brown, Mattia Dongili, Andrew Morton,
lkml, linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-09-14 at 10:36 -0600, Bjorn Helgaas wrote:
> On Wednesday 13 September 2006 21:01, Shaohua Li wrote:
> > On Wed, 2006-09-13 at 22:51 +0800, Bjorn Helgaas wrote:
> > > I think that your SSDT is valid. I can't point to a specific
> > > reference in the spec, but I think the "try _HID first, then try
> > > _CID" strategy is clearly the intent. Otherwise, there would be
> > > no reason to separate _HID from _CID.
>
> > The spec actually doesn't mention PNP0C01/PNP0C02. It's hard to say this
> > is valid or invalid.
>
> This problem is more general than just Keith's situation. This
> could happen with any device that has both _HID and _CID. As
> soon as you have both _HID and _CID, you can have a driver that
> claims the _HID and another that claims the _CID.
We don't see such issue before, don't think it's generic. We did have
some devices with _CID, like a pcie root bridge claims pnp0a03 (pci root
bridge), but they are really compatible.
> The spec obviously anticipates this situation, which is why I
> think the SSDT is valid from the ACPI spec point of view.
>
> Now, if you have some definition of the programming model of
> PNP0C01/PNP0C02, and the memory device doesn't conform to that
> model, then I would agree that the SSDT is invalid. But I
> don't know where a PNP0C01/PNP0C02 programming model is defined.
>
> The linux driver does nothing more than reserve the resources
> of the device, so it doesn't use any programming model at all.
> The memory device (in fact, any ACPI device at all) trivially
> conforms to this "null programming model."
>
> > The 'try _HID first then _CID' has another downside. It highly depends
> > on the driver is loaded first and then load the device. See motherboard
> > driver loads first and the mem hotplug driver isn't loaded, in this
> > situation if you scan the mem hotplug device, the mechanism will fail as
> > the two pass search will still bind motherboard driver to the device.
>
> I agree, this is a problem that will have to be resolved. And it's
> really not just an ACPI problem. A PCI driver can claim devices based
> on a class or a vendor/device/subvendor/subdevice with wildcards.
> Another driver can claim devices with a specific vendor/device/etc.
> Some devices may match with both drivers.
I'd prefer don't do ACPI core change in this stage and just workaround
Keith's issue till we find this is really a generic problem.
> PCI has a /sys/bus/pci/driver/XXX/{bind,unbind} mechanism to cause a
> driver to release a device and bind another driver to it. Maybe we
> could do something similar for ACPI.
After we convert acpi core to Linux driver model, we have the
capability. But not sure if this can help Keith.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-15 1:39 ` Shaohua Li
@ 2006-09-19 10:22 ` Bjorn Helgaas
0 siblings, 0 replies; 30+ messages in thread
From: Bjorn Helgaas @ 2006-09-19 10:22 UTC (permalink / raw)
To: Shaohua Li
Cc: kmannth, Moore, Robert, Len Brown, Mattia Dongili, Andrew Morton,
lkml, linux acpi, KAMEZAWA Hiroyuki
On Thursday 14 September 2006 19:39, Shaohua Li wrote:
> > PCI has a /sys/bus/pci/driver/XXX/{bind,unbind} mechanism to cause a
> > driver to release a device and bind another driver to it. Maybe we
> > could do something similar for ACPI.
> After we convert acpi core to Linux driver model, we have the
> capability. But not sure if this can help Keith.
When will the conversion to the Linux driver model happen?
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-14 3:01 ` Shaohua Li
2006-09-14 16:36 ` Bjorn Helgaas
@ 2006-09-14 17:55 ` keith mannthey
2006-09-15 1:52 ` Shaohua Li
1 sibling, 1 reply; 30+ messages in thread
From: keith mannthey @ 2006-09-14 17:55 UTC (permalink / raw)
To: Shaohua Li
Cc: Bjorn Helgaas, Moore, Robert, Len Brown, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-09-14 at 11:01 +0800, Shaohua Li wrote:
> On Wed, 2006-09-13 at 22:51 +0800, Bjorn Helgaas wrote:
> > On Tuesday 12 September 2006 19:27, keith mannthey wrote:
> > > On Thu, 2006-09-07 at 20:27 -0600, Bjorn Helgaas wrote:
> > > > > On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
> > I think that your SSDT is valid. I can't point to a specific
> > reference in the spec, but I think the "try _HID first, then try
> > _CID" strategy is clearly the intent. Otherwise, there would be
> > no reason to separate _HID from _CID.
> The spec actually doesn't mention PNP0C01/PNP0C02. It's hard to say this
> is valid or invalid.
Lets work on the assumption it is valid until someone points out in a
spec that says it isn't.
> The 'try _HID first then _CID' has another downside. It highly depends
> on the driver is loaded first and then load the device. See motherboard
> driver loads first and the mem hotplug driver isn't loaded, in this
> situation if you scan the mem hotplug device, the mechanism will fail as
> the two pass search will still bind motherboard driver to the device.
Any solution depends on the mem hotplug device being loaded. This
doesn't appear to be _HID before _CID specific issue .
> If you take the two pass search, I have a feeling this will make acpi
> never be able to convert Linux driver model.
I am not trying to break forward work but what I do want is a solution
to my problem.
> If you really want to workaround the issue, I prefer have a blacklist or
> something to let ACPI not use the _CID for your device, but please don't
> mess the ACPI core itself.
My fist pass to fix the problem was I guess a hack of sorts that caused
others problems (motherboard add return != 0 on unknown devices). I
don't want another Keith grown hack that breaks other people.
Can you elaborate on what you think would be safe way to do what you
propose since the ACPI core (can't/won't?) be fixed? I can imagine a
couple of different ways to fix this but I would like some feedback
before I go off and work on the 3rd pass of this fix.
1. Make the memory device get scanned before the motherboard device
somehow. Implicitly reorder the devices in the list. Perhaps a priority
sorted of sorts to have _HID device always before _CID devices during
the scan?
2. Have the motherboard device (if it finds the right acpi device type)
hook into the memory device somehow.
3. Some special blacklist of the motherboard device on my specific
system.
Thanks,
Keith
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-14 17:55 ` keith mannthey
@ 2006-09-15 1:52 ` Shaohua Li
2006-09-21 0:27 ` keith mannthey
0 siblings, 1 reply; 30+ messages in thread
From: Shaohua Li @ 2006-09-15 1:52 UTC (permalink / raw)
To: kmannth
Cc: Bjorn Helgaas, Moore, Robert, Len Brown, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-09-14 at 10:55 -0700, keith mannthey wrote:
> On Thu, 2006-09-14 at 11:01 +0800, Shaohua Li wrote:
> > On Wed, 2006-09-13 at 22:51 +0800, Bjorn Helgaas wrote:
> > > On Tuesday 12 September 2006 19:27, keith mannthey wrote:
> > > > On Thu, 2006-09-07 at 20:27 -0600, Bjorn Helgaas wrote:
> > > > > > On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
>
> > > I think that your SSDT is valid. I can't point to a specific
> > > reference in the spec, but I think the "try _HID first, then try
> > > _CID" strategy is clearly the intent. Otherwise, there would be
> > > no reason to separate _HID from _CID.
> > The spec actually doesn't mention PNP0C01/PNP0C02. It's hard to say this
> > is valid or invalid.
>
> Lets work on the assumption it is valid until someone points out in a
> spec that says it isn't.
>
> > The 'try _HID first then _CID' has another downside. It highly depends
> > on the driver is loaded first and then load the device. See motherboard
> > driver loads first and the mem hotplug driver isn't loaded, in this
> > situation if you scan the mem hotplug device, the mechanism will fail as
> > the two pass search will still bind motherboard driver to the device.
> Any solution depends on the mem hotplug device being loaded. This
> doesn't appear to be _HID before _CID specific issue .
>
> > If you take the two pass search, I have a feeling this will make acpi
> > never be able to convert Linux driver model.
>
> I am not trying to break forward work but what I do want is a solution
> to my problem.
>
> > If you really want to workaround the issue, I prefer have a blacklist or
> > something to let ACPI not use the _CID for your device, but please don't
> > mess the ACPI core itself.
>
> My fist pass to fix the problem was I guess a hack of sorts that caused
> others problems (motherboard add return != 0 on unknown devices). I
> don't want another Keith grown hack that breaks other people.
>
> Can you elaborate on what you think would be safe way to do what you
> propose since the ACPI core (can't/won't?) be fixed? I can imagine a
> couple of different ways to fix this but I would like some feedback
> before I go off and work on the 3rd pass of this fix.
>
> 1. Make the memory device get scanned before the motherboard device
> somehow. Implicitly reorder the devices in the list. Perhaps a priority
> sorted of sorts to have _HID device always before _CID devices during
> the scan?
This will change the scan order of ACPI device, and sounds too hack to me.
> 2. Have the motherboard device (if it finds the right acpi device type)
> hook into the memory device somehow.
>
> 3. Some special blacklist of the motherboard device on my specific
> system.
Either one of the two looks ok.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-09-15 1:52 ` Shaohua Li
@ 2006-09-21 0:27 ` keith mannthey
0 siblings, 0 replies; 30+ messages in thread
From: keith mannthey @ 2006-09-21 0:27 UTC (permalink / raw)
To: Shaohua Li
Cc: Bjorn Helgaas, Moore, Robert, Len Brown, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
[-- Attachment #1: Type: text/plain, Size: 3531 bytes --]
On Fri, 2006-09-15 at 09:52 +0800, Shaohua Li wrote:
> On Thu, 2006-09-14 at 10:55 -0700, keith mannthey wrote:
> > On Thu, 2006-09-14 at 11:01 +0800, Shaohua Li wrote:
> > > On Wed, 2006-09-13 at 22:51 +0800, Bjorn Helgaas wrote:
> > > > On Tuesday 12 September 2006 19:27, keith mannthey wrote:
> > > > > On Thu, 2006-09-07 at 20:27 -0600, Bjorn Helgaas wrote:
> > > > > > > On Thu, 2006-09-07 at 09:25 -0600, Bjorn Helgaas wrote:
> >
> > > > I think that your SSDT is valid. I can't point to a specific
> > > > reference in the spec, but I think the "try _HID first, then try
> > > > _CID" strategy is clearly the intent. Otherwise, there would be
> > > > no reason to separate _HID from _CID.
> > > The spec actually doesn't mention PNP0C01/PNP0C02. It's hard to say this
> > > is valid or invalid.
> >
> > Lets work on the assumption it is valid until someone points out in a
> > spec that says it isn't.
> >
> > > The 'try _HID first then _CID' has another downside. It highly depends
> > > on the driver is loaded first and then load the device. See motherboard
> > > driver loads first and the mem hotplug driver isn't loaded, in this
> > > situation if you scan the mem hotplug device, the mechanism will fail as
> > > the two pass search will still bind motherboard driver to the device.
> > Any solution depends on the mem hotplug device being loaded. This
> > doesn't appear to be _HID before _CID specific issue .
> >
> > > If you take the two pass search, I have a feeling this will make acpi
> > > never be able to convert Linux driver model.
> >
> > I am not trying to break forward work but what I do want is a solution
> > to my problem.
> >
> > > If you really want to workaround the issue, I prefer have a blacklist or
> > > something to let ACPI not use the _CID for your device, but please don't
> > > mess the ACPI core itself.
> >
> > My fist pass to fix the problem was I guess a hack of sorts that caused
> > others problems (motherboard add return != 0 on unknown devices). I
> > don't want another Keith grown hack that breaks other people.
> >
> > Can you elaborate on what you think would be safe way to do what you
> > propose since the ACPI core (can't/won't?) be fixed? I can imagine a
> > couple of different ways to fix this but I would like some feedback
> > before I go off and work on the 3rd pass of this fix.
Ok off I went....
> > 1. Make the memory device get scanned before the motherboard device
> > somehow. Implicitly reorder the devices in the list. Perhaps a priority
> > sorted of sorts to have _HID device always before _CID devices during
> > the scan?
> This will change the scan order of ACPI device, and sounds too hack to me.
ACPI driver only has name with no _HID _CID contest. The handle reads
the name space to fill in the device. There is no way to explicitly
order the list correctly for all cases as we are trying to reorder the
driver list.
I looked into making the memory device dynamically change itself to
not contain a _CID (thus changing the namespace) but it got pretty ugly.
This is perhaps doable but a little on the ugly side.
So I just flip the order of the device list for all cases in a simple
patch. This is a total hack and workaround for just my current
situation.
> > 3. Some special blacklist of the motherboard device on my specific
> > system.
Uhh this looks to require a whole new black list infrastructure?
Any more ideas????
Thanks,
Keith
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
[-- Attachment #2: patch-acpi-scanfix-v2 --]
[-- Type: text/plain, Size: 396 bytes --]
--- linux-2.6.17/drivers/acpi/scan.c 2006-09-20 13:58:06.000000000 -0700
+++ linux-2.6.18-rc6-mm2-works/drivers/acpi/scan.c 2006-09-20 11:31:33.000000000 -0700
@@ -601,7 +601,7 @@
return -ENODEV;
spin_lock(&acpi_device_lock);
- list_add_tail(&driver->node, &acpi_bus_drivers);
+ list_add(&driver->node, &acpi_bus_drivers);
spin_unlock(&acpi_device_lock);
acpi_driver_attach(driver);
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
@ 2006-08-31 17:02 Moore, Robert
2006-08-31 17:56 ` keith mannthey
0 siblings, 1 reply; 30+ messages in thread
From: Moore, Robert @ 2006-08-31 17:02 UTC (permalink / raw)
To: kmannth, Len Brown
Cc: Li, Shaohua, Mattia Dongili, Andrew Morton, lkml, linux acpi,
KAMEZAWA Hiroyuki
Return AE_OK to continue the walk. AE_CTRL_DEPTH will cause the walk to
continue, but go no further down the current branch of the namespace.
Anything other than these two exceptions will completely abort the walk.
Bob
> -----Original Message-----
> From: keith mannthey [mailto:kmannth@us.ibm.com]
> Sent: Thursday, August 31, 2006 9:49 AM
> To: Len Brown
> Cc: Moore, Robert; Li, Shaohua; Mattia Dongili; Andrew Morton; lkml;
linux
> acpi; KAMEZAWA Hiroyuki
> Subject: Re: one more ACPI Error (utglobal-0125): Unknown exception
> code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
>
> On Thu, 2006-08-31 at 02:48 -0400, Len Brown wrote:
> > On Tuesday 29 August 2006 16:04, Moore, Robert wrote:
> > > As far as the unknown exception,
> > >
> > > >[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
> > > >[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
> > > >[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
> > >
> > > I would guess that the callback routine for walk_resources is
> returning
> > > a non-zero status value which is causing an immediate abort of the
> walk
> > > with that value -- and the value is bogus.
>
> Before I put this check in acpi_motherboard_add always attached
itself
> to any resource type. I simply changed it so if the type is not
> ACPI_RESOURCE_TYPE_IO or ACPI_RESOURCE_TYPE_FIXED_IO it doesn't attach
> and can continue to find the correct device to attach to.
>
> Perhaps the motherboard device needs to attach to more device types?
>
> It was suggest by acpi folks to return -EINVAL. Should something
else
> be returned?
>
>
> Thanks,
> Keith
>
> > Yep, see -EINVAL below.
> >
> > -Len
> >
> >
http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-
> rc4/2.6.18-rc4-mm3/broken-out/hot-add-mem-x86_64-acpi-motherboard-
> fix.patch
> >
> >
> >
> > From: Keith Mannthey <kmannth@us.ibm.com>
> >
> > This patch set allow SPARSEMEM and RESERVE based hot-add to work. I
> have
> > test both options and they work as expected. I am adding memory to
the
> > 2nd node of a numa system (x86_64).
> >
> > Major changes from last set is the config change and RESERVE
enablment.
> >
> >
> > This patch:
> >
> >
> > Make ACPI motherboard driver not attach to devices/handles it
dosen't
> expect.
> > Fix a bug where the motherboard driver attached to hot-add memory
event
> and
> > caused the add memory call to fail.
> >
> > Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Cc: Andi Kleen <ak@muc.de>
> > Signed-off-by: Andrew Morton <akpm@osdl.org>
> > ---
> >
> >
> > diff -puN drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-
> motherboard-fix drivers/acpi/motherboard.c
> > ---
a/drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix
> > +++ a/drivers/acpi/motherboard.c
> > @@ -87,6 +87,7 @@ static acpi_status acpi_reserve_io_range
> > }
> > } else {
> > /* Memory mapped IO? */
> > + return -EINVAL;
> > }
> >
> > if (requested_res)
> > @@ -96,11 +97,16 @@ static acpi_status acpi_reserve_io_range
> >
> > static int acpi_motherboard_add(struct acpi_device *device)
> > {
> > + acpi_status status;
> > if (!device)
> > return -EINVAL;
> > - acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> > +
> > + status = acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> > acpi_reserve_io_ranges, NULL);
> >
> > + if (ACPI_FAILURE(status))
> > + return -ENODEV;
> > +
> > return 0;
> > }
> >
> > _
^ permalink raw reply [flat|nested] 30+ messages in thread* RE: one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-08-31 17:02 Moore, Robert
@ 2006-08-31 17:56 ` keith mannthey
0 siblings, 0 replies; 30+ messages in thread
From: keith mannthey @ 2006-08-31 17:56 UTC (permalink / raw)
To: Moore, Robert
Cc: Len Brown, Li, Shaohua, Mattia Dongili, Andrew Morton, lkml,
linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-08-31 at 10:02 -0700, Moore, Robert wrote:
> Return AE_OK to continue the walk. AE_CTRL_DEPTH will cause the walk to
> continue, but go no further down the current branch of the namespace.
>
> Anything other than these two exceptions will completely abort the walk.
Let me check with AE_OK (this is non-zero?). It will be several hours.
Thanks,
Keith
> > -----Original Message-----
> > From: keith mannthey [mailto:kmannth@us.ibm.com]
> > Sent: Thursday, August 31, 2006 9:49 AM
> > To: Len Brown
> > Cc: Moore, Robert; Li, Shaohua; Mattia Dongili; Andrew Morton; lkml;
> linux
> > acpi; KAMEZAWA Hiroyuki
> > Subject: Re: one more ACPI Error (utglobal-0125): Unknown exception
> > code:0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
> >
> > On Thu, 2006-08-31 at 02:48 -0400, Len Brown wrote:
> > > On Tuesday 29 August 2006 16:04, Moore, Robert wrote:
> > > > As far as the unknown exception,
> > > >
> > > > >[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
> > > > >[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
> > > > >[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
> > > >
> > > > I would guess that the callback routine for walk_resources is
> > returning
> > > > a non-zero status value which is causing an immediate abort of the
> > walk
> > > > with that value -- and the value is bogus.
> >
> > Before I put this check in acpi_motherboard_add always attached
> itself
> > to any resource type. I simply changed it so if the type is not
> > ACPI_RESOURCE_TYPE_IO or ACPI_RESOURCE_TYPE_FIXED_IO it doesn't attach
> > and can continue to find the correct device to attach to.
> >
> > Perhaps the motherboard device needs to attach to more device types?
> >
> > It was suggest by acpi folks to return -EINVAL. Should something
> else
> > be returned?
> >
> >
> > Thanks,
> > Keith
> >
> > > Yep, see -EINVAL below.
> > >
> > > -Len
> > >
> > >
> http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-
> > rc4/2.6.18-rc4-mm3/broken-out/hot-add-mem-x86_64-acpi-motherboard-
> > fix.patch
> > >
> > >
> > >
> > > From: Keith Mannthey <kmannth@us.ibm.com>
> > >
> > > This patch set allow SPARSEMEM and RESERVE based hot-add to work. I
> > have
> > > test both options and they work as expected. I am adding memory to
> the
> > > 2nd node of a numa system (x86_64).
> > >
> > > Major changes from last set is the config change and RESERVE
> enablment.
> > >
> > >
> > > This patch:
> > >
> > >
> > > Make ACPI motherboard driver not attach to devices/handles it
> dosen't
> > expect.
> > > Fix a bug where the motherboard driver attached to hot-add memory
> event
> > and
> > > caused the add memory call to fail.
> > >
> > > Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
> > > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > Cc: Andi Kleen <ak@muc.de>
> > > Signed-off-by: Andrew Morton <akpm@osdl.org>
> > > ---
> > >
> > >
> > > diff -puN drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-
> > motherboard-fix drivers/acpi/motherboard.c
> > > ---
> a/drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix
> > > +++ a/drivers/acpi/motherboard.c
> > > @@ -87,6 +87,7 @@ static acpi_status acpi_reserve_io_range
> > > }
> > > } else {
> > > /* Memory mapped IO? */
> > > + return -EINVAL;
> > > }
> > >
> > > if (requested_res)
> > > @@ -96,11 +97,16 @@ static acpi_status acpi_reserve_io_range
> > >
> > > static int acpi_motherboard_add(struct acpi_device *device)
> > > {
> > > + acpi_status status;
> > > if (!device)
> > > return -EINVAL;
> > > - acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> > > +
> > > + status = acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> > > acpi_reserve_io_ranges, NULL);
> > >
> > > + if (ACPI_FAILURE(status))
> > > + return -ENODEV;
> > > +
> > > return 0;
> > > }
> > >
> > > _
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
@ 2006-08-29 20:04 Moore, Robert
2006-08-31 6:48 ` Len Brown
0 siblings, 1 reply; 30+ messages in thread
From: Moore, Robert @ 2006-08-29 20:04 UTC (permalink / raw)
To: Li, Shaohua, Mattia Dongili, Andrew Morton; +Cc: linux-kernel, linux-acpi
As far as the unknown exception,
>[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
>[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
>[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
I would guess that the callback routine for walk_resources is returning
a non-zero status value which is causing an immediate abort of the walk
with that value -- and the value is bogus.
Bob
> -----Original Message-----
> From: linux-acpi-owner@vger.kernel.org [mailto:linux-acpi-
> owner@vger.kernel.org] On Behalf Of Li, Shaohua
> Sent: Monday, August 28, 2006 7:06 PM
> To: Mattia Dongili; Andrew Morton
> Cc: linux-kernel@vger.kernel.org; linux-acpi@vger.kernel.org
> Subject: RE: one more ACPI Error (utglobal-0125): Unknown exception
code:
> 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
>
>
>
> >-----Original Message-----
> >From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> >owner@vger.kernel.org] On Behalf Of Mattia Dongili
> >Sent: Tuesday, August 29, 2006 4:24 AM
> >To: Andrew Morton
> >Cc: linux-kernel@vger.kernel.org; linux-acpi@vger.kernel.org
> >Subject: one more ACPI Error (utglobal-0125): Unknown exception code:
> >0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
> >
> >On Sat, Aug 26, 2006 at 04:09:22PM -0700, Andrew Morton wrote:
> >>
> >>
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-
> >rc4/2.6.18-rc4-mm3/
> >[...]
> >> git-acpi.patch
> >
> >Sorry for reporting separately, I deleted the other thread on the
> issue.
> >Here we go:
> >[ 9.386644] PCI: Using ACPI for IRQ routing
> >[ 9.386688] PCI: If a device doesn't work, try "pci=routeirq". If
> it
> >helps, post a report
> >[ 9.391209] ACPI Error (utglobal-0125): Unknown exception code:
> >0xFFFFFFEA [20060707]
> >[ 9.391521] [<c0103a9f>] dump_trace+0x1ef/0x230
> >[ 9.391626] [<c0103b06>] show_trace_log_lvl+0x26/0x40
> >[ 9.391724] [<c01042bb>] show_trace+0x1b/0x20
> >[ 9.391820] [<c01043a4>] dump_stack+0x24/0x30
> >[ 9.391918] [<c0249f15>] acpi_format_exception+0xa3/0xb0
> >[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
> >[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
> >[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
> >[ 9.394977] [<c0255890>] acpi_bus_driver_init+0x2b/0x7c
> >[ 9.395742] [<c02568da>] acpi_bus_register_driver+0xa1/0x123
> >[ 9.396507] [<c0418adb>] acpi_motherboard_init+0x17/0xfb
> >[ 9.397268] [<c01003d0>] init+0x80/0x290
> >[ 9.397343] [<c0103593>] kernel_thread_helper+0x7/0x14
> >[ 9.397439] =======================
> >
> >full dmesg: http://oioio.altervista.org/linux/dmesg-2.6.18-rc4-mm3-1
> >config: http://oioio.altervista.org/linux/config-2.6.18-rc4-mm3-1
> >DSDT: http://oioio.altervista.org/linux/DSDT.aml
> > http://oioio.altervista.org/linux/DSDT.dsl
> >lspci: http://oioio.altervista.org/linux/lspci-v
> Below patch is the root cause.
>
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc
>
4/2.6.18-rc4-mm3/broken-out/hot-add-mem-x86_64-acpi-motherboard-fix.patc
> h
>
> motherboard driver is expected to reserve resources used by
motherboard,
> so hotplug will not fail. I don't know why memory hotplug guys change
> it.
>
> Thanks,
> Shaohua
> -
> To unsubscribe from this list: send the line "unsubscribe linux-acpi"
in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-08-29 20:04 one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA " Moore, Robert
@ 2006-08-31 6:48 ` Len Brown
2006-08-31 16:48 ` keith mannthey
0 siblings, 1 reply; 30+ messages in thread
From: Len Brown @ 2006-08-31 6:48 UTC (permalink / raw)
To: Moore, Robert
Cc: Li, Shaohua, Mattia Dongili, Andrew Morton, linux-kernel,
linux-acpi, Keith Mannthey, KAMEZAWA Hiroyuki
On Tuesday 29 August 2006 16:04, Moore, Robert wrote:
> As far as the unknown exception,
>
> >[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
> >[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
> >[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
>
> I would guess that the callback routine for walk_resources is returning
> a non-zero status value which is causing an immediate abort of the walk
> with that value -- and the value is bogus.
Yep, see -EINVAL below.
-Len
http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc4/2.6.18-rc4-mm3/broken-out/hot-add-mem-x86_64-acpi-motherboard-fix.patch
From: Keith Mannthey <kmannth@us.ibm.com>
This patch set allow SPARSEMEM and RESERVE based hot-add to work. I have
test both options and they work as expected. I am adding memory to the
2nd node of a numa system (x86_64).
Major changes from last set is the config change and RESERVE enablment.
This patch:
Make ACPI motherboard driver not attach to devices/handles it dosen't expect.
Fix a bug where the motherboard driver attached to hot-add memory event and
caused the add memory call to fail.
Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---
diff -puN drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix drivers/acpi/motherboard.c
--- a/drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix
+++ a/drivers/acpi/motherboard.c
@@ -87,6 +87,7 @@ static acpi_status acpi_reserve_io_range
}
} else {
/* Memory mapped IO? */
+ return -EINVAL;
}
if (requested_res)
@@ -96,11 +97,16 @@ static acpi_status acpi_reserve_io_range
static int acpi_motherboard_add(struct acpi_device *device)
{
+ acpi_status status;
if (!device)
return -EINVAL;
- acpi_walk_resources(device->handle, METHOD_NAME__CRS,
+
+ status = acpi_walk_resources(device->handle, METHOD_NAME__CRS,
acpi_reserve_io_ranges, NULL);
+ if (ACPI_FAILURE(status))
+ return -ENODEV;
+
return 0;
}
_
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-08-31 6:48 ` Len Brown
@ 2006-08-31 16:48 ` keith mannthey
2006-08-31 23:06 ` Bjorn Helgaas
0 siblings, 1 reply; 30+ messages in thread
From: keith mannthey @ 2006-08-31 16:48 UTC (permalink / raw)
To: Len Brown
Cc: Moore, Robert, Li, Shaohua, Mattia Dongili, Andrew Morton, lkml,
linux acpi, KAMEZAWA Hiroyuki
On Thu, 2006-08-31 at 02:48 -0400, Len Brown wrote:
> On Tuesday 29 August 2006 16:04, Moore, Robert wrote:
> > As far as the unknown exception,
> >
> > >[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
> > >[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
> > >[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
> >
> > I would guess that the callback routine for walk_resources is returning
> > a non-zero status value which is causing an immediate abort of the walk
> > with that value -- and the value is bogus.
Before I put this check in acpi_motherboard_add always attached itself
to any resource type. I simply changed it so if the type is not
ACPI_RESOURCE_TYPE_IO or ACPI_RESOURCE_TYPE_FIXED_IO it doesn't attach
and can continue to find the correct device to attach to.
Perhaps the motherboard device needs to attach to more device types?
It was suggest by acpi folks to return -EINVAL. Should something else
be returned?
Thanks,
Keith
> Yep, see -EINVAL below.
>
> -Len
>
> http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc4/2.6.18-rc4-mm3/broken-out/hot-add-mem-x86_64-acpi-motherboard-fix.patch
>
>
>
> From: Keith Mannthey <kmannth@us.ibm.com>
>
> This patch set allow SPARSEMEM and RESERVE based hot-add to work. I have
> test both options and they work as expected. I am adding memory to the
> 2nd node of a numa system (x86_64).
>
> Major changes from last set is the config change and RESERVE enablment.
>
>
> This patch:
>
>
> Make ACPI motherboard driver not attach to devices/handles it dosen't expect.
> Fix a bug where the motherboard driver attached to hot-add memory event and
> caused the add memory call to fail.
>
> Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Cc: Andi Kleen <ak@muc.de>
> Signed-off-by: Andrew Morton <akpm@osdl.org>
> ---
>
>
> diff -puN drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix drivers/acpi/motherboard.c
> --- a/drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix
> +++ a/drivers/acpi/motherboard.c
> @@ -87,6 +87,7 @@ static acpi_status acpi_reserve_io_range
> }
> } else {
> /* Memory mapped IO? */
> + return -EINVAL;
> }
>
> if (requested_res)
> @@ -96,11 +97,16 @@ static acpi_status acpi_reserve_io_range
>
> static int acpi_motherboard_add(struct acpi_device *device)
> {
> + acpi_status status;
> if (!device)
> return -EINVAL;
> - acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> +
> + status = acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> acpi_reserve_io_ranges, NULL);
>
> + if (ACPI_FAILURE(status))
> + return -ENODEV;
> +
> return 0;
> }
>
> _
^ permalink raw reply [flat|nested] 30+ messages in thread* Re: one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
2006-08-31 16:48 ` keith mannthey
@ 2006-08-31 23:06 ` Bjorn Helgaas
[not found] ` <1157073592.5649.29.camel@keithlap>
0 siblings, 1 reply; 30+ messages in thread
From: Bjorn Helgaas @ 2006-08-31 23:06 UTC (permalink / raw)
To: kmannth
Cc: Len Brown, Moore, Robert, Li, Shaohua, Mattia Dongili,
Andrew Morton, lkml, linux acpi, KAMEZAWA Hiroyuki
On Thursday 31 August 2006 10:48, keith mannthey wrote:
> On Thu, 2006-08-31 at 02:48 -0400, Len Brown wrote:
> > On Tuesday 29 August 2006 16:04, Moore, Robert wrote:
> > > As far as the unknown exception,
> > >
> > > >[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
> > > >[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
> > > >[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
> > >
> > > I would guess that the callback routine for walk_resources is returning
> > > a non-zero status value which is causing an immediate abort of the walk
> > > with that value -- and the value is bogus.
>
> Before I put this check in acpi_motherboard_add always attached itself
> to any resource type. I simply changed it so if the type is not
> ACPI_RESOURCE_TYPE_IO or ACPI_RESOURCE_TYPE_FIXED_IO it doesn't attach
> and can continue to find the correct device to attach to.
>
> Perhaps the motherboard device needs to attach to more device types?
>
> It was suggest by acpi folks to return -EINVAL. Should something else
> be returned?
Problem 1: acpi_reserve_io_ranges() needs to return an acpi_status
like AE_OK or AE_CTRL_TERMINATE, not a -EINVAL.
Problem 2: I don't understand how your patch works. An ACPI device
has a list of resources it uses. Are you saying that claiming all
the IO port resources of a PNP0C01 or PNP0C02 device causes the ACPI
memory hotplug driver to fail?
Is there some conflict between those PNP0C01 resources and the
resources of a hotplug memory device? Can you figure out exactly
what the conflict is by disassembling the DSDT for those devices?
We should understand this better before introducing special cases
to the motherboard driver. We should be able to trust the ACPI
description of the motherboard resources. The motherboard driver
currently claims only I/O port resources, but it really should
claim MMIO resources as well.
> > http://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc4/2.6.18-rc4-mm3/broken-out/hot-add-mem-x86_64-acpi-motherboard-fix.patch
> >
> > From: Keith Mannthey <kmannth@us.ibm.com>
> > ...
> > Make ACPI motherboard driver not attach to devices/handles it dosen't expect.
> > Fix a bug where the motherboard driver attached to hot-add memory event and
> > caused the add memory call to fail.
> >
> > Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
> > Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Cc: Andi Kleen <ak@muc.de>
> > Signed-off-by: Andrew Morton <akpm@osdl.org>
> > ---
> > diff -puN drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix drivers/acpi/motherboard.c
> > --- a/drivers/acpi/motherboard.c~hot-add-mem-x86_64-acpi-motherboard-fix
> > +++ a/drivers/acpi/motherboard.c
> > @@ -87,6 +87,7 @@ static acpi_status acpi_reserve_io_range
> > }
> > } else {
> > /* Memory mapped IO? */
> > + return -EINVAL;
> > }
> >
> > if (requested_res)
> > @@ -96,11 +97,16 @@ static acpi_status acpi_reserve_io_range
> >
> > static int acpi_motherboard_add(struct acpi_device *device)
> > {
> > + acpi_status status;
> > if (!device)
> > return -EINVAL;
> > - acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> > +
> > + status = acpi_walk_resources(device->handle, METHOD_NAME__CRS,
> > acpi_reserve_io_ranges, NULL);
> >
> > + if (ACPI_FAILURE(status))
> > + return -ENODEV;
> > +
> > return 0;
> > }
^ permalink raw reply [flat|nested] 30+ messages in thread
* RE: one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
@ 2006-08-29 2:05 Li, Shaohua
0 siblings, 0 replies; 30+ messages in thread
From: Li, Shaohua @ 2006-08-29 2:05 UTC (permalink / raw)
To: Mattia Dongili, Andrew Morton; +Cc: linux-kernel, linux-acpi
>-----Original Message-----
>From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
>owner@vger.kernel.org] On Behalf Of Mattia Dongili
>Sent: Tuesday, August 29, 2006 4:24 AM
>To: Andrew Morton
>Cc: linux-kernel@vger.kernel.org; linux-acpi@vger.kernel.org
>Subject: one more ACPI Error (utglobal-0125): Unknown exception code:
>0xFFFFFFEA [Re: 2.6.18-rc4-mm3]
>
>On Sat, Aug 26, 2006 at 04:09:22PM -0700, Andrew Morton wrote:
>>
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-
>rc4/2.6.18-rc4-mm3/
>[...]
>> git-acpi.patch
>
>Sorry for reporting separately, I deleted the other thread on the
issue.
>Here we go:
>[ 9.386644] PCI: Using ACPI for IRQ routing
>[ 9.386688] PCI: If a device doesn't work, try "pci=routeirq". If
it
>helps, post a report
>[ 9.391209] ACPI Error (utglobal-0125): Unknown exception code:
>0xFFFFFFEA [20060707]
>[ 9.391521] [<c0103a9f>] dump_trace+0x1ef/0x230
>[ 9.391626] [<c0103b06>] show_trace_log_lvl+0x26/0x40
>[ 9.391724] [<c01042bb>] show_trace+0x1b/0x20
>[ 9.391820] [<c01043a4>] dump_stack+0x24/0x30
>[ 9.391918] [<c0249f15>] acpi_format_exception+0xa3/0xb0
>[ 9.392729] [<c0246fb6>] acpi_ut_status_exit+0x31/0x5e
>[ 9.393453] [<c0243352>] acpi_walk_resources+0x10e/0x11b
>[ 9.394174] [<c025697e>] acpi_motherboard_add+0x22/0x31
>[ 9.394977] [<c0255890>] acpi_bus_driver_init+0x2b/0x7c
>[ 9.395742] [<c02568da>] acpi_bus_register_driver+0xa1/0x123
>[ 9.396507] [<c0418adb>] acpi_motherboard_init+0x17/0xfb
>[ 9.397268] [<c01003d0>] init+0x80/0x290
>[ 9.397343] [<c0103593>] kernel_thread_helper+0x7/0x14
>[ 9.397439] =======================
>
>full dmesg: http://oioio.altervista.org/linux/dmesg-2.6.18-rc4-mm3-1
>config: http://oioio.altervista.org/linux/config-2.6.18-rc4-mm3-1
>DSDT: http://oioio.altervista.org/linux/DSDT.aml
> http://oioio.altervista.org/linux/DSDT.dsl
>lspci: http://oioio.altervista.org/linux/lspci-v
Below patch is the root cause.
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc
4/2.6.18-rc4-mm3/broken-out/hot-add-mem-x86_64-acpi-motherboard-fix.patc
h
motherboard driver is expected to reserve resources used by motherboard,
so hotplug will not fail. I don't know why memory hotplug guys change
it.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 30+ messages in thread
[parent not found: <20060826160922.3324a707.akpm@osdl.org>]
end of thread, other threads:[~2006-09-21 0:27 UTC | newest]
Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-06 18:59 one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA [Re: 2.6.18-rc4-mm3] Moore, Robert
2006-09-06 20:04 ` keith mannthey
2006-09-07 2:03 ` one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA " Shaohua Li
2006-09-07 15:25 ` Bjorn Helgaas
2006-09-08 0:57 ` Shaohua Li
2006-09-08 2:27 ` Bjorn Helgaas
2006-09-13 1:27 ` keith mannthey
2006-09-13 14:51 ` Bjorn Helgaas
2006-09-14 3:01 ` Shaohua Li
2006-09-14 16:36 ` Bjorn Helgaas
2006-09-15 1:39 ` Shaohua Li
2006-09-19 10:22 ` Bjorn Helgaas
2006-09-14 17:55 ` keith mannthey
2006-09-15 1:52 ` Shaohua Li
2006-09-21 0:27 ` keith mannthey
-- strict thread matches above, loose matches on Subject: below --
2006-08-31 17:02 Moore, Robert
2006-08-31 17:56 ` keith mannthey
2006-08-29 20:04 one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA " Moore, Robert
2006-08-31 6:48 ` Len Brown
2006-08-31 16:48 ` keith mannthey
2006-08-31 23:06 ` Bjorn Helgaas
[not found] ` <1157073592.5649.29.camel@keithlap>
2006-09-01 2:39 ` one more ACPI Error (utglobal-0125): Unknown exception code:0xFFFFFFEA " Shaohua Li
2006-09-01 3:31 ` keith mannthey
2006-09-01 3:15 ` one more ACPI Error (utglobal-0125): Unknown exception code: 0xFFFFFFEA " Bjorn Helgaas
2006-09-01 3:56 ` KAMEZAWA Hiroyuki
2006-09-01 23:01 ` keith mannthey
2006-09-01 23:20 ` Bjorn Helgaas
2006-09-06 18:14 ` keith mannthey
2006-08-29 2:05 Li, Shaohua
[not found] <20060826160922.3324a707.akpm@osdl.org>
2006-08-28 20:24 ` Mattia Dongili
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).