* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
[not found] ` <55869329.4040908@pr.hu>
@ 2015-06-21 14:03 ` Bjorn Helgaas
2015-06-21 14:19 ` Boszormenyi Zoltan
0 siblings, 1 reply; 10+ messages in thread
From: Bjorn Helgaas @ 2015-06-21 14:03 UTC (permalink / raw)
To: Boszormenyi Zoltan
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
[+cc linux-pci]
Hi Boszormenyi,
On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan <zboszor@pr.hu> wrote:
> Hi,
>
> please, cc me, I am not subscribed to lkml.
>
>> Hi,
>>
>> [lkml.org still broken --> no accurate mail header info possible...]
>>
>> Just to ask the obvious:
>> I assume using /sys/bus/pci/rescan does not help once it's broken?
>> (since the machine comes up empty at initial-boot scan, too)
>
> I will try it, too, but I am not sure it would work.
>
> Currently I can't test it because the last time I completely discharged
> the battery. I also disconnected it to be able to get the realtek chip back
> immediately for faster testing. Now, that I have reconnected the battery,
> I need to wait for it to be charged somewhat to be able to reproduce
> losing the network chip.
>
>> Also, you could try diffing lspci -vvxxx -s.... output
>> of working vs. "distorting" kernel version - perhaps some register setup
>> has been changed (e.g. due to power management improvements or some such),
>> which may encourage the card
>> to get a problematic/corrupt state.
>
> I attached a tarball that contains lspci -vvxxx for
> - all devices / only the network chip
> - before / after "modprobe r8169"
> - for all 3 kernel versions tested.
>
> I figured out that if I type the modprobe and lspci in the same command line,
> I can get diagnostics out of the machine, after all.
>
> It's not just the Realtek chip that has changed parameters.
>
> (Vague idea) I noticed that some devices have changed like this:
>
> - Memory behind bridge: 80000000-801fffff
> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
> + Memory behind bridge: ff000000-ff1fffff
> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>
> Can't this cause a problem? E.g. programming the bridge with an address range
> that the bridge doesn't actually support?
This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
v3.18.16 dmesg log, so we can compare them?
These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
the code to see what might be going on:
acpi PNP0A08:00: host bridge window expanded to [mem
0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
ignored
pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
64bit pref]: address conflict with PCI Bus 0000:00 [mem
0xf0000000-0xfed8ffff window]
Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 14:03 ` ACPI regression? Was Re: Ethernet chip disappeared from lspci Bjorn Helgaas
@ 2015-06-21 14:19 ` Boszormenyi Zoltan
2015-06-21 15:37 ` Boszormenyi Zoltan
2015-06-21 17:25 ` Jiang Liu
0 siblings, 2 replies; 10+ messages in thread
From: Boszormenyi Zoltan @ 2015-06-21 14:19 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 2859 bytes --]
2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta:
> [+cc linux-pci]
>
> Hi Boszormenyi,
>
> On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan <zboszor@pr.hu> wrote:
>> Hi,
>>
>> please, cc me, I am not subscribed to lkml.
>>
>>> Hi,
>>>
>>> [lkml.org still broken --> no accurate mail header info possible...]
>>>
>>> Just to ask the obvious:
>>> I assume using /sys/bus/pci/rescan does not help once it's broken?
>>> (since the machine comes up empty at initial-boot scan, too)
>> I will try it, too, but I am not sure it would work.
>>
>> Currently I can't test it because the last time I completely discharged
>> the battery. I also disconnected it to be able to get the realtek chip back
>> immediately for faster testing. Now, that I have reconnected the battery,
>> I need to wait for it to be charged somewhat to be able to reproduce
>> losing the network chip.
>>
>>> Also, you could try diffing lspci -vvxxx -s.... output
>>> of working vs. "distorting" kernel version - perhaps some register setup
>>> has been changed (e.g. due to power management improvements or some such),
>>> which may encourage the card
>>> to get a problematic/corrupt state.
>> I attached a tarball that contains lspci -vvxxx for
>> - all devices / only the network chip
>> - before / after "modprobe r8169"
>> - for all 3 kernel versions tested.
>>
>> I figured out that if I type the modprobe and lspci in the same command line,
>> I can get diagnostics out of the machine, after all.
>>
>> It's not just the Realtek chip that has changed parameters.
>>
>> (Vague idea) I noticed that some devices have changed like this:
>>
>> - Memory behind bridge: 80000000-801fffff
>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
>> + Memory behind bridge: ff000000-ff1fffff
>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>>
>> Can't this cause a problem? E.g. programming the bridge with an address range
>> that the bridge doesn't actually support?
> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
> v3.18.16 dmesg log, so we can compare them?
I collected all 3 for you to compare them, compressed, attached.
BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0
as suspicious. I will try the 4.0/4.1 kernels with this one reverted.
>
> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
> the code to see what might be going on:
>
> acpi PNP0A08:00: host bridge window expanded to [mem
> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
> ignored
> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
> 64bit pref]: address conflict with PCI Bus 0000:00 [mem
> 0xf0000000-0xfed8ffff window]
>
> Bjorn
>
Thanks,
Zoltán Böszörményi
[-- Attachment #2: dmesg.tgz --]
[-- Type: application/x-compressed-tar, Size: 39096 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 14:19 ` Boszormenyi Zoltan
@ 2015-06-21 15:37 ` Boszormenyi Zoltan
2015-06-21 17:25 ` Jiang Liu
1 sibling, 0 replies; 10+ messages in thread
From: Boszormenyi Zoltan @ 2015-06-21 15:37 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
2015-06-21 16:19 keltezéssel, Boszormenyi Zoltan írta:
> 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta:
>> [+cc linux-pci]
>>
>> Hi Boszormenyi,
>>
>> On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan <zboszor@pr.hu> wrote:
>>> Hi,
>>>
>>> please, cc me, I am not subscribed to lkml.
>>>
>>>> Hi,
>>>>
>>>> [lkml.org still broken --> no accurate mail header info possible...]
>>>>
>>>> Just to ask the obvious:
>>>> I assume using /sys/bus/pci/rescan does not help once it's broken?
>>>> (since the machine comes up empty at initial-boot scan, too)
>>> I will try it, too, but I am not sure it would work.
>>>
>>> Currently I can't test it because the last time I completely discharged
>>> the battery. I also disconnected it to be able to get the realtek chip back
>>> immediately for faster testing. Now, that I have reconnected the battery,
>>> I need to wait for it to be charged somewhat to be able to reproduce
>>> losing the network chip.
>>>
>>>> Also, you could try diffing lspci -vvxxx -s.... output
>>>> of working vs. "distorting" kernel version - perhaps some register setup
>>>> has been changed (e.g. due to power management improvements or some such),
>>>> which may encourage the card
>>>> to get a problematic/corrupt state.
>>> I attached a tarball that contains lspci -vvxxx for
>>> - all devices / only the network chip
>>> - before / after "modprobe r8169"
>>> - for all 3 kernel versions tested.
>>>
>>> I figured out that if I type the modprobe and lspci in the same command line,
>>> I can get diagnostics out of the machine, after all.
>>>
>>> It's not just the Realtek chip that has changed parameters.
>>>
>>> (Vague idea) I noticed that some devices have changed like this:
>>>
>>> - Memory behind bridge: 80000000-801fffff
>>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
>>> + Memory behind bridge: ff000000-ff1fffff
>>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>>>
>>> Can't this cause a problem? E.g. programming the bridge with an address range
>>> that the bridge doesn't actually support?
>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
>> v3.18.16 dmesg log, so we can compare them?
> I collected all 3 for you to compare them, compressed, attached.
>
> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0
> as suspicious. I will try the 4.0/4.1 kernels with this one reverted.
Reverting this one didn't help.
>
>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
>> the code to see what might be going on:
>>
>> acpi PNP0A08:00: host bridge window expanded to [mem
>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
>> ignored
>> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
>> 64bit pref]: address conflict with PCI Bus 0000:00 [mem
>> 0xf0000000-0xfed8ffff window]
>>
>> Bjorn
>>
> Thanks,
> Zoltán Böszörményi
>
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 14:19 ` Boszormenyi Zoltan
2015-06-21 15:37 ` Boszormenyi Zoltan
@ 2015-06-21 17:25 ` Jiang Liu
2015-06-21 17:55 ` Jiang Liu
2015-06-21 18:28 ` ACPI regression? Was Re: Ethernet chip disappeared from lspci Boszormenyi Zoltan
1 sibling, 2 replies; 10+ messages in thread
From: Jiang Liu @ 2015-06-21 17:25 UTC (permalink / raw)
To: Boszormenyi Zoltan, Bjorn Helgaas
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
On 2015/6/21 22:19, Boszormenyi Zoltan wrote:
> 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta:
>> [+cc linux-pci]
>>
>> Hi Boszormenyi,
>>
>> On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan <zboszor@pr.hu> wrote:
>>> Hi,
>>>
>>> please, cc me, I am not subscribed to lkml.
>>>
>>>> Hi,
>>>>
>>>> [lkml.org still broken --> no accurate mail header info possible...]
>>>>
>>>> Just to ask the obvious:
>>>> I assume using /sys/bus/pci/rescan does not help once it's broken?
>>>> (since the machine comes up empty at initial-boot scan, too)
>>> I will try it, too, but I am not sure it would work.
>>>
>>> Currently I can't test it because the last time I completely discharged
>>> the battery. I also disconnected it to be able to get the realtek chip back
>>> immediately for faster testing. Now, that I have reconnected the battery,
>>> I need to wait for it to be charged somewhat to be able to reproduce
>>> losing the network chip.
>>>
>>>> Also, you could try diffing lspci -vvxxx -s.... output
>>>> of working vs. "distorting" kernel version - perhaps some register setup
>>>> has been changed (e.g. due to power management improvements or some such),
>>>> which may encourage the card
>>>> to get a problematic/corrupt state.
>>> I attached a tarball that contains lspci -vvxxx for
>>> - all devices / only the network chip
>>> - before / after "modprobe r8169"
>>> - for all 3 kernel versions tested.
>>>
>>> I figured out that if I type the modprobe and lspci in the same command line,
>>> I can get diagnostics out of the machine, after all.
>>>
>>> It's not just the Realtek chip that has changed parameters.
>>>
>>> (Vague idea) I noticed that some devices have changed like this:
>>>
>>> - Memory behind bridge: 80000000-801fffff
>>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
>>> + Memory behind bridge: ff000000-ff1fffff
>>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>>>
>>> Can't this cause a problem? E.g. programming the bridge with an address range
>>> that the bridge doesn't actually support?
>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
>> v3.18.16 dmesg log, so we can compare them?
>
> I collected all 3 for you to compare them, compressed, attached.
>
> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0
> as suspicious. I will try the 4.0/4.1 kernels with this one reverted.
>
>>
>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
>> the code to see what might be going on:
>>
>> acpi PNP0A08:00: host bridge window expanded to [mem
>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
>> ignored
>> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
>> 64bit pref]: address conflict with PCI Bus 0000:00 [mem
>> 0xf0000000-0xfed8ffff window]
>>
>> Bjorn
Hi Bjorn and Boszormenyi,
From the 3.18 kernel, we got a message:
[ 0.126248] acpi PNP0A08:00: host bridge window
[0x400000000-0xfffffffff] (ignored, not CPU addressable)
And from 4.1.-rc8, we got another message:
[ 0.127051] acpi PNP0A08:00: host bridge window expanded to [mem
0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window] ignored
That smells like a 32bit overflow or 64bit cut-off issue.
Hi Boszormenyi, could you please help to provide acpidump from the
machine?
Thanks!
Gerry
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 17:25 ` Jiang Liu
@ 2015-06-21 17:55 ` Jiang Liu
2015-06-21 18:55 ` Boszormenyi Zoltan
2015-06-21 18:28 ` ACPI regression? Was Re: Ethernet chip disappeared from lspci Boszormenyi Zoltan
1 sibling, 1 reply; 10+ messages in thread
From: Jiang Liu @ 2015-06-21 17:55 UTC (permalink / raw)
To: Boszormenyi Zoltan, Bjorn Helgaas
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
On 2015/6/22 1:25, Jiang Liu wrote:
[...]
>>>> - Memory behind bridge: 80000000-801fffff
>>>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
>>>> + Memory behind bridge: ff000000-ff1fffff
>>>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>>>>
>>>> Can't this cause a problem? E.g. programming the bridge with an address range
>>>> that the bridge doesn't actually support?
>>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
>>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
>>> v3.18.16 dmesg log, so we can compare them?
>>
>> I collected all 3 for you to compare them, compressed, attached.
>>
>> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0
>> as suspicious. I will try the 4.0/4.1 kernels with this one reverted.
>>
>>>
>>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
>>> the code to see what might be going on:
>>>
>>> acpi PNP0A08:00: host bridge window expanded to [mem
>>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
>>> ignored
>>> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
>>> 64bit pref]: address conflict with PCI Bus 0000:00 [mem
>>> 0xf0000000-0xfed8ffff window]
>>>
>>> Bjorn
> Hi Bjorn and Boszormenyi,
> From the 3.18 kernel, we got a message:
> [ 0.126248] acpi PNP0A08:00: host bridge window
> [0x400000000-0xfffffffff] (ignored, not CPU addressable)
> And from 4.1.-rc8, we got another message:
> [ 0.127051] acpi PNP0A08:00: host bridge window expanded to [mem
> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window] ignored
>
> That smells like a 32bit overflow or 64bit cut-off issue.
Hi Bjorn and Boszormenyi,
With v3.18.6, it uses u64 to compare resource ranges. We changed to use
resource_size_t with recent changes, and resource_size_t
may be u32 or u64 depending on configuration. So resource range
[0x400000000-0xfffffffff] may have been cut-off as
[0x00000000-0xffffffff], thus cause the trouble.
Hi Boszormenyi,
Could you please help to try following test patch?
against v4.1-rc8?
Thanks!
Gerry
-------------------------------------------------------------------
diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
index 8244f013f210..d7b8c392c420 100644
--- a/drivers/acpi/resource.c
+++ b/drivers/acpi/resource.c
@@ -206,6 +206,11 @@ static bool acpi_decode_space(struct resource_win *win,
res->start = attr->minimum;
res->end = attr->maximum;
+ if (res->start != attr->minimum || res->end != attr->maximum) {
+ pr_warn("resource window ([%#llx-%#llx] ignored, not CPU
addressable)\n",
+ attr->minimum, attr->maximum);
+ return false;
+ }
/*
* For bridges that translate addresses across the bridge,
-----------------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 17:25 ` Jiang Liu
2015-06-21 17:55 ` Jiang Liu
@ 2015-06-21 18:28 ` Boszormenyi Zoltan
1 sibling, 0 replies; 10+ messages in thread
From: Boszormenyi Zoltan @ 2015-06-21 18:28 UTC (permalink / raw)
To: Jiang Liu, Bjorn Helgaas
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 3710 bytes --]
2015-06-21 19:25 keltezéssel, Jiang Liu írta:
> On 2015/6/21 22:19, Boszormenyi Zoltan wrote:
>> 2015-06-21 16:03 keltezéssel, Bjorn Helgaas írta:
>>> [+cc linux-pci]
>>>
>>> Hi Boszormenyi,
>>>
>>> On Sun, Jun 21, 2015 at 5:34 AM, Boszormenyi Zoltan <zboszor@pr.hu> wrote:
>>>> Hi,
>>>>
>>>> please, cc me, I am not subscribed to lkml.
>>>>
>>>>> Hi,
>>>>>
>>>>> [lkml.org still broken --> no accurate mail header info possible...]
>>>>>
>>>>> Just to ask the obvious:
>>>>> I assume using /sys/bus/pci/rescan does not help once it's broken?
>>>>> (since the machine comes up empty at initial-boot scan, too)
>>>> I will try it, too, but I am not sure it would work.
>>>>
>>>> Currently I can't test it because the last time I completely discharged
>>>> the battery. I also disconnected it to be able to get the realtek chip back
>>>> immediately for faster testing. Now, that I have reconnected the battery,
>>>> I need to wait for it to be charged somewhat to be able to reproduce
>>>> losing the network chip.
>>>>
>>>>> Also, you could try diffing lspci -vvxxx -s.... output
>>>>> of working vs. "distorting" kernel version - perhaps some register setup
>>>>> has been changed (e.g. due to power management improvements or some such),
>>>>> which may encourage the card
>>>>> to get a problematic/corrupt state.
>>>> I attached a tarball that contains lspci -vvxxx for
>>>> - all devices / only the network chip
>>>> - before / after "modprobe r8169"
>>>> - for all 3 kernel versions tested.
>>>>
>>>> I figured out that if I type the modprobe and lspci in the same command line,
>>>> I can get diagnostics out of the machine, after all.
>>>>
>>>> It's not just the Realtek chip that has changed parameters.
>>>>
>>>> (Vague idea) I noticed that some devices have changed like this:
>>>>
>>>> - Memory behind bridge: 80000000-801fffff
>>>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
>>>> + Memory behind bridge: ff000000-ff1fffff
>>>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>>>>
>>>> Can't this cause a problem? E.g. programming the bridge with an address range
>>>> that the bridge doesn't actually support?
>>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
>>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
>>> v3.18.16 dmesg log, so we can compare them?
>> I collected all 3 for you to compare them, compressed, attached.
>>
>> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0
>> as suspicious. I will try the 4.0/4.1 kernels with this one reverted.
>>
>>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
>>> the code to see what might be going on:
>>>
>>> acpi PNP0A08:00: host bridge window expanded to [mem
>>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
>>> ignored
>>> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
>>> 64bit pref]: address conflict with PCI Bus 0000:00 [mem
>>> 0xf0000000-0xfed8ffff window]
>>>
>>> Bjorn
> Hi Bjorn and Boszormenyi,
> From the 3.18 kernel, we got a message:
> [ 0.126248] acpi PNP0A08:00: host bridge window
> [0x400000000-0xfffffffff] (ignored, not CPU addressable)
> And from 4.1.-rc8, we got another message:
> [ 0.127051] acpi PNP0A08:00: host bridge window expanded to [mem
> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window] ignored
>
> That smells like a 32bit overflow or 64bit cut-off issue.
>
> Hi Boszormenyi, could you please help to provide acpidump from the
> machine?
I already did in a previous mail which was only sent to LKML, but here it is again.
Thanks,
Zoltán
> Thanks!
> Gerry
>
>
>
>
[-- Attachment #2: acpidump.tgz --]
[-- Type: application/x-compressed-tar, Size: 49038 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 17:55 ` Jiang Liu
@ 2015-06-21 18:55 ` Boszormenyi Zoltan
2015-06-21 19:59 ` Boszormenyi Zoltan
0 siblings, 1 reply; 10+ messages in thread
From: Boszormenyi Zoltan @ 2015-06-21 18:55 UTC (permalink / raw)
To: Jiang Liu, Bjorn Helgaas
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 3506 bytes --]
2015-06-21 19:55 keltezéssel, Jiang Liu írta:
> On 2015/6/22 1:25, Jiang Liu wrote:
> [...]
>>>>> - Memory behind bridge: 80000000-801fffff
>>>>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
>>>>> + Memory behind bridge: ff000000-ff1fffff
>>>>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>>>>>
>>>>> Can't this cause a problem? E.g. programming the bridge with an address range
>>>>> that the bridge doesn't actually support?
>>>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
>>>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
>>>> v3.18.16 dmesg log, so we can compare them?
>>> I collected all 3 for you to compare them, compressed, attached.
>>>
>>> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0
>>> as suspicious. I will try the 4.0/4.1 kernels with this one reverted.
>>>
>>>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
>>>> the code to see what might be going on:
>>>>
>>>> acpi PNP0A08:00: host bridge window expanded to [mem
>>>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
>>>> ignored
>>>> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
>>>> 64bit pref]: address conflict with PCI Bus 0000:00 [mem
>>>> 0xf0000000-0xfed8ffff window]
>>>>
>>>> Bjorn
>> Hi Bjorn and Boszormenyi,
>> From the 3.18 kernel, we got a message:
>> [ 0.126248] acpi PNP0A08:00: host bridge window
>> [0x400000000-0xfffffffff] (ignored, not CPU addressable)
>> And from 4.1.-rc8, we got another message:
>> [ 0.127051] acpi PNP0A08:00: host bridge window expanded to [mem
>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window] ignored
>>
>> That smells like a 32bit overflow or 64bit cut-off issue.
> Hi Bjorn and Boszormenyi,
> With v3.18.6, it uses u64 to compare resource ranges. We changed to use
> resource_size_t with recent changes, and resource_size_t
> may be u32 or u64 depending on configuration. So resource range
> [0x400000000-0xfffffffff] may have been cut-off as
> [0x00000000-0xffffffff], thus cause the trouble.
>
> Hi Boszormenyi,
> Could you please help to try following test patch?
> against v4.1-rc8?
I have tried it. The result (dmesg, lspci before/after modprobe) is attached.
The "not CPU addressable" message shows up once in dmesg.
The device shows up in lspci and the module can be loaded. The previously
experienced sluggishness is gone now, but the network doesn't work after modprobe.
I think it was an expected outcome, since that particular range is ignored with this patch.
Thanks,
Zoltán
> Thanks!
> Gerry
> -------------------------------------------------------------------
> diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
> index 8244f013f210..d7b8c392c420 100644
> --- a/drivers/acpi/resource.c
> +++ b/drivers/acpi/resource.c
> @@ -206,6 +206,11 @@ static bool acpi_decode_space(struct resource_win *win,
>
> res->start = attr->minimum;
> res->end = attr->maximum;
> + if (res->start != attr->minimum || res->end != attr->maximum) {
> + pr_warn("resource window ([%#llx-%#llx] ignored, not CPU
> addressable)\n",
> + attr->minimum, attr->maximum);
> + return false;
> + }
>
> /*
> * For bridges that translate addresses across the bridge,
> -----------------------------------------------------------------------------
>
[-- Attachment #2: dmesg-lspci-xx2.tgz --]
[-- Type: application/x-compressed-tar, Size: 21863 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: ACPI regression? Was Re: Ethernet chip disappeared from lspci
2015-06-21 18:55 ` Boszormenyi Zoltan
@ 2015-06-21 19:59 ` Boszormenyi Zoltan
2015-06-23 4:12 ` [Patch v1] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel Jiang Liu
0 siblings, 1 reply; 10+ messages in thread
From: Boszormenyi Zoltan @ 2015-06-21 19:59 UTC (permalink / raw)
To: Jiang Liu, Bjorn Helgaas
Cc: Andreas Mohr, Rafael J. Wysocki, Linux Kernel Mailing List,
ACPI Devel Maling List, linux-pci@vger.kernel.org
2015-06-21 20:55 keltezéssel, Boszormenyi Zoltan írta:
> 2015-06-21 19:55 keltezéssel, Jiang Liu írta:
>> On 2015/6/22 1:25, Jiang Liu wrote:
>> [...]
>>>>>> - Memory behind bridge: 80000000-801fffff
>>>>>> - Prefetchable memory behind bridge: 0000000080200000-00000000803fffff
>>>>>> + Memory behind bridge: ff000000-ff1fffff
>>>>>> + Prefetchable memory behind bridge: 00000000ff200000-00000000ff3fffff
>>>>>>
>>>>>> Can't this cause a problem? E.g. programming the bridge with an address range
>>>>>> that the bridge doesn't actually support?
>>>>> This worked in v3.18.16, but not in v4.0.5 or v4.1.0-rc8. You
>>>>> attached a v4.1.0-rc8 dmesg log earlier. Would you mind collecting a
>>>>> v3.18.16 dmesg log, so we can compare them?
>>>> I collected all 3 for you to compare them, compressed, attached.
>>>>
>>>> BTW, I browsed git log and found 2ea3d266bab3b497238113b20136f7c3f69ad9c0
>>>> as suspicious. I will try the 4.0/4.1 kernels with this one reverted.
>>>>
>>>>> These (from the v4.1.0-rc8 dmesg) look wrong, but I'll have to look at
>>>>> the code to see what might be going on:
>>>>>
>>>>> acpi PNP0A08:00: host bridge window expanded to [mem
>>>>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window]
>>>>> ignored
>>>>> pci 0000:00:1c.1: can't claim BAR 15 [mem 0xfdf00000-0xfdffffff
>>>>> 64bit pref]: address conflict with PCI Bus 0000:00 [mem
>>>>> 0xf0000000-0xfed8ffff window]
>>>>>
>>>>> Bjorn
>>> Hi Bjorn and Boszormenyi,
>>> From the 3.18 kernel, we got a message:
>>> [ 0.126248] acpi PNP0A08:00: host bridge window
>>> [0x400000000-0xfffffffff] (ignored, not CPU addressable)
>>> And from 4.1.-rc8, we got another message:
>>> [ 0.127051] acpi PNP0A08:00: host bridge window expanded to [mem
>>> 0x00000000-0xffffffff window]; [mem 0x00000000-0xffffffff window] ignored
>>>
>>> That smells like a 32bit overflow or 64bit cut-off issue.
>> Hi Bjorn and Boszormenyi,
>> With v3.18.6, it uses u64 to compare resource ranges. We changed to use
>> resource_size_t with recent changes, and resource_size_t
>> may be u32 or u64 depending on configuration. So resource range
>> [0x400000000-0xfffffffff] may have been cut-off as
>> [0x00000000-0xffffffff], thus cause the trouble.
>>
>> Hi Boszormenyi,
>> Could you please help to try following test patch?
>> against v4.1-rc8?
> I have tried it. The result (dmesg, lspci before/after modprobe) is attached.
> The "not CPU addressable" message shows up once in dmesg.
> The device shows up in lspci and the module can be loaded. The previously
> experienced sluggishness is gone now, but the network doesn't work after modprobe.
> I think it was an expected outcome, since that particular range is ignored with this patch.
Hm, I can see a very similar message in 3.18.16, so it was not
the expected outcome.
After building the "official" r8168 from Realtek for 4.1.0-rc8,
the difference in lspci from the working 3.18.16 is nil, before
and after modprobe. (r8168 was build for 3.18.16, that's why.)
However, connman (similar to NetworkManager) still sees the network
connectivity as "down". I checked that the firmware files are there in
/lib/firmware/rtl_nic.
With r8168 (the "official" Realtek driver), the kernel message about
"link up" appears immediately and connman can configure the network.
I have tried the patch on 4.0.5, too, with the same result.
So, there may be another problem with the r8169 driver itself besides
this ACPI problem but no matter what I do, I can't seem to be able
to enable debugging messages for r8169.
So, for now I can use r8168 instead of r8169 with this patch.
Thanks,
Zoltán
>
> Thanks,
> Zoltán
>
>> Thanks!
>> Gerry
>> -------------------------------------------------------------------
>> diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
>> index 8244f013f210..d7b8c392c420 100644
>> --- a/drivers/acpi/resource.c
>> +++ b/drivers/acpi/resource.c
>> @@ -206,6 +206,11 @@ static bool acpi_decode_space(struct resource_win *win,
>>
>> res->start = attr->minimum;
>> res->end = attr->maximum;
>> + if (res->start != attr->minimum || res->end != attr->maximum) {
>> + pr_warn("resource window ([%#llx-%#llx] ignored, not CPU
>> addressable)\n",
>> + attr->minimum, attr->maximum);
>> + return false;
>> + }
>>
>> /*
>> * For bridges that translate addresses across the bridge,
>> -----------------------------------------------------------------------------
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Patch v1] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-21 19:59 ` Boszormenyi Zoltan
@ 2015-06-23 4:12 ` Jiang Liu
2015-06-23 7:35 ` Ingo Molnar
0 siblings, 1 reply; 10+ messages in thread
From: Jiang Liu @ 2015-06-23 4:12 UTC (permalink / raw)
To: Rafael J . Wysocki, Bjorn Helgaas, Boszormenyi Zoltan, Len Brown
Cc: Jiang Liu, LKML, linux-pci, linux-acpi, x86 @ kernel . org
The data type resource_size_t may be 32 bits or 64 bits depending on
CONFIG_PHYS_ADDR_T_64BIT. So reject ACPI resource descriptors which
will cause resource_size_t overflow with 32bit kernel
This issue was triggered on a platform running 32bit kernel with an
ACPI resource descriptor with address range [0x400000000-0xfffffffff].
Please refer to https://lkml.org/lkml/2015/6/19/277 for more information.
Reported-by: Boszormenyi Zoltan <zboszor@pr.hu>
Fixes: 593669c2ac0f ("x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation")
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: stable@vger.kernel.org # 4.0
---
drivers/acpi/resource.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/resource.c b/drivers/acpi/resource.c
index 8244f013f210..f1c966e05078 100644
--- a/drivers/acpi/resource.c
+++ b/drivers/acpi/resource.c
@@ -193,6 +193,7 @@ static bool acpi_decode_space(struct resource_win *win,
u8 iodec = attr->granularity == 0xfff ? ACPI_DECODE_10 : ACPI_DECODE_16;
bool wp = addr->info.mem.write_protect;
u64 len = attr->address_length;
+ u64 start, end, offset = 0;
struct resource *res = &win->res;
/*
@@ -204,9 +205,6 @@ static bool acpi_decode_space(struct resource_win *win,
pr_debug("ACPI: Invalid address space min_addr_fix %d, max_addr_fix %d, len %llx\n",
addr->min_address_fixed, addr->max_address_fixed, len);
- res->start = attr->minimum;
- res->end = attr->maximum;
-
/*
* For bridges that translate addresses across the bridge,
* translation_offset is the offset that must be added to the
@@ -214,12 +212,22 @@ static bool acpi_decode_space(struct resource_win *win,
* primary side. Non-bridge devices must list 0 for all Address
* Translation offset bits.
*/
- if (addr->producer_consumer == ACPI_PRODUCER) {
- res->start += attr->translation_offset;
- res->end += attr->translation_offset;
- } else if (attr->translation_offset) {
+ if (addr->producer_consumer == ACPI_PRODUCER)
+ offset = attr->translation_offset;
+ else if (attr->translation_offset)
pr_debug("ACPI: translation_offset(%lld) is invalid for non-bridge device.\n",
attr->translation_offset);
+ start = attr->minimum + offset;
+ end = attr->maximum + offset;
+
+ win->offset = offset;
+ res->start = start;
+ res->end = end;
+ if (sizeof(resource_size_t) < sizeof(u64) &&
+ (offset != win->offset || start != res->start || end != res->end)) {
+ pr_warn("acpi resource window ([%#llx-%#llx] ignored, not CPU addressable)\n",
+ attr->minimum, attr->maximum);
+ return false;
}
switch (addr->resource_type) {
@@ -236,8 +244,6 @@ static bool acpi_decode_space(struct resource_win *win,
return false;
}
- win->offset = attr->translation_offset;
-
if (addr->producer_consumer == ACPI_PRODUCER)
res->flags |= IORESOURCE_WINDOW;
--
1.7.10.4
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [Patch v1] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel
2015-06-23 4:12 ` [Patch v1] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel Jiang Liu
@ 2015-06-23 7:35 ` Ingo Molnar
0 siblings, 0 replies; 10+ messages in thread
From: Ingo Molnar @ 2015-06-23 7:35 UTC (permalink / raw)
To: Jiang Liu
Cc: Rafael J . Wysocki, Bjorn Helgaas, Boszormenyi Zoltan, Len Brown,
LKML, linux-pci, linux-acpi, x86 @ kernel . org
* Jiang Liu <jiang.liu@linux.intel.com> wrote:
> The data type resource_size_t may be 32 bits or 64 bits depending on
> CONFIG_PHYS_ADDR_T_64BIT. So reject ACPI resource descriptors which
> will cause resource_size_t overflow with 32bit kernel
>
> This issue was triggered on a platform running 32bit kernel with an
> ACPI resource descriptor with address range [0x400000000-0xfffffffff].
> Please refer to https://lkml.org/lkml/2015/6/19/277 for more information.
>
> Reported-by: Boszormenyi Zoltan <zboszor@pr.hu>
> Fixes: 593669c2ac0f ("x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation")
> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
> Cc: stable@vger.kernel.org # 4.0
Yeah, so please use the customary changelog style we use in the kernel:
" Current code does (A), this causes problem (B) when doing (C).
In that case the user notices (D).
We can improve this doing (E), because now the user will experience (F),
which is more desirable."
Please fill in A-F accordingly.
In particular your changelog is missing 'B' and 'D': what exactly is a
'resource_size_t overflow' and what does the user notice from it?
Your changelog is also missing 'F'.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-06-23 7:35 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <55841815.5000701@pr.hu>
[not found] ` <558419B2.7010703@pr.hu>
[not found] ` <55841D48.8080809@pr.hu>
[not found] ` <12950452.K8inU2UIYe@vostro.rjw.lan>
[not found] ` <55869329.4040908@pr.hu>
2015-06-21 14:03 ` ACPI regression? Was Re: Ethernet chip disappeared from lspci Bjorn Helgaas
2015-06-21 14:19 ` Boszormenyi Zoltan
2015-06-21 15:37 ` Boszormenyi Zoltan
2015-06-21 17:25 ` Jiang Liu
2015-06-21 17:55 ` Jiang Liu
2015-06-21 18:55 ` Boszormenyi Zoltan
2015-06-21 19:59 ` Boszormenyi Zoltan
2015-06-23 4:12 ` [Patch v1] PCI, ACPI: Fix regressions caused by resource_size_t overflow with 32bit kernel Jiang Liu
2015-06-23 7:35 ` Ingo Molnar
2015-06-21 18:28 ` ACPI regression? Was Re: Ethernet chip disappeared from lspci Boszormenyi Zoltan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).