* Re: Acpi deadlocks with 3.7.0-rc4 [not found] <50A513A8.9010404@redhat.com> @ 2012-11-28 16:01 ` Linus Torvalds 2012-11-28 16:21 ` Zdenek Kabelac 0 siblings, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2012-11-28 16:01 UTC (permalink / raw) To: Zdenek Kabelac, Len Brown, Rafael J. Wysocki, Linux ACPI; +Cc: LKML Adding more people (and the acpi list) to this report. I'm seeing *very* few changes to the core suspend/resume path in 3.7, and while there are some acpia updates, they seem to be pretty mild too. I think the acpi_os_wait_semaphore thing is a red herring - that's just stale on the stack. Do you have the register state from the oops? Or at least the "Code:" line? It would be nice to see exactly where the oops happens, and I cannot line up your "acpi_ns_lookup + 0xa1/0x5b9" with any code due to different compilers (and configurations etc). Linus On Thu, Nov 15, 2012 at 8:09 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: > Hello > > > I've already seen twice this oops after resuming my Lenovo T61 in docking > station. > > Since for some reason currently the serial line doesn't work correctly after > resume > (while I'm pretty sure it used to work in past) here is at least > hand-written oops > message from mobile camera picture. > > From the trace it seem os_wait semaphore is accessed twice. > Unsure which device is behind it - but it seem docking station is need to > hit this issue. > > > kernel 3.7.0-rc4 > > Pid: pm-suspend > > RIP: acpi_ns_lookup + 0xa1/0x5b9 > > Call Trace: > > ? acpi_os_wait_semaphore + 0x136/0x149 > acpi_ns_get_mode + 0x96/0x102 > ? __lock_is_held +0x5f/0x90 > acpi_ns_evaluate +0x47/0x2de > ? _raw_spin_lock_irqsave > ? acpi_ut_evaluate_object > ? sub_preempt_count > ? pnpacpi_can_wakeup > acpi_rs_get_method_data > ? acpi_os_signal_semaphore > acpi_walk_resources > ? acpi_ut_release_mutex > pnpacpi_build_resource_template > ? acpi_bus_get_device > pnpacpi_set_resources > ? pnp_device_shutdown > pnp_start_dev > pnp_bus_resume > dpm_run_callback > device_resume > dpm_resume > dpm_resume_end > ? acpi_suspend_begin_old > suspend_devices_and_enter > pm_suspend > state_store > kobj_attr_store > sysfs_write_file > vgs_write > sys_write > system_call_fastpath > > Zdenek > > > PS: jpg on request > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 16:01 ` Acpi deadlocks with 3.7.0-rc4 Linus Torvalds @ 2012-11-28 16:21 ` Zdenek Kabelac 2012-11-28 17:02 ` Linus Torvalds 2012-11-28 18:35 ` Rafael J. Wysocki 0 siblings, 2 replies; 14+ messages in thread From: Zdenek Kabelac @ 2012-11-28 16:21 UTC (permalink / raw) To: Linus Torvalds; +Cc: Len Brown, Rafael J. Wysocki, Linux ACPI, LKML Dne 28.11.2012 17:01, Linus Torvalds napsal(a): > Adding more people (and the acpi list) to this report. > > I'm seeing *very* few changes to the core suspend/resume path in 3.7, > and while there are some acpia updates, they seem to be pretty mild > too. > > I think the acpi_os_wait_semaphore thing is a red herring - that's > just stale on the stack. > > Do you have the register state from the oops? Or at least the "Code:" > line? It would be nice to see exactly where the oops happens, and I > cannot line up your "acpi_ns_lookup + 0xa1/0x5b9" with any code due > to different compilers (and configurations etc). > > Linus > I've opened https://bugzilla.kernel.org/show_bug.cgi?id=51071 and attached picture there which is all I have. I'll try to decode exact code line. In all cases - I've played even with 3.4 kernel - which also does not survive multiple resumes in docking station - though it just leaves black screen - so this oops is rather 'progress' and it also could be false path. It's probably not a regression from 3.6 - since this problem was there for much longer - but now it has just become much more visible. As I usually do not have reason to make multiple suspend/resumes in dock I've not noticed it until now when I've tried to capture this ACPI trace. Zdenek ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 16:21 ` Zdenek Kabelac @ 2012-11-28 17:02 ` Linus Torvalds 2012-11-28 17:27 ` Zdenek Kabelac 2012-11-28 18:35 ` Rafael J. Wysocki 1 sibling, 1 reply; 14+ messages in thread From: Linus Torvalds @ 2012-11-28 17:02 UTC (permalink / raw) To: Zdenek Kabelac; +Cc: Len Brown, Rafael J. Wysocki, Linux ACPI, LKML On Wed, Nov 28, 2012 at 8:21 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: > > I've opened https://bugzilla.kernel.org/show_bug.cgi?id=51071 > and attached picture there which is all I have. > > I'll try to decode exact code line. Uhhuh. It's missing much of the relevant parts of the code line, in particular the actual oopsing instruction. But what is there decodes to 41 b8 10 00 00 00 mov $0x10,%r8d 48 c7 c1 88 52 64 81 mov $0xffffffff81645288,%rcx 31 c0 xor %eax,%eax 48 c7 c2 98 52 64 81 mov $0xffffffff81645298,%rdx bf 00 04 00 0. mov $0x0.00400,%edi .. oops in here .. 74 33 je 0x50 48 89 df mov %rbx,%rdi e8 4d c9 00 00 callq ? <offset 0xc972> 48 89 d9 mov %rbx,%rcx 48 c7 c2 0a .. .. .. mov $0x......0a,%rdx which isn't really very obvious. Do you have that kernel around (or at least the same compiler and configuration)? Doing a objdump --disassemble drivers/acpi/acpica/nsaccess.o might help pinpoint where that is.. > It's probably not a regression from 3.6 - since this problem was there for > much longer - but now it has just become much more visible. Ok. Linus ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 17:02 ` Linus Torvalds @ 2012-11-28 17:27 ` Zdenek Kabelac 2012-11-28 19:07 ` Linus Torvalds 2012-11-28 20:31 ` Rafael J. Wysocki 0 siblings, 2 replies; 14+ messages in thread From: Zdenek Kabelac @ 2012-11-28 17:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: Len Brown, Rafael J. Wysocki, Linux ACPI, LKML Dne 28.11.2012 18:02, Linus Torvalds napsal(a): > On Wed, Nov 28, 2012 at 8:21 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: >> >> I've opened https://bugzilla.kernel.org/show_bug.cgi?id=51071 >> and attached picture there which is all I have. >> >> I'll try to decode exact code line. > > Uhhuh. It's missing much of the relevant parts of the code line, in > particular the actual oopsing instruction. But what is there decodes > to > > 41 b8 10 00 00 00 mov $0x10,%r8d > 48 c7 c1 88 52 64 81 mov $0xffffffff81645288,%rcx > 31 c0 xor %eax,%eax > 48 c7 c2 98 52 64 81 mov $0xffffffff81645298,%rdx > bf 00 04 00 0. mov $0x0.00400,%edi > > .. oops in here .. > > 74 33 je 0x50 > 48 89 df mov %rbx,%rdi > e8 4d c9 00 00 callq ? <offset 0xc972> > 48 89 d9 mov %rbx,%rcx > 48 c7 c2 0a .. .. .. mov $0x......0a,%rdx > > which isn't really very obvious. Do you have that kernel around (or at > least the same compiler and configuration)? Doing a > > objdump --disassemble drivers/acpi/acpica/nsaccess.o I've attached bigger disasfun script output to BZ 51071. https://bugzilla.kernel.org/show_bug.cgi?id=51071#c1 if (ACPI_GET_DESCRIPTOR_TYPE(prefix_node) != 00000000000000a1 <acpi_ns_lookup+0xa1> cmpb $0xf,0x8(%rbx) 00000000000000a5 <acpi_ns_lookup+0xa5> je 0da <acpi_ns_lookup+0xda> seems to be going out of bounds. Zdenek ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 17:27 ` Zdenek Kabelac @ 2012-11-28 19:07 ` Linus Torvalds 2012-11-28 20:05 ` Rafael J. Wysocki 2012-11-29 10:13 ` Zdenek Kabelac 2012-11-28 20:31 ` Rafael J. Wysocki 1 sibling, 2 replies; 14+ messages in thread From: Linus Torvalds @ 2012-11-28 19:07 UTC (permalink / raw) To: Zdenek Kabelac; +Cc: Len Brown, Rafael J. Wysocki, Linux ACPI, LKML [-- Attachment #1: Type: text/plain, Size: 913 bytes --] On Wed, Nov 28, 2012 at 9:27 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: > > I've attached bigger disasfun script output to BZ 51071. > https://bugzilla.kernel.org/show_bug.cgi?id=51071#c1 > > > if (ACPI_GET_DESCRIPTOR_TYPE(prefix_node) != > 00000000000000a1 <acpi_ns_lookup+0xa1> cmpb $0xf,0x8(%rbx) > 00000000000000a5 <acpi_ns_lookup+0xa5> je 0da <acpi_ns_lookup+0xda> > > seems to be going out of bounds. The whole "prefix_node" pointer is bogus. It seems to have the value 0x1000. I wonder how that happened. It's loaded from 'scope_info->scope.node', and it *should* be a valid pointer. Can you add a print-out of scope_info->common.descriptor_type and check that it is ACPI_DESC_TYPE_STATE_WSCOPE (== 8). If it is not, return early. Or just something like the attatched, which just uses the root node (and warns once) if it's not a valid WSCOPE thing. Linus [-- Attachment #2: patch.diff --] [-- Type: application/octet-stream, Size: 638 bytes --] drivers/acpi/acpica/nsaccess.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c index 23db53ce2293..3238cabba61a 100644 --- a/drivers/acpi/acpica/nsaccess.c +++ b/drivers/acpi/acpica/nsaccess.c @@ -320,6 +320,8 @@ acpi_ns_lookup(union acpi_generic_state *scope_info, acpi_gbl_root_node)); prefix_node = acpi_gbl_root_node; + } else if (WARN_ON_ONCE(scope_info->common.descriptor_type != ACPI_DESC_TYPE_STATE_WSCOPE)) { + prefix_node = acpi_gbl_root_node; } else { prefix_node = scope_info->scope.node; if (ACPI_GET_DESCRIPTOR_TYPE(prefix_node) != ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 19:07 ` Linus Torvalds @ 2012-11-28 20:05 ` Rafael J. Wysocki 2012-11-29 10:13 ` Zdenek Kabelac 1 sibling, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2012-11-28 20:05 UTC (permalink / raw) To: linux-acpi; +Cc: Linus Torvalds, Zdenek Kabelac, Len Brown, LKML On Wednesday, November 28, 2012 11:07:32 AM Linus Torvalds wrote: > On Wed, Nov 28, 2012 at 9:27 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: > > > > I've attached bigger disasfun script output to BZ 51071. > > https://bugzilla.kernel.org/show_bug.cgi?id=51071#c1 > > > > > > if (ACPI_GET_DESCRIPTOR_TYPE(prefix_node) != > > 00000000000000a1 <acpi_ns_lookup+0xa1> cmpb $0xf,0x8(%rbx) > > 00000000000000a5 <acpi_ns_lookup+0xa5> je 0da <acpi_ns_lookup+0xda> > > > > seems to be going out of bounds. > > The whole "prefix_node" pointer is bogus. It seems to have the value 0x1000. > > I wonder how that happened. It's loaded from 'scope_info->scope.node', > and it *should* be a valid pointer. Well, suppose that pnpacpi_build_resource_template() passes a handle that's not a valid pointer to acpi_walk_resources(). What happens then is that it is passed directly to acpi_rs_get_method_data() and from there to acpi_ut_evaluate_object() - without validation (acpi_rs_get_method_data() even has a comment about the parameters validity guaranteed by the caller, heh, heh). Then it becomes the prefix_node and is written into info->prefix_node. acpi_ns_evaluate() takes that and passes it to acpi_ns_get_node() along with info->pathname that is just the name of the method to evaluate, which is a valid string, so the "if (!pathname)" block in acpi_ns_get_node() is not executed and we get scope_info.scope.node = prefix_node, which is our bad pointer. A pointer to that scope_info is passed to acpi_ns_lookup() and we get the above. So the code in pnpacpi_build_resource_template() is at fault by passing a wrong pointer to acpi_walk_resources(). And the pointer is wrong probably because the struct acpi_device pointed to by dev->data in there has been removed during a previous suspend or resume (I'm not sure which one does that), but the PNP layer has no idea about that. And that bug has been there for quite a while (like forever?). Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 19:07 ` Linus Torvalds 2012-11-28 20:05 ` Rafael J. Wysocki @ 2012-11-29 10:13 ` Zdenek Kabelac 2012-11-29 10:59 ` Rafael J. Wysocki 1 sibling, 1 reply; 14+ messages in thread From: Zdenek Kabelac @ 2012-11-29 10:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: Len Brown, Rafael J. Wysocki, Linux ACPI, LKML Dne 28.11.2012 20:07, Linus Torvalds napsal(a): > whole "prefix_node" pointer is bogus. It seems to have the value 0x1000. Tested also this patch with this result: https://bugzilla.kernel.org/show_bug.cgi?id=51071#c8 So while it's made it pass suspend/resume, it's not really usable as docking then. Zdenek ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-29 10:13 ` Zdenek Kabelac @ 2012-11-29 10:59 ` Rafael J. Wysocki 2012-11-29 12:26 ` Zdenek Kabelac 0 siblings, 1 reply; 14+ messages in thread From: Rafael J. Wysocki @ 2012-11-29 10:59 UTC (permalink / raw) To: Zdenek Kabelac; +Cc: Linus Torvalds, Len Brown, Linux ACPI, LKML On Thursday, November 29, 2012 11:13:10 AM Zdenek Kabelac wrote: > Dne 28.11.2012 20:07, Linus Torvalds napsal(a): > > whole "prefix_node" pointer is bogus. It seems to have the value 0x1000. > > Tested also this patch with this result: > > https://bugzilla.kernel.org/show_bug.cgi?id=51071#c8 > > So while it's made it pass suspend/resume, it's not really usable > as docking then. This just makes acpi_ns_lookup() always return acpi_gbl_root_node for things looked up by acpi_ns_get_node() as far as I can say. Hmm. If my theory correct, the patch below should catch the bug. Can you please test it? Rafael --- drivers/acpi/scan.c | 1 + drivers/pnp/pnpacpi/rsparser.c | 3 +++ 2 files changed, 4 insertions(+) Index: linux/drivers/acpi/scan.c =================================================================== --- linux.orig/drivers/acpi/scan.c +++ linux/drivers/acpi/scan.c @@ -704,6 +704,7 @@ static void acpi_device_unregister(struc mutex_unlock(&acpi_device_lock); acpi_detach_data(device->handle, acpi_bus_data_handler); + device->handle = ERR_PTR(-ENODEV); acpi_device_remove_files(device); device_unregister(&device->dev); Index: linux/drivers/pnp/pnpacpi/rsparser.c =================================================================== --- linux.orig/drivers/pnp/pnpacpi/rsparser.c +++ linux/drivers/pnp/pnpacpi/rsparser.c @@ -611,6 +611,9 @@ int pnpacpi_build_resource_template(stru int res_cnt = 0; acpi_status status; + if (WARN_ON_ONCE(IS_ERR(handle))) + return PTR_ERR(handle); + status = acpi_walk_resources(handle, METHOD_NAME__CRS, pnpacpi_count_resources, &res_cnt); if (ACPI_FAILURE(status)) { -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-29 10:59 ` Rafael J. Wysocki @ 2012-11-29 12:26 ` Zdenek Kabelac 2012-11-29 16:59 ` Rafael J. Wysocki 0 siblings, 1 reply; 14+ messages in thread From: Zdenek Kabelac @ 2012-11-29 12:26 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linus Torvalds, Len Brown, Linux ACPI, LKML Dne 29.11.2012 11:59, Rafael J. Wysocki napsal(a): > On Thursday, November 29, 2012 11:13:10 AM Zdenek Kabelac wrote: >> Dne 28.11.2012 20:07, Linus Torvalds napsal(a): >>> whole "prefix_node" pointer is bogus. It seems to have the value 0x1000. >> >> Tested also this patch with this result: >> >> https://bugzilla.kernel.org/show_bug.cgi?id=51071#c8 >> >> So while it's made it pass suspend/resume, it's not really usable >> as docking then. > > This just makes acpi_ns_lookup() always return acpi_gbl_root_node > for things looked up by acpi_ns_get_node() as far as I can say. > > Hmm. > > If my theory correct, the patch below should catch the bug. Can you please > test it? > Ok now crashing right after 'undock' button press: https://bugzilla.kernel.org/show_bug.cgi?id=51071#c10 Zdenek ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-29 12:26 ` Zdenek Kabelac @ 2012-11-29 16:59 ` Rafael J. Wysocki 0 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2012-11-29 16:59 UTC (permalink / raw) To: Zdenek Kabelac; +Cc: Linus Torvalds, Len Brown, Linux ACPI, LKML On Thursday, November 29, 2012 01:26:48 PM Zdenek Kabelac wrote: > Dne 29.11.2012 11:59, Rafael J. Wysocki napsal(a): > > On Thursday, November 29, 2012 11:13:10 AM Zdenek Kabelac wrote: > >> Dne 28.11.2012 20:07, Linus Torvalds napsal(a): > >>> whole "prefix_node" pointer is bogus. It seems to have the value 0x1000. > >> > >> Tested also this patch with this result: > >> > >> https://bugzilla.kernel.org/show_bug.cgi?id=51071#c8 > >> > >> So while it's made it pass suspend/resume, it's not really usable > >> as docking then. > > > > This just makes acpi_ns_lookup() always return acpi_gbl_root_node > > for things looked up by acpi_ns_get_node() as far as I can say. > > > > Hmm. > > > > If my theory correct, the patch below should catch the bug. Can you please > > test it? > > > > Ok now crashing right after 'undock' button press: > > https://bugzilla.kernel.org/show_bug.cgi?id=51071#c10 And that's because I made a mistake in the patch. We're currently testing the appended one and it's showing that the added WARN_ON_ONCE() actually triggers, so the theory appears to be correct. I think we can debug this further in the Bugzilla. Thanks, Rafael --- drivers/acpi/scan.c | 2 ++ drivers/pnp/pnpacpi/rsparser.c | 3 +++ 2 files changed, 5 insertions(+) Index: linux/drivers/acpi/scan.c =================================================================== --- linux.orig/drivers/acpi/scan.c +++ linux/drivers/acpi/scan.c @@ -707,6 +707,8 @@ static void acpi_device_unregister(struc acpi_device_remove_files(device); device_unregister(&device->dev); + + device->handle = ERR_PTR(-ENODEV); } /* -------------------------------------------------------------------------- Index: linux/drivers/pnp/pnpacpi/rsparser.c =================================================================== --- linux.orig/drivers/pnp/pnpacpi/rsparser.c +++ linux/drivers/pnp/pnpacpi/rsparser.c @@ -611,6 +611,9 @@ int pnpacpi_build_resource_template(stru int res_cnt = 0; acpi_status status; + if (WARN_ON_ONCE(IS_ERR(handle))) + return PTR_ERR(handle); + status = acpi_walk_resources(handle, METHOD_NAME__CRS, pnpacpi_count_resources, &res_cnt); if (ACPI_FAILURE(status)) { -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 17:27 ` Zdenek Kabelac 2012-11-28 19:07 ` Linus Torvalds @ 2012-11-28 20:31 ` Rafael J. Wysocki 2012-11-29 9:03 ` Zdenek Kabelac 1 sibling, 1 reply; 14+ messages in thread From: Rafael J. Wysocki @ 2012-11-28 20:31 UTC (permalink / raw) To: Zdenek Kabelac; +Cc: linux-acpi, Linus Torvalds, Len Brown, LKML On Wednesday, November 28, 2012 06:27:50 PM Zdenek Kabelac wrote: > Dne 28.11.2012 18:02, Linus Torvalds napsal(a): > > On Wed, Nov 28, 2012 at 8:21 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: > >> > >> I've opened https://bugzilla.kernel.org/show_bug.cgi?id=51071 > >> and attached picture there which is all I have. I wonder if you can try to apply the patch below and see if that makes any difference? Rafael --- drivers/pnp/pnpacpi/rsparser.c | 8 ++++++++ 1 file changed, 8 insertions(+) Index: linux/drivers/pnp/pnpacpi/rsparser.c =================================================================== --- linux.orig/drivers/pnp/pnpacpi/rsparser.c +++ linux/drivers/pnp/pnpacpi/rsparser.c @@ -610,6 +610,14 @@ int pnpacpi_build_resource_template(stru struct acpi_resource *resource; int res_cnt = 0; acpi_status status; + int ret; + + /* Sanity check. */ + ret = acpi_bus_get_device(handle, &acpi_dev); + if (ret) { + dev_err(&dev->dev, "ACPI node is invalid in %s\n", __func__); + return ret; + } status = acpi_walk_resources(handle, METHOD_NAME__CRS, pnpacpi_count_resources, &res_cnt); -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 20:31 ` Rafael J. Wysocki @ 2012-11-29 9:03 ` Zdenek Kabelac 2012-11-29 10:09 ` Rafael J. Wysocki 0 siblings, 1 reply; 14+ messages in thread From: Zdenek Kabelac @ 2012-11-29 9:03 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: linux-acpi, Linus Torvalds, Len Brown, LKML Dne 28.11.2012 21:31, Rafael J. Wysocki napsal(a): > On Wednesday, November 28, 2012 06:27:50 PM Zdenek Kabelac wrote: >> Dne 28.11.2012 18:02, Linus Torvalds napsal(a): >>> On Wed, Nov 28, 2012 at 8:21 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: >>>> >>>> I've opened https://bugzilla.kernel.org/show_bug.cgi?id=51071 >>>> and attached picture there which is all I have. > > I wonder if you can try to apply the patch below and see if that makes any > difference? > Yep - extended BZ with 2 new pictures - it's now crashing earlier within the call of acpi_bus_get_device(). Zdenek ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-29 9:03 ` Zdenek Kabelac @ 2012-11-29 10:09 ` Rafael J. Wysocki 0 siblings, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2012-11-29 10:09 UTC (permalink / raw) To: linux-acpi; +Cc: Zdenek Kabelac, Linus Torvalds, Len Brown, LKML On Thursday, November 29, 2012 10:03:53 AM Zdenek Kabelac wrote: > Dne 28.11.2012 21:31, Rafael J. Wysocki napsal(a): > > On Wednesday, November 28, 2012 06:27:50 PM Zdenek Kabelac wrote: > >> Dne 28.11.2012 18:02, Linus Torvalds napsal(a): > >>> On Wed, Nov 28, 2012 at 8:21 AM, Zdenek Kabelac <zkabelac@redhat.com> wrote: > >>>> > >>>> I've opened https://bugzilla.kernel.org/show_bug.cgi?id=51071 > >>>> and attached picture there which is all I have. > > > > I wonder if you can try to apply the patch below and see if that makes any > > difference? > > > > Yep - extended BZ with 2 new pictures - it's now crashing earlier within > the call of acpi_bus_get_device(). Pretty much as expected. So this is the problem I was thinking it was: the previous suspend-resume cycle unregistered the struct acpi_device pointed to by dev->data in pnpacpi_build_resource_template(), so the handle pointer in there is now invalid. Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Acpi deadlocks with 3.7.0-rc4 2012-11-28 16:21 ` Zdenek Kabelac 2012-11-28 17:02 ` Linus Torvalds @ 2012-11-28 18:35 ` Rafael J. Wysocki 1 sibling, 0 replies; 14+ messages in thread From: Rafael J. Wysocki @ 2012-11-28 18:35 UTC (permalink / raw) To: Zdenek Kabelac; +Cc: Linus Torvalds, Len Brown, Linux ACPI, LKML On Wednesday, November 28, 2012 05:21:24 PM Zdenek Kabelac wrote: > Dne 28.11.2012 17:01, Linus Torvalds napsal(a): > > Adding more people (and the acpi list) to this report. > > > > I'm seeing *very* few changes to the core suspend/resume path in 3.7, > > and while there are some acpia updates, they seem to be pretty mild > > too. > > > > I think the acpi_os_wait_semaphore thing is a red herring - that's > > just stale on the stack. > > > > Do you have the register state from the oops? Or at least the "Code:" > > line? It would be nice to see exactly where the oops happens, and I > > cannot line up your "acpi_ns_lookup + 0xa1/0x5b9" with any code due > > to different compilers (and configurations etc). > > > > Linus > > > > > I've opened https://bugzilla.kernel.org/show_bug.cgi?id=51071 > and attached picture there which is all I have. > > I'll try to decode exact code line. > > > In all cases - I've played even with 3.4 kernel - which also does not survive > multiple resumes in docking station - though it just leaves black screen - so > this oops is rather 'progress' and it also could be false path. > > It's probably not a regression from 3.6 - since this problem was there for > much longer - but now it has just become much more visible. > > As I usually do not have reason to make multiple suspend/resumes in dock I've > not noticed it until now when I've tried to capture this ACPI trace. Well, if ACPI PNP devices are involved, and it looks like they are, that definitely is not a new problem. We're now in the process of reworking this area quite a bit and we'll do our best to address this issue too in the process. Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2012-11-29 16:59 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <50A513A8.9010404@redhat.com>
2012-11-28 16:01 ` Acpi deadlocks with 3.7.0-rc4 Linus Torvalds
2012-11-28 16:21 ` Zdenek Kabelac
2012-11-28 17:02 ` Linus Torvalds
2012-11-28 17:27 ` Zdenek Kabelac
2012-11-28 19:07 ` Linus Torvalds
2012-11-28 20:05 ` Rafael J. Wysocki
2012-11-29 10:13 ` Zdenek Kabelac
2012-11-29 10:59 ` Rafael J. Wysocki
2012-11-29 12:26 ` Zdenek Kabelac
2012-11-29 16:59 ` Rafael J. Wysocki
2012-11-28 20:31 ` Rafael J. Wysocki
2012-11-29 9:03 ` Zdenek Kabelac
2012-11-29 10:09 ` Rafael J. Wysocki
2012-11-28 18:35 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox