All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thierry Reding <thierry.reding@gmail.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Wonhong Kwon <wonhong.kwon@lge.com>
Cc: Len Brown <len.brown@intel.com>, Pavel Machek <pavel@ucw.cz>,
	linux-next@vger.kernel.org, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Runtime PM causes oops on next-20151015
Date: Thu, 15 Oct 2015 11:22:13 +0200	[thread overview]
Message-ID: <20151015092213.GA26614@ulmo.nvidia.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 4279 bytes --]

Hi Rafael, Wonhong,

Todays linux-next breaks rather spectacularly for drivers using runtime
PM. The culprit seems to be this commit:

	commit 7d24068e144adc03b805806645d732cf79488717
	Author: Wonhong Kwon <wonhongkwon@gmail.com>
	Date:   Tue Oct 6 10:10:20 2015 +0900

	    PM / hibernate: Move pm_init/pm_disk_init to late_initcall_sync

	    pm_init is being invoked by core_initcall and hibernate_image_size_init
	    calculates preferred image size (image_size) based on total pages
	    (totalram_pages). This totalram_pages can be modified during various
	    initcall-s phase and this can cause miscalculated image_size.

	    For example, when CMA is being used, init_cma_reserved_pageblock tries
	    to change the totalram_pages and this job is done during core_initcall.
	    In order words, the totalram_pages doesn't take CMA reserved pages into
	    account when image_size is calculated and it can be too small.

	    Move pm_init and pm_disk_init to late_initcall_sync so that it happens
	    after all other initcall-s change the totalram_pages.

	    Reported-by: Sangseok Lee <sangseok.lee@lge.com>
	    Signed-off-by: Wonhong Kwon <wonhong.kwon@lge.com>
	    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I can't reply to it directly because I don't have it in any of my mail
boxes (it seems to have been sent only to the linux-pm mailing list,
even Google finds only a single match).

Here's an extract of the oops:

	[    1.395928] Unable to handle kernel NULL pointer dereference at virtual address 00000100
	[    1.404013] pgd = ffffffc000e0e000
	[    1.407417] [00000100] *pgd=000000013c007003, *pud=000000013c007003, *pmd=000000013c008003, *pte=0060000050041707
	[    1.417746] Internal error: Oops: 96000005 [#1] PREEMPT SMP
	[    1.423316] Modules linked in:
	[    1.426400] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.3.0-rc5-next-20151015+ #1338
	[    1.434138] Hardware name: NVIDIA Tegra210 P2371 (P2180/P2597) reference board (DT)
	[    1.441789] task: ffffffc0bc0a8000 ti: ffffffc0bc084000 task.ti: ffffffc0bc084000
	[    1.449280] PC is at __queue_work+0x2c/0x240
	[    1.453551] LR is at queue_work_on+0x60/0x78
	...
	[    1.836517] Call trace:
	[    1.838968] [<ffffffc0000b4134>] __queue_work+0x2c/0x240
	[    1.844280] [<ffffffc0000b43a4>] queue_work_on+0x5c/0x78
	[    1.849599] [<ffffffc00052e508>] rpm_idle+0xc0/0x140
	[    1.854565] [<ffffffc00052e5dc>] __pm_runtime_idle+0x54/0x98
	[    1.860229] [<ffffffc00052414c>] driver_probe_device+0x164/0x2f8
	[    1.866236] [<ffffffc000524378>] __driver_attach+0x98/0xa0
	[    1.871724] [<ffffffc00052231c>] bus_for_each_dev+0x5c/0xa0
	[    1.877294] [<ffffffc000523ab4>] driver_attach+0x1c/0x28
	[    1.882608] [<ffffffc0005236e4>] bus_add_driver+0x1cc/0x238
	[    1.888180] [<ffffffc000524b1c>] driver_register+0x5c/0xf8
	[    1.893675] [<ffffffc000431800>] mipi_dsi_driver_register_full+0x50/0x60
	[    1.900374] [<ffffffc000ba1570>] panel_simple_init+0x2c/0x44
	[    1.906035] [<ffffffc000082934>] do_one_initcall+0x8c/0x1a0
	[    1.911612] [<ffffffc000b80aa8>] kernel_init_freeable+0x150/0x1f8
	[    1.917711] [<ffffffc0007f78fc>] kernel_init+0xc/0xe0

Instrumenting the code shows that pm_wq (passed to queue_work in the
rpm_idle() function) is NULL at this point. This matches up with the
change done in the above-mentioned commit, since now pm_wq only gets
initialized at late_initcall time, whereas all built-in drivers will
already be probed at device_initcall time. So I suspect that this is
going to cause crashes on a whole lot of systems (essentially every
system that tries to use runtime PM from a built-in driver).

Given the commit message I suspect that the right fix would be to split
pm_init() into two functions, one that initializes the hibernation image
and another with the PM core initialization. The pm_hibernate_init() is
probably going to work fine as late_initcall (I assume this was tested)
but the rest should probably stay at core_initcall.

I can provide a patch for the latter if everyone agrees that it's the
right thing, but in the meantime, can you please drop the above patch
from your tree to unbreak linux-next for all affected users?

Thanks,
Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

             reply	other threads:[~2015-10-15  9:22 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-15  9:22 Thierry Reding [this message]
2015-10-15 10:04 ` Runtime PM causes oops on next-20151015 Michael Ellerman
2015-10-15 13:26   ` Thierry Reding
2015-10-15 21:24     ` Rafael J. Wysocki
2015-10-15 23:35 ` David Kwon (권원홍)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151015092213.GA26614@ulmo.nvidia.com \
    --to=thierry.reding@gmail.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-next@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rjw@rjwysocki.net \
    --cc=wonhong.kwon@lge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.