public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: rjw@rjwysocki.net, oleg@redhat.com, mingo@kernel.org,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, mgorman@suse.de, ebiederm@xmission.com,
	bigeasy@linutronix.de, Will Deacon <will@kernel.org>,
	linux-kernel@vger.kernel.org, tj@kernel.org,
	linux-pm@vger.kernel.org, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v3 6/6] freezer,sched: Rewrite core freezer logic
Date: Wed, 26 Oct 2022 14:12:02 +0200	[thread overview]
Message-ID: <Y1kkEltJfojaFjEC@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <Y1kdRNNfUeAU+FNl@hirez.programming.kicks-ass.net>

On Wed, Oct 26, 2022 at 01:43:00PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 26, 2022 at 01:32:31PM +0300, Ville Syrjälä wrote:
> > Short form looks to be this:
> > <4>[  355.437846] 1 lock held by rs:main Q:Reg/359:
> > <4>[  355.438418]  #0: ffff88844693b758 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x1b/0x30
> > <4>[  355.438432] rs:main Q:Reg/359 holding locks while freezing
> 
> > <4>[  355.438429] ------------[ cut here ]------------
> > <4>[  355.438432] rs:main Q:Reg/359 holding locks while freezing
> > <4>[  355.438439] WARNING: CPU: 0 PID: 6211 at kernel/freezer.c:134 __set_task_frozen+0x86/0xb0
> > <4>[  355.438447] Modules linked in: snd_hda_intel i915 mei_hdcp mei_pxp drm_display_helper drm_kms_helper vgem drm_shmem_helper snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core snd_pcm prime_numbers ttm drm_buddy syscopyarea sysfillrect sysimgblt fb_sys_fops fuse x86_pkg_temp_thermal coretemp kvm_intel btusb btrtl btbcm btintel kvm irqbypass bluetooth crct10dif_pclmul crc32_pclmul ecdh_generic ghash_clmulni_intel ecc e1000e mei_me i2c_i801 ptp mei i2c_smbus pps_core lpc_ich video wmi [last unloaded: drm_kms_helper]
> > <4>[  355.438521] CPU: 0 PID: 6211 Comm: rtcwake Tainted: G     U             6.1.0-rc2-CI_DRM_12295-g3844a56a0922+ #1
> > <4>[  355.438526] Hardware name:  /NUC5i7RYB, BIOS RYBDWi35.86A.0385.2020.0519.1558 05/19/2020
> > <4>[  355.438530] RIP: 0010:__set_task_frozen+0x86/0xb0
> > <4>[  355.438536] Code: 83 60 09 00 00 85 c0 74 2a 48 89 df e8 ac 02 9b 00 8b 93 38 05 00 00 48 8d b3 48 07 00 00 48 c7 c7 a0 62 2b 82 e8 ee c1 9a 00 <0f> 0b c6 05 51 75 e3 02 01 c7 43 18 00 80 00 00 b8 00 80 00 00 5b
> > <4>[  355.438541] RSP: 0018:ffffc900012cbcf0 EFLAGS: 00010086
> > <4>[  355.438546] RAX: 0000000000000000 RBX: ffff88810d090040 RCX: 0000000000000004
> > <4>[  355.438550] RDX: 0000000000000004 RSI: 00000000fffff5de RDI: 00000000ffffffff
> > <4>[  355.438553] RBP: 0000000000000000 R08: 0000000000000000 R09: c0000000fffff5de
> > <4>[  355.438557] R10: 00000000002335f8 R11: ffffc900012cbb88 R12: 0000000000000246
> > <4>[  355.438561] R13: ffffffff81165430 R14: 0000000000000000 R15: ffff88810d090040
> > <4>[  355.438565] FS:  00007fcfa43c7740(0000) GS:ffff888446800000(0000) knlGS:0000000000000000
> > <4>[  355.438569] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[  355.438582] CR2: 00007fceb380f6b8 CR3: 0000000117c5c004 CR4: 00000000003706f0
> > <4>[  355.438586] Call Trace:
> > <4>[  355.438589]  <TASK>
> > <4>[  355.438592]  task_call_func+0xc4/0xe0
> > <4>[  355.438600]  freeze_task+0x84/0xe0
> > <4>[  355.438607]  try_to_freeze_tasks+0xac/0x260
> > <4>[  355.438616]  freeze_processes+0x56/0xb0
> > <4>[  355.438622]  pm_suspend.cold.7+0x1d9/0x31c
> > <4>[  355.438629]  state_store+0x7b/0xe0
> > <4>[  355.438637]  kernfs_fop_write_iter+0x124/0x1c0
> > <4>[  355.438644]  vfs_write+0x34f/0x4e0
> > <4>[  355.438655]  ksys_write+0x57/0xd0
> > <4>[  355.438663]  do_syscall_64+0x3a/0x90
> > <4>[  355.438670]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> 
> Oh I think I see what's going on.
> 
> It's a very narrow race between schedule() and task_call_func().
> 
>   CPU0						CPU1
> 
>   __schedule()
>     rq_lock();
>     prev_state = READ_ONCE(prev->__state);
>     if (... && prev_state) {
>       deactivate_tasl(rq, prev, ...)
>         prev->on_rq = 0;
> 
> 						task_call_func()
> 						  raw_spin_lock_irqsave(p->pi_lock);
> 						  state = READ_ONCE(p->__state);
> 						  smp_rmb();
> 						  if (... || p->on_rq) // false!!!
> 						    rq = __task_rq_lock()
> 
> 						  ret = func();
> 
>     next = pick_next_task();
>     rq = context_switch(prev, next)
>       prepare_lock_switch()
>         spin_release(&__rq_lockp(rq)->dep_map...)
> 
> 
> 
> So while the task is on it's way out, it still holds rq->lock for a
> little while, and right then task_call_func() comes in and figures it
> doesn't need rq->lock anymore (because the task is already dequeued --
> but still running there) and then the __set_task_frozen() thing observes
> it's holding rq->lock and yells murder.
> 
> Could you please give the below a spin?

Urgh.. that'll narrow the race more, but won't solve it, that
prepare_lock_switch() is after we clear ->on_cpu.

Let me ponder this a wee bit more..

  reply	other threads:[~2022-10-26 12:12 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-22 11:18 [PATCH v3 0/6] Freezer Rewrite Peter Zijlstra
2022-08-22 11:18 ` [PATCH v3 1/6] freezer: Have {,un}lock_system_sleep() save/restore flags Peter Zijlstra
2022-08-23 17:25   ` Rafael J. Wysocki
2022-09-09  9:00   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2022-08-22 11:18 ` [PATCH v3 2/6] freezer,umh: Clean up freezer/initrd interaction Peter Zijlstra
2022-08-23 17:28   ` Rafael J. Wysocki
2022-09-09  9:00   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2022-08-22 11:18 ` [PATCH v3 3/6] sched: Change wait_task_inactive()s match_state Peter Zijlstra
2022-09-04 10:44   ` Ingo Molnar
2022-09-06 10:54     ` Peter Zijlstra
2022-09-07  7:23       ` Ingo Molnar
2022-09-07  9:29       ` Peter Zijlstra
2022-09-09  9:00         ` [tip: sched/core] sched: Rename task_running() to task_on_cpu() tip-bot2 for Peter Zijlstra
2022-09-07  9:30       ` [PATCH v3 3/6] sched: Change wait_task_inactive()s match_state Peter Zijlstra
2022-09-09  9:00   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2022-08-22 11:18 ` [PATCH v3 4/6] sched/completion: Add wait_for_completion_state() Peter Zijlstra
2022-08-23 17:32   ` Rafael J. Wysocki
2022-08-26 21:54     ` Peter Zijlstra
2022-09-04 10:46   ` Ingo Molnar
2022-09-06 10:24     ` Peter Zijlstra
2022-09-07  7:35       ` Ingo Molnar
2022-09-07  9:24         ` Peter Zijlstra
2022-09-09  9:00   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2022-08-22 11:18 ` [PATCH v3 5/6] sched/wait: Add wait_event_state() Peter Zijlstra
2022-09-04  9:54   ` Ingo Molnar
2022-09-06 11:08     ` Peter Zijlstra
2022-09-07  7:26       ` Ingo Molnar
2022-09-09  9:00   ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2022-08-22 11:18 ` [PATCH v3 6/6] freezer,sched: Rewrite core freezer logic Peter Zijlstra
2022-08-23 17:36   ` Rafael J. Wysocki
2022-09-04 10:09   ` Ingo Molnar
2022-09-06 11:23     ` Peter Zijlstra
2022-09-07  7:30       ` Ingo Molnar
2022-09-09  9:00       ` [tip: sched/core] sched: Show PF_flag holes tip-bot2 for Peter Zijlstra
2022-09-09  9:00   ` [tip: sched/core] freezer,sched: Rewrite core freezer logic tip-bot2 for Peter Zijlstra
2022-09-23  7:21   ` [PATCH v3 6/6] " Christian Borntraeger
2022-09-23  7:53     ` Christian Borntraeger
2022-09-26  8:06       ` Christian Borntraeger
2022-09-26 10:55         ` Christian Borntraeger
2022-09-26 12:13           ` Peter Zijlstra
2022-09-26 12:32           ` Christian Borntraeger
2022-09-26 12:55             ` Peter Zijlstra
2022-09-26 13:23               ` Christian Borntraeger
2022-09-26 13:37                 ` Peter Zijlstra
2022-09-26 13:54                   ` Christian Borntraeger
2022-09-26 15:49                   ` Christian Borntraeger
2022-09-26 18:06                     ` Peter Zijlstra
2022-09-26 18:22                       ` Peter Zijlstra
2022-09-27  5:35                         ` Christian Borntraeger
2022-09-28  5:44                           ` Christian Borntraeger
2022-10-21 17:22   ` Ville Syrjälä
2022-10-25  4:52     ` Ville Syrjälä
2022-10-25 10:49       ` Peter Zijlstra
2022-10-26 10:32         ` Ville Syrjälä
2022-10-26 11:43           ` Peter Zijlstra
2022-10-26 12:12             ` Peter Zijlstra [this message]
2022-10-26 12:14               ` Peter Zijlstra
2022-10-27  5:58             ` Chen Yu
2022-10-27  7:39               ` Peter Zijlstra
2022-10-27 13:09             ` Ville Syrjälä
2022-10-27 16:53               ` Peter Zijlstra
2022-11-02 16:57                 ` Ville Syrjälä
2022-11-02 22:16                   ` Peter Zijlstra
2022-11-07 11:47                     ` Ville Syrjälä
2022-11-10 20:27                       ` [Intel-gfx] [PATCH v3 6/6] freezer, sched: " Ville Syrjälä
2022-11-14  9:10             ` [tip: sched/urgent] sched: Fix race in task_call_func() tip-bot2 for Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2021-10-09 10:07 [PATCH v3 0/6] Freezer rewrite Peter Zijlstra
2021-10-09 10:08 ` [PATCH v3 6/6] freezer,sched: Rewrite core freezer logic Peter Zijlstra
2021-10-18 13:36   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1kkEltJfojaFjEC@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bigeasy@linutronix.de \
    --cc=dietmar.eggemann@arm.com \
    --cc=ebiederm@xmission.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=rjw@rjwysocki.net \
    --cc=rostedt@goodmis.org \
    --cc=tj@kernel.org \
    --cc=ville.syrjala@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox