From: "Marc Hartmayer" <mhartmay@linux.ibm.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>, linux-kernel@vger.kernel.org
Cc: Lai Jiangshan <jiangshan.ljs@antgroup.com>,
Valentin Schneider <vschneid@redhat.com>,
Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
Heiko Carstens <hca@linux.ibm.com>,
Sven Schnelle <svens@linux.ibm.com>,
Mete Durlu <meted@linux.ibm.com>
Subject: Re: [PATCH 1/4] workqueue: Reap workers via kthread_stop() and remove detach_completion
Date: Tue, 23 Jul 2024 18:19:37 +0200 [thread overview]
Message-ID: <87le1sjd2e.fsf@linux.ibm.com> (raw)
In-Reply-To: <20240621073225.3600-2-jiangshanlai@gmail.com>
On Fri, Jun 21, 2024 at 03:32 PM +0800, Lai Jiangshan <jiangshanlai@gmail.com> wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
>
> The code to kick off the destruction of workers is now in a process
> context (idle_cull_fn()), so kthread_stop() can be used in the process
> context to replace the work of pool->detach_completion.
>
> The wakeup in wake_dying_workers() is unneeded after this change, but it
> is harmless, jut keep it here until next patch renames wake_dying_workers()
> rather than renaming it again and again.
>
> Cc: Valentin Schneider <vschneid@redhat.com>
> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> ---
> kernel/workqueue.c | 35 +++++++++++++++++++----------------
> 1 file changed, 19 insertions(+), 16 deletions(-)
>
Hi Lai,
a bisect of a regression in our CI on s390x led to this patch. The bug
is pretty easy to reproduce (currently, I only tested it on s390x - will
try to test it on x86 as well):
1. Start a Linux QEMU/KVM guest with 2 cores using this patch and enable
`panic_on_warn=1` for the guest kernel.
2. Run the following command in the KVM guest:
$ dd if=/dev/zero of=/dev/null & while : ; do chcpu -d 1; chcpu -e 1; done
3. Wait for the crash. e.g.:
2024/07/23 18:01:21 [M83LP63]: [ 157.267727] ------------[ cut here ]------------
2024/07/23 18:01:21 [M83LP63]: [ 157.267735] WARNING: CPU: 21 PID: 725 at kernel/workqueue.c:3340 worker_thread+0x54e/0x558
2024/07/23 18:01:21 [M83LP63]: [ 157.267746] Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables sunrpc dm_service_time s390_trng vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel
2024/07/23 18:01:21 [M83LP63]: loop dm_multipath configfs nfnetlink lcs ctcm fsm zfcp scsi_transport_fc ghash_s390 prng chacha_s390 libchacha aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common scm_block eadm_sch scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey zcrypt rng_core autofs4
2024/07/23 18:01:21 [M83LP63]: [ 157.267792] CPU: 21 PID: 725 Comm: kworker/dying Not tainted 6.10.0-rc2-00239-g68f83057b913 #95
2024/07/23 18:01:21 [M83LP63]: [ 157.267796] Hardware name: IBM 3906 M04 704 (LPAR)
2024/07/23 18:01:21 [M83LP63]: [ 157.267802] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
2024/07/23 18:01:21 [M83LP63]: [ 157.267797] Krnl PSW : 0704d00180000000 000003d600fcd9fa (worker_thread+0x552/0x558)
2024/07/23 18:01:21 [M83LP63]: [ 157.267806] Krnl GPRS: 6479696e6700776f 000002c901b62780 000003d602493ec8 000002c914954600
2024/07/23 18:01:21 [M83LP63]: [ 157.267809] 0000000000000000 0000000000000008 000002c901a85400 000002c90719e840
2024/07/23 18:01:21 [M83LP63]: [ 157.267811] 000002c90719e880 000002c901a85420 000002c91127adf0 000002c901a85400
2024/07/23 18:01:21 [M83LP63]: [ 157.267813] 000002c914954600 0000000000000000 000003d600fcd772 000003560452bd98
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] Krnl Code: 000003d600fcd9ec: c0e500674262 brasl %r14,000003d601cb5eb0
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcd9f2: a7f4ffc8 brc 15,000003d600fcd982
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] #000003d600fcd9f6: af000000 mc 0,0
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] >000003d600fcd9fa: a7f4fec2 brc 15,000003d600fcd77e
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcd9fe: 0707 bcr 0,%r7
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcda00: c00400682e10 brcl 0,000003d601cd3620
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcda06: eb7ff0500024 stmg %r7,%r15,80(%r15)
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcda0c: b90400ef lgr %r14,%r15
2024/07/23 18:01:21 [M83LP63]: [ 157.267853] Call Trace:
2024/07/23 18:01:21 [M83LP63]: [ 157.267855] [<000003d600fcd9fa>] worker_thread+0x552/0x558
2024/07/23 18:01:21 [M83LP63]: [ 157.267859] ([<000003d600fcd772>] worker_thread+0x2ca/0x558)
2024/07/23 18:01:21 [M83LP63]: [ 157.267862] [<000003d600fd6c80>] kthread+0x120/0x128
2024/07/23 18:01:21 [M83LP63]: [ 157.267865] [<000003d600f5305c>] __ret_from_fork+0x3c/0x58
2024/07/23 18:01:21 [M83LP63]: [ 157.267868] [<000003d601cc746a>] ret_from_fork+0xa/0x30
2024/07/23 18:01:21 [M83LP63]: [ 157.267873] Last Breaking-Event-Address:
2024/07/23 18:01:21 [M83LP63]: [ 157.267874] [<000003d600fcd778>] worker_thread+0x2d0/0x558
2024/07/23 18:01:21 [M83LP63]: [ 157.267879] Kernel panic - not syncing: kernel: panic_on_warn set ...
2024/07/23 18:01:22 [M83LP63]: [ 157.267881] CPU: 21 PID: 725 Comm: kworker/dying Not tainted 6.10.0-rc2-00239-g68f83057b913 #95
2024/07/23 18:01:22 [M83LP63]: [ 157.267884] Hardware name: IBM 3906 M04 704 (LPAR)
2024/07/23 18:01:22 [M83LP63]: [ 157.267885] Call Trace:
2024/07/23 18:01:22 [M83LP63]: [ 157.267886] [<000003d600fa7f8c>] panic+0x1ec/0x308
2024/07/23 18:01:22 [M83LP63]: [ 157.267892] [<000003d600fa822c>] check_panic_on_warn+0x84/0x88
2024/07/23 18:01:22 [M83LP63]: [ 157.267896] [<000003d600fa846e>] __warn+0xa6/0x160
2024/07/23 18:01:22 [M83LP63]: [ 157.267899] [<000003d601c8ac7e>] report_bug+0x146/0x1c0
2024/07/23 18:01:22 [M83LP63]: [ 157.267903] [<000003d600f50e64>] monitor_event_exception+0x44/0x80
2024/07/23 18:01:22 [M83LP63]: [ 157.267905] [<000003d601cb67e0>] __do_pgm_check+0xf0/0x1b0
2024/07/23 18:01:22 [M83LP63]: [ 157.267911] [<000003d601cc75a8>] pgm_check_handler+0x118/0x168
2024/07/23 18:01:22 [M83LP63]: [ 157.267914] [<000003d600fcd9fa>] worker_thread+0x552/0x558
2024/07/23 18:01:22 [M83LP63]: [ 157.267917] ([<000003d600fcd772>] worker_thread+0x2ca/0x558)
2024/07/23 18:01:22 [M83LP63]: [ 157.267920] [<000003d600fd6c80>] kthread+0x120/0x128
2024/07/23 18:01:22 [M83LP63]: [ 157.267923] [<000003d600f5305c>] __ret_from_fork+0x3c/0x58
2024/07/23 18:01:22 [M83LP63]: [ 157.267926] [<000003d601cc746a>] ret_from_fork+0xa/0x30
--
Kind regards / Beste Grüße
Marc Hartmayer
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
next prev parent reply other threads:[~2024-07-23 16:20 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-21 7:32 [PATCH 0/4] workqueue: Destroy workers in idle_cull_fn() Lai Jiangshan
2024-06-21 7:32 ` [PATCH 1/4] workqueue: Reap workers via kthread_stop() and remove detach_completion Lai Jiangshan
2024-07-23 16:19 ` Marc Hartmayer [this message]
2024-07-25 0:11 ` Lai Jiangshan
2024-07-29 1:49 ` Lai Jiangshan
2024-09-10 9:45 ` Marc Hartmayer
2024-09-10 16:29 ` Marc Hartmayer
2024-09-11 3:23 ` Lai Jiangshan
2024-09-11 3:32 ` Lai Jiangshan
2024-09-11 8:27 ` Marc Hartmayer
2024-09-11 9:37 ` Marc Hartmayer
2024-06-21 7:32 ` [PATCH 2/4] workqueue: Don't bind the rescuer in the last working cpu Lai Jiangshan
2024-06-21 7:32 ` [PATCH 3/4] workqueue: Detach workers directly in idle_cull_fn() Lai Jiangshan
2024-06-21 7:32 ` [PATCH 4/4] workqueue: Remove useless pool->dying_workers Lai Jiangshan
2024-06-21 22:34 ` [PATCH 0/4] workqueue: Destroy workers in idle_cull_fn() Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87le1sjd2e.fsf@linux.ibm.com \
--to=mhartmay@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jiangshan.ljs@antgroup.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=meted@linux.ibm.com \
--cc=svens@linux.ibm.com \
--cc=tj@kernel.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.