Re: [cxl:for-7.0/cxl-init] [dax/hmem, e820, resource] bc62f5b308: BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]

public inbox for nvdimm@lists.linux.dev
 help / color / mirror / Atom feed

From: <dan.j.williams@intel.com>
To: kernel test robot <oliver.sang@intel.com>,
	Dan Williams <dan.j.williams@intel.com>
Cc: <oe-lkp@lists.linux.dev>, <lkp@intel.com>,
	Alison Schofield <alison.schofield@intel.com>,
	Vishal Verma <vishal.l.verma@intel.com>,
	"Ira Weiny" <ira.weiny@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	<linux-cxl@vger.kernel.org>, Dave Jiang <dave.jiang@intel.com>,
	"Smita Koralahalli" <Smita.KoralahalliChannabasappa@amd.com>,
	<linux-kernel@vger.kernel.org>, <nvdimm@lists.linux.dev>,
	<oliver.sang@intel.com>
Subject: Re: [cxl:for-7.0/cxl-init] [dax/hmem, e820, resource] bc62f5b308: BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]
Date: Thu, 22 Jan 2026 12:18:58 -0800	[thread overview]
Message-ID: <69728632c464b_1d33100dd@dwillia2-mobl4.notmuch> (raw)
In-Reply-To: <202601211001.82fe0f1b-lkp@intel.com>

kernel test robot wrote:
> 
> 
> Hello,
> 
> FYI. we don't have enough knowledge to understand how the issues we found
> in the tests are related with the code. we just run the tests up to 200 times
> for both this commit and parent, noticed there are various random issues on
> this commit, but always clean on parent.
> 
> 
> =========================================================================================
> tbox_group/testcase/rootfs/kconfig/compiler/sleep:
>   vm-snb/boot/debian-11.1-i386-20220923.cgz/i386-randconfig-141-20260117/gcc-14/1
> 
> 29317f8dc6ed601e bc62f5b308cbdedf29132fe96e9
> ---------------- ---------------------------
>        fail:runs  %reproduction    fail:runs
>            |             |             |
>            :200          2%           5:200   dmesg.BUG:soft_lockup-CPU##stuck_for#s![kworker##:#]
>            :200          2%           5:200   dmesg.BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#]
>            :200          8%          17:200   dmesg.BUG:soft_lockup-CPU##stuck_for#s![swapper:#]
>            :200          2%           4:200   dmesg.BUG:workqueue_lockup-pool
>            :200          0%           1:200   dmesg.EIP:__schedule
>            :200          0%           1:200   dmesg.EIP:_raw_spin_unlock_irq
>            :200          2%           4:200   dmesg.EIP:_raw_spin_unlock_irqrestore
>            :200          6%          11:200   dmesg.EIP:console_emit_next_record
>            :200          0%           1:200   dmesg.EIP:finish_task_switch
>            :200          3%           6:200   dmesg.EIP:lock_acquire
>            :200          1%           2:200   dmesg.EIP:lock_release
>            :200          1%           2:200   dmesg.EIP:queue_work_on
>            :200          0%           1:200   dmesg.EIP:rcu_preempt_deferred_qs_irqrestore
>            :200          1%           2:200   dmesg.EIP:timekeeping_notify
>            :200          0%           1:200   dmesg.INFO:rcu_preempt_detected_stalls_on_CPUs/tasks
>            :200          0%           1:200   dmesg.INFO:task_blocked_for_more_than#seconds
>            :200         14%          27:200   dmesg.Kernel_panic-not_syncing:softlockup:hung_tasks
> 
> below is full report.

So this is good data, but I do not know what to do with it. The
RCU_STRICT_GRACE_PERIOD feature seems to want to make RCU usage bugs
more detectable, but at the risk of false positives. My concern is that
this patch disturbs 32-bit x86 builds just enough to make the softlockup
detector start getting upset about this rcu_gp::strict_work_handler
workqueue.

So unless this causes actual boot failures all I can assume is that this
is a false positive report. Nothing in this patch is touching workqueues
or object lifetime issues. So I can only assume this is a side effect of
instruction cache layout, or similar.

     prev parent reply	other threads:[~2026-01-22 20:19 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-21  5:11 [cxl:for-7.0/cxl-init] [dax/hmem, e820, resource] bc62f5b308: BUG:soft_lockup-CPU##stuck_for#s![kworker:#:#] kernel test robot
2026-01-22 20:18 ` dan.j.williams [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=69728632c464b_1d33100dd@dwillia2-mobl4.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox