From: Qian Cai <quic_qiancai@quicinc.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>,
Andrew Morton <akpm@linux-foundation.org>,
Nicolas Saenz Julienne <nsaenzju@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Linux-MM <linux-mm@kvack.org>, <kafai@fb.com>,
<kpsingh@kernel.org>
Subject: Re: [PATCH 0/6] Drain remote per-cpu directly v3
Date: Thu, 19 May 2022 09:29:45 -0400 [thread overview]
Message-ID: <YoZGSd6yQL3EP8tk@qian> (raw)
In-Reply-To: <20220518171503.GQ1790663@paulmck-ThinkPad-P17-Gen-1>
On Wed, May 18, 2022 at 10:15:03AM -0700, Paul E. McKenney wrote:
> So does this python script somehow change the tracing state? (It does
> not look to me like it does, but I could easily be missing something.)
No, I don't think so either. It pretty much just offline memory sections
one at a time.
> Either way, is there something else waiting for these RCU flavors?
> (There should not be.) Nevertheless, if so, there should be
> a synchronize_rcu_tasks(), synchronize_rcu_tasks_rude(), or
> synchronize_rcu_tasks_trace() on some other blocked task's stack
> somewhere.
There are only three blocked tasks when this happens. The kmemleak_scan()
is just the victim waiting for the locks taken by the stucking
offline_pages()->synchronize_rcu() task.
task:kmemleak state:D stack:25824 pid: 1033 ppid: 2 flags:0x00000008
Call trace:
__switch_to
__schedule
schedule
percpu_rwsem_wait
__percpu_down_read
percpu_down_read.constprop.0
get_online_mems
kmemleak_scan
kmemleak_scan_thread
kthread
ret_from_fork
task:cppc_fie state:D stack:23472 pid: 1848 ppid: 2 flags:0x00000008
Call trace:
__switch_to
__schedule
lockdep_recursion
task:tee state:D stack:24816 pid:16733 ppid: 16732 flags:0x0000020c
Call trace:
__switch_to
__schedule
schedule
schedule_timeout
__wait_for_common
wait_for_completion
__wait_rcu_gp
synchronize_rcu
lru_cache_disable
__alloc_contig_migrate_range
isolate_single_pageblock
start_isolate_page_range
offline_pages
memory_subsys_offline
device_offline
online_store
dev_attr_store
sysfs_kf_write
kernfs_fop_write_iter
new_sync_write
vfs_write
ksys_write
__arm64_sys_write
invoke_syscall
el0_svc_common.constprop.0
do_el0_svc
el0_svc
el0t_64_sync_handler
el0t_64_sync
> Or maybe something sleeps waiting for an RCU Tasks * callback to
> be invoked. In that case (and in the above case, for that matter),
> at least one of these pointers would be non-NULL on some CPU:
>
> 1. rcu_tasks__percpu.cblist.head
> 2. rcu_tasks_rude__percpu.cblist.head
> 3. rcu_tasks_trace__percpu.cblist.head
>
> The ->func field of the pointed-to structure contains a pointer to
> the callback function, which will help work out what is going on.
> (Most likely a wakeup being lost or not provided.)
What would be some of the easy ways to find out those? I can't see anything
interesting from the output of sysrq-t.
> Alternatively, if your system has hundreds of thousands of tasks and
> you have attached BPF programs to short-lived socket structures and you
> don't yet have the workaround, then you can see hangs. (I am working on a
> longer-term fix.) In the short term, applying the workaround is the right
> thing to do. (Adding a couple of the BPF guys on CC for their thoughts.)
The system is pretty much idle after a fresh reboot. The only workload is
to run the script.
next prev parent reply other threads:[~2022-05-19 13:29 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-12 8:50 [PATCH 0/6] Drain remote per-cpu directly v3 Mel Gorman
2022-05-12 8:50 ` [PATCH 1/6] mm/page_alloc: Add page->buddy_list and page->pcp_list Mel Gorman
2022-05-13 11:59 ` Nicolas Saenz Julienne
2022-05-19 9:36 ` Vlastimil Babka
2022-05-12 8:50 ` [PATCH 2/6] mm/page_alloc: Use only one PCP list for THP-sized allocations Mel Gorman
2022-05-19 9:45 ` Vlastimil Babka
2022-05-12 8:50 ` [PATCH 3/6] mm/page_alloc: Split out buddy removal code from rmqueue into separate helper Mel Gorman
2022-05-13 12:01 ` Nicolas Saenz Julienne
2022-05-19 9:52 ` Vlastimil Babka
2022-05-23 16:09 ` Qais Yousef
2022-05-24 11:55 ` Mel Gorman
2022-05-25 11:23 ` Qais Yousef
2022-05-12 8:50 ` [PATCH 4/6] mm/page_alloc: Remove unnecessary page == NULL check in rmqueue Mel Gorman
2022-05-13 12:03 ` Nicolas Saenz Julienne
2022-05-19 10:57 ` Vlastimil Babka
2022-05-19 12:13 ` Mel Gorman
2022-05-19 12:26 ` Vlastimil Babka
2022-05-12 8:50 ` [PATCH 5/6] mm/page_alloc: Protect PCP lists with a spinlock Mel Gorman
2022-05-13 12:22 ` Nicolas Saenz Julienne
2022-05-12 8:50 ` [PATCH 6/6] mm/page_alloc: Remotely drain per-cpu lists Mel Gorman
2022-05-12 19:37 ` Andrew Morton
2022-05-13 15:04 ` Mel Gorman
2022-05-13 15:19 ` Nicolas Saenz Julienne
2022-05-13 18:23 ` Mel Gorman
2022-05-17 12:57 ` Mel Gorman
2022-05-12 19:43 ` [PATCH 0/6] Drain remote per-cpu directly v3 Andrew Morton
2022-05-13 14:23 ` Mel Gorman
2022-05-13 19:38 ` Andrew Morton
2022-05-16 10:53 ` Mel Gorman
2022-05-13 12:24 ` Nicolas Saenz Julienne
2022-05-17 23:35 ` Qian Cai
2022-05-18 12:51 ` Mel Gorman
2022-05-18 16:27 ` Qian Cai
2022-05-18 17:15 ` Paul E. McKenney
2022-05-19 13:29 ` Qian Cai [this message]
2022-05-19 19:15 ` Paul E. McKenney
2022-05-19 21:05 ` Qian Cai
2022-05-19 21:29 ` Paul E. McKenney
2022-05-18 17:26 ` Marcelo Tosatti
2022-05-18 17:44 ` Marcelo Tosatti
2022-05-18 18:01 ` Nicolas Saenz Julienne
2022-05-26 17:19 ` Qian Cai
2022-05-27 8:39 ` Mel Gorman
2022-05-27 12:58 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YoZGSd6yQL3EP8tk@qian \
--to=quic_qiancai@quicinc.com \
--cc=akpm@linux-foundation.org \
--cc=kafai@fb.com \
--cc=kpsingh@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@kernel.org \
--cc=mtosatti@redhat.com \
--cc=nsaenzju@redhat.com \
--cc=paulmck@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.