From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
rientjes@google.com, linux-mm@kvack.org, oleg@redhat.com
Subject: Re: [PATCH v3] mm,oom: speed up select_bad_process() loop.
Date: Thu, 19 May 2016 08:53:29 +0200 [thread overview]
Message-ID: <20160519065329.GA26110@dhcp22.suse.cz> (raw)
In-Reply-To: <20160518140932.6643b963e8d3fc49ff64df8d@linux-foundation.org>
On Wed 18-05-16 14:09:32, Andrew Morton wrote:
> On Wed, 18 May 2016 16:15:45 +0200 Michal Hocko <mhocko@kernel.org> wrote:
>
> > > This patch adds a counter to signal_struct for tracking how many
> > > TIF_MEMDIE threads are in a given thread group, and check it at
> > > oom_scan_process_thread() so that select_bad_process() can use
> > > for_each_process() rather than for_each_process_thread().
> >
> > OK, this looks correct. Strictly speaking the patch is missing any note
> > on _why_ this is needed or an improvement. I would add something like
> > the following:
> > "
> > Although the original code was correct it was quite inefficient because
> > each thread group was scanned num_threads times which can be a lot
> > especially with processes with many threads. Even though the OOM is
> > extremely cold path it is always good to be as effective as possible
> > when we are inside rcu_read_lock() - aka unpreemptible context.
> > "
>
> This sounds quite rubbery to me. Lots of code calls
> for_each_process_thread() and presumably that isn't causing problems.
Yeah, many paths call for_each_process_thread but they are
O(num_threads) while this is O(num_threads^2).
> We're bloating up the signal_struct to solve some problem on a
> rarely-called slowpath with no evidence that there is actually a
> problem to be solved.
signal_struct has some holes[1] so we can stitch it there. Long term I
would like to to move this logic into the mm_struct, it would be just
larger surgery I guess.
[1]
die__process_function: tag not supported (INVALID)!
struct signal_struct {
atomic_t sigcnt; /* 0 4 */
atomic_t live; /* 4 4 */
int nr_threads; /* 8 4 */
/* XXX 4 bytes hole, try to pack */
struct list_head thread_head; /* 16 16 */
wait_queue_head_t wait_chldexit; /* 32 24 */
struct task_struct * curr_target; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct sigpending shared_pending; /* 64 24 */
int group_exit_code; /* 88 4 */
int notify_count; /* 92 4 */
struct task_struct * group_exit_task; /* 96 8 */
int group_stop_count; /* 104 4 */
unsigned int flags; /* 108 4 */
unsigned int is_child_subreaper:1; /* 112:31 4 */
unsigned int has_child_subreaper:1; /* 112:30 4 */
/* XXX 30 bits hole, try to pack */
int posix_timer_id; /* 116 4 */
struct list_head posix_timers; /* 120 16 */
/* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
struct hrtimer real_timer; /* 136 88 */
/* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
struct pid * leader_pid; /* 224 8 */
ktime_t it_real_incr; /* 232 8 */
struct cpu_itimer it[2]; /* 240 48 */
/* --- cacheline 4 boundary (256 bytes) was 32 bytes ago --- */
struct thread_group_cputimer cputimer; /* 288 32 */
/* --- cacheline 5 boundary (320 bytes) --- */
struct task_cputime cputime_expires; /* 320 24 */
atomic_t tick_dep_mask; /* 344 4 */
/* XXX 4 bytes hole, try to pack */
struct list_head cpu_timers[3]; /* 352 48 */
/* --- cacheline 6 boundary (384 bytes) was 16 bytes ago --- */
struct pid * tty_old_pgrp; /* 400 8 */
int leader; /* 408 4 */
/* XXX 4 bytes hole, try to pack */
struct tty_struct * tty; /* 416 8 */
struct autogroup * autogroup; /* 424 8 */
seqlock_t stats_lock; /* 432 8 */
cputime_t utime; /* 440 8 */
/* --- cacheline 7 boundary (448 bytes) --- */
cputime_t stime; /* 448 8 */
cputime_t cutime; /* 456 8 */
cputime_t cstime; /* 464 8 */
cputime_t gtime; /* 472 8 */
cputime_t cgtime; /* 480 8 */
struct prev_cputime prev_cputime; /* 488 24 */
/* --- cacheline 8 boundary (512 bytes) --- */
long unsigned int nvcsw; /* 512 8 */
long unsigned int nivcsw; /* 520 8 */
long unsigned int cnvcsw; /* 528 8 */
long unsigned int cnivcsw; /* 536 8 */
long unsigned int min_flt; /* 544 8 */
long unsigned int maj_flt; /* 552 8 */
long unsigned int cmin_flt; /* 560 8 */
long unsigned int cmaj_flt; /* 568 8 */
/* --- cacheline 9 boundary (576 bytes) --- */
long unsigned int inblock; /* 576 8 */
long unsigned int oublock; /* 584 8 */
long unsigned int cinblock; /* 592 8 */
long unsigned int coublock; /* 600 8 */
long unsigned int maxrss; /* 608 8 */
long unsigned int cmaxrss; /* 616 8 */
struct task_io_accounting ioac; /* 624 56 */
/* --- cacheline 10 boundary (640 bytes) was 40 bytes ago --- */
long long unsigned int sum_sched_runtime; /* 680 8 */
struct rlimit rlim[16]; /* 688 256 */
/* --- cacheline 14 boundary (896 bytes) was 48 bytes ago --- */
struct taskstats * stats; /* 944 8 */
oom_flags_t oom_flags; /* 952 4 */
short int oom_score_adj; /* 956 2 */
short int oom_score_adj_min; /* 958 2 */
/* --- cacheline 15 boundary (960 bytes) --- */
struct mutex cred_guard_mutex; /* 960 40 */
/* size: 1000, cachelines: 16, members: 58 */
/* sum members: 988, holes: 3, sum holes: 12 */
/* bit holes: 1, sum bit holes: 30 bits */
/* last cacheline: 40 bytes */
};
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-05-19 6:53 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-18 12:20 [PATCH v2] mm,oom: speed up select_bad_process() loop Tetsuo Handa
2016-05-18 12:51 ` Michal Hocko
2016-05-18 13:30 ` [PATCH v3] " Tetsuo Handa
2016-05-18 14:15 ` Michal Hocko
2016-05-18 21:09 ` Andrew Morton
2016-05-19 6:53 ` Michal Hocko [this message]
2016-05-19 7:17 ` Michal Hocko
2016-05-19 8:17 ` Michal Hocko
2016-05-20 1:50 ` Oleg Nesterov
2016-05-20 2:13 ` Oleg Nesterov
2016-05-20 6:42 ` Michal Hocko
2016-05-20 22:04 ` Oleg Nesterov
2016-05-18 14:44 ` Tetsuo Handa
2016-05-20 7:50 ` Michal Hocko
2016-05-20 11:51 ` Tetsuo Handa
2016-05-20 12:09 ` Michal Hocko
2016-05-20 13:41 ` Tetsuo Handa
2016-05-20 13:44 ` Tetsuo Handa
2016-05-20 15:23 ` Michal Hocko
2016-05-20 15:56 ` Tetsuo Handa
2016-05-23 7:55 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160519065329.GA26110@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=linux-mm@kvack.org \
--cc=oleg@redhat.com \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).