linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	rientjes@google.com, linux-mm@kvack.org, oleg@redhat.com
Subject: Re: [PATCH v3] mm,oom: speed up select_bad_process() loop.
Date: Thu, 19 May 2016 08:53:29 +0200	[thread overview]
Message-ID: <20160519065329.GA26110@dhcp22.suse.cz> (raw)
In-Reply-To: <20160518140932.6643b963e8d3fc49ff64df8d@linux-foundation.org>

On Wed 18-05-16 14:09:32, Andrew Morton wrote:
> On Wed, 18 May 2016 16:15:45 +0200 Michal Hocko <mhocko@kernel.org> wrote:
> 
> > > This patch adds a counter to signal_struct for tracking how many
> > > TIF_MEMDIE threads are in a given thread group, and check it at
> > > oom_scan_process_thread() so that select_bad_process() can use
> > > for_each_process() rather than for_each_process_thread().
> > 
> > OK, this looks correct. Strictly speaking the patch is missing any note
> > on _why_ this is needed or an improvement. I would add something like
> > the following:
> > "
> > Although the original code was correct it was quite inefficient because
> > each thread group was scanned num_threads times which can be a lot
> > especially with processes with many threads. Even though the OOM is
> > extremely cold path it is always good to be as effective as possible
> > when we are inside rcu_read_lock() - aka unpreemptible context.
> > "
> 
> This sounds quite rubbery to me.  Lots of code calls
> for_each_process_thread() and presumably that isn't causing problems. 

Yeah, many paths call for_each_process_thread but they are
O(num_threads) while this is O(num_threads^2).

> We're bloating up the signal_struct to solve some problem on a
> rarely-called slowpath with no evidence that there is actually a
> problem to be solved.

signal_struct has some holes[1] so we can stitch it there. Long term I
would like to to move this logic into the mm_struct, it would be just
larger surgery I guess.

[1]

die__process_function: tag not supported (INVALID)!
struct signal_struct {
	atomic_t                   sigcnt;               /*     0     4 */
	atomic_t                   live;                 /*     4     4 */
	int                        nr_threads;           /*     8     4 */

	/* XXX 4 bytes hole, try to pack */

	struct list_head           thread_head;          /*    16    16 */
	wait_queue_head_t          wait_chldexit;        /*    32    24 */
	struct task_struct *       curr_target;          /*    56     8 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	struct sigpending          shared_pending;       /*    64    24 */
	int                        group_exit_code;      /*    88     4 */
	int                        notify_count;         /*    92     4 */
	struct task_struct *       group_exit_task;      /*    96     8 */
	int                        group_stop_count;     /*   104     4 */
	unsigned int               flags;                /*   108     4 */
	unsigned int               is_child_subreaper:1; /*   112:31  4 */
	unsigned int               has_child_subreaper:1; /*   112:30  4 */

	/* XXX 30 bits hole, try to pack */

	int                        posix_timer_id;       /*   116     4 */
	struct list_head           posix_timers;         /*   120    16 */
	/* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */
	struct hrtimer             real_timer;           /*   136    88 */
	/* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
	struct pid *               leader_pid;           /*   224     8 */
	ktime_t                    it_real_incr;         /*   232     8 */
	struct cpu_itimer          it[2];                /*   240    48 */
	/* --- cacheline 4 boundary (256 bytes) was 32 bytes ago --- */
	struct thread_group_cputimer cputimer;           /*   288    32 */
	/* --- cacheline 5 boundary (320 bytes) --- */
	struct task_cputime        cputime_expires;      /*   320    24 */
	atomic_t                   tick_dep_mask;        /*   344     4 */

	/* XXX 4 bytes hole, try to pack */

	struct list_head           cpu_timers[3];        /*   352    48 */
	/* --- cacheline 6 boundary (384 bytes) was 16 bytes ago --- */
	struct pid *               tty_old_pgrp;         /*   400     8 */
	int                        leader;               /*   408     4 */

	/* XXX 4 bytes hole, try to pack */

	struct tty_struct *        tty;                  /*   416     8 */
	struct autogroup *         autogroup;            /*   424     8 */
	seqlock_t                  stats_lock;           /*   432     8 */
	cputime_t                  utime;                /*   440     8 */
	/* --- cacheline 7 boundary (448 bytes) --- */
	cputime_t                  stime;                /*   448     8 */
	cputime_t                  cutime;               /*   456     8 */
	cputime_t                  cstime;               /*   464     8 */
	cputime_t                  gtime;                /*   472     8 */
	cputime_t                  cgtime;               /*   480     8 */
	struct prev_cputime        prev_cputime;         /*   488    24 */
	/* --- cacheline 8 boundary (512 bytes) --- */
	long unsigned int          nvcsw;                /*   512     8 */
	long unsigned int          nivcsw;               /*   520     8 */
	long unsigned int          cnvcsw;               /*   528     8 */
	long unsigned int          cnivcsw;              /*   536     8 */
	long unsigned int          min_flt;              /*   544     8 */
	long unsigned int          maj_flt;              /*   552     8 */
	long unsigned int          cmin_flt;             /*   560     8 */
	long unsigned int          cmaj_flt;             /*   568     8 */
	/* --- cacheline 9 boundary (576 bytes) --- */
	long unsigned int          inblock;              /*   576     8 */
	long unsigned int          oublock;              /*   584     8 */
	long unsigned int          cinblock;             /*   592     8 */
	long unsigned int          coublock;             /*   600     8 */
	long unsigned int          maxrss;               /*   608     8 */
	long unsigned int          cmaxrss;              /*   616     8 */
	struct task_io_accounting  ioac;                 /*   624    56 */
	/* --- cacheline 10 boundary (640 bytes) was 40 bytes ago --- */
	long long unsigned int     sum_sched_runtime;    /*   680     8 */
	struct rlimit              rlim[16];             /*   688   256 */
	/* --- cacheline 14 boundary (896 bytes) was 48 bytes ago --- */
	struct taskstats *         stats;                /*   944     8 */
	oom_flags_t                oom_flags;            /*   952     4 */
	short int                  oom_score_adj;        /*   956     2 */
	short int                  oom_score_adj_min;    /*   958     2 */
	/* --- cacheline 15 boundary (960 bytes) --- */
	struct mutex               cred_guard_mutex;     /*   960    40 */

	/* size: 1000, cachelines: 16, members: 58 */
	/* sum members: 988, holes: 3, sum holes: 12 */
	/* bit holes: 1, sum bit holes: 30 bits */
	/* last cacheline: 40 bytes */
};
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-05-19  6:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-18 12:20 [PATCH v2] mm,oom: speed up select_bad_process() loop Tetsuo Handa
2016-05-18 12:51 ` Michal Hocko
2016-05-18 13:30   ` [PATCH v3] " Tetsuo Handa
2016-05-18 14:15     ` Michal Hocko
2016-05-18 21:09       ` Andrew Morton
2016-05-19  6:53         ` Michal Hocko [this message]
2016-05-19  7:17           ` Michal Hocko
2016-05-19  8:17             ` Michal Hocko
2016-05-20  1:50           ` Oleg Nesterov
2016-05-20  2:13             ` Oleg Nesterov
2016-05-20  6:42             ` Michal Hocko
2016-05-20 22:04               ` Oleg Nesterov
2016-05-18 14:44     ` Tetsuo Handa
2016-05-20  7:50     ` Michal Hocko
2016-05-20 11:51       ` Tetsuo Handa
2016-05-20 12:09         ` Michal Hocko
2016-05-20 13:41           ` Tetsuo Handa
2016-05-20 13:44             ` Tetsuo Handa
2016-05-20 15:23             ` Michal Hocko
2016-05-20 15:56               ` Tetsuo Handa
2016-05-23  7:55                 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160519065329.GA26110@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-mm@kvack.org \
    --cc=oleg@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).