From: "Bruno Prémont" <bonbons@linux-vserver.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: sedat.dilek@gmail.com, Mike Galbraith <efault@gmx.de>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Mike Frysinger <vapier.adi@gmail.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
"Paul E. McKenney" <paul.mckenney@linaro.org>,
Pekka Enberg <penberg@kernel.org>
Subject: Re: 2.6.39-rc4+: Kernel leaking memory during FS scanning, regression?
Date: Thu, 28 Apr 2011 22:23:01 +0200 [thread overview]
Message-ID: <20110428222301.0b745a0a@neptune.home> (raw)
In-Reply-To: <alpine.LFD.2.02.1104282044120.3005@ionos>
[-- Attachment #1: Type: text/plain, Size: 3704 bytes --]
On Thu, 28 April 2011 Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 28 Apr 2011, Sedat Dilek wrote:
> > On Thu, Apr 28, 2011 at 3:30 PM, Mike Galbraith <efault@gmx.de> wrote:
> > rt_rq[0]:
> > .rt_nr_running : 0
> > .rt_throttled : 0
>
> > .rt_time : 888.893877
>
> > .rt_time : 950.005460
>
> So rt_time is constantly accumulated, but never decreased. The
> decrease happens in the timer callback. Looks like the timer is not
> running for whatever reason.
>
> Can you add the following patch as well ?
>
> Thanks,
>
> tglx
>
> --- linux-2.6.orig/kernel/sched.c
> +++ linux-2.6/kernel/sched.c
> @@ -172,7 +172,7 @@ static enum hrtimer_restart sched_rt_per
> idle = do_sched_rt_period_timer(rt_b, overrun);
> }
>
> - return idle ? HRTIMER_NORESTART : HRTIMER_RESTART;
> + return HRTIMER_RESTART;
This doesn't help here.
Be it applied on top of the others, full diff attached
or applied alone (with throttling printk).
Could it be that NO_HZ=y has some importance in this matter?
Extended throttling printk (Linus asked what exact values were looking
like):
[ 401.000119] sched: RT throttling activated 950012539 > 950000000
Equivalent to what Sedat sees (/proc/sched_debug):
rt_rq[0]:
.rt_nr_running : 2
.rt_throttled : 1
.rt_time : 950.012539
.rt_runtime : 950.000000
/proc/$(pidof rcu_kthread)/sched captured at regular intervals:
Thu Apr 28 21:33:41 CEST 2011
rcu_kthread (6, #threads: 1)
---------------------------------------------------------
se.exec_start : 0.000000
se.vruntime : 0.000703
se.sum_exec_runtime : 903.067982
nr_switches : 23752
nr_voluntary_switches : 23751
nr_involuntary_switches : 1
se.load.weight : 1024
policy : 1
prio : 98
clock-delta : 912
Thu Apr 28 21:34:11 CEST 2011
rcu_kthread (6, #threads: 1)
---------------------------------------------------------
se.exec_start : 0.000000
se.vruntime : 0.000703
se.sum_exec_runtime : 974.899495
nr_switches : 25721
nr_voluntary_switches : 25720
nr_involuntary_switches : 1
se.load.weight : 1024
policy : 1
prio : 98
clock-delta : 1098
Thu Apr 28 21:34:41 CEST 2011
rcu_kthread (6, #threads: 1)
---------------------------------------------------------
se.exec_start : 0.000000
se.vruntime : 0.000703
se.sum_exec_runtime : 974.899495
nr_switches : 25721
nr_voluntary_switches : 25720
nr_involuntary_switches : 1
se.load.weight : 1024
policy : 1
prio : 98
clock-delta : 1126
Thu Apr 28 21:35:11 CEST 2011
rcu_kthread (6, #threads: 1)
> }
>
> static
[-- Attachment #2: sched_rt.diff --]
[-- Type: text/x-patch, Size: 2051 bytes --]
diff --git a/kernel/sched.c b/kernel/sched.c
index 312f8b9..aad1b88 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -172,7 +172,7 @@ static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer)
idle = do_sched_rt_period_timer(rt_b, overrun);
}
- return idle ? HRTIMER_NORESTART : HRTIMER_RESTART;
+ return /* idle ? HRTIMER_NORESTART : */ HRTIMER_RESTART;
}
static
@@ -460,7 +460,7 @@ struct rq {
u64 nohz_stamp;
unsigned char nohz_balance_kick;
#endif
- unsigned int skip_clock_update;
+ int skip_clock_update;
/* capture load from *all* tasks on this cpu: */
struct load_weight load;
@@ -642,8 +642,8 @@ static void update_rq_clock(struct rq *rq)
{
s64 delta;
- if (rq->skip_clock_update)
- return;
+/* if (rq->skip_clock_update > 0)
+ return; */
delta = sched_clock_cpu(cpu_of(rq)) - rq->clock;
rq->clock += delta;
@@ -4035,7 +4035,7 @@ static inline void schedule_debug(struct task_struct *prev)
static void put_prev_task(struct rq *rq, struct task_struct *prev)
{
- if (prev->se.on_rq)
+ if (prev->se.on_rq || rq->skip_clock_update < 0)
update_rq_clock(rq);
prev->sched_class->put_prev_task(rq, prev);
}
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index e7cebdc..2feae93 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -572,8 +572,15 @@ static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun)
enqueue = 1;
}
- if (enqueue)
+ if (enqueue) {
+ /*
+ * Tag a forced clock update if we're coming out of idle
+ * so rq->clock_task will be updated when we schedule().
+ */
+ if (rq->curr == rq->idle)
+ rq->skip_clock_update = -1;
sched_rt_rq_enqueue(rt_rq);
+ }
raw_spin_unlock(&rq->lock);
}
@@ -608,6 +615,7 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq)
return 0;
if (rt_rq->rt_time > runtime) {
+ printk_once(KERN_WARNING "sched: RT throttling activated %llu > %llu\n", rt_rq->rt_time, runtime);
rt_rq->rt_throttled = 1;
if (rt_rq_throttled(rt_rq)) {
sched_rt_rq_dequeue(rt_rq);
next prev parent reply other threads:[~2011-04-28 20:23 UTC|newest]
Thread overview: 88+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-24 18:21 2.6.39-rc4+: Kernel leaking memory during FS scanning, regression? Bruno Prémont
2011-04-24 21:59 ` Bruno Prémont
2011-04-25 2:42 ` KOSAKI Motohiro
2011-04-25 7:47 ` Mike Frysinger
2011-04-25 9:17 ` Bruno Prémont
2011-04-25 9:25 ` Pekka Enberg
2011-04-25 10:34 ` Bruno Prémont
2011-04-25 11:41 ` Bruno Prémont
2011-04-25 11:47 ` Pekka Enberg
2011-04-25 12:11 ` Bruno Prémont
2011-04-25 12:14 ` Tetsuo Handa
2011-04-25 12:21 ` Tetsuo Handa
2011-04-25 15:22 ` Linus Torvalds
2011-04-25 16:04 ` Bruno Prémont
2011-04-25 16:31 ` Linus Torvalds
2011-04-25 17:00 ` Bruno Prémont
2011-04-25 17:10 ` Linus Torvalds
2011-04-25 17:20 ` Linus Torvalds
2011-04-25 18:36 ` Bruno Prémont
2011-04-25 19:16 ` Paul E. McKenney
2011-04-25 21:10 ` Bruno Prémont
2011-04-25 21:26 ` Paul E. McKenney
2011-04-25 21:30 ` Linus Torvalds
2011-04-25 21:49 ` Paul E. McKenney
2011-04-26 6:19 ` Bruno Prémont
2011-04-26 11:27 ` Paul E. McKenney
2011-04-26 16:38 ` Bruno Prémont
2011-04-26 17:09 ` Bruno Prémont
2011-04-26 17:18 ` Linus Torvalds
2011-04-26 22:28 ` Thomas Gleixner
2011-04-27 6:15 ` Bruno Prémont
2011-04-27 18:41 ` Bruno Prémont
2011-04-27 19:16 ` Pádraig Brady
2011-04-27 19:34 ` Bruno Prémont
2011-04-27 22:05 ` Paul E. McKenney
2011-04-27 20:40 ` Bruno Prémont
2011-04-27 22:07 ` Paul E. McKenney
2011-04-28 6:10 ` Bruno Prémont
2011-04-27 22:06 ` Thomas Gleixner
2011-04-27 22:27 ` Paul E. McKenney
2011-04-27 22:32 ` Thomas Gleixner
2011-04-27 22:59 ` Paul E. McKenney
2011-04-27 23:28 ` Linus Torvalds
2011-04-27 23:46 ` Linus Torvalds
2011-04-28 9:09 ` Thomas Gleixner
2011-04-28 9:17 ` Sedat Dilek
2011-04-28 9:40 ` Thomas Gleixner
2011-04-28 10:12 ` Mike Galbraith
2011-04-28 9:45 ` Sedat Dilek
2011-04-28 10:26 ` Paul E. McKenney
2011-04-28 13:30 ` Mike Galbraith
2011-04-28 15:28 ` Sedat Dilek
2011-04-28 15:44 ` Sedat Dilek
2011-04-28 15:48 ` Linus Torvalds
2011-04-28 18:49 ` Thomas Gleixner
2011-04-28 20:23 ` Bruno Prémont [this message]
2011-04-28 20:29 ` Thomas Gleixner
2011-04-28 20:44 ` Bruno Prémont
2011-04-28 21:04 ` Thomas Gleixner
2011-04-28 21:51 ` john stultz
2011-04-28 22:02 ` Thomas Gleixner
2011-04-28 23:06 ` Sedat Dilek
2011-04-28 23:35 ` Sedat Dilek
2011-04-29 0:42 ` Paul E. McKenney
2011-04-29 9:34 ` Thomas Gleixner
2011-04-29 7:55 ` Sedat Dilek
2011-04-29 18:09 ` Mike Frysinger
2011-04-29 18:26 ` Thomas Gleixner
2011-04-29 19:31 ` Bruno Prémont
2011-04-29 20:10 ` Thomas Gleixner
2011-04-29 20:14 ` Bruno Prémont
2011-04-30 9:14 ` Sedat Dilek
2011-04-28 20:41 ` Sedat Dilek
2011-04-28 19:22 ` Mike Galbraith
2011-04-27 21:55 ` Paul E. McKenney
2011-04-28 6:22 ` Bruno Prémont
2011-04-28 10:26 ` Paul E. McKenney
2011-04-26 17:12 ` Linus Torvalds
2011-04-26 18:50 ` Paul E. McKenney
2011-04-26 19:17 ` Sedat Dilek
2011-04-27 22:02 ` Paul E. McKenney
2011-04-25 22:08 ` Mike Frysinger
2011-04-25 17:29 ` Paul E. McKenney
2011-04-25 18:13 ` Sedat Dilek
2011-04-25 18:28 ` Paul E. McKenney
2011-04-25 17:26 ` Paul E. McKenney
2011-04-27 10:28 ` Catalin Marinas
2011-04-25 17:51 ` Pekka Enberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110428222301.0b745a0a@neptune.home \
--to=bonbons@linux-vserver.org \
--cc=a.p.zijlstra@chello.nl \
--cc=efault@gmx.de \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=paul.mckenney@linaro.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=penberg@kernel.org \
--cc=sedat.dilek@gmail.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vapier.adi@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).