stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Ying Xue <ying.xue@windriver.com>,
	Fan Du <fan.du@windriver.com>, Yong Zhang <yong.zhang0@gmail.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	peterz@infradead.org, Ingo Molnar <mingo@kernel.org>,
	Li Zefan <lizefan@huawei.com>
Subject: [PATCH 3.4 27/30] sched/rt: Avoid updating RT entry timeout twice within one tick period
Date: Tue, 11 Feb 2014 11:06:27 -0800	[thread overview]
Message-ID: <20140211184647.919162730@linuxfoundation.org> (raw)
In-Reply-To: <20140211184647.149512538@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Ying Xue <ying.xue@windriver.com>

commit 57d2aa00dcec67afa52478730f2b524521af14fb upstream.

The issue below was found in 2.6.34-rt rather than mainline rt
kernel, but the issue still exists upstream as well.

So please let me describe how it was noticed on 2.6.34-rt:

On this version, each softirq has its own thread, it means there
is at least one RT FIFO task per cpu. The priority of these
tasks is set to 49 by default. If user launches an RT FIFO task
with priority lower than 49 of softirq RT tasks, it's possible
there are two RT FIFO tasks enqueued one cpu runqueue at one
moment. By current strategy of balancing RT tasks, when it comes
to RT tasks, we really need to put them off to a CPU that they
can run on as soon as possible. Even if it means a bit of cache
line flushing, we want RT tasks to be run with the least latency.

When the user RT FIFO task which just launched before is
running, the sched timer tick of the current cpu happens. In this
tick period, the timeout value of the user RT task will be
updated once. Subsequently, we try to wake up one softirq RT
task on its local cpu. As the priority of current user RT task
is lower than the softirq RT task, the current task will be
preempted by the higher priority softirq RT task. Before
preemption, we check to see if current can readily move to a
different cpu. If so, we will reschedule to allow the RT push logic
to try to move current somewhere else. Whenever the woken
softirq RT task runs, it first tries to migrate the user FIFO RT
task over to a cpu that is running a task of lesser priority. If
migration is done, it will send a reschedule request to the found
cpu by IPI interrupt. Once the target cpu responds the IPI
interrupt, it will pick the migrated user RT task to preempt its
current task. When the user RT task is running on the new cpu,
the sched timer tick of the cpu fires. So it will tick the user
RT task again. This also means the RT task timeout value will be
updated again. As the migration may be done in one tick period,
it means the user RT task timeout value will be updated twice
within one tick.

If we set a limit on the amount of cpu time for the user RT task
by setrlimit(RLIMIT_RTTIME), the SIGXCPU signal should be posted
upon reaching the soft limit.

But exactly when the SIGXCPU signal should be sent depends on the
RT task timeout value. In fact the timeout mechanism of sending
the SIGXCPU signal assumes the RT task timeout is increased once
every tick.

However, currently the timeout value may be added twice per
tick. So it results in the SIGXCPU signal being sent earlier
than expected.

To solve this issue, we prevent the timeout value from increasing
twice within one tick time by remembering the jiffies value of
last updating the timeout. As long as the RT task's jiffies is
different with the global jiffies value, we allow its timeout to
be updated.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Fan Du <fan.du@windriver.com>
Reviewed-by: Yong Zhang <yong.zhang0@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1342508623-2887-1-git-send-email-ying.xue@windriver.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ lizf: backported to 3.4: adjust context ]
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 include/linux/sched.h |    1 +
 kernel/sched/rt.c     |    6 +++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1237,6 +1237,7 @@ struct sched_entity {
 struct sched_rt_entity {
 	struct list_head run_list;
 	unsigned long timeout;
+	unsigned long watchdog_stamp;
 	unsigned int time_slice;
 	int nr_cpus_allowed;
 
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2002,7 +2002,11 @@ static void watchdog(struct rq *rq, stru
 	if (soft != RLIM_INFINITY) {
 		unsigned long next;
 
-		p->rt.timeout++;
+		if (p->rt.watchdog_stamp != jiffies) {
+			p->rt.timeout++;
+			p->rt.watchdog_stamp = jiffies;
+		}
+
 		next = DIV_ROUND_UP(min(soft, hard), USEC_PER_SEC/HZ);
 		if (p->rt.timeout > next)
 			p->cputime_expires.sched_exp = p->se.sum_exec_runtime;



  parent reply	other threads:[~2014-02-11 19:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-11 19:06 [PATCH 3.4 00/30] 3.4.80-stable review Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 01/30] SELinux: Fix memory leak upon loading policy Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 02/30] intel-iommu: fix off-by-one in pagetable freeing Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 03/30] audit: correct a type mismatch in audit_syscall_exit() Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 04/30] mmc: atmel-mci: fix timeout errors in SDIO mode when using DMA Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 05/30] slub: Fix calculation of cpu slabs Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 07/30] ACPI / init: Flag use of ACPI and ACPI idioms for power supplies to regulator API Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 08/30] mtd: mxc_nand: remove duplicated ecc_stats counting Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 09/30] ore: Fix wrong math in allocation of per device BIO Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 10/30] IB/qib: Fix QP check when looping back to/from QP1 Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 11/30] spidev: fix hang when transfer_one_message fails Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 12/30] NFSv4: OPEN must handle the NFS4ERR_IO return code correctly Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 13/30] nfs4.1: properly handle ENOTSUP in SECINFO_NO_NAME Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 14/30] sunrpc: Fix infinite loop in RPC state machine Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 15/30] dm: wait until embedded kobject is released before destroying a device Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 16/30] dm space map common: make sure new space is used during extend Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 20/30] drm/radeon: set the full cache bit for fences on r7xx+ Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 21/30] drm/radeon/DCE4+: clear bios scratch dpms bit (v2) Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 22/30] PCI: Enable ARI if dev and upstream bridge support it; disable otherwise Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 23/30] hpfs: deadlock and race in directory lseek() Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 24/30] sched/rt: Fix SCHED_RR across cgroups Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 25/30] sched,rt: fix isolated CPUs leaving root_task_group indefinitely throttled Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 26/30] sched: Unthrottle rt runqueues in __disable_runtime() Greg Kroah-Hartman
2014-02-11 19:06 ` Greg Kroah-Hartman [this message]
2014-02-11 19:06 ` [PATCH 3.4 28/30] rtc-cmos: Add an alarm disable quirk Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 29/30] timekeeping: Avoid possible deadlock from clock_was_set_delayed Greg Kroah-Hartman
2014-02-11 19:06 ` [PATCH 3.4 30/30] 3.4.y: timekeeping: fix 32-bit overflow in get_monotonic_boottime Greg Kroah-Hartman
2014-02-12  4:19 ` [PATCH 3.4 00/30] 3.4.80-stable review Guenter Roeck
2014-02-12 18:55   ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140211184647.919162730@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=fan.du@windriver.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=stable@vger.kernel.org \
    --cc=ying.xue@windriver.com \
    --cc=yong.zhang0@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).