* Updating kernels to latest stable versions
@ 2009-10-19 15:28 Marcin Juszkiewicz
2009-10-19 15:28 ` [PATCH 01/12] linux 2.6.18: update to latest stable patch Marcin Juszkiewicz
` (14 more replies)
0 siblings, 15 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:28 UTC (permalink / raw)
To: openembedded-devel
Some time ago I asked about updating kernels to recent stable patch.
There was nearly no discussion on it so this time I am sending mails for
review. Did not build checked those patches yet.
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH 01/12] linux 2.6.18: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
@ 2009-10-19 15:28 ` Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 02/12] linux 2.6.20: " Marcin Juszkiewicz
` (13 subsequent siblings)
14 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:28 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.18.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.18.bb b/recipes/linux/linux_2.6.18.bb
index b445325..df76415 100644
--- a/recipes/linux/linux_2.6.18.bb
+++ b/recipes/linux/linux_2.6.18.bb
@@ -6,11 +6,12 @@ require linux.inc
DEFAULT_PREFERENCE = "-1"
DEFAULT_PREFERENCE_avr32 = "1"
-PR = "r0"
+PR = "r1"
PARALLEL_MAKE=""
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.8.bz2;patch=1 \
file://defconfig \
"
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 02/12] linux 2.6.20: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
2009-10-19 15:28 ` [PATCH 01/12] linux 2.6.18: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-21 11:12 ` [PATCH] linux 2.6.20: added patch to fix PATH_MAX related build error Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 03/12] linux 2.6.21: update to latest stable patch Marcin Juszkiewicz
` (12 subsequent siblings)
14 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.20.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.20.bb b/recipes/linux/linux_2.6.20.bb
index 1614c9d..c701a3b 100644
--- a/recipes/linux/linux_2.6.20.bb
+++ b/recipes/linux/linux_2.6.20.bb
@@ -5,9 +5,10 @@ DEFAULT_PREFERENCE_at91sam9263ek = "20"
DEFAULT_PREFERENCE_at91sam9261ek = "20"
DEFAULT_PREFERENCE_at91sam9260ek = "20"
-PR = "r8"
+PR = "r9"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.21.bz2;patch=1 \
file://defconfig"
SRC_URI_append_n2100 = "\
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 03/12] linux 2.6.21: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
2009-10-19 15:28 ` [PATCH 01/12] linux 2.6.18: update to latest stable patch Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 02/12] linux 2.6.20: " Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 04/12] linux 2.6.22: " Marcin Juszkiewicz
` (11 subsequent siblings)
14 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.21.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.21.bb b/recipes/linux/linux_2.6.21.bb
index c4c7297..bbac0c2 100644
--- a/recipes/linux/linux_2.6.21.bb
+++ b/recipes/linux/linux_2.6.21.bb
@@ -5,9 +5,10 @@ DEFAULT_PREFERENCE_at91sam9263ek = "-1"
DEFAULT_PREFERENCE_gumstix-connex = "1"
DEFAULT_PREFERENCE_gumstix-verdex = "1"
-PR = "r12"
+PR = "r13"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.7.bz2;patch=1 \
file://tsc2003.c \
file://tsc2003-config.diff;patch=1 \
file://defconfig \
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 04/12] linux 2.6.22: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (2 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 03/12] linux 2.6.21: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 05/12] linux 2.6.23: " Marcin Juszkiewicz
` (10 subsequent siblings)
14 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.22.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.22.bb b/recipes/linux/linux_2.6.22.bb
index bb8749f..a2d4f55 100644
--- a/recipes/linux/linux_2.6.22.bb
+++ b/recipes/linux/linux_2.6.22.bb
@@ -5,9 +5,10 @@ DEFAULT_PREFERENCE = "-1"
DEFAULT_PREFERENCE_cm-x270 = "-1"
DEFAULT_PREFERENCE_bd-neon = "0"
-PR = "r5"
+PR = "r6"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.22.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.19.bz2;patch=1 \
file://defconfig \
"
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 05/12] linux 2.6.23: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (3 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 04/12] linux 2.6.22: " Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-21 10:23 ` [PATCH] linux 2.6.23: keep sched-cfs locally updated to 2.6.23.17 Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 06/12] linux 2.6.24: update to latest stable patch Marcin Juszkiewicz
` (9 subsequent siblings)
14 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.23.bb | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/recipes/linux/linux_2.6.23.bb b/recipes/linux/linux_2.6.23.bb
index 0f62a1c..754b795 100644
--- a/recipes/linux/linux_2.6.23.bb
+++ b/recipes/linux/linux_2.6.23.bb
@@ -6,7 +6,7 @@ DEFAULT_PREFERENCE_mpc8313e-rdb = "1"
DEFAULT_PREFERENCE_mpc8323e-rdb = "1"
DEFAULT_PREFERENCE_avr32 = "1"
-PR = "r12"
+PR = "r13"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 \
file://binutils-buildid-arm.patch;patch=1 \
@@ -15,7 +15,7 @@ SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 \
"
# Bug fixes on the 2.6.23.x stable branch
-SRC_URI += "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-2.6.23.12.bz2;patch=1"
+SRC_URI += "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-2.6.23.17.bz2;patch=1"
# Real-time preemption (includes CFS). This is experimental and requires a different defconfig.
#SRC_URI += "file://patch-2.6.23.12-rt14;patch=1"
# Only the Completely Fair Scheduler (CFS), the official backport from 2.6.24
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 06/12] linux 2.6.24: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (4 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 05/12] linux 2.6.23: " Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-21 10:32 ` [PATCH] linux 2.6.24: drop time.h.patch - it is in 2.6.24.7 update Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 07/12] linux 2.6.25: update to latest stable patch Marcin Juszkiewicz
` (8 subsequent siblings)
14 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.24.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.24.bb b/recipes/linux/linux_2.6.24.bb
index bfcfb89..5a09516 100644
--- a/recipes/linux/linux_2.6.24.bb
+++ b/recipes/linux/linux_2.6.24.bb
@@ -12,9 +12,10 @@ DEFAULT_PREFERENCE_oxnas = "1"
DEFAULT_PREFERENCE_hipox = "1"
DEFAULT_PREFERENCE_cs-e9302 = "1"
-PR = "r33"
+PR = "r34"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.24.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.7.bz2;patch=1 \
file://squashfs-lzma-2.6.24.patch;patch=1 \
file://ubifs-v2.6.24.patch;patch=1 \
file://time.h.patch;patch=1 \
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 07/12] linux 2.6.25: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (5 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 06/12] linux 2.6.24: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 08/12] linux 2.6.26: " Marcin Juszkiewicz
` (7 subsequent siblings)
14 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.25.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.25.bb b/recipes/linux/linux_2.6.25.bb
index b6ed3d8..aff0fc7 100644
--- a/recipes/linux/linux_2.6.25.bb
+++ b/recipes/linux/linux_2.6.25.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r6"
+PR = "r7"
# Mark archs/machines that this kernel supports
DEFAULT_PREFERENCE = "-1"
@@ -15,6 +15,7 @@ DEFAULT_PREFERENCE_at91-l9260 = "1"
DEFAULT_PREFERENCE_m8050 = "1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.25.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.20.bz2;patch=1 \
file://defconfig"
SRC_URI_append_mpc8313e-rdb = "\
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 08/12] linux 2.6.26: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (6 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 07/12] linux 2.6.25: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-21 11:55 ` [PATCH] linux 2.6.26: do not apply 2.6.26.5 for topas910 Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 09/12] linux 2.6.27: update to latest stable patch Marcin Juszkiewicz
` (6 subsequent siblings)
14 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.26.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.26.bb b/recipes/linux/linux_2.6.26.bb
index 53d5557..e1bb68c 100644
--- a/recipes/linux/linux_2.6.26.bb
+++ b/recipes/linux/linux_2.6.26.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r9"
+PR = "r10"
# Mark archs/machines that this kernel supports
DEFAULT_PREFERENCE = "-1"
@@ -11,6 +11,7 @@ DEFAULT_PREFERENCE_topas910 = "1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.26.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.8.bz2;patch=1 \
file://defconfig"
SRC_URI_append_boc01 = "\
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 09/12] linux 2.6.27: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (7 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 08/12] linux 2.6.26: " Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-22 10:01 ` [PATCH] linux: updated 2.6.27 to 2.6.27.38 Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 10/12] linux 2.6.28: update to latest stable patch Marcin Juszkiewicz
` (5 subsequent siblings)
14 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.27.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.27.bb b/recipes/linux/linux_2.6.27.bb
index 0b873a6..70b9028 100644
--- a/recipes/linux/linux_2.6.27.bb
+++ b/recipes/linux/linux_2.6.27.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r8"
+PR = "r9"
# Mark archs/machines that this kernel supports
DEFAULT_PREFERENCE = "-1"
@@ -9,6 +9,7 @@ DEFAULT_PREFERENCE_progear = "1"
DEFAULT_PREFERENCE_simpad = "-1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.37.bz2;patch=1 \
file://defconfig "
SRC_URI_append_boc01 = "\
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 10/12] linux 2.6.28: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (8 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 09/12] linux 2.6.27: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 11/12] linux 2.6.29: " Marcin Juszkiewicz
` (4 subsequent siblings)
14 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.28.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.28.bb b/recipes/linux/linux_2.6.28.bb
index 98cf191..05fe815 100644
--- a/recipes/linux/linux_2.6.28.bb
+++ b/recipes/linux/linux_2.6.28.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r12"
+PR = "r13"
# Mark archs/machines that this kernel supports
DEFAULT_PREFERENCE = "-1"
@@ -14,6 +14,7 @@ DEFAULT_PREFERENCE_wrap = "1"
DEFAULT_PREFERENCE_tx27 = "1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.28.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.10.bz2;patch=1 \
file://defconfig"
SRC_URI_append_at91sam9263ek = " \
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 11/12] linux 2.6.29: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (9 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 10/12] linux 2.6.28: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-21 12:26 ` [PATCH] linux 2.6.29: removed two patches from micro2440 which are also in stable update Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Marcin Juszkiewicz
` (3 subsequent siblings)
14 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.29.bb | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/recipes/linux/linux_2.6.29.bb b/recipes/linux/linux_2.6.29.bb
index f6e65d7..55e36b4 100644
--- a/recipes/linux/linux_2.6.29.bb
+++ b/recipes/linux/linux_2.6.29.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r9"
+PR = "r10"
S = "${WORKDIR}/linux-2.6.29"
@@ -19,6 +19,7 @@ DEFAULT_PREFERENCE_tqm8540 = "1"
DEFAULT_PREFERENCE_stamp9g20evb = "1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.29.tar.bz2 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.6.bz2;patch=1 \
file://defconfig"
SRC_URI_append_boc01 = "\
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH 12/12] linux 2.6.31: update to latest stable patch
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (10 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 11/12] linux 2.6.29: " Marcin Juszkiewicz
@ 2009-10-19 15:29 ` Marcin Juszkiewicz
2009-10-23 9:01 ` [PATCH] linux: 2.6.31 -> 2.6.31.5 update Marcin Juszkiewicz
2009-10-23 18:01 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Michael Smith
2009-10-19 16:28 ` Updating kernels to latest stable versions Stanislav Brabec
` (2 subsequent siblings)
14 siblings, 2 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 15:29 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.31.bb | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/recipes/linux/linux_2.6.31.bb b/recipes/linux/linux_2.6.31.bb
index 455ff19..ffefe60 100644
--- a/recipes/linux/linux_2.6.31.bb
+++ b/recipes/linux/linux_2.6.31.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r2"
+PR = "r3"
S = "${WORKDIR}/linux-${PV}"
@@ -9,7 +9,7 @@ DEFAULT_PREFERENCE = "-1"
DEFAULT_PREFERENCE_db1200 = "1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
- ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.3.bz2;patch=1 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.4.bz2;patch=1 \
http://maxim.org.za/AT91RM9200/2.6/2.6.31-at91.patch.gz;patch=1 \
file://defconfig"
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (11 preceding siblings ...)
2009-10-19 15:29 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-19 16:28 ` Stanislav Brabec
2009-10-19 16:47 ` Marcin Juszkiewicz
2009-10-19 16:49 ` Otavio Salvador
2009-10-19 17:52 ` Khem Raj
2009-12-14 10:56 ` Marcin Juszkiewicz
14 siblings, 2 replies; 38+ messages in thread
From: Stanislav Brabec @ 2009-10-19 16:28 UTC (permalink / raw)
To: openembedded-devel
Marcin Juszkiewicz wrote:
> Some time ago I asked about updating kernels to recent stable patch.
> There was nearly no discussion on it so this time I am sending mails for
> review. Did not build checked those patches yet.
Maybe it should be 2.6.21.7-r0 instead of 2.6.21-r13 and do the same for
other patched recipes.
--
Stanislav Brabec
http://www.penguin.cz/~utx/zaurus
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 16:28 ` Updating kernels to latest stable versions Stanislav Brabec
@ 2009-10-19 16:47 ` Marcin Juszkiewicz
2009-10-19 20:08 ` Phil Blundell
2009-10-19 16:49 ` Otavio Salvador
1 sibling, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 16:47 UTC (permalink / raw)
To: openembedded-devel
Dnia poniedziałek, 19 października 2009 o 18:28:33 Stanislav Brabec
napisał(a):
> Maybe it should be 2.6.21.7-r0 instead of 2.6.21-r13 and do the same for
> other patched recipes.
No, your suggestion will break everyone who uses PREF_VER_linux = "2.6.21" for
example.
Regards,
--
JID: hrw@jabber.org
Website: http://marcin.juszkiewicz.com.pl/
LinkedIn: http://www.linkedin.com/in/marcinjuszkiewicz
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 16:28 ` Updating kernels to latest stable versions Stanislav Brabec
2009-10-19 16:47 ` Marcin Juszkiewicz
@ 2009-10-19 16:49 ` Otavio Salvador
1 sibling, 0 replies; 38+ messages in thread
From: Otavio Salvador @ 2009-10-19 16:49 UTC (permalink / raw)
To: openembedded-devel
Hello,
On Mon, Oct 19, 2009 at 2:28 PM, Stanislav Brabec <utx@penguin.cz> wrote:
> Marcin Juszkiewicz wrote:
>> Some time ago I asked about updating kernels to recent stable patch.
>> There was nearly no discussion on it so this time I am sending mails for
>> review. Did not build checked those patches yet.
>
> Maybe it should be 2.6.21.7-r0 instead of 2.6.21-r13 and do the same for
> other patched recipes.
Personally I prefer 2.6.21-r13 since it allows for easier updating of
machines and also do not demand a image change just to use it.
Since it is a stable update it ought to not include regressions as far
as the patches works.
--
Otavio Salvador O.S. Systems
E-mail: otavio@ossystems.com.br http://www.ossystems.com.br
Mobile: +55 53 9981-7854 http://projetos.ossystems.com.br
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (12 preceding siblings ...)
2009-10-19 16:28 ` Updating kernels to latest stable versions Stanislav Brabec
@ 2009-10-19 17:52 ` Khem Raj
2009-10-19 18:08 ` Mike Westerhof
2009-12-14 10:56 ` Marcin Juszkiewicz
14 siblings, 1 reply; 38+ messages in thread
From: Khem Raj @ 2009-10-19 17:52 UTC (permalink / raw)
To: openembedded-devel
On Mon, Oct 19, 2009 at 8:28 AM, Marcin Juszkiewicz
<marcin@juszkiewicz.com.pl> wrote:
> Some time ago I asked about updating kernels to recent stable patch.
> There was nearly no discussion on it so this time I am sending mails for
> review. Did not build checked those patches yet.
You might have to try builds because there could be some backported
patches already applied which now are in the new minor release. (I
haven't looked myself if there is such a case.)
>
>
>
>
> _______________________________________________
> Openembedded-devel mailing list
> Openembedded-devel@lists.openembedded.org
> http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-devel
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 17:52 ` Khem Raj
@ 2009-10-19 18:08 ` Mike Westerhof
2009-10-19 18:18 ` Marcin Juszkiewicz
0 siblings, 1 reply; 38+ messages in thread
From: Mike Westerhof @ 2009-10-19 18:08 UTC (permalink / raw)
To: openembedded-devel
Khem Raj wrote:
> On Mon, Oct 19, 2009 at 8:28 AM, Marcin Juszkiewicz
> <marcin@juszkiewicz.com.pl> wrote:
>> Some time ago I asked about updating kernels to recent stable patch.
>> There was nearly no discussion on it so this time I am sending mails for
>> review. Did not build checked those patches yet.
>
> You might have to try builds because there could be some backported
> patches already applied which now are in the new minor release. (I
> haven't looked myself if there is such a case.)
There may also be platform-specific patches that won't apply with the
newer kernel updates, a possibility that is more likely with the
pressure to have maintainers use the generic kernel recipes instead of
just creating their own specific recipe.
And no, I haven't yet looked to see this will be a problem for the
ixp4xx kernels; that's not a big concern and I'll fix that myself should
it be necessary. I merely mention this to point out that we probably
need to move carefully on blanket updates to the kernel if we wish to
encourage use of the generic kernel recipes -- that's all. :)
Regards,
Mike (mwester)
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 18:08 ` Mike Westerhof
@ 2009-10-19 18:18 ` Marcin Juszkiewicz
2009-10-21 12:20 ` Marcin Juszkiewicz
0 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-19 18:18 UTC (permalink / raw)
To: openembedded-devel
Dnia poniedziałek, 19 października 2009 o 20:08:14 Mike Westerhof napisał(a):
> Khem Raj wrote:
> > You might have to try builds because there could be some backported
> > patches already applied which now are in the new minor release. (I
> > haven't looked myself if there is such a case.)
> There may also be platform-specific patches that won't apply with the
> newer kernel updates, a possibility that is more likely with the
> pressure to have maintainers use the generic kernel recipes instead of
> just creating their own specific recipe.
>
> And no, I haven't yet looked to see this will be a problem for the
> ixp4xx kernels; that's not a big concern and I'll fix that myself should
> it be necessary. I merely mention this to point out that we probably
> need to move carefully on blanket updates to the kernel if we wish to
> encourage use of the generic kernel recipes -- that's all. :)
And those reasons are why I did not pushed any of those this time (before I
updated .30 and .31). Tomorrow will do some build testing.
Regards,
--
JID: hrw@jabber.org
Website: http://marcin.juszkiewicz.com.pl/
LinkedIn: http://www.linkedin.com/in/marcinjuszkiewicz
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 16:47 ` Marcin Juszkiewicz
@ 2009-10-19 20:08 ` Phil Blundell
2009-10-20 12:45 ` Otavio Salvador
0 siblings, 1 reply; 38+ messages in thread
From: Phil Blundell @ 2009-10-19 20:08 UTC (permalink / raw)
To: openembedded-devel
On Mon, 2009-10-19 at 18:47 +0200, Marcin Juszkiewicz wrote:
> Dnia poniedziałek, 19 października 2009 o 18:28:33 Stanislav Brabec
> napisał(a):
>
> > Maybe it should be 2.6.21.7-r0 instead of 2.6.21-r13 and do the same for
> > other patched recipes.
>
> No, your suggestion will break everyone who uses PREF_VER_linux = "2.6.21" for
> example.
I guess that depends on your definition of "breaks". One way of looking
at it is that folks who set PREFERRED_VERSION_linux = "2.6.21" are
wanting the actual version 2.6.21, not 2.6.21.7, and might be dismayed
to find that they have suddenly been bumped to the latter with no way to
get back to what they had before.
2.6.21.7 clearly is a new upstream release and I would have thought that
it is appropriate to reflect that in PV rather than hiding it behind a
PR increment.
p.
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 20:08 ` Phil Blundell
@ 2009-10-20 12:45 ` Otavio Salvador
0 siblings, 0 replies; 38+ messages in thread
From: Otavio Salvador @ 2009-10-20 12:45 UTC (permalink / raw)
To: openembedded-devel
Hello Phil,
On Mon, Oct 19, 2009 at 6:08 PM, Phil Blundell <philb@gnu.org> wrote:
> On Mon, 2009-10-19 at 18:47 +0200, Marcin Juszkiewicz wrote:
>> Dnia poniedziałek, 19 października 2009 o 18:28:33 Stanislav Brabec
>> napisał(a):
>>
>> > Maybe it should be 2.6.21.7-r0 instead of 2.6.21-r13 and do the same for
>> > other patched recipes.
>>
>> No, your suggestion will break everyone who uses PREF_VER_linux = "2.6.21" for
>> example.
>
> I guess that depends on your definition of "breaks". One way of looking
> at it is that folks who set PREFERRED_VERSION_linux = "2.6.21" are
> wanting the actual version 2.6.21, not 2.6.21.7, and might be dismayed
> to find that they have suddenly been bumped to the latter with no way to
> get back to what they had before.
>
> 2.6.21.7 clearly is a new upstream release and I would have thought that
> it is appropriate to reflect that in PV rather than hiding it behind a
> PR increment.
Well, it is mostly bugfixes into a release. It is the same thing we
do when we apply a patch in a package and bump the revision. The only
difference is that there's a team doing it outside of OE and this
patches are huge.
It is NOT a upstream release; it is a stable release being "updated"
to fix critical bugs and regressions.
--
Otavio Salvador O.S. Systems
E-mail: otavio@ossystems.com.br http://www.ossystems.com.br
Mobile: +55 53 9981-7854 http://projetos.ossystems.com.br
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH] linux 2.6.23: keep sched-cfs locally updated to 2.6.23.17
2009-10-19 15:29 ` [PATCH 05/12] linux 2.6.23: " Marcin Juszkiewicz
@ 2009-10-21 10:23 ` Marcin Juszkiewicz
2009-10-21 10:30 ` [PATCH] linux 2.6.23: added time.h.patch to make it build for ARM Marcin Juszkiewicz
0 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 10:23 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
.../linux-2.6.23/sched-cfs-v2.6.23.12-v24.1.patch | 8567 ++++++++++++++++++++
recipes/linux/linux_2.6.23.bb | 4 +-
2 files changed, 8569 insertions(+), 2 deletions(-)
create mode 100644 recipes/linux/linux-2.6.23/sched-cfs-v2.6.23.12-v24.1.patch
diff --git a/recipes/linux/linux-2.6.23/sched-cfs-v2.6.23.12-v24.1.patch b/recipes/linux/linux-2.6.23/sched-cfs-v2.6.23.12-v24.1.patch
new file mode 100644
index 0000000..77ee5c8
--- /dev/null
+++ b/recipes/linux/linux-2.6.23/sched-cfs-v2.6.23.12-v24.1.patch
@@ -0,0 +1,8567 @@
+---
+ Documentation/sched-design-CFS.txt | 67 +
+ Makefile | 2
+ arch/i386/Kconfig | 11
+ drivers/kvm/kvm.h | 10
+ fs/pipe.c | 9
+ fs/proc/array.c | 21
+ fs/proc/base.c | 2
+ fs/proc/proc_misc.c | 15
+ include/linux/cgroup.h | 12
+ include/linux/cpuset.h | 5
+ include/linux/kernel.h | 7
+ include/linux/kernel_stat.h | 3
+ include/linux/nodemask.h | 94 +
+ include/linux/sched.h | 174 ++
+ include/linux/taskstats.h | 7
+ include/linux/topology.h | 5
+ init/Kconfig | 26
+ init/main.c | 3
+ kernel/delayacct.c | 8
+ kernel/exit.c | 6
+ kernel/fork.c | 5
+ kernel/ksysfs.c | 8
+ kernel/sched.c | 2310 +++++++++++++++++++++++--------------
+ kernel/sched_debug.c | 289 +++-
+ kernel/sched_fair.c | 885 ++++++--------
+ kernel/sched_idletask.c | 26
+ kernel/sched_rt.c | 54
+ kernel/sched_stats.h | 40
+ kernel/sysctl.c | 40
+ kernel/timer.c | 7
+ kernel/tsacct.c | 4
+ kernel/user.c | 249 +++
+ mm/memory_hotplug.c | 7
+ mm/page_alloc.c | 50
+ mm/vmscan.c | 4
+ net/unix/af_unix.c | 4
+ 36 files changed, 2883 insertions(+), 1586 deletions(-)
+
+--- linux-2.6.23.orig/Documentation/sched-design-CFS.txt
++++ linux-2.6.23/Documentation/sched-design-CFS.txt
+@@ -115,5 +115,72 @@ Some implementation details:
+ - reworked/sanitized SMP load-balancing: the runqueue-walking
+ assumptions are gone from the load-balancing code now, and
+ iterators of the scheduling modules are used. The balancing code got
+ quite a bit simpler as a result.
+
++
++Group scheduler extension to CFS
++================================
++
++Normally the scheduler operates on individual tasks and strives to provide
++fair CPU time to each task. Sometimes, it may be desirable to group tasks
++and provide fair CPU time to each such task group. For example, it may
++be desirable to first provide fair CPU time to each user on the system
++and then to each task belonging to a user.
++
++CONFIG_FAIR_GROUP_SCHED strives to achieve exactly that. It lets
++SCHED_NORMAL/BATCH tasks be be grouped and divides CPU time fairly among such
++groups. At present, there are two (mutually exclusive) mechanisms to group
++tasks for CPU bandwidth control purpose:
++
++ - Based on user id (CONFIG_FAIR_USER_SCHED)
++ In this option, tasks are grouped according to their user id.
++ - Based on "cgroup" pseudo filesystem (CONFIG_FAIR_CGROUP_SCHED)
++ This options lets the administrator create arbitrary groups
++ of tasks, using the "cgroup" pseudo filesystem. See
++ Documentation/cgroups.txt for more information about this
++ filesystem.
++
++Only one of these options to group tasks can be chosen and not both.
++
++Group scheduler tunables:
++
++When CONFIG_FAIR_USER_SCHED is defined, a directory is created in sysfs for
++each new user and a "cpu_share" file is added in that directory.
++
++ # cd /sys/kernel/uids
++ # cat 512/cpu_share # Display user 512's CPU share
++ 1024
++ # echo 2048 > 512/cpu_share # Modify user 512's CPU share
++ # cat 512/cpu_share # Display user 512's CPU share
++ 2048
++ #
++
++CPU bandwidth between two users are divided in the ratio of their CPU shares.
++For ex: if you would like user "root" to get twice the bandwidth of user
++"guest", then set the cpu_share for both the users such that "root"'s
++cpu_share is twice "guest"'s cpu_share
++
++
++When CONFIG_FAIR_CGROUP_SCHED is defined, a "cpu.shares" file is created
++for each group created using the pseudo filesystem. See example steps
++below to create task groups and modify their CPU share using the "cgroups"
++pseudo filesystem
++
++ # mkdir /dev/cpuctl
++ # mount -t cgroup -ocpu none /dev/cpuctl
++ # cd /dev/cpuctl
++
++ # mkdir multimedia # create "multimedia" group of tasks
++ # mkdir browser # create "browser" group of tasks
++
++ # #Configure the multimedia group to receive twice the CPU bandwidth
++ # #that of browser group
++
++ # echo 2048 > multimedia/cpu.shares
++ # echo 1024 > browser/cpu.shares
++
++ # firefox & # Launch firefox and move it to "browser" group
++ # echo <firefox_pid> > browser/tasks
++
++ # #Launch gmplayer (or your favourite movie player)
++ # echo <movie_player_pid> > multimedia/tasks
+--- linux-2.6.23.orig/Makefile
++++ linux-2.6.23/Makefile
+@@ -1,9 +1,9 @@
+ VERSION = 2
+ PATCHLEVEL = 6
+ SUBLEVEL = 23
+-EXTRAVERSION = .17
++EXTRAVERSION = .17-cfs-v24.1
+ NAME = Arr Matey! A Hairy Bilge Rat!
+
+ # *DOCUMENTATION*
+ # To see a list of typical targets execute "make help"
+ # More info can be located in ./README
+--- linux-2.6.23.orig/arch/i386/Kconfig
++++ linux-2.6.23/arch/i386/Kconfig
+@@ -212,10 +212,21 @@ config X86_ES7000
+ Only choose this option if you have such a system, otherwise you
+ should say N here.
+
+ endchoice
+
++config SCHED_NO_NO_OMIT_FRAME_POINTER
++ bool "Single-depth WCHAN output"
++ default y
++ help
++ Calculate simpler /proc/<PID>/wchan values. If this option
++ is disabled then wchan values will recurse back to the
++ caller function. This provides more accurate wchan values,
++ at the expense of slightly more scheduling overhead.
++
++ If in doubt, say "Y".
++
+ config PARAVIRT
+ bool "Paravirtualization support (EXPERIMENTAL)"
+ depends on EXPERIMENTAL
+ depends on !(X86_VISWS || X86_VOYAGER)
+ help
+--- linux-2.6.23.orig/drivers/kvm/kvm.h
++++ linux-2.6.23/drivers/kvm/kvm.h
+@@ -623,10 +623,20 @@ void __kvm_mmu_free_some_pages(struct kv
+ int kvm_mmu_load(struct kvm_vcpu *vcpu);
+ void kvm_mmu_unload(struct kvm_vcpu *vcpu);
+
+ int kvm_hypercall(struct kvm_vcpu *vcpu, struct kvm_run *run);
+
++static inline void kvm_guest_enter(void)
++{
++ current->flags |= PF_VCPU;
++}
++
++static inline void kvm_guest_exit(void)
++{
++ current->flags &= ~PF_VCPU;
++}
++
+ static inline int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva,
+ u32 error_code)
+ {
+ return vcpu->mmu.page_fault(vcpu, gva, error_code);
+ }
+--- linux-2.6.23.orig/fs/pipe.c
++++ linux-2.6.23/fs/pipe.c
+@@ -43,12 +43,11 @@ void pipe_wait(struct pipe_inode_info *p
+
+ /*
+ * Pipes are system-local resources, so sleeping on them
+ * is considered a noninteractive wait:
+ */
+- prepare_to_wait(&pipe->wait, &wait,
+- TASK_INTERRUPTIBLE | TASK_NONINTERACTIVE);
++ prepare_to_wait(&pipe->wait, &wait, TASK_INTERRUPTIBLE);
+ if (pipe->inode)
+ mutex_unlock(&pipe->inode->i_mutex);
+ schedule();
+ finish_wait(&pipe->wait, &wait);
+ if (pipe->inode)
+@@ -381,11 +380,11 @@ redo:
+ }
+ mutex_unlock(&inode->i_mutex);
+
+ /* Signal writers asynchronously that there is more room. */
+ if (do_wakeup) {
+- wake_up_interruptible(&pipe->wait);
++ wake_up_interruptible_sync(&pipe->wait);
+ kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
+ }
+ if (ret > 0)
+ file_accessed(filp);
+ return ret;
+@@ -554,11 +553,11 @@ redo2:
+ pipe->waiting_writers--;
+ }
+ out:
+ mutex_unlock(&inode->i_mutex);
+ if (do_wakeup) {
+- wake_up_interruptible(&pipe->wait);
++ wake_up_interruptible_sync(&pipe->wait);
+ kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
+ }
+ if (ret > 0)
+ file_update_time(filp);
+ return ret;
+@@ -648,11 +647,11 @@ pipe_release(struct inode *inode, int de
+ pipe->writers -= decw;
+
+ if (!pipe->readers && !pipe->writers) {
+ free_pipe_info(inode);
+ } else {
+- wake_up_interruptible(&pipe->wait);
++ wake_up_interruptible_sync(&pipe->wait);
+ kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
+ kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
+ }
+ mutex_unlock(&inode->i_mutex);
+
+--- linux-2.6.23.orig/fs/proc/array.c
++++ linux-2.6.23/fs/proc/array.c
+@@ -365,15 +365,22 @@ static cputime_t task_stime(struct task_
+ * grows monotonically - apps rely on that):
+ */
+ stime = nsec_to_clock_t(p->se.sum_exec_runtime) -
+ cputime_to_clock_t(task_utime(p));
+
+- p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
++ if (stime >= 0)
++ p->prev_stime = max(p->prev_stime, clock_t_to_cputime(stime));
++
+ return p->prev_stime;
+ }
+ #endif
+
++static cputime_t task_gtime(struct task_struct *p)
++{
++ return p->gtime;
++}
++
+ static int do_task_stat(struct task_struct *task, char *buffer, int whole)
+ {
+ unsigned long vsize, eip, esp, wchan = ~0UL;
+ long priority, nice;
+ int tty_pgrp = -1, tty_nr = 0;
+@@ -385,10 +392,11 @@ static int do_task_stat(struct task_stru
+ struct mm_struct *mm;
+ unsigned long long start_time;
+ unsigned long cmin_flt = 0, cmaj_flt = 0;
+ unsigned long min_flt = 0, maj_flt = 0;
+ cputime_t cutime, cstime, utime, stime;
++ cputime_t cgtime, gtime;
+ unsigned long rsslim = 0;
+ char tcomm[sizeof(task->comm)];
+ unsigned long flags;
+
+ state = *get_task_state(task);
+@@ -403,10 +411,11 @@ static int do_task_stat(struct task_stru
+ get_task_comm(tcomm, task);
+
+ sigemptyset(&sigign);
+ sigemptyset(&sigcatch);
+ cutime = cstime = utime = stime = cputime_zero;
++ cgtime = gtime = cputime_zero;
+
+ rcu_read_lock();
+ if (lock_task_sighand(task, &flags)) {
+ struct signal_struct *sig = task->signal;
+
+@@ -420,27 +429,30 @@ static int do_task_stat(struct task_stru
+
+ cmin_flt = sig->cmin_flt;
+ cmaj_flt = sig->cmaj_flt;
+ cutime = sig->cutime;
+ cstime = sig->cstime;
++ cgtime = sig->cgtime;
+ rsslim = sig->rlim[RLIMIT_RSS].rlim_cur;
+
+ /* add up live thread stats at the group level */
+ if (whole) {
+ struct task_struct *t = task;
+ do {
+ min_flt += t->min_flt;
+ maj_flt += t->maj_flt;
+ utime = cputime_add(utime, task_utime(t));
+ stime = cputime_add(stime, task_stime(t));
++ gtime = cputime_add(gtime, task_gtime(t));
+ t = next_thread(t);
+ } while (t != task);
+
+ min_flt += sig->min_flt;
+ maj_flt += sig->maj_flt;
+ utime = cputime_add(utime, sig->utime);
+ stime = cputime_add(stime, sig->stime);
++ gtime = cputime_add(gtime, sig->gtime);
+ }
+
+ sid = signal_session(sig);
+ pgid = process_group(task);
+ ppid = rcu_dereference(task->real_parent)->tgid;
+@@ -454,10 +466,11 @@ static int do_task_stat(struct task_stru
+ if (!whole) {
+ min_flt = task->min_flt;
+ maj_flt = task->maj_flt;
+ utime = task_utime(task);
+ stime = task_stime(task);
++ gtime = task_gtime(task);
+ }
+
+ /* scale priority and nice values from timeslices to -20..20 */
+ /* to make it look like a "normal" Unix priority/nice value */
+ priority = task_prio(task);
+@@ -471,11 +484,11 @@ static int do_task_stat(struct task_stru
+ /* convert nsec -> ticks */
+ start_time = nsec_to_clock_t(start_time);
+
+ res = sprintf(buffer, "%d (%s) %c %d %d %d %d %d %u %lu \
+ %lu %lu %lu %lu %lu %ld %ld %ld %ld %d 0 %llu %lu %ld %lu %lu %lu %lu %lu \
+-%lu %lu %lu %lu %lu %lu %lu %lu %d %d %u %u %llu\n",
++%lu %lu %lu %lu %lu %lu %lu %lu %d %d %u %u %llu %lu %ld\n",
+ task->pid,
+ tcomm,
+ state,
+ ppid,
+ pgid,
+@@ -516,11 +529,13 @@ static int do_task_stat(struct task_stru
+ 0UL,
+ task->exit_signal,
+ task_cpu(task),
+ task->rt_priority,
+ task->policy,
+- (unsigned long long)delayacct_blkio_ticks(task));
++ (unsigned long long)delayacct_blkio_ticks(task),
++ cputime_to_clock_t(gtime),
++ cputime_to_clock_t(cgtime));
+ if (mm)
+ mmput(mm);
+ return res;
+ }
+
+--- linux-2.6.23.orig/fs/proc/base.c
++++ linux-2.6.23/fs/proc/base.c
+@@ -302,11 +302,11 @@ static int proc_pid_wchan(struct task_st
+ static int proc_pid_schedstat(struct task_struct *task, char *buffer)
+ {
+ return sprintf(buffer, "%llu %llu %lu\n",
+ task->sched_info.cpu_time,
+ task->sched_info.run_delay,
+- task->sched_info.pcnt);
++ task->sched_info.pcount);
+ }
+ #endif
+
+ /* The badness from the OOM killer */
+ unsigned long badness(struct task_struct *p, unsigned long uptime);
+--- linux-2.6.23.orig/fs/proc/proc_misc.c
++++ linux-2.6.23/fs/proc/proc_misc.c
+@@ -441,20 +441,22 @@ static const struct file_operations proc
+ static int show_stat(struct seq_file *p, void *v)
+ {
+ int i;
+ unsigned long jif;
+ cputime64_t user, nice, system, idle, iowait, irq, softirq, steal;
++ cputime64_t guest;
+ u64 sum = 0;
+ struct timespec boottime;
+ unsigned int *per_irq_sum;
+
+ per_irq_sum = kzalloc(sizeof(unsigned int)*NR_IRQS, GFP_KERNEL);
+ if (!per_irq_sum)
+ return -ENOMEM;
+
+ user = nice = system = idle = iowait =
+ irq = softirq = steal = cputime64_zero;
++ guest = cputime64_zero;
+ getboottime(&boottime);
+ jif = boottime.tv_sec;
+
+ for_each_possible_cpu(i) {
+ int j;
+@@ -465,26 +467,28 @@ static int show_stat(struct seq_file *p,
+ idle = cputime64_add(idle, kstat_cpu(i).cpustat.idle);
+ iowait = cputime64_add(iowait, kstat_cpu(i).cpustat.iowait);
+ irq = cputime64_add(irq, kstat_cpu(i).cpustat.irq);
+ softirq = cputime64_add(softirq, kstat_cpu(i).cpustat.softirq);
+ steal = cputime64_add(steal, kstat_cpu(i).cpustat.steal);
++ guest = cputime64_add(guest, kstat_cpu(i).cpustat.guest);
+ for (j = 0; j < NR_IRQS; j++) {
+ unsigned int temp = kstat_cpu(i).irqs[j];
+ sum += temp;
+ per_irq_sum[j] += temp;
+ }
+ }
+
+- seq_printf(p, "cpu %llu %llu %llu %llu %llu %llu %llu %llu\n",
++ seq_printf(p, "cpu %llu %llu %llu %llu %llu %llu %llu %llu %llu\n",
+ (unsigned long long)cputime64_to_clock_t(user),
+ (unsigned long long)cputime64_to_clock_t(nice),
+ (unsigned long long)cputime64_to_clock_t(system),
+ (unsigned long long)cputime64_to_clock_t(idle),
+ (unsigned long long)cputime64_to_clock_t(iowait),
+ (unsigned long long)cputime64_to_clock_t(irq),
+ (unsigned long long)cputime64_to_clock_t(softirq),
+- (unsigned long long)cputime64_to_clock_t(steal));
++ (unsigned long long)cputime64_to_clock_t(steal),
++ (unsigned long long)cputime64_to_clock_t(guest));
+ for_each_online_cpu(i) {
+
+ /* Copy values here to work around gcc-2.95.3, gcc-2.96 */
+ user = kstat_cpu(i).cpustat.user;
+ nice = kstat_cpu(i).cpustat.nice;
+@@ -492,20 +496,23 @@ static int show_stat(struct seq_file *p,
+ idle = kstat_cpu(i).cpustat.idle;
+ iowait = kstat_cpu(i).cpustat.iowait;
+ irq = kstat_cpu(i).cpustat.irq;
+ softirq = kstat_cpu(i).cpustat.softirq;
+ steal = kstat_cpu(i).cpustat.steal;
+- seq_printf(p, "cpu%d %llu %llu %llu %llu %llu %llu %llu %llu\n",
++ guest = kstat_cpu(i).cpustat.guest;
++ seq_printf(p,
++ "cpu%d %llu %llu %llu %llu %llu %llu %llu %llu %llu\n",
+ i,
+ (unsigned long long)cputime64_to_clock_t(user),
+ (unsigned long long)cputime64_to_clock_t(nice),
+ (unsigned long long)cputime64_to_clock_t(system),
+ (unsigned long long)cputime64_to_clock_t(idle),
+ (unsigned long long)cputime64_to_clock_t(iowait),
+ (unsigned long long)cputime64_to_clock_t(irq),
+ (unsigned long long)cputime64_to_clock_t(softirq),
+- (unsigned long long)cputime64_to_clock_t(steal));
++ (unsigned long long)cputime64_to_clock_t(steal),
++ (unsigned long long)cputime64_to_clock_t(guest));
+ }
+ seq_printf(p, "intr %llu", (unsigned long long)sum);
+
+ #ifndef CONFIG_SMP
+ /* Touches too many cache lines on SMP setups */
+--- /dev/null
++++ linux-2.6.23/include/linux/cgroup.h
+@@ -0,0 +1,12 @@
++#ifndef _LINUX_CGROUP_H
++#define _LINUX_CGROUP_H
++
++/*
++ * Control groups are not backported - we use a few compatibility
++ * defines to be able to use the upstream sched.c as-is:
++ */
++#define task_pid_nr(task) (task)->pid
++#define task_pid_vnr(task) (task)->pid
++#define find_task_by_vpid(pid) find_task_by_pid(pid)
++
++#endif
+--- linux-2.6.23.orig/include/linux/cpuset.h
++++ linux-2.6.23/include/linux/cpuset.h
+@@ -144,8 +144,13 @@ static inline int cpuset_do_slab_mem_spr
+ return 0;
+ }
+
+ static inline void cpuset_track_online_nodes(void) {}
+
++static inline cpumask_t cpuset_cpus_allowed_locked(struct task_struct *p)
++{
++ return cpu_possible_map;
++}
++
+ #endif /* !CONFIG_CPUSETS */
+
+ #endif /* _LINUX_CPUSET_H */
+--- linux-2.6.23.orig/include/linux/kernel.h
++++ linux-2.6.23/include/linux/kernel.h
+@@ -59,10 +59,17 @@ extern const char linux_proc_banner[];
+ #define KERN_WARNING "<4>" /* warning conditions */
+ #define KERN_NOTICE "<5>" /* normal but significant condition */
+ #define KERN_INFO "<6>" /* informational */
+ #define KERN_DEBUG "<7>" /* debug-level messages */
+
++/*
++ * Annotation for a "continued" line of log printout (only done after a
++ * line that had no enclosing \n). Only to be used by core/arch code
++ * during early bootup (a continued line is not SMP-safe otherwise).
++ */
++#define KERN_CONT ""
++
+ extern int console_printk[];
+
+ #define console_loglevel (console_printk[0])
+ #define default_message_loglevel (console_printk[1])
+ #define minimum_console_loglevel (console_printk[2])
+--- linux-2.6.23.orig/include/linux/kernel_stat.h
++++ linux-2.6.23/include/linux/kernel_stat.h
+@@ -21,10 +21,11 @@ struct cpu_usage_stat {
+ cputime64_t softirq;
+ cputime64_t irq;
+ cputime64_t idle;
+ cputime64_t iowait;
+ cputime64_t steal;
++ cputime64_t guest;
+ };
+
+ struct kernel_stat {
+ struct cpu_usage_stat cpustat;
+ unsigned int irqs[NR_IRQS];
+@@ -50,9 +51,11 @@ static inline int kstat_irqs(int irq)
+
+ return sum;
+ }
+
+ extern void account_user_time(struct task_struct *, cputime_t);
++extern void account_user_time_scaled(struct task_struct *, cputime_t);
+ extern void account_system_time(struct task_struct *, int, cputime_t);
++extern void account_system_time_scaled(struct task_struct *, cputime_t);
+ extern void account_steal_time(struct task_struct *, cputime_t);
+
+ #endif /* _LINUX_KERNEL_STAT_H */
+--- linux-2.6.23.orig/include/linux/nodemask.h
++++ linux-2.6.23/include/linux/nodemask.h
+@@ -336,46 +336,108 @@ static inline void __nodes_remap(nodemas
+ if (!nodes_empty(mask)) \
+ for ((node) = 0; (node) < 1; (node)++)
+ #endif /* MAX_NUMNODES */
+
+ /*
++ * Bitmasks that are kept for all the nodes.
++ */
++enum node_states {
++ N_POSSIBLE, /* The node could become online at some point */
++ N_ONLINE, /* The node is online */
++ N_NORMAL_MEMORY, /* The node has regular memory */
++#ifdef CONFIG_HIGHMEM
++ N_HIGH_MEMORY, /* The node has regular or high memory */
++#else
++ N_HIGH_MEMORY = N_NORMAL_MEMORY,
++#endif
++ N_CPU, /* The node has one or more cpus */
++ NR_NODE_STATES
++};
++
++/*
+ * The following particular system nodemasks and operations
+ * on them manage all possible and online nodes.
+ */
+
+-extern nodemask_t node_online_map;
+-extern nodemask_t node_possible_map;
++extern nodemask_t node_states[NR_NODE_STATES];
+
+ #if MAX_NUMNODES > 1
+-#define num_online_nodes() nodes_weight(node_online_map)
+-#define num_possible_nodes() nodes_weight(node_possible_map)
+-#define node_online(node) node_isset((node), node_online_map)
+-#define node_possible(node) node_isset((node), node_possible_map)
+-#define first_online_node first_node(node_online_map)
+-#define next_online_node(nid) next_node((nid), node_online_map)
++static inline int node_state(int node, enum node_states state)
++{
++ return node_isset(node, node_states[state]);
++}
++
++static inline void node_set_state(int node, enum node_states state)
++{
++ __node_set(node, &node_states[state]);
++}
++
++static inline void node_clear_state(int node, enum node_states state)
++{
++ __node_clear(node, &node_states[state]);
++}
++
++static inline int num_node_state(enum node_states state)
++{
++ return nodes_weight(node_states[state]);
++}
++
++#define for_each_node_state(__node, __state) \
++ for_each_node_mask((__node), node_states[__state])
++
++#define first_online_node first_node(node_states[N_ONLINE])
++#define next_online_node(nid) next_node((nid), node_states[N_ONLINE])
++
+ extern int nr_node_ids;
+ #else
+-#define num_online_nodes() 1
+-#define num_possible_nodes() 1
+-#define node_online(node) ((node) == 0)
+-#define node_possible(node) ((node) == 0)
++
++static inline int node_state(int node, enum node_states state)
++{
++ return node == 0;
++}
++
++static inline void node_set_state(int node, enum node_states state)
++{
++}
++
++static inline void node_clear_state(int node, enum node_states state)
++{
++}
++
++static inline int num_node_state(enum node_states state)
++{
++ return 1;
++}
++
++#define for_each_node_state(node, __state) \
++ for ( (node) = 0; (node) == 0; (node) = 1)
++
+ #define first_online_node 0
+ #define next_online_node(nid) (MAX_NUMNODES)
+ #define nr_node_ids 1
++
+ #endif
+
++#define node_online_map node_states[N_ONLINE]
++#define node_possible_map node_states[N_POSSIBLE]
++
+ #define any_online_node(mask) \
+ ({ \
+ int node; \
+ for_each_node_mask(node, (mask)) \
+ if (node_online(node)) \
+ break; \
+ node; \
+ })
+
+-#define node_set_online(node) set_bit((node), node_online_map.bits)
+-#define node_set_offline(node) clear_bit((node), node_online_map.bits)
++#define num_online_nodes() num_node_state(N_ONLINE)
++#define num_possible_nodes() num_node_state(N_POSSIBLE)
++#define node_online(node) node_state((node), N_ONLINE)
++#define node_possible(node) node_state((node), N_POSSIBLE)
++
++#define node_set_online(node) node_set_state((node), N_ONLINE)
++#define node_set_offline(node) node_clear_state((node), N_ONLINE)
+
+-#define for_each_node(node) for_each_node_mask((node), node_possible_map)
+-#define for_each_online_node(node) for_each_node_mask((node), node_online_map)
++#define for_each_node(node) for_each_node_state(node, N_POSSIBLE)
++#define for_each_online_node(node) for_each_node_state(node, N_ONLINE)
+
+ #endif /* __LINUX_NODEMASK_H */
+--- linux-2.6.23.orig/include/linux/sched.h
++++ linux-2.6.23/include/linux/sched.h
+@@ -1,10 +1,21 @@
+ #ifndef _LINUX_SCHED_H
+ #define _LINUX_SCHED_H
+
+ #include <linux/auxvec.h> /* For AT_VECTOR_SIZE */
+
++/* backporting helper macro: */
++#define cpu_sibling_map(cpu) cpu_sibling_map[cpu]
++
++/*
++ * * Control groups are not backported - we use a few compatibility
++ * * defines to be able to use the upstream sched.c as-is:
++ * */
++#define task_pid_nr(task) (task)->pid
++#define task_pid_vnr(task) (task)->pid
++#define find_task_by_vpid(pid) find_task_by_pid(pid)
++
+ /*
+ * cloning flags:
+ */
+ #define CSIGNAL 0x000000ff /* signal mask to be sent at exit */
+ #define CLONE_VM 0x00000100 /* set if VM shared between processes */
+@@ -84,10 +95,11 @@ struct sched_param {
+ #include <linux/param.h>
+ #include <linux/resource.h>
+ #include <linux/timer.h>
+ #include <linux/hrtimer.h>
+ #include <linux/task_io_accounting.h>
++#include <linux/kobject.h>
+
+ #include <asm/processor.h>
+
+ struct exec_domain;
+ struct futex_pi_state;
+@@ -133,10 +145,11 @@ extern unsigned long nr_active(void);
+ extern unsigned long nr_iowait(void);
+ extern unsigned long weighted_cpuload(const int cpu);
+
+ struct seq_file;
+ struct cfs_rq;
++struct task_group;
+ #ifdef CONFIG_SCHED_DEBUG
+ extern void proc_sched_show_task(struct task_struct *p, struct seq_file *m);
+ extern void proc_sched_set_task(struct task_struct *p);
+ extern void
+ print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq);
+@@ -171,12 +184,11 @@ print_cfs_rq(struct seq_file *m, int cpu
+ #define TASK_TRACED 8
+ /* in tsk->exit_state */
+ #define EXIT_ZOMBIE 16
+ #define EXIT_DEAD 32
+ /* in tsk->state again */
+-#define TASK_NONINTERACTIVE 64
+-#define TASK_DEAD 128
++#define TASK_DEAD 64
+
+ #define __set_task_state(tsk, state_value) \
+ do { (tsk)->state = (state_value); } while (0)
+ #define set_task_state(tsk, state_value) \
+ set_mb((tsk)->state, (state_value))
+@@ -276,10 +288,14 @@ static inline void touch_all_softlockup_
+ #endif
+
+
+ /* Attach to any functions which should be ignored in wchan output. */
+ #define __sched __attribute__((__section__(".sched.text")))
++
++/* Linker adds these: start and end of __sched functions */
++extern char __sched_text_start[], __sched_text_end[];
++
+ /* Is this address in the __sched functions? */
+ extern int in_sched_functions(unsigned long addr);
+
+ #define MAX_SCHEDULE_TIMEOUT LONG_MAX
+ extern signed long FASTCALL(schedule_timeout(signed long timeout));
+@@ -513,10 +529,12 @@ struct signal_struct {
+ * and for reaped dead child processes forked by this group.
+ * Live threads maintain their own counters and add to these
+ * in __exit_signal, except for the group leader.
+ */
+ cputime_t utime, stime, cutime, cstime;
++ cputime_t gtime;
++ cputime_t cgtime;
+ unsigned long nvcsw, nivcsw, cnvcsw, cnivcsw;
+ unsigned long min_flt, maj_flt, cmin_flt, cmaj_flt;
+ unsigned long inblock, oublock, cinblock, coublock;
+
+ /*
+@@ -593,12 +611,27 @@ struct user_struct {
+ #endif
+
+ /* Hash table maintenance information */
+ struct hlist_node uidhash_node;
+ uid_t uid;
++
++#ifdef CONFIG_FAIR_USER_SCHED
++ struct task_group *tg;
++#ifdef CONFIG_SYSFS
++ struct kset kset;
++ struct subsys_attribute user_attr;
++ struct work_struct work;
++#endif
++#endif
+ };
+
++#ifdef CONFIG_FAIR_USER_SCHED
++extern int uids_kobject_init(void);
++#else
++static inline int uids_kobject_init(void) { return 0; }
++#endif
++
+ extern struct user_struct *find_user(uid_t);
+
+ extern struct user_struct root_user;
+ #define INIT_USER (&root_user)
+
+@@ -606,17 +639,21 @@ struct backing_dev_info;
+ struct reclaim_state;
+
+ #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
+ struct sched_info {
+ /* cumulative counters */
+- unsigned long pcnt; /* # of times run on this cpu */
++ unsigned long pcount; /* # of times run on this cpu */
+ unsigned long long cpu_time, /* time spent on the cpu */
+ run_delay; /* time spent waiting on a runqueue */
+
+ /* timestamps */
+ unsigned long long last_arrival,/* when we last ran on a cpu */
+ last_queued; /* when we were last queued to run */
++#ifdef CONFIG_SCHEDSTATS
++ /* BKL stats */
++ unsigned int bkl_count;
++#endif
+ };
+ #endif /* defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT) */
+
+ #ifdef CONFIG_SCHEDSTATS
+ extern const struct file_operations proc_schedstat_operations;
+@@ -747,43 +784,42 @@ struct sched_domain {
+ unsigned int balance_interval; /* initialise to 1. units in ms. */
+ unsigned int nr_balance_failed; /* initialise to 0 */
+
+ #ifdef CONFIG_SCHEDSTATS
+ /* load_balance() stats */
+- unsigned long lb_cnt[CPU_MAX_IDLE_TYPES];
+- unsigned long lb_failed[CPU_MAX_IDLE_TYPES];
+- unsigned long lb_balanced[CPU_MAX_IDLE_TYPES];
+- unsigned long lb_imbalance[CPU_MAX_IDLE_TYPES];
+- unsigned long lb_gained[CPU_MAX_IDLE_TYPES];
+- unsigned long lb_hot_gained[CPU_MAX_IDLE_TYPES];
+- unsigned long lb_nobusyg[CPU_MAX_IDLE_TYPES];
+- unsigned long lb_nobusyq[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_count[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_failed[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_balanced[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_imbalance[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_gained[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_hot_gained[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_nobusyg[CPU_MAX_IDLE_TYPES];
++ unsigned int lb_nobusyq[CPU_MAX_IDLE_TYPES];
+
+ /* Active load balancing */
+- unsigned long alb_cnt;
+- unsigned long alb_failed;
+- unsigned long alb_pushed;
++ unsigned int alb_count;
++ unsigned int alb_failed;
++ unsigned int alb_pushed;
+
+ /* SD_BALANCE_EXEC stats */
+- unsigned long sbe_cnt;
+- unsigned long sbe_balanced;
+- unsigned long sbe_pushed;
++ unsigned int sbe_count;
++ unsigned int sbe_balanced;
++ unsigned int sbe_pushed;
+
+ /* SD_BALANCE_FORK stats */
+- unsigned long sbf_cnt;
+- unsigned long sbf_balanced;
+- unsigned long sbf_pushed;
++ unsigned int sbf_count;
++ unsigned int sbf_balanced;
++ unsigned int sbf_pushed;
+
+ /* try_to_wake_up() stats */
+- unsigned long ttwu_wake_remote;
+- unsigned long ttwu_move_affine;
+- unsigned long ttwu_move_balance;
++ unsigned int ttwu_wake_remote;
++ unsigned int ttwu_move_affine;
++ unsigned int ttwu_move_balance;
+ #endif
+ };
+
+-extern int partition_sched_domains(cpumask_t *partition1,
+- cpumask_t *partition2);
++extern void partition_sched_domains(int ndoms_new, cpumask_t *doms_new);
+
+ #endif /* CONFIG_SMP */
+
+ /*
+ * A runqueue laden with a single nice 0 task scores a weighted_cpuload of
+@@ -851,27 +887,32 @@ struct uts_namespace;
+
+ struct rq;
+ struct sched_domain;
+
+ struct sched_class {
+- struct sched_class *next;
++ const struct sched_class *next;
+
+ void (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup);
+ void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);
+- void (*yield_task) (struct rq *rq, struct task_struct *p);
++ void (*yield_task) (struct rq *rq);
+
+ void (*check_preempt_curr) (struct rq *rq, struct task_struct *p);
+
+ struct task_struct * (*pick_next_task) (struct rq *rq);
+ void (*put_prev_task) (struct rq *rq, struct task_struct *p);
+
++#ifdef CONFIG_SMP
+ unsigned long (*load_balance) (struct rq *this_rq, int this_cpu,
+- struct rq *busiest,
+- unsigned long max_nr_move, unsigned long max_load_move,
++ struct rq *busiest, unsigned long max_load_move,
+ struct sched_domain *sd, enum cpu_idle_type idle,
+ int *all_pinned, int *this_best_prio);
+
++ int (*move_one_task) (struct rq *this_rq, int this_cpu,
++ struct rq *busiest, struct sched_domain *sd,
++ enum cpu_idle_type idle);
++#endif
++
+ void (*set_curr_task) (struct rq *rq);
+ void (*task_tick) (struct rq *rq, struct task_struct *p);
+ void (*task_new) (struct rq *rq, struct task_struct *p);
+ };
+
+@@ -885,46 +926,52 @@ struct load_weight {
+ * Current field usage histogram:
+ *
+ * 4 se->block_start
+ * 4 se->run_node
+ * 4 se->sleep_start
+- * 4 se->sleep_start_fair
+ * 6 se->load.weight
+- * 7 se->delta_fair
+- * 15 se->wait_runtime
+ */
+ struct sched_entity {
+- long wait_runtime;
+- unsigned long delta_fair_run;
+- unsigned long delta_fair_sleep;
+- unsigned long delta_exec;
+- s64 fair_key;
+ struct load_weight load; /* for load-balancing */
+ struct rb_node run_node;
+ unsigned int on_rq;
+
+ u64 exec_start;
+ u64 sum_exec_runtime;
++ u64 vruntime;
+ u64 prev_sum_exec_runtime;
+- u64 wait_start_fair;
+- u64 sleep_start_fair;
+
+ #ifdef CONFIG_SCHEDSTATS
+ u64 wait_start;
+ u64 wait_max;
+- s64 sum_wait_runtime;
+
+ u64 sleep_start;
+ u64 sleep_max;
+ s64 sum_sleep_runtime;
+
+ u64 block_start;
+ u64 block_max;
+ u64 exec_max;
++ u64 slice_max;
+
+- unsigned long wait_runtime_overruns;
+- unsigned long wait_runtime_underruns;
++ u64 nr_migrations;
++ u64 nr_migrations_cold;
++ u64 nr_failed_migrations_affine;
++ u64 nr_failed_migrations_running;
++ u64 nr_failed_migrations_hot;
++ u64 nr_forced_migrations;
++ u64 nr_forced2_migrations;
++
++ u64 nr_wakeups;
++ u64 nr_wakeups_sync;
++ u64 nr_wakeups_migrate;
++ u64 nr_wakeups_local;
++ u64 nr_wakeups_remote;
++ u64 nr_wakeups_affine;
++ u64 nr_wakeups_affine_attempts;
++ u64 nr_wakeups_passive;
++ u64 nr_wakeups_idle;
+ #endif
+
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ struct sched_entity *parent;
+ /* rq on which this entity is (to be) queued: */
+@@ -949,11 +996,11 @@ struct task_struct {
+ #endif
+ #endif
+
+ int prio, static_prio, normal_prio;
+ struct list_head run_list;
+- struct sched_class *sched_class;
++ const struct sched_class *sched_class;
+ struct sched_entity se;
+
+ #ifdef CONFIG_PREEMPT_NOTIFIERS
+ /* list of struct preempt_notifier: */
+ struct hlist_head preempt_notifiers;
+@@ -1019,11 +1066,12 @@ struct task_struct {
+ struct completion *vfork_done; /* for vfork() */
+ int __user *set_child_tid; /* CLONE_CHILD_SETTID */
+ int __user *clear_child_tid; /* CLONE_CHILD_CLEARTID */
+
+ unsigned int rt_priority;
+- cputime_t utime, stime;
++ cputime_t utime, stime, utimescaled, stimescaled;
++ cputime_t gtime;
+ cputime_t prev_utime, prev_stime;
+ unsigned long nvcsw, nivcsw; /* context switch counts */
+ struct timespec start_time; /* monotonic time */
+ struct timespec real_start_time; /* boot based time */
+ /* mm fault and swap info: this can arguably be seen as either mm-specific or thread-specific */
+@@ -1312,10 +1360,11 @@ static inline void put_task_struct(struc
+ #define PF_ALIGNWARN 0x00000001 /* Print alignment warning msgs */
+ /* Not implemented yet, only for 486*/
+ #define PF_STARTING 0x00000002 /* being created */
+ #define PF_EXITING 0x00000004 /* getting shut down */
+ #define PF_EXITPIDONE 0x00000008 /* pi exit done on shut down */
++#define PF_VCPU 0x00000010 /* I'm a virtual CPU */
+ #define PF_FORKNOEXEC 0x00000040 /* forked but didn't exec */
+ #define PF_SUPERPRIV 0x00000100 /* used super-user privileges */
+ #define PF_DUMPCORE 0x00000200 /* dumped core */
+ #define PF_SIGNALED 0x00000400 /* killed by a signal */
+ #define PF_MEMALLOC 0x00000800 /* Allocating memory */
+@@ -1399,19 +1448,30 @@ extern void idle_task_exit(void);
+ static inline void idle_task_exit(void) {}
+ #endif
+
+ extern void sched_idle_next(void);
+
++#ifdef CONFIG_SCHED_DEBUG
+ extern unsigned int sysctl_sched_latency;
+ extern unsigned int sysctl_sched_min_granularity;
+ extern unsigned int sysctl_sched_wakeup_granularity;
+ extern unsigned int sysctl_sched_batch_wakeup_granularity;
+-extern unsigned int sysctl_sched_stat_granularity;
+-extern unsigned int sysctl_sched_runtime_limit;
+-extern unsigned int sysctl_sched_compat_yield;
+ extern unsigned int sysctl_sched_child_runs_first;
+ extern unsigned int sysctl_sched_features;
++extern unsigned int sysctl_sched_migration_cost;
++extern unsigned int sysctl_sched_nr_migrate;
++#ifdef CONFIG_FAIR_GROUP_SCHED
++extern unsigned int sysctl_sched_min_bal_int_shares;
++extern unsigned int sysctl_sched_max_bal_int_shares;
++#endif
++
++int sched_nr_latency_handler(struct ctl_table *table, int write,
++ struct file *file, void __user *buffer, size_t *length,
++ loff_t *ppos);
++#endif
++
++extern unsigned int sysctl_sched_compat_yield;
+
+ #ifdef CONFIG_RT_MUTEXES
+ extern int rt_mutex_getprio(struct task_struct *p);
+ extern void rt_mutex_setprio(struct task_struct *p, int prio);
+ extern void rt_mutex_adjust_pi(struct task_struct *p);
+@@ -1841,10 +1901,22 @@ extern long sched_getaffinity(pid_t pid,
+
+ extern int sched_mc_power_savings, sched_smt_power_savings;
+
+ extern void normalize_rt_tasks(void);
+
++#ifdef CONFIG_FAIR_GROUP_SCHED
++
++extern struct task_group init_task_group;
++
++extern struct task_group *sched_create_group(void);
++extern void sched_destroy_group(struct task_group *tg);
++extern void sched_move_task(struct task_struct *tsk);
++extern int sched_group_set_shares(struct task_group *tg, unsigned long shares);
++extern unsigned long sched_group_shares(struct task_group *tg);
++
++#endif
++
+ #ifdef CONFIG_TASK_XACCT
+ static inline void add_rchar(struct task_struct *tsk, ssize_t amt)
+ {
+ tsk->rchar += amt;
+ }
+@@ -1879,8 +1951,16 @@ static inline void inc_syscr(struct task
+ static inline void inc_syscw(struct task_struct *tsk)
+ {
+ }
+ #endif
+
++#ifdef CONFIG_SMP
++void migration_init(void);
++#else
++static inline void migration_init(void)
++{
++}
++#endif
++
+ #endif /* __KERNEL__ */
+
+ #endif
+--- linux-2.6.23.orig/include/linux/taskstats.h
++++ linux-2.6.23/include/linux/taskstats.h
+@@ -29,11 +29,11 @@
+ * b) add comment indicating new version number at end of struct
+ * c) add new fields after version comment; maintain 64-bit alignment
+ */
+
+
+-#define TASKSTATS_VERSION 5
++#define TASKSTATS_VERSION 6
+ #define TS_COMM_LEN 32 /* should be >= TASK_COMM_LEN
+ * in linux/sched.h */
+
+ struct taskstats {
+
+@@ -150,10 +150,15 @@ struct taskstats {
+ __u64 write_bytes; /* bytes of write I/O */
+ __u64 cancelled_write_bytes; /* bytes of cancelled write I/O */
+
+ __u64 nvcsw; /* voluntary_ctxt_switches */
+ __u64 nivcsw; /* nonvoluntary_ctxt_switches */
++
++ /* time accounting for SMT machines */
++ __u64 ac_utimescaled; /* utime scaled on frequency etc */
++ __u64 ac_stimescaled; /* stime scaled on frequency etc */
++ __u64 cpu_scaled_run_real_total; /* scaled cpu_run_real_total */
+ };
+
+
+ /*
+ * Commands sent from userspace
+--- linux-2.6.23.orig/include/linux/topology.h
++++ linux-2.6.23/include/linux/topology.h
+@@ -157,19 +157,18 @@
+ .max_interval = 4, \
+ .busy_factor = 64, \
+ .imbalance_pct = 125, \
+ .cache_nice_tries = 1, \
+ .busy_idx = 2, \
+- .idle_idx = 0, \
+- .newidle_idx = 0, \
++ .idle_idx = 1, \
++ .newidle_idx = 2, \
+ .wake_idx = 1, \
+ .forkexec_idx = 1, \
+ .flags = SD_LOAD_BALANCE \
+ | SD_BALANCE_NEWIDLE \
+ | SD_BALANCE_EXEC \
+ | SD_WAKE_AFFINE \
+- | SD_WAKE_IDLE \
+ | BALANCE_FOR_PKG_POWER,\
+ .last_balance = jiffies, \
+ .balance_interval = 1, \
+ .nr_balance_failed = 0, \
+ }
+--- linux-2.6.23.orig/init/Kconfig
++++ linux-2.6.23/init/Kconfig
+@@ -271,18 +271,44 @@ config LOG_BUF_SHIFT
+ 12 => 4 KB
+
+ config CPUSETS
+ bool "Cpuset support"
+ depends on SMP
++ #
++ # disabled for now - depends on control groups, which
++ # are hard to backport:
++ #
++ depends on 0
+ help
+ This option will let you create and manage CPUSETs which
+ allow dynamically partitioning a system into sets of CPUs and
+ Memory Nodes and assigning tasks to run only within those sets.
+ This is primarily useful on large SMP or NUMA systems.
+
+ Say N if unsure.
+
++config FAIR_GROUP_SCHED
++ bool "Fair group CPU scheduler"
++ default y
++ depends on EXPERIMENTAL
++ help
++ This feature lets CPU scheduler recognize task groups and control CPU
++ bandwidth allocation to such task groups.
++
++choice
++ depends on FAIR_GROUP_SCHED
++ prompt "Basis for grouping tasks"
++ default FAIR_USER_SCHED
++
++config FAIR_USER_SCHED
++ bool "user id"
++ help
++ This option will choose userid as the basis for grouping
++ tasks, thus providing equal CPU bandwidth to each user.
++
++endchoice
++
+ config SYSFS_DEPRECATED
+ bool "Create deprecated sysfs files"
+ default y
+ help
+ This option creates deprecated symlinks such as the
+--- linux-2.6.23.orig/init/main.c
++++ linux-2.6.23/init/main.c
+@@ -750,15 +750,12 @@ static int __init nosoftlockup_setup(cha
+ __setup("nosoftlockup", nosoftlockup_setup);
+
+ static void __init do_pre_smp_initcalls(void)
+ {
+ extern int spawn_ksoftirqd(void);
+-#ifdef CONFIG_SMP
+- extern int migration_init(void);
+
+ migration_init();
+-#endif
+ spawn_ksoftirqd();
+ if (!nosoftlockup)
+ spawn_softlockup_task();
+ }
+
+--- linux-2.6.23.orig/kernel/delayacct.c
++++ linux-2.6.23/kernel/delayacct.c
+@@ -113,15 +113,21 @@ int __delayacct_add_tsk(struct taskstats
+ tmp = (s64)d->cpu_run_real_total;
+ cputime_to_timespec(tsk->utime + tsk->stime, &ts);
+ tmp += timespec_to_ns(&ts);
+ d->cpu_run_real_total = (tmp < (s64)d->cpu_run_real_total) ? 0 : tmp;
+
++ tmp = (s64)d->cpu_scaled_run_real_total;
++ cputime_to_timespec(tsk->utimescaled + tsk->stimescaled, &ts);
++ tmp += timespec_to_ns(&ts);
++ d->cpu_scaled_run_real_total =
++ (tmp < (s64)d->cpu_scaled_run_real_total) ? 0 : tmp;
++
+ /*
+ * No locking available for sched_info (and too expensive to add one)
+ * Mitigate by taking snapshot of values
+ */
+- t1 = tsk->sched_info.pcnt;
++ t1 = tsk->sched_info.pcount;
+ t2 = tsk->sched_info.run_delay;
+ t3 = tsk->sched_info.cpu_time;
+
+ d->cpu_count += t1;
+
+--- linux-2.6.23.orig/kernel/exit.c
++++ linux-2.6.23/kernel/exit.c
+@@ -109,10 +109,11 @@ static void __exit_signal(struct task_st
+ * We won't ever get here for the group leader, since it
+ * will have been the last reference on the signal_struct.
+ */
+ sig->utime = cputime_add(sig->utime, tsk->utime);
+ sig->stime = cputime_add(sig->stime, tsk->stime);
++ sig->gtime = cputime_add(sig->gtime, tsk->gtime);
+ sig->min_flt += tsk->min_flt;
+ sig->maj_flt += tsk->maj_flt;
+ sig->nvcsw += tsk->nvcsw;
+ sig->nivcsw += tsk->nivcsw;
+ sig->inblock += task_io_get_inblock(tsk);
+@@ -1240,10 +1241,15 @@ static int wait_task_zombie(struct task_
+ psig->cstime =
+ cputime_add(psig->cstime,
+ cputime_add(p->stime,
+ cputime_add(sig->stime,
+ sig->cstime)));
++ psig->cgtime =
++ cputime_add(psig->cgtime,
++ cputime_add(p->gtime,
++ cputime_add(sig->gtime,
++ sig->cgtime)));
+ psig->cmin_flt +=
+ p->min_flt + sig->min_flt + sig->cmin_flt;
+ psig->cmaj_flt +=
+ p->maj_flt + sig->maj_flt + sig->cmaj_flt;
+ psig->cnvcsw +=
+--- linux-2.6.23.orig/kernel/fork.c
++++ linux-2.6.23/kernel/fork.c
+@@ -875,10 +875,12 @@ static inline int copy_signal(unsigned l
+
+ sig->leader = 0; /* session leadership doesn't inherit */
+ sig->tty_old_pgrp = NULL;
+
+ sig->utime = sig->stime = sig->cutime = sig->cstime = cputime_zero;
++ sig->gtime = cputime_zero;
++ sig->cgtime = cputime_zero;
+ sig->nvcsw = sig->nivcsw = sig->cnvcsw = sig->cnivcsw = 0;
+ sig->min_flt = sig->maj_flt = sig->cmin_flt = sig->cmaj_flt = 0;
+ sig->inblock = sig->oublock = sig->cinblock = sig->coublock = 0;
+ sig->sum_sched_runtime = 0;
+ INIT_LIST_HEAD(&sig->cpu_timers[0]);
+@@ -1045,10 +1047,13 @@ static struct task_struct *copy_process(
+
+ p->utime = cputime_zero;
+ p->stime = cputime_zero;
+ p->prev_utime = cputime_zero;
+ p->prev_stime = cputime_zero;
++ p->gtime = cputime_zero;
++ p->utimescaled = cputime_zero;
++ p->stimescaled = cputime_zero;
+
+ #ifdef CONFIG_TASK_XACCT
+ p->rchar = 0; /* I/O counter: bytes read */
+ p->wchar = 0; /* I/O counter: bytes written */
+ p->syscr = 0; /* I/O counter: read syscalls */
+--- linux-2.6.23.orig/kernel/ksysfs.c
++++ linux-2.6.23/kernel/ksysfs.c
+@@ -12,10 +12,11 @@
+ #include <linux/string.h>
+ #include <linux/sysfs.h>
+ #include <linux/module.h>
+ #include <linux/init.h>
+ #include <linux/kexec.h>
++#include <linux/sched.h>
+
+ #define KERNEL_ATTR_RO(_name) \
+ static struct subsys_attribute _name##_attr = __ATTR_RO(_name)
+
+ #define KERNEL_ATTR_RW(_name) \
+@@ -114,9 +115,16 @@ static int __init ksysfs_init(void)
+ notes_attr.size = notes_size;
+ error = sysfs_create_bin_file(&kernel_subsys.kobj,
+ ¬es_attr);
+ }
+
++ /*
++ * Create "/sys/kernel/uids" directory and corresponding root user's
++ * directory under it.
++ */
++ if (!error)
++ error = uids_kobject_init();
++
+ return error;
+ }
+
+ core_initcall(ksysfs_init);
+--- linux-2.6.23.orig/kernel/sched.c
++++ linux-2.6.23/kernel/sched.c
+@@ -42,10 +42,11 @@
+ #include <linux/profile.h>
+ #include <linux/freezer.h>
+ #include <linux/vmalloc.h>
+ #include <linux/blkdev.h>
+ #include <linux/delay.h>
++#include <linux/pid_namespace.h>
+ #include <linux/smp.h>
+ #include <linux/threads.h>
+ #include <linux/timer.h>
+ #include <linux/rcupdate.h>
+ #include <linux/cpu.h>
+@@ -59,21 +60,23 @@
+ #include <linux/tsacct_kern.h>
+ #include <linux/kprobes.h>
+ #include <linux/delayacct.h>
+ #include <linux/reciprocal_div.h>
+ #include <linux/unistd.h>
++#include <linux/pagemap.h>
+
+ #include <asm/tlb.h>
++#include <asm/irq_regs.h>
+
+ /*
+ * Scheduler clock - returns current time in nanosec units.
+ * This is default implementation.
+ * Architectures and sub-architectures can override this.
+ */
+ unsigned long long __attribute__((weak)) sched_clock(void)
+ {
+- return (unsigned long long)jiffies * (1000000000 / HZ);
++ return (unsigned long long)jiffies * (NSEC_PER_SEC / HZ);
+ }
+
+ /*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+@@ -93,24 +96,22 @@ unsigned long long __attribute__((weak))
+ #define MAX_USER_PRIO (USER_PRIO(MAX_PRIO))
+
+ /*
+ * Some helpers for converting nanosecond timing to jiffy resolution
+ */
+-#define NS_TO_JIFFIES(TIME) ((TIME) / (1000000000 / HZ))
+-#define JIFFIES_TO_NS(TIME) ((TIME) * (1000000000 / HZ))
++#define NS_TO_JIFFIES(TIME) ((unsigned long)(TIME) / (NSEC_PER_SEC / HZ))
++#define JIFFIES_TO_NS(TIME) ((TIME) * (NSEC_PER_SEC / HZ))
+
+ #define NICE_0_LOAD SCHED_LOAD_SCALE
+ #define NICE_0_SHIFT SCHED_LOAD_SHIFT
+
+ /*
+ * These are the 'tuning knobs' of the scheduler:
+ *
+- * Minimum timeslice is 5 msecs (or 1 jiffy, whichever is larger),
+- * default timeslice is 100 msecs, maximum timeslice is 800 msecs.
++ * default timeslice is 100 msecs (used only for SCHED_RR tasks).
+ * Timeslices get refilled after they expire.
+ */
+-#define MIN_TIMESLICE max(5 * HZ / 1000, 1)
+ #define DEF_TIMESLICE (100 * HZ / 1000)
+
+ #ifdef CONFIG_SMP
+ /*
+ * Divide a load by a sched group cpu_power : (load / sg->__cpu_power)
+@@ -130,28 +131,10 @@ static inline void sg_inc_cpu_power(stru
+ sg->__cpu_power += val;
+ sg->reciprocal_cpu_power = reciprocal_value(sg->__cpu_power);
+ }
+ #endif
+
+-#define SCALE_PRIO(x, prio) \
+- max(x * (MAX_PRIO - prio) / (MAX_USER_PRIO / 2), MIN_TIMESLICE)
+-
+-/*
+- * static_prio_timeslice() scales user-nice values [ -20 ... 0 ... 19 ]
+- * to time slice values: [800ms ... 100ms ... 5ms]
+- */
+-static unsigned int static_prio_timeslice(int static_prio)
+-{
+- if (static_prio == NICE_TO_PRIO(19))
+- return 1;
+-
+- if (static_prio < NICE_TO_PRIO(0))
+- return SCALE_PRIO(DEF_TIMESLICE * 4, static_prio);
+- else
+- return SCALE_PRIO(DEF_TIMESLICE, static_prio);
+-}
+-
+ static inline int rt_policy(int policy)
+ {
+ if (unlikely(policy == SCHED_FIFO) || unlikely(policy == SCHED_RR))
+ return 1;
+ return 0;
+@@ -168,45 +151,115 @@ static inline int task_has_rt_policy(str
+ struct rt_prio_array {
+ DECLARE_BITMAP(bitmap, MAX_RT_PRIO+1); /* include 1 bit for delimiter */
+ struct list_head queue[MAX_RT_PRIO];
+ };
+
+-struct load_stat {
+- struct load_weight load;
+- u64 load_update_start, load_update_last;
+- unsigned long delta_fair, delta_exec, delta_stat;
++#ifdef CONFIG_FAIR_GROUP_SCHED
++
++#include <linux/cgroup.h>
++
++struct cfs_rq;
++
++/* task group related information */
++struct task_group {
++#ifdef CONFIG_FAIR_CGROUP_SCHED
++ struct cgroup_subsys_state css;
++#endif
++ /* schedulable entities of this group on each cpu */
++ struct sched_entity **se;
++ /* runqueue "owned" by this group on each cpu */
++ struct cfs_rq **cfs_rq;
++ unsigned long shares;
++ /* spinlock to serialize modification to shares */
++ spinlock_t lock;
++ struct rcu_head rcu;
++};
++
++/* Default task group's sched entity on each cpu */
++static DEFINE_PER_CPU(struct sched_entity, init_sched_entity);
++/* Default task group's cfs_rq on each cpu */
++static DEFINE_PER_CPU(struct cfs_rq, init_cfs_rq) ____cacheline_aligned_in_smp;
++
++static struct sched_entity *init_sched_entity_p[NR_CPUS];
++static struct cfs_rq *init_cfs_rq_p[NR_CPUS];
++
++/* Default task group.
++ * Every task in system belong to this group at bootup.
++ */
++struct task_group init_task_group = {
++ .se = init_sched_entity_p,
++ .cfs_rq = init_cfs_rq_p,
+ };
+
++#ifdef CONFIG_FAIR_USER_SCHED
++# define INIT_TASK_GRP_LOAD 2*NICE_0_LOAD
++#else
++# define INIT_TASK_GRP_LOAD NICE_0_LOAD
++#endif
++
++static int init_task_group_load = INIT_TASK_GRP_LOAD;
++
++/* return group to which a task belongs */
++static inline struct task_group *task_group(struct task_struct *p)
++{
++ struct task_group *tg;
++
++#ifdef CONFIG_FAIR_USER_SCHED
++ tg = p->user->tg;
++#elif defined(CONFIG_FAIR_CGROUP_SCHED)
++ tg = container_of(task_subsys_state(p, cpu_cgroup_subsys_id),
++ struct task_group, css);
++#else
++ tg = &init_task_group;
++#endif
++ return tg;
++}
++
++/* Change a task's cfs_rq and parent entity if it moves across CPUs/groups */
++static inline void set_task_cfs_rq(struct task_struct *p, unsigned int cpu)
++{
++ p->se.cfs_rq = task_group(p)->cfs_rq[cpu];
++ p->se.parent = task_group(p)->se[cpu];
++}
++
++#else
++
++static inline void set_task_cfs_rq(struct task_struct *p, unsigned int cpu) { }
++
++#endif /* CONFIG_FAIR_GROUP_SCHED */
++
+ /* CFS-related fields in a runqueue */
+ struct cfs_rq {
+ struct load_weight load;
+ unsigned long nr_running;
+
+- s64 fair_clock;
+ u64 exec_clock;
+- s64 wait_runtime;
+- u64 sleeper_bonus;
+- unsigned long wait_runtime_overruns, wait_runtime_underruns;
++ u64 min_vruntime;
+
+ struct rb_root tasks_timeline;
+ struct rb_node *rb_leftmost;
+ struct rb_node *rb_load_balance_curr;
+-#ifdef CONFIG_FAIR_GROUP_SCHED
+ /* 'curr' points to currently running entity on this cfs_rq.
+ * It is set to NULL otherwise (i.e when none are currently running).
+ */
+ struct sched_entity *curr;
++
++ unsigned long nr_spread_over;
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
+ struct rq *rq; /* cpu runqueue to which this cfs_rq is attached */
+
+- /* leaf cfs_rqs are those that hold tasks (lowest schedulable entity in
++ /*
++ * leaf cfs_rqs are those that hold tasks (lowest schedulable entity in
+ * a hierarchy). Non-leaf lrqs hold other higher schedulable entities
+ * (like users, containers etc.)
+ *
+ * leaf_cfs_rq_list ties together list of leaf cfs_rq's in a cpu. This
+ * list is used during load balance.
+ */
+- struct list_head leaf_cfs_rq_list; /* Better name : task_cfs_rq_list? */
++ struct list_head leaf_cfs_rq_list;
++ struct task_group *tg; /* group that "owns" this runqueue */
+ #endif
+ };
+
+ /* Real-Time classes' related field in a runqueue: */
+ struct rt_rq {
+@@ -221,11 +274,12 @@ struct rt_rq {
+ * Locking rule: those places that want to lock multiple runqueues
+ * (such as the load balancing or the thread migration code), lock
+ * acquire operations must be ordered by ascending &runqueue.
+ */
+ struct rq {
+- spinlock_t lock; /* runqueue lock */
++ /* runqueue lock: */
++ spinlock_t lock;
+
+ /*
+ * nr_running and cpu_load should be in the same cacheline because
+ * remote CPUs use both these fields when doing load calculation.
+ */
+@@ -234,19 +288,21 @@ struct rq {
+ unsigned long cpu_load[CPU_LOAD_IDX_MAX];
+ unsigned char idle_at_tick;
+ #ifdef CONFIG_NO_HZ
+ unsigned char in_nohz_recently;
+ #endif
+- struct load_stat ls; /* capture load from *all* tasks on this cpu */
++ /* capture load from *all* tasks on this cpu: */
++ struct load_weight load;
+ unsigned long nr_load_updates;
+ u64 nr_switches;
+
+ struct cfs_rq cfs;
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+- struct list_head leaf_cfs_rq_list; /* list of leaf cfs_rq on this cpu */
++ /* list of leaf cfs_rq on this cpu: */
++ struct list_head leaf_cfs_rq_list;
+ #endif
+- struct rt_rq rt;
++ struct rt_rq rt;
+
+ /*
+ * This is part of a global counter where only the total sum
+ * over all CPUs matters. A task can increase this counter on
+ * one CPU and if it got migrated afterwards it may decrease
+@@ -272,34 +328,38 @@ struct rq {
+ struct sched_domain *sd;
+
+ /* For active balancing */
+ int active_balance;
+ int push_cpu;
+- int cpu; /* cpu of this runqueue */
++ /* cpu of this runqueue: */
++ int cpu;
+
+ struct task_struct *migration_thread;
+ struct list_head migration_queue;
+ #endif
+
+ #ifdef CONFIG_SCHEDSTATS
+ /* latency stats */
+ struct sched_info rq_sched_info;
+
+ /* sys_sched_yield() stats */
+- unsigned long yld_exp_empty;
+- unsigned long yld_act_empty;
+- unsigned long yld_both_empty;
+- unsigned long yld_cnt;
++ unsigned int yld_exp_empty;
++ unsigned int yld_act_empty;
++ unsigned int yld_both_empty;
++ unsigned int yld_count;
+
+ /* schedule() stats */
+- unsigned long sched_switch;
+- unsigned long sched_cnt;
+- unsigned long sched_goidle;
++ unsigned int sched_switch;
++ unsigned int sched_count;
++ unsigned int sched_goidle;
+
+ /* try_to_wake_up() stats */
+- unsigned long ttwu_cnt;
+- unsigned long ttwu_local;
++ unsigned int ttwu_count;
++ unsigned int ttwu_local;
++
++ /* BKL stats */
++ unsigned int bkl_count;
+ #endif
+ struct lock_class_key rq_lock_key;
+ };
+
+ static DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+@@ -380,10 +440,45 @@ static void update_rq_clock(struct rq *r
+ #define this_rq() (&__get_cpu_var(runqueues))
+ #define task_rq(p) cpu_rq(task_cpu(p))
+ #define cpu_curr(cpu) (cpu_rq(cpu)->curr)
+
+ /*
++ * Tunables that become constants when CONFIG_SCHED_DEBUG is off:
++ */
++#ifdef CONFIG_SCHED_DEBUG
++# define const_debug __read_mostly
++#else
++# define const_debug static const
++#endif
++
++/*
++ * Debugging: various feature bits
++ */
++enum {
++ SCHED_FEAT_NEW_FAIR_SLEEPERS = 1,
++ SCHED_FEAT_WAKEUP_PREEMPT = 2,
++ SCHED_FEAT_START_DEBIT = 4,
++ SCHED_FEAT_TREE_AVG = 8,
++ SCHED_FEAT_APPROX_AVG = 16,
++};
++
++const_debug unsigned int sysctl_sched_features =
++ SCHED_FEAT_NEW_FAIR_SLEEPERS * 1 |
++ SCHED_FEAT_WAKEUP_PREEMPT * 1 |
++ SCHED_FEAT_START_DEBIT * 1 |
++ SCHED_FEAT_TREE_AVG * 0 |
++ SCHED_FEAT_APPROX_AVG * 0;
++
++#define sched_feat(x) (sysctl_sched_features & SCHED_FEAT_##x)
++
++/*
++ * Number of tasks to iterate in a single balance run.
++ * Limited because this is done with IRQs disabled.
++ */
++const_debug unsigned int sysctl_sched_nr_migrate = 32;
++
++/*
+ * For kernel-internal use: high-speed (but slightly incorrect) per-cpu
+ * clock constructed from sched_clock():
+ */
+ unsigned long long cpu_clock(int cpu)
+ {
+@@ -391,40 +486,39 @@ unsigned long long cpu_clock(int cpu)
+ unsigned long flags;
+ struct rq *rq;
+
+ local_irq_save(flags);
+ rq = cpu_rq(cpu);
+- update_rq_clock(rq);
++ /*
++ * Only call sched_clock() if the scheduler has already been
++ * initialized (some code might call cpu_clock() very early):
++ */
++ if (rq->idle)
++ update_rq_clock(rq);
+ now = rq->clock;
+ local_irq_restore(flags);
+
+ return now;
+ }
+-
+-#ifdef CONFIG_FAIR_GROUP_SCHED
+-/* Change a task's ->cfs_rq if it moves across CPUs */
+-static inline void set_task_cfs_rq(struct task_struct *p)
+-{
+- p->se.cfs_rq = &task_rq(p)->cfs;
+-}
+-#else
+-static inline void set_task_cfs_rq(struct task_struct *p)
+-{
+-}
+-#endif
++EXPORT_SYMBOL_GPL(cpu_clock);
+
+ #ifndef prepare_arch_switch
+ # define prepare_arch_switch(next) do { } while (0)
+ #endif
+ #ifndef finish_arch_switch
+ # define finish_arch_switch(prev) do { } while (0)
+ #endif
+
++static inline int task_current(struct rq *rq, struct task_struct *p)
++{
++ return rq->curr == p;
++}
++
+ #ifndef __ARCH_WANT_UNLOCKED_CTXSW
+ static inline int task_running(struct rq *rq, struct task_struct *p)
+ {
+- return rq->curr == p;
++ return task_current(rq, p);
+ }
+
+ static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
+ {
+ }
+@@ -449,11 +543,11 @@ static inline void finish_lock_switch(st
+ static inline int task_running(struct rq *rq, struct task_struct *p)
+ {
+ #ifdef CONFIG_SMP
+ return p->oncpu;
+ #else
+- return rq->curr == p;
++ return task_current(rq, p);
+ #endif
+ }
+
+ static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
+ {
+@@ -494,44 +588,40 @@ static inline void finish_lock_switch(st
+ * Must be called interrupts disabled.
+ */
+ static inline struct rq *__task_rq_lock(struct task_struct *p)
+ __acquires(rq->lock)
+ {
+- struct rq *rq;
+-
+-repeat_lock_task:
+- rq = task_rq(p);
+- spin_lock(&rq->lock);
+- if (unlikely(rq != task_rq(p))) {
++ for (;;) {
++ struct rq *rq = task_rq(p);
++ spin_lock(&rq->lock);
++ if (likely(rq == task_rq(p)))
++ return rq;
+ spin_unlock(&rq->lock);
+- goto repeat_lock_task;
+ }
+- return rq;
+ }
+
+ /*
+ * task_rq_lock - lock the runqueue a given task resides on and disable
+- * interrupts. Note the ordering: we can safely lookup the task_rq without
++ * interrupts. Note the ordering: we can safely lookup the task_rq without
+ * explicitly disabling preemption.
+ */
+ static struct rq *task_rq_lock(struct task_struct *p, unsigned long *flags)
+ __acquires(rq->lock)
+ {
+ struct rq *rq;
+
+-repeat_lock_task:
+- local_irq_save(*flags);
+- rq = task_rq(p);
+- spin_lock(&rq->lock);
+- if (unlikely(rq != task_rq(p))) {
++ for (;;) {
++ local_irq_save(*flags);
++ rq = task_rq(p);
++ spin_lock(&rq->lock);
++ if (likely(rq == task_rq(p)))
++ return rq;
+ spin_unlock_irqrestore(&rq->lock, *flags);
+- goto repeat_lock_task;
+ }
+- return rq;
+ }
+
+-static inline void __task_rq_unlock(struct rq *rq)
++static void __task_rq_unlock(struct rq *rq)
+ __releases(rq->lock)
+ {
+ spin_unlock(&rq->lock);
+ }
+
+@@ -542,11 +632,11 @@ static inline void task_rq_unlock(struct
+ }
+
+ /*
+ * this_rq_lock - lock this runqueue and disable interrupts.
+ */
+-static inline struct rq *this_rq_lock(void)
++static struct rq *this_rq_lock(void)
+ __acquires(rq->lock)
+ {
+ struct rq *rq;
+
+ local_irq_disable();
+@@ -576,10 +666,11 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_sleep
+ void sched_clock_idle_wakeup_event(u64 delta_ns)
+ {
+ struct rq *rq = cpu_rq(smp_processor_id());
+ u64 now = sched_clock();
+
++ touch_softlockup_watchdog();
+ rq->idle_clock += delta_ns;
+ /*
+ * Override the previous timestamp and ignore all
+ * sched_clock() deltas that occured while we idled,
+ * and use the PM-provided delta_ns to advance the
+@@ -642,23 +733,10 @@ static inline void resched_task(struct t
+ assert_spin_locked(&task_rq(p)->lock);
+ set_tsk_need_resched(p);
+ }
+ #endif
+
+-static u64 div64_likely32(u64 divident, unsigned long divisor)
+-{
+-#if BITS_PER_LONG == 32
+- if (likely(divident <= 0xffffffffULL))
+- return (u32)divident / divisor;
+- do_div(divident, divisor);
+-
+- return divident;
+-#else
+- return divident / divisor;
+-#endif
+-}
+-
+ #if BITS_PER_LONG == 32
+ # define WMULT_CONST (~0UL)
+ #else
+ # define WMULT_CONST (1UL << 32)
+ #endif
+@@ -696,27 +774,25 @@ static inline unsigned long
+ calc_delta_fair(unsigned long delta_exec, struct load_weight *lw)
+ {
+ return calc_delta_mine(delta_exec, NICE_0_LOAD, lw);
+ }
+
+-static void update_load_add(struct load_weight *lw, unsigned long inc)
++static inline void update_load_add(struct load_weight *lw, unsigned long inc)
+ {
+ lw->weight += inc;
+- lw->inv_weight = 0;
+ }
+
+-static void update_load_sub(struct load_weight *lw, unsigned long dec)
++static inline void update_load_sub(struct load_weight *lw, unsigned long dec)
+ {
+ lw->weight -= dec;
+- lw->inv_weight = 0;
+ }
+
+ /*
+ * To aid in avoiding the subversion of "niceness" due to uneven distribution
+ * of tasks with abnormal "nice" values across CPUs the contribution that
+ * each task makes to its run queue's load is weighted according to its
+- * scheduling class and "nice" value. For SCHED_NORMAL tasks this is just a
++ * scheduling class and "nice" value. For SCHED_NORMAL tasks this is just a
+ * scaled version of the new time slice allocation that they receive on time
+ * slice expiry etc.
+ */
+
+ #define WEIGHT_IDLEPRIO 2
+@@ -774,76 +850,62 @@ struct rq_iterator {
+ void *arg;
+ struct task_struct *(*start)(void *);
+ struct task_struct *(*next)(void *);
+ };
+
+-static int balance_tasks(struct rq *this_rq, int this_cpu, struct rq *busiest,
+- unsigned long max_nr_move, unsigned long max_load_move,
+- struct sched_domain *sd, enum cpu_idle_type idle,
+- int *all_pinned, unsigned long *load_moved,
+- int *this_best_prio, struct rq_iterator *iterator);
++#ifdef CONFIG_SMP
++static unsigned long
++balance_tasks(struct rq *this_rq, int this_cpu, struct rq *busiest,
++ unsigned long max_load_move, struct sched_domain *sd,
++ enum cpu_idle_type idle, int *all_pinned,
++ int *this_best_prio, struct rq_iterator *iterator);
++
++static int
++iter_move_one_task(struct rq *this_rq, int this_cpu, struct rq *busiest,
++ struct sched_domain *sd, enum cpu_idle_type idle,
++ struct rq_iterator *iterator);
++#endif
++
++#ifdef CONFIG_CGROUP_CPUACCT
++static void cpuacct_charge(struct task_struct *tsk, u64 cputime);
++#else
++static inline void cpuacct_charge(struct task_struct *tsk, u64 cputime) {}
++#endif
+
+ #include "sched_stats.h"
+-#include "sched_rt.c"
+-#include "sched_fair.c"
+ #include "sched_idletask.c"
++#include "sched_fair.c"
++#include "sched_rt.c"
+ #ifdef CONFIG_SCHED_DEBUG
+ # include "sched_debug.c"
+ #endif
+
+ #define sched_class_highest (&rt_sched_class)
+
+-static void __update_curr_load(struct rq *rq, struct load_stat *ls)
+-{
+- if (rq->curr != rq->idle && ls->load.weight) {
+- ls->delta_exec += ls->delta_stat;
+- ls->delta_fair += calc_delta_fair(ls->delta_stat, &ls->load);
+- ls->delta_stat = 0;
+- }
+-}
+-
+ /*
+ * Update delta_exec, delta_fair fields for rq.
+ *
+ * delta_fair clock advances at a rate inversely proportional to
+- * total load (rq->ls.load.weight) on the runqueue, while
++ * total load (rq->load.weight) on the runqueue, while
+ * delta_exec advances at the same rate as wall-clock (provided
+ * cpu is not idle).
+ *
+ * delta_exec / delta_fair is a measure of the (smoothened) load on this
+ * runqueue over any given interval. This (smoothened) load is used
+ * during load balance.
+ *
+- * This function is called /before/ updating rq->ls.load
++ * This function is called /before/ updating rq->load
+ * and when switching tasks.
+ */
+-static void update_curr_load(struct rq *rq)
+-{
+- struct load_stat *ls = &rq->ls;
+- u64 start;
+-
+- start = ls->load_update_start;
+- ls->load_update_start = rq->clock;
+- ls->delta_stat += rq->clock - start;
+- /*
+- * Stagger updates to ls->delta_fair. Very frequent updates
+- * can be expensive.
+- */
+- if (ls->delta_stat >= sysctl_sched_stat_granularity)
+- __update_curr_load(rq, ls);
+-}
+-
+ static inline void inc_load(struct rq *rq, const struct task_struct *p)
+ {
+- update_curr_load(rq);
+- update_load_add(&rq->ls.load, p->se.load.weight);
++ update_load_add(&rq->load, p->se.load.weight);
+ }
+
+ static inline void dec_load(struct rq *rq, const struct task_struct *p)
+ {
+- update_curr_load(rq);
+- update_load_sub(&rq->ls.load, p->se.load.weight);
++ update_load_sub(&rq->load, p->se.load.weight);
+ }
+
+ static void inc_nr_running(struct task_struct *p, struct rq *rq)
+ {
+ rq->nr_running++;
+@@ -856,12 +918,10 @@ static void dec_nr_running(struct task_s
+ dec_load(rq, p);
+ }
+
+ static void set_load_weight(struct task_struct *p)
+ {
+- p->se.wait_runtime = 0;
+-
+ if (task_has_rt_policy(p)) {
+ p->se.load.weight = prio_to_weight[0] * 2;
+ p->se.load.inv_weight = prio_to_wmult[0] >> 1;
+ return;
+ }
+@@ -949,24 +1009,10 @@ static void activate_task(struct rq *rq,
+ enqueue_task(rq, p, wakeup);
+ inc_nr_running(p, rq);
+ }
+
+ /*
+- * activate_idle_task - move idle task to the _front_ of runqueue.
+- */
+-static inline void activate_idle_task(struct task_struct *p, struct rq *rq)
+-{
+- update_rq_clock(rq);
+-
+- if (p->state == TASK_UNINTERRUPTIBLE)
+- rq->nr_uninterruptible--;
+-
+- enqueue_task(rq, p, 0);
+- inc_nr_running(p, rq);
+-}
+-
+-/*
+ * deactivate_task - remove a task from the runqueue.
+ */
+ static void deactivate_task(struct rq *rq, struct task_struct *p, int sleep)
+ {
+ if (p->state == TASK_UNINTERRUPTIBLE)
+@@ -986,45 +1032,76 @@ inline int task_curr(const struct task_s
+ }
+
+ /* Used instead of source_load when we know the type == 0 */
+ unsigned long weighted_cpuload(const int cpu)
+ {
+- return cpu_rq(cpu)->ls.load.weight;
++ return cpu_rq(cpu)->load.weight;
+ }
+
+ static inline void __set_task_cpu(struct task_struct *p, unsigned int cpu)
+ {
++ set_task_cfs_rq(p, cpu);
+ #ifdef CONFIG_SMP
++ /*
++ * After ->cpu is set up to a new value, task_rq_lock(p, ...) can be
++ * successfuly executed on another CPU. We must ensure that updates of
++ * per-task data have been completed by this moment.
++ */
++ smp_wmb();
+ task_thread_info(p)->cpu = cpu;
+- set_task_cfs_rq(p);
+ #endif
+ }
+
+ #ifdef CONFIG_SMP
+
++/*
++ * Is this task likely cache-hot:
++ */
++static inline int
++task_hot(struct task_struct *p, u64 now, struct sched_domain *sd)
++{
++ s64 delta;
++
++ if (p->sched_class != &fair_sched_class)
++ return 0;
++
++ if (sysctl_sched_migration_cost == -1)
++ return 1;
++ if (sysctl_sched_migration_cost == 0)
++ return 0;
++
++ delta = now - p->se.exec_start;
++
++ return delta < (s64)sysctl_sched_migration_cost;
++}
++
++
+ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
+ {
+ int old_cpu = task_cpu(p);
+ struct rq *old_rq = cpu_rq(old_cpu), *new_rq = cpu_rq(new_cpu);
+- u64 clock_offset, fair_clock_offset;
++ struct cfs_rq *old_cfsrq = task_cfs_rq(p),
++ *new_cfsrq = cpu_cfs_rq(old_cfsrq, new_cpu);
++ u64 clock_offset;
+
+ clock_offset = old_rq->clock - new_rq->clock;
+- fair_clock_offset = old_rq->cfs.fair_clock - new_rq->cfs.fair_clock;
+-
+- if (p->se.wait_start_fair)
+- p->se.wait_start_fair -= fair_clock_offset;
+- if (p->se.sleep_start_fair)
+- p->se.sleep_start_fair -= fair_clock_offset;
+
+ #ifdef CONFIG_SCHEDSTATS
+ if (p->se.wait_start)
+ p->se.wait_start -= clock_offset;
+ if (p->se.sleep_start)
+ p->se.sleep_start -= clock_offset;
+ if (p->se.block_start)
+ p->se.block_start -= clock_offset;
++ if (old_cpu != new_cpu) {
++ schedstat_inc(p, se.nr_migrations);
++ if (task_hot(p, old_rq->clock, NULL))
++ schedstat_inc(p, se.nr_forced2_migrations);
++ }
+ #endif
++ p->se.vruntime -= old_cfsrq->min_vruntime -
++ new_cfsrq->min_vruntime;
+
+ __set_task_cpu(p, new_cpu);
+ }
+
+ struct migration_req {
+@@ -1075,73 +1152,75 @@ void wait_task_inactive(struct task_stru
+ {
+ unsigned long flags;
+ int running, on_rq;
+ struct rq *rq;
+
+-repeat:
+- /*
+- * We do the initial early heuristics without holding
+- * any task-queue locks at all. We'll only try to get
+- * the runqueue lock when things look like they will
+- * work out!
+- */
+- rq = task_rq(p);
++ for (;;) {
++ /*
++ * We do the initial early heuristics without holding
++ * any task-queue locks at all. We'll only try to get
++ * the runqueue lock when things look like they will
++ * work out!
++ */
++ rq = task_rq(p);
+
+- /*
+- * If the task is actively running on another CPU
+- * still, just relax and busy-wait without holding
+- * any locks.
+- *
+- * NOTE! Since we don't hold any locks, it's not
+- * even sure that "rq" stays as the right runqueue!
+- * But we don't care, since "task_running()" will
+- * return false if the runqueue has changed and p
+- * is actually now running somewhere else!
+- */
+- while (task_running(rq, p))
+- cpu_relax();
++ /*
++ * If the task is actively running on another CPU
++ * still, just relax and busy-wait without holding
++ * any locks.
++ *
++ * NOTE! Since we don't hold any locks, it's not
++ * even sure that "rq" stays as the right runqueue!
++ * But we don't care, since "task_running()" will
++ * return false if the runqueue has changed and p
++ * is actually now running somewhere else!
++ */
++ while (task_running(rq, p))
++ cpu_relax();
+
+- /*
+- * Ok, time to look more closely! We need the rq
+- * lock now, to be *sure*. If we're wrong, we'll
+- * just go back and repeat.
+- */
+- rq = task_rq_lock(p, &flags);
+- running = task_running(rq, p);
+- on_rq = p->se.on_rq;
+- task_rq_unlock(rq, &flags);
++ /*
++ * Ok, time to look more closely! We need the rq
++ * lock now, to be *sure*. If we're wrong, we'll
++ * just go back and repeat.
++ */
++ rq = task_rq_lock(p, &flags);
++ running = task_running(rq, p);
++ on_rq = p->se.on_rq;
++ task_rq_unlock(rq, &flags);
+
+- /*
+- * Was it really running after all now that we
+- * checked with the proper locks actually held?
+- *
+- * Oops. Go back and try again..
+- */
+- if (unlikely(running)) {
+- cpu_relax();
+- goto repeat;
+- }
++ /*
++ * Was it really running after all now that we
++ * checked with the proper locks actually held?
++ *
++ * Oops. Go back and try again..
++ */
++ if (unlikely(running)) {
++ cpu_relax();
++ continue;
++ }
+
+- /*
+- * It's not enough that it's not actively running,
+- * it must be off the runqueue _entirely_, and not
+- * preempted!
+- *
+- * So if it wa still runnable (but just not actively
+- * running right now), it's preempted, and we should
+- * yield - it could be a while.
+- */
+- if (unlikely(on_rq)) {
+- yield();
+- goto repeat;
+- }
++ /*
++ * It's not enough that it's not actively running,
++ * it must be off the runqueue _entirely_, and not
++ * preempted!
++ *
++ * So if it wa still runnable (but just not actively
++ * running right now), it's preempted, and we should
++ * yield - it could be a while.
++ */
++ if (unlikely(on_rq)) {
++ schedule_timeout_uninterruptible(1);
++ continue;
++ }
+
+- /*
+- * Ahh, all good. It wasn't running, and it wasn't
+- * runnable, which means that it will never become
+- * running in the future either. We're all done!
+- */
++ /*
++ * Ahh, all good. It wasn't running, and it wasn't
++ * runnable, which means that it will never become
++ * running in the future either. We're all done!
++ */
++ break;
++ }
+ }
+
+ /***
+ * kick_process - kick a running thread to enter/exit the kernel
+ * @p: the to-be-kicked thread
+@@ -1171,11 +1250,11 @@ void kick_process(struct task_struct *p)
+ * according to the scheduling class and "nice" value.
+ *
+ * We want to under-estimate the load of migration sources, to
+ * balance conservatively.
+ */
+-static inline unsigned long source_load(int cpu, int type)
++static unsigned long source_load(int cpu, int type)
+ {
+ struct rq *rq = cpu_rq(cpu);
+ unsigned long total = weighted_cpuload(cpu);
+
+ if (type == 0)
+@@ -1186,11 +1265,11 @@ static inline unsigned long source_load(
+
+ /*
+ * Return a high guess at the load of a migration-target cpu weighted
+ * according to the scheduling class and "nice" value.
+ */
+-static inline unsigned long target_load(int cpu, int type)
++static unsigned long target_load(int cpu, int type)
+ {
+ struct rq *rq = cpu_rq(cpu);
+ unsigned long total = weighted_cpuload(cpu);
+
+ if (type == 0)
+@@ -1228,11 +1307,11 @@ find_idlest_group(struct sched_domain *s
+ int local_group;
+ int i;
+
+ /* Skip over this group if it has no CPUs allowed */
+ if (!cpus_intersects(group->cpumask, p->cpus_allowed))
+- goto nextgroup;
++ continue;
+
+ local_group = cpu_isset(this_cpu, group->cpumask);
+
+ /* Tally up the load of all CPUs in the group */
+ avg_load = 0;
+@@ -1256,13 +1335,11 @@ find_idlest_group(struct sched_domain *s
+ this = group;
+ } else if (avg_load < min_load) {
+ min_load = avg_load;
+ idlest = group;
+ }
+-nextgroup:
+- group = group->next;
+- } while (group != sd->groups);
++ } while (group = group->next, group != sd->groups);
+
+ if (!idlest || 100*this_load < imbalance*min_load)
+ return NULL;
+ return idlest;
+ }
+@@ -1390,12 +1467,17 @@ static int wake_idle(int cpu, struct tas
+
+ for_each_domain(cpu, sd) {
+ if (sd->flags & SD_WAKE_IDLE) {
+ cpus_and(tmp, sd->span, p->cpus_allowed);
+ for_each_cpu_mask(i, tmp) {
+- if (idle_cpu(i))
++ if (idle_cpu(i)) {
++ if (i != task_cpu(p)) {
++ schedstat_inc(p,
++ se.nr_wakeups_idle);
++ }
+ return i;
++ }
+ }
+ } else {
+ break;
+ }
+ }
+@@ -1422,11 +1504,11 @@ static inline int wake_idle(int cpu, str
+ *
+ * returns failure only if the task is already active.
+ */
+ static int try_to_wake_up(struct task_struct *p, unsigned int state, int sync)
+ {
+- int cpu, this_cpu, success = 0;
++ int cpu, orig_cpu, this_cpu, success = 0;
+ unsigned long flags;
+ long old_state;
+ struct rq *rq;
+ #ifdef CONFIG_SMP
+ struct sched_domain *sd, *this_sd = NULL;
+@@ -1441,19 +1523,20 @@ static int try_to_wake_up(struct task_st
+
+ if (p->se.on_rq)
+ goto out_running;
+
+ cpu = task_cpu(p);
++ orig_cpu = cpu;
+ this_cpu = smp_processor_id();
+
+ #ifdef CONFIG_SMP
+ if (unlikely(task_running(rq, p)))
+ goto out_activate;
+
+ new_cpu = cpu;
+
+- schedstat_inc(rq, ttwu_cnt);
++ schedstat_inc(rq, ttwu_count);
+ if (cpu == this_cpu) {
+ schedstat_inc(rq, ttwu_local);
+ goto out_set_cpu;
+ }
+
+@@ -1484,10 +1567,17 @@ static int try_to_wake_up(struct task_st
+
+ if (this_sd->flags & SD_WAKE_AFFINE) {
+ unsigned long tl = this_load;
+ unsigned long tl_per_task;
+
++ /*
++ * Attract cache-cold tasks on sync wakeups:
++ */
++ if (sync && !task_hot(p, rq->clock, this_sd))
++ goto out_set_cpu;
++
++ schedstat_inc(p, se.nr_wakeups_affine_attempts);
+ tl_per_task = cpu_avg_load_per_task(this_cpu);
+
+ /*
+ * If sync wakeup then subtract the (maximum possible)
+ * effect of the currently running task from the load
+@@ -1503,10 +1593,11 @@ static int try_to_wake_up(struct task_st
+ * This domain has SD_WAKE_AFFINE and
+ * p is cache cold in this domain, and
+ * there is no bad imbalance.
+ */
+ schedstat_inc(this_sd, ttwu_move_affine);
++ schedstat_inc(p, se.nr_wakeups_affine);
+ goto out_set_cpu;
+ }
+ }
+
+ /*
+@@ -1514,10 +1605,11 @@ static int try_to_wake_up(struct task_st
+ * limit is reached.
+ */
+ if (this_sd->flags & SD_WAKE_BALANCE) {
+ if (imbalance*this_load <= 100*load) {
+ schedstat_inc(this_sd, ttwu_move_balance);
++ schedstat_inc(p, se.nr_wakeups_passive);
+ goto out_set_cpu;
+ }
+ }
+ }
+
+@@ -1539,22 +1631,22 @@ out_set_cpu:
+ cpu = task_cpu(p);
+ }
+
+ out_activate:
+ #endif /* CONFIG_SMP */
++ schedstat_inc(p, se.nr_wakeups);
++ if (sync)
++ schedstat_inc(p, se.nr_wakeups_sync);
++ if (orig_cpu != cpu)
++ schedstat_inc(p, se.nr_wakeups_migrate);
++ if (cpu == this_cpu)
++ schedstat_inc(p, se.nr_wakeups_local);
++ else
++ schedstat_inc(p, se.nr_wakeups_remote);
+ update_rq_clock(rq);
+ activate_task(rq, p, 1);
+- /*
+- * Sync wakeups (i.e. those types of wakeups where the waker
+- * has indicated that it will leave the CPU in short order)
+- * don't trigger a preemption, if the woken up task will run on
+- * this cpu. (in this case the 'I will reschedule' promise of
+- * the waker guarantees that the freshly woken up task is going
+- * to be considered on this CPU.)
+- */
+- if (!sync || cpu != this_cpu)
+- check_preempt_curr(rq, p);
++ check_preempt_curr(rq, p);
+ success = 1;
+
+ out_running:
+ p->state = TASK_RUNNING;
+ out:
+@@ -1581,32 +1673,24 @@ int fastcall wake_up_state(struct task_s
+ *
+ * __sched_fork() is basic setup used by init_idle() too:
+ */
+ static void __sched_fork(struct task_struct *p)
+ {
+- p->se.wait_start_fair = 0;
+ p->se.exec_start = 0;
+ p->se.sum_exec_runtime = 0;
+ p->se.prev_sum_exec_runtime = 0;
+- p->se.delta_exec = 0;
+- p->se.delta_fair_run = 0;
+- p->se.delta_fair_sleep = 0;
+- p->se.wait_runtime = 0;
+- p->se.sleep_start_fair = 0;
+
+ #ifdef CONFIG_SCHEDSTATS
+ p->se.wait_start = 0;
+- p->se.sum_wait_runtime = 0;
+ p->se.sum_sleep_runtime = 0;
+ p->se.sleep_start = 0;
+ p->se.block_start = 0;
+ p->se.sleep_max = 0;
+ p->se.block_max = 0;
+ p->se.exec_max = 0;
++ p->se.slice_max = 0;
+ p->se.wait_max = 0;
+- p->se.wait_runtime_overruns = 0;
+- p->se.wait_runtime_underruns = 0;
+ #endif
+
+ INIT_LIST_HEAD(&p->run_list);
+ p->se.on_rq = 0;
+
+@@ -1633,16 +1717,18 @@ void sched_fork(struct task_struct *p, i
+ __sched_fork(p);
+
+ #ifdef CONFIG_SMP
+ cpu = sched_balance_self(cpu, SD_BALANCE_FORK);
+ #endif
+- __set_task_cpu(p, cpu);
++ set_task_cpu(p, cpu);
+
+ /*
+ * Make sure we do not leak PI boosting priority to the child:
+ */
+ p->prio = current->normal_prio;
++ if (!rt_prio(p->prio))
++ p->sched_class = &fair_sched_class;
+
+ #if defined(CONFIG_SCHEDSTATS) || defined(CONFIG_TASK_DELAY_ACCT)
+ if (likely(sched_info_on()))
+ memset(&p->sched_info, 0, sizeof(p->sched_info));
+ #endif
+@@ -1655,44 +1741,28 @@ void sched_fork(struct task_struct *p, i
+ #endif
+ put_cpu();
+ }
+
+ /*
+- * After fork, child runs first. (default) If set to 0 then
+- * parent will (try to) run first.
+- */
+-unsigned int __read_mostly sysctl_sched_child_runs_first = 1;
+-
+-/*
+ * wake_up_new_task - wake up a newly created task for the first time.
+ *
+ * This function will do some initial scheduler statistics housekeeping
+ * that must be done for every newly created context, then puts the task
+ * on the runqueue and wakes it.
+ */
+ void fastcall wake_up_new_task(struct task_struct *p, unsigned long clone_flags)
+ {
+ unsigned long flags;
+ struct rq *rq;
+- int this_cpu;
+
+ rq = task_rq_lock(p, &flags);
+ BUG_ON(p->state != TASK_RUNNING);
+- this_cpu = smp_processor_id(); /* parent's CPU */
+ update_rq_clock(rq);
+
+ p->prio = effective_prio(p);
+
+- if (rt_prio(p->prio))
+- p->sched_class = &rt_sched_class;
+- else
+- p->sched_class = &fair_sched_class;
+-
+- if (!p->sched_class->task_new || !sysctl_sched_child_runs_first ||
+- (clone_flags & CLONE_VM) || task_cpu(p) != this_cpu ||
+- !current->se.on_rq) {
+-
++ if (!p->sched_class->task_new || !current->se.on_rq) {
+ activate_task(rq, p, 0);
+ } else {
+ /*
+ * Let the scheduling class do new task startup
+ * management (if any):
+@@ -1793,15 +1863,15 @@ prepare_task_switch(struct rq *rq, struc
+ * with a prepare_task_switch call before the context switch.
+ * finish_task_switch will reconcile locking set up by prepare_task_switch,
+ * and do any other architecture-specific cleanup actions.
+ *
+ * Note that we may have delayed dropping an mm in context_switch(). If
+- * so, we finish that here outside of the runqueue lock. (Doing it
++ * so, we finish that here outside of the runqueue lock. (Doing it
+ * with the lock held can cause deadlocks; see schedule() for
+ * details.)
+ */
+-static inline void finish_task_switch(struct rq *rq, struct task_struct *prev)
++static void finish_task_switch(struct rq *rq, struct task_struct *prev)
+ __releases(rq->lock)
+ {
+ struct mm_struct *mm = rq->prev_mm;
+ long prev_state;
+
+@@ -1847,11 +1917,11 @@ asmlinkage void schedule_tail(struct tas
+ #ifdef __ARCH_WANT_UNLOCKED_CTXSW
+ /* In this case, finish_task_switch does not reenable preemption */
+ preempt_enable();
+ #endif
+ if (current->set_child_tid)
+- put_user(current->pid, current->set_child_tid);
++ put_user(task_pid_vnr(current), current->set_child_tid);
+ }
+
+ /*
+ * context_switch - switch to the new MM and the new
+ * thread's register state.
+@@ -1979,56 +2049,30 @@ unsigned long nr_active(void)
+ * Update rq->cpu_load[] statistics. This function is usually called every
+ * scheduler tick (TICK_NSEC).
+ */
+ static void update_cpu_load(struct rq *this_rq)
+ {
+- u64 fair_delta64, exec_delta64, idle_delta64, sample_interval64, tmp64;
+- unsigned long total_load = this_rq->ls.load.weight;
+- unsigned long this_load = total_load;
+- struct load_stat *ls = &this_rq->ls;
++ unsigned long this_load = this_rq->load.weight;
+ int i, scale;
+
+ this_rq->nr_load_updates++;
+- if (unlikely(!(sysctl_sched_features & SCHED_FEAT_PRECISE_CPU_LOAD)))
+- goto do_avg;
+-
+- /* Update delta_fair/delta_exec fields first */
+- update_curr_load(this_rq);
+-
+- fair_delta64 = ls->delta_fair + 1;
+- ls->delta_fair = 0;
+-
+- exec_delta64 = ls->delta_exec + 1;
+- ls->delta_exec = 0;
+-
+- sample_interval64 = this_rq->clock - ls->load_update_last;
+- ls->load_update_last = this_rq->clock;
+-
+- if ((s64)sample_interval64 < (s64)TICK_NSEC)
+- sample_interval64 = TICK_NSEC;
+-
+- if (exec_delta64 > sample_interval64)
+- exec_delta64 = sample_interval64;
+-
+- idle_delta64 = sample_interval64 - exec_delta64;
+-
+- tmp64 = div64_64(SCHED_LOAD_SCALE * exec_delta64, fair_delta64);
+- tmp64 = div64_64(tmp64 * exec_delta64, sample_interval64);
+-
+- this_load = (unsigned long)tmp64;
+-
+-do_avg:
+
+ /* Update our load: */
+ for (i = 0, scale = 1; i < CPU_LOAD_IDX_MAX; i++, scale += scale) {
+ unsigned long old_load, new_load;
+
+ /* scale is effectively 1 << i now, and >> i divides by scale */
+
+ old_load = this_rq->cpu_load[i];
+ new_load = this_load;
+-
++ /*
++ * Round up the averaging division if load is increasing. This
++ * prevents us from getting stuck on 9 if the load is 10, for
++ * example.
++ */
++ if (new_load > old_load)
++ new_load += scale-1;
+ this_rq->cpu_load[i] = (old_load*(scale-1) + new_load) >> i;
+ }
+ }
+
+ #ifdef CONFIG_SMP
+@@ -2101,11 +2145,11 @@ static void double_lock_balance(struct r
+ }
+
+ /*
+ * If dest_cpu is allowed for this process, migrate the task to it.
+ * This is accomplished by forcing the cpu_allowed mask to only
+- * allow dest_cpu, which will force the cpu onto dest_cpu. Then
++ * allow dest_cpu, which will force the cpu onto dest_cpu. Then
+ * the cpu_allowed mask is restored.
+ */
+ static void sched_migrate_task(struct task_struct *p, int dest_cpu)
+ {
+ struct migration_req req;
+@@ -2176,44 +2220,69 @@ int can_migrate_task(struct task_struct
+ * We do not migrate tasks that are:
+ * 1) running (obviously), or
+ * 2) cannot be migrated to this CPU due to cpus_allowed, or
+ * 3) are cache-hot on their current CPU.
+ */
+- if (!cpu_isset(this_cpu, p->cpus_allowed))
++ if (!cpu_isset(this_cpu, p->cpus_allowed)) {
++ schedstat_inc(p, se.nr_failed_migrations_affine);
+ return 0;
++ }
+ *all_pinned = 0;
+
+- if (task_running(rq, p))
++ if (task_running(rq, p)) {
++ schedstat_inc(p, se.nr_failed_migrations_running);
+ return 0;
++ }
++
++ /*
++ * Aggressive migration if:
++ * 1) task is cache cold, or
++ * 2) too many balance attempts have failed.
++ */
++
++ if (!task_hot(p, rq->clock, sd) ||
++ sd->nr_balance_failed > sd->cache_nice_tries) {
++#ifdef CONFIG_SCHEDSTATS
++ if (task_hot(p, rq->clock, sd)) {
++ schedstat_inc(sd, lb_hot_gained[idle]);
++ schedstat_inc(p, se.nr_forced_migrations);
++ }
++#endif
++ return 1;
++ }
+
++ if (task_hot(p, rq->clock, sd)) {
++ schedstat_inc(p, se.nr_failed_migrations_hot);
++ return 0;
++ }
+ return 1;
+ }
+
+-static int balance_tasks(struct rq *this_rq, int this_cpu, struct rq *busiest,
+- unsigned long max_nr_move, unsigned long max_load_move,
+- struct sched_domain *sd, enum cpu_idle_type idle,
+- int *all_pinned, unsigned long *load_moved,
+- int *this_best_prio, struct rq_iterator *iterator)
++static unsigned long
++balance_tasks(struct rq *this_rq, int this_cpu, struct rq *busiest,
++ unsigned long max_load_move, struct sched_domain *sd,
++ enum cpu_idle_type idle, int *all_pinned,
++ int *this_best_prio, struct rq_iterator *iterator)
+ {
+- int pulled = 0, pinned = 0, skip_for_load;
++ int loops = 0, pulled = 0, pinned = 0, skip_for_load;
+ struct task_struct *p;
+ long rem_load_move = max_load_move;
+
+- if (max_nr_move == 0 || max_load_move == 0)
++ if (max_load_move == 0)
+ goto out;
+
+ pinned = 1;
+
+ /*
+ * Start the load-balancing iterator:
+ */
+ p = iterator->start(iterator->arg);
+ next:
+- if (!p)
++ if (!p || loops++ > sysctl_sched_nr_migrate)
+ goto out;
+ /*
+- * To help distribute high priority tasks accross CPUs we don't
++ * To help distribute high priority tasks across CPUs we don't
+ * skip a task if it will be the highest priority task (i.e. smallest
+ * prio value) on its new queue regardless of its load weight
+ */
+ skip_for_load = (p->se.load.weight >> 1) > rem_load_move +
+ SCHED_LOAD_SCALE_FUZZ;
+@@ -2226,31 +2295,30 @@ next:
+ pull_task(busiest, p, this_rq, this_cpu);
+ pulled++;
+ rem_load_move -= p->se.load.weight;
+
+ /*
+- * We only want to steal up to the prescribed number of tasks
+- * and the prescribed amount of weighted load.
++ * We only want to steal up to the prescribed amount of weighted load.
+ */
+- if (pulled < max_nr_move && rem_load_move > 0) {
++ if (rem_load_move > 0) {
+ if (p->prio < *this_best_prio)
+ *this_best_prio = p->prio;
+ p = iterator->next(iterator->arg);
+ goto next;
+ }
+ out:
+ /*
+- * Right now, this is the only place pull_task() is called,
++ * Right now, this is one of only two places pull_task() is called,
+ * so we can safely collect pull_task() stats here rather than
+ * inside pull_task().
+ */
+ schedstat_add(sd, lb_gained[idle], pulled);
+
+ if (all_pinned)
+ *all_pinned = pinned;
+- *load_moved = max_load_move - rem_load_move;
+- return pulled;
++
++ return max_load_move - rem_load_move;
+ }
+
+ /*
+ * move_tasks tries to move up to max_load_move weighted load from busiest to
+ * this_rq, as part of a balancing operation within domain "sd".
+@@ -2261,42 +2329,65 @@ out:
+ static int move_tasks(struct rq *this_rq, int this_cpu, struct rq *busiest,
+ unsigned long max_load_move,
+ struct sched_domain *sd, enum cpu_idle_type idle,
+ int *all_pinned)
+ {
+- struct sched_class *class = sched_class_highest;
++ const struct sched_class *class = sched_class_highest;
+ unsigned long total_load_moved = 0;
+ int this_best_prio = this_rq->curr->prio;
+
+ do {
+ total_load_moved +=
+ class->load_balance(this_rq, this_cpu, busiest,
+- ULONG_MAX, max_load_move - total_load_moved,
++ max_load_move - total_load_moved,
+ sd, idle, all_pinned, &this_best_prio);
+ class = class->next;
+ } while (class && max_load_move > total_load_moved);
+
+ return total_load_moved > 0;
+ }
+
++static int
++iter_move_one_task(struct rq *this_rq, int this_cpu, struct rq *busiest,
++ struct sched_domain *sd, enum cpu_idle_type idle,
++ struct rq_iterator *iterator)
++{
++ struct task_struct *p = iterator->start(iterator->arg);
++ int pinned = 0;
++
++ while (p) {
++ if (can_migrate_task(p, busiest, this_cpu, sd, idle, &pinned)) {
++ pull_task(busiest, p, this_rq, this_cpu);
++ /*
++ * Right now, this is only the second place pull_task()
++ * is called, so we can safely collect pull_task()
++ * stats here rather than inside pull_task().
++ */
++ schedstat_inc(sd, lb_gained[idle]);
++
++ return 1;
++ }
++ p = iterator->next(iterator->arg);
++ }
++
++ return 0;
++}
++
+ /*
+ * move_one_task tries to move exactly one task from busiest to this_rq, as
+ * part of active balancing operations within "domain".
+ * Returns 1 if successful and 0 otherwise.
+ *
+ * Called with both runqueues locked.
+ */
+ static int move_one_task(struct rq *this_rq, int this_cpu, struct rq *busiest,
+ struct sched_domain *sd, enum cpu_idle_type idle)
+ {
+- struct sched_class *class;
+- int this_best_prio = MAX_PRIO;
++ const struct sched_class *class;
+
+ for (class = sched_class_highest; class; class = class->next)
+- if (class->load_balance(this_rq, this_cpu, busiest,
+- 1, ULONG_MAX, sd, idle, NULL,
+- &this_best_prio))
++ if (class->move_one_task(this_rq, this_cpu, busiest, sd, idle))
+ return 1;
+
+ return 0;
+ }
+
+@@ -2313,11 +2404,11 @@ find_busiest_group(struct sched_domain *
+ struct sched_group *busiest = NULL, *this = NULL, *group = sd->groups;
+ unsigned long max_load, avg_load, total_load, this_load, total_pwr;
+ unsigned long max_pull;
+ unsigned long busiest_load_per_task, busiest_nr_running;
+ unsigned long this_load_per_task, this_nr_running;
+- int load_idx;
++ int load_idx, group_imb = 0;
+ #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+ int power_savings_balance = 1;
+ unsigned long leader_nr_running = 0, min_load_per_task = 0;
+ unsigned long min_nr_running = ULONG_MAX;
+ struct sched_group *group_min = NULL, *group_leader = NULL;
+@@ -2332,23 +2423,26 @@ find_busiest_group(struct sched_domain *
+ load_idx = sd->newidle_idx;
+ else
+ load_idx = sd->idle_idx;
+
+ do {
+- unsigned long load, group_capacity;
++ unsigned long load, group_capacity, max_cpu_load, min_cpu_load;
+ int local_group;
+ int i;
++ int __group_imb = 0;
+ unsigned int balance_cpu = -1, first_idle_cpu = 0;
+ unsigned long sum_nr_running, sum_weighted_load;
+
+ local_group = cpu_isset(this_cpu, group->cpumask);
+
+ if (local_group)
+ balance_cpu = first_cpu(group->cpumask);
+
+ /* Tally up the load of all CPUs in the group */
+ sum_weighted_load = sum_nr_running = avg_load = 0;
++ max_cpu_load = 0;
++ min_cpu_load = ~0UL;
+
+ for_each_cpu_mask(i, group->cpumask) {
+ struct rq *rq;
+
+ if (!cpu_isset(i, *cpus))
+@@ -2365,12 +2459,17 @@ find_busiest_group(struct sched_domain *
+ first_idle_cpu = 1;
+ balance_cpu = i;
+ }
+
+ load = target_load(i, load_idx);
+- } else
++ } else {
+ load = source_load(i, load_idx);
++ if (load > max_cpu_load)
++ max_cpu_load = load;
++ if (min_cpu_load > load)
++ min_cpu_load = load;
++ }
+
+ avg_load += load;
+ sum_nr_running += rq->nr_running;
+ sum_weighted_load += weighted_cpuload(i);
+ }
+@@ -2392,23 +2491,27 @@ find_busiest_group(struct sched_domain *
+
+ /* Adjust by relative CPU power of the group */
+ avg_load = sg_div_cpu_power(group,
+ avg_load * SCHED_LOAD_SCALE);
+
++ if ((max_cpu_load - min_cpu_load) > SCHED_LOAD_SCALE)
++ __group_imb = 1;
++
+ group_capacity = group->__cpu_power / SCHED_LOAD_SCALE;
+
+ if (local_group) {
+ this_load = avg_load;
+ this = group;
+ this_nr_running = sum_nr_running;
+ this_load_per_task = sum_weighted_load;
+ } else if (avg_load > max_load &&
+- sum_nr_running > group_capacity) {
++ (sum_nr_running > group_capacity || __group_imb)) {
+ max_load = avg_load;
+ busiest = group;
+ busiest_nr_running = sum_nr_running;
+ busiest_load_per_task = sum_weighted_load;
++ group_imb = __group_imb;
+ }
+
+ #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+ /*
+ * Busy processors will not participate in power savings
+@@ -2476,19 +2579,22 @@ group_next:
+ if (this_load >= avg_load ||
+ 100*max_load <= sd->imbalance_pct*this_load)
+ goto out_balanced;
+
+ busiest_load_per_task /= busiest_nr_running;
++ if (group_imb)
++ busiest_load_per_task = min(busiest_load_per_task, avg_load);
++
+ /*
+ * We're trying to get all the cpus to the average_load, so we don't
+ * want to push ourselves above the average load, nor do we wish to
+ * reduce the max loaded cpu below the average load, as either of these
+ * actions would just result in more rebalancing later, and ping-pong
+ * tasks around. Thus we look for the minimum possible imbalance.
+ * Negative imbalances (*we* are more loaded than anyone else) will
+ * be counted as no imbalance for these purposes -- we can't fix that
+- * by pulling tasks to us. Be careful of negative numbers as they'll
++ * by pulling tasks to us. Be careful of negative numbers as they'll
+ * appear as very large values with unsigned longs.
+ */
+ if (max_load <= busiest_load_per_task)
+ goto out_balanced;
+
+@@ -2650,11 +2756,11 @@ static int load_balance(int this_cpu, st
+ */
+ if (idle != CPU_NOT_IDLE && sd->flags & SD_SHARE_CPUPOWER &&
+ !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
+ sd_idle = 1;
+
+- schedstat_inc(sd, lb_cnt[idle]);
++ schedstat_inc(sd, lb_count[idle]);
+
+ redo:
+ group = find_busiest_group(sd, this_cpu, &imbalance, idle, &sd_idle,
+ &cpus, balance);
+
+@@ -2803,11 +2909,11 @@ load_balance_newidle(int this_cpu, struc
+ */
+ if (sd->flags & SD_SHARE_CPUPOWER &&
+ !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
+ sd_idle = 1;
+
+- schedstat_inc(sd, lb_cnt[CPU_NEWLY_IDLE]);
++ schedstat_inc(sd, lb_count[CPU_NEWLY_IDLE]);
+ redo:
+ group = find_busiest_group(sd, this_cpu, &imbalance, CPU_NEWLY_IDLE,
+ &sd_idle, &cpus, NULL);
+ if (!group) {
+ schedstat_inc(sd, lb_nobusyg[CPU_NEWLY_IDLE]);
+@@ -2919,11 +3025,11 @@ static void active_load_balance(struct r
+
+ target_rq = cpu_rq(target_cpu);
+
+ /*
+ * This condition is "impossible", if it occurs
+- * we need to fix it. Originally reported by
++ * we need to fix it. Originally reported by
+ * Bjorn Helgaas on a 128-cpu setup.
+ */
+ BUG_ON(busiest_rq == target_rq);
+
+ /* move a task from busiest_rq to target_rq */
+@@ -2937,11 +3043,11 @@ static void active_load_balance(struct r
+ cpu_isset(busiest_cpu, sd->span))
+ break;
+ }
+
+ if (likely(sd)) {
+- schedstat_inc(sd, alb_cnt);
++ schedstat_inc(sd, alb_count);
+
+ if (move_one_task(target_rq, target_cpu, busiest_rq,
+ sd, CPU_IDLE))
+ schedstat_inc(sd, alb_pushed);
+ else
+@@ -2951,11 +3057,11 @@ static void active_load_balance(struct r
+ }
+
+ #ifdef CONFIG_NO_HZ
+ static struct {
+ atomic_t load_balancer;
+- cpumask_t cpu_mask;
++ cpumask_t cpu_mask;
+ } nohz ____cacheline_aligned = {
+ .load_balancer = ATOMIC_INIT(-1),
+ .cpu_mask = CPU_MASK_NONE,
+ };
+
+@@ -3030,11 +3136,11 @@ static DEFINE_SPINLOCK(balancing);
+ * It checks each scheduling domain to see if it is due to be balanced,
+ * and initiates a balancing operation if so.
+ *
+ * Balancing parameters are set up in arch_init_sched_domains.
+ */
+-static inline void rebalance_domains(int cpu, enum cpu_idle_type idle)
++static void rebalance_domains(int cpu, enum cpu_idle_type idle)
+ {
+ int balance = 1;
+ struct rq *rq = cpu_rq(cpu);
+ unsigned long interval;
+ struct sched_domain *sd;
+@@ -3214,22 +3320,10 @@ static inline void trigger_load_balance(
+ */
+ static inline void idle_balance(int cpu, struct rq *rq)
+ {
+ }
+
+-/* Avoid "used but not defined" warning on UP */
+-static int balance_tasks(struct rq *this_rq, int this_cpu, struct rq *busiest,
+- unsigned long max_nr_move, unsigned long max_load_move,
+- struct sched_domain *sd, enum cpu_idle_type idle,
+- int *all_pinned, unsigned long *load_moved,
+- int *this_best_prio, struct rq_iterator *iterator)
+-{
+- *load_moved = 0;
+-
+- return 0;
+-}
+-
+ #endif
+
+ DEFINE_PER_CPU(struct kernel_stat, kstat);
+
+ EXPORT_PER_CPU_SYMBOL(kstat);
+@@ -3244,11 +3338,11 @@ unsigned long long task_sched_runtime(st
+ u64 ns, delta_exec;
+ struct rq *rq;
+
+ rq = task_rq_lock(p, &flags);
+ ns = p->se.sum_exec_runtime;
+- if (rq->curr == p) {
++ if (task_current(rq, p)) {
+ update_rq_clock(rq);
+ delta_exec = rq->clock - p->se.exec_start;
+ if ((s64)delta_exec > 0)
+ ns += delta_exec;
+ }
+@@ -3258,11 +3352,10 @@ unsigned long long task_sched_runtime(st
+ }
+
+ /*
+ * Account user cpu time to a process.
+ * @p: the process that the cpu time gets accounted to
+- * @hardirq_offset: the offset to subtract from hardirq_count()
+ * @cputime: the cpu time spent in user space since the last update
+ */
+ void account_user_time(struct task_struct *p, cputime_t cputime)
+ {
+ struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
+@@ -3277,10 +3370,39 @@ void account_user_time(struct task_struc
+ else
+ cpustat->user = cputime64_add(cpustat->user, tmp);
+ }
+
+ /*
++ * Account guest cpu time to a process.
++ * @p: the process that the cpu time gets accounted to
++ * @cputime: the cpu time spent in virtual machine since the last update
++ */
++static void account_guest_time(struct task_struct *p, cputime_t cputime)
++{
++ cputime64_t tmp;
++ struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
++
++ tmp = cputime_to_cputime64(cputime);
++
++ p->utime = cputime_add(p->utime, cputime);
++ p->gtime = cputime_add(p->gtime, cputime);
++
++ cpustat->user = cputime64_add(cpustat->user, tmp);
++ cpustat->guest = cputime64_add(cpustat->guest, tmp);
++}
++
++/*
++ * Account scaled user cpu time to a process.
++ * @p: the process that the cpu time gets accounted to
++ * @cputime: the cpu time spent in user space since the last update
++ */
++void account_user_time_scaled(struct task_struct *p, cputime_t cputime)
++{
++ p->utimescaled = cputime_add(p->utimescaled, cputime);
++}
++
++/*
+ * Account system cpu time to a process.
+ * @p: the process that the cpu time gets accounted to
+ * @hardirq_offset: the offset to subtract from hardirq_count()
+ * @cputime: the cpu time spent in kernel space since the last update
+ */
+@@ -3289,10 +3411,13 @@ void account_system_time(struct task_str
+ {
+ struct cpu_usage_stat *cpustat = &kstat_this_cpu.cpustat;
+ struct rq *rq = this_rq();
+ cputime64_t tmp;
+
++ if ((p->flags & PF_VCPU) && (irq_count() - hardirq_offset == 0))
++ return account_guest_time(p, cputime);
++
+ p->stime = cputime_add(p->stime, cputime);
+
+ /* Add system time to cpustat. */
+ tmp = cputime_to_cputime64(cputime);
+ if (hardirq_count() - hardirq_offset)
+@@ -3308,10 +3433,21 @@ void account_system_time(struct task_str
+ /* Account for system time used */
+ acct_update_integrals(p);
+ }
+
+ /*
++ * Account scaled system cpu time to a process.
++ * @p: the process that the cpu time gets accounted to
++ * @hardirq_offset: the offset to subtract from hardirq_count()
++ * @cputime: the cpu time spent in kernel space since the last update
++ */
++void account_system_time_scaled(struct task_struct *p, cputime_t cputime)
++{
++ p->stimescaled = cputime_add(p->stimescaled, cputime);
++}
++
++/*
+ * Account for involuntary wait time.
+ * @p: the process from which the cpu time has been stolen
+ * @steal: the cpu time spent in involuntary wait
+ */
+ void account_steal_time(struct task_struct *p, cputime_t steal)
+@@ -3404,43 +3540,56 @@ EXPORT_SYMBOL(sub_preempt_count);
+ /*
+ * Print scheduling while atomic bug:
+ */
+ static noinline void __schedule_bug(struct task_struct *prev)
+ {
+- printk(KERN_ERR "BUG: scheduling while atomic: %s/0x%08x/%d\n",
+- prev->comm, preempt_count(), prev->pid);
++ struct pt_regs *regs = get_irq_regs();
++
++ printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
++ prev->comm, prev->pid, preempt_count());
++
+ debug_show_held_locks(prev);
+ if (irqs_disabled())
+ print_irqtrace_events(prev);
+- dump_stack();
++
++ if (regs)
++ show_regs(regs);
++ else
++ dump_stack();
+ }
+
+ /*
+ * Various schedule()-time debugging checks and statistics:
+ */
+ static inline void schedule_debug(struct task_struct *prev)
+ {
+ /*
+- * Test if we are atomic. Since do_exit() needs to call into
++ * Test if we are atomic. Since do_exit() needs to call into
+ * schedule() atomically, we ignore that path for now.
+ * Otherwise, whine if we are scheduling when we should not be.
+ */
+ if (unlikely(in_atomic_preempt_off()) && unlikely(!prev->exit_state))
+ __schedule_bug(prev);
+
+ profile_hit(SCHED_PROFILING, __builtin_return_address(0));
+
+- schedstat_inc(this_rq(), sched_cnt);
++ schedstat_inc(this_rq(), sched_count);
++#ifdef CONFIG_SCHEDSTATS
++ if (unlikely(prev->lock_depth >= 0)) {
++ schedstat_inc(this_rq(), bkl_count);
++ schedstat_inc(prev, sched_info.bkl_count);
++ }
++#endif
+ }
+
+ /*
+ * Pick up the highest-prio task:
+ */
+ static inline struct task_struct *
+ pick_next_task(struct rq *rq, struct task_struct *prev)
+ {
+- struct sched_class *class;
++ const struct sched_class *class;
+ struct task_struct *p;
+
+ /*
+ * Optimization: we know that if all tasks are in
+ * the fair class we can call that function directly:
+@@ -3485,13 +3634,17 @@ need_resched:
+ release_kernel_lock(prev);
+ need_resched_nonpreemptible:
+
+ schedule_debug(prev);
+
+- spin_lock_irq(&rq->lock);
+- clear_tsk_need_resched(prev);
++ /*
++ * Do the rq-clock update outside the rq lock:
++ */
++ local_irq_disable();
+ __update_rq_clock(rq);
++ spin_lock(&rq->lock);
++ clear_tsk_need_resched(prev);
+
+ if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
+ if (unlikely((prev->state & TASK_INTERRUPTIBLE) &&
+ unlikely(signal_pending(prev)))) {
+ prev->state = TASK_RUNNING;
+@@ -3530,11 +3683,11 @@ need_resched_nonpreemptible:
+ EXPORT_SYMBOL(schedule);
+
+ #ifdef CONFIG_PREEMPT
+ /*
+ * this is the entry point to schedule() from in-kernel preemption
+- * off of preempt_enable. Kernel preemptions off return from interrupt
++ * off of preempt_enable. Kernel preemptions off return from interrupt
+ * occur there and call schedule directly.
+ */
+ asmlinkage void __sched preempt_schedule(void)
+ {
+ struct thread_info *ti = current_thread_info();
+@@ -3542,36 +3695,39 @@ asmlinkage void __sched preempt_schedule
+ struct task_struct *task = current;
+ int saved_lock_depth;
+ #endif
+ /*
+ * If there is a non-zero preempt_count or interrupts are disabled,
+- * we do not want to preempt the current task. Just return..
++ * we do not want to preempt the current task. Just return..
+ */
+ if (likely(ti->preempt_count || irqs_disabled()))
+ return;
+
+-need_resched:
+- add_preempt_count(PREEMPT_ACTIVE);
+- /*
+- * We keep the big kernel semaphore locked, but we
+- * clear ->lock_depth so that schedule() doesnt
+- * auto-release the semaphore:
+- */
++ do {
++ add_preempt_count(PREEMPT_ACTIVE);
++
++ /*
++ * We keep the big kernel semaphore locked, but we
++ * clear ->lock_depth so that schedule() doesnt
++ * auto-release the semaphore:
++ */
+ #ifdef CONFIG_PREEMPT_BKL
+- saved_lock_depth = task->lock_depth;
+- task->lock_depth = -1;
++ saved_lock_depth = task->lock_depth;
++ task->lock_depth = -1;
+ #endif
+- schedule();
++ schedule();
+ #ifdef CONFIG_PREEMPT_BKL
+- task->lock_depth = saved_lock_depth;
++ task->lock_depth = saved_lock_depth;
+ #endif
+- sub_preempt_count(PREEMPT_ACTIVE);
++ sub_preempt_count(PREEMPT_ACTIVE);
+
+- /* we could miss a preemption opportunity between schedule and now */
+- barrier();
+- if (unlikely(test_thread_flag(TIF_NEED_RESCHED)))
+- goto need_resched;
++ /*
++ * Check again in case we missed a preemption opportunity
++ * between schedule and now.
++ */
++ barrier();
++ } while (unlikely(test_thread_flag(TIF_NEED_RESCHED)));
+ }
+ EXPORT_SYMBOL(preempt_schedule);
+
+ /*
+ * this is the entry point to schedule() from kernel preemption
+@@ -3587,33 +3743,36 @@ asmlinkage void __sched preempt_schedule
+ int saved_lock_depth;
+ #endif
+ /* Catch callers which need to be fixed */
+ BUG_ON(ti->preempt_count || !irqs_disabled());
+
+-need_resched:
+- add_preempt_count(PREEMPT_ACTIVE);
+- /*
+- * We keep the big kernel semaphore locked, but we
+- * clear ->lock_depth so that schedule() doesnt
+- * auto-release the semaphore:
+- */
++ do {
++ add_preempt_count(PREEMPT_ACTIVE);
++
++ /*
++ * We keep the big kernel semaphore locked, but we
++ * clear ->lock_depth so that schedule() doesnt
++ * auto-release the semaphore:
++ */
+ #ifdef CONFIG_PREEMPT_BKL
+- saved_lock_depth = task->lock_depth;
+- task->lock_depth = -1;
++ saved_lock_depth = task->lock_depth;
++ task->lock_depth = -1;
+ #endif
+- local_irq_enable();
+- schedule();
+- local_irq_disable();
++ local_irq_enable();
++ schedule();
++ local_irq_disable();
+ #ifdef CONFIG_PREEMPT_BKL
+- task->lock_depth = saved_lock_depth;
++ task->lock_depth = saved_lock_depth;
+ #endif
+- sub_preempt_count(PREEMPT_ACTIVE);
++ sub_preempt_count(PREEMPT_ACTIVE);
+
+- /* we could miss a preemption opportunity between schedule and now */
+- barrier();
+- if (unlikely(test_thread_flag(TIF_NEED_RESCHED)))
+- goto need_resched;
++ /*
++ * Check again in case we missed a preemption opportunity
++ * between schedule and now.
++ */
++ barrier();
++ } while (unlikely(test_thread_flag(TIF_NEED_RESCHED)));
+ }
+
+ #endif /* CONFIG_PREEMPT */
+
+ int default_wake_function(wait_queue_t *curr, unsigned mode, int sync,
+@@ -3622,25 +3781,24 @@ int default_wake_function(wait_queue_t *
+ return try_to_wake_up(curr->private, mode, sync);
+ }
+ EXPORT_SYMBOL(default_wake_function);
+
+ /*
+- * The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
+- * wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
++ * The core wakeup function. Non-exclusive wakeups (nr_exclusive == 0) just
++ * wake everything up. If it's an exclusive wakeup (nr_exclusive == small +ve
+ * number) then we wake all the non-exclusive tasks and one exclusive task.
+ *
+ * There are circumstances in which we can try to wake a task which has already
+- * started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
++ * started to run but is not in state TASK_RUNNING. try_to_wake_up() returns
+ * zero in this (rare) case, and we handle it by continuing to scan the queue.
+ */
+ static void __wake_up_common(wait_queue_head_t *q, unsigned int mode,
+ int nr_exclusive, int sync, void *key)
+ {
+- struct list_head *tmp, *next;
++ wait_queue_t *curr, *next;
+
+- list_for_each_safe(tmp, next, &q->task_list) {
+- wait_queue_t *curr = list_entry(tmp, wait_queue_t, task_list);
++ list_for_each_entry_safe(curr, next, &q->task_list, task_list) {
+ unsigned flags = curr->flags;
+
+ if (curr->func(curr, mode, sync, key) &&
+ (flags & WQ_FLAG_EXCLUSIVE) && !--nr_exclusive)
+ break;
+@@ -3702,11 +3860,11 @@ __wake_up_sync(wait_queue_head_t *q, uns
+ __wake_up_common(q, mode, nr_exclusive, sync, NULL);
+ spin_unlock_irqrestore(&q->lock, flags);
+ }
+ EXPORT_SYMBOL_GPL(__wake_up_sync); /* For internal use only */
+
+-void fastcall complete(struct completion *x)
++void complete(struct completion *x)
+ {
+ unsigned long flags;
+
+ spin_lock_irqsave(&x->wait.lock, flags);
+ x->done++;
+@@ -3714,11 +3872,11 @@ void fastcall complete(struct completion
+ 1, 0, NULL);
+ spin_unlock_irqrestore(&x->wait.lock, flags);
+ }
+ EXPORT_SYMBOL(complete);
+
+-void fastcall complete_all(struct completion *x)
++void complete_all(struct completion *x)
+ {
+ unsigned long flags;
+
+ spin_lock_irqsave(&x->wait.lock, flags);
+ x->done += UINT_MAX/2;
+@@ -3726,210 +3884,123 @@ void fastcall complete_all(struct comple
+ 0, 0, NULL);
+ spin_unlock_irqrestore(&x->wait.lock, flags);
+ }
+ EXPORT_SYMBOL(complete_all);
+
+-void fastcall __sched wait_for_completion(struct completion *x)
+-{
+- might_sleep();
+-
+- spin_lock_irq(&x->wait.lock);
+- if (!x->done) {
+- DECLARE_WAITQUEUE(wait, current);
+-
+- wait.flags |= WQ_FLAG_EXCLUSIVE;
+- __add_wait_queue_tail(&x->wait, &wait);
+- do {
+- __set_current_state(TASK_UNINTERRUPTIBLE);
+- spin_unlock_irq(&x->wait.lock);
+- schedule();
+- spin_lock_irq(&x->wait.lock);
+- } while (!x->done);
+- __remove_wait_queue(&x->wait, &wait);
+- }
+- x->done--;
+- spin_unlock_irq(&x->wait.lock);
+-}
+-EXPORT_SYMBOL(wait_for_completion);
+-
+-unsigned long fastcall __sched
+-wait_for_completion_timeout(struct completion *x, unsigned long timeout)
++static inline long __sched
++do_wait_for_common(struct completion *x, long timeout, int state)
+ {
+- might_sleep();
+-
+- spin_lock_irq(&x->wait.lock);
+ if (!x->done) {
+ DECLARE_WAITQUEUE(wait, current);
+
+ wait.flags |= WQ_FLAG_EXCLUSIVE;
+ __add_wait_queue_tail(&x->wait, &wait);
+ do {
+- __set_current_state(TASK_UNINTERRUPTIBLE);
++ if (state == TASK_INTERRUPTIBLE &&
++ signal_pending(current)) {
++ __remove_wait_queue(&x->wait, &wait);
++ return -ERESTARTSYS;
++ }
++ __set_current_state(state);
+ spin_unlock_irq(&x->wait.lock);
+ timeout = schedule_timeout(timeout);
+ spin_lock_irq(&x->wait.lock);
+ if (!timeout) {
+ __remove_wait_queue(&x->wait, &wait);
+- goto out;
++ return timeout;
+ }
+ } while (!x->done);
+ __remove_wait_queue(&x->wait, &wait);
+ }
+ x->done--;
+-out:
+- spin_unlock_irq(&x->wait.lock);
+ return timeout;
+ }
+-EXPORT_SYMBOL(wait_for_completion_timeout);
+
+-int fastcall __sched wait_for_completion_interruptible(struct completion *x)
++static long __sched
++wait_for_common(struct completion *x, long timeout, int state)
+ {
+- int ret = 0;
+-
+ might_sleep();
+
+ spin_lock_irq(&x->wait.lock);
+- if (!x->done) {
+- DECLARE_WAITQUEUE(wait, current);
+-
+- wait.flags |= WQ_FLAG_EXCLUSIVE;
+- __add_wait_queue_tail(&x->wait, &wait);
+- do {
+- if (signal_pending(current)) {
+- ret = -ERESTARTSYS;
+- __remove_wait_queue(&x->wait, &wait);
+- goto out;
+- }
+- __set_current_state(TASK_INTERRUPTIBLE);
+- spin_unlock_irq(&x->wait.lock);
+- schedule();
+- spin_lock_irq(&x->wait.lock);
+- } while (!x->done);
+- __remove_wait_queue(&x->wait, &wait);
+- }
+- x->done--;
+-out:
++ timeout = do_wait_for_common(x, timeout, state);
+ spin_unlock_irq(&x->wait.lock);
+-
+- return ret;
++ return timeout;
+ }
+-EXPORT_SYMBOL(wait_for_completion_interruptible);
+
+-unsigned long fastcall __sched
+-wait_for_completion_interruptible_timeout(struct completion *x,
+- unsigned long timeout)
++void __sched wait_for_completion(struct completion *x)
+ {
+- might_sleep();
+-
+- spin_lock_irq(&x->wait.lock);
+- if (!x->done) {
+- DECLARE_WAITQUEUE(wait, current);
++ wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_UNINTERRUPTIBLE);
++}
++EXPORT_SYMBOL(wait_for_completion);
+
+- wait.flags |= WQ_FLAG_EXCLUSIVE;
+- __add_wait_queue_tail(&x->wait, &wait);
+- do {
+- if (signal_pending(current)) {
+- timeout = -ERESTARTSYS;
+- __remove_wait_queue(&x->wait, &wait);
+- goto out;
+- }
+- __set_current_state(TASK_INTERRUPTIBLE);
+- spin_unlock_irq(&x->wait.lock);
+- timeout = schedule_timeout(timeout);
+- spin_lock_irq(&x->wait.lock);
+- if (!timeout) {
+- __remove_wait_queue(&x->wait, &wait);
+- goto out;
+- }
+- } while (!x->done);
+- __remove_wait_queue(&x->wait, &wait);
+- }
+- x->done--;
+-out:
+- spin_unlock_irq(&x->wait.lock);
+- return timeout;
++unsigned long __sched
++wait_for_completion_timeout(struct completion *x, unsigned long timeout)
++{
++ return wait_for_common(x, timeout, TASK_UNINTERRUPTIBLE);
+ }
+-EXPORT_SYMBOL(wait_for_completion_interruptible_timeout);
++EXPORT_SYMBOL(wait_for_completion_timeout);
+
+-static inline void
+-sleep_on_head(wait_queue_head_t *q, wait_queue_t *wait, unsigned long *flags)
++int __sched wait_for_completion_interruptible(struct completion *x)
+ {
+- spin_lock_irqsave(&q->lock, *flags);
+- __add_wait_queue(q, wait);
+- spin_unlock(&q->lock);
++ long t = wait_for_common(x, MAX_SCHEDULE_TIMEOUT, TASK_INTERRUPTIBLE);
++ if (t == -ERESTARTSYS)
++ return t;
++ return 0;
+ }
++EXPORT_SYMBOL(wait_for_completion_interruptible);
+
+-static inline void
+-sleep_on_tail(wait_queue_head_t *q, wait_queue_t *wait, unsigned long *flags)
++unsigned long __sched
++wait_for_completion_interruptible_timeout(struct completion *x,
++ unsigned long timeout)
+ {
+- spin_lock_irq(&q->lock);
+- __remove_wait_queue(q, wait);
+- spin_unlock_irqrestore(&q->lock, *flags);
++ return wait_for_common(x, timeout, TASK_INTERRUPTIBLE);
+ }
++EXPORT_SYMBOL(wait_for_completion_interruptible_timeout);
+
+-void __sched interruptible_sleep_on(wait_queue_head_t *q)
++static long __sched
++sleep_on_common(wait_queue_head_t *q, int state, long timeout)
+ {
+ unsigned long flags;
+ wait_queue_t wait;
+
+ init_waitqueue_entry(&wait, current);
+
+- current->state = TASK_INTERRUPTIBLE;
++ __set_current_state(state);
+
+- sleep_on_head(q, &wait, &flags);
+- schedule();
+- sleep_on_tail(q, &wait, &flags);
++ spin_lock_irqsave(&q->lock, flags);
++ __add_wait_queue(q, &wait);
++ spin_unlock(&q->lock);
++ timeout = schedule_timeout(timeout);
++ spin_lock_irq(&q->lock);
++ __remove_wait_queue(q, &wait);
++ spin_unlock_irqrestore(&q->lock, flags);
++
++ return timeout;
++}
++
++void __sched interruptible_sleep_on(wait_queue_head_t *q)
++{
++ sleep_on_common(q, TASK_INTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
+ }
+ EXPORT_SYMBOL(interruptible_sleep_on);
+
+ long __sched
+ interruptible_sleep_on_timeout(wait_queue_head_t *q, long timeout)
+ {
+- unsigned long flags;
+- wait_queue_t wait;
+-
+- init_waitqueue_entry(&wait, current);
+-
+- current->state = TASK_INTERRUPTIBLE;
+-
+- sleep_on_head(q, &wait, &flags);
+- timeout = schedule_timeout(timeout);
+- sleep_on_tail(q, &wait, &flags);
+-
+- return timeout;
++ return sleep_on_common(q, TASK_INTERRUPTIBLE, timeout);
+ }
+ EXPORT_SYMBOL(interruptible_sleep_on_timeout);
+
+ void __sched sleep_on(wait_queue_head_t *q)
+ {
+- unsigned long flags;
+- wait_queue_t wait;
+-
+- init_waitqueue_entry(&wait, current);
+-
+- current->state = TASK_UNINTERRUPTIBLE;
+-
+- sleep_on_head(q, &wait, &flags);
+- schedule();
+- sleep_on_tail(q, &wait, &flags);
++ sleep_on_common(q, TASK_UNINTERRUPTIBLE, MAX_SCHEDULE_TIMEOUT);
+ }
+ EXPORT_SYMBOL(sleep_on);
+
+ long __sched sleep_on_timeout(wait_queue_head_t *q, long timeout)
+ {
+- unsigned long flags;
+- wait_queue_t wait;
+-
+- init_waitqueue_entry(&wait, current);
+-
+- current->state = TASK_UNINTERRUPTIBLE;
+-
+- sleep_on_head(q, &wait, &flags);
+- timeout = schedule_timeout(timeout);
+- sleep_on_tail(q, &wait, &flags);
+-
+- return timeout;
++ return sleep_on_common(q, TASK_UNINTERRUPTIBLE, timeout);
+ }
+ EXPORT_SYMBOL(sleep_on_timeout);
+
+ #ifdef CONFIG_RT_MUTEXES
+
+@@ -3944,38 +4015,44 @@ EXPORT_SYMBOL(sleep_on_timeout);
+ * Used by the rt_mutex code to implement priority inheritance logic.
+ */
+ void rt_mutex_setprio(struct task_struct *p, int prio)
+ {
+ unsigned long flags;
+- int oldprio, on_rq;
++ int oldprio, on_rq, running;
+ struct rq *rq;
+
+ BUG_ON(prio < 0 || prio > MAX_PRIO);
+
+ rq = task_rq_lock(p, &flags);
+ update_rq_clock(rq);
+
+ oldprio = p->prio;
+ on_rq = p->se.on_rq;
+- if (on_rq)
++ running = task_current(rq, p);
++ if (on_rq) {
+ dequeue_task(rq, p, 0);
++ if (running)
++ p->sched_class->put_prev_task(rq, p);
++ }
+
+ if (rt_prio(prio))
+ p->sched_class = &rt_sched_class;
+ else
+ p->sched_class = &fair_sched_class;
+
+ p->prio = prio;
+
+ if (on_rq) {
++ if (running)
++ p->sched_class->set_curr_task(rq);
+ enqueue_task(rq, p, 0);
+ /*
+ * Reschedule if we are currently running on this runqueue and
+ * our priority decreased, or if we are not currently running on
+ * this runqueue and our priority is higher than the current's
+ */
+- if (task_running(rq, p)) {
++ if (running) {
+ if (p->prio > oldprio)
+ resched_task(rq->curr);
+ } else {
+ check_preempt_curr(rq, p);
+ }
+@@ -4135,13 +4212,13 @@ struct task_struct *idle_task(int cpu)
+
+ /**
+ * find_process_by_pid - find a process with a matching PID value.
+ * @pid: the pid in question.
+ */
+-static inline struct task_struct *find_process_by_pid(pid_t pid)
++static struct task_struct *find_process_by_pid(pid_t pid)
+ {
+- return pid ? find_task_by_pid(pid) : current;
++ return pid ? find_task_by_vpid(pid) : current;
+ }
+
+ /* Actually do priority change: must hold rq lock. */
+ static void
+ __setscheduler(struct rq *rq, struct task_struct *p, int policy, int prio)
+@@ -4177,11 +4254,11 @@ __setscheduler(struct rq *rq, struct tas
+ * NOTE that the task may be already dead.
+ */
+ int sched_setscheduler(struct task_struct *p, int policy,
+ struct sched_param *param)
+ {
+- int retval, oldprio, oldpolicy = -1, on_rq;
++ int retval, oldprio, oldpolicy = -1, on_rq, running;
+ unsigned long flags;
+ struct rq *rq;
+
+ /* may grab non-irq protected spin_locks */
+ BUG_ON(in_interrupt());
+@@ -4259,22 +4336,30 @@ recheck:
+ spin_unlock_irqrestore(&p->pi_lock, flags);
+ goto recheck;
+ }
+ update_rq_clock(rq);
+ on_rq = p->se.on_rq;
+- if (on_rq)
++ running = task_current(rq, p);
++ if (on_rq) {
+ deactivate_task(rq, p, 0);
++ if (running)
++ p->sched_class->put_prev_task(rq, p);
++ }
++
+ oldprio = p->prio;
+ __setscheduler(rq, p, policy, param->sched_priority);
++
+ if (on_rq) {
++ if (running)
++ p->sched_class->set_curr_task(rq);
+ activate_task(rq, p, 0);
+ /*
+ * Reschedule if we are currently running on this runqueue and
+ * our priority decreased, or if we are not currently running on
+ * this runqueue and our priority is higher than the current's
+ */
+- if (task_running(rq, p)) {
++ if (running) {
+ if (p->prio > oldprio)
+ resched_task(rq->curr);
+ } else {
+ check_preempt_curr(rq, p);
+ }
+@@ -4314,12 +4399,12 @@ do_sched_setscheduler(pid_t pid, int pol
+ * sys_sched_setscheduler - set/change the scheduler policy and RT priority
+ * @pid: the pid in question.
+ * @policy: new policy.
+ * @param: structure containing the new RT priority.
+ */
+-asmlinkage long sys_sched_setscheduler(pid_t pid, int policy,
+- struct sched_param __user *param)
++asmlinkage long
++sys_sched_setscheduler(pid_t pid, int policy, struct sched_param __user *param)
+ {
+ /* negative values for policy are not valid */
+ if (policy < 0)
+ return -EINVAL;
+
+@@ -4341,26 +4426,24 @@ asmlinkage long sys_sched_setparam(pid_t
+ * @pid: the pid in question.
+ */
+ asmlinkage long sys_sched_getscheduler(pid_t pid)
+ {
+ struct task_struct *p;
+- int retval = -EINVAL;
++ int retval;
+
+ if (pid < 0)
+- goto out_nounlock;
++ return -EINVAL;
+
+ retval = -ESRCH;
+ read_lock(&tasklist_lock);
+ p = find_process_by_pid(pid);
+ if (p) {
+ retval = security_task_getscheduler(p);
+ if (!retval)
+ retval = p->policy;
+ }
+ read_unlock(&tasklist_lock);
+-
+-out_nounlock:
+ return retval;
+ }
+
+ /**
+ * sys_sched_getscheduler - get the RT priority of a thread
+@@ -4369,14 +4452,14 @@ out_nounlock:
+ */
+ asmlinkage long sys_sched_getparam(pid_t pid, struct sched_param __user *param)
+ {
+ struct sched_param lp;
+ struct task_struct *p;
+- int retval = -EINVAL;
++ int retval;
+
+ if (!param || pid < 0)
+- goto out_nounlock;
++ return -EINVAL;
+
+ read_lock(&tasklist_lock);
+ p = find_process_by_pid(pid);
+ retval = -ESRCH;
+ if (!p)
+@@ -4392,11 +4475,10 @@ asmlinkage long sys_sched_getparam(pid_t
+ /*
+ * This one might sleep, we cannot do it with a spinlock held ...
+ */
+ retval = copy_to_user(param, &lp, sizeof(*param)) ? -EFAULT : 0;
+
+-out_nounlock:
+ return retval;
+
+ out_unlock:
+ read_unlock(&tasklist_lock);
+ return retval;
+@@ -4418,11 +4500,11 @@ long sched_setaffinity(pid_t pid, cpumas
+ return -ESRCH;
+ }
+
+ /*
+ * It is not safe to call set_cpus_allowed with the
+- * tasklist_lock held. We will bump the task_struct's
++ * tasklist_lock held. We will bump the task_struct's
+ * usage count and then drop tasklist_lock.
+ */
+ get_task_struct(p);
+ read_unlock(&tasklist_lock);
+
+@@ -4435,12 +4517,25 @@ long sched_setaffinity(pid_t pid, cpumas
+ if (retval)
+ goto out_unlock;
+
+ cpus_allowed = cpuset_cpus_allowed(p);
+ cpus_and(new_mask, new_mask, cpus_allowed);
++ again:
+ retval = set_cpus_allowed(p, new_mask);
+
++ if (!retval) {
++ cpus_allowed = cpuset_cpus_allowed(p);
++ if (!cpus_subset(new_mask, cpus_allowed)) {
++ /*
++ * We must have raced with a concurrent cpuset
++ * update. Just reset the cpus_allowed to the
++ * cpuset's cpus_allowed
++ */
++ new_mask = cpus_allowed;
++ goto again;
++ }
++ }
+ out_unlock:
+ put_task_struct(p);
+ mutex_unlock(&sched_hotcpu_mutex);
+ return retval;
+ }
+@@ -4552,12 +4647,12 @@ asmlinkage long sys_sched_getaffinity(pi
+ */
+ asmlinkage long sys_sched_yield(void)
+ {
+ struct rq *rq = this_rq_lock();
+
+- schedstat_inc(rq, yld_cnt);
+- current->sched_class->yield_task(rq, current);
++ schedstat_inc(rq, yld_count);
++ current->sched_class->yield_task(rq);
+
+ /*
+ * Since we are going to call schedule() anyway, there's
+ * no need to preempt or enable interrupts:
+ */
+@@ -4601,11 +4696,11 @@ EXPORT_SYMBOL(cond_resched);
+
+ /*
+ * cond_resched_lock() - if a reschedule is pending, drop the given lock,
+ * call schedule, and on return reacquire the lock.
+ *
+- * This works OK both with and without CONFIG_PREEMPT. We do strange low-level
++ * This works OK both with and without CONFIG_PREEMPT. We do strange low-level
+ * operations here to prevent schedule() from being called twice (once via
+ * spin_unlock(), once by hand).
+ */
+ int cond_resched_lock(spinlock_t *lock)
+ {
+@@ -4655,11 +4750,11 @@ void __sched yield(void)
+ sys_sched_yield();
+ }
+ EXPORT_SYMBOL(yield);
+
+ /*
+- * This task is about to go to sleep on IO. Increment rq->nr_iowait so
++ * This task is about to go to sleep on IO. Increment rq->nr_iowait so
+ * that process accounting knows that this is a task in IO wait state.
+ *
+ * But don't do that if it is a deliberate, throttling IO wait (this task
+ * has set its backing_dev_info: the queue against which it should throttle)
+ */
+@@ -4747,15 +4842,16 @@ asmlinkage long sys_sched_get_priority_m
+ */
+ asmlinkage
+ long sys_sched_rr_get_interval(pid_t pid, struct timespec __user *interval)
+ {
+ struct task_struct *p;
+- int retval = -EINVAL;
++ unsigned int time_slice;
++ int retval;
+ struct timespec t;
+
+ if (pid < 0)
+- goto out_nounlock;
++ return -EINVAL;
+
+ retval = -ESRCH;
+ read_lock(&tasklist_lock);
+ p = find_process_by_pid(pid);
+ if (!p)
+@@ -4763,16 +4859,32 @@ long sys_sched_rr_get_interval(pid_t pid
+
+ retval = security_task_getscheduler(p);
+ if (retval)
+ goto out_unlock;
+
+- jiffies_to_timespec(p->policy == SCHED_FIFO ?
+- 0 : static_prio_timeslice(p->static_prio), &t);
++ /*
++ * Time slice is 0 for SCHED_FIFO tasks and for SCHED_OTHER
++ * tasks that are on an otherwise idle runqueue:
++ */
++ time_slice = 0;
++ if (p->policy == SCHED_RR) {
++ time_slice = DEF_TIMESLICE;
++ } else {
++ struct sched_entity *se = &p->se;
++ unsigned long flags;
++ struct rq *rq;
++
++ rq = task_rq_lock(p, &flags);
++ if (rq->cfs.load.weight)
++ time_slice = NS_TO_JIFFIES(sched_slice(&rq->cfs, se));
++ task_rq_unlock(rq, &flags);
++ }
+ read_unlock(&tasklist_lock);
++ jiffies_to_timespec(time_slice, &t);
+ retval = copy_to_user(interval, &t, sizeof(t)) ? -EFAULT : 0;
+-out_nounlock:
+ return retval;
++
+ out_unlock:
+ read_unlock(&tasklist_lock);
+ return retval;
+ }
+
+@@ -4782,32 +4894,33 @@ static void show_task(struct task_struct
+ {
+ unsigned long free = 0;
+ unsigned state;
+
+ state = p->state ? __ffs(p->state) + 1 : 0;
+- printk("%-13.13s %c", p->comm,
++ printk(KERN_INFO "%-13.13s %c", p->comm,
+ state < sizeof(stat_nam) - 1 ? stat_nam[state] : '?');
+ #if BITS_PER_LONG == 32
+ if (state == TASK_RUNNING)
+- printk(" running ");
++ printk(KERN_CONT " running ");
+ else
+- printk(" %08lx ", thread_saved_pc(p));
++ printk(KERN_CONT " %08lx ", thread_saved_pc(p));
+ #else
+ if (state == TASK_RUNNING)
+- printk(" running task ");
++ printk(KERN_CONT " running task ");
+ else
+- printk(" %016lx ", thread_saved_pc(p));
++ printk(KERN_CONT " %016lx ", thread_saved_pc(p));
+ #endif
+ #ifdef CONFIG_DEBUG_STACK_USAGE
+ {
+ unsigned long *n = end_of_stack(p);
+ while (!*n)
+ n++;
+ free = (unsigned long)n - (unsigned long)end_of_stack(p);
+ }
+ #endif
+- printk("%5lu %5d %6d\n", free, p->pid, p->parent->pid);
++ printk(KERN_CONT "%5lu %5d %6d\n", free,
++ task_pid_nr(p), task_pid_nr(p->parent));
+
+ if (state != TASK_RUNNING)
+ show_stack(p, NULL);
+ }
+
+@@ -4909,22 +5022,22 @@ cpumask_t nohz_cpu_mask = CPU_MASK_NONE;
+ * This idea comes from the SD scheduler of Con Kolivas:
+ */
+ static inline void sched_init_granularity(void)
+ {
+ unsigned int factor = 1 + ilog2(num_online_cpus());
+- const unsigned long limit = 100000000;
++ const unsigned long limit = 200000000;
+
+ sysctl_sched_min_granularity *= factor;
+ if (sysctl_sched_min_granularity > limit)
+ sysctl_sched_min_granularity = limit;
+
+ sysctl_sched_latency *= factor;
+ if (sysctl_sched_latency > limit)
+ sysctl_sched_latency = limit;
+
+- sysctl_sched_runtime_limit = sysctl_sched_latency;
+- sysctl_sched_wakeup_granularity = sysctl_sched_min_granularity / 2;
++ sysctl_sched_wakeup_granularity *= factor;
++ sysctl_sched_batch_wakeup_granularity *= factor;
+ }
+
+ #ifdef CONFIG_SMP
+ /*
+ * This is how migration works:
+@@ -4946,11 +5059,11 @@ static inline void sched_init_granularit
+ * Change a given task's CPU affinity. Migrate the thread to a
+ * proper CPU and schedule it away if the CPU it's executing on
+ * is removed from the allowed bitmask.
+ *
+ * NOTE: the caller must have a valid reference to the task, the
+- * task must not exit() & deallocate itself prematurely. The
++ * task must not exit() & deallocate itself prematurely. The
+ * call is not atomic; no spinlocks may be held.
+ */
+ int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
+ {
+ struct migration_req req;
+@@ -4983,11 +5096,11 @@ out:
+ return ret;
+ }
+ EXPORT_SYMBOL_GPL(set_cpus_allowed);
+
+ /*
+- * Move (not current) task off this cpu, onto dest cpu. We're doing
++ * Move (not current) task off this cpu, onto dest cpu. We're doing
+ * this because either it can't run here any more (set_cpus_allowed()
+ * away from this CPU, or CPU going down), or because we're
+ * attempting to rebalance this task on exec (sched_exec).
+ *
+ * So we race with normal scheduler movements, but that's OK, as long
+@@ -5045,10 +5158,12 @@ static int migration_thread(void *data)
+ set_current_state(TASK_INTERRUPTIBLE);
+ while (!kthread_should_stop()) {
+ struct migration_req *req;
+ struct list_head *head;
+
++ try_to_freeze();
++
+ spin_lock_irq(&rq->lock);
+
+ if (cpu_is_offline(cpu)) {
+ spin_unlock_irq(&rq->lock);
+ goto wait_to_die;
+@@ -5089,50 +5204,69 @@ wait_to_die:
+ __set_current_state(TASK_RUNNING);
+ return 0;
+ }
+
+ #ifdef CONFIG_HOTPLUG_CPU
++
++static int __migrate_task_irq(struct task_struct *p, int src_cpu, int dest_cpu)
++{
++ int ret;
++
++ local_irq_disable();
++ ret = __migrate_task(p, src_cpu, dest_cpu);
++ local_irq_enable();
++ return ret;
++}
++
+ /*
+- * Figure out where task on dead CPU should go, use force if neccessary.
++ * Figure out where task on dead CPU should go, use force if necessary.
+ * NOTE: interrupts should be disabled by the caller
+ */
+ static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *p)
+ {
+ unsigned long flags;
+ cpumask_t mask;
+ struct rq *rq;
+ int dest_cpu;
+
+-restart:
+- /* On same node? */
+- mask = node_to_cpumask(cpu_to_node(dead_cpu));
+- cpus_and(mask, mask, p->cpus_allowed);
+- dest_cpu = any_online_cpu(mask);
+-
+- /* On any allowed CPU? */
+- if (dest_cpu == NR_CPUS)
+- dest_cpu = any_online_cpu(p->cpus_allowed);
++ do {
++ /* On same node? */
++ mask = node_to_cpumask(cpu_to_node(dead_cpu));
++ cpus_and(mask, mask, p->cpus_allowed);
++ dest_cpu = any_online_cpu(mask);
++
++ /* On any allowed CPU? */
++ if (dest_cpu == NR_CPUS)
++ dest_cpu = any_online_cpu(p->cpus_allowed);
++
++ /* No more Mr. Nice Guy. */
++ if (dest_cpu == NR_CPUS) {
++ cpumask_t cpus_allowed = cpuset_cpus_allowed_locked(p);
++ /*
++ * Try to stay on the same cpuset, where the
++ * current cpuset may be a subset of all cpus.
++ * The cpuset_cpus_allowed_locked() variant of
++ * cpuset_cpus_allowed() will not block. It must be
++ * called within calls to cpuset_lock/cpuset_unlock.
++ */
++ rq = task_rq_lock(p, &flags);
++ p->cpus_allowed = cpus_allowed;
++ dest_cpu = any_online_cpu(p->cpus_allowed);
++ task_rq_unlock(rq, &flags);
+
+- /* No more Mr. Nice Guy. */
+- if (dest_cpu == NR_CPUS) {
+- rq = task_rq_lock(p, &flags);
+- cpus_setall(p->cpus_allowed);
+- dest_cpu = any_online_cpu(p->cpus_allowed);
+- task_rq_unlock(rq, &flags);
+-
+- /*
+- * Don't tell them about moving exiting tasks or
+- * kernel threads (both mm NULL), since they never
+- * leave kernel.
+- */
+- if (p->mm && printk_ratelimit())
+- printk(KERN_INFO "process %d (%s) no "
+- "longer affine to cpu%d\n",
+- p->pid, p->comm, dead_cpu);
+- }
+- if (!__migrate_task(p, dead_cpu, dest_cpu))
+- goto restart;
++ /*
++ * Don't tell them about moving exiting tasks or
++ * kernel threads (both mm NULL), since they never
++ * leave kernel.
++ */
++ if (p->mm && printk_ratelimit()) {
++ printk(KERN_INFO "process %d (%s) no "
++ "longer affine to cpu%d\n",
++ task_pid_nr(p), p->comm, dead_cpu);
++ }
++ }
++ } while (!__migrate_task_irq(p, dead_cpu, dest_cpu));
+ }
+
+ /*
+ * While a dead CPU has no uninterruptible tasks queued at this point,
+ * it might still have a nonzero ->nr_uninterruptible counter, because
+@@ -5156,27 +5290,27 @@ static void migrate_nr_uninterruptible(s
+ /* Run through task list and migrate tasks from the dead cpu. */
+ static void migrate_live_tasks(int src_cpu)
+ {
+ struct task_struct *p, *t;
+
+- write_lock_irq(&tasklist_lock);
++ read_lock(&tasklist_lock);
+
+ do_each_thread(t, p) {
+ if (p == current)
+ continue;
+
+ if (task_cpu(p) == src_cpu)
+ move_task_off_dead_cpu(src_cpu, p);
+ } while_each_thread(t, p);
+
+- write_unlock_irq(&tasklist_lock);
++ read_unlock(&tasklist_lock);
+ }
+
+ /*
+ * Schedules idle task to be the next runnable task on current CPU.
+- * It does so by boosting its priority to highest possible and adding it to
+- * the _front_ of the runqueue. Used by CPU offline code.
++ * It does so by boosting its priority to highest possible.
++ * Used by CPU offline code.
+ */
+ void sched_idle_next(void)
+ {
+ int this_cpu = smp_processor_id();
+ struct rq *rq = cpu_rq(this_cpu);
+@@ -5192,12 +5326,12 @@ void sched_idle_next(void)
+ */
+ spin_lock_irqsave(&rq->lock, flags);
+
+ __setscheduler(rq, p, SCHED_FIFO, MAX_RT_PRIO-1);
+
+- /* Add idle task to the _front_ of its priority queue: */
+- activate_idle_task(p, rq);
++ update_rq_clock(rq);
++ activate_task(rq, p, 0);
+
+ spin_unlock_irqrestore(&rq->lock, flags);
+ }
+
+ /*
+@@ -5219,26 +5353,25 @@ void idle_task_exit(void)
+ static void migrate_dead(unsigned int dead_cpu, struct task_struct *p)
+ {
+ struct rq *rq = cpu_rq(dead_cpu);
+
+ /* Must be exiting, otherwise would be on tasklist. */
+- BUG_ON(p->exit_state != EXIT_ZOMBIE && p->exit_state != EXIT_DEAD);
++ BUG_ON(!p->exit_state);
+
+ /* Cannot have done final schedule yet: would have vanished. */
+ BUG_ON(p->state == TASK_DEAD);
+
+ get_task_struct(p);
+
+ /*
+ * Drop lock around migration; if someone else moves it,
+- * that's OK. No task can be added to this CPU, so iteration is
++ * that's OK. No task can be added to this CPU, so iteration is
+ * fine.
+- * NOTE: interrupts should be left disabled --dev@
+ */
+- spin_unlock(&rq->lock);
++ spin_unlock_irq(&rq->lock);
+ move_task_off_dead_cpu(dead_cpu, p);
+- spin_lock(&rq->lock);
++ spin_lock_irq(&rq->lock);
+
+ put_task_struct(p);
+ }
+
+ /* release_task() removes task from tasklist, so we won't find dead tasks. */
+@@ -5265,34 +5398,52 @@ static void migrate_dead_tasks(unsigned
+ static struct ctl_table sd_ctl_dir[] = {
+ {
+ .procname = "sched_domain",
+ .mode = 0555,
+ },
+- {0,},
++ {0, },
+ };
+
+ static struct ctl_table sd_ctl_root[] = {
+ {
+ .ctl_name = CTL_KERN,
+ .procname = "kernel",
+ .mode = 0555,
+ .child = sd_ctl_dir,
+ },
+- {0,},
++ {0, },
+ };
+
+ static struct ctl_table *sd_alloc_ctl_entry(int n)
+ {
+ struct ctl_table *entry =
+- kmalloc(n * sizeof(struct ctl_table), GFP_KERNEL);
+-
+- BUG_ON(!entry);
+- memset(entry, 0, n * sizeof(struct ctl_table));
++ kcalloc(n, sizeof(struct ctl_table), GFP_KERNEL);
+
+ return entry;
+ }
+
++static void sd_free_ctl_entry(struct ctl_table **tablep)
++{
++ struct ctl_table *entry;
++
++ /*
++ * In the intermediate directories, both the child directory and
++ * procname are dynamically allocated and could fail but the mode
++ * will always be set. In the lowest directory the names are
++ * static strings and all have proc handlers.
++ */
++ for (entry = *tablep; entry->mode; entry++) {
++ if (entry->child)
++ sd_free_ctl_entry(&entry->child);
++ if (entry->proc_handler == NULL)
++ kfree(entry->procname);
++ }
++
++ kfree(*tablep);
++ *tablep = NULL;
++}
++
+ static void
+ set_table_entry(struct ctl_table *entry,
+ const char *procname, void *data, int maxlen,
+ mode_t mode, proc_handler *proc_handler)
+ {
+@@ -5306,10 +5457,13 @@ set_table_entry(struct ctl_table *entry,
+ static struct ctl_table *
+ sd_alloc_ctl_domain_table(struct sched_domain *sd)
+ {
+ struct ctl_table *table = sd_alloc_ctl_entry(12);
+
++ if (table == NULL)
++ return NULL;
++
+ set_table_entry(&table[0], "min_interval", &sd->min_interval,
+ sizeof(long), 0644, proc_doulongvec_minmax);
+ set_table_entry(&table[1], "max_interval", &sd->max_interval,
+ sizeof(long), 0644, proc_doulongvec_minmax);
+ set_table_entry(&table[2], "busy_idx", &sd->busy_idx,
+@@ -5329,10 +5483,11 @@ sd_alloc_ctl_domain_table(struct sched_d
+ set_table_entry(&table[9], "cache_nice_tries",
+ &sd->cache_nice_tries,
+ sizeof(int), 0644, proc_dointvec_minmax);
+ set_table_entry(&table[10], "flags", &sd->flags,
+ sizeof(int), 0644, proc_dointvec_minmax);
++ /* &table[11] is terminator */
+
+ return table;
+ }
+
+ static ctl_table *sd_alloc_ctl_cpu_table(int cpu)
+@@ -5343,10 +5498,12 @@ static ctl_table *sd_alloc_ctl_cpu_table
+ char buf[32];
+
+ for_each_domain(cpu, sd)
+ domain_num++;
+ entry = table = sd_alloc_ctl_entry(domain_num + 1);
++ if (table == NULL)
++ return NULL;
+
+ i = 0;
+ for_each_domain(cpu, sd) {
+ snprintf(buf, 32, "domain%d", i);
+ entry->procname = kstrdup(buf, GFP_KERNEL);
+@@ -5357,28 +5514,48 @@ static ctl_table *sd_alloc_ctl_cpu_table
+ }
+ return table;
+ }
+
+ static struct ctl_table_header *sd_sysctl_header;
+-static void init_sched_domain_sysctl(void)
++static void register_sched_domain_sysctl(void)
+ {
+ int i, cpu_num = num_online_cpus();
+ struct ctl_table *entry = sd_alloc_ctl_entry(cpu_num + 1);
+ char buf[32];
+
++ WARN_ON(sd_ctl_dir[0].child);
+ sd_ctl_dir[0].child = entry;
+
+- for (i = 0; i < cpu_num; i++, entry++) {
++ if (entry == NULL)
++ return;
++
++ for_each_online_cpu(i) {
+ snprintf(buf, 32, "cpu%d", i);
+ entry->procname = kstrdup(buf, GFP_KERNEL);
+ entry->mode = 0555;
+ entry->child = sd_alloc_ctl_cpu_table(i);
++ entry++;
+ }
++
++ WARN_ON(sd_sysctl_header);
+ sd_sysctl_header = register_sysctl_table(sd_ctl_root);
+ }
++
++/* may be called multiple times per register */
++static void unregister_sched_domain_sysctl(void)
++{
++ if (sd_sysctl_header)
++ unregister_sysctl_table(sd_sysctl_header);
++ sd_sysctl_header = NULL;
++ if (sd_ctl_dir[0].child)
++ sd_free_ctl_entry(&sd_ctl_dir[0].child);
++}
+ #else
+-static void init_sched_domain_sysctl(void)
++static void register_sched_domain_sysctl(void)
++{
++}
++static void unregister_sched_domain_sysctl(void)
+ {
+ }
+ #endif
+
+ /*
+@@ -5401,57 +5578,62 @@ migration_call(struct notifier_block *nf
+ case CPU_UP_PREPARE:
+ case CPU_UP_PREPARE_FROZEN:
+ p = kthread_create(migration_thread, hcpu, "migration/%d", cpu);
+ if (IS_ERR(p))
+ return NOTIFY_BAD;
++ p->flags |= PF_NOFREEZE;
+ kthread_bind(p, cpu);
+ /* Must be high prio: stop_machine expects to yield to it. */
+ rq = task_rq_lock(p, &flags);
+ __setscheduler(rq, p, SCHED_FIFO, MAX_RT_PRIO-1);
+ task_rq_unlock(rq, &flags);
+ cpu_rq(cpu)->migration_thread = p;
+ break;
+
+ case CPU_ONLINE:
+ case CPU_ONLINE_FROZEN:
+- /* Strictly unneccessary, as first user will wake it. */
++ /* Strictly unnecessary, as first user will wake it. */
+ wake_up_process(cpu_rq(cpu)->migration_thread);
+ break;
+
+ #ifdef CONFIG_HOTPLUG_CPU
+ case CPU_UP_CANCELED:
+ case CPU_UP_CANCELED_FROZEN:
+ if (!cpu_rq(cpu)->migration_thread)
+ break;
+- /* Unbind it from offline cpu so it can run. Fall thru. */
++ /* Unbind it from offline cpu so it can run. Fall thru. */
+ kthread_bind(cpu_rq(cpu)->migration_thread,
+ any_online_cpu(cpu_online_map));
+ kthread_stop(cpu_rq(cpu)->migration_thread);
+ cpu_rq(cpu)->migration_thread = NULL;
+ break;
+
+ case CPU_DEAD:
+ case CPU_DEAD_FROZEN:
++ cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */
+ migrate_live_tasks(cpu);
+ rq = cpu_rq(cpu);
+ kthread_stop(rq->migration_thread);
+ rq->migration_thread = NULL;
+ /* Idle task back to normal (off runqueue, low prio) */
+- rq = task_rq_lock(rq->idle, &flags);
++ spin_lock_irq(&rq->lock);
+ update_rq_clock(rq);
+ deactivate_task(rq, rq->idle, 0);
+ rq->idle->static_prio = MAX_PRIO;
+ __setscheduler(rq, rq->idle, SCHED_NORMAL, 0);
+ rq->idle->sched_class = &idle_sched_class;
+ migrate_dead_tasks(cpu);
+- task_rq_unlock(rq, &flags);
++ spin_unlock_irq(&rq->lock);
++ cpuset_unlock();
+ migrate_nr_uninterruptible(rq);
+ BUG_ON(rq->nr_running != 0);
+
+- /* No need to migrate the tasks: it was best-effort if
+- * they didn't take sched_hotcpu_mutex. Just wake up
+- * the requestors. */
++ /*
++ * No need to migrate the tasks: it was best-effort if
++ * they didn't take sched_hotcpu_mutex. Just wake up
++ * the requestors.
++ */
+ spin_lock_irq(&rq->lock);
+ while (!list_empty(&rq->migration_queue)) {
+ struct migration_req *req;
+
+ req = list_entry(rq->migration_queue.next,
+@@ -5475,125 +5657,125 @@ migration_call(struct notifier_block *nf
+ static struct notifier_block __cpuinitdata migration_notifier = {
+ .notifier_call = migration_call,
+ .priority = 10
+ };
+
+-int __init migration_init(void)
++void __init migration_init(void)
+ {
+ void *cpu = (void *)(long)smp_processor_id();
+ int err;
+
+ /* Start one for the boot CPU: */
+ err = migration_call(&migration_notifier, CPU_UP_PREPARE, cpu);
+ BUG_ON(err == NOTIFY_BAD);
+ migration_call(&migration_notifier, CPU_ONLINE, cpu);
+ register_cpu_notifier(&migration_notifier);
+-
+- return 0;
+ }
+ #endif
+
+ #ifdef CONFIG_SMP
+
+ /* Number of possible processor ids */
+ int nr_cpu_ids __read_mostly = NR_CPUS;
+ EXPORT_SYMBOL(nr_cpu_ids);
+
+-#undef SCHED_DOMAIN_DEBUG
+-#ifdef SCHED_DOMAIN_DEBUG
+-static void sched_domain_debug(struct sched_domain *sd, int cpu)
+-{
+- int level = 0;
++#ifdef CONFIG_SCHED_DEBUG
+
+- if (!sd) {
+- printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu);
+- return;
++static int sched_domain_debug_one(struct sched_domain *sd, int cpu, int level)
++{
++ struct sched_group *group = sd->groups;
++ cpumask_t groupmask;
++ char str[NR_CPUS];
++
++ cpumask_scnprintf(str, NR_CPUS, sd->span);
++ cpus_clear(groupmask);
++
++ printk(KERN_DEBUG "%*s domain %d: ", level, "", level);
++
++ if (!(sd->flags & SD_LOAD_BALANCE)) {
++ printk("does not load-balance\n");
++ if (sd->parent)
++ printk(KERN_ERR "ERROR: !SD_LOAD_BALANCE domain"
++ " has parent");
++ return -1;
+ }
+
+- printk(KERN_DEBUG "CPU%d attaching sched-domain:\n", cpu);
++ printk(KERN_CONT "span %s\n", str);
+
++ if (!cpu_isset(cpu, sd->span)) {
++ printk(KERN_ERR "ERROR: domain->span does not contain "
++ "CPU%d\n", cpu);
++ }
++ if (!cpu_isset(cpu, group->cpumask)) {
++ printk(KERN_ERR "ERROR: domain->groups does not contain"
++ " CPU%d\n", cpu);
++ }
++
++ printk(KERN_DEBUG "%*s groups:", level + 1, "");
+ do {
+- int i;
+- char str[NR_CPUS];
+- struct sched_group *group = sd->groups;
+- cpumask_t groupmask;
+-
+- cpumask_scnprintf(str, NR_CPUS, sd->span);
+- cpus_clear(groupmask);
+-
+- printk(KERN_DEBUG);
+- for (i = 0; i < level + 1; i++)
+- printk(" ");
+- printk("domain %d: ", level);
+-
+- if (!(sd->flags & SD_LOAD_BALANCE)) {
+- printk("does not load-balance\n");
+- if (sd->parent)
+- printk(KERN_ERR "ERROR: !SD_LOAD_BALANCE domain"
+- " has parent");
++ if (!group) {
++ printk("\n");
++ printk(KERN_ERR "ERROR: group is NULL\n");
+ break;
+ }
+
+- printk("span %s\n", str);
++ if (!group->__cpu_power) {
++ printk(KERN_CONT "\n");
++ printk(KERN_ERR "ERROR: domain->cpu_power not "
++ "set\n");
++ break;
++ }
+
+- if (!cpu_isset(cpu, sd->span))
+- printk(KERN_ERR "ERROR: domain->span does not contain "
+- "CPU%d\n", cpu);
+- if (!cpu_isset(cpu, group->cpumask))
+- printk(KERN_ERR "ERROR: domain->groups does not contain"
+- " CPU%d\n", cpu);
+-
+- printk(KERN_DEBUG);
+- for (i = 0; i < level + 2; i++)
+- printk(" ");
+- printk("groups:");
+- do {
+- if (!group) {
+- printk("\n");
+- printk(KERN_ERR "ERROR: group is NULL\n");
+- break;
+- }
++ if (!cpus_weight(group->cpumask)) {
++ printk(KERN_CONT "\n");
++ printk(KERN_ERR "ERROR: empty group\n");
++ break;
++ }
+
+- if (!group->__cpu_power) {
+- printk("\n");
+- printk(KERN_ERR "ERROR: domain->cpu_power not "
+- "set\n");
+- }
++ if (cpus_intersects(groupmask, group->cpumask)) {
++ printk(KERN_CONT "\n");
++ printk(KERN_ERR "ERROR: repeated CPUs\n");
++ break;
++ }
+
+- if (!cpus_weight(group->cpumask)) {
+- printk("\n");
+- printk(KERN_ERR "ERROR: empty group\n");
+- }
++ cpus_or(groupmask, groupmask, group->cpumask);
+
+- if (cpus_intersects(groupmask, group->cpumask)) {
+- printk("\n");
+- printk(KERN_ERR "ERROR: repeated CPUs\n");
+- }
++ cpumask_scnprintf(str, NR_CPUS, group->cpumask);
++ printk(KERN_CONT " %s", str);
++
++ group = group->next;
++ } while (group != sd->groups);
++ printk(KERN_CONT "\n");
++
++ if (!cpus_equal(sd->span, groupmask))
++ printk(KERN_ERR "ERROR: groups don't span domain->span\n");
++
++ if (sd->parent && !cpus_subset(groupmask, sd->parent->span))
++ printk(KERN_ERR "ERROR: parent span is not a superset "
++ "of domain->span\n");
++ return 0;
++}
+
+- cpus_or(groupmask, groupmask, group->cpumask);
++static void sched_domain_debug(struct sched_domain *sd, int cpu)
++{
++ int level = 0;
+
+- cpumask_scnprintf(str, NR_CPUS, group->cpumask);
+- printk(" %s", str);
++ if (!sd) {
++ printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu);
++ return;
++ }
+
+- group = group->next;
+- } while (group != sd->groups);
+- printk("\n");
+-
+- if (!cpus_equal(sd->span, groupmask))
+- printk(KERN_ERR "ERROR: groups don't span "
+- "domain->span\n");
++ printk(KERN_DEBUG "CPU%d attaching sched-domain:\n", cpu);
+
++ for (;;) {
++ if (sched_domain_debug_one(sd, cpu, level))
++ break;
+ level++;
+ sd = sd->parent;
+ if (!sd)
+- continue;
+-
+- if (!cpus_subset(groupmask, sd->span))
+- printk(KERN_ERR "ERROR: parent span is not a superset "
+- "of domain->span\n");
+-
+- } while (sd);
++ break;
++ }
+ }
+ #else
+ # define sched_domain_debug(sd, cpu) do { } while (0)
+ #endif
+
+@@ -5698,11 +5880,11 @@ static int __init isolated_cpu_setup(cha
+ if (ints[i] < NR_CPUS)
+ cpu_set(ints[i], cpu_isolated_map);
+ return 1;
+ }
+
+-__setup ("isolcpus=", isolated_cpu_setup);
++__setup("isolcpus=", isolated_cpu_setup);
+
+ /*
+ * init_sched_build_groups takes the cpumask we wish to span, and a pointer
+ * to a function which identifies what group(along with sched group) a CPU
+ * belongs to. The return value of group_fn must be a >= 0 and < NR_CPUS
+@@ -5755,11 +5937,11 @@ init_sched_build_groups(cpumask_t span,
+ /**
+ * find_next_best_node - find the next node to include in a sched_domain
+ * @node: node whose sched_domain we're building
+ * @used_nodes: nodes already in the sched_domain
+ *
+- * Find the next node to include in a given scheduling domain. Simply
++ * Find the next node to include in a given scheduling domain. Simply
+ * finds the closest node not already in the @used_nodes map.
+ *
+ * Should use nodemask_t.
+ */
+ static int find_next_best_node(int node, unsigned long *used_nodes)
+@@ -5795,11 +5977,11 @@ static int find_next_best_node(int node,
+ /**
+ * sched_domain_node_span - get a cpumask for a node's sched_domain
+ * @node: node whose cpumask we're constructing
+ * @size: number of nodes to include in this span
+ *
+- * Given a node, construct a good cpumask for its sched_domain to span. It
++ * Given a node, construct a good cpumask for its sched_domain to span. It
+ * should be one that prevents unnecessary balancing, but also spreads tasks
+ * out optimally.
+ */
+ static cpumask_t sched_domain_node_span(int node)
+ {
+@@ -5832,12 +6014,12 @@ int sched_smt_power_savings = 0, sched_m
+ */
+ #ifdef CONFIG_SCHED_SMT
+ static DEFINE_PER_CPU(struct sched_domain, cpu_domains);
+ static DEFINE_PER_CPU(struct sched_group, sched_group_cpus);
+
+-static int cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map,
+- struct sched_group **sg)
++static int
++cpu_to_cpu_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+ {
+ if (sg)
+ *sg = &per_cpu(sched_group_cpus, cpu);
+ return cpu;
+ }
+@@ -5850,44 +6032,44 @@ static int cpu_to_cpu_group(int cpu, con
+ static DEFINE_PER_CPU(struct sched_domain, core_domains);
+ static DEFINE_PER_CPU(struct sched_group, sched_group_core);
+ #endif
+
+ #if defined(CONFIG_SCHED_MC) && defined(CONFIG_SCHED_SMT)
+-static int cpu_to_core_group(int cpu, const cpumask_t *cpu_map,
+- struct sched_group **sg)
++static int
++cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+ {
+ int group;
+- cpumask_t mask = cpu_sibling_map[cpu];
++ cpumask_t mask = cpu_sibling_map(cpu);
+ cpus_and(mask, mask, *cpu_map);
+ group = first_cpu(mask);
+ if (sg)
+ *sg = &per_cpu(sched_group_core, group);
+ return group;
+ }
+ #elif defined(CONFIG_SCHED_MC)
+-static int cpu_to_core_group(int cpu, const cpumask_t *cpu_map,
+- struct sched_group **sg)
++static int
++cpu_to_core_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+ {
+ if (sg)
+ *sg = &per_cpu(sched_group_core, cpu);
+ return cpu;
+ }
+ #endif
+
+ static DEFINE_PER_CPU(struct sched_domain, phys_domains);
+ static DEFINE_PER_CPU(struct sched_group, sched_group_phys);
+
+-static int cpu_to_phys_group(int cpu, const cpumask_t *cpu_map,
+- struct sched_group **sg)
++static int
++cpu_to_phys_group(int cpu, const cpumask_t *cpu_map, struct sched_group **sg)
+ {
+ int group;
+ #ifdef CONFIG_SCHED_MC
+ cpumask_t mask = cpu_coregroup_map(cpu);
+ cpus_and(mask, mask, *cpu_map);
+ group = first_cpu(mask);
+ #elif defined(CONFIG_SCHED_SMT)
+- cpumask_t mask = cpu_sibling_map[cpu];
++ cpumask_t mask = cpu_sibling_map(cpu);
+ cpus_and(mask, mask, *cpu_map);
+ group = first_cpu(mask);
+ #else
+ group = cpu;
+ #endif
+@@ -5927,28 +6109,27 @@ static void init_numa_sched_groups_power
+ struct sched_group *sg = group_head;
+ int j;
+
+ if (!sg)
+ return;
+-next_sg:
+- for_each_cpu_mask(j, sg->cpumask) {
+- struct sched_domain *sd;
++ do {
++ for_each_cpu_mask(j, sg->cpumask) {
++ struct sched_domain *sd;
+
+- sd = &per_cpu(phys_domains, j);
+- if (j != first_cpu(sd->groups->cpumask)) {
+- /*
+- * Only add "power" once for each
+- * physical package.
+- */
+- continue;
+- }
++ sd = &per_cpu(phys_domains, j);
++ if (j != first_cpu(sd->groups->cpumask)) {
++ /*
++ * Only add "power" once for each
++ * physical package.
++ */
++ continue;
++ }
+
+- sg_inc_cpu_power(sg, sd->groups->__cpu_power);
+- }
+- sg = sg->next;
+- if (sg != group_head)
+- goto next_sg;
++ sg_inc_cpu_power(sg, sd->groups->__cpu_power);
++ }
++ sg = sg->next;
++ } while (sg != group_head);
+ }
+ #endif
+
+ #ifdef CONFIG_NUMA
+ /* Free memory allocated for various sched_group structures */
+@@ -6055,12 +6236,12 @@ static int build_sched_domains(const cpu
+ int sd_allnodes = 0;
+
+ /*
+ * Allocate the per-node list of sched groups
+ */
+- sched_group_nodes = kzalloc(sizeof(struct sched_group *)*MAX_NUMNODES,
+- GFP_KERNEL);
++ sched_group_nodes = kcalloc(MAX_NUMNODES, sizeof(struct sched_group *),
++ GFP_KERNEL);
+ if (!sched_group_nodes) {
+ printk(KERN_WARNING "Can not alloc sched group node list\n");
+ return -ENOMEM;
+ }
+ sched_group_nodes_bycpu[first_cpu(*cpu_map)] = sched_group_nodes;
+@@ -6118,22 +6299,22 @@ static int build_sched_domains(const cpu
+
+ #ifdef CONFIG_SCHED_SMT
+ p = sd;
+ sd = &per_cpu(cpu_domains, i);
+ *sd = SD_SIBLING_INIT;
+- sd->span = cpu_sibling_map[i];
++ sd->span = cpu_sibling_map(i);
+ cpus_and(sd->span, sd->span, *cpu_map);
+ sd->parent = p;
+ p->child = sd;
+ cpu_to_cpu_group(i, cpu_map, &sd->groups);
+ #endif
+ }
+
+ #ifdef CONFIG_SCHED_SMT
+ /* Set up CPU (sibling) groups */
+ for_each_cpu_mask(i, *cpu_map) {
+- cpumask_t this_sibling_map = cpu_sibling_map[i];
++ cpumask_t this_sibling_map = cpu_sibling_map(i);
+ cpus_and(this_sibling_map, this_sibling_map, *cpu_map);
+ if (i != first_cpu(this_sibling_map))
+ continue;
+
+ init_sched_build_groups(this_sibling_map, cpu_map,
+@@ -6291,26 +6472,37 @@ static int build_sched_domains(const cpu
+ error:
+ free_sched_groups(cpu_map);
+ return -ENOMEM;
+ #endif
+ }
++
++static cpumask_t *doms_cur; /* current sched domains */
++static int ndoms_cur; /* number of sched domains in 'doms_cur' */
++
++/*
++ * Special case: If a kmalloc of a doms_cur partition (array of
++ * cpumask_t) fails, then fallback to a single sched domain,
++ * as determined by the single cpumask_t fallback_doms.
++ */
++static cpumask_t fallback_doms;
++
+ /*
+- * Set up scheduler domains and groups. Callers must hold the hotplug lock.
++ * Set up scheduler domains and groups. Callers must hold the hotplug lock.
++ * For now this just excludes isolated cpus, but could be used to
++ * exclude other special cases in the future.
+ */
+ static int arch_init_sched_domains(const cpumask_t *cpu_map)
+ {
+- cpumask_t cpu_default_map;
+ int err;
+
+- /*
+- * Setup mask for cpus without special case scheduling requirements.
+- * For now this just excludes isolated cpus, but could be used to
+- * exclude other special cases in the future.
+- */
+- cpus_andnot(cpu_default_map, *cpu_map, cpu_isolated_map);
+-
+- err = build_sched_domains(&cpu_default_map);
++ ndoms_cur = 1;
++ doms_cur = kmalloc(sizeof(cpumask_t), GFP_KERNEL);
++ if (!doms_cur)
++ doms_cur = &fallback_doms;
++ cpus_andnot(*doms_cur, *cpu_map, cpu_isolated_map);
++ err = build_sched_domains(doms_cur);
++ register_sched_domain_sysctl();
+
+ return err;
+ }
+
+ static void arch_destroy_sched_domains(const cpumask_t *cpu_map)
+@@ -6324,41 +6516,83 @@ static void arch_destroy_sched_domains(c
+ */
+ static void detach_destroy_domains(const cpumask_t *cpu_map)
+ {
+ int i;
+
++ unregister_sched_domain_sysctl();
++
+ for_each_cpu_mask(i, *cpu_map)
+ cpu_attach_domain(NULL, i);
+ synchronize_sched();
+ arch_destroy_sched_domains(cpu_map);
+ }
+
+ /*
+- * Partition sched domains as specified by the cpumasks below.
+- * This attaches all cpus from the cpumasks to the NULL domain,
+- * waits for a RCU quiescent period, recalculates sched
+- * domain information and then attaches them back to the
+- * correct sched domains
++ * Partition sched domains as specified by the 'ndoms_new'
++ * cpumasks in the array doms_new[] of cpumasks. This compares
++ * doms_new[] to the current sched domain partitioning, doms_cur[].
++ * It destroys each deleted domain and builds each new domain.
++ *
++ * 'doms_new' is an array of cpumask_t's of length 'ndoms_new'.
++ * The masks don't intersect (don't overlap.) We should setup one
++ * sched domain for each mask. CPUs not in any of the cpumasks will
++ * not be load balanced. If the same cpumask appears both in the
++ * current 'doms_cur' domains and in the new 'doms_new', we can leave
++ * it as it is.
++ *
++ * The passed in 'doms_new' should be kmalloc'd. This routine takes
++ * ownership of it and will kfree it when done with it. If the caller
++ * failed the kmalloc call, then it can pass in doms_new == NULL,
++ * and partition_sched_domains() will fallback to the single partition
++ * 'fallback_doms'.
++ *
+ * Call with hotplug lock held
+ */
+-int partition_sched_domains(cpumask_t *partition1, cpumask_t *partition2)
++void partition_sched_domains(int ndoms_new, cpumask_t *doms_new)
+ {
+- cpumask_t change_map;
+- int err = 0;
++ int i, j;
+
+- cpus_and(*partition1, *partition1, cpu_online_map);
+- cpus_and(*partition2, *partition2, cpu_online_map);
+- cpus_or(change_map, *partition1, *partition2);
+-
+- /* Detach sched domains from all of the affected cpus */
+- detach_destroy_domains(&change_map);
+- if (!cpus_empty(*partition1))
+- err = build_sched_domains(partition1);
+- if (!err && !cpus_empty(*partition2))
+- err = build_sched_domains(partition2);
++ /* always unregister in case we don't destroy any domains */
++ unregister_sched_domain_sysctl();
+
+- return err;
++ if (doms_new == NULL) {
++ ndoms_new = 1;
++ doms_new = &fallback_doms;
++ cpus_andnot(doms_new[0], cpu_online_map, cpu_isolated_map);
++ }
++
++ /* Destroy deleted domains */
++ for (i = 0; i < ndoms_cur; i++) {
++ for (j = 0; j < ndoms_new; j++) {
++ if (cpus_equal(doms_cur[i], doms_new[j]))
++ goto match1;
++ }
++ /* no match - a current sched domain not in new doms_new[] */
++ detach_destroy_domains(doms_cur + i);
++match1:
++ ;
++ }
++
++ /* Build new domains */
++ for (i = 0; i < ndoms_new; i++) {
++ for (j = 0; j < ndoms_cur; j++) {
++ if (cpus_equal(doms_new[i], doms_cur[j]))
++ goto match2;
++ }
++ /* no match - add a new doms_new */
++ build_sched_domains(doms_new + i);
++match2:
++ ;
++ }
++
++ /* Remember the new sched domains */
++ if (doms_cur != &fallback_doms)
++ kfree(doms_cur);
++ doms_cur = doms_new;
++ ndoms_cur = ndoms_new;
++
++ register_sched_domain_sysctl();
+ }
+
+ #if defined(CONFIG_SCHED_MC) || defined(CONFIG_SCHED_SMT)
+ static int arch_reinit_sched_domains(void)
+ {
+@@ -6434,11 +6668,11 @@ int sched_create_sysfs_power_savings_ent
+ return err;
+ }
+ #endif
+
+ /*
+- * Force a reinitialization of the sched domains hierarchy. The domains
++ * Force a reinitialization of the sched domains hierarchy. The domains
+ * and groups cannot be updated in place without racing with the balancing
+ * code, so we temporarily attach all running cpus to the NULL domain
+ * which will prevent rebalancing while the sched domains are recalculated.
+ */
+ static int update_sched_domains(struct notifier_block *nfb,
+@@ -6485,12 +6719,10 @@ void __init sched_init_smp(void)
+ cpu_set(smp_processor_id(), non_isolated_cpus);
+ mutex_unlock(&sched_hotcpu_mutex);
+ /* XXX: Theoretical race here - CPU may be hotplugged now */
+ hotcpu_notifier(update_sched_domains, 0);
+
+- init_sched_domain_sysctl();
+-
+ /* Move init over to a non-isolated CPU */
+ if (set_cpus_allowed(current, non_isolated_cpus) < 0)
+ BUG();
+ sched_init_granularity();
+ }
+@@ -6501,40 +6733,29 @@ void __init sched_init_smp(void)
+ }
+ #endif /* CONFIG_SMP */
+
+ int in_sched_functions(unsigned long addr)
+ {
+- /* Linker adds these: start and end of __sched functions */
+- extern char __sched_text_start[], __sched_text_end[];
+-
+ return in_lock_functions(addr) ||
+ (addr >= (unsigned long)__sched_text_start
+ && addr < (unsigned long)__sched_text_end);
+ }
+
+-static inline void init_cfs_rq(struct cfs_rq *cfs_rq, struct rq *rq)
++static void init_cfs_rq(struct cfs_rq *cfs_rq, struct rq *rq)
+ {
+ cfs_rq->tasks_timeline = RB_ROOT;
+- cfs_rq->fair_clock = 1;
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ cfs_rq->rq = rq;
+ #endif
++ cfs_rq->min_vruntime = (u64)(-(1LL << 20));
+ }
+
+ void __init sched_init(void)
+ {
+- u64 now = sched_clock();
+ int highest_cpu = 0;
+ int i, j;
+
+- /*
+- * Link up the scheduling class hierarchy:
+- */
+- rt_sched_class.next = &fair_sched_class;
+- fair_sched_class.next = &idle_sched_class;
+- idle_sched_class.next = NULL;
+-
+ for_each_possible_cpu(i) {
+ struct rt_prio_array *array;
+ struct rq *rq;
+
+ rq = cpu_rq(i);
+@@ -6543,14 +6764,32 @@ void __init sched_init(void)
+ rq->nr_running = 0;
+ rq->clock = 1;
+ init_cfs_rq(&rq->cfs, rq);
+ #ifdef CONFIG_FAIR_GROUP_SCHED
+ INIT_LIST_HEAD(&rq->leaf_cfs_rq_list);
+- list_add(&rq->cfs.leaf_cfs_rq_list, &rq->leaf_cfs_rq_list);
++ {
++ struct cfs_rq *cfs_rq = &per_cpu(init_cfs_rq, i);
++ struct sched_entity *se =
++ &per_cpu(init_sched_entity, i);
++
++ init_cfs_rq_p[i] = cfs_rq;
++ init_cfs_rq(cfs_rq, rq);
++ cfs_rq->tg = &init_task_group;
++ list_add(&cfs_rq->leaf_cfs_rq_list,
++ &rq->leaf_cfs_rq_list);
++
++ init_sched_entity_p[i] = se;
++ se->cfs_rq = &rq->cfs;
++ se->my_q = cfs_rq;
++ se->load.weight = init_task_group_load;
++ se->load.inv_weight =
++ div64_64(1ULL<<32, init_task_group_load);
++ se->parent = NULL;
++ }
++ init_task_group.shares = init_task_group_load;
++ spin_lock_init(&init_task_group.lock);
+ #endif
+- rq->ls.load_update_last = now;
+- rq->ls.load_update_start = now;
+
+ for (j = 0; j < CPU_LOAD_IDX_MAX; j++)
+ rq->cpu_load[j] = 0;
+ #ifdef CONFIG_SMP
+ rq->sd = NULL;
+@@ -6631,30 +6870,44 @@ void __might_sleep(char *file, int line)
+ }
+ EXPORT_SYMBOL(__might_sleep);
+ #endif
+
+ #ifdef CONFIG_MAGIC_SYSRQ
++static void normalize_task(struct rq *rq, struct task_struct *p)
++{
++ int on_rq;
++ update_rq_clock(rq);
++ on_rq = p->se.on_rq;
++ if (on_rq)
++ deactivate_task(rq, p, 0);
++ __setscheduler(rq, p, SCHED_NORMAL, 0);
++ if (on_rq) {
++ activate_task(rq, p, 0);
++ resched_task(rq->curr);
++ }
++}
++
+ void normalize_rt_tasks(void)
+ {
+ struct task_struct *g, *p;
+ unsigned long flags;
+ struct rq *rq;
+- int on_rq;
+
+ read_lock_irq(&tasklist_lock);
+ do_each_thread(g, p) {
+- p->se.fair_key = 0;
+- p->se.wait_runtime = 0;
++ /*
++ * Only normalize user tasks:
++ */
++ if (!p->mm)
++ continue;
++
+ p->se.exec_start = 0;
+- p->se.wait_start_fair = 0;
+- p->se.sleep_start_fair = 0;
+ #ifdef CONFIG_SCHEDSTATS
+ p->se.wait_start = 0;
+ p->se.sleep_start = 0;
+ p->se.block_start = 0;
+ #endif
+- task_rq(p)->cfs.fair_clock = 0;
+ task_rq(p)->clock = 0;
+
+ if (!rt_task(p)) {
+ /*
+ * Renice negative nice level userspace
+@@ -6665,30 +6918,13 @@ void normalize_rt_tasks(void)
+ continue;
+ }
+
+ spin_lock_irqsave(&p->pi_lock, flags);
+ rq = __task_rq_lock(p);
+-#ifdef CONFIG_SMP
+- /*
+- * Do not touch the migration thread:
+- */
+- if (p == rq->migration_thread)
+- goto out_unlock;
+-#endif
+
+- update_rq_clock(rq);
+- on_rq = p->se.on_rq;
+- if (on_rq)
+- deactivate_task(rq, p, 0);
+- __setscheduler(rq, p, SCHED_NORMAL, 0);
+- if (on_rq) {
+- activate_task(rq, p, 0);
+- resched_task(rq->curr);
+- }
+-#ifdef CONFIG_SMP
+- out_unlock:
+-#endif
++ normalize_task(rq, p);
++
+ __task_rq_unlock(rq);
+ spin_unlock_irqrestore(&p->pi_lock, flags);
+ } while_each_thread(g, p);
+
+ read_unlock_irq(&tasklist_lock);
+@@ -6722,12 +6958,12 @@ struct task_struct *curr_task(int cpu)
+ * set_curr_task - set the current task for a given cpu.
+ * @cpu: the processor in question.
+ * @p: the task pointer to set.
+ *
+ * Description: This function must only be used when non-maskable interrupts
+- * are serviced on a separate stack. It allows the architecture to switch the
+- * notion of the current task on a cpu in a non-blocking manner. This function
++ * are serviced on a separate stack. It allows the architecture to switch the
++ * notion of the current task on a cpu in a non-blocking manner. This function
+ * must be called with all CPU's synchronized, and interrupts disabled, the
+ * and caller must save the original value of the current task (see
+ * curr_task() above) and restore that value before reenabling interrupts and
+ * re-starting the system.
+ *
+@@ -6737,5 +6973,427 @@ void set_curr_task(int cpu, struct task_
+ {
+ cpu_curr(cpu) = p;
+ }
+
+ #endif
++
++#ifdef CONFIG_FAIR_GROUP_SCHED
++
++/* allocate runqueue etc for a new task group */
++struct task_group *sched_create_group(void)
++{
++ struct task_group *tg;
++ struct cfs_rq *cfs_rq;
++ struct sched_entity *se;
++ struct rq *rq;
++ int i;
++
++ tg = kzalloc(sizeof(*tg), GFP_KERNEL);
++ if (!tg)
++ return ERR_PTR(-ENOMEM);
++
++ tg->cfs_rq = kzalloc(sizeof(cfs_rq) * NR_CPUS, GFP_KERNEL);
++ if (!tg->cfs_rq)
++ goto err;
++ tg->se = kzalloc(sizeof(se) * NR_CPUS, GFP_KERNEL);
++ if (!tg->se)
++ goto err;
++
++ for_each_possible_cpu(i) {
++ rq = cpu_rq(i);
++
++ cfs_rq = kmalloc_node(sizeof(struct cfs_rq), GFP_KERNEL,
++ cpu_to_node(i));
++ if (!cfs_rq)
++ goto err;
++
++ se = kmalloc_node(sizeof(struct sched_entity), GFP_KERNEL,
++ cpu_to_node(i));
++ if (!se)
++ goto err;
++
++ memset(cfs_rq, 0, sizeof(struct cfs_rq));
++ memset(se, 0, sizeof(struct sched_entity));
++
++ tg->cfs_rq[i] = cfs_rq;
++ init_cfs_rq(cfs_rq, rq);
++ cfs_rq->tg = tg;
++
++ tg->se[i] = se;
++ se->cfs_rq = &rq->cfs;
++ se->my_q = cfs_rq;
++ se->load.weight = NICE_0_LOAD;
++ se->load.inv_weight = div64_64(1ULL<<32, NICE_0_LOAD);
++ se->parent = NULL;
++ }
++
++ for_each_possible_cpu(i) {
++ rq = cpu_rq(i);
++ cfs_rq = tg->cfs_rq[i];
++ list_add_rcu(&cfs_rq->leaf_cfs_rq_list, &rq->leaf_cfs_rq_list);
++ }
++
++ tg->shares = NICE_0_LOAD;
++ spin_lock_init(&tg->lock);
++
++ return tg;
++
++err:
++ for_each_possible_cpu(i) {
++ if (tg->cfs_rq)
++ kfree(tg->cfs_rq[i]);
++ if (tg->se)
++ kfree(tg->se[i]);
++ }
++ kfree(tg->cfs_rq);
++ kfree(tg->se);
++ kfree(tg);
++
++ return ERR_PTR(-ENOMEM);
++}
++
++/* rcu callback to free various structures associated with a task group */
++static void free_sched_group(struct rcu_head *rhp)
++{
++ struct task_group *tg = container_of(rhp, struct task_group, rcu);
++ struct cfs_rq *cfs_rq;
++ struct sched_entity *se;
++ int i;
++
++ /* now it should be safe to free those cfs_rqs */
++ for_each_possible_cpu(i) {
++ cfs_rq = tg->cfs_rq[i];
++ kfree(cfs_rq);
++
++ se = tg->se[i];
++ kfree(se);
++ }
++
++ kfree(tg->cfs_rq);
++ kfree(tg->se);
++ kfree(tg);
++}
++
++/* Destroy runqueue etc associated with a task group */
++void sched_destroy_group(struct task_group *tg)
++{
++ struct cfs_rq *cfs_rq = NULL;
++ int i;
++
++ for_each_possible_cpu(i) {
++ cfs_rq = tg->cfs_rq[i];
++ list_del_rcu(&cfs_rq->leaf_cfs_rq_list);
++ }
++
++ BUG_ON(!cfs_rq);
++
++ /* wait for possible concurrent references to cfs_rqs complete */
++ call_rcu(&tg->rcu, free_sched_group);
++}
++
++/* change task's runqueue when it moves between groups.
++ * The caller of this function should have put the task in its new group
++ * by now. This function just updates tsk->se.cfs_rq and tsk->se.parent to
++ * reflect its new group.
++ */
++void sched_move_task(struct task_struct *tsk)
++{
++ int on_rq, running;
++ unsigned long flags;
++ struct rq *rq;
++
++ rq = task_rq_lock(tsk, &flags);
++
++ if (tsk->sched_class != &fair_sched_class) {
++ set_task_cfs_rq(tsk, task_cpu(tsk));
++ goto done;
++ }
++
++ update_rq_clock(rq);
++
++ running = task_current(rq, tsk);
++ on_rq = tsk->se.on_rq;
++
++ if (on_rq) {
++ dequeue_task(rq, tsk, 0);
++ if (unlikely(running))
++ tsk->sched_class->put_prev_task(rq, tsk);
++ }
++
++ set_task_cfs_rq(tsk, task_cpu(tsk));
++
++ if (on_rq) {
++ if (unlikely(running))
++ tsk->sched_class->set_curr_task(rq);
++ enqueue_task(rq, tsk, 0);
++ }
++
++done:
++ task_rq_unlock(rq, &flags);
++}
++
++static void set_se_shares(struct sched_entity *se, unsigned long shares)
++{
++ struct cfs_rq *cfs_rq = se->cfs_rq;
++ struct rq *rq = cfs_rq->rq;
++ int on_rq;
++
++ spin_lock_irq(&rq->lock);
++
++ on_rq = se->on_rq;
++ if (on_rq)
++ dequeue_entity(cfs_rq, se, 0);
++
++ se->load.weight = shares;
++ se->load.inv_weight = div64_64((1ULL<<32), shares);
++
++ if (on_rq)
++ enqueue_entity(cfs_rq, se, 0);
++
++ spin_unlock_irq(&rq->lock);
++}
++
++int sched_group_set_shares(struct task_group *tg, unsigned long shares)
++{
++ int i;
++
++ spin_lock(&tg->lock);
++ if (tg->shares == shares)
++ goto done;
++
++ tg->shares = shares;
++ for_each_possible_cpu(i)
++ set_se_shares(tg->se[i], shares);
++
++done:
++ spin_unlock(&tg->lock);
++ return 0;
++}
++
++unsigned long sched_group_shares(struct task_group *tg)
++{
++ return tg->shares;
++}
++
++#endif /* CONFIG_FAIR_GROUP_SCHED */
++
++#ifdef CONFIG_FAIR_CGROUP_SCHED
++
++/* return corresponding task_group object of a cgroup */
++static inline struct task_group *cgroup_tg(struct cgroup *cgrp)
++{
++ return container_of(cgroup_subsys_state(cgrp, cpu_cgroup_subsys_id),
++ struct task_group, css);
++}
++
++static struct cgroup_subsys_state *
++cpu_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
++{
++ struct task_group *tg;
++
++ if (!cgrp->parent) {
++ /* This is early initialization for the top cgroup */
++ init_task_group.css.cgroup = cgrp;
++ return &init_task_group.css;
++ }
++
++ /* we support only 1-level deep hierarchical scheduler atm */
++ if (cgrp->parent->parent)
++ return ERR_PTR(-EINVAL);
++
++ tg = sched_create_group();
++ if (IS_ERR(tg))
++ return ERR_PTR(-ENOMEM);
++
++ /* Bind the cgroup to task_group object we just created */
++ tg->css.cgroup = cgrp;
++
++ return &tg->css;
++}
++
++static void
++cpu_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
++{
++ struct task_group *tg = cgroup_tg(cgrp);
++
++ sched_destroy_group(tg);
++}
++
++static int
++cpu_cgroup_can_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
++ struct task_struct *tsk)
++{
++ /* We don't support RT-tasks being in separate groups */
++ if (tsk->sched_class != &fair_sched_class)
++ return -EINVAL;
++
++ return 0;
++}
++
++static void
++cpu_cgroup_attach(struct cgroup_subsys *ss, struct cgroup *cgrp,
++ struct cgroup *old_cont, struct task_struct *tsk)
++{
++ sched_move_task(tsk);
++}
++
++static int cpu_shares_write_uint(struct cgroup *cgrp, struct cftype *cftype,
++ u64 shareval)
++{
++ return sched_group_set_shares(cgroup_tg(cgrp), shareval);
++}
++
++static u64 cpu_shares_read_uint(struct cgroup *cgrp, struct cftype *cft)
++{
++ struct task_group *tg = cgroup_tg(cgrp);
++
++ return (u64) tg->shares;
++}
++
++static struct cftype cpu_files[] = {
++ {
++ .name = "shares",
++ .read_uint = cpu_shares_read_uint,
++ .write_uint = cpu_shares_write_uint,
++ },
++};
++
++static int cpu_cgroup_populate(struct cgroup_subsys *ss, struct cgroup *cont)
++{
++ return cgroup_add_files(cont, ss, cpu_files, ARRAY_SIZE(cpu_files));
++}
++
++struct cgroup_subsys cpu_cgroup_subsys = {
++ .name = "cpu",
++ .create = cpu_cgroup_create,
++ .destroy = cpu_cgroup_destroy,
++ .can_attach = cpu_cgroup_can_attach,
++ .attach = cpu_cgroup_attach,
++ .populate = cpu_cgroup_populate,
++ .subsys_id = cpu_cgroup_subsys_id,
++ .early_init = 1,
++};
++
++#endif /* CONFIG_FAIR_CGROUP_SCHED */
++
++#ifdef CONFIG_CGROUP_CPUACCT
++
++/*
++ * CPU accounting code for task groups.
++ *
++ * Based on the work by Paul Menage (menage@google.com) and Balbir Singh
++ * (balbir@in.ibm.com).
++ */
++
++/* track cpu usage of a group of tasks */
++struct cpuacct {
++ struct cgroup_subsys_state css;
++ /* cpuusage holds pointer to a u64-type object on every cpu */
++ u64 *cpuusage;
++};
++
++struct cgroup_subsys cpuacct_subsys;
++
++/* return cpu accounting group corresponding to this container */
++static inline struct cpuacct *cgroup_ca(struct cgroup *cont)
++{
++ return container_of(cgroup_subsys_state(cont, cpuacct_subsys_id),
++ struct cpuacct, css);
++}
++
++/* return cpu accounting group to which this task belongs */
++static inline struct cpuacct *task_ca(struct task_struct *tsk)
++{
++ return container_of(task_subsys_state(tsk, cpuacct_subsys_id),
++ struct cpuacct, css);
++}
++
++/* create a new cpu accounting group */
++static struct cgroup_subsys_state *cpuacct_create(
++ struct cgroup_subsys *ss, struct cgroup *cont)
++{
++ struct cpuacct *ca = kzalloc(sizeof(*ca), GFP_KERNEL);
++
++ if (!ca)
++ return ERR_PTR(-ENOMEM);
++
++ ca->cpuusage = alloc_percpu(u64);
++ if (!ca->cpuusage) {
++ kfree(ca);
++ return ERR_PTR(-ENOMEM);
++ }
++
++ return &ca->css;
++}
++
++/* destroy an existing cpu accounting group */
++static void
++cpuacct_destroy(struct cgroup_subsys *ss, struct cgroup *cont)
++{
++ struct cpuacct *ca = cgroup_ca(cont);
++
++ free_percpu(ca->cpuusage);
++ kfree(ca);
++}
++
++/* return total cpu usage (in nanoseconds) of a group */
++static u64 cpuusage_read(struct cgroup *cont, struct cftype *cft)
++{
++ struct cpuacct *ca = cgroup_ca(cont);
++ u64 totalcpuusage = 0;
++ int i;
++
++ for_each_possible_cpu(i) {
++ u64 *cpuusage = percpu_ptr(ca->cpuusage, i);
++
++ /*
++ * Take rq->lock to make 64-bit addition safe on 32-bit
++ * platforms.
++ */
++ spin_lock_irq(&cpu_rq(i)->lock);
++ totalcpuusage += *cpuusage;
++ spin_unlock_irq(&cpu_rq(i)->lock);
++ }
++
++ return totalcpuusage;
++}
++
++static struct cftype files[] = {
++ {
++ .name = "usage",
++ .read_uint = cpuusage_read,
++ },
++};
++
++static int cpuacct_populate(struct cgroup_subsys *ss, struct cgroup *cont)
++{
++ return cgroup_add_files(cont, ss, files, ARRAY_SIZE(files));
++}
++
++/*
++ * charge this task's execution time to its accounting group.
++ *
++ * called with rq->lock held.
++ */
++static void cpuacct_charge(struct task_struct *tsk, u64 cputime)
++{
++ struct cpuacct *ca;
++
++ if (!cpuacct_subsys.active)
++ return;
++
++ ca = task_ca(tsk);
++ if (ca) {
++ u64 *cpuusage = percpu_ptr(ca->cpuusage, task_cpu(tsk));
++
++ *cpuusage += cputime;
++ }
++}
++
++struct cgroup_subsys cpuacct_subsys = {
++ .name = "cpuacct",
++ .create = cpuacct_create,
++ .destroy = cpuacct_destroy,
++ .populate = cpuacct_populate,
++ .subsys_id = cpuacct_subsys_id,
++};
++#endif /* CONFIG_CGROUP_CPUACCT */
+--- linux-2.6.23.orig/kernel/sched_debug.c
++++ linux-2.6.23/kernel/sched_debug.c
+@@ -26,104 +26,125 @@
+ seq_printf(m, x); \
+ else \
+ printk(x); \
+ } while (0)
+
++/*
++ * Ease the printing of nsec fields:
++ */
++static long long nsec_high(long long nsec)
++{
++ if (nsec < 0) {
++ nsec = -nsec;
++ do_div(nsec, 1000000);
++ return -nsec;
++ }
++ do_div(nsec, 1000000);
++
++ return nsec;
++}
++
++static unsigned long nsec_low(long long nsec)
++{
++ if (nsec < 0)
++ nsec = -nsec;
++
++ return do_div(nsec, 1000000);
++}
++
++#define SPLIT_NS(x) nsec_high(x), nsec_low(x)
++
+ static void
+ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p)
+ {
+ if (rq->curr == p)
+ SEQ_printf(m, "R");
+ else
+ SEQ_printf(m, " ");
+
+- SEQ_printf(m, "%15s %5d %15Ld %13Ld %13Ld %9Ld %5d ",
++ SEQ_printf(m, "%15s %5d %9Ld.%06ld %9Ld %5d ",
+ p->comm, p->pid,
+- (long long)p->se.fair_key,
+- (long long)(p->se.fair_key - rq->cfs.fair_clock),
+- (long long)p->se.wait_runtime,
++ SPLIT_NS(p->se.vruntime),
+ (long long)(p->nvcsw + p->nivcsw),
+ p->prio);
+ #ifdef CONFIG_SCHEDSTATS
+- SEQ_printf(m, "%15Ld %15Ld %15Ld %15Ld %15Ld\n",
+- (long long)p->se.sum_exec_runtime,
+- (long long)p->se.sum_wait_runtime,
+- (long long)p->se.sum_sleep_runtime,
+- (long long)p->se.wait_runtime_overruns,
+- (long long)p->se.wait_runtime_underruns);
++ SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld\n",
++ SPLIT_NS(p->se.vruntime),
++ SPLIT_NS(p->se.sum_exec_runtime),
++ SPLIT_NS(p->se.sum_sleep_runtime));
+ #else
+- SEQ_printf(m, "%15Ld %15Ld %15Ld %15Ld %15Ld\n",
+- 0LL, 0LL, 0LL, 0LL, 0LL);
++ SEQ_printf(m, "%15Ld %15Ld %15Ld.%06ld %15Ld.%06ld %15Ld.%06ld\n",
++ 0LL, 0LL, 0LL, 0L, 0LL, 0L, 0LL, 0L);
+ #endif
+ }
+
+ static void print_rq(struct seq_file *m, struct rq *rq, int rq_cpu)
+ {
+ struct task_struct *g, *p;
++ unsigned long flags;
+
+ SEQ_printf(m,
+ "\nrunnable tasks:\n"
+- " task PID tree-key delta waiting"
+- " switches prio"
+- " sum-exec sum-wait sum-sleep"
+- " wait-overrun wait-underrun\n"
+- "------------------------------------------------------------------"
+- "----------------"
+- "------------------------------------------------"
+- "--------------------------------\n");
++ " task PID tree-key switches prio"
++ " exec-runtime sum-exec sum-sleep\n"
++ "------------------------------------------------------"
++ "----------------------------------------------------\n");
+
+- read_lock_irq(&tasklist_lock);
++ read_lock_irqsave(&tasklist_lock, flags);
+
+ do_each_thread(g, p) {
+ if (!p->se.on_rq || task_cpu(p) != rq_cpu)
+ continue;
+
+ print_task(m, rq, p);
+ } while_each_thread(g, p);
+
+- read_unlock_irq(&tasklist_lock);
++ read_unlock_irqrestore(&tasklist_lock, flags);
+ }
+
+-static void
+-print_cfs_rq_runtime_sum(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
++void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
+ {
+- s64 wait_runtime_rq_sum = 0;
+- struct task_struct *p;
+- struct rb_node *curr;
+- unsigned long flags;
++ s64 MIN_vruntime = -1, min_vruntime, max_vruntime = -1,
++ spread, rq0_min_vruntime, spread0;
+ struct rq *rq = &per_cpu(runqueues, cpu);
++ struct sched_entity *last;
++ unsigned long flags;
+
+- spin_lock_irqsave(&rq->lock, flags);
+- curr = first_fair(cfs_rq);
+- while (curr) {
+- p = rb_entry(curr, struct task_struct, se.run_node);
+- wait_runtime_rq_sum += p->se.wait_runtime;
+-
+- curr = rb_next(curr);
+- }
+- spin_unlock_irqrestore(&rq->lock, flags);
+-
+- SEQ_printf(m, " .%-30s: %Ld\n", "wait_runtime_rq_sum",
+- (long long)wait_runtime_rq_sum);
+-}
+-
+-void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq)
+-{
+ SEQ_printf(m, "\ncfs_rq\n");
+
+-#define P(x) \
+- SEQ_printf(m, " .%-30s: %Ld\n", #x, (long long)(cfs_rq->x))
+-
+- P(fair_clock);
+- P(exec_clock);
+- P(wait_runtime);
+- P(wait_runtime_overruns);
+- P(wait_runtime_underruns);
+- P(sleeper_bonus);
+-#undef P
++ SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "exec_clock",
++ SPLIT_NS(cfs_rq->exec_clock));
+
+- print_cfs_rq_runtime_sum(m, cpu, cfs_rq);
++ spin_lock_irqsave(&rq->lock, flags);
++ if (cfs_rq->rb_leftmost)
++ MIN_vruntime = (__pick_next_entity(cfs_rq))->vruntime;
++ last = __pick_last_entity(cfs_rq);
++ if (last)
++ max_vruntime = last->vruntime;
++ min_vruntime = rq->cfs.min_vruntime;
++ rq0_min_vruntime = per_cpu(runqueues, 0).cfs.min_vruntime;
++ spin_unlock_irqrestore(&rq->lock, flags);
++ SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "MIN_vruntime",
++ SPLIT_NS(MIN_vruntime));
++ SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "min_vruntime",
++ SPLIT_NS(min_vruntime));
++ SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "max_vruntime",
++ SPLIT_NS(max_vruntime));
++ spread = max_vruntime - MIN_vruntime;
++ SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "spread",
++ SPLIT_NS(spread));
++ spread0 = min_vruntime - rq0_min_vruntime;
++ SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "spread0",
++ SPLIT_NS(spread0));
++ SEQ_printf(m, " .%-30s: %ld\n", "nr_running", cfs_rq->nr_running);
++ SEQ_printf(m, " .%-30s: %ld\n", "load", cfs_rq->load.weight);
++#ifdef CONFIG_SCHEDSTATS
++ SEQ_printf(m, " .%-30s: %d\n", "bkl_count",
++ rq->bkl_count);
++#endif
++ SEQ_printf(m, " .%-30s: %ld\n", "nr_spread_over",
++ cfs_rq->nr_spread_over);
+ }
+
+ static void print_cpu(struct seq_file *m, int cpu)
+ {
+ struct rq *rq = &per_cpu(runqueues, cpu);
+@@ -139,35 +160,36 @@ static void print_cpu(struct seq_file *m
+ SEQ_printf(m, "\ncpu#%d\n", cpu);
+ #endif
+
+ #define P(x) \
+ SEQ_printf(m, " .%-30s: %Ld\n", #x, (long long)(rq->x))
++#define PN(x) \
++ SEQ_printf(m, " .%-30s: %Ld.%06ld\n", #x, SPLIT_NS(rq->x))
+
+ P(nr_running);
+ SEQ_printf(m, " .%-30s: %lu\n", "load",
+- rq->ls.load.weight);
+- P(ls.delta_fair);
+- P(ls.delta_exec);
++ rq->load.weight);
+ P(nr_switches);
+ P(nr_load_updates);
+ P(nr_uninterruptible);
+ SEQ_printf(m, " .%-30s: %lu\n", "jiffies", jiffies);
+- P(next_balance);
++ PN(next_balance);
+ P(curr->pid);
+- P(clock);
+- P(idle_clock);
+- P(prev_clock_raw);
++ PN(clock);
++ PN(idle_clock);
++ PN(prev_clock_raw);
+ P(clock_warps);
+ P(clock_overflows);
+ P(clock_deep_idle_events);
+- P(clock_max_delta);
++ PN(clock_max_delta);
+ P(cpu_load[0]);
+ P(cpu_load[1]);
+ P(cpu_load[2]);
+ P(cpu_load[3]);
+ P(cpu_load[4]);
+ #undef P
++#undef PN
+
+ print_cfs_stats(m, cpu);
+
+ print_rq(m, rq, cpu);
+ }
+@@ -175,16 +197,29 @@ static void print_cpu(struct seq_file *m
+ static int sched_debug_show(struct seq_file *m, void *v)
+ {
+ u64 now = ktime_to_ns(ktime_get());
+ int cpu;
+
+- SEQ_printf(m, "Sched Debug Version: v0.05-v20, %s %.*s\n",
++ SEQ_printf(m, "Sched Debug Version: v0.07, %s %.*s\n",
+ init_utsname()->release,
+ (int)strcspn(init_utsname()->version, " "),
+ init_utsname()->version);
+
+- SEQ_printf(m, "now at %Lu nsecs\n", (unsigned long long)now);
++ SEQ_printf(m, "now at %Lu.%06ld msecs\n", SPLIT_NS(now));
++
++#define P(x) \
++ SEQ_printf(m, " .%-40s: %Ld\n", #x, (long long)(x))
++#define PN(x) \
++ SEQ_printf(m, " .%-40s: %Ld.%06ld\n", #x, SPLIT_NS(x))
++ PN(sysctl_sched_latency);
++ PN(sysctl_sched_min_granularity);
++ PN(sysctl_sched_wakeup_granularity);
++ PN(sysctl_sched_batch_wakeup_granularity);
++ PN(sysctl_sched_child_runs_first);
++ P(sysctl_sched_features);
++#undef PN
++#undef P
+
+ for_each_online_cpu(cpu)
+ print_cpu(m, cpu);
+
+ SEQ_printf(m, "\n");
+@@ -200,11 +235,11 @@ static void sysrq_sched_debug_show(void)
+ static int sched_debug_open(struct inode *inode, struct file *filp)
+ {
+ return single_open(filp, sched_debug_show, NULL);
+ }
+
+-static struct file_operations sched_debug_fops = {
++static const struct file_operations sched_debug_fops = {
+ .open = sched_debug_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+ };
+@@ -224,10 +259,11 @@ static int __init init_sched_debug_procf
+
+ __initcall(init_sched_debug_procfs);
+
+ void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
+ {
++ unsigned long nr_switches;
+ unsigned long flags;
+ int num_threads = 1;
+
+ rcu_read_lock();
+ if (lock_task_sighand(p, &flags)) {
+@@ -235,53 +271,126 @@ void proc_sched_show_task(struct task_st
+ unlock_task_sighand(p, &flags);
+ }
+ rcu_read_unlock();
+
+ SEQ_printf(m, "%s (%d, #threads: %d)\n", p->comm, p->pid, num_threads);
+- SEQ_printf(m, "----------------------------------------------\n");
++ SEQ_printf(m,
++ "---------------------------------------------------------\n");
++#define __P(F) \
++ SEQ_printf(m, "%-35s:%21Ld\n", #F, (long long)F)
+ #define P(F) \
+- SEQ_printf(m, "%-25s:%20Ld\n", #F, (long long)p->F)
++ SEQ_printf(m, "%-35s:%21Ld\n", #F, (long long)p->F)
++#define __PN(F) \
++ SEQ_printf(m, "%-35s:%14Ld.%06ld\n", #F, SPLIT_NS((long long)F))
++#define PN(F) \
++ SEQ_printf(m, "%-35s:%14Ld.%06ld\n", #F, SPLIT_NS((long long)p->F))
++
++ PN(se.exec_start);
++ PN(se.vruntime);
++ PN(se.sum_exec_runtime);
+
+- P(se.wait_runtime);
+- P(se.wait_start_fair);
+- P(se.exec_start);
+- P(se.sleep_start_fair);
+- P(se.sum_exec_runtime);
++ nr_switches = p->nvcsw + p->nivcsw;
+
+ #ifdef CONFIG_SCHEDSTATS
+- P(se.wait_start);
+- P(se.sleep_start);
+- P(se.block_start);
+- P(se.sleep_max);
+- P(se.block_max);
+- P(se.exec_max);
+- P(se.wait_max);
+- P(se.wait_runtime_overruns);
+- P(se.wait_runtime_underruns);
+- P(se.sum_wait_runtime);
++ PN(se.wait_start);
++ PN(se.sleep_start);
++ PN(se.block_start);
++ PN(se.sleep_max);
++ PN(se.block_max);
++ PN(se.exec_max);
++ PN(se.slice_max);
++ PN(se.wait_max);
++ P(sched_info.bkl_count);
++ P(se.nr_migrations);
++ P(se.nr_migrations_cold);
++ P(se.nr_failed_migrations_affine);
++ P(se.nr_failed_migrations_running);
++ P(se.nr_failed_migrations_hot);
++ P(se.nr_forced_migrations);
++ P(se.nr_forced2_migrations);
++ P(se.nr_wakeups);
++ P(se.nr_wakeups_sync);
++ P(se.nr_wakeups_migrate);
++ P(se.nr_wakeups_local);
++ P(se.nr_wakeups_remote);
++ P(se.nr_wakeups_affine);
++ P(se.nr_wakeups_affine_attempts);
++ P(se.nr_wakeups_passive);
++ P(se.nr_wakeups_idle);
++
++ {
++ u64 avg_atom, avg_per_cpu;
++
++ avg_atom = p->se.sum_exec_runtime;
++ if (nr_switches)
++ do_div(avg_atom, nr_switches);
++ else
++ avg_atom = -1LL;
++
++ avg_per_cpu = p->se.sum_exec_runtime;
++ if (p->se.nr_migrations) {
++ avg_per_cpu = div64_64(avg_per_cpu,
++ p->se.nr_migrations);
++ } else {
++ avg_per_cpu = -1LL;
++ }
++
++ __PN(avg_atom);
++ __PN(avg_per_cpu);
++ }
+ #endif
+- SEQ_printf(m, "%-25s:%20Ld\n",
+- "nr_switches", (long long)(p->nvcsw + p->nivcsw));
++ __P(nr_switches);
++ SEQ_printf(m, "%-35s:%21Ld\n",
++ "nr_voluntary_switches", (long long)p->nvcsw);
++ SEQ_printf(m, "%-35s:%21Ld\n",
++ "nr_involuntary_switches", (long long)p->nivcsw);
++
+ P(se.load.weight);
+ P(policy);
+ P(prio);
++#undef PN
++#undef __PN
+ #undef P
++#undef __P
+
+ {
+ u64 t0, t1;
+
+ t0 = sched_clock();
+ t1 = sched_clock();
+- SEQ_printf(m, "%-25s:%20Ld\n",
++ SEQ_printf(m, "%-35s:%21Ld\n",
+ "clock-delta", (long long)(t1-t0));
+ }
+ }
+
+ void proc_sched_set_task(struct task_struct *p)
+ {
+ #ifdef CONFIG_SCHEDSTATS
+- p->se.sleep_max = p->se.block_max = p->se.exec_max = p->se.wait_max = 0;
+- p->se.wait_runtime_overruns = p->se.wait_runtime_underruns = 0;
++ p->se.wait_max = 0;
++ p->se.sleep_max = 0;
++ p->se.sum_sleep_runtime = 0;
++ p->se.block_max = 0;
++ p->se.exec_max = 0;
++ p->se.slice_max = 0;
++ p->se.nr_migrations = 0;
++ p->se.nr_migrations_cold = 0;
++ p->se.nr_failed_migrations_affine = 0;
++ p->se.nr_failed_migrations_running = 0;
++ p->se.nr_failed_migrations_hot = 0;
++ p->se.nr_forced_migrations = 0;
++ p->se.nr_forced2_migrations = 0;
++ p->se.nr_wakeups = 0;
++ p->se.nr_wakeups_sync = 0;
++ p->se.nr_wakeups_migrate = 0;
++ p->se.nr_wakeups_local = 0;
++ p->se.nr_wakeups_remote = 0;
++ p->se.nr_wakeups_affine = 0;
++ p->se.nr_wakeups_affine_attempts = 0;
++ p->se.nr_wakeups_passive = 0;
++ p->se.nr_wakeups_idle = 0;
++ p->sched_info.bkl_count = 0;
+ #endif
+- p->se.sum_exec_runtime = 0;
+- p->se.prev_sum_exec_runtime = 0;
++ p->se.sum_exec_runtime = 0;
++ p->se.prev_sum_exec_runtime = 0;
++ p->nvcsw = 0;
++ p->nivcsw = 0;
+ }
+--- linux-2.6.23.orig/kernel/sched_fair.c
++++ linux-2.6.23/kernel/sched_fair.c
+@@ -20,29 +20,38 @@
+ * Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra <pzijlstr@redhat.com>
+ */
+
+ /*
+ * Targeted preemption latency for CPU-bound tasks:
+- * (default: 20ms, units: nanoseconds)
++ * (default: 20ms * (1 + ilog(ncpus)), units: nanoseconds)
+ *
+ * NOTE: this latency value is not the same as the concept of
+- * 'timeslice length' - timeslices in CFS are of variable length.
+- * (to see the precise effective timeslice length of your workload,
+- * run vmstat and monitor the context-switches field)
++ * 'timeslice length' - timeslices in CFS are of variable length
++ * and have no persistent notion like in traditional, time-slice
++ * based scheduling concepts.
+ *
+- * On SMP systems the value of this is multiplied by the log2 of the
+- * number of CPUs. (i.e. factor 2x on 2-way systems, 3x on 4-way
+- * systems, 4x on 8-way systems, 5x on 16-way systems, etc.)
+- * Targeted preemption latency for CPU-bound tasks:
++ * (to see the precise effective timeslice length of your workload,
++ * run vmstat and monitor the context-switches (cs) field)
+ */
+-unsigned int sysctl_sched_latency __read_mostly = 20000000ULL;
++unsigned int sysctl_sched_latency = 20000000ULL;
+
+ /*
+ * Minimal preemption granularity for CPU-bound tasks:
+- * (default: 2 msec, units: nanoseconds)
++ * (default: 4 msec * (1 + ilog(ncpus)), units: nanoseconds)
++ */
++unsigned int sysctl_sched_min_granularity = 4000000ULL;
++
++/*
++ * is kept at sysctl_sched_latency / sysctl_sched_min_granularity
+ */
+-unsigned int sysctl_sched_min_granularity __read_mostly = 2000000ULL;
++static unsigned int sched_nr_latency = 5;
++
++/*
++ * After fork, child runs first. (default) If set to 0 then
++ * parent will (try to) run first.
++ */
++const_debug unsigned int sysctl_sched_child_runs_first = 1;
+
+ /*
+ * sys_sched_yield() compat mode
+ *
+ * This option switches the agressive yield implementation of the
+@@ -50,56 +59,29 @@ unsigned int sysctl_sched_min_granularit
+ */
+ unsigned int __read_mostly sysctl_sched_compat_yield;
+
+ /*
+ * SCHED_BATCH wake-up granularity.
+- * (default: 25 msec, units: nanoseconds)
++ * (default: 10 msec * (1 + ilog(ncpus)), units: nanoseconds)
+ *
+ * This option delays the preemption effects of decoupled workloads
+ * and reduces their over-scheduling. Synchronous workloads will still
+ * have immediate wakeup/sleep latencies.
+ */
+-unsigned int sysctl_sched_batch_wakeup_granularity __read_mostly = 25000000UL;
++unsigned int sysctl_sched_batch_wakeup_granularity = 10000000UL;
+
+ /*
+ * SCHED_OTHER wake-up granularity.
+- * (default: 1 msec, units: nanoseconds)
++ * (default: 10 msec * (1 + ilog(ncpus)), units: nanoseconds)
+ *
+ * This option delays the preemption effects of decoupled workloads
+ * and reduces their over-scheduling. Synchronous workloads will still
+ * have immediate wakeup/sleep latencies.
+ */
+-unsigned int sysctl_sched_wakeup_granularity __read_mostly = 1000000UL;
+-
+-unsigned int sysctl_sched_stat_granularity __read_mostly;
+-
+-/*
+- * Initialized in sched_init_granularity() [to 5 times the base granularity]:
+- */
+-unsigned int sysctl_sched_runtime_limit __read_mostly;
+-
+-/*
+- * Debugging: various feature bits
+- */
+-enum {
+- SCHED_FEAT_FAIR_SLEEPERS = 1,
+- SCHED_FEAT_SLEEPER_AVG = 2,
+- SCHED_FEAT_SLEEPER_LOAD_AVG = 4,
+- SCHED_FEAT_PRECISE_CPU_LOAD = 8,
+- SCHED_FEAT_START_DEBIT = 16,
+- SCHED_FEAT_SKIP_INITIAL = 32,
+-};
+-
+-unsigned int sysctl_sched_features __read_mostly =
+- SCHED_FEAT_FAIR_SLEEPERS *1 |
+- SCHED_FEAT_SLEEPER_AVG *0 |
+- SCHED_FEAT_SLEEPER_LOAD_AVG *1 |
+- SCHED_FEAT_PRECISE_CPU_LOAD *0 |
+- SCHED_FEAT_START_DEBIT *1 |
+- SCHED_FEAT_SKIP_INITIAL *0;
++unsigned int sysctl_sched_wakeup_granularity = 10000000UL;
+
+-extern struct sched_class fair_sched_class;
++const_debug unsigned int sysctl_sched_migration_cost = 500000UL;
+
+ /**************************************************************
+ * CFS operations on generic schedulable entities:
+ */
+
+@@ -109,47 +91,22 @@ extern struct sched_class fair_sched_cla
+ static inline struct rq *rq_of(struct cfs_rq *cfs_rq)
+ {
+ return cfs_rq->rq;
+ }
+
+-/* currently running entity (if any) on this cfs_rq */
+-static inline struct sched_entity *cfs_rq_curr(struct cfs_rq *cfs_rq)
+-{
+- return cfs_rq->curr;
+-}
+-
+ /* An entity is a task if it doesn't "own" a runqueue */
+ #define entity_is_task(se) (!se->my_q)
+
+-static inline void
+-set_cfs_rq_curr(struct cfs_rq *cfs_rq, struct sched_entity *se)
+-{
+- cfs_rq->curr = se;
+-}
+-
+ #else /* CONFIG_FAIR_GROUP_SCHED */
+
+ static inline struct rq *rq_of(struct cfs_rq *cfs_rq)
+ {
+ return container_of(cfs_rq, struct rq, cfs);
+ }
+
+-static inline struct sched_entity *cfs_rq_curr(struct cfs_rq *cfs_rq)
+-{
+- struct rq *rq = rq_of(cfs_rq);
+-
+- if (unlikely(rq->curr->sched_class != &fair_sched_class))
+- return NULL;
+-
+- return &rq->curr->se;
+-}
+-
+ #define entity_is_task(se) 1
+
+-static inline void
+-set_cfs_rq_curr(struct cfs_rq *cfs_rq, struct sched_entity *se) { }
+-
+ #endif /* CONFIG_FAIR_GROUP_SCHED */
+
+ static inline struct task_struct *task_of(struct sched_entity *se)
+ {
+ return container_of(se, struct task_struct, se);
+@@ -158,20 +115,42 @@ static inline struct task_struct *task_o
+
+ /**************************************************************
+ * Scheduling class tree data structure manipulation methods:
+ */
+
++static inline u64 max_vruntime(u64 min_vruntime, u64 vruntime)
++{
++ s64 delta = (s64)(vruntime - min_vruntime);
++ if (delta > 0)
++ min_vruntime = vruntime;
++
++ return min_vruntime;
++}
++
++static inline u64 min_vruntime(u64 min_vruntime, u64 vruntime)
++{
++ s64 delta = (s64)(vruntime - min_vruntime);
++ if (delta < 0)
++ min_vruntime = vruntime;
++
++ return min_vruntime;
++}
++
++static inline s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *se)
++{
++ return se->vruntime - cfs_rq->min_vruntime;
++}
++
+ /*
+ * Enqueue an entity into the rb-tree:
+ */
+-static inline void
+-__enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
++static void __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+ struct rb_node **link = &cfs_rq->tasks_timeline.rb_node;
+ struct rb_node *parent = NULL;
+ struct sched_entity *entry;
+- s64 key = se->fair_key;
++ s64 key = entity_key(cfs_rq, se);
+ int leftmost = 1;
+
+ /*
+ * Find the right place in the rbtree:
+ */
+@@ -180,11 +159,11 @@ __enqueue_entity(struct cfs_rq *cfs_rq,
+ entry = rb_entry(parent, struct sched_entity, run_node);
+ /*
+ * We dont care about collisions. Nodes with
+ * the same key stay together.
+ */
+- if (key - entry->fair_key < 0) {
++ if (key < entity_key(cfs_rq, entry)) {
+ link = &parent->rb_left;
+ } else {
+ link = &parent->rb_right;
+ leftmost = 0;
+ }
+@@ -197,28 +176,18 @@ __enqueue_entity(struct cfs_rq *cfs_rq,
+ if (leftmost)
+ cfs_rq->rb_leftmost = &se->run_node;
+
+ rb_link_node(&se->run_node, parent, link);
+ rb_insert_color(&se->run_node, &cfs_rq->tasks_timeline);
+- update_load_add(&cfs_rq->load, se->load.weight);
+- cfs_rq->nr_running++;
+- se->on_rq = 1;
+-
+- schedstat_add(cfs_rq, wait_runtime, se->wait_runtime);
+ }
+
+-static inline void
+-__dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
++static void __dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+ if (cfs_rq->rb_leftmost == &se->run_node)
+ cfs_rq->rb_leftmost = rb_next(&se->run_node);
+- rb_erase(&se->run_node, &cfs_rq->tasks_timeline);
+- update_load_sub(&cfs_rq->load, se->load.weight);
+- cfs_rq->nr_running--;
+- se->on_rq = 0;
+
+- schedstat_add(cfs_rq, wait_runtime, -se->wait_runtime);
++ rb_erase(&se->run_node, &cfs_rq->tasks_timeline);
+ }
+
+ static inline struct rb_node *first_fair(struct cfs_rq *cfs_rq)
+ {
+ return cfs_rq->rb_leftmost;
+@@ -227,308 +196,206 @@ static inline struct rb_node *first_fair
+ static struct sched_entity *__pick_next_entity(struct cfs_rq *cfs_rq)
+ {
+ return rb_entry(first_fair(cfs_rq), struct sched_entity, run_node);
+ }
+
++static inline struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq)
++{
++ struct rb_node **link = &cfs_rq->tasks_timeline.rb_node;
++ struct sched_entity *se = NULL;
++ struct rb_node *parent;
++
++ while (*link) {
++ parent = *link;
++ se = rb_entry(parent, struct sched_entity, run_node);
++ link = &parent->rb_right;
++ }
++
++ return se;
++}
++
+ /**************************************************************
+ * Scheduling class statistics methods:
+ */
+
++#ifdef CONFIG_SCHED_DEBUG
++int sched_nr_latency_handler(struct ctl_table *table, int write,
++ struct file *filp, void __user *buffer, size_t *lenp,
++ loff_t *ppos)
++{
++ int ret = proc_dointvec_minmax(table, write, filp, buffer, lenp, ppos);
++
++ if (ret || !write)
++ return ret;
++
++ sched_nr_latency = DIV_ROUND_UP(sysctl_sched_latency,
++ sysctl_sched_min_granularity);
++
++ return 0;
++}
++#endif
++
+ /*
+- * Calculate the preemption granularity needed to schedule every
+- * runnable task once per sysctl_sched_latency amount of time.
+- * (down to a sensible low limit on granularity)
+- *
+- * For example, if there are 2 tasks running and latency is 10 msecs,
+- * we switch tasks every 5 msecs. If we have 3 tasks running, we have
+- * to switch tasks every 3.33 msecs to get a 10 msecs observed latency
+- * for each task. We do finer and finer scheduling up to until we
+- * reach the minimum granularity value.
+- *
+- * To achieve this we use the following dynamic-granularity rule:
+- *
+- * gran = lat/nr - lat/nr/nr
++ * The idea is to set a period in which each task runs once.
+ *
+- * This comes out of the following equations:
++ * When there are too many tasks (sysctl_sched_nr_latency) we have to stretch
++ * this period because otherwise the slices get too small.
+ *
+- * kA1 + gran = kB1
+- * kB2 + gran = kA2
+- * kA2 = kA1
+- * kB2 = kB1 - d + d/nr
+- * lat = d * nr
+- *
+- * Where 'k' is key, 'A' is task A (waiting), 'B' is task B (running),
+- * '1' is start of time, '2' is end of time, 'd' is delay between
+- * 1 and 2 (during which task B was running), 'nr' is number of tasks
+- * running, 'lat' is the the period of each task. ('lat' is the
+- * sched_latency that we aim for.)
++ * p = (nr <= nl) ? l : l*nr/nl
+ */
+-static long
+-sched_granularity(struct cfs_rq *cfs_rq)
++static u64 __sched_period(unsigned long nr_running)
+ {
+- unsigned int gran = sysctl_sched_latency;
+- unsigned int nr = cfs_rq->nr_running;
++ u64 period = sysctl_sched_latency;
++ unsigned long nr_latency = sched_nr_latency;
+
+- if (nr > 1) {
+- gran = gran/nr - gran/nr/nr;
+- gran = max(gran, sysctl_sched_min_granularity);
++ if (unlikely(nr_running > nr_latency)) {
++ period *= nr_running;
++ do_div(period, nr_latency);
+ }
+
+- return gran;
++ return period;
+ }
+
+ /*
+- * We rescale the rescheduling granularity of tasks according to their
+- * nice level, but only linearly, not exponentially:
++ * We calculate the wall-time slice from the period by taking a part
++ * proportional to the weight.
++ *
++ * s = p*w/rw
+ */
+-static long
+-niced_granularity(struct sched_entity *curr, unsigned long granularity)
++static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- u64 tmp;
++ u64 slice = __sched_period(cfs_rq->nr_running);
+
+- if (likely(curr->load.weight == NICE_0_LOAD))
+- return granularity;
+- /*
+- * Positive nice levels get the same granularity as nice-0:
+- */
+- if (likely(curr->load.weight < NICE_0_LOAD)) {
+- tmp = curr->load.weight * (u64)granularity;
+- return (long) (tmp >> NICE_0_SHIFT);
+- }
+- /*
+- * Negative nice level tasks get linearly finer
+- * granularity:
+- */
+- tmp = curr->load.inv_weight * (u64)granularity;
++ slice *= se->load.weight;
++ do_div(slice, cfs_rq->load.weight);
+
+- /*
+- * It will always fit into 'long':
+- */
+- return (long) (tmp >> (WMULT_SHIFT-NICE_0_SHIFT));
++ return slice;
+ }
+
+-static inline void
+-limit_wait_runtime(struct cfs_rq *cfs_rq, struct sched_entity *se)
++/*
++ * We calculate the vruntime slice.
++ *
++ * vs = s/w = p/rw
++ */
++static u64 __sched_vslice(unsigned long rq_weight, unsigned long nr_running)
+ {
+- long limit = sysctl_sched_runtime_limit;
++ u64 vslice = __sched_period(nr_running);
+
+- /*
+- * Niced tasks have the same history dynamic range as
+- * non-niced tasks:
+- */
+- if (unlikely(se->wait_runtime > limit)) {
+- se->wait_runtime = limit;
+- schedstat_inc(se, wait_runtime_overruns);
+- schedstat_inc(cfs_rq, wait_runtime_overruns);
+- }
+- if (unlikely(se->wait_runtime < -limit)) {
+- se->wait_runtime = -limit;
+- schedstat_inc(se, wait_runtime_underruns);
+- schedstat_inc(cfs_rq, wait_runtime_underruns);
+- }
++ vslice *= NICE_0_LOAD;
++ do_div(vslice, rq_weight);
++
++ return vslice;
+ }
+
+-static inline void
+-__add_wait_runtime(struct cfs_rq *cfs_rq, struct sched_entity *se, long delta)
++static u64 sched_vslice(struct cfs_rq *cfs_rq)
+ {
+- se->wait_runtime += delta;
+- schedstat_add(se, sum_wait_runtime, delta);
+- limit_wait_runtime(cfs_rq, se);
++ return __sched_vslice(cfs_rq->load.weight, cfs_rq->nr_running);
+ }
+
+-static void
+-add_wait_runtime(struct cfs_rq *cfs_rq, struct sched_entity *se, long delta)
++static u64 sched_vslice_add(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- schedstat_add(cfs_rq, wait_runtime, -se->wait_runtime);
+- __add_wait_runtime(cfs_rq, se, delta);
+- schedstat_add(cfs_rq, wait_runtime, se->wait_runtime);
++ return __sched_vslice(cfs_rq->load.weight + se->load.weight,
++ cfs_rq->nr_running + 1);
+ }
+
+ /*
+ * Update the current task's runtime statistics. Skip current tasks that
+ * are not in our scheduling class.
+ */
+ static inline void
+-__update_curr(struct cfs_rq *cfs_rq, struct sched_entity *curr)
++__update_curr(struct cfs_rq *cfs_rq, struct sched_entity *curr,
++ unsigned long delta_exec)
+ {
+- unsigned long delta, delta_exec, delta_fair, delta_mine;
+- struct load_weight *lw = &cfs_rq->load;
+- unsigned long load = lw->weight;
++ unsigned long delta_exec_weighted;
++ u64 vruntime;
+
+- delta_exec = curr->delta_exec;
+ schedstat_set(curr->exec_max, max((u64)delta_exec, curr->exec_max));
+
+ curr->sum_exec_runtime += delta_exec;
+- cfs_rq->exec_clock += delta_exec;
+-
+- if (unlikely(!load))
+- return;
+-
+- delta_fair = calc_delta_fair(delta_exec, lw);
+- delta_mine = calc_delta_mine(delta_exec, curr->load.weight, lw);
+-
+- if (cfs_rq->sleeper_bonus > sysctl_sched_min_granularity) {
+- delta = min((u64)delta_mine, cfs_rq->sleeper_bonus);
+- delta = min(delta, (unsigned long)(
+- (long)sysctl_sched_runtime_limit - curr->wait_runtime));
+- cfs_rq->sleeper_bonus -= delta;
+- delta_mine -= delta;
++ schedstat_add(cfs_rq, exec_clock, delta_exec);
++ delta_exec_weighted = delta_exec;
++ if (unlikely(curr->load.weight != NICE_0_LOAD)) {
++ delta_exec_weighted = calc_delta_fair(delta_exec_weighted,
++ &curr->load);
+ }
++ curr->vruntime += delta_exec_weighted;
+
+- cfs_rq->fair_clock += delta_fair;
+ /*
+- * We executed delta_exec amount of time on the CPU,
+- * but we were only entitled to delta_mine amount of
+- * time during that period (if nr_running == 1 then
+- * the two values are equal)
+- * [Note: delta_mine - delta_exec is negative]:
++ * maintain cfs_rq->min_vruntime to be a monotonic increasing
++ * value tracking the leftmost vruntime in the tree.
+ */
+- add_wait_runtime(cfs_rq, curr, delta_mine - delta_exec);
++ if (first_fair(cfs_rq)) {
++ vruntime = min_vruntime(curr->vruntime,
++ __pick_next_entity(cfs_rq)->vruntime);
++ } else
++ vruntime = curr->vruntime;
++
++ cfs_rq->min_vruntime =
++ max_vruntime(cfs_rq->min_vruntime, vruntime);
+ }
+
+ static void update_curr(struct cfs_rq *cfs_rq)
+ {
+- struct sched_entity *curr = cfs_rq_curr(cfs_rq);
++ struct sched_entity *curr = cfs_rq->curr;
++ u64 now = rq_of(cfs_rq)->clock;
+ unsigned long delta_exec;
+
+ if (unlikely(!curr))
+ return;
+
+ /*
+ * Get the amount of time the current task was running
+ * since the last time we changed load (this cannot
+ * overflow on 32 bits):
+ */
+- delta_exec = (unsigned long)(rq_of(cfs_rq)->clock - curr->exec_start);
++ delta_exec = (unsigned long)(now - curr->exec_start);
++
++ __update_curr(cfs_rq, curr, delta_exec);
++ curr->exec_start = now;
+
+- curr->delta_exec += delta_exec;
++ if (entity_is_task(curr)) {
++ struct task_struct *curtask = task_of(curr);
+
+- if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) {
+- __update_curr(cfs_rq, curr);
+- curr->delta_exec = 0;
++ cpuacct_charge(curtask, delta_exec);
+ }
+- curr->exec_start = rq_of(cfs_rq)->clock;
+ }
+
+ static inline void
+ update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- se->wait_start_fair = cfs_rq->fair_clock;
+ schedstat_set(se->wait_start, rq_of(cfs_rq)->clock);
+ }
+
+ /*
+- * We calculate fair deltas here, so protect against the random effects
+- * of a multiplication overflow by capping it to the runtime limit:
+- */
+-#if BITS_PER_LONG == 32
+-static inline unsigned long
+-calc_weighted(unsigned long delta, unsigned long weight, int shift)
+-{
+- u64 tmp = (u64)delta * weight >> shift;
+-
+- if (unlikely(tmp > sysctl_sched_runtime_limit*2))
+- return sysctl_sched_runtime_limit*2;
+- return tmp;
+-}
+-#else
+-static inline unsigned long
+-calc_weighted(unsigned long delta, unsigned long weight, int shift)
+-{
+- return delta * weight >> shift;
+-}
+-#endif
+-
+-/*
+ * Task is being enqueued - update stats:
+ */
+ static void update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- s64 key;
+-
+ /*
+ * Are we enqueueing a waiting task? (for current tasks
+ * a dequeue/enqueue event is a NOP)
+ */
+- if (se != cfs_rq_curr(cfs_rq))
++ if (se != cfs_rq->curr)
+ update_stats_wait_start(cfs_rq, se);
+- /*
+- * Update the key:
+- */
+- key = cfs_rq->fair_clock;
+-
+- /*
+- * Optimize the common nice 0 case:
+- */
+- if (likely(se->load.weight == NICE_0_LOAD)) {
+- key -= se->wait_runtime;
+- } else {
+- u64 tmp;
+-
+- if (se->wait_runtime < 0) {
+- tmp = -se->wait_runtime;
+- key += (tmp * se->load.inv_weight) >>
+- (WMULT_SHIFT - NICE_0_SHIFT);
+- } else {
+- tmp = se->wait_runtime;
+- key -= (tmp * se->load.inv_weight) >>
+- (WMULT_SHIFT - NICE_0_SHIFT);
+- }
+- }
+-
+- se->fair_key = key;
+-}
+-
+-/*
+- * Note: must be called with a freshly updated rq->fair_clock.
+- */
+-static inline void
+-__update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
+-{
+- unsigned long delta_fair = se->delta_fair_run;
+-
+- schedstat_set(se->wait_max, max(se->wait_max,
+- rq_of(cfs_rq)->clock - se->wait_start));
+-
+- if (unlikely(se->load.weight != NICE_0_LOAD))
+- delta_fair = calc_weighted(delta_fair, se->load.weight,
+- NICE_0_SHIFT);
+-
+- add_wait_runtime(cfs_rq, se, delta_fair);
+ }
+
+ static void
+ update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- unsigned long delta_fair;
+-
+- if (unlikely(!se->wait_start_fair))
+- return;
+-
+- delta_fair = (unsigned long)min((u64)(2*sysctl_sched_runtime_limit),
+- (u64)(cfs_rq->fair_clock - se->wait_start_fair));
+-
+- se->delta_fair_run += delta_fair;
+- if (unlikely(abs(se->delta_fair_run) >=
+- sysctl_sched_stat_granularity)) {
+- __update_stats_wait_end(cfs_rq, se);
+- se->delta_fair_run = 0;
+- }
+-
+- se->wait_start_fair = 0;
++ schedstat_set(se->wait_max, max(se->wait_max,
++ rq_of(cfs_rq)->clock - se->wait_start));
+ schedstat_set(se->wait_start, 0);
+ }
+
+ static inline void
+ update_stats_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- update_curr(cfs_rq);
+ /*
+ * Mark the end of the wait period if dequeueing a
+ * waiting task:
+ */
+- if (se != cfs_rq_curr(cfs_rq))
++ if (se != cfs_rq->curr)
+ update_stats_wait_end(cfs_rq, se);
+ }
+
+ /*
+ * We are picking a new current task - update its stats:
+@@ -540,83 +407,32 @@ update_stats_curr_start(struct cfs_rq *c
+ * We are starting a new run period:
+ */
+ se->exec_start = rq_of(cfs_rq)->clock;
+ }
+
+-/*
+- * We are descheduling a task - update its stats:
+- */
+-static inline void
+-update_stats_curr_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
+-{
+- se->exec_start = 0;
+-}
+-
+ /**************************************************
+ * Scheduling class queueing methods:
+ */
+
+-static void __enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
++static void
++account_entity_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- unsigned long load = cfs_rq->load.weight, delta_fair;
+- long prev_runtime;
+-
+- /*
+- * Do not boost sleepers if there's too much bonus 'in flight'
+- * already:
+- */
+- if (unlikely(cfs_rq->sleeper_bonus > sysctl_sched_runtime_limit))
+- return;
+-
+- if (sysctl_sched_features & SCHED_FEAT_SLEEPER_LOAD_AVG)
+- load = rq_of(cfs_rq)->cpu_load[2];
+-
+- delta_fair = se->delta_fair_sleep;
+-
+- /*
+- * Fix up delta_fair with the effect of us running
+- * during the whole sleep period:
+- */
+- if (sysctl_sched_features & SCHED_FEAT_SLEEPER_AVG)
+- delta_fair = div64_likely32((u64)delta_fair * load,
+- load + se->load.weight);
+-
+- if (unlikely(se->load.weight != NICE_0_LOAD))
+- delta_fair = calc_weighted(delta_fair, se->load.weight,
+- NICE_0_SHIFT);
+-
+- prev_runtime = se->wait_runtime;
+- __add_wait_runtime(cfs_rq, se, delta_fair);
+- delta_fair = se->wait_runtime - prev_runtime;
++ update_load_add(&cfs_rq->load, se->load.weight);
++ cfs_rq->nr_running++;
++ se->on_rq = 1;
++}
+
+- /*
+- * Track the amount of bonus we've given to sleepers:
+- */
+- cfs_rq->sleeper_bonus += delta_fair;
++static void
++account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se)
++{
++ update_load_sub(&cfs_rq->load, se->load.weight);
++ cfs_rq->nr_running--;
++ se->on_rq = 0;
+ }
+
+ static void enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- struct task_struct *tsk = task_of(se);
+- unsigned long delta_fair;
+-
+- if ((entity_is_task(se) && tsk->policy == SCHED_BATCH) ||
+- !(sysctl_sched_features & SCHED_FEAT_FAIR_SLEEPERS))
+- return;
+-
+- delta_fair = (unsigned long)min((u64)(2*sysctl_sched_runtime_limit),
+- (u64)(cfs_rq->fair_clock - se->sleep_start_fair));
+-
+- se->delta_fair_sleep += delta_fair;
+- if (unlikely(abs(se->delta_fair_sleep) >=
+- sysctl_sched_stat_granularity)) {
+- __enqueue_sleeper(cfs_rq, se);
+- se->delta_fair_sleep = 0;
+- }
+-
+- se->sleep_start_fair = 0;
+-
+ #ifdef CONFIG_SCHEDSTATS
+ if (se->sleep_start) {
+ u64 delta = rq_of(cfs_rq)->clock - se->sleep_start;
+
+ if ((s64)delta < 0)
+@@ -644,38 +460,99 @@ static void enqueue_sleeper(struct cfs_r
+ * Blocking time is in units of nanosecs, so shift by 20 to
+ * get a milliseconds-range estimation of the amount of
+ * time that the task spent sleeping:
+ */
+ if (unlikely(prof_on == SLEEP_PROFILING)) {
++ struct task_struct *tsk = task_of(se);
++
+ profile_hits(SLEEP_PROFILING, (void *)get_wchan(tsk),
+ delta >> 20);
+ }
+ }
+ #endif
+ }
+
++static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
++{
++#ifdef CONFIG_SCHED_DEBUG
++ s64 d = se->vruntime - cfs_rq->min_vruntime;
++
++ if (d < 0)
++ d = -d;
++
++ if (d > 3*sysctl_sched_latency)
++ schedstat_inc(cfs_rq, nr_spread_over);
++#endif
++}
++
++static void
++place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
++{
++ u64 vruntime;
++
++ vruntime = cfs_rq->min_vruntime;
++
++ if (sched_feat(TREE_AVG)) {
++ struct sched_entity *last = __pick_last_entity(cfs_rq);
++ if (last) {
++ vruntime += last->vruntime;
++ vruntime >>= 1;
++ }
++ } else if (sched_feat(APPROX_AVG) && cfs_rq->nr_running)
++ vruntime += sched_vslice(cfs_rq)/2;
++
++ /*
++ * The 'current' period is already promised to the current tasks,
++ * however the extra weight of the new task will slow them down a
++ * little, place the new task so that it fits in the slot that
++ * stays open at the end.
++ */
++ if (initial && sched_feat(START_DEBIT))
++ vruntime += sched_vslice_add(cfs_rq, se);
++
++ if (!initial) {
++ /* sleeps upto a single latency don't count. */
++ if (sched_feat(NEW_FAIR_SLEEPERS) && entity_is_task(se))
++ vruntime -= sysctl_sched_latency;
++
++ /* ensure we never gain time by being placed backwards. */
++ vruntime = max_vruntime(se->vruntime, vruntime);
++ }
++
++ se->vruntime = vruntime;
++}
++
+ static void
+ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int wakeup)
+ {
+ /*
+- * Update the fair clock.
++ * Update run-time statistics of the 'current'.
+ */
+ update_curr(cfs_rq);
+
+- if (wakeup)
++ if (wakeup) {
++ place_entity(cfs_rq, se, 0);
+ enqueue_sleeper(cfs_rq, se);
++ }
+
+ update_stats_enqueue(cfs_rq, se);
+- __enqueue_entity(cfs_rq, se);
++ check_spread(cfs_rq, se);
++ if (se != cfs_rq->curr)
++ __enqueue_entity(cfs_rq, se);
++ account_entity_enqueue(cfs_rq, se);
+ }
+
+ static void
+ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int sleep)
+ {
++ /*
++ * Update run-time statistics of the 'current'.
++ */
++ update_curr(cfs_rq);
++
+ update_stats_dequeue(cfs_rq, se);
+ if (sleep) {
+- se->sleep_start_fair = cfs_rq->fair_clock;
+ #ifdef CONFIG_SCHEDSTATS
+ if (entity_is_task(se)) {
+ struct task_struct *tsk = task_of(se);
+
+ if (tsk->state & TASK_INTERRUPTIBLE)
+@@ -683,72 +560,68 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
+ if (tsk->state & TASK_UNINTERRUPTIBLE)
+ se->block_start = rq_of(cfs_rq)->clock;
+ }
+ #endif
+ }
+- __dequeue_entity(cfs_rq, se);
++
++ if (se != cfs_rq->curr)
++ __dequeue_entity(cfs_rq, se);
++ account_entity_dequeue(cfs_rq, se);
+ }
+
+ /*
+ * Preempt the current task with a newly woken task if needed:
+ */
+ static void
+-__check_preempt_curr_fair(struct cfs_rq *cfs_rq, struct sched_entity *se,
+- struct sched_entity *curr, unsigned long granularity)
++check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
+ {
+- s64 __delta = curr->fair_key - se->fair_key;
+ unsigned long ideal_runtime, delta_exec;
+
+- /*
+- * ideal_runtime is compared against sum_exec_runtime, which is
+- * walltime, hence do not scale.
+- */
+- ideal_runtime = max(sysctl_sched_latency / cfs_rq->nr_running,
+- (unsigned long)sysctl_sched_min_granularity);
+-
+- /*
+- * If we executed more than what the latency constraint suggests,
+- * reduce the rescheduling granularity. This way the total latency
+- * of how much a task is not scheduled converges to
+- * sysctl_sched_latency:
+- */
++ ideal_runtime = sched_slice(cfs_rq, curr);
+ delta_exec = curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
+ if (delta_exec > ideal_runtime)
+- granularity = 0;
+-
+- /*
+- * Take scheduling granularity into account - do not
+- * preempt the current task unless the best task has
+- * a larger than sched_granularity fairness advantage:
+- *
+- * scale granularity as key space is in fair_clock.
+- */
+- if (__delta > niced_granularity(curr, granularity))
+ resched_task(rq_of(cfs_rq)->curr);
+ }
+
+-static inline void
++static void
+ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
+ {
+- /*
+- * Any task has to be enqueued before it get to execute on
+- * a CPU. So account for the time it spent waiting on the
+- * runqueue. (note, here we rely on pick_next_task() having
+- * done a put_prev_task_fair() shortly before this, which
+- * updated rq->fair_clock - used by update_stats_wait_end())
+- */
+- update_stats_wait_end(cfs_rq, se);
++ /* 'current' is not kept within the tree. */
++ if (se->on_rq) {
++ /*
++ * Any task has to be enqueued before it get to execute on
++ * a CPU. So account for the time it spent waiting on the
++ * runqueue.
++ */
++ update_stats_wait_end(cfs_rq, se);
++ __dequeue_entity(cfs_rq, se);
++ }
++
+ update_stats_curr_start(cfs_rq, se);
+- set_cfs_rq_curr(cfs_rq, se);
++ cfs_rq->curr = se;
++#ifdef CONFIG_SCHEDSTATS
++ /*
++ * Track our maximum slice length, if the CPU's load is at
++ * least twice that of our own weight (i.e. dont track it
++ * when there are only lesser-weight tasks around):
++ */
++ if (rq_of(cfs_rq)->load.weight >= 2*se->load.weight) {
++ se->slice_max = max(se->slice_max,
++ se->sum_exec_runtime - se->prev_sum_exec_runtime);
++ }
++#endif
+ se->prev_sum_exec_runtime = se->sum_exec_runtime;
+ }
+
+ static struct sched_entity *pick_next_entity(struct cfs_rq *cfs_rq)
+ {
+- struct sched_entity *se = __pick_next_entity(cfs_rq);
++ struct sched_entity *se = NULL;
+
+- set_next_entity(cfs_rq, se);
++ if (first_fair(cfs_rq)) {
++ se = __pick_next_entity(cfs_rq);
++ set_next_entity(cfs_rq, se);
++ }
+
+ return se;
+ }
+
+ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev)
+@@ -758,37 +631,28 @@ static void put_prev_entity(struct cfs_r
+ * was not called and update_curr() has to be done:
+ */
+ if (prev->on_rq)
+ update_curr(cfs_rq);
+
+- update_stats_curr_end(cfs_rq, prev);
+-
+- if (prev->on_rq)
++ check_spread(cfs_rq, prev);
++ if (prev->on_rq) {
+ update_stats_wait_start(cfs_rq, prev);
+- set_cfs_rq_curr(cfs_rq, NULL);
++ /* Put 'current' back into the tree. */
++ __enqueue_entity(cfs_rq, prev);
++ }
++ cfs_rq->curr = NULL;
+ }
+
+ static void entity_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
+ {
+- struct sched_entity *next;
+-
+- /*
+- * Dequeue and enqueue the task to update its
+- * position within the tree:
+- */
+- dequeue_entity(cfs_rq, curr, 0);
+- enqueue_entity(cfs_rq, curr, 0);
+-
+ /*
+- * Reschedule if another task tops the current one.
++ * Update run-time statistics of the 'current'.
+ */
+- next = __pick_next_entity(cfs_rq);
+- if (next == curr)
+- return;
++ update_curr(cfs_rq);
+
+- __check_preempt_curr_fair(cfs_rq, next, curr,
+- sched_granularity(cfs_rq));
++ if (cfs_rq->nr_running > 1 || !sched_feat(WAKEUP_PREEMPT))
++ check_preempt_tick(cfs_rq, curr);
+ }
+
+ /**************************************************
+ * CFS operations on tasks:
+ */
+@@ -819,27 +683,32 @@ static inline struct cfs_rq *group_cfs_r
+ /* Given a group's cfs_rq on one cpu, return its corresponding cfs_rq on
+ * another cpu ('this_cpu')
+ */
+ static inline struct cfs_rq *cpu_cfs_rq(struct cfs_rq *cfs_rq, int this_cpu)
+ {
+- /* A later patch will take group into account */
+- return &cpu_rq(this_cpu)->cfs;
++ return cfs_rq->tg->cfs_rq[this_cpu];
+ }
+
+ /* Iterate thr' all leaf cfs_rq's on a runqueue */
+ #define for_each_leaf_cfs_rq(rq, cfs_rq) \
+ list_for_each_entry(cfs_rq, &rq->leaf_cfs_rq_list, leaf_cfs_rq_list)
+
+-/* Do the two (enqueued) tasks belong to the same group ? */
+-static inline int is_same_group(struct task_struct *curr, struct task_struct *p)
++/* Do the two (enqueued) entities belong to the same group ? */
++static inline int
++is_same_group(struct sched_entity *se, struct sched_entity *pse)
+ {
+- if (curr->se.cfs_rq == p->se.cfs_rq)
++ if (se->cfs_rq == pse->cfs_rq)
+ return 1;
+
+ return 0;
+ }
+
++static inline struct sched_entity *parent_entity(struct sched_entity *se)
++{
++ return se->parent;
++}
++
+ #else /* CONFIG_FAIR_GROUP_SCHED */
+
+ #define for_each_sched_entity(se) \
+ for (; se; se = NULL)
+
+@@ -868,15 +737,21 @@ static inline struct cfs_rq *cpu_cfs_rq(
+ }
+
+ #define for_each_leaf_cfs_rq(rq, cfs_rq) \
+ for (cfs_rq = &rq->cfs; cfs_rq; cfs_rq = NULL)
+
+-static inline int is_same_group(struct task_struct *curr, struct task_struct *p)
++static inline int
++is_same_group(struct sched_entity *se, struct sched_entity *pse)
+ {
+ return 1;
+ }
+
++static inline struct sched_entity *parent_entity(struct sched_entity *se)
++{
++ return NULL;
++}
++
+ #endif /* CONFIG_FAIR_GROUP_SCHED */
+
+ /*
+ * The enqueue_task method is called before nr_running is
+ * increased. Here we update the fair scheduling stats and
+@@ -890,10 +765,11 @@ static void enqueue_task_fair(struct rq
+ for_each_sched_entity(se) {
+ if (se->on_rq)
+ break;
+ cfs_rq = cfs_rq_of(se);
+ enqueue_entity(cfs_rq, se, wakeup);
++ wakeup = 1;
+ }
+ }
+
+ /*
+ * The dequeue_task method is called before nr_running is
+@@ -909,97 +785,95 @@ static void dequeue_task_fair(struct rq
+ cfs_rq = cfs_rq_of(se);
+ dequeue_entity(cfs_rq, se, sleep);
+ /* Don't dequeue parent if it has other entities besides us */
+ if (cfs_rq->load.weight)
+ break;
++ sleep = 1;
+ }
+ }
+
+ /*
+ * sched_yield() support is very simple - we dequeue and enqueue.
+ *
+ * If compat_yield is turned on then we requeue to the end of the tree.
+ */
+-static void yield_task_fair(struct rq *rq, struct task_struct *p)
++static void yield_task_fair(struct rq *rq)
+ {
+- struct cfs_rq *cfs_rq = task_cfs_rq(p);
+- struct rb_node **link = &cfs_rq->tasks_timeline.rb_node;
+- struct sched_entity *rightmost, *se = &p->se;
+- struct rb_node *parent;
++ struct task_struct *curr = rq->curr;
++ struct cfs_rq *cfs_rq = task_cfs_rq(curr);
++ struct sched_entity *rightmost, *se = &curr->se;
+
+ /*
+ * Are we the only task in the tree?
+ */
+ if (unlikely(cfs_rq->nr_running == 1))
+ return;
+
+- if (likely(!sysctl_sched_compat_yield)) {
++ if (likely(!sysctl_sched_compat_yield) && curr->policy != SCHED_BATCH) {
+ __update_rq_clock(rq);
+ /*
+- * Dequeue and enqueue the task to update its
+- * position within the tree:
++ * Update run-time statistics of the 'current'.
+ */
+- dequeue_entity(cfs_rq, &p->se, 0);
+- enqueue_entity(cfs_rq, &p->se, 0);
++ update_curr(cfs_rq);
+
+ return;
+ }
+ /*
+ * Find the rightmost entry in the rbtree:
+ */
+- do {
+- parent = *link;
+- link = &parent->rb_right;
+- } while (*link);
+-
+- rightmost = rb_entry(parent, struct sched_entity, run_node);
++ rightmost = __pick_last_entity(cfs_rq);
+ /*
+ * Already in the rightmost position?
+ */
+- if (unlikely(rightmost == se))
++ if (unlikely(rightmost->vruntime < se->vruntime))
+ return;
+
+ /*
+ * Minimally necessary key value to be last in the tree:
++ * Upon rescheduling, sched_class::put_prev_task() will place
++ * 'current' within the tree based on its new key value.
+ */
+- se->fair_key = rightmost->fair_key + 1;
+-
+- if (cfs_rq->rb_leftmost == &se->run_node)
+- cfs_rq->rb_leftmost = rb_next(&se->run_node);
+- /*
+- * Relink the task to the rightmost position:
+- */
+- rb_erase(&se->run_node, &cfs_rq->tasks_timeline);
+- rb_link_node(&se->run_node, parent, link);
+- rb_insert_color(&se->run_node, &cfs_rq->tasks_timeline);
++ se->vruntime = rightmost->vruntime + 1;
+ }
+
+ /*
+ * Preempt the current task with a newly woken task if needed:
+ */
+-static void check_preempt_curr_fair(struct rq *rq, struct task_struct *p)
++static void check_preempt_wakeup(struct rq *rq, struct task_struct *p)
+ {
+ struct task_struct *curr = rq->curr;
+ struct cfs_rq *cfs_rq = task_cfs_rq(curr);
++ struct sched_entity *se = &curr->se, *pse = &p->se;
+ unsigned long gran;
+
+ if (unlikely(rt_prio(p->prio))) {
+ update_rq_clock(rq);
+ update_curr(cfs_rq);
+ resched_task(curr);
+ return;
+ }
+-
+- gran = sysctl_sched_wakeup_granularity;
+ /*
+- * Batch tasks prefer throughput over latency:
++ * Batch tasks do not preempt (their preemption is driven by
++ * the tick):
+ */
+ if (unlikely(p->policy == SCHED_BATCH))
+- gran = sysctl_sched_batch_wakeup_granularity;
++ return;
++
++ if (!sched_feat(WAKEUP_PREEMPT))
++ return;
++
++ while (!is_same_group(se, pse)) {
++ se = parent_entity(se);
++ pse = parent_entity(pse);
++ }
+
+- if (is_same_group(curr, p))
+- __check_preempt_curr_fair(cfs_rq, &p->se, &curr->se, gran);
++ gran = sysctl_sched_wakeup_granularity;
++ if (unlikely(se->load.weight != NICE_0_LOAD))
++ gran = calc_delta_fair(gran, &se->load);
++
++ if (pse->vruntime + gran < se->vruntime)
++ resched_task(curr);
+ }
+
+ static struct task_struct *pick_next_task_fair(struct rq *rq)
+ {
+ struct cfs_rq *cfs_rq = &rq->cfs;
+@@ -1028,10 +902,11 @@ static void put_prev_task_fair(struct rq
+ cfs_rq = cfs_rq_of(se);
+ put_prev_entity(cfs_rq, se);
+ }
+ }
+
++#ifdef CONFIG_SMP
+ /**************************************************
+ * Fair scheduling class load-balancing methods:
+ */
+
+ /*
+@@ -1039,11 +914,11 @@ static void put_prev_task_fair(struct rq
+ * during the whole iteration, the current task might be
+ * dequeued so the iterator has to be dequeue-safe. Here we
+ * achieve that by always pre-iterating before returning
+ * the current task:
+ */
+-static inline struct task_struct *
++static struct task_struct *
+ __load_balance_iterator(struct cfs_rq *cfs_rq, struct rb_node *curr)
+ {
+ struct task_struct *p;
+
+ if (!curr)
+@@ -1076,25 +951,27 @@ static int cfs_rq_best_prio(struct cfs_r
+ struct task_struct *p;
+
+ if (!cfs_rq->nr_running)
+ return MAX_PRIO;
+
+- curr = __pick_next_entity(cfs_rq);
++ curr = cfs_rq->curr;
++ if (!curr)
++ curr = __pick_next_entity(cfs_rq);
++
+ p = task_of(curr);
+
+ return p->prio;
+ }
+ #endif
+
+ static unsigned long
+ load_balance_fair(struct rq *this_rq, int this_cpu, struct rq *busiest,
+- unsigned long max_nr_move, unsigned long max_load_move,
++ unsigned long max_load_move,
+ struct sched_domain *sd, enum cpu_idle_type idle,
+ int *all_pinned, int *this_best_prio)
+ {
+ struct cfs_rq *busy_cfs_rq;
+- unsigned long load_moved, total_nr_moved = 0, nr_moved;
+ long rem_load_move = max_load_move;
+ struct rq_iterator cfs_rq_iterator;
+
+ cfs_rq_iterator.start = load_balance_start_fair;
+ cfs_rq_iterator.next = load_balance_next_fair;
+@@ -1118,29 +995,52 @@ load_balance_fair(struct rq *this_rq, in
+
+ *this_best_prio = cfs_rq_best_prio(this_cfs_rq);
+ #else
+ # define maxload rem_load_move
+ #endif
+- /* pass busy_cfs_rq argument into
++ /*
++ * pass busy_cfs_rq argument into
+ * load_balance_[start|next]_fair iterators
+ */
+ cfs_rq_iterator.arg = busy_cfs_rq;
+- nr_moved = balance_tasks(this_rq, this_cpu, busiest,
+- max_nr_move, maxload, sd, idle, all_pinned,
+- &load_moved, this_best_prio, &cfs_rq_iterator);
+-
+- total_nr_moved += nr_moved;
+- max_nr_move -= nr_moved;
+- rem_load_move -= load_moved;
++ rem_load_move -= balance_tasks(this_rq, this_cpu, busiest,
++ maxload, sd, idle, all_pinned,
++ this_best_prio,
++ &cfs_rq_iterator);
+
+- if (max_nr_move <= 0 || rem_load_move <= 0)
++ if (rem_load_move <= 0)
+ break;
+ }
+
+ return max_load_move - rem_load_move;
+ }
+
++static int
++move_one_task_fair(struct rq *this_rq, int this_cpu, struct rq *busiest,
++ struct sched_domain *sd, enum cpu_idle_type idle)
++{
++ struct cfs_rq *busy_cfs_rq;
++ struct rq_iterator cfs_rq_iterator;
++
++ cfs_rq_iterator.start = load_balance_start_fair;
++ cfs_rq_iterator.next = load_balance_next_fair;
++
++ for_each_leaf_cfs_rq(busiest, busy_cfs_rq) {
++ /*
++ * pass busy_cfs_rq argument into
++ * load_balance_[start|next]_fair iterators
++ */
++ cfs_rq_iterator.arg = busy_cfs_rq;
++ if (iter_move_one_task(this_rq, this_cpu, busiest, sd, idle,
++ &cfs_rq_iterator))
++ return 1;
++ }
++
++ return 0;
++}
++#endif
++
+ /*
+ * scheduler tick hitting a task of our scheduling class:
+ */
+ static void task_tick_fair(struct rq *rq, struct task_struct *curr)
+ {
+@@ -1151,51 +1051,44 @@ static void task_tick_fair(struct rq *rq
+ cfs_rq = cfs_rq_of(se);
+ entity_tick(cfs_rq, se);
+ }
+ }
+
++#define swap(a, b) do { typeof(a) tmp = (a); (a) = (b); (b) = tmp; } while (0)
++
+ /*
+ * Share the fairness runtime between parent and child, thus the
+ * total amount of pressure for CPU stays equal - new tasks
+ * get a chance to run but frequent forkers are not allowed to
+ * monopolize the CPU. Note: the parent runqueue is locked,
+ * the child is not running yet.
+ */
+ static void task_new_fair(struct rq *rq, struct task_struct *p)
+ {
+ struct cfs_rq *cfs_rq = task_cfs_rq(p);
+- struct sched_entity *se = &p->se, *curr = cfs_rq_curr(cfs_rq);
++ struct sched_entity *se = &p->se, *curr = cfs_rq->curr;
++ int this_cpu = smp_processor_id();
+
+ sched_info_queued(p);
+
+ update_curr(cfs_rq);
+- update_stats_enqueue(cfs_rq, se);
+- /*
+- * Child runs first: we let it run before the parent
+- * until it reschedules once. We set up the key so that
+- * it will preempt the parent:
+- */
+- se->fair_key = curr->fair_key -
+- niced_granularity(curr, sched_granularity(cfs_rq)) - 1;
+- /*
+- * The first wait is dominated by the child-runs-first logic,
+- * so do not credit it with that waiting time yet:
+- */
+- if (sysctl_sched_features & SCHED_FEAT_SKIP_INITIAL)
+- se->wait_start_fair = 0;
++ place_entity(cfs_rq, se, 1);
+
+- /*
+- * The statistical average of wait_runtime is about
+- * -granularity/2, so initialize the task with that:
+- */
+- if (sysctl_sched_features & SCHED_FEAT_START_DEBIT)
+- se->wait_runtime = -(sched_granularity(cfs_rq) / 2);
++ /* 'curr' will be NULL if the child belongs to a different group */
++ if (sysctl_sched_child_runs_first && this_cpu == task_cpu(p) &&
++ curr && curr->vruntime < se->vruntime) {
++ /*
++ * Upon rescheduling, sched_class::put_prev_task() will place
++ * 'current' within the tree based on its new key value.
++ */
++ swap(curr->vruntime, se->vruntime);
++ }
+
+- __enqueue_entity(cfs_rq, se);
++ enqueue_task_fair(rq, p, 0);
++ resched_task(rq->curr);
+ }
+
+-#ifdef CONFIG_FAIR_GROUP_SCHED
+ /* Account for a task changing its policy or group.
+ *
+ * This routine is mostly called to set cfs_rq->curr field when a task
+ * migrates between groups/classes.
+ */
+@@ -1204,30 +1097,29 @@ static void set_curr_task_fair(struct rq
+ struct sched_entity *se = &rq->curr->se;
+
+ for_each_sched_entity(se)
+ set_next_entity(cfs_rq_of(se), se);
+ }
+-#else
+-static void set_curr_task_fair(struct rq *rq)
+-{
+-}
+-#endif
+
+ /*
+ * All the scheduling class methods:
+ */
+-struct sched_class fair_sched_class __read_mostly = {
++static const struct sched_class fair_sched_class = {
++ .next = &idle_sched_class,
+ .enqueue_task = enqueue_task_fair,
+ .dequeue_task = dequeue_task_fair,
+ .yield_task = yield_task_fair,
+
+- .check_preempt_curr = check_preempt_curr_fair,
++ .check_preempt_curr = check_preempt_wakeup,
+
+ .pick_next_task = pick_next_task_fair,
+ .put_prev_task = put_prev_task_fair,
+
++#ifdef CONFIG_SMP
+ .load_balance = load_balance_fair,
++ .move_one_task = move_one_task_fair,
++#endif
+
+ .set_curr_task = set_curr_task_fair,
+ .task_tick = task_tick_fair,
+ .task_new = task_new_fair,
+ };
+@@ -1235,9 +1127,12 @@ struct sched_class fair_sched_class __re
+ #ifdef CONFIG_SCHED_DEBUG
+ static void print_cfs_stats(struct seq_file *m, int cpu)
+ {
+ struct cfs_rq *cfs_rq;
+
++#ifdef CONFIG_FAIR_GROUP_SCHED
++ print_cfs_rq(m, cpu, &cpu_rq(cpu)->cfs);
++#endif
+ for_each_leaf_cfs_rq(cpu_rq(cpu), cfs_rq)
+ print_cfs_rq(m, cpu, cfs_rq);
+ }
+ #endif
+--- linux-2.6.23.orig/kernel/sched_idletask.c
++++ linux-2.6.23/kernel/sched_idletask.c
+@@ -35,37 +35,55 @@ dequeue_task_idle(struct rq *rq, struct
+
+ static void put_prev_task_idle(struct rq *rq, struct task_struct *prev)
+ {
+ }
+
++#ifdef CONFIG_SMP
+ static unsigned long
+ load_balance_idle(struct rq *this_rq, int this_cpu, struct rq *busiest,
+- unsigned long max_nr_move, unsigned long max_load_move,
+- struct sched_domain *sd, enum cpu_idle_type idle,
+- int *all_pinned, int *this_best_prio)
++ unsigned long max_load_move,
++ struct sched_domain *sd, enum cpu_idle_type idle,
++ int *all_pinned, int *this_best_prio)
+ {
+ return 0;
+ }
+
++static int
++move_one_task_idle(struct rq *this_rq, int this_cpu, struct rq *busiest,
++ struct sched_domain *sd, enum cpu_idle_type idle)
++{
++ return 0;
++}
++#endif
++
+ static void task_tick_idle(struct rq *rq, struct task_struct *curr)
+ {
+ }
+
++static void set_curr_task_idle(struct rq *rq)
++{
++}
++
+ /*
+ * Simple, special scheduling class for the per-CPU idle tasks:
+ */
+-static struct sched_class idle_sched_class __read_mostly = {
++const struct sched_class idle_sched_class = {
++ /* .next is NULL */
+ /* no enqueue/yield_task for idle tasks */
+
+ /* dequeue is not valid, we print a debug message there: */
+ .dequeue_task = dequeue_task_idle,
+
+ .check_preempt_curr = check_preempt_curr_idle,
+
+ .pick_next_task = pick_next_task_idle,
+ .put_prev_task = put_prev_task_idle,
+
++#ifdef CONFIG_SMP
+ .load_balance = load_balance_idle,
++ .move_one_task = move_one_task_idle,
++#endif
+
++ .set_curr_task = set_curr_task_idle,
+ .task_tick = task_tick_idle,
+ /* no .task_new for idle tasks */
+ };
+--- linux-2.6.23.orig/kernel/sched_rt.c
++++ linux-2.6.23/kernel/sched_rt.c
+@@ -5,11 +5,11 @@
+
+ /*
+ * Update the current task's runtime statistics. Skip current tasks that
+ * are not in our scheduling class.
+ */
+-static inline void update_curr_rt(struct rq *rq)
++static void update_curr_rt(struct rq *rq)
+ {
+ struct task_struct *curr = rq->curr;
+ u64 delta_exec;
+
+ if (!task_has_rt_policy(curr))
+@@ -21,10 +21,11 @@ static inline void update_curr_rt(struct
+
+ schedstat_set(curr->se.exec_max, max(curr->se.exec_max, delta_exec));
+
+ curr->se.sum_exec_runtime += delta_exec;
+ curr->se.exec_start = rq->clock;
++ cpuacct_charge(curr, delta_exec);
+ }
+
+ static void enqueue_task_rt(struct rq *rq, struct task_struct *p, int wakeup)
+ {
+ struct rt_prio_array *array = &rq->rt.active;
+@@ -57,13 +58,13 @@ static void requeue_task_rt(struct rq *r
+
+ list_move_tail(&p->run_list, array->queue + p->prio);
+ }
+
+ static void
+-yield_task_rt(struct rq *rq, struct task_struct *p)
++yield_task_rt(struct rq *rq)
+ {
+- requeue_task_rt(rq, p);
++ requeue_task_rt(rq, rq->curr);
+ }
+
+ /*
+ * Preempt the current task with a newly woken task if needed:
+ */
+@@ -96,10 +97,11 @@ static void put_prev_task_rt(struct rq *
+ {
+ update_curr_rt(rq);
+ p->se.exec_start = 0;
+ }
+
++#ifdef CONFIG_SMP
+ /*
+ * Load-balancing iterator. Note: while the runqueue stays locked
+ * during the whole iteration, the current task might be
+ * dequeued so the iterator has to be dequeue-safe. Here we
+ * achieve that by always pre-iterating before returning
+@@ -170,45 +172,57 @@ static struct task_struct *load_balance_
+ return p;
+ }
+
+ static unsigned long
+ load_balance_rt(struct rq *this_rq, int this_cpu, struct rq *busiest,
+- unsigned long max_nr_move, unsigned long max_load_move,
+- struct sched_domain *sd, enum cpu_idle_type idle,
+- int *all_pinned, int *this_best_prio)
++ unsigned long max_load_move,
++ struct sched_domain *sd, enum cpu_idle_type idle,
++ int *all_pinned, int *this_best_prio)
+ {
+- int nr_moved;
+ struct rq_iterator rt_rq_iterator;
+- unsigned long load_moved;
+
+ rt_rq_iterator.start = load_balance_start_rt;
+ rt_rq_iterator.next = load_balance_next_rt;
+ /* pass 'busiest' rq argument into
+ * load_balance_[start|next]_rt iterators
+ */
+ rt_rq_iterator.arg = busiest;
+
+- nr_moved = balance_tasks(this_rq, this_cpu, busiest, max_nr_move,
+- max_load_move, sd, idle, all_pinned, &load_moved,
+- this_best_prio, &rt_rq_iterator);
++ return balance_tasks(this_rq, this_cpu, busiest, max_load_move, sd,
++ idle, all_pinned, this_best_prio, &rt_rq_iterator);
++}
++
++static int
++move_one_task_rt(struct rq *this_rq, int this_cpu, struct rq *busiest,
++ struct sched_domain *sd, enum cpu_idle_type idle)
++{
++ struct rq_iterator rt_rq_iterator;
++
++ rt_rq_iterator.start = load_balance_start_rt;
++ rt_rq_iterator.next = load_balance_next_rt;
++ rt_rq_iterator.arg = busiest;
+
+- return load_moved;
++ return iter_move_one_task(this_rq, this_cpu, busiest, sd, idle,
++ &rt_rq_iterator);
+ }
++#endif
+
+ static void task_tick_rt(struct rq *rq, struct task_struct *p)
+ {
++ update_curr_rt(rq);
++
+ /*
+ * RR tasks need a special form of timeslice management.
+ * FIFO tasks have no timeslices.
+ */
+ if (p->policy != SCHED_RR)
+ return;
+
+ if (--p->time_slice)
+ return;
+
+- p->time_slice = static_prio_timeslice(p->static_prio);
++ p->time_slice = DEF_TIMESLICE;
+
+ /*
+ * Requeue to the end of queue if we are not the only element
+ * on the queue:
+ */
+@@ -216,19 +230,31 @@ static void task_tick_rt(struct rq *rq,
+ requeue_task_rt(rq, p);
+ set_tsk_need_resched(p);
+ }
+ }
+
+-static struct sched_class rt_sched_class __read_mostly = {
++static void set_curr_task_rt(struct rq *rq)
++{
++ struct task_struct *p = rq->curr;
++
++ p->se.exec_start = rq->clock;
++}
++
++const struct sched_class rt_sched_class = {
++ .next = &fair_sched_class,
+ .enqueue_task = enqueue_task_rt,
+ .dequeue_task = dequeue_task_rt,
+ .yield_task = yield_task_rt,
+
+ .check_preempt_curr = check_preempt_curr_rt,
+
+ .pick_next_task = pick_next_task_rt,
+ .put_prev_task = put_prev_task_rt,
+
++#ifdef CONFIG_SMP
+ .load_balance = load_balance_rt,
++ .move_one_task = move_one_task_rt,
++#endif
+
++ .set_curr_task = set_curr_task_rt,
+ .task_tick = task_tick_rt,
+ };
+--- linux-2.6.23.orig/kernel/sched_stats.h
++++ linux-2.6.23/kernel/sched_stats.h
+@@ -14,22 +14,22 @@ static int show_schedstat(struct seq_fil
+ seq_printf(seq, "timestamp %lu\n", jiffies);
+ for_each_online_cpu(cpu) {
+ struct rq *rq = cpu_rq(cpu);
+ #ifdef CONFIG_SMP
+ struct sched_domain *sd;
+- int dcnt = 0;
++ int dcount = 0;
+ #endif
+
+ /* runqueue-specific stats */
+ seq_printf(seq,
+- "cpu%d %lu %lu %lu %lu %lu %lu %lu %lu %lu %llu %llu %lu",
++ "cpu%d %u %u %u %u %u %u %u %u %u %llu %llu %lu",
+ cpu, rq->yld_both_empty,
+- rq->yld_act_empty, rq->yld_exp_empty, rq->yld_cnt,
+- rq->sched_switch, rq->sched_cnt, rq->sched_goidle,
+- rq->ttwu_cnt, rq->ttwu_local,
++ rq->yld_act_empty, rq->yld_exp_empty, rq->yld_count,
++ rq->sched_switch, rq->sched_count, rq->sched_goidle,
++ rq->ttwu_count, rq->ttwu_local,
+ rq->rq_sched_info.cpu_time,
+- rq->rq_sched_info.run_delay, rq->rq_sched_info.pcnt);
++ rq->rq_sched_info.run_delay, rq->rq_sched_info.pcount);
+
+ seq_printf(seq, "\n");
+
+ #ifdef CONFIG_SMP
+ /* domain-specific stats */
+@@ -37,29 +37,28 @@ static int show_schedstat(struct seq_fil
+ for_each_domain(cpu, sd) {
+ enum cpu_idle_type itype;
+ char mask_str[NR_CPUS];
+
+ cpumask_scnprintf(mask_str, NR_CPUS, sd->span);
+- seq_printf(seq, "domain%d %s", dcnt++, mask_str);
++ seq_printf(seq, "domain%d %s", dcount++, mask_str);
+ for (itype = CPU_IDLE; itype < CPU_MAX_IDLE_TYPES;
+ itype++) {
+- seq_printf(seq, " %lu %lu %lu %lu %lu %lu %lu "
+- "%lu",
+- sd->lb_cnt[itype],
++ seq_printf(seq, " %u %u %u %u %u %u %u %u",
++ sd->lb_count[itype],
+ sd->lb_balanced[itype],
+ sd->lb_failed[itype],
+ sd->lb_imbalance[itype],
+ sd->lb_gained[itype],
+ sd->lb_hot_gained[itype],
+ sd->lb_nobusyq[itype],
+ sd->lb_nobusyg[itype]);
+ }
+- seq_printf(seq, " %lu %lu %lu %lu %lu %lu %lu %lu %lu"
+- " %lu %lu %lu\n",
+- sd->alb_cnt, sd->alb_failed, sd->alb_pushed,
+- sd->sbe_cnt, sd->sbe_balanced, sd->sbe_pushed,
+- sd->sbf_cnt, sd->sbf_balanced, sd->sbf_pushed,
++ seq_printf(seq,
++ " %u %u %u %u %u %u %u %u %u %u %u %u\n",
++ sd->alb_count, sd->alb_failed, sd->alb_pushed,
++ sd->sbe_count, sd->sbe_balanced, sd->sbe_pushed,
++ sd->sbf_count, sd->sbf_balanced, sd->sbf_pushed,
+ sd->ttwu_wake_remote, sd->ttwu_move_affine,
+ sd->ttwu_move_balance);
+ }
+ preempt_enable();
+ #endif
+@@ -99,11 +98,11 @@ const struct file_operations proc_scheds
+ static inline void
+ rq_sched_info_arrive(struct rq *rq, unsigned long long delta)
+ {
+ if (rq) {
+ rq->rq_sched_info.run_delay += delta;
+- rq->rq_sched_info.pcnt++;
++ rq->rq_sched_info.pcount++;
+ }
+ }
+
+ /*
+ * Expects runqueue lock to be held for atomicity of update
+@@ -155,18 +154,18 @@ static inline void sched_info_dequeued(s
+ * long it was waiting to run. We also note when it began so that we
+ * can keep stats on how long its timeslice is.
+ */
+ static void sched_info_arrive(struct task_struct *t)
+ {
+- unsigned long long now = sched_clock(), delta = 0;
++ unsigned long long now = task_rq(t)->clock, delta = 0;
+
+ if (t->sched_info.last_queued)
+ delta = now - t->sched_info.last_queued;
+ sched_info_dequeued(t);
+ t->sched_info.run_delay += delta;
+ t->sched_info.last_arrival = now;
+- t->sched_info.pcnt++;
++ t->sched_info.pcount++;
+
+ rq_sched_info_arrive(task_rq(t), delta);
+ }
+
+ /*
+@@ -186,20 +185,21 @@ static void sched_info_arrive(struct tas
+ */
+ static inline void sched_info_queued(struct task_struct *t)
+ {
+ if (unlikely(sched_info_on()))
+ if (!t->sched_info.last_queued)
+- t->sched_info.last_queued = sched_clock();
++ t->sched_info.last_queued = task_rq(t)->clock;
+ }
+
+ /*
+ * Called when a process ceases being the active-running process, either
+ * voluntarily or involuntarily. Now we can calculate how long we ran.
+ */
+ static inline void sched_info_depart(struct task_struct *t)
+ {
+- unsigned long long delta = sched_clock() - t->sched_info.last_arrival;
++ unsigned long long delta = task_rq(t)->clock -
++ t->sched_info.last_arrival;
+
+ t->sched_info.cpu_time += delta;
+ rq_sched_info_depart(task_rq(t), delta);
+ }
+
+--- linux-2.6.23.orig/kernel/sysctl.c
++++ linux-2.6.23/kernel/sysctl.c
+@@ -211,35 +211,35 @@ static ctl_table root_table[] = {
+ { .ctl_name = 0 }
+ };
+
+ #ifdef CONFIG_SCHED_DEBUG
+ static unsigned long min_sched_granularity_ns = 100000; /* 100 usecs */
+-static unsigned long max_sched_granularity_ns = 1000000000; /* 1 second */
++static unsigned long max_sched_granularity_ns = NSEC_PER_SEC; /* 1 second */
+ static unsigned long min_wakeup_granularity_ns; /* 0 usecs */
+-static unsigned long max_wakeup_granularity_ns = 1000000000; /* 1 second */
++static unsigned long max_wakeup_granularity_ns = NSEC_PER_SEC; /* 1 second */
+ #endif
+
+-static ctl_table kern_table[] = {
++static struct ctl_table kern_table[] = {
+ #ifdef CONFIG_SCHED_DEBUG
+ {
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "sched_min_granularity_ns",
+ .data = &sysctl_sched_min_granularity,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+- .proc_handler = &proc_dointvec_minmax,
++ .proc_handler = &sched_nr_latency_handler,
+ .strategy = &sysctl_intvec,
+ .extra1 = &min_sched_granularity_ns,
+ .extra2 = &max_sched_granularity_ns,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "sched_latency_ns",
+ .data = &sysctl_sched_latency,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+- .proc_handler = &proc_dointvec_minmax,
++ .proc_handler = &sched_nr_latency_handler,
+ .strategy = &sysctl_intvec,
+ .extra1 = &min_sched_granularity_ns,
+ .extra2 = &max_sched_granularity_ns,
+ },
+ {
+@@ -264,47 +264,43 @@ static ctl_table kern_table[] = {
+ .extra1 = &min_wakeup_granularity_ns,
+ .extra2 = &max_wakeup_granularity_ns,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
+- .procname = "sched_stat_granularity_ns",
+- .data = &sysctl_sched_stat_granularity,
++ .procname = "sched_child_runs_first",
++ .data = &sysctl_sched_child_runs_first,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+- .proc_handler = &proc_dointvec_minmax,
+- .strategy = &sysctl_intvec,
+- .extra1 = &min_wakeup_granularity_ns,
+- .extra2 = &max_wakeup_granularity_ns,
++ .proc_handler = &proc_dointvec,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
+- .procname = "sched_runtime_limit_ns",
+- .data = &sysctl_sched_runtime_limit,
++ .procname = "sched_features",
++ .data = &sysctl_sched_features,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+- .proc_handler = &proc_dointvec_minmax,
+- .strategy = &sysctl_intvec,
+- .extra1 = &min_sched_granularity_ns,
+- .extra2 = &max_sched_granularity_ns,
++ .proc_handler = &proc_dointvec,
+ },
+ {
+ .ctl_name = CTL_UNNUMBERED,
+- .procname = "sched_child_runs_first",
+- .data = &sysctl_sched_child_runs_first,
++ .procname = "sched_migration_cost",
++ .data = &sysctl_sched_migration_cost,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = &proc_dointvec,
+ },
++#if defined(CONFIG_FAIR_GROUP_SCHED) && defined(CONFIG_SMP)
+ {
+ .ctl_name = CTL_UNNUMBERED,
+- .procname = "sched_features",
+- .data = &sysctl_sched_features,
++ .procname = "sched_nr_migrate",
++ .data = &sysctl_sched_nr_migrate,
+ .maxlen = sizeof(unsigned int),
+- .mode = 0644,
++ .mode = 644,
+ .proc_handler = &proc_dointvec,
+ },
+ #endif
++#endif
+ {
+ .ctl_name = CTL_UNNUMBERED,
+ .procname = "sched_compat_yield",
+ .data = &sysctl_sched_compat_yield,
+ .maxlen = sizeof(unsigned int),
+--- linux-2.6.23.orig/kernel/timer.c
++++ linux-2.6.23/kernel/timer.c
+@@ -824,14 +824,17 @@ void update_process_times(int user_tick)
+ {
+ struct task_struct *p = current;
+ int cpu = smp_processor_id();
+
+ /* Note: this timer irq context must be accounted for as well. */
+- if (user_tick)
++ if (user_tick) {
+ account_user_time(p, jiffies_to_cputime(1));
+- else
++ account_user_time_scaled(p, jiffies_to_cputime(1));
++ } else {
+ account_system_time(p, HARDIRQ_OFFSET, jiffies_to_cputime(1));
++ account_system_time_scaled(p, jiffies_to_cputime(1));
++ }
+ run_local_timers();
+ if (rcu_pending(cpu))
+ rcu_check_callbacks(cpu, user_tick);
+ scheduler_tick();
+ run_posix_cpu_timers(p);
+--- linux-2.6.23.orig/kernel/tsacct.c
++++ linux-2.6.23/kernel/tsacct.c
+@@ -60,10 +60,14 @@ void bacct_add_tsk(struct taskstats *sta
+ stats->ac_ppid = pid_alive(tsk) ?
+ rcu_dereference(tsk->real_parent)->tgid : 0;
+ rcu_read_unlock();
+ stats->ac_utime = cputime_to_msecs(tsk->utime) * USEC_PER_MSEC;
+ stats->ac_stime = cputime_to_msecs(tsk->stime) * USEC_PER_MSEC;
++ stats->ac_utimescaled =
++ cputime_to_msecs(tsk->utimescaled) * USEC_PER_MSEC;
++ stats->ac_stimescaled =
++ cputime_to_msecs(tsk->stimescaled) * USEC_PER_MSEC;
+ stats->ac_minflt = tsk->min_flt;
+ stats->ac_majflt = tsk->maj_flt;
+
+ strncpy(stats->ac_comm, tsk->comm, sizeof(stats->ac_comm));
+ }
+--- linux-2.6.23.orig/kernel/user.c
++++ linux-2.6.23/kernel/user.c
+@@ -48,40 +48,242 @@ struct user_struct root_user = {
+ .locked_shm = 0,
+ #ifdef CONFIG_KEYS
+ .uid_keyring = &root_user_keyring,
+ .session_keyring = &root_session_keyring,
+ #endif
++#ifdef CONFIG_FAIR_USER_SCHED
++ .tg = &init_task_group,
++#endif
+ };
+
+ /*
+ * These routines must be called with the uidhash spinlock held!
+ */
+-static inline void uid_hash_insert(struct user_struct *up, struct hlist_head *hashent)
++static inline void uid_hash_insert(struct user_struct *up,
++ struct hlist_head *hashent)
+ {
+ hlist_add_head(&up->uidhash_node, hashent);
+ }
+
+ static inline void uid_hash_remove(struct user_struct *up)
+ {
+ hlist_del_init(&up->uidhash_node);
+ }
+
+-static inline struct user_struct *uid_hash_find(uid_t uid, struct hlist_head *hashent)
++static inline struct user_struct *uid_hash_find(uid_t uid,
++ struct hlist_head *hashent)
+ {
+ struct user_struct *user;
+ struct hlist_node *h;
+
+ hlist_for_each_entry(user, h, hashent, uidhash_node) {
+- if(user->uid == uid) {
++ if (user->uid == uid) {
+ atomic_inc(&user->__count);
+ return user;
+ }
+ }
+
+ return NULL;
+ }
+
++#ifdef CONFIG_FAIR_USER_SCHED
++
++static struct kobject uids_kobject; /* represents /sys/kernel/uids directory */
++static DEFINE_MUTEX(uids_mutex);
++
++static void sched_destroy_user(struct user_struct *up)
++{
++ sched_destroy_group(up->tg);
++}
++
++static int sched_create_user(struct user_struct *up)
++{
++ int rc = 0;
++
++ up->tg = sched_create_group();
++ if (IS_ERR(up->tg))
++ rc = -ENOMEM;
++
++ return rc;
++}
++
++static void sched_switch_user(struct task_struct *p)
++{
++ sched_move_task(p);
++}
++
++static inline void uids_mutex_lock(void)
++{
++ mutex_lock(&uids_mutex);
++}
++
++static inline void uids_mutex_unlock(void)
++{
++ mutex_unlock(&uids_mutex);
++}
++
++/* return cpu shares held by the user */
++ssize_t cpu_shares_show(struct kset *kset, char *buffer)
++{
++ struct user_struct *up = container_of(kset, struct user_struct, kset);
++
++ return sprintf(buffer, "%lu\n", sched_group_shares(up->tg));
++}
++
++/* modify cpu shares held by the user */
++ssize_t cpu_shares_store(struct kset *kset, const char *buffer, size_t size)
++{
++ struct user_struct *up = container_of(kset, struct user_struct, kset);
++ unsigned long shares;
++ int rc;
++
++ sscanf(buffer, "%lu", &shares);
++
++ rc = sched_group_set_shares(up->tg, shares);
++
++ return (rc ? rc : size);
++}
++
++static void user_attr_init(struct subsys_attribute *sa, char *name, int mode)
++{
++ sa->attr.name = name; sa->attr.owner = NULL;
++ sa->attr.mode = mode;
++ sa->show = cpu_shares_show;
++ sa->store = cpu_shares_store;
++}
++
++/* Create "/sys/kernel/uids/<uid>" directory and
++ * "/sys/kernel/uids/<uid>/cpu_share" file for this user.
++ */
++static int user_kobject_create(struct user_struct *up)
++{
++ struct kset *kset = &up->kset;
++ struct kobject *kobj = &kset->kobj;
++ int error;
++
++ memset(kset, 0, sizeof(struct kset));
++ kobj->parent = &uids_kobject; /* create under /sys/kernel/uids dir */
++ kobject_set_name(kobj, "%d", up->uid);
++ kset_init(kset);
++ user_attr_init(&up->user_attr, "cpu_share", 0644);
++
++ error = kobject_add(kobj);
++ if (error)
++ goto done;
++
++ error = sysfs_create_file(kobj, &up->user_attr.attr);
++ if (error)
++ kobject_del(kobj);
++
++ kobject_uevent(kobj, KOBJ_ADD);
++
++done:
++ return error;
++}
++
++/* create these in sysfs filesystem:
++ * "/sys/kernel/uids" directory
++ * "/sys/kernel/uids/0" directory (for root user)
++ * "/sys/kernel/uids/0/cpu_share" file (for root user)
++ */
++int __init uids_kobject_init(void)
++{
++ int error;
++
++ /* create under /sys/kernel dir */
++ uids_kobject.parent = &kernel_subsys.kobj;
++ uids_kobject.kset = &kernel_subsys;
++ kobject_set_name(&uids_kobject, "uids");
++ kobject_init(&uids_kobject);
++
++ error = kobject_add(&uids_kobject);
++ if (!error)
++ error = user_kobject_create(&root_user);
++
++ return error;
++}
++
++/* work function to remove sysfs directory for a user and free up
++ * corresponding structures.
++ */
++static void remove_user_sysfs_dir(struct work_struct *w)
++{
++ struct user_struct *up = container_of(w, struct user_struct, work);
++ struct kobject *kobj = &up->kset.kobj;
++ unsigned long flags;
++ int remove_user = 0;
++
++ /* Make uid_hash_remove() + sysfs_remove_file() + kobject_del()
++ * atomic.
++ */
++ uids_mutex_lock();
++
++ local_irq_save(flags);
++
++ if (atomic_dec_and_lock(&up->__count, &uidhash_lock)) {
++ uid_hash_remove(up);
++ remove_user = 1;
++ spin_unlock_irqrestore(&uidhash_lock, flags);
++ } else {
++ local_irq_restore(flags);
++ }
++
++ if (!remove_user)
++ goto done;
++
++ sysfs_remove_file(kobj, &up->user_attr.attr);
++ kobject_uevent(kobj, KOBJ_REMOVE);
++ kobject_del(kobj);
++
++ sched_destroy_user(up);
++ key_put(up->uid_keyring);
++ key_put(up->session_keyring);
++ kmem_cache_free(uid_cachep, up);
++
++done:
++ uids_mutex_unlock();
++}
++
++/* IRQs are disabled and uidhash_lock is held upon function entry.
++ * IRQ state (as stored in flags) is restored and uidhash_lock released
++ * upon function exit.
++ */
++static inline void free_user(struct user_struct *up, unsigned long flags)
++{
++ /* restore back the count */
++ atomic_inc(&up->__count);
++ spin_unlock_irqrestore(&uidhash_lock, flags);
++
++ INIT_WORK(&up->work, remove_user_sysfs_dir);
++ schedule_work(&up->work);
++}
++
++#else /* CONFIG_FAIR_USER_SCHED */
++
++static void sched_destroy_user(struct user_struct *up) { }
++static int sched_create_user(struct user_struct *up) { return 0; }
++static void sched_switch_user(struct task_struct *p) { }
++static inline int user_kobject_create(struct user_struct *up) { return 0; }
++static inline void uids_mutex_lock(void) { }
++static inline void uids_mutex_unlock(void) { }
++
++/* IRQs are disabled and uidhash_lock is held upon function entry.
++ * IRQ state (as stored in flags) is restored and uidhash_lock released
++ * upon function exit.
++ */
++static inline void free_user(struct user_struct *up, unsigned long flags)
++{
++ uid_hash_remove(up);
++ spin_unlock_irqrestore(&uidhash_lock, flags);
++ sched_destroy_user(up);
++ key_put(up->uid_keyring);
++ key_put(up->session_keyring);
++ kmem_cache_free(uid_cachep, up);
++}
++
++#endif /* CONFIG_FAIR_USER_SCHED */
++
+ /*
+ * Locate the user_struct for the passed UID. If found, take a ref on it. The
+ * caller must undo that ref with free_uid().
+ *
+ * If the user_struct could not be found, return NULL.
+@@ -104,26 +306,26 @@ void free_uid(struct user_struct *up)
+
+ if (!up)
+ return;
+
+ local_irq_save(flags);
+- if (atomic_dec_and_lock(&up->__count, &uidhash_lock)) {
+- uid_hash_remove(up);
+- spin_unlock_irqrestore(&uidhash_lock, flags);
+- key_put(up->uid_keyring);
+- key_put(up->session_keyring);
+- kmem_cache_free(uid_cachep, up);
+- } else {
++ if (atomic_dec_and_lock(&up->__count, &uidhash_lock))
++ free_user(up, flags);
++ else
+ local_irq_restore(flags);
+- }
+ }
+
+ struct user_struct * alloc_uid(struct user_namespace *ns, uid_t uid)
+ {
+ struct hlist_head *hashent = uidhashentry(ns, uid);
+ struct user_struct *up;
+
++ /* Make uid_hash_find() + user_kobject_create() + uid_hash_insert()
++ * atomic.
++ */
++ uids_mutex_lock();
++
+ spin_lock_irq(&uidhash_lock);
+ up = uid_hash_find(uid, hashent);
+ spin_unlock_irq(&uidhash_lock);
+
+ if (!up) {
+@@ -148,27 +350,51 @@ struct user_struct * alloc_uid(struct us
+ if (alloc_uid_keyring(new, current) < 0) {
+ kmem_cache_free(uid_cachep, new);
+ return NULL;
+ }
+
++ if (sched_create_user(new) < 0) {
++ key_put(new->uid_keyring);
++ key_put(new->session_keyring);
++ kmem_cache_free(uid_cachep, new);
++ return NULL;
++ }
++
++ if (user_kobject_create(new)) {
++ sched_destroy_user(new);
++ key_put(new->uid_keyring);
++ key_put(new->session_keyring);
++ kmem_cache_free(uid_cachep, new);
++ uids_mutex_unlock();
++ return NULL;
++ }
++
+ /*
+ * Before adding this, check whether we raced
+ * on adding the same user already..
+ */
+ spin_lock_irq(&uidhash_lock);
+ up = uid_hash_find(uid, hashent);
+ if (up) {
++ /* This case is not possible when CONFIG_FAIR_USER_SCHED
++ * is defined, since we serialize alloc_uid() using
++ * uids_mutex. Hence no need to call
++ * sched_destroy_user() or remove_user_sysfs_dir().
++ */
+ key_put(new->uid_keyring);
+ key_put(new->session_keyring);
+ kmem_cache_free(uid_cachep, new);
+ } else {
+ uid_hash_insert(new, hashent);
+ up = new;
+ }
+ spin_unlock_irq(&uidhash_lock);
+
+ }
++
++ uids_mutex_unlock();
++
+ return up;
+ }
+
+ void switch_uid(struct user_struct *new_user)
+ {
+@@ -182,10 +408,11 @@ void switch_uid(struct user_struct *new_
+ old_user = current->user;
+ atomic_inc(&new_user->processes);
+ atomic_dec(&old_user->processes);
+ switch_uid_keyring(new_user);
+ current->user = new_user;
++ sched_switch_user(current);
+
+ /*
+ * We need to synchronize with __sigqueue_alloc()
+ * doing a get_uid(p->user).. If that saw the old
+ * user value, we need to wait until it has exited
+--- linux-2.6.23.orig/mm/memory_hotplug.c
++++ linux-2.6.23/mm/memory_hotplug.c
+@@ -215,10 +215,14 @@ int online_pages(unsigned long pfn, unsi
+ }
+ zone->present_pages += onlined_pages;
+ zone->zone_pgdat->node_present_pages += onlined_pages;
+
+ setup_per_zone_pages_min();
++ if (onlined_pages) {
++ kswapd_run(zone_to_nid(zone));
++ node_set_state(zone_to_nid(zone), N_HIGH_MEMORY);
++ }
+
+ if (need_zonelists_rebuild)
+ build_all_zonelists();
+ vm_total_pages = nr_free_pagecache_pages();
+ writeback_set_ratelimit();
+@@ -269,13 +273,10 @@ int add_memory(int nid, u64 start, u64 s
+ if (!node_online(nid)) {
+ pgdat = hotadd_new_pgdat(nid, start);
+ if (!pgdat)
+ return -ENOMEM;
+ new_pgdat = 1;
+- ret = kswapd_run(nid);
+- if (ret)
+- goto error;
+ }
+
+ /* call arch's memory hotadd */
+ ret = arch_add_memory(nid, start, size);
+
+--- linux-2.6.23.orig/mm/page_alloc.c
++++ linux-2.6.23/mm/page_alloc.c
+@@ -45,17 +45,25 @@
+ #include <asm/tlbflush.h>
+ #include <asm/div64.h>
+ #include "internal.h"
+
+ /*
+- * MCD - HACK: Find somewhere to initialize this EARLY, or make this
+- * initializer cleaner
++ * Array of node states.
+ */
+-nodemask_t node_online_map __read_mostly = { { [0] = 1UL } };
+-EXPORT_SYMBOL(node_online_map);
+-nodemask_t node_possible_map __read_mostly = NODE_MASK_ALL;
+-EXPORT_SYMBOL(node_possible_map);
++nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
++ [N_POSSIBLE] = NODE_MASK_ALL,
++ [N_ONLINE] = { { [0] = 1UL } },
++#ifndef CONFIG_NUMA
++ [N_NORMAL_MEMORY] = { { [0] = 1UL } },
++#ifdef CONFIG_HIGHMEM
++ [N_HIGH_MEMORY] = { { [0] = 1UL } },
++#endif
++ [N_CPU] = { { [0] = 1UL } },
++#endif /* NUMA */
++};
++EXPORT_SYMBOL(node_states);
++
+ unsigned long totalram_pages __read_mostly;
+ unsigned long totalreserve_pages __read_mostly;
+ long nr_swap_pages;
+ int percpu_pagelist_fraction;
+
+@@ -2070,18 +2078,39 @@ static void build_zonelist_cache(pg_data
+ pgdat->node_zonelists[i].zlcache_ptr = NULL;
+ }
+
+ #endif /* CONFIG_NUMA */
+
++/* Any regular memory on that node ? */
++static void check_for_regular_memory(pg_data_t *pgdat)
++{
++#ifdef CONFIG_HIGHMEM
++ enum zone_type zone_type;
++
++ for (zone_type = 0; zone_type <= ZONE_NORMAL; zone_type++) {
++ struct zone *zone = &pgdat->node_zones[zone_type];
++ if (zone->present_pages)
++ node_set_state(zone_to_nid(zone), N_NORMAL_MEMORY);
++ }
++#endif
++}
++
+ /* return values int ....just for stop_machine_run() */
+ static int __build_all_zonelists(void *dummy)
+ {
+ int nid;
+
+ for_each_online_node(nid) {
+- build_zonelists(NODE_DATA(nid));
+- build_zonelist_cache(NODE_DATA(nid));
++ pg_data_t *pgdat = NODE_DATA(nid);
++
++ build_zonelists(pgdat);
++ build_zonelist_cache(pgdat);
++
++ /* Any memory on that node */
++ if (pgdat->node_present_pages)
++ node_set_state(nid, N_HIGH_MEMORY);
++ check_for_regular_memory(pgdat);
+ }
+ return 0;
+ }
+
+ void build_all_zonelists(void)
+@@ -2322,18 +2351,21 @@ static struct per_cpu_pageset boot_pages
+ * per cpu pageset array in struct zone.
+ */
+ static int __cpuinit process_zones(int cpu)
+ {
+ struct zone *zone, *dzone;
++ int node = cpu_to_node(cpu);
++
++ node_set_state(node, N_CPU); /* this node has a cpu */
+
+ for_each_zone(zone) {
+
+ if (!populated_zone(zone))
+ continue;
+
+ zone_pcp(zone, cpu) = kmalloc_node(sizeof(struct per_cpu_pageset),
+- GFP_KERNEL, cpu_to_node(cpu));
++ GFP_KERNEL, node);
+ if (!zone_pcp(zone, cpu))
+ goto bad;
+
+ setup_pageset(zone_pcp(zone, cpu), zone_batchsize(zone));
+
+--- linux-2.6.23.orig/mm/vmscan.c
++++ linux-2.6.23/mm/vmscan.c
+@@ -1845,11 +1845,10 @@ static int __zone_reclaim(struct zone *z
+ return nr_reclaimed >= nr_pages;
+ }
+
+ int zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
+ {
+- cpumask_t mask;
+ int node_id;
+
+ /*
+ * Zone reclaim reclaims unmapped file backed pages and
+ * slab pages if we are over the defined limits.
+@@ -1882,11 +1881,10 @@ int zone_reclaim(struct zone *zone, gfp_
+ * have associated processors. This will favor the local processor
+ * over remote processors and spread off node memory allocations
+ * as wide as possible.
+ */
+ node_id = zone_to_nid(zone);
+- mask = node_to_cpumask(node_id);
+- if (!cpus_empty(mask) && node_id != numa_node_id())
++ if (node_state(node_id, N_CPU) && node_id != numa_node_id())
+ return 0;
+ return __zone_reclaim(zone, gfp_mask, order);
+ }
+ #endif
+--- linux-2.6.23.orig/net/unix/af_unix.c
++++ linux-2.6.23/net/unix/af_unix.c
+@@ -331,11 +331,11 @@ static inline int unix_writable(struct s
+ static void unix_write_space(struct sock *sk)
+ {
+ read_lock(&sk->sk_callback_lock);
+ if (unix_writable(sk)) {
+ if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
+- wake_up_interruptible(sk->sk_sleep);
++ wake_up_interruptible_sync(sk->sk_sleep);
+ sk_wake_async(sk, 2, POLL_OUT);
+ }
+ read_unlock(&sk->sk_callback_lock);
+ }
+
+@@ -1640,11 +1640,11 @@ static int unix_dgram_recvmsg(struct kio
+ err = 0;
+ unix_state_unlock(sk);
+ goto out_unlock;
+ }
+
+- wake_up_interruptible(&u->peer_wait);
++ wake_up_interruptible_sync(&u->peer_wait);
+
+ if (msg->msg_name)
+ unix_copy_addr(msg, skb->sk);
+
+ if (size > skb->len)
diff --git a/recipes/linux/linux_2.6.23.bb b/recipes/linux/linux_2.6.23.bb
index 754b795..1485444 100644
--- a/recipes/linux/linux_2.6.23.bb
+++ b/recipes/linux/linux_2.6.23.bb
@@ -18,8 +18,8 @@ SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 \
SRC_URI += "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-2.6.23.17.bz2;patch=1"
# Real-time preemption (includes CFS). This is experimental and requires a different defconfig.
#SRC_URI += "file://patch-2.6.23.12-rt14;patch=1"
-# Only the Completely Fair Scheduler (CFS), the official backport from 2.6.24
-SRC_URI += "http://people.redhat.com/mingo/cfs-scheduler/sched-cfs-v2.6.23.12-v24.1.patch;patch=1"
+# Only the Completely Fair Scheduler (CFS), the official backport from 2.6.24 (adapted for 2.6.23.17)
+SRC_URI += "file://sched-cfs-v2.6.23.12-v24.1.patch;patch=1"
# Add support for squashfs-lzma (a highly compressed read-only filesystem)
SRC_URI += "http://kamikaze.waninkoko.info/patches/2.6.23/klight1/broken-out/squashfs-lzma-2.6.23.patch;patch=1"
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] linux 2.6.23: added time.h.patch to make it build for ARM
2009-10-21 10:23 ` [PATCH] linux 2.6.23: keep sched-cfs locally updated to 2.6.23.17 Marcin Juszkiewicz
@ 2009-10-21 10:30 ` Marcin Juszkiewicz
2009-10-21 11:37 ` [PATCH] linux 2.6.23: dropped em-x270 copy of time.h.patch Marcin Juszkiewicz
0 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 10:30 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux-2.6.23/time.h.patch | 25 +++++++++++++++++++++++++
recipes/linux/linux_2.6.23.bb | 2 ++
2 files changed, 27 insertions(+), 0 deletions(-)
create mode 100644 recipes/linux/linux-2.6.23/time.h.patch
diff --git a/recipes/linux/linux-2.6.23/time.h.patch b/recipes/linux/linux-2.6.23/time.h.patch
new file mode 100644
index 0000000..fd22f3a
--- /dev/null
+++ b/recipes/linux/linux-2.6.23/time.h.patch
@@ -0,0 +1,25 @@
+....since some architectures don't support __udivdi3() (and
+ we don't want to use that, anyway).
+
+ Signed-off-by: Segher Boessenkool
+ ---
+ include/linux/time.h | 4 ++++
+ 1 files changed, 4 insertions(+), 0 deletions(-)
+
+
+Index: linux-2.6.24/include/linux/time.h
+===================================================================
+--- linux-2.6.24.orig/include/linux/time.h 2008-06-23 11:17:09.021841180 +0200
++++ linux-2.6.24/include/linux/time.h 2008-06-23 11:18:34.445167140 +0200
+@@ -173,6 +173,11 @@
+ {
+ ns += a->tv_nsec;
+ while(unlikely(ns >= NSEC_PER_SEC)) {
++ /* The following asm() prevents the compiler from
++ * optimising this loop into a modulo operation.
++ */
++ asm("" : "+r"(ns));
++
+ ns -= NSEC_PER_SEC;
+ a->tv_sec++;
+ }
diff --git a/recipes/linux/linux_2.6.23.bb b/recipes/linux/linux_2.6.23.bb
index 1485444..69cdb57 100644
--- a/recipes/linux/linux_2.6.23.bb
+++ b/recipes/linux/linux_2.6.23.bb
@@ -23,6 +23,8 @@ SRC_URI += "file://sched-cfs-v2.6.23.12-v24.1.patch;patch=1"
# Add support for squashfs-lzma (a highly compressed read-only filesystem)
SRC_URI += "http://kamikaze.waninkoko.info/patches/2.6.23/klight1/broken-out/squashfs-lzma-2.6.23.patch;patch=1"
+SRC_URI += "file://time.h.patch;patch=1"
+
# The Atmel patch doesn't apply against 2.6.23.12 :(
SRC_URI_avr32 = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2 \
file://defconfig \
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] linux 2.6.24: drop time.h.patch - it is in 2.6.24.7 update
2009-10-19 15:29 ` [PATCH 06/12] linux 2.6.24: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-21 10:32 ` Marcin Juszkiewicz
0 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 10:32 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux-2.6.24/time.h.patch | 25 -------------------------
recipes/linux/linux_2.6.24.bb | 1 -
2 files changed, 0 insertions(+), 26 deletions(-)
delete mode 100644 recipes/linux/linux-2.6.24/time.h.patch
diff --git a/recipes/linux/linux-2.6.24/time.h.patch b/recipes/linux/linux-2.6.24/time.h.patch
deleted file mode 100644
index fd22f3a..0000000
--- a/recipes/linux/linux-2.6.24/time.h.patch
+++ /dev/null
@@ -1,25 +0,0 @@
-....since some architectures don't support __udivdi3() (and
- we don't want to use that, anyway).
-
- Signed-off-by: Segher Boessenkool
- ---
- include/linux/time.h | 4 ++++
- 1 files changed, 4 insertions(+), 0 deletions(-)
-
-
-Index: linux-2.6.24/include/linux/time.h
-===================================================================
---- linux-2.6.24.orig/include/linux/time.h 2008-06-23 11:17:09.021841180 +0200
-+++ linux-2.6.24/include/linux/time.h 2008-06-23 11:18:34.445167140 +0200
-@@ -173,6 +173,11 @@
- {
- ns += a->tv_nsec;
- while(unlikely(ns >= NSEC_PER_SEC)) {
-+ /* The following asm() prevents the compiler from
-+ * optimising this loop into a modulo operation.
-+ */
-+ asm("" : "+r"(ns));
-+
- ns -= NSEC_PER_SEC;
- a->tv_sec++;
- }
diff --git a/recipes/linux/linux_2.6.24.bb b/recipes/linux/linux_2.6.24.bb
index 5a09516..b92428e 100644
--- a/recipes/linux/linux_2.6.24.bb
+++ b/recipes/linux/linux_2.6.24.bb
@@ -18,7 +18,6 @@ SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.24.tar.bz2 \
${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.7.bz2;patch=1 \
file://squashfs-lzma-2.6.24.patch;patch=1 \
file://ubifs-v2.6.24.patch;patch=1 \
- file://time.h.patch;patch=1 \
file://defconfig"
# Moved away temporarely until committed properly (work in progress).
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] linux 2.6.20: added patch to fix PATH_MAX related build error
2009-10-19 15:29 ` [PATCH 02/12] linux 2.6.20: " Marcin Juszkiewicz
@ 2009-10-21 11:12 ` Marcin Juszkiewicz
0 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 11:12 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
...ude-limits.h-in-sumversion.c-for-PATH_MAX.patch | 29 ++++++++++++++++++++
recipes/linux/linux_2.6.20.bb | 1 +
2 files changed, 30 insertions(+), 0 deletions(-)
create mode 100644 recipes/linux/linux-2.6.20/0001-kbuild-include-limits.h-in-sumversion.c-for-PATH_MAX.patch
diff --git a/recipes/linux/linux-2.6.20/0001-kbuild-include-limits.h-in-sumversion.c-for-PATH_MAX.patch b/recipes/linux/linux-2.6.20/0001-kbuild-include-limits.h-in-sumversion.c-for-PATH_MAX.patch
new file mode 100644
index 0000000..4871601
--- /dev/null
+++ b/recipes/linux/linux-2.6.20/0001-kbuild-include-limits.h-in-sumversion.c-for-PATH_MAX.patch
@@ -0,0 +1,29 @@
+From fc31c7716355a226b8ed4e16f4581e5c8fa53570 Mon Sep 17 00:00:00 2001
+From: Mike Frysinger <vapier@gentoo.org>
+Date: Thu, 17 May 2007 14:57:20 -0400
+Subject: [PATCH] kbuild: include limits.h in sumversion.c for PATH_MAX
+
+POSIX says limits.h defines PATH_MAX so we should include it (which fixes
+compiling on some systems like OS X).
+
+Signed-off-by: Mike Frysinger <vapier@gentoo.org>
+Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
+---
+ scripts/mod/sumversion.c | 1 +
+ 1 files changed, 1 insertions(+), 0 deletions(-)
+
+diff --git a/scripts/mod/sumversion.c b/scripts/mod/sumversion.c
+index 6873d5a..d9cc690 100644
+--- a/scripts/mod/sumversion.c
++++ b/scripts/mod/sumversion.c
+@@ -7,6 +7,7 @@
+ #include <ctype.h>
+ #include <errno.h>
+ #include <string.h>
++#include <limits.h>
+ #include "modpost.h"
+
+ /*
+--
+1.6.3.3
+
diff --git a/recipes/linux/linux_2.6.20.bb b/recipes/linux/linux_2.6.20.bb
index c701a3b..07c85bc 100644
--- a/recipes/linux/linux_2.6.20.bb
+++ b/recipes/linux/linux_2.6.20.bb
@@ -9,6 +9,7 @@ PR = "r9"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.21.bz2;patch=1 \
+ file://0001-kbuild-include-limits.h-in-sumversion.c-for-PATH_MAX.patch;patch=1 \
file://defconfig"
SRC_URI_append_n2100 = "\
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] linux 2.6.23: dropped em-x270 copy of time.h.patch
2009-10-21 10:30 ` [PATCH] linux 2.6.23: added time.h.patch to make it build for ARM Marcin Juszkiewicz
@ 2009-10-21 11:37 ` Marcin Juszkiewicz
0 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 11:37 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
.../em-x270/01-prevent_loop_timespec_add_ns.patch | 19 -------------------
recipes/linux/linux_2.6.23.bb | 3 +--
2 files changed, 1 insertions(+), 21 deletions(-)
delete mode 100644 recipes/linux/linux-2.6.23/em-x270/01-prevent_loop_timespec_add_ns.patch
diff --git a/recipes/linux/linux-2.6.23/em-x270/01-prevent_loop_timespec_add_ns.patch b/recipes/linux/linux-2.6.23/em-x270/01-prevent_loop_timespec_add_ns.patch
deleted file mode 100644
index a2f6e17..0000000
--- a/recipes/linux/linux-2.6.23/em-x270/01-prevent_loop_timespec_add_ns.patch
+++ /dev/null
@@ -1,19 +0,0 @@
----
- include/linux/time.h | 4 ++++
- 1 files changed, 4 insertions(+), 0 deletions(-)
-diff --git a/include/linux/time.h b/include/linux/time.h
-index 2091a19..d32ef0a 100644
---- a/include/linux/time.h
-+++ b/include/linux/time.h
-@@ -173,6 +173,10 @@ static inline void timespec_add_ns(struct timespec *a, u64 ns)
- {
- ns += a->tv_nsec;
- while(unlikely(ns >= NSEC_PER_SEC)) {
-+ /* The following asm() prevents the compiler from
-+ * optimising this loop into a modulo operation. */
-+ asm("" : "+r"(ns));
-+
- ns -= NSEC_PER_SEC;
- a->tv_sec++;
- }
-
diff --git a/recipes/linux/linux_2.6.23.bb b/recipes/linux/linux_2.6.23.bb
index 69cdb57..65f400d 100644
--- a/recipes/linux/linux_2.6.23.bb
+++ b/recipes/linux/linux_2.6.23.bb
@@ -31,8 +31,7 @@ SRC_URI_avr32 = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-2.6.23.tar.bz2
http://avr32linux.org/twiki/pub/Main/LinuxPatches/linux-2.6.23.atmel.3.patch.bz2;patch=1 \
"
SRC_URI_append_em-x270 = "\
- file://em-x270.patch;patch=1 \
- file://01-prevent_loop_timespec_add_ns.patch;patch=1"
+ file://em-x270.patch;patch=1 "
SRC_URI_append_cm-x270 = "\
file://0001-cm-x270-base2.patch;patch=1 \
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] linux 2.6.26: do not apply 2.6.26.5 for topas910
2009-10-19 15:29 ` [PATCH 08/12] linux 2.6.26: " Marcin Juszkiewicz
@ 2009-10-21 11:55 ` Marcin Juszkiewicz
0 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 11:55 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.26.bb | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/recipes/linux/linux_2.6.26.bb b/recipes/linux/linux_2.6.26.bb
index e1bb68c..775d2d8 100644
--- a/recipes/linux/linux_2.6.26.bb
+++ b/recipes/linux/linux_2.6.26.bb
@@ -37,8 +37,7 @@ SRC_URI_append_mpc8313e-rdb = "\
file://mpc8313e-rdb-eth-fixed.patch;patch=1 \
"
-SRC_URI_append_topas910 = " ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-2.6.26.5.bz2;patch=1 \
- http://www.bplan-gmbh.org/data/toshiba/topas/linux/2.6.26.5/patch_2.6.26.5_topas910.bz2;patch=1"
+SRC_URI_append_topas910 = "http://www.bplan-gmbh.org/data/toshiba/topas/linux/2.6.26.5/patch_2.6.26.5_topas910.bz2;patch=1"
# see http://bugzilla.kernel.org/show_bug.cgi?id=11143
do_stage_append() {
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 18:18 ` Marcin Juszkiewicz
@ 2009-10-21 12:20 ` Marcin Juszkiewicz
2009-10-26 20:35 ` Marcin Juszkiewicz
0 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 12:20 UTC (permalink / raw)
To: openembedded-devel
Dnia poniedziałek, 19 października 2009 o 20:18:28 Marcin Juszkiewicz napisał(a):
> And those reasons are why I did not pushed any of those this time (before I
> updated .30 and .31). Tomorrow will do some build testing.
I did a check for applying all patches for those machines (they alter SRC_URI):
2.6.18 - at91sam9263ek
2.6.19 - at91sam9263ek
2.6.20 - n2100, at91sam9263ek
2.6.21 - simpad, kb9202, at91sam9263ek, sarge-at91, gumstix-verdex (FAIL), gumstix-connex (FAIL)
2.6.22 - cm-x270, bd-neon
2.6.23 - em-x270, cm-x270, mpc8313e-rdb, mpc8323e-rdb
2.6.24 - simpad, gesbc-9302, mpc8313e-rdb, cm-x270, ts72xx (FAIL), oxnas (FAIL), hipox (FAIL)
2.6.25 - mpc8313e-rdb, cm-x270, at32stk1000 (FAIL on fetching), at91-l9260, m8050
2.6.26 - boc01, mpc8313e-rdb, topas910
2.6.27 - boc01, progear, simpad
2.6.28 - at91sam9263ek, ronetix-pm9263, stb225, collie, tosa, gamecube, tx27 FAIL
2.6.29 - boc01, canyonlands, micro2440 FAIL, tosa, ep93xx, cm-x270, stamp9g20evb
2.6.30 - mpc8315e-rdb, at91sam9263ek
2.6.31 - boc01, ep93xx
So this changeset breaks few devices at patch level:
gumstix-verdex/connex in 2.6.21, ts72xx/oxnas/hipox in 2.6.24,
tx27 in 2.6.28 and micro2440 in 2.6.29.
at32stk1000 patch was not available for fetching.
Regards,
--
JID: hrw@jabber.org
Website: http://marcin.juszkiewicz.com.pl/
LinkedIn: http://www.linkedin.com/in/marcinjuszkiewicz
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH] linux 2.6.29: removed two patches from micro2440 which are also in stable update
2009-10-19 15:29 ` [PATCH 11/12] linux 2.6.29: " Marcin Juszkiewicz
@ 2009-10-21 12:26 ` Marcin Juszkiewicz
0 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-21 12:26 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
...mp-fix-for-the-40-relocation-binutils-pro.patch | 49 ------------------
...2-GRO-Disable-GRO-on-legacy-netif_rx-path.patch | 54 --------------------
recipes/linux/linux_2.6.29.bb | 2 -
3 files changed, 0 insertions(+), 105 deletions(-)
delete mode 100644 recipes/linux/linux-2.6.29/micro2440/0005-920T-Temp-fix-for-the-40-relocation-binutils-pro.patch
delete mode 100644 recipes/linux/linux-2.6.29/micro2440/0012-GRO-Disable-GRO-on-legacy-netif_rx-path.patch
diff --git a/recipes/linux/linux-2.6.29/micro2440/0005-920T-Temp-fix-for-the-40-relocation-binutils-pro.patch b/recipes/linux/linux-2.6.29/micro2440/0005-920T-Temp-fix-for-the-40-relocation-binutils-pro.patch
deleted file mode 100644
index 6b8aaf4..0000000
--- a/recipes/linux/linux-2.6.29/micro2440/0005-920T-Temp-fix-for-the-40-relocation-binutils-pro.patch
+++ /dev/null
@@ -1,49 +0,0 @@
-From a4cba996cb77da4afc26c35402a70c3f008afe96 Mon Sep 17 00:00:00 2001
-From: Michel Pollet <buserror@gmail.com>
-Date: Sat, 14 Mar 2009 10:34:32 +0000
-Subject: [PATCH] 920T: Temp(?) fix for the 40 relocation binutils problem
-
-This prevents the modules failing to load when made
-with modern toolchains. There is no way to prevent binutils
-to generate these relocations, and on the 920t they are
-in fact not needed. So this patch just skip them.
-
-Signed-off-by: Michel Pollet <buserror@gmail.com>
----
- arch/arm/include/asm/elf.h | 1 +
- arch/arm/kernel/module.c | 7 +++++++
- 2 files changed, 8 insertions(+), 0 deletions(-)
-
-diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
-index a58378c..ce3b36e 100644
---- a/arch/arm/include/asm/elf.h
-+++ b/arch/arm/include/asm/elf.h
-@@ -50,6 +50,7 @@ typedef struct user_fp elf_fpregset_t;
- #define R_ARM_ABS32 2
- #define R_ARM_CALL 28
- #define R_ARM_JUMP24 29
-+#define R_ARM_V4BX 40
-
- /*
- * These are used to set parameters in the core dumps.
-diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
-index dab48f2..fa03392 100644
---- a/arch/arm/kernel/module.c
-+++ b/arch/arm/kernel/module.c
-@@ -132,6 +132,13 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
- *(u32 *)loc |= offset & 0x00ffffff;
- break;
-
-+#ifdef CONFIG_CPU_ARM920T
-+ /* modern toolchain generate V4BX for the modules, and there is no
-+ * way to skip them being generated in the .ko, so in our case, we just
-+ * can ignore them */
-+ case R_ARM_V4BX: /* Ignore these sections */
-+ break;
-+#endif
- default:
- printk(KERN_ERR "%s: unknown relocation: %u\n",
- module->name, ELF32_R_TYPE(rel->r_info));
---
-1.5.6.3
-
diff --git a/recipes/linux/linux-2.6.29/micro2440/0012-GRO-Disable-GRO-on-legacy-netif_rx-path.patch b/recipes/linux/linux-2.6.29/micro2440/0012-GRO-Disable-GRO-on-legacy-netif_rx-path.patch
deleted file mode 100644
index bfad6d8..0000000
--- a/recipes/linux/linux-2.6.29/micro2440/0012-GRO-Disable-GRO-on-legacy-netif_rx-path.patch
+++ /dev/null
@@ -1,54 +0,0 @@
-From 11c0b33d2a046a37bcd96528faa0e93359ef4a4b Mon Sep 17 00:00:00 2001
-From: Herbert Xu <herbert@gondor.apana.org.au>
-Date: Thu, 26 Mar 2009 00:59:10 -0700
-Subject: [PATCH] GRO: Disable GRO on legacy netif_rx path
-
-When I fixed the GRO crash in the legacy receive path I used
-napi_complete to replace __napi_complete. Unfortunately they're
-not the same when NETPOLL is enabled, which may result in us
-not calling __napi_complete at all.
-
-What's more, we really do need to keep the __napi_complete call
-within the IRQ-off section since in theory an IRQ can occur in
-between and fill up the backlog to the maximum, causing us to
-lock up.
-
-Since we can't seem to find a fix that works properly right now,
-this patch reverts all the GRO support from the netif_rx path.
-
-Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-Signed-off-by: David S. Miller <davem@davemloft.net>
-Signed-off-by: Michel Pollet <buserror@gmail.com>
----
- net/core/dev.c | 9 +++------
- 1 files changed, 3 insertions(+), 6 deletions(-)
-
-diff --git a/net/core/dev.c b/net/core/dev.c
-index e3fe5c7..e438f54 100644
---- a/net/core/dev.c
-+++ b/net/core/dev.c
-@@ -2588,18 +2588,15 @@ static int process_backlog(struct napi_struct *napi, int quota)
- local_irq_disable();
- skb = __skb_dequeue(&queue->input_pkt_queue);
- if (!skb) {
-+ __napi_complete(napi);
- local_irq_enable();
-- napi_complete(napi);
-- goto out;
-+ break;
- }
- local_irq_enable();
-
-- napi_gro_receive(napi, skb);
-+ netif_receive_skb(skb);
- } while (++work < quota && jiffies == start_time);
-
-- napi_gro_flush(napi);
--
--out:
- return work;
- }
-
---
-1.5.6.3
-
diff --git a/recipes/linux/linux_2.6.29.bb b/recipes/linux/linux_2.6.29.bb
index 6027968..240faf2 100644
--- a/recipes/linux/linux_2.6.29.bb
+++ b/recipes/linux/linux_2.6.29.bb
@@ -43,14 +43,12 @@ SRC_URI_append_micro2440 = " \
file://0002-S3C-Backported-openmoko-s-touchscreen-filters.patch;patch=1 \
file://0003-VENDOR-armworks-logo.patch;patch=1 \
file://0004-920T-Use-specific-920t-mtune.patch;patch=1 \
- file://0005-920T-Temp-fix-for-the-40-relocation-binutils-pro.patch;patch=1 \
file://0006-S3C-Allow-the-machine-code-to-get-the-BBT-table-fro.patch;patch=1 \
file://0007-MINI2440-Add-machine-support.patch;patch=1 \
file://0008-MINI2440-Delays-command-check-response-on-SD.patch;patch=1 \
file://0009-MINI2440-Rename-the-SoC-tty-names.patch;patch=1 \
file://0010-MINI2440-creates-a-mini2440_defconfig-file.patch;patch=1 \
file://0011-MINI2440-Add-touchscreen-support.patch;patch=1 \
- file://0012-GRO-Disable-GRO-on-legacy-netif_rx-path.patch;patch=1 \
"
SRC_URI_append_tosa = " \
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] linux: updated 2.6.27 to 2.6.27.38
2009-10-19 15:29 ` [PATCH 09/12] linux 2.6.27: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-22 10:01 ` Marcin Juszkiewicz
0 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-22 10:01 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.27.bb | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/recipes/linux/linux_2.6.27.bb b/recipes/linux/linux_2.6.27.bb
index 70b9028..4562a05 100644
--- a/recipes/linux/linux_2.6.27.bb
+++ b/recipes/linux/linux_2.6.27.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r9"
+PR = "r10"
# Mark archs/machines that this kernel supports
DEFAULT_PREFERENCE = "-1"
@@ -9,7 +9,7 @@ DEFAULT_PREFERENCE_progear = "1"
DEFAULT_PREFERENCE_simpad = "-1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
- ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.37.bz2;patch=1 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.38.bz2;patch=1 \
file://defconfig "
SRC_URI_append_boc01 = "\
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH] linux: 2.6.31 -> 2.6.31.5 update
2009-10-19 15:29 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Marcin Juszkiewicz
@ 2009-10-23 9:01 ` Marcin Juszkiewicz
2009-10-23 18:02 ` Michael Smith
2009-10-23 18:01 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Michael Smith
1 sibling, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-23 9:01 UTC (permalink / raw)
To: openembedded-devel
Signed-off-by: Marcin Juszkiewicz <marcin@juszkiewicz.com.pl>
---
recipes/linux/linux_2.6.31.bb | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/recipes/linux/linux_2.6.31.bb b/recipes/linux/linux_2.6.31.bb
index f30280f..cef4393 100644
--- a/recipes/linux/linux_2.6.31.bb
+++ b/recipes/linux/linux_2.6.31.bb
@@ -1,6 +1,6 @@
require linux.inc
-PR = "r3"
+PR = "r4"
S = "${WORKDIR}/linux-${PV}"
@@ -9,7 +9,7 @@ DEFAULT_PREFERENCE = "-1"
DEFAULT_PREFERENCE_db1200 = "1"
SRC_URI = "${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/linux-${PV}.tar.bz2 \
- ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.4.bz2;patch=1 \
+ ${KERNELORG_MIRROR}/pub/linux/kernel/v2.6/patch-${PV}.5.bz2;patch=1 \
http://maxim.org.za/AT91RM9200/2.6/2.6.31-at91.patch.gz;patch=1 \
file://defconfig"
--
1.6.5
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH 12/12] linux 2.6.31: update to latest stable patch
2009-10-19 15:29 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Marcin Juszkiewicz
2009-10-23 9:01 ` [PATCH] linux: 2.6.31 -> 2.6.31.5 update Marcin Juszkiewicz
@ 2009-10-23 18:01 ` Michael Smith
1 sibling, 0 replies; 38+ messages in thread
From: Michael Smith @ 2009-10-23 18:01 UTC (permalink / raw)
To: openembedded-devel
Marcin Juszkiewicz <marcin <at> juszkiewicz.com.pl> writes:
>
>
> Signed-off-by: Marcin Juszkiewicz <marcin <at> juszkiewicz.com.pl>
> ---
> recipes/linux/linux_2.6.31.bb | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
Acked-By: Michael Smith <msmith@cbnco.com>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH] linux: 2.6.31 -> 2.6.31.5 update
2009-10-23 9:01 ` [PATCH] linux: 2.6.31 -> 2.6.31.5 update Marcin Juszkiewicz
@ 2009-10-23 18:02 ` Michael Smith
0 siblings, 0 replies; 38+ messages in thread
From: Michael Smith @ 2009-10-23 18:02 UTC (permalink / raw)
To: openembedded-devel
Marcin Juszkiewicz <marcin <at> juszkiewicz.com.pl> writes:
> Signed-off-by: Marcin Juszkiewicz <marcin <at> juszkiewicz.com.pl>
> ---
> recipes/linux/linux_2.6.31.bb | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
Acked-By: Michael Smith <msmith@cbnco.com>
(but the checksum for the 2.6.31.5 patch is missing)
Mike
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-21 12:20 ` Marcin Juszkiewicz
@ 2009-10-26 20:35 ` Marcin Juszkiewicz
2009-10-26 20:52 ` Koen Kooi
0 siblings, 1 reply; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-10-26 20:35 UTC (permalink / raw)
To: openembedded-devel; +Cc: Crofton
Dnia środa, 21 października 2009 o 14:20:33 Marcin Juszkiewicz napisał(a
> I did a check for applying all patches for those machines (they alter
> SRC_URI):
>
> 2.6.18 - at91sam9263ek
> 2.6.19 - at91sam9263ek
> 2.6.20 - n2100, at91sam9263ek
> 2.6.21 - simpad, kb9202, at91sam9263ek, sarge-at91, gumstix-verdex (FAIL),
> gumstix-connex (FAIL)
> 2.6.22 - cm-x270, bd-neon
> 2.6.23 - em-x270, cm-x270, mpc8313e-rdb, mpc8323e-rdb
> 2.6.24 - simpad, gesbc-9302, mpc8313e-rdb, cm-x270, ts72xx (FAIL), oxnas
> (FAIL), hipox (FAIL)
> 2.6.25 - mpc8313e-rdb, cm-x270, at32stk1000 (FAIL on
> fetching), at91-l9260, m8050
> 2.6.26 - boc01, mpc8313e-rdb, topas910
> 2.6.27 - boc01, progear, simpad
> 2.6.28 - at91sam9263ek, ronetix-pm9263, stb225, collie, tosa, gamecube,
> tx27 FAIL
> 2.6.29 - boc01, canyonlands, micro2440 FAIL, tosa, ep93xx,
> cm-x270, stamp9g20evb
> 2.6.30 - mpc8315e-rdb, at91sam9263ek
> 2.6.31 - boc01, ep93xx
>
> So this changeset breaks few devices at patch level:
> gumstix-verdex/connex in 2.6.21, ts72xx/oxnas/hipox in 2.6.24,
> tx27 in 2.6.28 and micro2440 in 2.6.29.
>
> at32stk1000 patch was not available for fetching.
Did someone who maintain any of failed devices took a look at it?
Regards,
--
JID: hrw@jabber.org
Website: http://marcin.juszkiewicz.com.pl/
LinkedIn: http://www.linkedin.com/in/marcinjuszkiewicz
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-26 20:35 ` Marcin Juszkiewicz
@ 2009-10-26 20:52 ` Koen Kooi
2009-10-26 21:27 ` Philip Balister
0 siblings, 1 reply; 38+ messages in thread
From: Koen Kooi @ 2009-10-26 20:52 UTC (permalink / raw)
To: openembedded-devel
On 26-10-09 21:35, Marcin Juszkiewicz wrote:
> Dnia środa, 21 października 2009 o 14:20:33 Marcin Juszkiewicz napisał(a
>
>> I did a check for applying all patches for those machines (they alter
>> SRC_URI):
>>
>> 2.6.18 - at91sam9263ek
>> 2.6.19 - at91sam9263ek
>> 2.6.20 - n2100, at91sam9263ek
>> 2.6.21 - simpad, kb9202, at91sam9263ek, sarge-at91, gumstix-verdex (FAIL),
>> gumstix-connex (FAIL)
>> 2.6.22 - cm-x270, bd-neon
>> 2.6.23 - em-x270, cm-x270, mpc8313e-rdb, mpc8323e-rdb
>> 2.6.24 - simpad, gesbc-9302, mpc8313e-rdb, cm-x270, ts72xx (FAIL), oxnas
>> (FAIL), hipox (FAIL)
>> 2.6.25 - mpc8313e-rdb, cm-x270, at32stk1000 (FAIL on
>> fetching), at91-l9260, m8050
>> 2.6.26 - boc01, mpc8313e-rdb, topas910
>> 2.6.27 - boc01, progear, simpad
>> 2.6.28 - at91sam9263ek, ronetix-pm9263, stb225, collie, tosa, gamecube,
>> tx27 FAIL
>> 2.6.29 - boc01, canyonlands, micro2440 FAIL, tosa, ep93xx,
>> cm-x270, stamp9g20evb
>> 2.6.30 - mpc8315e-rdb, at91sam9263ek
>> 2.6.31 - boc01, ep93xx
>>
>> So this changeset breaks few devices at patch level:
>> gumstix-verdex/connex in 2.6.21, ts72xx/oxnas/hipox in 2.6.24,
>> tx27 in 2.6.28 and micro2440 in 2.6.29.
>>
>> at32stk1000 patch was not available for fetching.
>
> Did someone who maintain any of failed devices took a look at it?
Somewhere on my TODO is to update the pxa gumstix to 2.6.3x, but it's "a
few weeks away" down.
regards,
Koen
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-26 20:52 ` Koen Kooi
@ 2009-10-26 21:27 ` Philip Balister
2009-10-26 22:07 ` Koen Kooi
0 siblings, 1 reply; 38+ messages in thread
From: Philip Balister @ 2009-10-26 21:27 UTC (permalink / raw)
To: openembedded-devel
On 10/26/2009 04:52 PM, Koen Kooi wrote:
> On 26-10-09 21:35, Marcin Juszkiewicz wrote:
>> Dnia środa, 21 października 2009 o 14:20:33 Marcin Juszkiewicz napisał(a
>>
>>> I did a check for applying all patches for those machines (they alter
>>> SRC_URI):
>>>
>>> 2.6.18 - at91sam9263ek
>>> 2.6.19 - at91sam9263ek
>>> 2.6.20 - n2100, at91sam9263ek
>>> 2.6.21 - simpad, kb9202, at91sam9263ek, sarge-at91, gumstix-verdex
>>> (FAIL),
>>> gumstix-connex (FAIL)
>>> 2.6.22 - cm-x270, bd-neon
>>> 2.6.23 - em-x270, cm-x270, mpc8313e-rdb, mpc8323e-rdb
>>> 2.6.24 - simpad, gesbc-9302, mpc8313e-rdb, cm-x270, ts72xx (FAIL), oxnas
>>> (FAIL), hipox (FAIL)
>>> 2.6.25 - mpc8313e-rdb, cm-x270, at32stk1000 (FAIL on
>>> fetching), at91-l9260, m8050
>>> 2.6.26 - boc01, mpc8313e-rdb, topas910
>>> 2.6.27 - boc01, progear, simpad
>>> 2.6.28 - at91sam9263ek, ronetix-pm9263, stb225, collie, tosa, gamecube,
>>> tx27 FAIL
>>> 2.6.29 - boc01, canyonlands, micro2440 FAIL, tosa, ep93xx,
>>> cm-x270, stamp9g20evb
>>> 2.6.30 - mpc8315e-rdb, at91sam9263ek
>>> 2.6.31 - boc01, ep93xx
>>>
>>> So this changeset breaks few devices at patch level:
>>> gumstix-verdex/connex in 2.6.21, ts72xx/oxnas/hipox in 2.6.24,
>>> tx27 in 2.6.28 and micro2440 in 2.6.29.
>>>
>>> at32stk1000 patch was not available for fetching.
>>
>> Did someone who maintain any of failed devices took a look at it?
>
> Somewhere on my TODO is to update the pxa gumstix to 2.6.3x, but it's "a
> few weeks away" down.
There is also some traffic on the gmustix list about this happening.
Philip
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-26 21:27 ` Philip Balister
@ 2009-10-26 22:07 ` Koen Kooi
0 siblings, 0 replies; 38+ messages in thread
From: Koen Kooi @ 2009-10-26 22:07 UTC (permalink / raw)
To: openembedded-devel
On 26-10-09 22:27, Philip Balister wrote:
> On 10/26/2009 04:52 PM, Koen Kooi wrote:
>> On 26-10-09 21:35, Marcin Juszkiewicz wrote:
>>> Dnia środa, 21 października 2009 o 14:20:33 Marcin Juszkiewicz napisał(a
>>>
>>>> I did a check for applying all patches for those machines (they alter
>>>> SRC_URI):
>>>>
>>>> 2.6.18 - at91sam9263ek
>>>> 2.6.19 - at91sam9263ek
>>>> 2.6.20 - n2100, at91sam9263ek
>>>> 2.6.21 - simpad, kb9202, at91sam9263ek, sarge-at91, gumstix-verdex
>>>> (FAIL),
>>>> gumstix-connex (FAIL)
>>>> 2.6.22 - cm-x270, bd-neon
>>>> 2.6.23 - em-x270, cm-x270, mpc8313e-rdb, mpc8323e-rdb
>>>> 2.6.24 - simpad, gesbc-9302, mpc8313e-rdb, cm-x270, ts72xx (FAIL),
>>>> oxnas
>>>> (FAIL), hipox (FAIL)
>>>> 2.6.25 - mpc8313e-rdb, cm-x270, at32stk1000 (FAIL on
>>>> fetching), at91-l9260, m8050
>>>> 2.6.26 - boc01, mpc8313e-rdb, topas910
>>>> 2.6.27 - boc01, progear, simpad
>>>> 2.6.28 - at91sam9263ek, ronetix-pm9263, stb225, collie, tosa, gamecube,
>>>> tx27 FAIL
>>>> 2.6.29 - boc01, canyonlands, micro2440 FAIL, tosa, ep93xx,
>>>> cm-x270, stamp9g20evb
>>>> 2.6.30 - mpc8315e-rdb, at91sam9263ek
>>>> 2.6.31 - boc01, ep93xx
>>>>
>>>> So this changeset breaks few devices at patch level:
>>>> gumstix-verdex/connex in 2.6.21, ts72xx/oxnas/hipox in 2.6.24,
>>>> tx27 in 2.6.28 and micro2440 in 2.6.29.
>>>>
>>>> at32stk1000 patch was not available for fetching.
>>>
>>> Did someone who maintain any of failed devices took a look at it?
>>
>> Somewhere on my TODO is to update the pxa gumstix to 2.6.3x, but it's "a
>> few weeks away" down.
>
> There is also some traffic on the gmustix list about this happening.
Which is why it's on my todo :)
regards,
Koen
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: Updating kernels to latest stable versions
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
` (13 preceding siblings ...)
2009-10-19 17:52 ` Khem Raj
@ 2009-12-14 10:56 ` Marcin Juszkiewicz
14 siblings, 0 replies; 38+ messages in thread
From: Marcin Juszkiewicz @ 2009-12-14 10:56 UTC (permalink / raw)
To: openembedded-devel
Dnia poniedziałek, 19 października 2009 o 17:28:58 Marcin Juszkiewicz
napisał(a):
> Some time ago I asked about updating kernels to recent stable patch.
Today I pushed all my updates to .dev tree - please shout if it breaks your
device.
Regards,
--
JID: hrw@jabber.org
Website: http://marcin.juszkiewicz.com.pl/
LinkedIn: http://www.linkedin.com/in/marcinjuszkiewicz
^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2009-12-14 10:58 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-19 15:28 Updating kernels to latest stable versions Marcin Juszkiewicz
2009-10-19 15:28 ` [PATCH 01/12] linux 2.6.18: update to latest stable patch Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 02/12] linux 2.6.20: " Marcin Juszkiewicz
2009-10-21 11:12 ` [PATCH] linux 2.6.20: added patch to fix PATH_MAX related build error Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 03/12] linux 2.6.21: update to latest stable patch Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 04/12] linux 2.6.22: " Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 05/12] linux 2.6.23: " Marcin Juszkiewicz
2009-10-21 10:23 ` [PATCH] linux 2.6.23: keep sched-cfs locally updated to 2.6.23.17 Marcin Juszkiewicz
2009-10-21 10:30 ` [PATCH] linux 2.6.23: added time.h.patch to make it build for ARM Marcin Juszkiewicz
2009-10-21 11:37 ` [PATCH] linux 2.6.23: dropped em-x270 copy of time.h.patch Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 06/12] linux 2.6.24: update to latest stable patch Marcin Juszkiewicz
2009-10-21 10:32 ` [PATCH] linux 2.6.24: drop time.h.patch - it is in 2.6.24.7 update Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 07/12] linux 2.6.25: update to latest stable patch Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 08/12] linux 2.6.26: " Marcin Juszkiewicz
2009-10-21 11:55 ` [PATCH] linux 2.6.26: do not apply 2.6.26.5 for topas910 Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 09/12] linux 2.6.27: update to latest stable patch Marcin Juszkiewicz
2009-10-22 10:01 ` [PATCH] linux: updated 2.6.27 to 2.6.27.38 Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 10/12] linux 2.6.28: update to latest stable patch Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 11/12] linux 2.6.29: " Marcin Juszkiewicz
2009-10-21 12:26 ` [PATCH] linux 2.6.29: removed two patches from micro2440 which are also in stable update Marcin Juszkiewicz
2009-10-19 15:29 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Marcin Juszkiewicz
2009-10-23 9:01 ` [PATCH] linux: 2.6.31 -> 2.6.31.5 update Marcin Juszkiewicz
2009-10-23 18:02 ` Michael Smith
2009-10-23 18:01 ` [PATCH 12/12] linux 2.6.31: update to latest stable patch Michael Smith
2009-10-19 16:28 ` Updating kernels to latest stable versions Stanislav Brabec
2009-10-19 16:47 ` Marcin Juszkiewicz
2009-10-19 20:08 ` Phil Blundell
2009-10-20 12:45 ` Otavio Salvador
2009-10-19 16:49 ` Otavio Salvador
2009-10-19 17:52 ` Khem Raj
2009-10-19 18:08 ` Mike Westerhof
2009-10-19 18:18 ` Marcin Juszkiewicz
2009-10-21 12:20 ` Marcin Juszkiewicz
2009-10-26 20:35 ` Marcin Juszkiewicz
2009-10-26 20:52 ` Koen Kooi
2009-10-26 21:27 ` Philip Balister
2009-10-26 22:07 ` Koen Kooi
2009-12-14 10:56 ` Marcin Juszkiewicz
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.