Linux Documentation
 help / color / mirror / Atom feed
* [RFC PATCH v2 2/6] uaccess: add untagged_addr definition for other arches
From: Andrey Konovalov @ 2018-03-27 16:57 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Jonathan Corbet, Mark Rutland,
	Robin Murphy, Al Viro, Andrey Konovalov, James Morse, Kees Cook,
	Bart Van Assche, Kate Stewart, Greg Kroah-Hartman,
	Thomas Gleixner, Philippe Ombredanne, Andrew Morton, Ingo Molnar,
	Kirill A . Shutemov, Dan Williams, Aneesh Kumar K . V, Zi Yan,
	linux-arm-kernel, linux-doc, linux-kernel, linux-mm
  Cc: Dmitry Vyukov, Kostya Serebryany, Evgeniy Stepanov, Lee Smith,
	Ramana Radhakrishnan, Jacob Bramley, Ruben Ayrapetyan
In-Reply-To: <cover.1522169685.git.andreyknvl@google.com>

To allow arm64 syscalls accept tagged pointers from userspace, we must
untag them when they are passed to the kernel. Since untagging is done in
generic parts of the kernel (like the mm subsystem), the untagged_addr
macro should be defined for all architectures.

Define it as a noop for other architectures besides arm64.

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
---
 include/linux/uaccess.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index efe79c1cdd47..c045b4eff95e 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -13,6 +13,10 @@
 
 #include <asm/uaccess.h>
 
+#ifndef untagged_addr
+#define untagged_addr(addr) addr
+#endif
+
 /*
  * Architectures should provide two primitives (raw_copy_{to,from}_user())
  * and get rid of their private instances of copy_{to,from}_user() and
-- 
2.17.0.rc0.231.g781580f067-goog

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [RFC PATCH v2 5/6] lib, arm64: untag addrs passed to strncpy_from_user and strnlen_user
From: Andrey Konovalov @ 2018-03-27 16:57 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Jonathan Corbet, Mark Rutland,
	Robin Murphy, Al Viro, Andrey Konovalov, James Morse, Kees Cook,
	Bart Van Assche, Kate Stewart, Greg Kroah-Hartman,
	Thomas Gleixner, Philippe Ombredanne, Andrew Morton, Ingo Molnar,
	Kirill A . Shutemov, Dan Williams, Aneesh Kumar K . V, Zi Yan,
	linux-arm-kernel, linux-doc, linux-kernel, linux-mm
  Cc: Dmitry Vyukov, Kostya Serebryany, Evgeniy Stepanov, Lee Smith,
	Ramana Radhakrishnan, Jacob Bramley, Ruben Ayrapetyan
In-Reply-To: <cover.1522169685.git.andreyknvl@google.com>

strncpy_from_user and strnlen_user accept user addresses as arguments, and
do not go through the same path as copy_from_user and others, so here we
need to separately handle the case of tagged user addresses as well.

Untag user pointers passed to these functions.

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
---
 lib/strncpy_from_user.c | 2 ++
 lib/strnlen_user.c      | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index b53e1b5d80f4..97467cd2bc59 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -106,6 +106,8 @@ long strncpy_from_user(char *dst, const char __user *src, long count)
 	if (unlikely(count <= 0))
 		return 0;
 
+	src = untagged_addr(src);
+
 	max_addr = user_addr_max();
 	src_addr = (unsigned long)src;
 	if (likely(src_addr < max_addr)) {
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index 60d0bbda8f5e..8b5f56466e00 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -108,6 +108,8 @@ long strnlen_user(const char __user *str, long count)
 	if (unlikely(count <= 0))
 		return 0;
 
+	str = untagged_addr(str);
+
 	max_addr = user_addr_max();
 	src_addr = (unsigned long)str;
 	if (likely(src_addr < max_addr)) {
-- 
2.17.0.rc0.231.g781580f067-goog

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH] docs/memory-barriers.txt: Fix broken DMA vs MMIO ordering example
From: Paul E. McKenney @ 2018-03-27 15:02 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, linux-doc, okaya, Benjamin Herrenschmidt,
	Arnd Bergmann, Jason Gunthorpe, Peter Zijlstra, Ingo Molnar,
	Jonathan Corbet
In-Reply-To: <1522156287-15169-1-git-send-email-will.deacon@arm.com>

On Tue, Mar 27, 2018 at 02:11:27PM +0100, Will Deacon wrote:
> The section of memory-barriers.txt that describes the dma_Xmb() barriers
> has an incorrect example claiming that a wmb() is required after writing
> to coherent memory in order for those writes to be visible to a device
> before a subsequent MMIO access using writel() can reach the device.
> 
> In fact, this ordering guarantee is provided (at significant cost on some
> architectures such as arm and power) by writel, so the wmb() is not
> necessary. writel_relaxed exists for cases where this ordering is not
> required.
> 
> Fix the example and update the text to make this clearer.
> 
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Reported-by: Sinan Kaya <okaya@codeaurora.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>

Good catch, queued on my lkmm branch, thank you!

							Thanx, Paul

> ---
>  Documentation/memory-barriers.txt | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index a863009849a3..3247547d1c36 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -1909,9 +1909,6 @@ There are some more advanced barrier functions:
>  		/* assign ownership */
>  		desc->status = DEVICE_OWN;
> 
> -		/* force memory to sync before notifying device via MMIO */
> -		wmb();
> -
>  		/* notify device of new descriptors */
>  		writel(DESC_NOTIFY, doorbell);
>  	}
> @@ -1919,11 +1916,15 @@ There are some more advanced barrier functions:
>       The dma_rmb() allows us guarantee the device has released ownership
>       before we read the data from the descriptor, and the dma_wmb() allows
>       us to guarantee the data is written to the descriptor before the device
> -     can see it now has ownership.  The wmb() is needed to guarantee that the
> -     cache coherent memory writes have completed before attempting a write to
> -     the cache incoherent MMIO region.
> -
> -     See Documentation/DMA-API.txt for more information on consistent memory.
> +     can see it now has ownership.  Note that, when using writel(), a prior
> +     wmb() is not needed to guarantee that the cache coherent memory writes
> +     have completed before writing to the MMIO region.  The cheaper
> +     writel_relaxed() does not provide this guarantee and must not be used
> +     here.
> +
> +     See the subsection "Kernel I/O barrier effects" for more information on
> +     relaxed I/O accessors and the Documentation/DMA-API.txt file for more
> +     information on consistent memory.
> 
> 
>  MMIO WRITE BARRIER
> -- 
> 2.1.4
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH V2 3/9] dt-bindings: Tegra186 tachometer device tree bindings
From: Rob Herring @ 2018-03-27 15:00 UTC (permalink / raw)
  To: Rajkumar Rampelli
  Cc: mark.rutland, thierry.reding, jonathanh, jdelvare, linux, corbet,
	catalin.marinas, will.deacon, kstewart, gregkh, pombredanne,
	mmaddireddy, mperttunen, arnd, timur, andy.gross, xuwei5, elder,
	heiko, krzk, ard.biesheuvel, devicetree, linux-kernel, linux-pwm,
	linux-tegra, linux-hwmon, linux-doc, linux-arm-kernel, ldewangan
In-Reply-To: <20180327145249.xjoo42qow34ksdle@rob-hp-laptop>

On Tue, Mar 27, 2018 at 09:52:49AM -0500, Rob Herring wrote:
> On Wed, Mar 21, 2018 at 10:10:38AM +0530, Rajkumar Rampelli wrote:
> > Supply Device tree binding documentation for the NVIDIA
> > Tegra186 SoC's Tachometer Controller
> > 
> > Signed-off-by: Rajkumar Rampelli <rrajk@nvidia.com>
> > ---
> > 
> > V2: Renamed compatible string to "nvidia,tegra186-pwm-tachometer"
> >     Renamed dt property values of clock-names and reset-names to "tachometer"
> >     from "tach"
> 
> Read my prior comments on v1.

Also, I'm trying to make sense of who you Cc'ed on this. There's a ton 
of folks I know that I'm pretty sure don't care about this series. Start 
with get_maintainers.pl and add people you know need to see this series.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH V2 3/9] dt-bindings: Tegra186 tachometer device tree bindings
From: Rob Herring @ 2018-03-27 14:52 UTC (permalink / raw)
  To: Rajkumar Rampelli
  Cc: mark.rutland, thierry.reding, jonathanh, jdelvare, linux, corbet,
	catalin.marinas, will.deacon, kstewart, gregkh, pombredanne,
	mmaddireddy, mperttunen, arnd, timur, andy.gross, xuwei5, elder,
	heiko, krzk, ard.biesheuvel, devicetree, linux-kernel, linux-pwm,
	linux-tegra, linux-hwmon, linux-doc, linux-arm-kernel, ldewangan
In-Reply-To: <1521607244-29734-4-git-send-email-rrajk@nvidia.com>

On Wed, Mar 21, 2018 at 10:10:38AM +0530, Rajkumar Rampelli wrote:
> Supply Device tree binding documentation for the NVIDIA
> Tegra186 SoC's Tachometer Controller
> 
> Signed-off-by: Rajkumar Rampelli <rrajk@nvidia.com>
> ---
> 
> V2: Renamed compatible string to "nvidia,tegra186-pwm-tachometer"
>     Renamed dt property values of clock-names and reset-names to "tachometer"
>     from "tach"

Read my prior comments on v1.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v4 4/8] dt-bindings: Add doc for the Ingenic TCU drivers
From: Rob Herring @ 2018-03-27 14:46 UTC (permalink / raw)
  To: Paul Cercueil
  Cc: Thomas Gleixner, Jason Cooper, Marc Zyngier, Lee Jones,
	Daniel Lezcano, Ralf Baechle, Jonathan Corbet, Mark Rutland,
	James Hogan, Maarten ter Huurne, linux-clk, devicetree,
	linux-kernel, linux-mips, linux-doc
In-Reply-To: <20180317232901.14129-5-paul@crapouillou.net>

On Sun, Mar 18, 2018 at 12:28:57AM +0100, Paul Cercueil wrote:
> Add documentation about how to properly use the Ingenic TCU
> (Timer/Counter Unit) drivers from devicetree.
> 
> Signed-off-by: Paul Cercueil <paul@crapouillou.net>
> ---
>  .../bindings/clock/ingenic,tcu-clocks.txt          | 42 ++++++++++++++++
>  .../bindings/interrupt-controller/ingenic,tcu.txt  | 39 +++++++++++++++
>  .../devicetree/bindings/mfd/ingenic,tcu.txt        | 56 ++++++++++++++++++++++
>  .../devicetree/bindings/timer/ingenic,tcu.txt      | 41 ++++++++++++++++
>  4 files changed, 178 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/clock/ingenic,tcu-clocks.txt
>  create mode 100644 Documentation/devicetree/bindings/interrupt-controller/ingenic,tcu.txt
>  create mode 100644 Documentation/devicetree/bindings/mfd/ingenic,tcu.txt
>  create mode 100644 Documentation/devicetree/bindings/timer/ingenic,tcu.txt
> 
>  v4: New patch in this series. Corresponds to V2 patches 3-4-5 with
>  added content.
> 
> diff --git a/Documentation/devicetree/bindings/clock/ingenic,tcu-clocks.txt b/Documentation/devicetree/bindings/clock/ingenic,tcu-clocks.txt
> new file mode 100644
> index 000000000000..471d27078599
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/clock/ingenic,tcu-clocks.txt
> @@ -0,0 +1,42 @@
> +Ingenic SoC TCU binding
> +
> +The TCU is the Timer/Counter Unit present in all Ingenic SoCs. It features 8
> +channels, each one having its own clock, that can be started and stopped,
> +reparented, and reclocked.
> +
> +Required properties:
> +- compatible : One of:
> +  * ingenic,jz4740-tcu-clocks,
> +  * ingenic,jz4770-tcu-clocks,
> +  * ingenic,jz4780-tcu-clocks.
> +- clocks : List of phandle & clock specifiers for clocks external to the TCU.
> +  The "pclk", "rtc" and "ext" clocks should be provided.
> +- clock-names : List of name strings for the external clocks.
> +- #clock-cells: Should be 1.
> +  Clock consumers specify this argument to identify a clock. The valid values
> +  may be found in <dt-bindings/clock/ingenic,tcu.h>.
> +
> +Example:

Let's just put one complete example in instead of all these duplicated 
and incomplete examples.

> +
> +/ {
> +	tcu: mfd@10002000 {
> +		compatible = "ingenic,tcu", "simple-mfd", "syscon";
> +		reg = <0x10002000 0x1000>;
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0x0 0x10002000 0x1000>;
> +
> +		tcu_clk: clocks@10 {
> +			compatible = "ingenic,jz4740-tcu-clocks";
> +			reg = <0x10 0xff0>;
> +
> +			clocks = <&ext>, <&rtc>, <&pclk>;
> +			clock-names = "ext", "rtc", "pclk";
> +
> +			#clock-cells = <1>;
> +		};
> +	};
> +};
> +
> +For information about the top-level "ingenic,tcu" compatible node and other
> +children nodes, see Documentation/devicetree/bindings/mfd/ingenic,tcu.txt.
> diff --git a/Documentation/devicetree/bindings/interrupt-controller/ingenic,tcu.txt b/Documentation/devicetree/bindings/interrupt-controller/ingenic,tcu.txt
> new file mode 100644
> index 000000000000..7f3af2da77cd
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/interrupt-controller/ingenic,tcu.txt
> @@ -0,0 +1,39 @@
> +Ingenic SoCs Timer/Counter Unit Interrupt Controller
> +
> +Required properties:
> +
> +- compatible : should be "ingenic,<socname>-tcu-intc". Valid strings are:
> +  * ingenic,jz4740-tcu-intc
> +  * ingenic,jz4770-tcu-intc
> +  * ingenic,jz4780-tcu-intc
> +- interrupt-controller : Identifies the node as an interrupt controller
> +- #interrupt-cells : Specifies the number of cells needed to encode an
> +  interrupt source. The value shall be 1.
> +- interrupt-parent : phandle of the interrupt controller.
> +- interrupts : Specifies the interrupt the controller is connected to.
> +
> +Example:
> +
> +/ {
> +	tcu: mfd@10002000 {
> +		compatible = "ingenic,tcu", "simple-mfd", "syscon";
> +		reg = <0x10002000 0x1000>;
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0x0 0x10002000 0x1000>;
> +
> +		tcu_irq: interrupt-controller@20 {
> +			compatible = "ingenic,jz4740-tcu-intc";
> +			reg = <0x20 0x20>;
> +
> +			interrupt-controller;
> +			#interrupt-cells = <1>;
> +
> +			interrupt-parent = <&intc>;
> +			interrupts = <15>;

The interrupt controller doesn't require any clocks?

> +		};
> +	};
> +};
> +
> +For information about the top-level "ingenic,tcu" compatible node and other
> +children nodes, see Documentation/devicetree/bindings/mfd/ingenic,tcu.txt.
> diff --git a/Documentation/devicetree/bindings/mfd/ingenic,tcu.txt b/Documentation/devicetree/bindings/mfd/ingenic,tcu.txt
> new file mode 100644
> index 000000000000..5742c3f21550
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mfd/ingenic,tcu.txt
> @@ -0,0 +1,56 @@
> +Ingenic JZ47xx SoCs Timer/Counter Unit devicetree bindings
> +----------------------------------------------------------
> +
> +For a description of the TCU hardware and drivers, have a look at
> +Documentation/mips/ingenic-tcu.txt.
> +
> +The TCU is implemented as a parent node, whose role is to create the
> +regmap, and child nodes for the various drivers listed in the aforementioned
> +document.
> +
> +Required properties:
> +
> +- compatible: must be "ingenic,tcu", "simple-mfd", "syscon";
> +- reg: Should be the offset/length value corresponding to the TCU registers
> +- #address-cells: Should be <1>;
> +- #size-cells: Should be <1>;
> +- ranges: Should be one range for the full TCU registers area
> +
> +Accepted children nodes:
> +- Documentation/devicetree/bindings/interrupt-controller/ingenic,tcu.txt
> +- Documentation/devicetree/bindings/clock/ingenic,tcu-clocks.txt
> +- Documentation/devicetree/bindings/timer/ingenic,tcu.txt
> +
> +
> +Example:
> +
> +/ {
> +	tcu: mfd@10002000 {
> +		compatible = "ingenic,tcu", "simple-mfd", "syscon";
> +		reg = <0x10002000 0x1000>;
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0x0 0x10002000 0x1000>;
> +
> +		tcu_irq: interrupt-controller@20 {
> +			compatible = "ingenic,jz4740-tcu-intc";
> +			reg = <0x20 0x20>;

I think you should drop this node and make the parent node the interrupt 
controller. That is the normal pattern where the parent node handles 
all the common functions. Otherwise, there is no need to have the parent 
node. You should then also drop simple-mfd as then you can control 
initialization order by initializing interrupt controller before 
its clients.

> +			...
> +		};
> +
> +		tcu_clk: clocks@10 {
> +			compatible = "ingenic,jz4740-tcu-clocks";
> +			reg = <0x10 0xff0>;
> +			...
> +		};
> +
> +		tcu_timer: timer@10 {
> +			compatible = "ingenic,jz4740-tcu";
> +			reg = <0x10 0xff0>;

Is this copy-n-paste or you really have 2 nodes at the same address? The 
latter is not valid.

> +			...
> +		};
> +	};
> +};
> +
> +For more information about the children node, refer to the documents listed
> +above in the "Accepted children nodes" section.
> diff --git a/Documentation/devicetree/bindings/timer/ingenic,tcu.txt b/Documentation/devicetree/bindings/timer/ingenic,tcu.txt
> new file mode 100644
> index 000000000000..f910b7e96783
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/timer/ingenic,tcu.txt
> @@ -0,0 +1,41 @@
> +Ingenic JZ47xx SoCs Timer/Counter Unit driver
> +---------------------------------------------
> +
> +Required properties:
> +
> +- compatible : should be "ingenic,<socname>-tcu". Valid strings are:
> +  * ingenic,jz4740-tcu
> +  * ingenic,jz4770-tcu
> +  * ingenic,jz4780-tcu
> +- interrupt-parent : phandle of the TCU interrupt controller.
> +- interrupts : Specifies the interrupts the controller is connected to.
> +- clocks : List of phandle & clock specifiers for the TCU clocks.
> +- clock-names : List of name strings for the TCU clocks.
> +
> +Example:
> +
> +/ {
> +	tcu: mfd@10002000 {
> +		compatible = "ingenic,tcu", "simple-mfd", "syscon";
> +		reg = <0x10002000 0x1000>;
> +		#address-cells = <1>;
> +		#size-cells = <1>;
> +		ranges = <0x0 0x10002000 0x1000>;
> +
> +		tcu_timer: timer@10 {
> +			compatible = "ingenic,jz4740-tcu";
> +			reg = <0x10 0xff0>;
> +
> +			clocks = <&tcu_clk 0>, <&tcu_clk 1>, <&tcu_clk 2>, <&tcu_clk 3>,
> +					 <&tcu_clk 4>, <&tcu_clk 5>, <&tcu_clk 6>, <&tcu_clk 7>;
> +			clock-names = "timer0", "timer1", "timer2", "timer3",
> +						  "timer4", "timer5", "timer6", "timer7";
> +
> +			interrupt-parent = <&tcu_irq>;
> +			interrupts = <0 1 2 3 4 5 6 7>;

Thinking about this some more... You simply have 8 timers (and no other 
functions?) with some internal clock and irq controls for each timer. I 
don't think it really makes sense to create separate clock and irq 
drivers in that case. That would be like creating clock drivers for 
every clock divider in timers, pwms, uarts, etc. Unless the clocks get 
exposed to other parts of the system, then there is no point.

> +		};
> +	};
> +};
> +
> +For information about the top-level "ingenic,tcu" compatible node and other
> +children nodes, see Documentation/devicetree/bindings/mfd/ingenic,tcu.txt.
> -- 
> 2.11.0
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v6 2/2] cpuset: Add cpuset.sched_load_balance to v2
From: Waiman Long @ 2018-03-27 14:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Juri Lelli, Li Zefan, Johannes Weiner, Peter Zijlstra,
	Ingo Molnar, cgroups, linux-kernel, linux-doc, kernel-team, pjt,
	luto, efault, torvalds, Roman Gushchin
In-Reply-To: <20180327140259.GN1840639@devbig577.frc2.facebook.com>

On 03/27/2018 10:02 AM, Tejun Heo wrote:
> Hello,
>
> On Mon, Mar 26, 2018 at 04:28:49PM -0400, Waiman Long wrote:
>> Maybe we can have a different root level flag, say,
>> sched_partition_domain that is equivalent to !sched_load_balnace.
>> However, I am still not sure if we should enforce that no task should be
>> in the root cgroup when the flag is set.
>>
>> Tejun and Peter, what are your thoughts on this?
> I haven't looked into the other issues too much but we for sure cannot
> empty the root cgroup.
>
> Thanks.
>
Now, I have a different idea. How about we add a special root-only knob,
say, "cpuset.cpus.isolated" that contains the list of CPUs that are
still owned by root, but not participated in load balancing. All the
tasks in the root are load-balanced among the remaining CPUs.

A child can then be created that hold some or all the CPUs in the
isolated set. It will then have a separate root domain if load balancing
is on, or an isolated cpuset if load balancing is off.

Will that idea work?

Cheers,
Longman


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v6 2/2] cpuset: Add cpuset.sched_load_balance to v2
From: Tejun Heo @ 2018-03-27 14:02 UTC (permalink / raw)
  To: Waiman Long
  Cc: Juri Lelli, Li Zefan, Johannes Weiner, Peter Zijlstra,
	Ingo Molnar, cgroups, linux-kernel, linux-doc, kernel-team, pjt,
	luto, efault, torvalds, Roman Gushchin
In-Reply-To: <bf79b45e-7716-65af-03ca-7112dc367371@redhat.com>

Hello,

On Mon, Mar 26, 2018 at 04:28:49PM -0400, Waiman Long wrote:
> Maybe we can have a different root level flag, say,
> sched_partition_domain that is equivalent to !sched_load_balnace.
> However, I am still not sure if we should enforce that no task should be
> in the root cgroup when the flag is set.
> 
> Tejun and Peter, what are your thoughts on this?

I haven't looked into the other issues too much but we for sure cannot
empty the root cgroup.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] docs/memory-barriers.txt: Fix broken DMA vs MMIO ordering example
From: Will Deacon @ 2018-03-27 13:11 UTC (permalink / raw)
  To: linux-kernel, linux-doc
  Cc: okaya, Will Deacon, Benjamin Herrenschmidt, Arnd Bergmann,
	Jason Gunthorpe, Paul E. McKenney, Peter Zijlstra, Ingo Molnar,
	Jonathan Corbet

The section of memory-barriers.txt that describes the dma_Xmb() barriers
has an incorrect example claiming that a wmb() is required after writing
to coherent memory in order for those writes to be visible to a device
before a subsequent MMIO access using writel() can reach the device.

In fact, this ordering guarantee is provided (at significant cost on some
architectures such as arm and power) by writel, so the wmb() is not
necessary. writel_relaxed exists for cases where this ordering is not
required.

Fix the example and update the text to make this clearer.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reported-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 Documentation/memory-barriers.txt | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index a863009849a3..3247547d1c36 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1909,9 +1909,6 @@ There are some more advanced barrier functions:
 		/* assign ownership */
 		desc->status = DEVICE_OWN;
 
-		/* force memory to sync before notifying device via MMIO */
-		wmb();
-
 		/* notify device of new descriptors */
 		writel(DESC_NOTIFY, doorbell);
 	}
@@ -1919,11 +1916,15 @@ There are some more advanced barrier functions:
      The dma_rmb() allows us guarantee the device has released ownership
      before we read the data from the descriptor, and the dma_wmb() allows
      us to guarantee the data is written to the descriptor before the device
-     can see it now has ownership.  The wmb() is needed to guarantee that the
-     cache coherent memory writes have completed before attempting a write to
-     the cache incoherent MMIO region.
-
-     See Documentation/DMA-API.txt for more information on consistent memory.
+     can see it now has ownership.  Note that, when using writel(), a prior
+     wmb() is not needed to guarantee that the cache coherent memory writes
+     have completed before writing to the MMIO region.  The cheaper
+     writel_relaxed() does not provide this guarantee and must not be used
+     here.
+
+     See the subsection "Kernel I/O barrier effects" for more information on
+     relaxed I/O accessors and the Documentation/DMA-API.txt file for more
+     information on consistent memory.
 
 
 MMIO WRITE BARRIER
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH] Documentation/process: update FUSE project website
From: Martin Kepplinger @ 2018-03-27 12:59 UTC (permalink / raw)
  To: corbet; +Cc: mchehab, linux-doc, linux-kernel, Martin Kepplinger

According to the old project site, https://sourceforge.net/projects/fuse/
the project has moved to https://github.com/libfuse/ so we update the
link to point to the latest libfuse release.

Signed-off-by: Martin Kepplinger <martink@posteo.de>
---
 Documentation/process/changes.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst
index 4f19a9725f76..ddc029734b25 100644
--- a/Documentation/process/changes.rst
+++ b/Documentation/process/changes.rst
@@ -430,7 +430,7 @@ udev
 FUSE
 ----
 
-- <http://sourceforge.net/projects/fuse>
+- <https://github.com/libfuse/libfuse/releases>
 
 mcelog
 ------
-- 
2.14.2

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v13 1/3] mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS is enabled
From: Ram Pai @ 2018-03-27  9:09 UTC (permalink / raw)
  To: mpe, mingo, akpm
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram, corbet, arnd
In-Reply-To: <1522141768-25485-1-git-send-email-linuxram@us.ibm.com>

VM_PKEY_BITx are defined only if CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
is enabled. Powerpc also needs these bits. Hence lets define the
VM_PKEY_BITx bits for any architecture that enables
CONFIG_ARCH_HAS_PKEYS.

cc: Michael Ellermen <mpe@ellerman.id.au>
cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |    2 ++
 fs/proc/task_mmu.c               |    4 ++--
 include/linux/mm.h               |    9 +++++----
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0d3c630..99344d7 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -26,6 +26,8 @@
 # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
 # define VM_PKEY_BIT4	VM_HIGH_ARCH_4
+#elif !defined(VM_PKEY_BIT4)
+# define VM_PKEY_BIT4	VM_HIGH_ARCH_4
 #endif
 
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ec6d298..6b996d0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -679,13 +679,13 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_MERGEABLE)]	= "mg",
 		[ilog2(VM_UFFD_MISSING)]= "um",
 		[ilog2(VM_UFFD_WP)]	= "uw",
-#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_ARCH_HAS_PKEYS
 		/* These come out via ProtectionKey: */
 		[ilog2(VM_PKEY_BIT0)]	= "",
 		[ilog2(VM_PKEY_BIT1)]	= "",
 		[ilog2(VM_PKEY_BIT2)]	= "",
 		[ilog2(VM_PKEY_BIT3)]	= "",
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
 	};
 	size_t i;
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ad06d42..ad207ad 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -228,15 +228,16 @@ extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 #define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
 #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
 
-#if defined(CONFIG_X86)
-# define VM_PAT		VM_ARCH_1	/* PAT reserves whole VMA at once (x86) */
-#if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
+#ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
 # define VM_PKEY_BIT1	VM_HIGH_ARCH_1
 # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
-#endif
+#endif /* CONFIG_ARCH_HAS_PKEYS */
+
+#if defined(CONFIG_X86)
+# define VM_PAT		VM_ARCH_1	/* PAT reserves whole VMA at once (x86) */
 #elif defined(CONFIG_PPC)
 # define VM_SAO		VM_ARCH_1	/* Strong Access Ordering (powerpc) */
 #elif defined(CONFIG_PARISC)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v13 3/3] mm, x86, powerpc: display pkey in smaps only if arch supports pkeys
From: Ram Pai @ 2018-03-27  9:09 UTC (permalink / raw)
  To: mpe, mingo, akpm
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram, corbet, arnd
In-Reply-To: <1522141768-25485-1-git-send-email-linuxram@us.ibm.com>

Currently the  architecture  specific code is expected to
display  the  protection  keys  in  smap  for a given vma.
This can lead to redundant code and possibly to divergent
formats in which the key gets displayed.

This  patch  changes  the implementation. It displays the
pkey only if the architecture support pkeys, i.e
arch_pkeys_enabled() returns true.  This patch
provides x86 implementation for arch_pkeys_enabled().

x86 arch_show_smap() function is not needed anymore.
Deleting it.

cc: Michael Ellermen <mpe@ellerman.id.au>
cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
(fixed compilation errors for x86 configs)
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    5 -----
 arch/x86/include/asm/mmu_context.h     |    5 -----
 arch/x86/include/asm/pkeys.h           |    1 +
 arch/x86/kernel/fpu/xstate.c           |    5 +++++
 arch/x86/kernel/setup.c                |    8 --------
 fs/proc/task_mmu.c                     |   10 +++++-----
 include/linux/pkeys.h                  |    7 ++++++-
 7 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 051b3d6..566b3c2 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -203,11 +203,6 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #define thread_pkey_regs_restore(new_thread, old_thread)
 #define thread_pkey_regs_init(thread)
 
-static inline int vma_pkey(struct vm_area_struct *vma)
-{
-	return 0;
-}
-
 static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
 {
 	return 0x0UL;
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 1de72ce..e597d09 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -295,11 +295,6 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 
 	return (vma->vm_flags & vma_pkey_mask) >> VM_PKEY_SHIFT;
 }
-#else
-static inline int vma_pkey(struct vm_area_struct *vma)
-{
-	return 0;
-}
 #endif
 
 /*
diff --git a/arch/x86/include/asm/pkeys.h b/arch/x86/include/asm/pkeys.h
index a0ba1ff..f6c287b 100644
--- a/arch/x86/include/asm/pkeys.h
+++ b/arch/x86/include/asm/pkeys.h
@@ -6,6 +6,7 @@
 
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val);
+extern bool arch_pkeys_enabled(void);
 
 /*
  * Try to dedicate one of the protection keys to be used as an
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 87a57b7..4f566e9 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -945,6 +945,11 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 
 	return 0;
 }
+
+bool arch_pkeys_enabled(void)
+{
+	return boot_cpu_has(X86_FEATURE_OSPKE);
+}
 #endif /* ! CONFIG_ARCH_HAS_PKEYS */
 
 /*
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 4c616be..117ed01 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1307,11 +1307,3 @@ static int __init register_kernel_offset_dumper(void)
 	return 0;
 }
 __initcall(register_kernel_offset_dumper);
-
-void arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-	if (!boot_cpu_has(X86_FEATURE_OSPKE))
-		return;
-
-	seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
-}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 6d83bb7..70aa912 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -18,10 +18,12 @@
 #include <linux/page_idle.h>
 #include <linux/shmem_fs.h>
 #include <linux/uaccess.h>
+#include <linux/pkeys.h>
 
 #include <asm/elf.h>
 #include <asm/tlb.h>
 #include <asm/tlbflush.h>
+#include <asm/mmu_context.h>
 #include "internal.h"
 
 void task_mem(struct seq_file *m, struct mm_struct *mm)
@@ -733,10 +735,6 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask,
 }
 #endif /* HUGETLB_PAGE */
 
-void __weak arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
-{
-}
-
 static int show_smap(struct seq_file *m, void *v, int is_pid)
 {
 	struct proc_maps_private *priv = m->private;
@@ -856,9 +854,11 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
 			   (unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
 
 	if (!rollup_mode) {
-		arch_show_smap(m, vma);
+		if (arch_pkeys_enabled())
+			seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
 		show_smap_vma_flags(m, vma);
 	}
+
 	m_cache_vma(m, vma);
 	return ret;
 }
diff --git a/include/linux/pkeys.h b/include/linux/pkeys.h
index 0794ca7..49dff15 100644
--- a/include/linux/pkeys.h
+++ b/include/linux/pkeys.h
@@ -3,7 +3,6 @@
 #define _LINUX_PKEYS_H
 
 #include <linux/mm_types.h>
-#include <asm/mmu_context.h>
 
 #ifdef CONFIG_ARCH_HAS_PKEYS
 #include <asm/pkeys.h>
@@ -13,6 +12,7 @@
 #define arch_override_mprotect_pkey(vma, prot, pkey) (0)
 #define PKEY_DEDICATED_EXECUTE_ONLY 0
 #define ARCH_VM_PKEY_FLAGS 0
+#define vma_pkey(vma) 0
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
@@ -35,6 +35,11 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return 0;
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+	return false;
+}
+
 static inline void copy_init_pkru_to_fpregs(void)
 {
 }
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v13 2/3] mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
From: Ram Pai @ 2018-03-27  9:09 UTC (permalink / raw)
  To: mpe, mingo, akpm
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram, corbet, arnd
In-Reply-To: <1522141768-25485-1-git-send-email-linuxram@us.ibm.com>

Currently only 4bits are allocated in the vma flags to hold 16
keys. This is sufficient for x86. PowerPC  supports  32  keys,
which needs 5bits. This patch allocates an  additional bit.

cc: Dave Hansen <dave.hansen@intel.com>
cc: Michael Ellermen <mpe@ellerman.id.au>
cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
cc: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 fs/proc/task_mmu.c |    1 +
 include/linux/mm.h |    3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 6b996d0..6d83bb7 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -685,6 +685,7 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
 		[ilog2(VM_PKEY_BIT1)]	= "",
 		[ilog2(VM_PKEY_BIT2)]	= "",
 		[ilog2(VM_PKEY_BIT3)]	= "",
+		[ilog2(VM_PKEY_BIT4)]	= "",
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 	};
 	size_t i;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index ad207ad..d534f46 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -231,9 +231,10 @@ extern int overcommit_kbytes_handler(struct ctl_table *, int, void __user *,
 #ifdef CONFIG_ARCH_HAS_PKEYS
 # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
 # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
-# define VM_PKEY_BIT1	VM_HIGH_ARCH_1
+# define VM_PKEY_BIT1	VM_HIGH_ARCH_1	/* on x86 and 5-bit value on ppc64   */
 # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
 # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
+# define VM_PKEY_BIT4	VM_HIGH_ARCH_4
 #endif /* CONFIG_ARCH_HAS_PKEYS */
 
 #if defined(CONFIG_X86)
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v13 0/3] mm, x86, powerpc: Enhancements to Memory Protection Keys.
From: Ram Pai @ 2018-03-27  9:09 UTC (permalink / raw)
  To: mpe, mingo, akpm
  Cc: linuxppc-dev, linux-mm, x86, linux-arch, linux-doc,
	linux-kselftest, linux-kernel, dave.hansen, benh, paulus,
	khandual, aneesh.kumar, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram, corbet, arnd

This patch series provides arch-neutral enhancements to
enable memory-keys on new architecutes, and the corresponding
changes in x86 and powerpc specific code to support that.

a) Provides ability to support upto 32 keys.  PowerPC
	can handle 32 keys and hence needs this.

b) Arch-neutral code; and not the arch-specific code,
   determines the format of the string, that displays the key
   for each vma in smaps.

History:
-------
version v13:
	(1) fixed a git bisect error. :(

version v12:
	(1) fixed compilation errors seen with various x86
		configs.
version v11:
	(1) code that displays key in smaps is not any more
		defined under CONFIG_ARCH_HAS_PKEYS.
       	    - Comment by Eric W. Biederman and Michal Hocko
	(2) merged two patches that implemented (1).
		- comment by Michal Hocko

version prior to v11:
	(1) used one additional bit from VM_HIGH_ARCH_*
       		to support 32 keys.
	    - Suggestion by Dave Hansen.
	(2) powerpc specific changes to support memory keys.


Ram Pai (3):
  mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS
    is enabled
  mm, powerpc, x86: introduce an additional vma bit for powerpc pkey
  mm, x86, powerpc: display pkey in smaps only if arch supports pkeys

 arch/powerpc/include/asm/mmu_context.h |    5 -----
 arch/powerpc/include/asm/pkeys.h       |    2 ++
 arch/x86/include/asm/mmu_context.h     |    5 -----
 arch/x86/include/asm/pkeys.h           |    1 +
 arch/x86/kernel/fpu/xstate.c           |    5 +++++
 arch/x86/kernel/setup.c                |    8 --------
 fs/proc/task_mmu.c                     |   15 ++++++++-------
 include/linux/mm.h                     |   12 +++++++-----
 include/linux/pkeys.h                  |    7 ++++++-
 9 files changed, 29 insertions(+), 31 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v6 2/2] cpuset: Add cpuset.sched_load_balance to v2
From: Mike Galbraith @ 2018-03-27  6:54 UTC (permalink / raw)
  To: Waiman Long, Juri Lelli
  Cc: Tejun Heo, Li Zefan, Johannes Weiner, Peter Zijlstra, Ingo Molnar,
	cgroups, linux-kernel, linux-doc, kernel-team, pjt, luto,
	torvalds, Roman Gushchin
In-Reply-To: <bf79b45e-7716-65af-03ca-7112dc367371@redhat.com>

On Mon, 2018-03-26 at 16:28 -0400, Waiman Long wrote:
> 
> The sched_load_balance flag isn't something that is passed to the
> scheduler. It only only affects the CPU topology of the system. So I
> suspect that a process in the root cgroup will be load balanced among
> the CPUs in the one of the child cgroups. 

Yes, among CPUs that remain part of topology (and intersect affinity).

> That doesn't look right unless
> we enforce that no process can be in the root cgroup in this case.

caveat: quite a few kthreads are nailed to the floor of root.

	-Mike


--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v6 2/2] cpuset: Add cpuset.sched_load_balance to v2
From: Juri Lelli @ 2018-03-27  6:17 UTC (permalink / raw)
  To: Waiman Long
  Cc: Tejun Heo, Li Zefan, Johannes Weiner, Peter Zijlstra, Ingo Molnar,
	cgroups, linux-kernel, linux-doc, kernel-team, pjt, luto, efault,
	torvalds, Roman Gushchin
In-Reply-To: <bf79b45e-7716-65af-03ca-7112dc367371@redhat.com>

On 26/03/18 16:28, Waiman Long wrote:
> On 03/26/2018 08:47 AM, Juri Lelli wrote:
> > On 23/03/18 14:44, Waiman Long wrote:
> >> On 03/23/2018 03:59 AM, Juri Lelli wrote:
> > [...]
> >
> >>> OK, thanks for confirming. Can you tell again however why do you think
> >>> we need to remove sched_load_balance from root level? Won't we end up
> >>> having tasks put on isolated sets?
> >> The root cgroup is special that it owns all the resources in the system.
> >> We generally don't want restriction be put on the root cgroup. A child
> >> cgroup has to be created to have constraints put on it. In fact, most of
> >> the controller files don't show up in the v2 cgroup root at all.
> >>
> >> An isolated cgroup has to be put under root, e.g.
> >>
> >>       Root
> >>      /    \
> >> isolated  balanced
> >>
> >>> Also, I guess children groups with more than one CPU will need to be
> >>> able to load balance across their CPUs, no matter what their parent
> >>> group does?
> >> The purpose of an isolated cpuset is to have a dedicated set of CPUs to
> >> be used by a certain application that makes its own scheduling decision
> >> by placing tasks explicitly on specific CPUs. It just doesn't make sense
> >> to have a CPU in an isolated cpuset to participated in load balancing in
> >> another cpuset. If one want load balancing in a child cpuset, the parent
> >> cpuset should have load balancing turned on as well.
> > Isolated with CPUs overlapping some other cpuset makes little sense, I
> > agree. What I have in mind however is an isolated set of CPUs that don't
> > overlap with any other cpuset (as your balanced set above). In this case
> > I think it makes sense to let the sys admin decide if "automatic" load
> > balancing has to be performed (by the scheduler) or no load balacing at
> > all has to take place?
> >
> > Further extending your example:
> >
> >              Root [0-3]
> > 	     /        \
> >         group1 [0-1] group2[2-3]
> >
> > Why should we prevent load balancing to be disabled at root level (so
> > that for example tasks still residing in root group are not freely
> > migrated around, potentially disturbing both sub-groups)?
> >
> > Then one can decide that group1 is a "userspace managed" group (no load
> > balancing takes place) and group2 is balanced by the scheduler.
> >
> > And this is not DEADLINE specific, IMHO.
> >
> >> As I look into the code, it seems like root domain is probably somewhat
> >> associated with cpu_exclusive only. Whether sched_load_balance is set
> >> doesn't really matter.  I will need to look further on the conditions
> >> where a new root domain is created.
> > I checked again myself (sched domains code is always a maze :) and I
> > believe that sched_load_balance flag indeed controls domains (sched and
> > root) creation and configuration . Changing the flag triggers potential
> > rebuild and separed sched/root domains are generated if subgroups have
> > non overlapping cpumasks.  cpu_exclusive only enforces this latter
> > condition.
> 
> Right, I ran some tests and figured out that to have root_domain in the
> child cgroup level, we do need to disable load balancing at the root
> cgroup level and enabling it in child cgroups that are mutually disjoint
> in their cpu lists. The cpu_exclusive flag isn't really needed.

It seems to make little sense at root level indeed. 

> I am not against doing that at the root cgroup, but it is kind of weird
> in term of semantics. If we disable load balancing in the root cgroup,
> but enabling it at child cgroups, what does that mean to the processes
> that are still in the root cgroup?

It might be up to the different scheduling classes I guess. See more on
this below.

> The sched_load_balance flag isn't something that is passed to the
> scheduler. It only only affects the CPU topology of the system. So I
> suspect that a process in the root cgroup will be load balanced among
> the CPUs in the one of the child cgroups. That doesn't look right unless
> we enforce that no process can be in the root cgroup in this case.
> 
> Real cpu isolation will then require that we disable load balancing at
> root, and enable load balancing in child cgroups that only contain CPUs
> outside of the isolated CPU list. Again, it is still possible that some
> tasks in the root cgroup, if present, may be using some of the isolated
> CPUs.

So, for DEADLINE this is currently a problem. We know that this is
broken (and Mathieu proposed already patches to fix it [1]). What we
want, I think, is to deny setting a task to DEADLINE if its current
affinity could overlap some exclusive set (root domain as per above), as
for example in your case if the task is residing in the root group.
Since DEADLINE bases load balancing on root domains, once those have
been correctly created, tasks shouldn't be able to escape. And if no
task can reside on the root level once sched_load_balance has been
disable, it seems we won't have the problem you fear.

RT looks similar in this sense (load balancing using root domains info),
but no admission control is performed, so I guess we could fall in your
problematic situation.

FAIR uses a mix of sched domains and root domains information to perform
load balancing, so once tasks are divided among configured sets all
should work OK, but again there might be still some tasks left at root
group. :/ I'm not sure what happens to those w.r.t. load balancing.

> Maybe we can have a different root level flag, say,
> sched_partition_domain that is equivalent to !sched_load_balnace.
> However, I am still not sure if we should enforce that no task should be
> in the root cgroup when the flag is set.
> 
> Tejun and Peter, what are your thoughts on this?

Let's see what they think. :)

Thanks for the discussion!

Best,

- Juri

[1] https://marc.info/?l=linux-kernel&m=151855397701977&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 06/11] dt-bindings: i3c: Add macros to help fill I3C/I2C device's reg property
From: Rob Herring @ 2018-03-26 22:25 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Wolfram Sang, linux-i2c, Jonathan Corbet, linux-doc,
	Greg Kroah-Hartman, Arnd Bergmann, Przemyslaw Sroka,
	Arkadiusz Golec, Alan Douglas, Bartosz Folta, Damian Kos,
	Alicja Jurasik-Urbaniak, Cyprian Wronka, Suresh Punnoose,
	Rafal Ciepiela, Thomas Petazzoni, Nishanth Menon, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala, devicetree, linux-kernel,
	Vitor Soares, Geert Uytterhoeven, Linus Walleij, Xiang Lin,
	linux-gpio
In-Reply-To: <20180323110020.19080-7-boris.brezillon@bootlin.com>

On Fri, Mar 23, 2018 at 12:00:15PM +0100, Boris Brezillon wrote:
> The reg property of devices connected to an I3C bus have 3 cells, and
> filling them manually is not trivial. Provides macros to help doing
> that.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
> ---
>  include/dt-bindings/i3c/i3c.h | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
>  create mode 100644 include/dt-bindings/i3c/i3c.h

Reviewed-by: Rob Herring <robh@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 05/11] dt-bindings: i3c: Document core bindings
From: Rob Herring @ 2018-03-26 22:24 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Wolfram Sang, linux-i2c, Jonathan Corbet, linux-doc,
	Greg Kroah-Hartman, Arnd Bergmann, Przemyslaw Sroka,
	Arkadiusz Golec, Alan Douglas, Bartosz Folta, Damian Kos,
	Alicja Jurasik-Urbaniak, Cyprian Wronka, Suresh Punnoose,
	Rafal Ciepiela, Thomas Petazzoni, Nishanth Menon, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala, devicetree, linux-kernel,
	Vitor Soares, Geert Uytterhoeven, Linus Walleij, Xiang Lin,
	linux-gpio, Boris Brezillon
In-Reply-To: <20180323110020.19080-6-boris.brezillon@bootlin.com>

On Fri, Mar 23, 2018 at 12:00:14PM +0100, Boris Brezillon wrote:
> From: Boris Brezillon <boris.brezillon@free-electrons.com>
> 
> A new I3C subsystem has been added and a generic description has been
> created to represent the I3C bus and the devices connected on it.
> 
> Document this generic representation.

Mostly looks fine, a couple of clarifications below.

> 
> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
> ---
> Changes in v3:
> - Rename {i2c,i3c}-scl-frequency DT prop into {i2c,i3c}-scl-hz
> - Rework the way we expose the provisional ID and LVR information
> - Rename dynamic-address into assigned-address
> - Enforce the I3C master node name
> 
> Changes in v2:
> - Define how to describe I3C devices in the DT and when it should be
>   used. Note that the parsing of I3C devices is not yet implemented in
>   the framework. Will be added when someone really needs it.
> ---
>  Documentation/devicetree/bindings/i3c/i3c.txt | 140 ++++++++++++++++++++++++++
>  1 file changed, 140 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/i3c/i3c.txt
> 
> diff --git a/Documentation/devicetree/bindings/i3c/i3c.txt b/Documentation/devicetree/bindings/i3c/i3c.txt
> new file mode 100644
> index 000000000000..ed858228d26b
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/i3c/i3c.txt
> @@ -0,0 +1,140 @@
> +Generic device tree bindings for I3C busses
> +===========================================
> +
> +This document describes generic bindings that should be used to describe I3C
> +busses in a device tree.
> +
> +Required properties
> +-------------------
> +
> +- #address-cells  - should be <3>. Read more about addresses below.
> +- #size-cells     - should be <0>.
> +- compatible      - name of the I3C master controller driving the I3C bus
> +
> +For other required properties e.g. to describe register sets,
> +clocks, etc. check the binding documentation of the specific driver.
> +The node describing an I3C bus should be named i3c-master.
> +
> +Optional properties
> +-------------------
> +
> +These properties may not be supported by all I3C master drivers. Each I3C
> +master bindings should specify which of them are supported.
> +
> +- i3c-scl-hz: frequency of the SCL signal used for I3C transfers.
> +	      When undefined the core sets it to 12.5MHz.
> +
> +- i2c-scl-hz: frequency of the SCL signal used for I2C transfers.
> +	      When undefined, the core looks at LVR (Legacy Virtual Register)
> +	      values of I2C devices described in the device tree to determine
> +	      the maximum I2C frequency.
> +
> +I2C devices
> +===========
> +
> +Each I2C device connected to the bus should be described in a subnode. All
> +properties described in Documentation/devicetree/bindings/i2c/i2c.txt are
> +valid here, but several new properties have been added.
> +
> +New constraint on existing properties:
> +--------------------------------------
> +- reg: contains 3 cells
> +  + first cell : still encoding the I2C address
> +
> +  + second cell: should have bit 31 set to 1 signify that this is an I2C
> +		 device. Bits 0 to 7 encode the I3C LVR (Legacy Virtual
> +		 Register):
> +
> +	bit[7:5]: I2C device index. Possible values
> +	* 0: I2C device has a 50 ns spike filter
> +	* 1: I2C device does not have a 50 ns spike filter but supports high
> +	     frequency on SCL
> +	* 2: I2C device does not have a 50 ns spike filter and is not tolerant
> +	     to high frequencies
> +	* 3-7: reserved
> +
> +	bit[4]: tell whether the device operates in FM (Fast Mode) or FM+ mode
> +	* 0: FM+ mode
> +	* 1: FM mode
> +
> +	bit[3:0]: device type
> +	* 0-15: reserved
> +
> +  + third cell: should be 0
> +
> +I3C devices
> +===========
> +
> +All I3C devices are supposed to support DAA (Dynamic Address Assignment), and
> +are thus discoverable. So, by default, I3C devices do not have to be described
> +in the device tree.
> +This being said, one might want to attach extra resources to these devices,
> +and those resources may have to be described in the device tree, which in turn
> +means we have to describe I3C devices.
> +
> +Another use case for describing an I3C device in the device tree is when this
> +I3C device has a static address and we want to assign it a specific dynamic
> +address before the DAA takes place (so that other devices on the bus can't

static is I2C address and dynamic is an I3C address. That could be 
clearer throughout.

> +take this dynamic address).
> +
> +The I3C device should be names <device-type>@<static-address>,<i3c-pid>,

s/static-address/static-i2c-address/

> +where device-type is describing the type of device connected on the bus
> +(gpio-controller, sensor, ...).
> +
> +Required properties
> +-------------------
> +- reg: contains 3 cells
> +  + first cell : encodes the I2C address. Should be 0 if the device does not
> +		 have one (0 is not a valid I3C address).

Change here to "encodes the static I2C address". 

0 is not a valid I2C address?

> +
> +  + second and third cells: should encode the ProvisionalID. The second cell
> +			    contains the manufacturer ID left-shifted by 1.
> +			    The third cell contains ORing of the part ID
> +			    left-shifted by 16, the instance ID left-shifted
> +			    by 12 and the extra information. This encoding is
> +			    following the PID definition provided by the I3C
> +			    specification.
> +
> +Optional properties
> +-------------------
> +- assigned-address: dynamic address to be assigned to this device. This
> +		    property is only valid if the I3C device has a static
> +		    address (first cell of the reg property != 0).
> +
> +
> +Example:
> +
> +	i3c-master@d040000 {
> +		compatible = "cdns,i3c-master";
> +		clocks = <&coreclock>, <&i3csysclock>;
> +		clock-names = "pclk", "sysclk";
> +		interrupts = <3 0>;
> +		reg = <0x0d040000 0x1000>;
> +		#address-cells = <3>;
> +		#size-cells = <0>;
> +
> +		status = "okay";
> +		i2c-scl-frequency = <100000>;
> +
> +		/* I2C device. */
> +		nunchuk: nunchuk@52 {
> +			compatible = "nintendo,nunchuk";
> +			reg = <0x52 0x80000010 0x0>;
> +		};
> +
> +		/* I3C device with a static address. */
> +		thermal_sensor: sensor@68,39200144004 {
> +			reg = <0x68 0x392 0x144004>;
> +			assigned-address = <0xa>;
> +		};
> +
> +		/*
> +		 * I3C device without a static address but requiring resources
> +		 * described in the DT.
> +		 */
> +		sensor@0,39200154004 {
> +			reg = <0x0 0x392 0x154004>;
> +			clocks = <&clock_provider 0>;
> +		};
> +	};
> +
> -- 
> 2.14.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 11/11] dt-bindings: gpio: Add bindings for Cadence I3C gpio expander
From: Rob Herring @ 2018-03-26 22:25 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Wolfram Sang, linux-i2c, Jonathan Corbet, linux-doc,
	Greg Kroah-Hartman, Arnd Bergmann, Przemyslaw Sroka,
	Arkadiusz Golec, Alan Douglas, Bartosz Folta, Damian Kos,
	Alicja Jurasik-Urbaniak, Cyprian Wronka, Suresh Punnoose,
	Rafal Ciepiela, Thomas Petazzoni, Nishanth Menon, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala, devicetree, linux-kernel,
	Vitor Soares, Geert Uytterhoeven, Linus Walleij, Xiang Lin,
	linux-gpio
In-Reply-To: <20180323110020.19080-12-boris.brezillon@bootlin.com>

On Fri, Mar 23, 2018 at 12:00:20PM +0100, Boris Brezillon wrote:
> Document the Cadence I3C gpio expander bindings.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
> ---
>  .../devicetree/bindings/gpio/gpio-cdns-i3c.txt     | 38 ++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/gpio/gpio-cdns-i3c.txt
> 
> diff --git a/Documentation/devicetree/bindings/gpio/gpio-cdns-i3c.txt b/Documentation/devicetree/bindings/gpio/gpio-cdns-i3c.txt
> new file mode 100644
> index 000000000000..634b1f268215
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/gpio/gpio-cdns-i3c.txt
> @@ -0,0 +1,38 @@
> +* Cadence I3C GPIO expander
> +
> +The Cadence I3C GPIO expander provides 8 GPIOs controllable over I3C.
> +This GPIOs can be configured in output or input mode and if they are in input
> +mode they can generate IBIs (In Band Interrupts).
> +
> +Required properties for GPIO node:
> +- reg : 3 cells encoding the I3C static address (none in our case) and the I3C
> +	Provisional ID. See Documentation/devicetree/bindings/i3c/i3c.txt for
> +	more details.
> +	Should be <0x0 0x392 0x0>.
> +- gpio-controller : Marks the device node as a gpio controller.
> +- #gpio-cells : Should be two. The first cell is the pin number and
> +  the second cell is used to specify the gpio polarity:
> +      0 = active high
> +      1 = active low
> +- interrupt-controller: Marks the device node as an interrupt controller.
> +- #interrupt-cells : Should be 2.  The first cell is the GPIO number.
> +  The second cell bits[3:0] is used to specify trigger type and level flags:
> +      1 = low-to-high edge triggered.
> +      2 = high-to-low edge triggered.
> +      3 = triggered on both edges.
> +      4 = active high level-sensitive.
> +      8 = active low level-sensitive.
> +
> +Example:
> +
> +	i3c-master@xxx {
> +		...
> +		i3c_gpio_expander: gpio@0,1c9,0 {

The unit address is wrong here.

> +			reg = <0 0x392 0x0>;
> +			gpio-controller;
> +			#gpio-cells = <2>;
> +			interrupt-controller;
> +			#interrupt-cells = <2>;
> +		};
> +		...
> +	};
> -- 
> 2.14.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v6 2/2] cpuset: Add cpuset.sched_load_balance to v2
From: Waiman Long @ 2018-03-26 20:28 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Tejun Heo, Li Zefan, Johannes Weiner, Peter Zijlstra, Ingo Molnar,
	cgroups, linux-kernel, linux-doc, kernel-team, pjt, luto, efault,
	torvalds, Roman Gushchin
In-Reply-To: <20180326124711.GE5942@localhost.localdomain>

On 03/26/2018 08:47 AM, Juri Lelli wrote:
> On 23/03/18 14:44, Waiman Long wrote:
>> On 03/23/2018 03:59 AM, Juri Lelli wrote:
> [...]
>
>>> OK, thanks for confirming. Can you tell again however why do you think
>>> we need to remove sched_load_balance from root level? Won't we end up
>>> having tasks put on isolated sets?
>> The root cgroup is special that it owns all the resources in the system.
>> We generally don't want restriction be put on the root cgroup. A child
>> cgroup has to be created to have constraints put on it. In fact, most of
>> the controller files don't show up in the v2 cgroup root at all.
>>
>> An isolated cgroup has to be put under root, e.g.
>>
>>       Root
>>      /    \
>> isolated  balanced
>>
>>> Also, I guess children groups with more than one CPU will need to be
>>> able to load balance across their CPUs, no matter what their parent
>>> group does?
>> The purpose of an isolated cpuset is to have a dedicated set of CPUs to
>> be used by a certain application that makes its own scheduling decision
>> by placing tasks explicitly on specific CPUs. It just doesn't make sense
>> to have a CPU in an isolated cpuset to participated in load balancing in
>> another cpuset. If one want load balancing in a child cpuset, the parent
>> cpuset should have load balancing turned on as well.
> Isolated with CPUs overlapping some other cpuset makes little sense, I
> agree. What I have in mind however is an isolated set of CPUs that don't
> overlap with any other cpuset (as your balanced set above). In this case
> I think it makes sense to let the sys admin decide if "automatic" load
> balancing has to be performed (by the scheduler) or no load balacing at
> all has to take place?
>
> Further extending your example:
>
>              Root [0-3]
> 	     /        \
>         group1 [0-1] group2[2-3]
>
> Why should we prevent load balancing to be disabled at root level (so
> that for example tasks still residing in root group are not freely
> migrated around, potentially disturbing both sub-groups)?
>
> Then one can decide that group1 is a "userspace managed" group (no load
> balancing takes place) and group2 is balanced by the scheduler.
>
> And this is not DEADLINE specific, IMHO.
>
>> As I look into the code, it seems like root domain is probably somewhat
>> associated with cpu_exclusive only. Whether sched_load_balance is set
>> doesn't really matter.  I will need to look further on the conditions
>> where a new root domain is created.
> I checked again myself (sched domains code is always a maze :) and I
> believe that sched_load_balance flag indeed controls domains (sched and
> root) creation and configuration . Changing the flag triggers potential
> rebuild and separed sched/root domains are generated if subgroups have
> non overlapping cpumasks.  cpu_exclusive only enforces this latter
> condition.

Right, I ran some tests and figured out that to have root_domain in the
child cgroup level, we do need to disable load balancing at the root
cgroup level and enabling it in child cgroups that are mutually disjoint
in their cpu lists. The cpu_exclusive flag isn't really needed.

I am not against doing that at the root cgroup, but it is kind of weird
in term of semantics. If we disable load balancing in the root cgroup,
but enabling it at child cgroups, what does that mean to the processes
that are still in the root cgroup?

The sched_load_balance flag isn't something that is passed to the
scheduler. It only only affects the CPU topology of the system. So I
suspect that a process in the root cgroup will be load balanced among
the CPUs in the one of the child cgroups. That doesn't look right unless
we enforce that no process can be in the root cgroup in this case.

Real cpu isolation will then require that we disable load balancing at
root, and enable load balancing in child cgroups that only contain CPUs
outside of the isolated CPU list. Again, it is still possible that some
tasks in the root cgroup, if present, may be using some of the isolated
CPUs.

Maybe we can have a different root level flag, say,
sched_partition_domain that is equivalent to !sched_load_balnace.
However, I am still not sure if we should enforce that no task should be
in the root cgroup when the flag is set.

Tejun and Peter, what are your thoughts on this?

Cheers,
Longman

 

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v12 04/22] selftests/vm: typecast the pkey register
From: Thiago Jung Bauermann @ 2018-03-26 19:38 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Ram Pai, shuahkh, linux-kselftest, mpe, linuxppc-dev, linux-mm,
	x86, linux-arch, linux-doc, linux-kernel, mingo, akpm, benh,
	paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	ebiederm, arnd
In-Reply-To: <00081300-e891-3381-3acd-e3312e54fb58@intel.com>


Dave Hansen <dave.hansen@intel.com> writes:

> On 02/21/2018 05:55 PM, Ram Pai wrote:
>> -static inline unsigned int _rdpkey_reg(int line)
>> +static inline pkey_reg_t _rdpkey_reg(int line)
>>  {
>> -	unsigned int pkey_reg = __rdpkey_reg();
>> +	pkey_reg_t pkey_reg = __rdpkey_reg();
>>
>> -	dprintf4("rdpkey_reg(line=%d) pkey_reg: %x shadow: %x\n",
>> +	dprintf4("rdpkey_reg(line=%d) pkey_reg: %016lx shadow: %016lx\n",
>>  			line, pkey_reg, shadow_pkey_reg);
>>  	assert(pkey_reg == shadow_pkey_reg);
>
> Hmm.  So we're using %lx for an int?  Doesn't the compiler complain
> about this?

It doesn't because dprintf4() doesn't have the annotation that tells the
compiler that it takes printf-like arguments. Once I add it:

--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -54,6 +54,10 @@
 #define DPRINT_IN_SIGNAL_BUF_SIZE 4096
 extern int dprint_in_signal;
 extern char dprint_in_signal_buffer[DPRINT_IN_SIGNAL_BUF_SIZE];
+
+#ifdef __GNUC__
+__attribute__((format(printf, 1, 2)))
+#endif
 static inline void sigsafe_printf(const char *format, ...)
 {
 	va_list ap;

Then it does complain about it. I'm working on a fix where each arch
will define a format string to use for its pkey_reg_t and use it like
this:

--- a/tools/testing/selftests/vm/pkey-helpers.h
+++ b/tools/testing/selftests/vm/pkey-helpers.h
@@ -19,6 +19,7 @@
 #define u32 uint32_t
 #define u64 uint64_t
 #define pkey_reg_t u32
+#define PKEY_REG_FMT "%016x"

 #ifdef __i386__
 #ifndef SYS_mprotect_key
@@ -112,7 +113,8 @@ static inline pkey_reg_t _read_pkey_reg(int line)
 {
 	pkey_reg_t pkey_reg = __read_pkey_reg();

-	dprintf4("read_pkey_reg(line=%d) pkey_reg: %016lx shadow: %016lx\n",
+	dprintf4("read_pkey_reg(line=%d) pkey_reg: "PKEY_REG_FMT
+			" shadow: "PKEY_REG_FMT"\n",
 			line, pkey_reg, shadow_pkey_reg);
 	assert(pkey_reg == shadow_pkey_reg);

--
Thiago Jung Bauermann
IBM Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] fix one dead link in ia64/xen.txt
From: 慕冬亮 @ 2018-03-26 18:03 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Jonathan Corbet, yamada.masahiro, Sergei Trofimovich, tony.luck,
	Bjørn Forsman, linux-doc, linux-kernel
In-Reply-To: <91363263-1ab8-5513-23a3-bbcea802d788@citrix.com>

On Tue, Mar 20, 2018 at 4:17 PM, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 20/03/18 19:56, Dongliang Mu wrote:
>> Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
>> ---
>>  Documentation/ia64/xen.txt | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/ia64/xen.txt b/Documentation/ia64/xen.txt
>> index a12c74ce2773..464d4c29b8b5 100644
>> --- a/Documentation/ia64/xen.txt
>> +++ b/Documentation/ia64/xen.txt
>> @@ -26,8 +26,8 @@ Getting and Building Xen and Dom0
>>      DomainU OS  : RHEL5
>>
>>   1. Download source
>> -    # hg clone http://xenbits.xensource.com/ext/ia64/xen-unstable.hg
>> -    # cd xen-unstable.hg
>> +    # hg clone http://xenbits.xensource.com/ext/ia64/xen-unstable
>> +    # cd xen-unstable
>>      # hg clone http://xenbits.xensource.com/ext/ia64/linux-2.6.18-xen.hg
>>
>>   2. # make world
>
> The last commit in that repository is almost 9 years old, and IA64
> support was dropped from Xen mainline6 years ago.
>
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=570c311ca2c7a1131570cdfc77e977bc7a9bf4c0
>
> There are a number of other dead links in this doc, and those which
> aren't dead refer to Linux 2.6.x.  I'd just remove the entire file,
> rather than pretend that any of this still might work.  (If by some
> miracle it does still function, its 10 years behind on security fixes...)
>

If the situation is like that, I agree to delete this file.

> ~Andrew
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2] Documentation/CodingStyle: Add an example for braces
From: Gary R Hook @ 2018-03-26 16:49 UTC (permalink / raw)
  To: Jonathan Corbet; +Cc: Jani Nikula, linux-doc, linux-kernel
In-Reply-To: <20180326103249.42789cdb@lwn.net>

On 03/26/2018 11:32 AM, Jonathan Corbet wrote:
> On Mon, 26 Mar 2018 11:28:03 -0500
> Gary R Hook <gary.hook@amd.com> wrote:
> 
>> Submitting a v3 because the example could better illuminate the options
>> by using loop construct inside of an if, addressing Jani's point but
>> without opening the door to later criticism.
>>
>> I also like the verbage in v2/3 better, but I'll let Jonathan make the call.
> 
> As I told you, I was applying the first version; I did that last week.

Forgive me; was out of the office. I've seen maintainers comment but not 
necessarily execute immediately, and therefore I try to learn how each 
works, but here I made an assumption. No worries and sorry to bother.

> 
>> BTW which tree should these be developed against? I used torvalds, but
>> I'm not entirely sure that was the proper one?
> 
> The MAINTAINERS file will (almost) always answer that question for
> you:	
> 
> 	T:	git git://git.lwn.net/linux.git docs-next

Good point. I should know better by now.

Again, thank you.

Gary
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2] Documentation/CodingStyle: Add an example for braces
From: Jonathan Corbet @ 2018-03-26 16:32 UTC (permalink / raw)
  To: Gary R Hook; +Cc: Jani Nikula, linux-doc, linux-kernel
In-Reply-To: <724fc548-0e40-11a3-2a9b-3dd0db0de880@amd.com>

On Mon, 26 Mar 2018 11:28:03 -0500
Gary R Hook <gary.hook@amd.com> wrote:

> Submitting a v3 because the example could better illuminate the options 
> by using loop construct inside of an if, addressing Jani's point but 
> without opening the door to later criticism.
> 
> I also like the verbage in v2/3 better, but I'll let Jonathan make the call.

As I told you, I was applying the first version; I did that last week.

> BTW which tree should these be developed against? I used torvalds, but 
> I'm not entirely sure that was the proper one?

The MAINTAINERS file will (almost) always answer that question for
you:	

	T:	git git://git.lwn.net/linux.git docs-next

For a patch like this it doesn't matter, since there's is no other work on
the file to conflict with.

THanks,

jon
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2] Documentation/CodingStyle: Add an example for braces
From: Gary R Hook @ 2018-03-26 16:28 UTC (permalink / raw)
  To: Jani Nikula, Jonathan Corbet; +Cc: linux-doc, linux-kernel
In-Reply-To: <87woy4iide.fsf@intel.com>

On 03/22/2018 04:12 AM, Jani Nikula wrote:
> On Wed, 21 Mar 2018, Jonathan Corbet <corbet@lwn.net> wrote:
>> To head that off, I think I'll apply your first version instead, sorry
>> Jani.
> 
> No worries.
> 

Submitting a v3 because the example could better illuminate the options 
by using loop construct inside of an if, addressing Jani's point but 
without opening the door to later criticism.

I also like the verbage in v2/3 better, but I'll let Jonathan make the call.

BTW which tree should these be developed against? I used torvalds, but 
I'm not entirely sure that was the proper one?

Gary

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox