LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC 3/5] powerpc/mpic: Add support for non-contiguous irq ranges
From: Scott Wood @ 2018-08-08  5:50 UTC (permalink / raw)
  To: Bharat Bhushan, Rob Herring
  Cc: benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au,
	galak@kernel.crashing.org, mark.rutland@arm.com,
	kstewart@linuxfoundation.org, gregkh@linuxfoundation.org,
	devicetree@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, keescook@chromium.org,
	tyreld@linux.vnet.ibm.com, joe@perches.com
In-Reply-To: <AM5PR0401MB2545AF363F5DD258205177FC9A260@AM5PR0401MB2545.eurprd04.prod.outlook.com>

On Wed, 2018-08-08 at 03:37 +0000, Bharat Bhushan wrote:
> > -----Original Message-----
> > From: Scott Wood [mailto:oss@buserror.net]
> > Sent: Wednesday, August 8, 2018 2:34 AM
> > To: Rob Herring <robh@kernel.org>; Bharat Bhushan
> > <bharat.bhushan@nxp.com>
> > Cc: benh@kernel.crashing.org; paulus@samba.org; mpe@ellerman.id.au;
> > galak@kernel.crashing.org; mark.rutland@arm.com;
> > kstewart@linuxfoundation.org; gregkh@linuxfoundation.org;
> > devicetree@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > kernel@vger.kernel.org; keescook@chromium.org;
> > tyreld@linux.vnet.ibm.com; joe@perches.com
> > Subject: Re: [RFC 3/5] powerpc/mpic: Add support for non-contiguous irq
> > ranges
> > 
> > On Tue, 2018-08-07 at 12:09 -0600, Rob Herring wrote:
> > > On Fri, Jul 27, 2018 at 03:17:59PM +0530, Bharat Bhushan wrote:
> > > > Freescale MPIC h/w may not support all interrupt sources reported by
> > > > hardware, "last-interrupt-source" or platform. On these platforms a
> > > > misconfigured device tree that assigns one of the reserved
> > > > interrupts leaves a non-functioning system without warning.
> > > 
> > > There are lots of ways to misconfigure DTs. I don't think this is
> > > special and needs a property.
> > 
> > Yeah, the system will be just as non-functioning if you specify a valid-
> > but-
> > wrong-for-the-device interrupt number.
> 
> Some is one additional benefits of this changes, MPIC have reserved regions
> for un-supported interrupts and read/writes to these reserved regions seams
> have no effect.
> MPIC driver reads/writes to the reserved regions during init/uninit and
> save/restore state.
> 
> Let me know if it make sense to have these changes for mentioned reasons.

The driver has been doing this forever with no ill effect.  What is the
motivation for this change?

-Scott

^ permalink raw reply

* Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020
From: Scott Wood @ 2018-08-08  5:55 UTC (permalink / raw)
  To: Bharat Bhushan, benh@kernel.crashing.org, paulus@samba.org,
	mpe@ellerman.id.au, galak@kernel.crashing.org,
	mark.rutland@arm.com, kstewart@linuxfoundation.org,
	gregkh@linuxfoundation.org, devicetree@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
  Cc: robh@kernel.org, keescook@chromium.org, tyreld@linux.vnet.ibm.com,
	joe@perches.com
In-Reply-To: <AM5PR0401MB254592DC0E917FD292B323BA9A260@AM5PR0401MB2545.eurprd04.prod.outlook.com>

On Wed, 2018-08-08 at 03:44 +0000, Bharat Bhushan wrote:
> > -----Original Message-----
> > From: Scott Wood [mailto:oss@buserror.net]
> > Sent: Wednesday, August 8, 2018 2:44 AM
> > To: Bharat Bhushan <bharat.bhushan@nxp.com>;
> > benh@kernel.crashing.org; paulus@samba.org; mpe@ellerman.id.au;
> > galak@kernel.crashing.org; mark.rutland@arm.com;
> > kstewart@linuxfoundation.org; gregkh@linuxfoundation.org;
> > devicetree@vger.kernel.org; linuxppc-dev@lists.ozlabs.org; linux-
> > kernel@vger.kernel.org
> > Cc: robh@kernel.org; keescook@chromium.org; tyreld@linux.vnet.ibm.com;
> > joe@perches.com
> > Subject: Re: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020
> > 
> > On Fri, 2018-07-27 at 15:18 +0530, Bharat Bhushan wrote:
> > > MPIC on NXP (Freescale) P2020 supports following irq
> > > ranges:
> > >   > 0 - 11      (External interrupt)
> > >   > 16 - 79     (Internal interrupt)
> > >   > 176 - 183   (Messaging interrupt)
> > >   > 224 - 231   (Shared message signaled interrupt)
> > 
> > Why don't you convert to the 4-cell interrupt specifiers that make dealing
> > with these ranges less error-prone?
> 
> Ok , will do if we agree to have this series as per comment on other patch.

If you're concerned with errors, this would be a good things to do regardless.
 Actually, it seems that p2020si-post.dtsi already uses 4-cell interrupts.

What is motivating this patchset?  Is there something wrong in the existing
dts files?


> 
> > 
> > > diff --git a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > index 1006950..49ff348 100644
> > > --- a/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > +++ b/arch/powerpc/platforms/85xx/mpc85xx_rdb.c
> > > @@ -57,6 +57,11 @@ void __init mpc85xx_rdb_pic_init(void)
> > >  			MPIC_BIG_ENDIAN |
> > >  			MPIC_SINGLE_DEST_CPU,
> > >  			0, 256, " OpenPIC  ");
> > > +	} else if (of_machine_is_compatible("fsl,P2020RDB-PC")) {
> > > +		mpic = mpic_alloc(NULL, 0,
> > > +		  MPIC_BIG_ENDIAN |
> > > +		  MPIC_SINGLE_DEST_CPU,
> > > +		  0, 0, " OpenPIC  ");
> > >  	} else {
> > >  		mpic = mpic_alloc(NULL, 0,
> > >  		  MPIC_BIG_ENDIAN |
> > 
> > I don't think we want to grow a list of every single revision of every
> > board in
> > these platform files.
> 
> One other confusing observation I have is that "irq_count" from platform
> code is given precedence over "last-interrupt-source" in device-tree.
> Should not device-tree should have precedence otherwise there is no point
> using " last-interrupt-source" if platform code passes "irq_count" in
> mpic_alloc().

Maybe, though I don't think it matters much given that last-interrupt-source
was only added to avoid having to pass irq_count in platform code.

-Scott

^ permalink raw reply

* Re: [RFC PATCH 0/3] New device-tree format  and Opal based idle save-restore
From: Gautham R Shenoy @ 2018-08-08  6:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Akshay Adiga, linux-kernel, linuxppc-dev, npiggin, benh, ego,
	huntbag
In-Reply-To: <87zhxybceu.fsf@concordia.ellerman.id.au>

Hello Michael,

On Tue, Aug 07, 2018 at 10:15:37PM +1000, Michael Ellerman wrote:
> > Skiboot patch-set for device-tree is posted here :
> > https://patchwork.ozlabs.org/project/skiboot/list/?series=58934
> 
> I don't see a device tree binding documented anywhere?
> 
> There is an existing binding defined for ARM chips, presumably it
> doesn't do everything we need. But are there good reasons why we are not
> using it as a base?
> 
> See: Documentation/devicetree/bindings/arm/idle-states.txt
>

In case of ARM, the idle-states node is a child of cpus node. Each
child of the idle-states node is a node describing that particular
idle state.

	idle-states {
		entry-method = "psci";

		CPU_RETENTION_0_0: cpu-retention-0-0 {
			compatible = "arm,idle-state";
			arm,psci-suspend-param = <0x0010000>;
			entry-latency-us = <20>;
			exit-latency-us = <40>;
			min-residency-us = <80>;
			status = "disabled"
		};


		CPU_SLEEP_0_0: cpu-sleep-0-0 {
			compatible = "arm,idle-state";
			local-timer-stop;
			arm,psci-suspend-param = <0x0010000>;
			entry-latency-us = <250>;
			exit-latency-us = <500>;
			min-residency-us = <950>;
			status = "okay"
		};

		.
		.
		.
	}


Furthermore, each CPU can have a different set of cpu-idle states
due to the asymmetric nature of the processors units on the board.
Thus, there is an additional property for each cpu called
cpu-idle-states which points to the containers of the idle states
themselves.


cpus {
	#size-cells = <0>;
	#address-cells = <2>;

	CPU0: cpu@0 {
		device_type = "cpu";
		compatible = "arm,cortex-a57";
		reg = <0x0 0x0>;
		enable-method = "psci";
		cpu-idle-states = <&CPU_RETENTION_0_0 &CPU_SLEEP_0_0
				   &CLUSTER_RETENTION_0 &CLUSTER_SLEEP_0>;
	};

	. . .
	. . .
	. . .
	. . .

	CPU8: cpu@100000000 {
		device_type = "cpu";
		compatible = "arm,cortex-a53";
		reg = <0x1 0x0>;
		enable-method = "psci";
		cpu-idle-states = <&CPU_RETENTION_1_0 &CPU_SLEEP_1_0
				   &CLUSTER_RETENTION_1 &CLUSTER_SLEEP_1>;
	};


In our case, we already have an "ibm,opal/power-mgt/" node in the
device tree where we have defined the idle state so far. This was the
reason to stick the new device tree format under this existing node
that has been specially earmarked for power management related bits,
instead of defining the new format under the cpus node.

Also, in our case, since all the CPU nodes are symmetric they will
have the same set of idle states. Hence, we wouldn't need the
"cpu-idle-states" property for each CPU.

As for the properties of idle states themselves, the only common
things between the ARM idle-states and our case are the compatible,
exit-latency-us, min-residency-us. In addition to this we need the
flags which indicate the nature of the resource loss (Hypervisors
state loss, Timebase loss, etc..) , the psscr_val and the psscr_mask
corresponding to the stop states which the ARM device-tree doesn't
provide.

For this reason we have opted for a new bindings since the overlap
between these two platforms is minimal.

> 
> The way you're using compatible is not really consistent with its
> traditional meaning.
> 
> eg, you have multiple states with:
> 
>                 compatible = "ibm,state-v1",
>                             "cpuoffline",
>                             "opal-supported";
> 
> 
> This would typically mean that all those state are all "compatible" with
> some semantics defined by the name "ibm,state-v1". What you're trying to
> say (I think) is that each state is "version 1" of *that state*. And
> only kernels that understand version 1 should use the state.

Ok, I see what you mean here. Perhaps, we should have had something
like "ibm,stop0-v1" , "ibm,stop1-v2", "ibm,stop2-v2" etc, where
version1, version2 etc, pertains to the versions of those specific
states.

Thus a kernel that knows about "version 1" of stop0 and stop2 and
"version 2" of stop1 will end up using only stop0 and stop1 since it
doesn't know "version 2" of stop2. 

In such a case, kernel should fallback to OPAL for stop2. Does this
make sense ?

> 
> And "cpuoffline" and "opal-supported" definitely don't belong in
> compatible AFAICS, they should simply be boolean properties of the
> node.

I agree. These should be flags.

> 
> cheers
> 

^ permalink raw reply

* RE: [RFC 5/5] powerpc/fsl: Add supported-irq-ranges for P2020
From: Bharat Bhushan @ 2018-08-08  6:28 UTC (permalink / raw)
  To: Scott Wood, benh@kernel.crashing.org, paulus@samba.org,
	mpe@ellerman.id.au, galak@kernel.crashing.org,
	mark.rutland@arm.com, kstewart@linuxfoundation.org,
	gregkh@linuxfoundation.org, devicetree@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
  Cc: robh@kernel.org, keescook@chromium.org, tyreld@linux.vnet.ibm.com,
	joe@perches.com
In-Reply-To: <aeff83355ee5235cfd6107dba06a5f7957b45c92.camel@buserror.net>

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogU2NvdHQgV29vZCBbbWFp
bHRvOm9zc0BidXNlcnJvci5uZXRdDQo+IFNlbnQ6IFdlZG5lc2RheSwgQXVndXN0IDgsIDIwMTgg
MTE6MjYgQU0NCj4gVG86IEJoYXJhdCBCaHVzaGFuIDxiaGFyYXQuYmh1c2hhbkBueHAuY29tPjsN
Cj4gYmVuaEBrZXJuZWwuY3Jhc2hpbmcub3JnOyBwYXVsdXNAc2FtYmEub3JnOyBtcGVAZWxsZXJt
YW4uaWQuYXU7DQo+IGdhbGFrQGtlcm5lbC5jcmFzaGluZy5vcmc7IG1hcmsucnV0bGFuZEBhcm0u
Y29tOw0KPiBrc3Rld2FydEBsaW51eGZvdW5kYXRpb24ub3JnOyBncmVna2hAbGludXhmb3VuZGF0
aW9uLm9yZzsNCj4gZGV2aWNldHJlZUB2Z2VyLmtlcm5lbC5vcmc7IGxpbnV4cHBjLWRldkBsaXN0
cy5vemxhYnMub3JnOyBsaW51eC0NCj4ga2VybmVsQHZnZXIua2VybmVsLm9yZw0KPiBDYzogcm9i
aEBrZXJuZWwub3JnOyBrZWVzY29va0BjaHJvbWl1bS5vcmc7IHR5cmVsZEBsaW51eC52bmV0Lmli
bS5jb207DQo+IGpvZUBwZXJjaGVzLmNvbQ0KPiBTdWJqZWN0OiBSZTogW1JGQyA1LzVdIHBvd2Vy
cGMvZnNsOiBBZGQgc3VwcG9ydGVkLWlycS1yYW5nZXMgZm9yIFAyMDIwDQo+IA0KPiBPbiBXZWQs
IDIwMTgtMDgtMDggYXQgMDM6NDQgKzAwMDAsIEJoYXJhdCBCaHVzaGFuIHdyb3RlOg0KPiA+ID4g
LS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPiA+IEZyb206IFNjb3R0IFdvb2QgW21haWx0
bzpvc3NAYnVzZXJyb3IubmV0XQ0KPiA+ID4gU2VudDogV2VkbmVzZGF5LCBBdWd1c3QgOCwgMjAx
OCAyOjQ0IEFNDQo+ID4gPiBUbzogQmhhcmF0IEJodXNoYW4gPGJoYXJhdC5iaHVzaGFuQG54cC5j
b20+Ow0KPiA+ID4gYmVuaEBrZXJuZWwuY3Jhc2hpbmcub3JnOyBwYXVsdXNAc2FtYmEub3JnOyBt
cGVAZWxsZXJtYW4uaWQuYXU7DQo+ID4gPiBnYWxha0BrZXJuZWwuY3Jhc2hpbmcub3JnOyBtYXJr
LnJ1dGxhbmRAYXJtLmNvbTsNCj4gPiA+IGtzdGV3YXJ0QGxpbnV4Zm91bmRhdGlvbi5vcmc7IGdy
ZWdraEBsaW51eGZvdW5kYXRpb24ub3JnOw0KPiA+ID4gZGV2aWNldHJlZUB2Z2VyLmtlcm5lbC5v
cmc7IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnOyBsaW51eC0NCj4gPiA+IGtlcm5lbEB2
Z2VyLmtlcm5lbC5vcmcNCj4gPiA+IENjOiByb2JoQGtlcm5lbC5vcmc7IGtlZXNjb29rQGNocm9t
aXVtLm9yZzsNCj4gPiA+IHR5cmVsZEBsaW51eC52bmV0LmlibS5jb207IGpvZUBwZXJjaGVzLmNv
bQ0KPiA+ID4gU3ViamVjdDogUmU6IFtSRkMgNS81XSBwb3dlcnBjL2ZzbDogQWRkIHN1cHBvcnRl
ZC1pcnEtcmFuZ2VzIGZvcg0KPiA+ID4gUDIwMjANCj4gPiA+DQo+ID4gPiBPbiBGcmksIDIwMTgt
MDctMjcgYXQgMTU6MTggKzA1MzAsIEJoYXJhdCBCaHVzaGFuIHdyb3RlOg0KPiA+ID4gPiBNUElD
IG9uIE5YUCAoRnJlZXNjYWxlKSBQMjAyMCBzdXBwb3J0cyBmb2xsb3dpbmcgaXJxDQo+ID4gPiA+
IHJhbmdlczoNCj4gPiA+ID4gICA+IDAgLSAxMSAgICAgIChFeHRlcm5hbCBpbnRlcnJ1cHQpDQo+
ID4gPiA+ICAgPiAxNiAtIDc5ICAgICAoSW50ZXJuYWwgaW50ZXJydXB0KQ0KPiA+ID4gPiAgID4g
MTc2IC0gMTgzICAgKE1lc3NhZ2luZyBpbnRlcnJ1cHQpDQo+ID4gPiA+ICAgPiAyMjQgLSAyMzEg
ICAoU2hhcmVkIG1lc3NhZ2Ugc2lnbmFsZWQgaW50ZXJydXB0KQ0KPiA+ID4NCj4gPiA+IFdoeSBk
b24ndCB5b3UgY29udmVydCB0byB0aGUgNC1jZWxsIGludGVycnVwdCBzcGVjaWZpZXJzIHRoYXQg
bWFrZQ0KPiA+ID4gZGVhbGluZyB3aXRoIHRoZXNlIHJhbmdlcyBsZXNzIGVycm9yLXByb25lPw0K
PiA+DQo+ID4gT2sgLCB3aWxsIGRvIGlmIHdlIGFncmVlIHRvIGhhdmUgdGhpcyBzZXJpZXMgYXMg
cGVyIGNvbW1lbnQgb24gb3RoZXIgcGF0Y2guDQo+IA0KPiBJZiB5b3UncmUgY29uY2VybmVkIHdp
dGggZXJyb3JzLCB0aGlzIHdvdWxkIGJlIGEgZ29vZCB0aGluZ3MgdG8gZG8gcmVnYXJkbGVzcy4N
Cj4gIEFjdHVhbGx5LCBpdCBzZWVtcyB0aGF0IHAyMDIwc2ktcG9zdC5kdHNpIGFscmVhZHkgdXNl
cyA0LWNlbGwgaW50ZXJydXB0cy4NCj4gDQo+IFdoYXQgaXMgbW90aXZhdGluZyB0aGlzIHBhdGNo
c2V0PyAgSXMgdGhlcmUgc29tZXRoaW5nIHdyb25nIGluIHRoZSBleGlzdGluZw0KPiBkdHMgZmls
ZXM/DQoNClRoZXJlIGlzIG5vIGVycm9yIGluIGRldmljZSB0cmVlLiBNYWluIG1vdGl2YXRpb24g
aXMgdG8gaW1wcm92ZSBjb2RlIGZvciBmb2xsb3dpbmcgcmVhc29uczogDQogIC0gV2hpbGUgY29k
ZSBzdHVkeSBpdCB3YXMgZm91bmQgdGhhdCBpZiBhIHJlc2VydmVkIGlycS1udW1iZXIgdXNlZCB0
aGVuIHRoZXJlIGFyZSBubyBjaGVjayBpbiBkcml2ZXIuIGlycSB3aWxsIGJlIGNvbmZpZ3VyZWQg
YXMgY29ycmVjdCBhbmQgaW50ZXJydXB0IHdpbGwgbmV2ZXIgZmlyZS4NCiAtIFdhcm5pbmdzIHdl
cmUgb2JzZXJ2ZWQgb24gZGV2ZWxvcG1lbnQgcGxhdGZvcm0gKHNpbXVsYXRvcikgd2hlbiByZWFk
L3dyaXRlIHRvIHJlc2VydmVkIE1QSUMgcmVhc29uIGR1cmluZyBpbml0Lg0KICANCj4gDQo+IA0K
PiA+DQo+ID4gPg0KPiA+ID4gPiBkaWZmIC0tZ2l0IGEvYXJjaC9wb3dlcnBjL3BsYXRmb3Jtcy84
NXh4L21wYzg1eHhfcmRiLmMNCj4gPiA+ID4gYi9hcmNoL3Bvd2VycGMvcGxhdGZvcm1zLzg1eHgv
bXBjODV4eF9yZGIuYw0KPiA+ID4gPiBpbmRleCAxMDA2OTUwLi40OWZmMzQ4IDEwMDY0NA0KPiA+
ID4gPiAtLS0gYS9hcmNoL3Bvd2VycGMvcGxhdGZvcm1zLzg1eHgvbXBjODV4eF9yZGIuYw0KPiA+
ID4gPiArKysgYi9hcmNoL3Bvd2VycGMvcGxhdGZvcm1zLzg1eHgvbXBjODV4eF9yZGIuYw0KPiA+
ID4gPiBAQCAtNTcsNiArNTcsMTEgQEAgdm9pZCBfX2luaXQgbXBjODV4eF9yZGJfcGljX2luaXQo
dm9pZCkNCj4gPiA+ID4gIAkJCU1QSUNfQklHX0VORElBTiB8DQo+ID4gPiA+ICAJCQlNUElDX1NJ
TkdMRV9ERVNUX0NQVSwNCj4gPiA+ID4gIAkJCTAsIDI1NiwgIiBPcGVuUElDICAiKTsNCj4gPiA+
ID4gKwl9IGVsc2UgaWYgKG9mX21hY2hpbmVfaXNfY29tcGF0aWJsZSgiZnNsLFAyMDIwUkRCLVBD
IikpIHsNCj4gPiA+ID4gKwkJbXBpYyA9IG1waWNfYWxsb2MoTlVMTCwgMCwNCj4gPiA+ID4gKwkJ
ICBNUElDX0JJR19FTkRJQU4gfA0KPiA+ID4gPiArCQkgIE1QSUNfU0lOR0xFX0RFU1RfQ1BVLA0K
PiA+ID4gPiArCQkgIDAsIDAsICIgT3BlblBJQyAgIik7DQo+ID4gPiA+ICAJfSBlbHNlIHsNCj4g
PiA+ID4gIAkJbXBpYyA9IG1waWNfYWxsb2MoTlVMTCwgMCwNCj4gPiA+ID4gIAkJICBNUElDX0JJ
R19FTkRJQU4gfA0KPiA+ID4NCj4gPiA+IEkgZG9uJ3QgdGhpbmsgd2Ugd2FudCB0byBncm93IGEg
bGlzdCBvZiBldmVyeSBzaW5nbGUgcmV2aXNpb24gb2YNCj4gPiA+IGV2ZXJ5IGJvYXJkIGluIHRo
ZXNlIHBsYXRmb3JtIGZpbGVzLg0KPiA+DQo+ID4gT25lIG90aGVyIGNvbmZ1c2luZyBvYnNlcnZh
dGlvbiBJIGhhdmUgaXMgdGhhdCAiaXJxX2NvdW50IiBmcm9tDQo+ID4gcGxhdGZvcm0gY29kZSBp
cyBnaXZlbiBwcmVjZWRlbmNlIG92ZXIgImxhc3QtaW50ZXJydXB0LXNvdXJjZSIgaW4gZGV2aWNl
LQ0KPiB0cmVlLg0KPiA+IFNob3VsZCBub3QgZGV2aWNlLXRyZWUgc2hvdWxkIGhhdmUgcHJlY2Vk
ZW5jZSBvdGhlcndpc2UgdGhlcmUgaXMgbm8NCj4gPiBwb2ludCB1c2luZyAiIGxhc3QtaW50ZXJy
dXB0LXNvdXJjZSIgaWYgcGxhdGZvcm0gY29kZSBwYXNzZXMNCj4gPiAiaXJxX2NvdW50IiBpbiBt
cGljX2FsbG9jKCkuDQo+IA0KPiBNYXliZSwgdGhvdWdoIEkgZG9uJ3QgdGhpbmsgaXQgbWF0dGVy
cyBtdWNoIGdpdmVuIHRoYXQgbGFzdC1pbnRlcnJ1cHQtc291cmNlDQo+IHdhcyBvbmx5IGFkZGVk
IHRvIGF2b2lkIGhhdmluZyB0byBwYXNzIGlycV9jb3VudCBpbiBwbGF0Zm9ybSBjb2RlLg0KDQpU
aGFua3MgZm9yIGNsYXJpZnlpbmc7DQoNCk15IHVuZGVyc3RhbmRpbmcgd2FzIHRoYXQgImxhc3Qt
aW50ZXJydXB0LXNvdXJjZSIgYWRkZWQgdG8gZW5zdXJlIHRoYXQgd2UgY2FuIG92ZXItcmlkZSB2
YWx1ZSBwYXNzZWQgZnJvbSBwbGF0Zm9ybSBjb2RlLiBJbiB0aGF0IGNhc2Ugd2UgZG8gbm90IG5l
ZWQgdG8gY2hhbmdlIGNvZGUgYW5kIGNhbiBjb250cm9sIGZyb20gZGV2aWNlIHRyZWUuDQoNClRo
YW5rcw0KLUJoYXJhdA0KDQoNCj4gDQo+IC1TY290dA0KDQo=

^ permalink raw reply

* [PATCH] lib/test_hexdump: fix failure on big endian cpu
From: Christophe Leroy @ 2018-08-08  6:37 UTC (permalink / raw)
  To: Andy Shevchenko, Andrew Morton, Linus Torvalds
  Cc: linux-kernel, linuxppc-dev, Michael Ellerman

On a big endian cpu, test_hexdump fails as follows. The logs show
that bytes are expected in reversed order.

[...]
[   16.643648] test_hexdump: Len: 24 buflen: 130 strlen: 97
[   16.648681] test_hexdump: Result: 97 'be32db7b 0a1893b2 70bac424 7d83349b a69c31ad 9c0face9                    .2.{....p..$}.4...1.....'
[   16.660951] test_hexdump: Expect: 97 '7bdb32be b293180a 24c4ba70 9b34837d ad319ca6 e9ac0f9c                    .2.{....p..$}.4...1.....'
[   16.673129] test_hexdump: Len: 8 buflen: 130 strlen: 77
[   16.678113] test_hexdump: Result: 77 'be32db7b0a1893b2                                                     .2.{....'
[   16.688660] test_hexdump: Expect: 77 'b293180a7bdb32be                                                     .2.{....'
[   16.699170] test_hexdump: Len: 6 buflen: 131 strlen: 87
[   16.704238] test_hexdump: Result: 87 'be32 db7b 0a18                                                                   .2.{..'
[   16.715511] test_hexdump: Expect: 87 '32be 7bdb 180a                                                                   .2.{..'
[   16.726864] test_hexdump: Len: 24 buflen: 131 strlen: 97
[   16.731902] test_hexdump: Result: 97 'be32db7b 0a1893b2 70bac424 7d83349b a69c31ad 9c0face9                    .2.{....p..$}.4...1.....'
[   16.744175] test_hexdump: Expect: 97 '7bdb32be b293180a 24c4ba70 9b34837d ad319ca6 e9ac0f9c                    .2.{....p..$}.4...1.....'
[   16.756379] test_hexdump: Len: 32 buflen: 131 strlen: 101
[   16.761507] test_hexdump: Result: 101 'be32db7b0a1893b2 70bac4247d83349b a69c31ad9c0face9 4cd1199943b1af0c  .2.{....p..$}.4...1.....L...C...'
[   16.774212] test_hexdump: Expect: 101 'b293180a7bdb32be 9b34837d24c4ba70 e9ac0f9cad319ca6 0cafb1439919d14c  .2.{....p..$}.4...1.....L...C...'
[   16.786763] test_hexdump: failed 801 out of 1184 tests

This patch fixes it.

Fixes: 64d1d77a44697 ("hexdump: introduce test suite")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 lib/test_hexdump.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/lib/test_hexdump.c b/lib/test_hexdump.c
index 3f415d8101f3..626f580b4ff7 100644
--- a/lib/test_hexdump.c
+++ b/lib/test_hexdump.c
@@ -18,7 +18,7 @@ static const unsigned char data_b[] = {
 
 static const unsigned char data_a[] = ".2.{....p..$}.4...1.....L...C...";
 
-static const char * const test_data_1_le[] __initconst = {
+static const char * const test_data_1[] __initconst = {
 	"be", "32", "db", "7b", "0a", "18", "93", "b2",
 	"70", "ba", "c4", "24", "7d", "83", "34", "9b",
 	"a6", "9c", "31", "ad", "9c", "0f", "ac", "e9",
@@ -32,16 +32,33 @@ static const char * const test_data_2_le[] __initconst = {
 	"d14c", "9919", "b143", "0caf",
 };
 
+static const char * const test_data_2_be[] __initconst = {
+	"be32", "db7b", "0a18", "93b2",
+	"70ba", "c424", "7d83", "349b",
+	"a69c", "31ad", "9c0f", "ace9",
+	"4cd1", "1999", "43b1", "af0c",
+};
+
 static const char * const test_data_4_le[] __initconst = {
 	"7bdb32be", "b293180a", "24c4ba70", "9b34837d",
 	"ad319ca6", "e9ac0f9c", "9919d14c", "0cafb143",
 };
 
+static const char * const test_data_4_be[] __initconst = {
+	"be32db7b", "0a1893b2", "70bac424", "7d83349b",
+	"a69c31ad", "9c0face9", "4cd11999", "43b1af0c",
+};
+
 static const char * const test_data_8_le[] __initconst = {
 	"b293180a7bdb32be", "9b34837d24c4ba70",
 	"e9ac0f9cad319ca6", "0cafb1439919d14c",
 };
 
+static const char * const test_data_8_be[] __initconst = {
+	"be32db7b0a1893b2", "70bac4247d83349b",
+	"a69c31ad9c0face9", "4cd1199943b1af0c",
+};
+
 #define FILL_CHAR	'#'
 
 static unsigned total_tests __initdata;
@@ -56,6 +73,7 @@ static void __init test_hexdump_prepare_test(size_t len, int rowsize,
 	size_t l = len;
 	int gs = groupsize, rs = rowsize;
 	unsigned int i;
+	const bool is_be = IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
 
 	if (rs != 16 && rs != 32)
 		rs = 16;
@@ -67,13 +85,13 @@ static void __init test_hexdump_prepare_test(size_t len, int rowsize,
 		gs = 1;
 
 	if (gs == 8)
-		result = test_data_8_le;
+		result = is_be ? test_data_8_be : test_data_8_le;
 	else if (gs == 4)
-		result = test_data_4_le;
+		result = is_be ? test_data_4_be : test_data_4_le;
 	else if (gs == 2)
-		result = test_data_2_le;
+		result = is_be ? test_data_2_be : test_data_2_le;
 	else
-		result = test_data_1_le;
+		result = test_data_1;
 
 	/* hex dump */
 	p = test;
-- 
2.13.3

^ permalink raw reply related

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Christoph Hellwig @ 2018-08-08  6:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Christoph Hellwig, Michael S. Tsirkin, Will Deacon,
	Anshuman Khandual, virtualization, linux-kernel, linuxppc-dev,
	aik, robh, joe, elfring, david, jasowang, mpe, linuxram, haren,
	paulus, srikar, robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <2103ecfe52d23cec03f185d08a87bfad9c9d82b5.camel@kernel.crashing.org>

On Wed, Aug 08, 2018 at 06:32:45AM +1000, Benjamin Herrenschmidt wrote:
> As for the flag itself, while we could set it from qemu when we get
> notified that the guest is going secure, both Michael and I think it's
> rather gross, it requires qemu to go iterate all virtio devices and
> "poke" something into them.

You don't need to set them the time you go secure.  You just need to
set the flag from the beginning on any VM you might want to go secure.
Or for simplicity just any VM - if the DT/ACPI tables exposed by
qemu are good enough that will always exclude a iommu and not set a
DMA offset, so nothing will change on the qemu side of he processing,
and with the new direct calls for the direct dma ops performance in
the guest won't change either.

> It's nicer if we have a way in the guest virtio driver to do something
> along the lines of
> 
> 	if ((flags & VIRTIO_F_IOMMU_PLATFORM) || arch_virtio_wants_dma_ops())
> 
> Which would have the same effect and means the issue is entirely
> contained in the guest.

It would not be the same effect.  The problem with that is that you must
now assumes that your qemu knows that for example you might be passing
a dma offset if the bus otherwise requires it.  Or in other words:
you potentially break the contract between qemu and the guest of always
passing down physical addresses.  If we explicitly change that contract
through using a flag that says you pass bus address everything is fine.

Note that in practice your scheme will probably just work for your
initial prototype, but chances are it will get us in trouble later on.

^ permalink raw reply

* Re: [PATCH] lib/test_hexdump: fix failure on big endian cpu
From: Michael Ellerman @ 2018-08-08  7:25 UTC (permalink / raw)
  To: Christophe Leroy, Andy Shevchenko, Andrew Morton, Linus Torvalds
  Cc: linux-kernel, linuxppc-dev
In-Reply-To: <f3112437f62c2f48300535510918e8be1dceacfb.1533610877.git.christophe.leroy@c-s.fr>

Christophe Leroy <christophe.leroy@c-s.fr> writes:
> diff --git a/lib/test_hexdump.c b/lib/test_hexdump.c
> index 3f415d8101f3..626f580b4ff7 100644
> --- a/lib/test_hexdump.c
> +++ b/lib/test_hexdump.c
> @@ -32,16 +32,33 @@ static const char * const test_data_2_le[] __initconst = {
>  	"d14c", "9919", "b143", "0caf",
>  };
>  
> +static const char * const test_data_2_be[] __initconst = {
> +	"be32", "db7b", "0a18", "93b2",
> +	"70ba", "c424", "7d83", "349b",
> +	"a69c", "31ad", "9c0f", "ace9",
> +	"4cd1", "1999", "43b1", "af0c",
> +};
> +
>  static const char * const test_data_4_le[] __initconst = {
>  	"7bdb32be", "b293180a", "24c4ba70", "9b34837d",
>  	"ad319ca6", "e9ac0f9c", "9919d14c", "0cafb143",
>  };
>  
> +static const char * const test_data_4_be[] __initconst = {
> +	"be32db7b", "0a1893b2", "70bac424", "7d83349b",
> +	"a69c31ad", "9c0face9", "4cd11999", "43b1af0c",
> +};
> +

Is there a reason we can't do it all at compile time?

eg:

static const char * const test_data_4[] __initconst = {
#ifdef CONFIG_CPU_LITTLE_ENDIAN
	"7bdb32be", "b293180a", "24c4ba70", "9b34837d",
	"ad319ca6", "e9ac0f9c", "9919d14c", "0cafb143",
#else
	"be32db7b", "0a1893b2", "70bac424", "7d83349b",
	"a69c31ad", "9c0face9", "4cd11999", "43b1af0c",
#endif
};


cheers

^ permalink raw reply

* Re: [PATCH] lib/test_hexdump: fix failure on big endian cpu
From: Christophe LEROY @ 2018-08-08  7:44 UTC (permalink / raw)
  To: Michael Ellerman, Andy Shevchenko, Andrew Morton, Linus Torvalds
  Cc: linux-kernel, linuxppc-dev
In-Reply-To: <87mutxb9qy.fsf@concordia.ellerman.id.au>



Le 08/08/2018 à 09:25, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> diff --git a/lib/test_hexdump.c b/lib/test_hexdump.c
>> index 3f415d8101f3..626f580b4ff7 100644
>> --- a/lib/test_hexdump.c
>> +++ b/lib/test_hexdump.c
>> @@ -32,16 +32,33 @@ static const char * const test_data_2_le[] __initconst = {
>>   	"d14c", "9919", "b143", "0caf",
>>   };
>>   
>> +static const char * const test_data_2_be[] __initconst = {
>> +	"be32", "db7b", "0a18", "93b2",
>> +	"70ba", "c424", "7d83", "349b",
>> +	"a69c", "31ad", "9c0f", "ace9",
>> +	"4cd1", "1999", "43b1", "af0c",
>> +};
>> +
>>   static const char * const test_data_4_le[] __initconst = {
>>   	"7bdb32be", "b293180a", "24c4ba70", "9b34837d",
>>   	"ad319ca6", "e9ac0f9c", "9919d14c", "0cafb143",
>>   };
>>   
>> +static const char * const test_data_4_be[] __initconst = {
>> +	"be32db7b", "0a1893b2", "70bac424", "7d83349b",
>> +	"a69c31ad", "9c0face9", "4cd11999", "43b1af0c",
>> +};
>> +
> 
> Is there a reason we can't do it all at compile time?

Codyingstyle suggests to use IS_ENABLED() as much as possible.
I checked symbols inside resulting vmlinux, only the BE ones are there 
so it has
the same effect as an #ifdef

Extract from Documentation/process/codying-style.rst:

Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
symbol into a C boolean expression, and use it in a normal C conditional:

.. code-block:: c

	if (IS_ENABLED(CONFIG_SOMETHING)) {
		...
	}

The compiler will constant-fold the conditional away, and include or exclude
the block of code just as with an #ifdef, so this will not add any runtime
overhead.  However, this approach still allows the C compiler to see the 
code
inside the block, and check it for correctness (syntax, types, symbol
references, etc).  Thus, you still have to use an #ifdef if the code 
inside the
block references symbols that will not exist if the condition is not met.

Christophe

> 
> eg:
> 
> static const char * const test_data_4[] __initconst = {
> #ifdef CONFIG_CPU_LITTLE_ENDIAN
> 	"7bdb32be", "b293180a", "24c4ba70", "9b34837d",
> 	"ad319ca6", "e9ac0f9c", "9919d14c", "0cafb143",
> #else
> 	"be32db7b", "0a1893b2", "70bac424", "7d83349b",
> 	"a69c31ad", "9c0face9", "4cd11999", "43b1af0c",
> #endif
> };
> 
> 
> cheers
> 

^ permalink raw reply

* Re: [RFC PATCH kernel 0/5] powerpc/P9/vfio: Pass through NVIDIA Tesla V100
From: Alexey Kardashevskiy @ 2018-08-08  8:39 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Benjamin Herrenschmidt, linuxppc-dev, David Gibson, kvm-ppc,
	Ram Pai, kvm, Alistair Popple
In-Reply-To: <20180801101646.2de997c3@t450s.home>



On 02/08/2018 02:16, Alex Williamson wrote:
> On Wed, 1 Aug 2018 18:37:35 +1000
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> On 01/08/2018 00:29, Alex Williamson wrote:
>>> On Tue, 31 Jul 2018 14:03:35 +1000
>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>>   
>>>> On 31/07/2018 02:29, Alex Williamson wrote:  
>>>>> On Mon, 30 Jul 2018 18:58:49 +1000
>>>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:  
>>>>>> After some local discussions, it was pointed out that force disabling
>>>>>> nvlinks won't bring us much as for an nvlink to work, both sides need to
>>>>>> enable it so malicious guests cannot penetrate good ones (or a host)
>>>>>> unless a good guest enabled the link but won't happen with a well
>>>>>> behaving guest. And if two guests became malicious, then can still only
>>>>>> harm each other, and so can they via other ways such network. This is
>>>>>> different from PCIe as once PCIe link is unavoidably enabled, a well
>>>>>> behaving device cannot firewall itself from peers as it is up to the
>>>>>> upstream bridge(s) now to decide the routing; with nvlink2, a GPU still
>>>>>> has means to protect itself, just like a guest can run "firewalld" for
>>>>>> network.
>>>>>>
>>>>>> Although it would be a nice feature to have an extra barrier between
>>>>>> GPUs, is inability to block the links in hypervisor still a blocker for
>>>>>> V100 pass through?    
>>>>>
>>>>> How is the NVLink configured by the guest, is it 'on'/'off' or are
>>>>> specific routes configured?     
>>>>
>>>> The GPU-GPU links need not to be blocked and need to be enabled
>>>> (==trained) by a driver in the guest. There are no routes between GPUs
>>>> in NVLink fabric, these are direct links, it is just a switch on each
>>>> side, both switches need to be on for a link to work.  
>>>
>>> Ok, but there is at least the possibility of multiple direct links per
>>> GPU, the very first diagram I find of NVlink shows 8 interconnected
>>> GPUs:
>>>
>>> https://www.nvidia.com/en-us/data-center/nvlink/  
>>
>> Out design is like the left part of the picture but it is just a detail.
> 
> Unless we can specifically identify a direct link vs a mesh link, we
> shouldn't be making assumptions about the degree of interconnect.
>  
>>> So if each switch enables one direct, point to point link, how does the
>>> guest know which links to open for which peer device?  
>>
>> It uses PCI config space on GPUs to discover the topology.
> 
> So do we need to virtualize this config space if we're going to
> virtualize the topology?
> 
>>> And of course
>>> since we can't see the spec, a security audit is at best hearsay :-\  
>>
>> Yup, the exact discovery protocol is hidden.
> 
> It could be reverse engineered...
> 
>>>> The GPU-CPU links - the GPU bit is the same switch, the CPU NVlink state
>>>> is controlled via the emulated PCI bridges which I pass through together
>>>> with the GPU.  
>>>
>>> So there's a special emulated switch, is that how the guest knows which
>>> GPUs it can enable NVLinks to?  
>>
>> Since it only has PCI config space (there is nothing relevant in the
>> device tree at all), I assume (double checking with the NVIDIA folks
>> now) the guest driver enables them all, tests which pair works and
>> disables the ones which do not. This gives a malicious guest a tiny
>> window of opportunity to break into a good guest. Hm :-/
> 
> Let's not minimize that window, that seems like a prime candidate for
> an exploit.
> 
>>>>> If the former, then isn't a non-malicious
>>>>> guest still susceptible to a malicious guest?    
>>>>
>>>> A non-malicious guest needs to turn its switch on for a link to a GPU
>>>> which belongs to a malicious guest.  
>>>
>>> Actual security, or obfuscation, will we ever know...  
>>>>>> If the latter, how is  
>>>>> routing configured by the guest given that the guest view of the
>>>>> topology doesn't match physical hardware?  Are these routes
>>>>> deconfigured by device reset?  Are they part of the save/restore
>>>>> state?  Thanks,    
>>>
>>> Still curious what happens to these routes on reset.  Can a later user
>>> of a GPU inherit a device where the links are already enabled?  Thanks,  
>>
>> I am told that the GPU reset disables links. As a side effect, we get an
>> HMI (a hardware fault which reset the host machine) when trying
>> accessing the GPU RAM which indicates that the link is down as the
>> memory is only accessible via the nvlink. We have special fencing code
>> in our host firmware (skiboot) to fence this memory on PCI reset so
>> reading from it returns zeroes instead of HMIs.
> 
> What sort of reset is required for this?  Typically we rely on
> secondary bus reset for GPUs, but it would be a problem if GPUs were to
> start implementing FLR and nobody had a spec to learn that FLR maybe
> didn't disable the link.  The better approach to me still seems to be
> virtualizing these NVLink config registers to an extent that the user
> can only enabling links where they have ownership of both ends of the
> connection.  Thanks,


I re-read what I wrote and I owe some explanation.

The link state can be:
- disabled (or masked),
- enabled (or not-disabled? unmasked?),
- trained (configured).

At the moment no reset disables links, on sec bus reset they are
unconfigured and go to the initial enabled-and-not-trained state which
is the default config. The NVIDIA driver in the guest trains links to do
the topology discovery. We can disable links and this disabled status
remains until sec bus reset and there is no way to re-enable links other
than sec bus reset. This is what I get from NVIDIA. FLR should not be
able to change a thing here.



-- 
Alexey

^ permalink raw reply

* [PATCH v2] powerpc/topology: Check at boot for topology updates
From: Srikar Dronamraju @ 2018-08-08  9:00 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman, Michael Bringmann, Manjunatha H R,
	Srikar Dronamraju, Anshuman Khandual

On a shared lpar, Phyp will not update the cpu associativity at boot
time. Just after the boot system does recognize itself as a shared lpar and
trigger a request for correct cpu associativity. But by then the scheduler
would have already created/destroyed its sched domains.

This causes
- Broken load balance across Nodes causing islands of cores.
- Performance degradation esp if the system is lightly loaded
- dmesg to wrongly report all cpus to be in Node 0.
- Messages in dmesg saying borken topology.
- With commit 051f3ca02e46 ("sched/topology: Introduce NUMA identity node sched
  domain"), can cause rcu stalls at boot up.

>From a scheduler maintainer's perspective, moving cpus from one node to
another or creating more numa levels after boot is not appropriate
without some notification to the user space.
https://lore.kernel.org/lkml/20150406214558.GA38501@linux.vnet.ibm.com/T/#u

The sched_domains_numa_masks table which is used to generate cpumasks is
only created at boot time just before creating sched domains and never
updated.  Hence, its better to get the topology correct before the sched
domains are created.

For example on 64 core Power 8 shared lpar, dmesg reports

[    2.088360] Brought up 512 CPUs
[    2.088368] Node 0 CPUs: 0-511
[    2.088371] Node 1 CPUs:
[    2.088373] Node 2 CPUs:
[    2.088375] Node 3 CPUs:
[    2.088376] Node 4 CPUs:
[    2.088378] Node 5 CPUs:
[    2.088380] Node 6 CPUs:
[    2.088382] Node 7 CPUs:
[    2.088386] Node 8 CPUs:
[    2.088388] Node 9 CPUs:
[    2.088390] Node 10 CPUs:
[    2.088392] Node 11 CPUs:
...
[    3.916091] BUG: arch topology borken
[    3.916103]      the DIE domain not a subset of the NUMA domain
[    3.916105] BUG: arch topology borken
[    3.916106]      the DIE domain not a subset of the NUMA domain
...

numactl/lscpu output will still be correct with cores spreading across
all nodes.

Socket(s):             64
NUMA node(s):          12
Model:                 2.0 (pvr 004d 0200)
Model name:            POWER8 (architected), altivec supported
Hypervisor vendor:     pHyp
Virtualization type:   para
L1d cache:             64K
L1i cache:             32K
NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471
NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479
NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487
NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495
NUMA node4 CPU(s):     208-215,304-311,400-407,496-503
NUMA node5 CPU(s):     168-175,264-271,360-367,456-463
NUMA node6 CPU(s):     128-135,224-231,320-327,416-423
NUMA node7 CPU(s):     136-143,232-239,328-335,424-431
NUMA node8 CPU(s):     216-223,312-319,408-415,504-511
NUMA node9 CPU(s):     144-151,240-247,336-343,432-439
NUMA node10 CPU(s):    152-159,248-255,344-351,440-447
NUMA node11 CPU(s):    160-167,256-263,352-359,448-455

Currently on this lpar, the scheduler detects 2 levels of Numa and
created numa sched domains for all cpus, but it finds a single DIE
domain consisting of all cpus. Hence it deletes all numa sched domains.

To address this, split the topology update init, such that the first
part detects vphn/prrn soon after cpus are setup and force updates
topology just before scheduler creates sched domain.

With the fix, dmesg reports

[    0.491336] numa: Node 0 CPUs: 0-7 32-39 64-71 96-103 176-183 272-279 368-375 464-471
[    0.491351] numa: Node 1 CPUs: 8-15 40-47 72-79 104-111 184-191 280-287 376-383 472-479
[    0.491359] numa: Node 2 CPUs: 16-23 48-55 80-87 112-119 192-199 288-295 384-391 480-487
[    0.491366] numa: Node 3 CPUs: 24-31 56-63 88-95 120-127 200-207 296-303 392-399 488-495
[    0.491374] numa: Node 4 CPUs: 208-215 304-311 400-407 496-503
[    0.491379] numa: Node 5 CPUs: 168-175 264-271 360-367 456-463
[    0.491384] numa: Node 6 CPUs: 128-135 224-231 320-327 416-423
[    0.491389] numa: Node 7 CPUs: 136-143 232-239 328-335 424-431
[    0.491394] numa: Node 8 CPUs: 216-223 312-319 408-415 504-511
[    0.491399] numa: Node 9 CPUs: 144-151 240-247 336-343 432-439
[    0.491404] numa: Node 10 CPUs: 152-159 248-255 344-351 440-447
[    0.491409] numa: Node 11 CPUs: 160-167 256-263 352-359 448-455

and lscpu would also report

Socket(s):             64
NUMA node(s):          12
Model:                 2.0 (pvr 004d 0200)
Model name:            POWER8 (architected), altivec supported
Hypervisor vendor:     pHyp
Virtualization type:   para
L1d cache:             64K
L1i cache:             32K
NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471
NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479
NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487
NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495
NUMA node4 CPU(s):     208-215,304-311,400-407,496-503
NUMA node5 CPU(s):     168-175,264-271,360-367,456-463
NUMA node6 CPU(s):     128-135,224-231,320-327,416-423
NUMA node7 CPU(s):     136-143,232-239,328-335,424-431
NUMA node8 CPU(s):     216-223,312-319,408-415,504-511
NUMA node9 CPU(s):     144-151,240-247,336-343,432-439
NUMA node10 CPU(s):    152-159,248-255,344-351,440-447
NUMA node11 CPU(s):    160-167,256-263,352-359,448-455

Previous attempt to solve this problem
https://patchwork.ozlabs.org/patch/530090/

Reported-by: Manjunatha H R <manjuhr1@in.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog v1->v2:
Fix compile warnings and checkpatch issues.

 arch/powerpc/include/asm/topology.h |  4 ++++
 arch/powerpc/kernel/smp.c           |  6 ++++++
 arch/powerpc/mm/numa.c              | 22 ++++++++++++++--------
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/topology.h b/arch/powerpc/include/asm/topology.h
index 16b077801a5f..1024f1587e18 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -43,6 +43,7 @@ extern void __init dump_numa_cpu_topology(void);
 extern int sysfs_add_device_to_node(struct device *dev, int nid);
 extern void sysfs_remove_device_from_node(struct device *dev, int nid);
 extern int numa_update_cpu_topology(bool cpus_locked);
+extern void check_topology_updates(void);
 
 static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node)
 {
@@ -84,6 +85,9 @@ static inline int numa_update_cpu_topology(bool cpus_locked)
 
 static inline void update_numa_cpu_lookup_table(unsigned int cpu, int node) {}
 
+static void check_topology_updates(void)
+{
+}
 #endif /* CONFIG_NUMA */
 
 #if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR)
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 4794d6b4f4d2..2aa0ffd954c9 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1156,6 +1156,12 @@ void __init smp_cpus_done(unsigned int max_cpus)
 	if (smp_ops && smp_ops->bringup_done)
 		smp_ops->bringup_done();
 
+	/*
+	 * On a shared LPAR, associativity needs to be requested.
+	 * Hence, check for numa topology updates before dumping
+	 * cpu topology
+	 */
+	check_topology_updates();
 	dump_numa_cpu_topology();
 
 	/*
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 0c7e05d89244..eab46a44436f 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1515,6 +1515,7 @@ int start_topology_update(void)
 		   lppaca_shared_proc(get_lppaca())) {
 		if (!vphn_enabled) {
 			vphn_enabled = 1;
+			topology_update_needed = 1;
 			setup_cpu_associativity_change_counters();
 			timer_setup(&topology_timer, topology_timer_fn,
 				    TIMER_DEFERRABLE);
@@ -1551,6 +1552,19 @@ int prrn_is_enabled(void)
 	return prrn_enabled;
 }
 
+void check_topology_updates(void)
+{
+	/* Do not poll for changes if disabled at boot */
+	if (topology_updates_enabled)
+		start_topology_update();
+
+	if (topology_update_needed) {
+		bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
+			    nr_cpumask_bits);
+		numa_update_cpu_topology(false);
+	}
+}
+
 static int topology_read(struct seq_file *file, void *v)
 {
 	if (vphn_enabled || prrn_enabled)
@@ -1597,10 +1611,6 @@ static const struct file_operations topology_ops = {
 
 static int topology_update_init(void)
 {
-	/* Do not poll for changes if disabled at boot */
-	if (topology_updates_enabled)
-		start_topology_update();
-
 	if (vphn_enabled)
 		topology_schedule_update();
 
@@ -1608,10 +1618,6 @@ static int topology_update_init(void)
 		return -ENOMEM;
 
 	topology_inited = 1;
-	if (topology_update_needed)
-		bitmap_fill(cpumask_bits(&cpu_associativity_changes_mask),
-					nr_cpumask_bits);
-
 	return 0;
 }
 device_initcall(topology_update_init);
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH v2 1/2] powerpc/fadump: handle crash memory ranges array index overflow
From: Mahesh Jagannath Salgaonkar @ 2018-08-08  9:04 UTC (permalink / raw)
  To: Hari Bathini, Michael Ellerman; +Cc: Mahesh J Salgaonkar, stable, linuxppc-dev
In-Reply-To: <153358813908.15150.18359359970445648733.stgit@hbathini.in.ibm.com>

On 08/07/2018 02:12 AM, Hari Bathini wrote:
> Crash memory ranges is an array of memory ranges of the crashing kernel
> to be exported as a dump via /proc/vmcore file. The size of the array
> is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases
> where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS
> value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since
> commit 142b45a72e22 ("memblock: Add array resizing support").
> 
> On large memory systems with a few DLPAR operations, the memblock memory
> regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such
> systems, registering fadump results in crash or other system failures
> like below:
> 
>   task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000
>   NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180
>   REGS: c00000000b73b570 TRAP: 0300   Tainted: G          L   X  (4.4.140+)
>   MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22004484  XER: 20000000
>   CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0
>   GPR00: c0000000000f9e58 c00000000b73b7f0 c000000000f09a00 000000000000001a
>   GPR04: c00007f3bf774c90 0000000000000004 c000000000eb9a00 0000000000000800
>   GPR08: 0000000000000804 000007a450000000 c000000000fa9a00 c00007ffb169ca20
>   GPR12: 0000000022004482 c00000000fa12c00 c00007f3a0ea97a8 0000000000000000
>   GPR16: c00007f3a0ea9a50 c00000000b73bd60 0000000000000118 000000000001fe80
>   GPR20: 0000000000000118 0000000000000000 c000000000b8c980 00000000000000d0
>   GPR24: 000007ffb0b10000 c00007ffb169c980 0000000000000000 c000000000b8c980
>   GPR28: 0000000000000004 c00007ffb169c980 000000000000001a c00007ffb169c980
>   NIP [c000000000047df4] smp_send_reschedule+0x24/0x80
>   LR [c0000000000f9e58] resched_curr+0x138/0x160
>   Call Trace:
>   [c00000000b73b7f0] [c0000000000f9e58] resched_curr+0x138/0x160 (unreliable)
>   [c00000000b73b820] [c0000000000fb538] check_preempt_curr+0xc8/0xf0
>   [c00000000b73b850] [c0000000000fb598] ttwu_do_wakeup+0x38/0x150
>   [c00000000b73b890] [c0000000000fc9c4] try_to_wake_up+0x224/0x4d0
>   [c00000000b73b900] [c00000000011ef34] __wake_up_common+0x94/0x100
>   [c00000000b73b960] [c00000000034a78c] ep_poll_callback+0xac/0x1c0
>   [c00000000b73b9b0] [c00000000011ef34] __wake_up_common+0x94/0x100
>   [c00000000b73ba10] [c00000000011f810] __wake_up_sync_key+0x70/0xa0
>   [c00000000b73ba60] [c00000000067c3e8] sock_def_readable+0x58/0xa0
>   [c00000000b73ba90] [c0000000007848ac] unix_stream_sendmsg+0x2dc/0x4c0
>   [c00000000b73bb70] [c000000000675a38] sock_sendmsg+0x68/0xa0
>   [c00000000b73bba0] [c00000000067673c] ___sys_sendmsg+0x2cc/0x2e0
>   [c00000000b73bd30] [c000000000677dbc] __sys_sendmsg+0x5c/0xc0
>   [c00000000b73bdd0] [c0000000006789bc] SyS_socketcall+0x36c/0x3f0
>   [c00000000b73be30] [c000000000009488] system_call+0x3c/0x100
>   Instruction dump:
>   4e800020 60000000 60420000 3c4c00ec 38421c30 7c0802a6 f8010010 60000000
>   3d42000a e92ab420 2fa90000 4dde0020 <e9290000> 2fa90000 419e0044 7c0802a6
>   ---[ end trace a6d1dd4bab5f8253 ]---
> 
> as array index overflow is not checked for while setting up crash memory
> ranges causing memory corruption. To resolve this issue, dynamically
> allocate memory for crash memory ranges and resize it incrementally,
> in units of pagesize, on hitting array size limit.
> 
> Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.")
> Cc: stable@vger.kernel.org
> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>

Looks ok to me.

Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Thanks,
-Mahesh.

> ---
> 
> Changes in v2:
> * Allocating memory for crash ranges in pagesize unit.
> * freeing memory allocated while cleaning up.
> * Moved the changes to coalesce memory ranges into patch 2/2.
> 
> 
>  arch/powerpc/include/asm/fadump.h |    4 +-
>  arch/powerpc/kernel/fadump.c      |   91 +++++++++++++++++++++++++++++++------
>  2 files changed, 79 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h
> index 5a23010..3abc738 100644
> --- a/arch/powerpc/include/asm/fadump.h
> +++ b/arch/powerpc/include/asm/fadump.h
> @@ -195,8 +195,8 @@ struct fadump_crash_info_header {
>  	struct cpumask	online_mask;
>  };
>  
> -/* Crash memory ranges */
> -#define INIT_CRASHMEM_RANGES	(INIT_MEMBLOCK_REGIONS + 2)
> +/* Crash memory ranges size unit (pagesize) */
> +#define CRASHMEM_RANGES_ALLOC_SIZE		PAGE_SIZE
>  
>  struct fad_crash_memory_ranges {
>  	unsigned long long	base;
> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> index 07e8396..2ec5704 100644
> --- a/arch/powerpc/kernel/fadump.c
> +++ b/arch/powerpc/kernel/fadump.c
> @@ -47,8 +47,10 @@ static struct fadump_mem_struct fdm;
>  static const struct fadump_mem_struct *fdm_active;
>  
>  static DEFINE_MUTEX(fadump_mutex);
> -struct fad_crash_memory_ranges crash_memory_ranges[INIT_CRASHMEM_RANGES];
> +struct fad_crash_memory_ranges *crash_memory_ranges;
> +int crash_memory_ranges_size;
>  int crash_mem_ranges;
> +int max_crash_mem_ranges;
>  
>  /* Scan the Firmware Assisted dump configuration details. */
>  int __init early_init_dt_scan_fw_dump(unsigned long node,
> @@ -868,22 +870,67 @@ static int __init process_fadump(const struct fadump_mem_struct *fdm_active)
>  	return 0;
>  }
>  
> -static inline void fadump_add_crash_memory(unsigned long long base,
> -					unsigned long long end)
> +static void free_crash_memory_ranges(void)
> +{
> +	kfree(crash_memory_ranges);
> +	crash_memory_ranges = NULL;
> +	crash_memory_ranges_size = 0;
> +	max_crash_mem_ranges = 0;
> +}
> +
> +/*
> + * Allocate or reallocate crash memory ranges array in incremental units
> + * of CRASHMEM_RANGES_ALLOC_SIZE.
> + */
> +static int allocate_crash_memory_ranges(void)
> +{
> +	u64 new_size;
> +	struct fad_crash_memory_ranges *new_array;
> +
> +	new_size = crash_memory_ranges_size + CRASHMEM_RANGES_ALLOC_SIZE;
> +	pr_debug("Allocating %llu bytes of memory for crash memory ranges\n",
> +		 new_size);
> +
> +	new_array = krealloc(crash_memory_ranges, new_size, GFP_KERNEL);
> +	if (new_array == NULL) {
> +		pr_err("Insufficient memory for setting up crash memory ranges\n");
> +		free_crash_memory_ranges();
> +		return -ENOMEM;
> +	}
> +
> +	crash_memory_ranges = new_array;
> +	crash_memory_ranges_size = new_size;
> +	max_crash_mem_ranges = (new_size /
> +				sizeof(struct fad_crash_memory_ranges));
> +	return 0;
> +}
> +
> +static inline int fadump_add_crash_memory(unsigned long long base,
> +					  unsigned long long end)
>  {
>  	if (base == end)
> -		return;
> +		return 0;
> +
> +	if (crash_mem_ranges == max_crash_mem_ranges) {
> +		int ret;
> +
> +		ret = allocate_crash_memory_ranges();
> +		if (ret)
> +			return ret;
> +	}
>  
>  	pr_debug("crash_memory_range[%d] [%#016llx-%#016llx], %#llx bytes\n",
>  		crash_mem_ranges, base, end - 1, (end - base));
>  	crash_memory_ranges[crash_mem_ranges].base = base;
>  	crash_memory_ranges[crash_mem_ranges].size = end - base;
>  	crash_mem_ranges++;
> +	return 0;
>  }
>  
> -static void fadump_exclude_reserved_area(unsigned long long start,
> +static int fadump_exclude_reserved_area(unsigned long long start,
>  					unsigned long long end)
>  {
> +	int ret = 0;
>  	unsigned long long ra_start, ra_end;
>  
>  	ra_start = fw_dump.reserve_dump_area_start;
> @@ -891,15 +938,20 @@ static void fadump_exclude_reserved_area(unsigned long long start,
>  
>  	if ((ra_start < end) && (ra_end > start)) {
>  		if ((start < ra_start) && (end > ra_end)) {
> -			fadump_add_crash_memory(start, ra_start);
> -			fadump_add_crash_memory(ra_end, end);
> +			ret = fadump_add_crash_memory(start, ra_start);
> +			if (ret)
> +				return ret;
> +
> +			ret = fadump_add_crash_memory(ra_end, end);
>  		} else if (start < ra_start) {
> -			fadump_add_crash_memory(start, ra_start);
> +			ret = fadump_add_crash_memory(start, ra_start);
>  		} else if (ra_end < end) {
> -			fadump_add_crash_memory(ra_end, end);
> +			ret = fadump_add_crash_memory(ra_end, end);
>  		}
>  	} else
> -		fadump_add_crash_memory(start, end);
> +		ret = fadump_add_crash_memory(start, end);
> +
> +	return ret;
>  }
>  
>  static int fadump_init_elfcore_header(char *bufp)
> @@ -939,8 +991,9 @@ static int fadump_init_elfcore_header(char *bufp)
>   * Traverse through memblock structure and setup crash memory ranges. These
>   * ranges will be used create PT_LOAD program headers in elfcore header.
>   */
> -static void fadump_setup_crash_memory_ranges(void)
> +static int fadump_setup_crash_memory_ranges(void)
>  {
> +	int ret;
>  	struct memblock_region *reg;
>  	unsigned long long start, end;
>  
> @@ -953,7 +1006,9 @@ static void fadump_setup_crash_memory_ranges(void)
>  	 * specified during fadump registration. We need to create a separate
>  	 * program header for this chunk with the correct offset.
>  	 */
> -	fadump_add_crash_memory(RMA_START, fw_dump.boot_memory_size);
> +	ret = fadump_add_crash_memory(RMA_START, fw_dump.boot_memory_size);
> +	if (ret)
> +		return ret;
>  
>  	for_each_memblock(memory, reg) {
>  		start = (unsigned long long)reg->base;
> @@ -973,8 +1028,12 @@ static void fadump_setup_crash_memory_ranges(void)
>  		}
>  
>  		/* add this range excluding the reserved dump area. */
> -		fadump_exclude_reserved_area(start, end);
> +		ret = fadump_exclude_reserved_area(start, end);
> +		if (ret)
> +			return ret;
>  	}
> +
> +	return 0;
>  }
>  
>  /*
> @@ -1095,6 +1154,7 @@ static unsigned long init_fadump_header(unsigned long addr)
>  
>  static int register_fadump(void)
>  {
> +	int ret;
>  	unsigned long addr;
>  	void *vaddr;
>  
> @@ -1105,7 +1165,9 @@ static int register_fadump(void)
>  	if (!fw_dump.reserve_dump_area_size)
>  		return -ENODEV;
>  
> -	fadump_setup_crash_memory_ranges();
> +	ret = fadump_setup_crash_memory_ranges();
> +	if (ret)
> +		return ret;
>  
>  	addr = be64_to_cpu(fdm.rmr_region.destination_address) + be64_to_cpu(fdm.rmr_region.source_len);
>  	/* Initialize fadump crash info header. */
> @@ -1183,6 +1245,7 @@ void fadump_cleanup(void)
>  	} else if (fw_dump.dump_registered) {
>  		/* Un-register Firmware-assisted dump if it was registered. */
>  		fadump_unregister_dump(&fdm);
> +		free_crash_memory_ranges();
>  	}
>  }
>  
> 

^ permalink raw reply

* Re: [PATCH v7 5/9] powerpc/pseries: flush SLB contents on SLB MCE errors.
From: Nicholas Piggin @ 2018-08-08  9:04 UTC (permalink / raw)
  To: Mahesh J Salgaonkar
  Cc: linuxppc-dev, Michal Suchanek, Aneesh Kumar K.V, Ananth Narayan,
	Laurent Dufour, Michael Ellerman
In-Reply-To: <153365142349.14256.9954484737438718329.stgit@jupiter.in.ibm.com>

On Tue, 07 Aug 2018 19:47:14 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:

> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> On pseries, as of today system crashes if we get a machine check
> exceptions due to SLB errors. These are soft errors and can be fixed by
> flushing the SLBs so the kernel can continue to function instead of
> system crash. We do this in real mode before turning on MMU. Otherwise
> we would run into nested machine checks. This patch now fetches the
> rtas error log in real mode and flushes the SLBs on SLB errors.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> Signed-off-by: Michal Suchanek <msuchanek@suse.com>
> ---
> 
> Changes in V7:
> - Fold Michal's patch into this patch.
> - Handle MSR_RI=0 and evil context case in MC handler.
> ---


> diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> index cb796724a6fc..e89f675f1b5e 100644
> --- a/arch/powerpc/mm/slb.c
> +++ b/arch/powerpc/mm/slb.c
> @@ -145,6 +145,12 @@ void slb_flush_and_rebolt(void)
>  	get_paca()->slb_cache_ptr = 0;
>  }
>  
> +void slb_flush_and_rebolt_realmode(void)
> +{
> +	__slb_flush_and_rebolt();
> +	get_paca()->slb_cache_ptr = 0;
> +}
> +
>  void slb_vmalloc_update(void)
>  {
>  	unsigned long vflags;

Can you use this patch for the SLB flush?

https://patchwork.ozlabs.org/patch/953034/

Thanks,
Nick

^ permalink raw reply

* Re: [PATCH v2 2/2] powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
From: Mahesh Jagannath Salgaonkar @ 2018-08-08  9:08 UTC (permalink / raw)
  To: Hari Bathini, Michael Ellerman; +Cc: linuxppc-dev, Mahesh J Salgaonkar
In-Reply-To: <153358817264.15150.1450644377959627061.stgit@hbathini.in.ibm.com>

On 08/07/2018 02:12 AM, Hari Bathini wrote:
> With dynamic memory allocation support for crash memory ranges array,
> there is no hard limit on the no. of crash memory ranges kernel could
> export, but program headers count could overflow in the /proc/vmcore
> ELF file while exporting each memory range as PT_LOAD segment. Reduce
> the likelihood of a such scenario, by folding adjacent crash memory
> ranges which minimizes the total number of PT_LOAD segments.
> 
> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
> ---
>  arch/powerpc/kernel/fadump.c |   45 ++++++++++++++++++++++++++++++++++--------
>  1 file changed, 36 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> index 2ec5704..cd0c555 100644
> --- a/arch/powerpc/kernel/fadump.c
> +++ b/arch/powerpc/kernel/fadump.c
> @@ -908,22 +908,41 @@ static int allocate_crash_memory_ranges(void)
>  static inline int fadump_add_crash_memory(unsigned long long base,
>  					  unsigned long long end)
>  {
> +	u64  start, size;
> +	bool is_adjacent = false;
> +
>  	if (base == end)
>  		return 0;
>  
> -	if (crash_mem_ranges == max_crash_mem_ranges) {
> -		int ret;
> +	/*
> +	 * Fold adjacent memory ranges to bring down the memory ranges/
> +	 * PT_LOAD segments count.
> +	 */
> +	if (crash_mem_ranges) {
> +		start = crash_memory_ranges[crash_mem_ranges-1].base;
> +		size = crash_memory_ranges[crash_mem_ranges-1].size;
>  
> -		ret = allocate_crash_memory_ranges();
> -		if (ret)
> -			return ret;
> +		if ((start + size) == base)
> +			is_adjacent = true;
> +	}
> +	if (!is_adjacent) {
> +		/* resize the array on reaching the limit */
> +		if (crash_mem_ranges == max_crash_mem_ranges) {
> +			int ret;
> +
> +			ret = allocate_crash_memory_ranges();
> +			if (ret)
> +				return ret;
> +		}
> +
> +		start = base;
> +		crash_memory_ranges[crash_mem_ranges].base = start;
> +		crash_mem_ranges++;
>  	}
>  
> +	crash_memory_ranges[crash_mem_ranges-1].size = (end - start);
>  	pr_debug("crash_memory_range[%d] [%#016llx-%#016llx], %#llx bytes\n",
> -		crash_mem_ranges, base, end - 1, (end - base));
> -	crash_memory_ranges[crash_mem_ranges].base = base;
> -	crash_memory_ranges[crash_mem_ranges].size = end - base;
> -	crash_mem_ranges++;
> +		(crash_mem_ranges - 1), start, end - 1, (end - start));
>  	return 0;
>  }
>  
> @@ -999,6 +1018,14 @@ static int fadump_setup_crash_memory_ranges(void)
>  
>  	pr_debug("Setup crash memory ranges.\n");
>  	crash_mem_ranges = 0;
> +
> +	/* allocate memory for crash memory ranges for the first time */
> +	if (!max_crash_mem_ranges) {
> +		ret = allocate_crash_memory_ranges();
> +		if (ret)
> +			return ret;
> +	}
> +

I see that the check for (!is_adjacent) in first hunk already handles
the first time allocation. Do we need this ?

Rest looks fine to me.

Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Thanks,
-Mahesh.

>  	/*
>  	 * add the first memory chunk (RMA_START through boot_memory_size) as
>  	 * a separate memory chunk. The reason is, at the time crash firmware
> 

^ permalink raw reply

* [PATCH NEXT 2/4] powerpc/pasemi: Add Nemo board IRQ init routine
From: Christian Zigotzky @ 2018-08-08  9:22 UTC (permalink / raw)
  To: Michael Ellerman, Darren Stevens, linuxppc-dev, Olof Johansson
In-Reply-To: <87h8noanxo.fsf@concordia.ellerman.id.au>

Hello Michael,

I haven't reached Darren yet but I try to help a little bit.

On 03 May 2018 at 3:06PM, Michael Ellerman wrote:
> Darren Stevens <darren@stevens-zone.net> writes:
>
>> diff --git a/arch/powerpc/platforms/pasemi/setup.c b/arch/powerpc/platforms/pasemi/setup.c
>> index c4a3e93..c583c17 100644
>> --- a/arch/powerpc/platforms/pasemi/setup.c
>> +++ b/arch/powerpc/platforms/pasemi/setup.c
>> @@ -183,6 +184,99 @@ static int __init pas_setup_mce_regs(void)
>>   }
>>   machine_device_initcall(pasemi, pas_setup_mce_regs);
>>   
>> +#ifdef CONFIG_PPC_PASEMI_NEMO
>> +static void sb600_8259_cascade(struct irq_desc *desc)
>> +{
>> +	struct irq_chip *chip = irq_desc_get_chip(desc);
>> +	unsigned int cascade_irq = i8259_irq();
>> +
>> +	if (cascade_irq)
>> +               generic_handle_irq(cascade_irq);
>> +
>> +	chip->irq_eoi(&desc->irq_data);
>> +}
>> +
>> +static __init void nemo_init_IRQ(void)
>> +{
>> +	struct device_node *np;
>> +	struct device_node *root, *mpic_node, *i8259_node;
>> +	unsigned long openpic_addr;
>> +	const unsigned int *opprop;
>> +	int naddr, opplen;
>> +	int mpic_flags;
>> +	const unsigned int *nmiprop;
>> +	struct mpic *mpic;
>> +	int gpio_virq;
>> +
>> +	mpic_node = NULL;
> This is basically a copy of the existing routine.
Yes, at the begin of 'nemo_init_IRQ' it is a copy of the existing 
routine. But the code before is new code. A similar code is in the file 
"arch/powerpc/platforms/pseries/setup.c" but for the pSeries.
>
>> +	for_each_node_by_type(np, "interrupt-controller")
>> +		if (of_device_is_compatible(np, "open-pic")) {
>> +			mpic_node = np;
>> +			break;
>> +		}
>> +	if (!mpic_node)
>> +		for_each_node_by_type(np, "open-pic") {
>> +			mpic_node = np;
>> +			break;
>> +		}
>> +	if (!mpic_node) {
>> +		printk(KERN_ERR
>> +			"Failed to locate the MPIC interrupt controller\n");
>> +		return;
>> +	}
>> +
>> +	/* Find address list in /platform-open-pic */
>> +	root = of_find_node_by_path("/");
>> +	naddr = of_n_addr_cells(root);
>> +	opprop = of_get_property(root, "platform-open-pic", &opplen);
>> +	if (!opprop) {
>> +		printk(KERN_ERR "No platform-open-pic property.\n");
>> +		of_node_put(root);
>> +		return;
>> +	}
>> +	openpic_addr = of_read_number(opprop, naddr);
>> +	printk(KERN_DEBUG "OpenPIC addr: %lx\n", openpic_addr);
>> +
>> +	mpic_flags = MPIC_LARGE_VECTORS | MPIC_NO_BIAS | MPIC_NO_RESET;
>> +
>> +	nmiprop = of_get_property(mpic_node, "nmi-source", NULL);
>> +	if (nmiprop)
>> +		mpic_flags |= MPIC_ENABLE_MCK;
>> +
>> +	mpic = mpic_alloc(mpic_node, openpic_addr,
>> +			  mpic_flags, 0, 0, "PASEMI-OPIC");
>> +	BUG_ON(!mpic);
>> +
>> +	mpic_assign_isu(mpic, 0, mpic->paddr + 0x10000);
>> +	mpic_init(mpic);
>> +	/* The NMI/MCK source needs to be prio 15 */
>> +	if (nmiprop) {
>> +		nmi_virq = irq_create_mapping(NULL, *nmiprop);
>> +		mpic_irq_set_priority(nmi_virq, 15);
>> +		irq_set_irq_type(nmi_virq, IRQ_TYPE_EDGE_RISING);
>> +		mpic_unmask_irq(irq_get_irq_data(nmi_virq));
>> +	}
> Except for this bit:
>
>> +	/* Connect the SB600's legacy i8259 controller */
>> +	i8259_node = of_find_node_by_path("/pxp@0,e0000000");
>> +	i8259_init(i8259_node, 0);
>> +	of_node_put(i8259_node);
>> +
>> +	gpio_virq = irq_create_mapping(NULL, 3);
>> +	irq_set_irq_type(gpio_virq, IRQ_TYPE_LEVEL_HIGH);
>> +	irq_set_chained_handler(gpio_virq, sb600_8259_cascade);
>> +	mpic_unmask_irq(irq_get_irq_data(gpio_virq));
>> +
>> +	irq_set_default_host(mpic->irqhost);
> So that should just go in a separate routine that is empty when NEMO=n.
>
> cheers
>
What do you think about the following code. This uses 'pas_init_IRQ':

static __init void pas_init_IRQ(void)
  {
      struct device_node *np;
-    struct device_node *root, *mpic_node;
+    struct device_node *root, *mpic_node, *i8259_node;
      unsigned long openpic_addr;
      const unsigned int *opprop;
      int naddr, opplen;
      int mpic_flags;
      const unsigned int *nmiprop;
      struct mpic *mpic;
+    int gpio_virq;

      mpic_node = NULL;

@@ -244,6 +270,22 @@ static __init void pas_init_IRQ(void)
          mpic_unmask_irq(irq_get_irq_data(nmi_virq));
      }

+
+#ifdef CONFIG_PPC_PASEMI_NEMO
+    /* Connect the SB600's legacy i8259 controller */
+    i8259_node = of_find_node_by_path("/pxp@0,e0000000");
+    i8259_init(i8259_node, 0);
+    of_node_put(i8259_node);
+
+    gpio_virq = irq_create_mapping(NULL, 3);
+    irq_set_irq_type(gpio_virq, IRQ_TYPE_LEVEL_HIGH);
+    irq_set_chained_handler(gpio_virq, sb600_8259_cascade);
+    mpic_unmask_irq(irq_get_irq_data(gpio_virq));
+
+    irq_set_default_host(mpic->irqhost);
+
+#endif
      of_node_put(mpic_node);
      of_node_put(root);
  }

----

Shall I add 'ifdef CONFIG_PPC_PASEMI_NEMO' for 'struct device_node 
*root, *mpic_node, *i8259_node;' and 'int gpio_virq;'?

Note: I am not a programmer. I work for the Linux first level support 
for A-EON but sometimes I try to help. This is only a suggestion.

Thanks,
Christian

^ permalink raw reply

* Re: [PATCH] of/fdt: Remove PPC32 longtrail hack in memory scan
From: Geert Uytterhoeven @ 2018-08-08  9:29 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Rob Herring, Frank Rowand, Paul Mackerras, Linuxppc-dev,
	Dominik Klein
In-Reply-To: <20180727053555.26596-1-mpe@ellerman.id.au>

Hi Michael,

On Fri, Jul 27, 2018 at 7:36 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
> When the OF code was originally made common by Grant in commit
> 51975db0b733 ("of/flattree: merge early_init_dt_scan_memory() common
> code") (Feb 2010), the common code inherited a hack to handle
> PPC "longtrail" machines, which had a "memory@0" node with no
> device_type.
>
> That check was then made to only apply to PPC32 in b44aa25d20e2 ("of:
> Handle memory@0 node on PPC32 only") (May 2014).
>
> But according to Paul Mackerras the "longtrail" machines are long
> dead, if they were ever seen in the wild at all. If someone does still
> have one, we can handle this firmware wart in powerpc platform code.
>
> So remove the hack once and for all.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Thanks for your patch!

My LongTrail died in 2004.  I haven't heard since even longer about active
use from any of the other people I know that had one.

So:

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

However, recently Dominik (CC) send me an enquiry about it, so perhaps
he is a happy user?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH v2 2/2] powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
From: Hari Bathini @ 2018-08-08  9:35 UTC (permalink / raw)
  To: Mahesh Jagannath Salgaonkar, Hari Bathini, Michael Ellerman
  Cc: linuxppc-dev, Mahesh J Salgaonkar
In-Reply-To: <c73ee602-dc93-1462-cc5b-3695594705e1@linux.vnet.ibm.com>



On Wednesday 08 August 2018 02:38 PM, Mahesh Jagannath Salgaonkar wrote:
> On 08/07/2018 02:12 AM, Hari Bathini wrote:
>> With dynamic memory allocation support for crash memory ranges array,
>> there is no hard limit on the no. of crash memory ranges kernel could
>> export, but program headers count could overflow in the /proc/vmcore
>> ELF file while exporting each memory range as PT_LOAD segment. Reduce
>> the likelihood of a such scenario, by folding adjacent crash memory
>> ranges which minimizes the total number of PT_LOAD segments.
>>
>> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
>> ---
>>   arch/powerpc/kernel/fadump.c |   45 ++++++++++++++++++++++++++++++++++--------
>>   1 file changed, 36 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
>> index 2ec5704..cd0c555 100644
>> --- a/arch/powerpc/kernel/fadump.c
>> +++ b/arch/powerpc/kernel/fadump.c
>> @@ -908,22 +908,41 @@ static int allocate_crash_memory_ranges(void)
>>   static inline int fadump_add_crash_memory(unsigned long long base,
>>   					  unsigned long long end)
>>   {
>> +	u64  start, size;
>> +	bool is_adjacent = false;
>> +
>>   	if (base == end)
>>   		return 0;
>>   
>> -	if (crash_mem_ranges == max_crash_mem_ranges) {
>> -		int ret;
>> +	/*
>> +	 * Fold adjacent memory ranges to bring down the memory ranges/
>> +	 * PT_LOAD segments count.
>> +	 */
>> +	if (crash_mem_ranges) {
>> +		start = crash_memory_ranges[crash_mem_ranges-1].base;
>> +		size = crash_memory_ranges[crash_mem_ranges-1].size;
>>   
>> -		ret = allocate_crash_memory_ranges();
>> -		if (ret)
>> -			return ret;
>> +		if ((start + size) == base)
>> +			is_adjacent = true;
>> +	}
>> +	if (!is_adjacent) {
>> +		/* resize the array on reaching the limit */
>> +		if (crash_mem_ranges == max_crash_mem_ranges) {
>> +			int ret;
>> +
>> +			ret = allocate_crash_memory_ranges();
>> +			if (ret)
>> +				return ret;
>> +		}
>> +
>> +		start = base;
>> +		crash_memory_ranges[crash_mem_ranges].base = start;
>> +		crash_mem_ranges++;
>>   	}
>>   
>> +	crash_memory_ranges[crash_mem_ranges-1].size = (end - start);
>>   	pr_debug("crash_memory_range[%d] [%#016llx-%#016llx], %#llx bytes\n",
>> -		crash_mem_ranges, base, end - 1, (end - base));
>> -	crash_memory_ranges[crash_mem_ranges].base = base;
>> -	crash_memory_ranges[crash_mem_ranges].size = end - base;
>> -	crash_mem_ranges++;
>> +		(crash_mem_ranges - 1), start, end - 1, (end - start));
>>   	return 0;
>>   }
>>   
>> @@ -999,6 +1018,14 @@ static int fadump_setup_crash_memory_ranges(void)
>>   
>>   	pr_debug("Setup crash memory ranges.\n");
>>   	crash_mem_ranges = 0;
>> +
>> +	/* allocate memory for crash memory ranges for the first time */
>> +	if (!max_crash_mem_ranges) {
>> +		ret = allocate_crash_memory_ranges();
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
> I see that the check for (!is_adjacent) in first hunk already handles
> the first time allocation. Do we need this ?

Right. This hunk in fadump_setup_crash_memory_ranges() is unnecessary. 
Can be dropped.
Also, I missed out on adding "#include <linux/slab.h>". Though it 
compiles fine with
upstream kernel, will add and post v3 just to be safe..

> Rest looks fine to me.
>
> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Thanks for the review

- Hari

^ permalink raw reply

* [PATCH v3 1/2] powerpc/fadump: handle crash memory ranges array index overflow
From: Hari Bathini @ 2018-08-08  9:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Mahesh Salgaonkar, Mahesh J Salgaonkar, stable, linuxppc-dev

Crash memory ranges is an array of memory ranges of the crashing kernel
to be exported as a dump via /proc/vmcore file. The size of the array
is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases
where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS
value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since
commit 142b45a72e22 ("memblock: Add array resizing support").

On large memory systems with a few DLPAR operations, the memblock memory
regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such
systems, registering fadump results in crash or other system failures
like below:

  task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000
  NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180
  REGS: c00000000b73b570 TRAP: 0300   Tainted: G          L   X  (4.4.140+)
  MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22004484  XER: 20000000
  CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0
  GPR00: c0000000000f9e58 c00000000b73b7f0 c000000000f09a00 000000000000001a
  GPR04: c00007f3bf774c90 0000000000000004 c000000000eb9a00 0000000000000800
  GPR08: 0000000000000804 000007a450000000 c000000000fa9a00 c00007ffb169ca20
  GPR12: 0000000022004482 c00000000fa12c00 c00007f3a0ea97a8 0000000000000000
  GPR16: c00007f3a0ea9a50 c00000000b73bd60 0000000000000118 000000000001fe80
  GPR20: 0000000000000118 0000000000000000 c000000000b8c980 00000000000000d0
  GPR24: 000007ffb0b10000 c00007ffb169c980 0000000000000000 c000000000b8c980
  GPR28: 0000000000000004 c00007ffb169c980 000000000000001a c00007ffb169c980
  NIP [c000000000047df4] smp_send_reschedule+0x24/0x80
  LR [c0000000000f9e58] resched_curr+0x138/0x160
  Call Trace:
  [c00000000b73b7f0] [c0000000000f9e58] resched_curr+0x138/0x160 (unreliable)
  [c00000000b73b820] [c0000000000fb538] check_preempt_curr+0xc8/0xf0
  [c00000000b73b850] [c0000000000fb598] ttwu_do_wakeup+0x38/0x150
  [c00000000b73b890] [c0000000000fc9c4] try_to_wake_up+0x224/0x4d0
  [c00000000b73b900] [c00000000011ef34] __wake_up_common+0x94/0x100
  [c00000000b73b960] [c00000000034a78c] ep_poll_callback+0xac/0x1c0
  [c00000000b73b9b0] [c00000000011ef34] __wake_up_common+0x94/0x100
  [c00000000b73ba10] [c00000000011f810] __wake_up_sync_key+0x70/0xa0
  [c00000000b73ba60] [c00000000067c3e8] sock_def_readable+0x58/0xa0
  [c00000000b73ba90] [c0000000007848ac] unix_stream_sendmsg+0x2dc/0x4c0
  [c00000000b73bb70] [c000000000675a38] sock_sendmsg+0x68/0xa0
  [c00000000b73bba0] [c00000000067673c] ___sys_sendmsg+0x2cc/0x2e0
  [c00000000b73bd30] [c000000000677dbc] __sys_sendmsg+0x5c/0xc0
  [c00000000b73bdd0] [c0000000006789bc] SyS_socketcall+0x36c/0x3f0
  [c00000000b73be30] [c000000000009488] system_call+0x3c/0x100
  Instruction dump:
  4e800020 60000000 60420000 3c4c00ec 38421c30 7c0802a6 f8010010 60000000
  3d42000a e92ab420 2fa90000 4dde0020 <e9290000> 2fa90000 419e0044 7c0802a6
  ---[ end trace a6d1dd4bab5f8253 ]---

as array index overflow is not checked for while setting up crash memory
ranges causing memory corruption. To resolve this issue, dynamically
allocate memory for crash memory ranges and resize it incrementally,
in units of pagesize, on hitting array size limit.

Fixes: 2df173d9e85d ("fadump: Initialize elfcore header and add PT_LOAD program headers.")
Cc: stable@vger.kernel.org
Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---

Changes in v3:
* Included <linux/slab.h> for krelloc()


 arch/powerpc/include/asm/fadump.h |    4 +-
 arch/powerpc/kernel/fadump.c      |   92 +++++++++++++++++++++++++++++++------
 2 files changed, 80 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/fadump.h b/arch/powerpc/include/asm/fadump.h
index 5a23010..3abc738 100644
--- a/arch/powerpc/include/asm/fadump.h
+++ b/arch/powerpc/include/asm/fadump.h
@@ -195,8 +195,8 @@ struct fadump_crash_info_header {
 	struct cpumask	online_mask;
 };
 
-/* Crash memory ranges */
-#define INIT_CRASHMEM_RANGES	(INIT_MEMBLOCK_REGIONS + 2)
+/* Crash memory ranges size unit (pagesize) */
+#define CRASHMEM_RANGES_ALLOC_SIZE		PAGE_SIZE
 
 struct fad_crash_memory_ranges {
 	unsigned long long	base;
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 07e8396..9f80a78 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -34,6 +34,7 @@
 #include <linux/crash_dump.h>
 #include <linux/kobject.h>
 #include <linux/sysfs.h>
+#include <linux/slab.h>
 
 #include <asm/debugfs.h>
 #include <asm/page.h>
@@ -47,8 +48,10 @@ static struct fadump_mem_struct fdm;
 static const struct fadump_mem_struct *fdm_active;
 
 static DEFINE_MUTEX(fadump_mutex);
-struct fad_crash_memory_ranges crash_memory_ranges[INIT_CRASHMEM_RANGES];
+struct fad_crash_memory_ranges *crash_memory_ranges;
+int crash_memory_ranges_size;
 int crash_mem_ranges;
+int max_crash_mem_ranges;
 
 /* Scan the Firmware Assisted dump configuration details. */
 int __init early_init_dt_scan_fw_dump(unsigned long node,
@@ -868,22 +871,67 @@ static int __init process_fadump(const struct fadump_mem_struct *fdm_active)
 	return 0;
 }
 
-static inline void fadump_add_crash_memory(unsigned long long base,
-					unsigned long long end)
+static void free_crash_memory_ranges(void)
+{
+	kfree(crash_memory_ranges);
+	crash_memory_ranges = NULL;
+	crash_memory_ranges_size = 0;
+	max_crash_mem_ranges = 0;
+}
+
+/*
+ * Allocate or reallocate crash memory ranges array in incremental units
+ * of CRASHMEM_RANGES_ALLOC_SIZE.
+ */
+static int allocate_crash_memory_ranges(void)
+{
+	u64 new_size;
+	struct fad_crash_memory_ranges *new_array;
+
+	new_size = crash_memory_ranges_size + CRASHMEM_RANGES_ALLOC_SIZE;
+	pr_debug("Allocating %llu bytes of memory for crash memory ranges\n",
+		 new_size);
+
+	new_array = krealloc(crash_memory_ranges, new_size, GFP_KERNEL);
+	if (new_array == NULL) {
+		pr_err("Insufficient memory for setting up crash memory ranges\n");
+		free_crash_memory_ranges();
+		return -ENOMEM;
+	}
+
+	crash_memory_ranges = new_array;
+	crash_memory_ranges_size = new_size;
+	max_crash_mem_ranges = (new_size /
+				sizeof(struct fad_crash_memory_ranges));
+	return 0;
+}
+
+static inline int fadump_add_crash_memory(unsigned long long base,
+					  unsigned long long end)
 {
 	if (base == end)
-		return;
+		return 0;
+
+	if (crash_mem_ranges == max_crash_mem_ranges) {
+		int ret;
+
+		ret = allocate_crash_memory_ranges();
+		if (ret)
+			return ret;
+	}
 
 	pr_debug("crash_memory_range[%d] [%#016llx-%#016llx], %#llx bytes\n",
 		crash_mem_ranges, base, end - 1, (end - base));
 	crash_memory_ranges[crash_mem_ranges].base = base;
 	crash_memory_ranges[crash_mem_ranges].size = end - base;
 	crash_mem_ranges++;
+	return 0;
 }
 
-static void fadump_exclude_reserved_area(unsigned long long start,
+static int fadump_exclude_reserved_area(unsigned long long start,
 					unsigned long long end)
 {
+	int ret = 0;
 	unsigned long long ra_start, ra_end;
 
 	ra_start = fw_dump.reserve_dump_area_start;
@@ -891,15 +939,20 @@ static void fadump_exclude_reserved_area(unsigned long long start,
 
 	if ((ra_start < end) && (ra_end > start)) {
 		if ((start < ra_start) && (end > ra_end)) {
-			fadump_add_crash_memory(start, ra_start);
-			fadump_add_crash_memory(ra_end, end);
+			ret = fadump_add_crash_memory(start, ra_start);
+			if (ret)
+				return ret;
+
+			ret = fadump_add_crash_memory(ra_end, end);
 		} else if (start < ra_start) {
-			fadump_add_crash_memory(start, ra_start);
+			ret = fadump_add_crash_memory(start, ra_start);
 		} else if (ra_end < end) {
-			fadump_add_crash_memory(ra_end, end);
+			ret = fadump_add_crash_memory(ra_end, end);
 		}
 	} else
-		fadump_add_crash_memory(start, end);
+		ret = fadump_add_crash_memory(start, end);
+
+	return ret;
 }
 
 static int fadump_init_elfcore_header(char *bufp)
@@ -939,8 +992,9 @@ static int fadump_init_elfcore_header(char *bufp)
  * Traverse through memblock structure and setup crash memory ranges. These
  * ranges will be used create PT_LOAD program headers in elfcore header.
  */
-static void fadump_setup_crash_memory_ranges(void)
+static int fadump_setup_crash_memory_ranges(void)
 {
+	int ret;
 	struct memblock_region *reg;
 	unsigned long long start, end;
 
@@ -953,7 +1007,9 @@ static void fadump_setup_crash_memory_ranges(void)
 	 * specified during fadump registration. We need to create a separate
 	 * program header for this chunk with the correct offset.
 	 */
-	fadump_add_crash_memory(RMA_START, fw_dump.boot_memory_size);
+	ret = fadump_add_crash_memory(RMA_START, fw_dump.boot_memory_size);
+	if (ret)
+		return ret;
 
 	for_each_memblock(memory, reg) {
 		start = (unsigned long long)reg->base;
@@ -973,8 +1029,12 @@ static void fadump_setup_crash_memory_ranges(void)
 		}
 
 		/* add this range excluding the reserved dump area. */
-		fadump_exclude_reserved_area(start, end);
+		ret = fadump_exclude_reserved_area(start, end);
+		if (ret)
+			return ret;
 	}
+
+	return 0;
 }
 
 /*
@@ -1095,6 +1155,7 @@ static unsigned long init_fadump_header(unsigned long addr)
 
 static int register_fadump(void)
 {
+	int ret;
 	unsigned long addr;
 	void *vaddr;
 
@@ -1105,7 +1166,9 @@ static int register_fadump(void)
 	if (!fw_dump.reserve_dump_area_size)
 		return -ENODEV;
 
-	fadump_setup_crash_memory_ranges();
+	ret = fadump_setup_crash_memory_ranges();
+	if (ret)
+		return ret;
 
 	addr = be64_to_cpu(fdm.rmr_region.destination_address) + be64_to_cpu(fdm.rmr_region.source_len);
 	/* Initialize fadump crash info header. */
@@ -1183,6 +1246,7 @@ void fadump_cleanup(void)
 	} else if (fw_dump.dump_registered) {
 		/* Un-register Firmware-assisted dump if it was registered. */
 		fadump_unregister_dump(&fdm);
+		free_crash_memory_ranges();
 	}
 }
 

^ permalink raw reply related

* [PATCH v3 2/2] powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
From: Hari Bathini @ 2018-08-08  9:50 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Mahesh Salgaonkar, Mahesh J Salgaonkar, linuxppc-dev
In-Reply-To: <153372179892.3846.3752141557907430807.stgit@hbathini.in.ibm.com>

With dynamic memory allocation support for crash memory ranges array,
there is no hard limit on the no. of crash memory ranges kernel could
export, but program headers count could overflow in the /proc/vmcore
ELF file while exporting each memory range as PT_LOAD segment. Reduce
the likelihood of a such scenario, by folding adjacent crash memory
ranges which minimizes the total number of PT_LOAD segments.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---

Changes in v3:
* Dropped unnecessary memory allocation hunk in fadump_setup_crash_memory_ranges()


 arch/powerpc/kernel/fadump.c |   37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 9f80a78..5436600c 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -909,22 +909,41 @@ static int allocate_crash_memory_ranges(void)
 static inline int fadump_add_crash_memory(unsigned long long base,
 					  unsigned long long end)
 {
+	u64  start, size;
+	bool is_adjacent = false;
+
 	if (base == end)
 		return 0;
 
-	if (crash_mem_ranges == max_crash_mem_ranges) {
-		int ret;
+	/*
+	 * Fold adjacent memory ranges to bring down the memory ranges/
+	 * PT_LOAD segments count.
+	 */
+	if (crash_mem_ranges) {
+		start = crash_memory_ranges[crash_mem_ranges-1].base;
+		size = crash_memory_ranges[crash_mem_ranges-1].size;
 
-		ret = allocate_crash_memory_ranges();
-		if (ret)
-			return ret;
+		if ((start + size) == base)
+			is_adjacent = true;
+	}
+	if (!is_adjacent) {
+		/* resize the array on reaching the limit */
+		if (crash_mem_ranges == max_crash_mem_ranges) {
+			int ret;
+
+			ret = allocate_crash_memory_ranges();
+			if (ret)
+				return ret;
+		}
+
+		start = base;
+		crash_memory_ranges[crash_mem_ranges].base = start;
+		crash_mem_ranges++;
 	}
 
+	crash_memory_ranges[crash_mem_ranges-1].size = (end - start);
 	pr_debug("crash_memory_range[%d] [%#016llx-%#016llx], %#llx bytes\n",
-		crash_mem_ranges, base, end - 1, (end - base));
-	crash_memory_ranges[crash_mem_ranges].base = base;
-	crash_memory_ranges[crash_mem_ranges].size = end - base;
-	crash_mem_ranges++;
+		(crash_mem_ranges - 1), start, end - 1, (end - start));
 	return 0;
 }
 

^ permalink raw reply related

* Re: [RFC 0/4] Virtio uses DMA API for all devices
From: Benjamin Herrenschmidt @ 2018-08-08 10:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Michael S. Tsirkin, Will Deacon, Anshuman Khandual,
	virtualization, linux-kernel, linuxppc-dev, aik, robh, joe,
	elfring, david, jasowang, mpe, linuxram, haren, paulus, srikar,
	robin.murphy, jean-philippe.brucker, marc.zyngier
In-Reply-To: <20180808063158.GA2474@infradead.org>

On Tue, 2018-08-07 at 23:31 -0700, Christoph Hellwig wrote:
> 
> You don't need to set them the time you go secure.  You just need to
> set the flag from the beginning on any VM you might want to go secure.
> Or for simplicity just any VM - if the DT/ACPI tables exposed by
> qemu are good enough that will always exclude a iommu and not set a
> DMA offset, so nothing will change on the qemu side of he processing,
> and with the new direct calls for the direct dma ops performance in
> the guest won't change either.

So that's where I'm not sure things are "good enough" due to how
pseries works. (remember it's paravirtualized).

A pseries system starts with a default iommu on all devices, that uses
translation using 4k entires with a "pinhole" window (usually 2G with
qemu iirc). There's no "pass through" by default.

Qemu virtio bypasses that iommu when the VIRTIO_F_IOMMU_PLATFORM flag
is not set (default) but there's nothing in the device-tree to tell the
guest about this since it's a violation of our pseries architecture, so
we just rely on Linux virtio "knowing" that it happens. It's a bit
yucky but that's now history...

Essentially pseries "architecturally" does not have the concept of not
having an iommu in the way and qemu violates that architecture today.

(Remember it comes from pHyp, our priorietary HV, which we are somewhat
mimmicing here).

So if we always set VIRTIO_F_IOMMU_PLATFORM, it *will* force all virtio
through that iommu and performance will suffer (esp vhost I suspect),
especially since adding/removing translations in the iommu is a
hypercall.

Now, we do have HV APIs to create a second window that's "permanently
mapped" to the guest memory, thus avoiding dynamic map/unmaps, and
Linux can make use of this but I don't know if that works with qemu and
the performance impact with vhost.

So the situation isn't that great.... On the other hand, I think the
other approach works for us:

> > It's nicer if we have a way in the guest virtio driver to do something
> > along the lines of
> > 
> > 	if ((flags & VIRTIO_F_IOMMU_PLATFORM) || arch_virtio_wants_dma_ops())
> > 
> > Which would have the same effect and means the issue is entirely
> > contained in the guest.
> 
> It would not be the same effect.  The problem with that is that you must
> now assumes that your qemu knows that for example you might be passing
> a dma offset if the bus otherwise requires it. 

I would assume that arch_virtio_wants_dma_ops() only returns true when
no such offsets are involved, at least in our case that would be what
happens.

>  Or in other words:
> you potentially break the contract between qemu and the guest of always
> passing down physical addresses.  If we explicitly change that contract
> through using a flag that says you pass bus address everything is fine.

For us a "bus address" is behind the iommu so that's what
VIRTIO_F_IOMMU_PLATFORM does already. We don't have the concept of a
bus address that is different. I suppose it's an ARMism to have DMA
offsets that are separate from iommus ? 

> Note that in practice your scheme will probably just work for your
> initial prototype, but chances are it will get us in trouble later on.

Not on pseries, at least not in any way I can think of mind you... but
maybe other architectures would abuse it... We could add a WARN_ON if
that calls returns true on a bus with an offset I suppose.

Cheers,
Ben.

^ permalink raw reply

* Re: Build regressions/improvements in v4.17-rc1
From: Michael Ellerman @ 2018-08-08 10:32 UTC (permalink / raw)
  To: Andrew Morton, Geert Uytterhoeven
  Cc: Linux Kernel Mailing List, Dan Williams, linuxppc-dev, npiggin
In-Reply-To: <20180806155440.9dcb271a3b075bd964aec60f@linux-foundation.org>

Andrew Morton <akpm@linux-foundation.org> writes:
> On Mon, 6 Aug 2018 12:39:21 +0200 Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
>> CC Dan, Michael, AKPM, powerpc
>> 
>> On Mon, Apr 16, 2018 at 3:10 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>> > Below is the list of build error/warning regressions/improvements in
>> > v4.17-rc1[1] compared to v4.16[2].
>> 
>> I'd like to point your attention to:
>> 
>> >   + warning: vmlinux.o(.text+0x376518): Section mismatch in reference from the function .devm_memremap_pages() to the function .meminit.text:.arch_add_memory():  => N/A
>> >   + warning: vmlinux.o(.text+0x376d64): Section mismatch in reference from the function .devm_memremap_pages_release() to the function .meminit.text:.arch_remove_memory():  => N/A
>
> hm.  Dan isn't around at present so we're on our own with this one.
>
> x86 doesn't put arch_add_memory and arch_remove_memory into __meminit. 
> x86 does
>
> #ifdef CONFIG_MEMORY_HOTPLUG
> int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
> 		bool want_memblock)
> {
> 	...
>
>
> So I guess powerpc should do that as well?

But we only recently added it to fix a section mismatch warning:

  WARNING: vmlinux.o(.text+0x6da88): Section mismatch in reference from the function .arch_add_memory() to the function .meminit.text:.create_section_mapping()
  The function .arch_add_memory() references
  the function __meminit .create_section_mapping().
  This is often because .arch_add_memory lacks a __meminit 
  annotation or the annotation of .create_section_mapping is wrong.


I think the problem is that the section mismatch logic isn't able to
cope with __meminit's changing semantics.

When CONFIG_MEMORY_HOTPLUG=y references from .text to .meminit.text
should be allowed, because they're just folded in together in the linker
script.

When CONFIG_MEMORY_HOTPLUG=n references from .text to .meminit.text
should NOT be allowed, because .meminit.text becomes .init.text and will
be freed.

I don't see anything in the section mismatch logic to cope with that
difference.

It looks like __meminit is saving us about 1K on powerpc, so I'm
strongly inclined to just remove it entirely from arch/powerpc.

Also I haven't been seeing this in my local builds because I have:

CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y

So I guess we need to work out why that's interfering with section
mismatch analysis.

cheers

^ permalink raw reply

* Re: [PATCH] of/fdt: Remove PPC32 longtrail hack in memory scan
From: Dominik Klein @ 2018-08-08 11:09 UTC (permalink / raw)
  To: Geert Uytterhoeven, Michael Ellerman
  Cc: open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Rob Herring, Frank Rowand, Paul Mackerras, Linuxppc-dev
In-Reply-To: <CAMuHMdVPmDG5m4C5=Y8yJYdwnWmXJJqQO1Hxa-3oncou9WCPzw@mail.gmail.com>

Hi guys,

On 08.08.2018 11:29, Geert Uytterhoeven wrote:
> [...]
> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
> 
> However, recently Dominik (CC) send me an enquiry about it, so perhaps
> he is a happy user?

Nope - don't have one either, so go ahead :)


Regards,
Dominik

^ permalink raw reply

* Re: Build regressions/improvements in v4.17-rc1
From: Mathieu Malaterre @ 2018-08-08 11:46 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Andrew Morton, Geert Uytterhoeven, Dan Williams, linuxppc-dev,
	LKML, Nicholas Piggin
In-Reply-To: <87lg9hb12y.fsf@concordia.ellerman.id.au>

On Wed, Aug 8, 2018 at 12:34 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Andrew Morton <akpm@linux-foundation.org> writes:
> > On Mon, 6 Aug 2018 12:39:21 +0200 Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >
> >> CC Dan, Michael, AKPM, powerpc
> >>
> >> On Mon, Apr 16, 2018 at 3:10 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
> >> > Below is the list of build error/warning regressions/improvements in
> >> > v4.17-rc1[1] compared to v4.16[2].
> >>
> >> I'd like to point your attention to:
> >>
> >> >   + warning: vmlinux.o(.text+0x376518): Section mismatch in reference from the function .devm_memremap_pages() to the function .meminit.text:.arch_add_memory():  => N/A
> >> >   + warning: vmlinux.o(.text+0x376d64): Section mismatch in reference from the function .devm_memremap_pages_release() to the function .meminit.text:.arch_remove_memory():  => N/A
> >
> > hm.  Dan isn't around at present so we're on our own with this one.
> >
> > x86 doesn't put arch_add_memory and arch_remove_memory into __meminit.
> > x86 does
> >
> > #ifdef CONFIG_MEMORY_HOTPLUG
> > int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap,
> >               bool want_memblock)
> > {
> >       ...
> >
> >
> > So I guess powerpc should do that as well?
>
> But we only recently added it to fix a section mismatch warning:
>
>   WARNING: vmlinux.o(.text+0x6da88): Section mismatch in reference from the function .arch_add_memory() to the function .meminit.text:.create_section_mapping()
>   The function .arch_add_memory() references
>   the function __meminit .create_section_mapping().
>   This is often because .arch_add_memory lacks a __meminit
>   annotation or the annotation of .create_section_mapping is wrong.
>
>
> I think the problem is that the section mismatch logic isn't able to
> cope with __meminit's changing semantics.
>
> When CONFIG_MEMORY_HOTPLUG=y references from .text to .meminit.text
> should be allowed, because they're just folded in together in the linker
> script.
>
> When CONFIG_MEMORY_HOTPLUG=n references from .text to .meminit.text
> should NOT be allowed, because .meminit.text becomes .init.text and will
> be freed.
>
> I don't see anything in the section mismatch logic to cope with that
> difference.
>
> It looks like __meminit is saving us about 1K on powerpc, so I'm
> strongly inclined to just remove it entirely from arch/powerpc.
>
> Also I haven't been seeing this in my local builds because I have:
>
> CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
>
> So I guess we need to work out why that's interfering with section
> mismatch analysis.

...

Well that's a good question actually. Section mismatch
analysis is done on the throwaway vmlinux.o which is not linked
with --gc-sections (and is not a final link), so the via_pmu_driver
symbol should exist and be picked up.

I wonder if something about the  -ffunction-sections is breaking
the reference detection.
...



ref:
https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg135431.html

>
> cheers

^ permalink raw reply

* [PATCH] powerpc: fix size calculation using resource_size()
From: Dan Carpenter @ 2018-08-08 11:57 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Jia Hongtao
  Cc: Paul Mackerras, Michael Ellerman, Rob Herring, Tyrel Datwyler,
	linuxppc-dev, kernel-janitors

The problem is the the calculation should be "end - start + 1" but the
plus one is missing in this calculation.

Fixes: 8626816e905e ("powerpc: add support for MPIC message register API")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
Static analysis.  Not tested.

diff --git a/arch/powerpc/sysdev/mpic_msgr.c b/arch/powerpc/sysdev/mpic_msgr.c
index eb69a5186243..280e964e1aa8 100644
--- a/arch/powerpc/sysdev/mpic_msgr.c
+++ b/arch/powerpc/sysdev/mpic_msgr.c
@@ -196,7 +196,7 @@ static int mpic_msgr_probe(struct platform_device *dev)
 
 	/* IO map the message register block. */
 	of_address_to_resource(np, 0, &rsrc);
-	msgr_block_addr = ioremap(rsrc.start, rsrc.end - rsrc.start);
+	msgr_block_addr = ioremap(rsrc.start, resource_size(&rsrc));
 	if (!msgr_block_addr) {
 		dev_err(&dev->dev, "Failed to iomap MPIC message registers");
 		return -EFAULT;

^ permalink raw reply related

* Re: [PATCH] lib/test_hexdump: fix failure on big endian cpu
From: Andy Shevchenko @ 2018-08-08 12:13 UTC (permalink / raw)
  To: Christophe Leroy, Andrew Morton, Linus Torvalds
  Cc: linux-kernel, linuxppc-dev, Michael Ellerman
In-Reply-To: <f3112437f62c2f48300535510918e8be1dceacfb.1533610877.git.christophe.leroy@c-s.fr>

On Wed, 2018-08-08 at 06:37 +0000, Christophe Leroy wrote:
> On a big endian cpu, test_hexdump fails as follows. The logs show
> that bytes are expected in reversed order.
> 
> [...]
> [   16.643648] test_hexdump: Len: 24 buflen: 130 strlen: 97
> [   16.648681] test_hexdump: Result: 97 'be32db7b 0a1893b2 70bac424
> 7d83349b a69c31ad
> 9c0face9                    .2.{....p..$}.4...1.....'
> [   16.660951] test_hexdump: Expect: 97 '7bdb32be b293180a 24c4ba70
> 9b34837d ad319ca6
> e9ac0f9c                    .2.{....p..$}.4...1.....'
> [   16.673129] test_hexdump: Len: 8 buflen: 130 strlen: 77
> [   16.678113] test_hexdump: Result: 77
> 'be32db7b0a1893b2                                                     
> .2.{....'
> [   16.688660] test_hexdump: Expect: 77
> 'b293180a7bdb32be                                                     
> .2.{....'
> [   16.699170] test_hexdump: Len: 6 buflen: 131 strlen: 87
> [   16.704238] test_hexdump: Result: 87 'be32 db7b
> 0a18                                                                  
>  .2.{..'
> [   16.715511] test_hexdump: Expect: 87 '32be 7bdb
> 180a                                                                  
>  .2.{..'
> [   16.726864] test_hexdump: Len: 24 buflen: 131 strlen: 97
> [   16.731902] test_hexdump: Result: 97 'be32db7b 0a1893b2 70bac424
> 7d83349b a69c31ad
> 9c0face9                    .2.{....p..$}.4...1.....'
> [   16.744175] test_hexdump: Expect: 97 '7bdb32be b293180a 24c4ba70
> 9b34837d ad319ca6
> e9ac0f9c                    .2.{....p..$}.4...1.....'
> [   16.756379] test_hexdump: Len: 32 buflen: 131 strlen: 101
> [   16.761507] test_hexdump: Result: 101 'be32db7b0a1893b2
> 70bac4247d83349b a69c31ad9c0face9
> 4cd1199943b1af0c  .2.{....p..$}.4...1.....L...C...'
> [   16.774212] test_hexdump: Expect: 101 'b293180a7bdb32be
> 9b34837d24c4ba70 e9ac0f9cad319ca6
> 0cafb1439919d14c  .2.{....p..$}.4...1.....L...C...'
> [   16.786763] test_hexdump: failed 801 out of 1184 tests
> 
> This patch fixes it.

I like this approach because in the future we might introduce something
to print be data on le cpu or otherwise and thus test data will be used
independently of cpu endianess.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

Thanks for adding the BE support.

> 
> Fixes: 64d1d77a44697 ("hexdump: introduce test suite")
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>  lib/test_hexdump.c | 28 +++++++++++++++++++++++-----
>  1 file changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/test_hexdump.c b/lib/test_hexdump.c
> index 3f415d8101f3..626f580b4ff7 100644
> --- a/lib/test_hexdump.c
> +++ b/lib/test_hexdump.c
> @@ -18,7 +18,7 @@ static const unsigned char data_b[] = {
>  
>  static const unsigned char data_a[] =
> ".2.{....p..$}.4...1.....L...C...";
>  
> -static const char * const test_data_1_le[] __initconst = {
> +static const char * const test_data_1[] __initconst = {
>  	"be", "32", "db", "7b", "0a", "18", "93", "b2",
>  	"70", "ba", "c4", "24", "7d", "83", "34", "9b",
>  	"a6", "9c", "31", "ad", "9c", "0f", "ac", "e9",
> @@ -32,16 +32,33 @@ static const char * const test_data_2_le[]
> __initconst = {
>  	"d14c", "9919", "b143", "0caf",
>  };
>  
> +static const char * const test_data_2_be[] __initconst = {
> +	"be32", "db7b", "0a18", "93b2",
> +	"70ba", "c424", "7d83", "349b",
> +	"a69c", "31ad", "9c0f", "ace9",
> +	"4cd1", "1999", "43b1", "af0c",
> +};
> +
>  static const char * const test_data_4_le[] __initconst = {
>  	"7bdb32be", "b293180a", "24c4ba70", "9b34837d",
>  	"ad319ca6", "e9ac0f9c", "9919d14c", "0cafb143",
>  };
>  
> +static const char * const test_data_4_be[] __initconst = {
> +	"be32db7b", "0a1893b2", "70bac424", "7d83349b",
> +	"a69c31ad", "9c0face9", "4cd11999", "43b1af0c",
> +};
> +
>  static const char * const test_data_8_le[] __initconst = {
>  	"b293180a7bdb32be", "9b34837d24c4ba70",
>  	"e9ac0f9cad319ca6", "0cafb1439919d14c",
>  };
>  
> +static const char * const test_data_8_be[] __initconst = {
> +	"be32db7b0a1893b2", "70bac4247d83349b",
> +	"a69c31ad9c0face9", "4cd1199943b1af0c",
> +};
> +
>  #define FILL_CHAR	'#'
>  
>  static unsigned total_tests __initdata;
> @@ -56,6 +73,7 @@ static void __init test_hexdump_prepare_test(size_t
> len, int rowsize,
>  	size_t l = len;
>  	int gs = groupsize, rs = rowsize;
>  	unsigned int i;
> +	const bool is_be = IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
>  
>  	if (rs != 16 && rs != 32)
>  		rs = 16;
> @@ -67,13 +85,13 @@ static void __init
> test_hexdump_prepare_test(size_t len, int rowsize,
>  		gs = 1;
>  
>  	if (gs == 8)
> -		result = test_data_8_le;
> +		result = is_be ? test_data_8_be : test_data_8_le;
>  	else if (gs == 4)
> -		result = test_data_4_le;
> +		result = is_be ? test_data_4_be : test_data_4_le;
>  	else if (gs == 2)
> -		result = test_data_2_le;
> +		result = is_be ? test_data_2_be : test_data_2_le;
>  	else
> -		result = test_data_1_le;
> +		result = test_data_1;
>  
>  	/* hex dump */
>  	p = test;

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply

* Re: Build regressions/improvements in v4.17-rc1
From: Geert Uytterhoeven @ 2018-08-08 12:26 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Andrew Morton, Linux Kernel Mailing List, Dan Williams,
	linuxppc-dev, Nicholas Piggin
In-Reply-To: <87lg9hb12y.fsf@concordia.ellerman.id.au>

On Wed, Aug 8, 2018 at 12:32 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> Also I haven't been seeing this in my local builds because I have:
>
> CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y
>
> So I guess we need to work out why that's interfering with section
> mismatch analysis.

One other common case of missing section mismatch warnings is when some
versions of the compiler inline the called function, while other versions don't.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox