* [PATCH 3/5] perf, tool: Conditional branch filter 'cond' added to perf record
From: Anshuman Khandual @ 2013-05-22 6:22 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, ak, peterz, eranian, mingo
In-Reply-To: <1369203761-12649-1-git-send-email-khandual@linux.vnet.ibm.com>
Adding perf record support for new branch stack filter criteria
PERF_SAMPLE_BRANCH_COND.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
tools/perf/builtin-record.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index cdf58ec..833743a 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -676,6 +676,7 @@ static const struct branch_mode branch_modes[] = {
BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+ BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_CONDITIONAL),
BRANCH_END
};
--
1.7.11.7
^ permalink raw reply related
* [PATCH 5/5] perf, documentation: Description for conditional branch filter
From: Anshuman Khandual @ 2013-05-22 6:22 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, ak, peterz, eranian, mingo
In-Reply-To: <1369203761-12649-1-git-send-email-khandual@linux.vnet.ibm.com>
Adding documentation support for conditional branch filter.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
tools/perf/Documentation/perf-record.txt | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index d4da111..8b5e1ed 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -169,12 +169,13 @@ following filters are defined:
- any_call: any function call or system call
- any_ret: any function return or system call return
- ind_call: any indirect branch
+ - cond: conditional branches
- u: only when the branch target is at the user level
- k: only when the branch target is in the kernel
- hv: only when the target is at the hypervisor level
+
-The option requires at least one branch type among any, any_call, any_ret, ind_call.
+The option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
The privilege levels may be omitted, in which case, the privilege levels of the associated
event are applied to the branch filter. Both kernel (k) and hypervisor (hv) privilege
levels are subject to permissions. When sampling on multiple events, branch stack sampling
--
1.7.11.7
^ permalink raw reply related
* [PATCH 2/5] powerpc, perf: Enable conditional branch filter for POWER8
From: Anshuman Khandual @ 2013-05-22 6:22 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, ak, peterz, eranian, mingo
In-Reply-To: <1369203761-12649-1-git-send-email-khandual@linux.vnet.ibm.com>
Enables conditional branch filter support for POWER8
utilizing MMCRA register based filter and also invalidates
a BHRB branch filter combination involving conditional
branches.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/perf/power8-pmu.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 8ed323d..e60b38f 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -548,11 +548,21 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
return -1;
+ /* Invalid branch filter combination - HW does not support */
+ if ((branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) &&
+ (branch_sample_type & PERF_SAMPLE_BRANCH_COND))
+ return -1;
+
if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
return pmu_bhrb_filter;
}
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
+ pmu_bhrb_filter |= POWER8_MMCRA_IFM3;
+ return pmu_bhrb_filter;
+ }
+
/* Every thing else is unsupported */
return -1;
}
--
1.7.11.7
^ permalink raw reply related
* [PATCH 4/5] x86, perf: Add conditional branch filtering support
From: Anshuman Khandual @ 2013-05-22 6:22 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, ak, peterz, eranian, mingo
In-Reply-To: <1369203761-12649-1-git-send-email-khandual@linux.vnet.ibm.com>
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
This patch adds conditional branch filtering support,
enabling it for PERF_SAMPLE_BRANCH_COND in perf branch
stack sampling framework by utilizing an available
software filter X86_BR_JCC.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index d978353..a0d6387 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -337,6 +337,10 @@ static int intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
if (br_type & PERF_SAMPLE_BRANCH_IND_CALL)
mask |= X86_BR_IND_CALL;
+
+ if (br_type & PERF_SAMPLE_BRANCH_COND)
+ mask |= X86_BR_JCC;
+
/*
* stash actual user request into reg, it may
* be used by fixup code for some CPU
@@ -626,6 +630,7 @@ static const int nhm_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
* NHM/WSM erratum: must include IND_JMP to capture IND_CALL
*/
[PERF_SAMPLE_BRANCH_IND_CALL] = LBR_IND_CALL | LBR_IND_JMP,
+ [PERF_SAMPLE_BRANCH_COND] = LBR_JCC,
};
static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
@@ -637,6 +642,7 @@ static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
[PERF_SAMPLE_BRANCH_ANY_CALL] = LBR_REL_CALL | LBR_IND_CALL
| LBR_FAR,
[PERF_SAMPLE_BRANCH_IND_CALL] = LBR_IND_CALL,
+ [PERF_SAMPLE_BRANCH_COND] = LBR_JCC,
};
/* core */
--
1.7.11.7
^ permalink raw reply related
* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Mike Qiu @ 2013-05-22 6:16 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: tglx, Alexander Gordeev, linuxppc-dev, linux-kernel
In-Reply-To: <1369181713.6387.79.camel@pasglop>
于 2013/5/22 8:15, Benjamin Herrenschmidt 写道:
> On Tue, 2013-05-21 at 16:45 +0200, Alexander Gordeev wrote:
>> On Tue, Jan 15, 2013 at 03:38:53PM +0800, Mike Qiu wrote:
>>> The test results is shown by 'cat /proc/interrups':
>>> CPU0 CPU1 CPU2 CPU3
>>> 16: 240458 261601 226310 200425 XICS Level IPI
>>> 17: 0 0 0 0 XICS Level RAS_EPOW
>>> 18: 10 0 3 2 XICS Level hvc_console
>>> 19: 122182 28481 28527 28864 XICS Level ibmvscsi
>>> 20: 506 7388226 108 118 XICS Level eth0
>>> 21: 6 5 5 5 XICS Level host1-0
>>> 22: 817 814 816 813 XICS Level host1-1
>> Hi Mike,
>>
>> I am curious if pSeries firmware allows changing affinity masks independently
>> for multiple MSIs? I.e. in your example, would it be possible to assign IRQ21
>> and IRQ22 to different CPUs?
> Yes. Each interrupt has its own affinity, whether it's an MSI or not,
> the affinity is not driven by the address.
>
> Cheers,
> Ben.
Hi Ben,
May this patch be accepted? if so I will send out the 3.9 version.
As Michael Ellerman says, he want to see the performance data,
but this depends on the driver.
It is something like MSI, and the driver can use more than 1 MSI.
That is to say, the driver has more interrupt resource to use,
but whether the driver is full use of the resource, is out of
this patch's control.
I test this patch use ipr driver, which add multiple MSI
support by others. and it can work.
Thanks
Mike
>> Thanks!
>>
>>> LOC: 398077 316725 231882 203049 Local timer interrupts
>>> SPU: 1659 919 961 903 Spurious interrupts
>>> CNT: 0 0 0 0 Performance
>>> monitoring interrupts
>>> MCE: 0 0 0 0 Machine check exceptions
>
>
^ permalink raw reply
* RE: SATA hang on 8315E triggered by heavy flash write?
From: Xie Shaohui-B21989 @ 2013-05-22 6:15 UTC (permalink / raw)
To: Anthony Foiani; +Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <gppwj3ayu.fsf@dworkin.scrye.com>
Hi, Anthony Foiani,
Please confirm what is the key operation to reproduce the error.
1. only update NOR for a long enough time, for ex. tens of seconds, see if =
error happens;
2. only r/w SSD without NOR operation, see if error happens;
3. r/w SSD first and keep it run, then start to read NOR, if no error for a=
long time, then start to write NOR, see how long the error will happen.
Best Regards,=20
Shaohui Xie
> -----Original Message-----
> From: Anthony Foiani [mailto:tkil@scrye.com]
> Sent: Wednesday, May 22, 2013 12:17 PM
> To: Wood Scott-B07421
> Cc: linuxppc-dev@lists.ozlabs.org; Xie Shaohui-B21989
> Subject: Re: SATA hang on 8315E triggered by heavy flash write?
>=20
>=20
> Scott --
>=20
> Scott Wood <scottwood@freescale.com> writes:
>=20
> > On 05/15/2013 03:12:21 AM, Anthony Foiani wrote:
> >> At this point, /dev/sda is pretty much unusable, and I have to do at
> >> least a reboot to recover. (I don't recall if I had to do a power
> >> cycle at this point, though.)
>=20
> For whatever it's worth, a hard boot (full power cycle) is indeed
> necessary at this point.
>=20
> >> I suspect that it is related to errata eLBC-A001 (from MPC8315E Chip
> >> Errata, Rev. 3, 09/2011):
> >> ...
> >> But it seems that erratum is already fixed:
> >>
> >> http://patchwork.ozlabs.org/patch/96339/
> >> (git patch d08e44570e)
> >>
> >> Am I reading that correctly?
> >
> > Yes, that erratum has been worked around.
>=20
> Ok, thanks for the confirmation.
>=20
> >> (I'm already writing only one flash sector at a time, but it might be
> >> that even a single 0x10000-byte sector takes long enough to trigger
> >> the issue.)
> >
> > I don't think this erratum is relevant. Unlike NAND, NOR flash does
> > not involve holding the localbus for extended periods of time.
>=20
> I wasn't sure about the mechanism of the erratum, and it seemed awfully
> close, so I thought I'd go fishing. Guess I missed. :(
>=20
> It is NOR writes, btw; I do both in my application, but the initial error
> always seems to occur during a NOR write. (In this device, kernel +
> devtree go into NOR flash, ramdisk goes into NAND flash, and data goes to
> SSD... stop laughing.)
>=20
> Here's the most recent hang. First, to compare the application log
> timestamps with the kernel log timestamps:
>=20
> # mix of kernel and application log, note that kernel is about +12s.
> +0.537506 main.0 [0]: rc: fork took 9.376ms
> [ 12.892323] PHY: mdio@e0024520:01 - Link is Up - 100/Full
> +1.603034 main.0 [0]: schs: ctor: done
>=20
> The console output is:
>=20
> # console log
> [318334.294126] ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x0 action
> 0xe frozen
> [318334.301515] ata2.00: PHY RDY changed
> [318334.305301] ata2.00: failed command: WRITE DMA
> [318334.309991] ata2.00: cmd ca/00:08:b0:00:18/00:00:00:00:00/e1 tag 0
> dma 4096 out
> [318334.310015] res 50/00:00:08:61:25/00:00:00:00:00/e1 Emask
> 0x10 (ATA bus error)
> [318334.325689] ata2.00: status: { DRDY }
> [318334.329717] ata2: hard resetting link
> [318334.836038] ata2: Hardreset failed, not off-lined 0
> [318334.848407] ata2: setting speed (in hard reset)
> [318344.456050] ata2: No Signature Update
> [318344.631916] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> [318344.638354] ata2.00: link online but device misclassified
> [318349.643897] ata2.00: qc timeout (cmd 0xec)
> [318349.648268] ata2.00: failed to IDENTIFY (I/O error, err_mask=3D0x4)
> [318349.654562] ata2.00: revalidation failed (errno=3D-5)
> [318349.659667] ata2: hard resetting link
> [318350.163864] ata2: Hardreset failed, not off-lined 0
> [318350.175869] ata2: setting speed (in hard reset)
> [318359.771956] ata2: No Signature Update
> [318359.947901] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> [318359.954342] ata2.00: link online but device misclassified
> [318369.959921] ata2.00: qc timeout (cmd 0xec)
> [318369.964279] ata2.00: failed to IDENTIFY (I/O error, err_mask=3D0x4)
> [318369.970567] ata2.00: revalidation failed (errno=3D-5)
> [318369.975658] ata2: hard resetting link
> [318370.479933] ata2: Hardreset failed, not off-lined 0
> [318370.491880] ata2: setting speed (in hard reset)
> [318380.083892] ata2: No Signature Update
>=20
> And my application log:
>=20
> # application log
> +318320.957019 sw-upd.0 [29]: fm: nor0: write: writing 0x10000
> @0x180000 from buf[0x80000]; attempt 1/3
> +318322.498346 sw-upd.0 [29]: fm: nor0: write: writing 0x10000
> @0x190000 from buf[0x90000]; attempt 1/3
> +318323.849995 sw-upd.0 [29]: fm: nor0: write: writing 0x10000
> @0x1a0000 from buf[0xa0000]; attempt 1/3
> +318325.262559 sw-upd.0 [29]: fm: nor0: write: writing 0x10000
> @0x1b0000 from buf[0xb0000]; attempt 1/3
> +318326.703213 sw-upd.0 [29]: fm: nor0: write: writing 0x10000
> @0x1c0000 from buf[0xc0000]; attempt 1/3
>=20
> > I also don't see how it would interact with SATA, which is separate
> > from the localbus.
>=20
> No idea. Is there some other shared resource that might be taxed by this
> type of load?
>=20
> I do get a few other errors, usually just once or twice per boot:
>=20
> [ 4231.619368] NOHZ: local_softirq_pending 100
> [ 4232.249935] NOHZ: local_softirq_pending 100
> [ 4232.312241] NOHZ: local_softirq_pending 100
> [ 4232.424523] NOHZ: local_softirq_pending 100
> [ 4233.139146] NOHZ: local_softirq_pending 100
> [ 4233.328540] NOHZ: local_softirq_pending 100
> [ 4233.655909] NOHZ: local_softirq_pending 100
> [ 4234.106578] NOHZ: local_softirq_pending 100
> [ 4234.853966] NOHZ: local_softirq_pending 100
> [ 4235.375208] NOHZ: local_softirq_pending 100
> [11072.027818] hrtimer: interrupt took 126210 ns
>=20
> They seem harmless, though, and (as the timestamps indicate) the machine
> happily ran for 3-4 days after those issues.
>=20
> > Are you seeing any errors on the localbus, or just on SATA?
>=20
> I'm not seeing any errors in the console log -- but I'm not using the LBC
> for anything other than flash writes, SFAIK. (Unless I2C is handled
> through the LBC, in which case, I have frequent (~50-100/s) small
> transactions all the time -- but the hangs always coincide with flash
> writes, and not with the I2C traffic that is going on all the
> time...)
>=20
> > Hopefully Shaohui (our SATA person) can answer these. If you don't
> > get an answer, go ahead and open an official support request.
>=20
> I have a (lousy) workaround in hand: don't touch the disk during flash
> updates. (The flash writes are software updates, which will hopefully be
> fairly rare once I'm done developing this thing. Until then, though, I'm
> updating it multiple times a day, and have hit this quite a few times by
> now.)
>=20
> So there's no great hurry. If Shaohui can find something in the next
> week or so, that'd be fantastic; otherwise, I'll open a request.
>=20
> Thanks again!
>=20
> Best regards,
> Anthony Foiani
^ permalink raw reply
* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Mike Qiu @ 2013-05-22 5:57 UTC (permalink / raw)
To: linuxppc-dev
In-Reply-To: <20130521144548.GB21632@dhcp-26-207.brq.redhat.com>
于 2013/5/21 22:45, Alexander Gordeev 写道:
> On Tue, Jan 15, 2013 at 03:38:53PM +0800, Mike Qiu wrote:
>> The test results is shown by 'cat /proc/interrups':
>> CPU0 CPU1 CPU2 CPU3
>> 16: 240458 261601 226310 200425 XICS Level IPI
>> 17: 0 0 0 0 XICS Level RAS_EPOW
>> 18: 10 0 3 2 XICS Level hvc_console
>> 19: 122182 28481 28527 28864 XICS Level ibmvscsi
>> 20: 506 7388226 108 118 XICS Level eth0
>> 21: 6 5 5 5 XICS Level host1-0
>> 22: 817 814 816 813 XICS Level host1-1
> Hi Mike,
>
> I am curious if pSeries firmware allows changing affinity masks independently
> for multiple MSIs? I.e. in your example, would it be possible to assign IRQ21
> and IRQ22 to different CPUs?
Yes, as Ben says, this is very different from other firmware :)
Thanks
Mike
>
> Thanks!
>
>> LOC: 398077 316725 231882 203049 Local timer interrupts
>> SPU: 1659 919 961 903 Spurious interrupts
>> CNT: 0 0 0 0 Performance
>> monitoring interrupts
>> MCE: 0 0 0 0 Machine check exceptions
^ permalink raw reply
* [PATCH 1/2] powerpc, perf: Ignore separate BHRB privilege state filter request
From: Anshuman Khandual @ 2013-05-22 5:47 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey
In-Reply-To: <1369201667-9048-1-git-send-email-khandual@linux.vnet.ibm.com>
Completely ignore BHRB privilege state filter request as we are
already configuring MMCRA register with privilege state filtering
attribute for the accompanying PMU event. This would help achieve
cleaner user space interaction for BHRB.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/perf/power8-pmu.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index f7d1c4f..8ed323d 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -525,16 +525,17 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
u64 pmu_bhrb_filter = 0;
u64 br_privilege = branch_sample_type & ONLY_PLM;
- /* BHRB and regular PMU events share the same prvillege state
+ /* BHRB and regular PMU events share the same prvilege state
* filter configuration. BHRB is always recorded along with a
- * regular PMU event. So privilege state filter criteria for BHRB
- * and the companion PMU events has to be the same. As a default
- * "perf record" tool sets all privillege bits ON when no filter
- * criteria is provided in the command line. So as along as all
- * privillege bits are ON or they are OFF, we are good to go.
+ * regular PMU event. So privilege state filter criteria for
+ * the BHRB and the companion PMU events has to be the same.
+ * Separate BHRB privillege state filter requests would be
+ * ignored.
*/
- if ((br_privilege != 7) && (br_privilege != 0))
- return -1;
+
+ if (br_privilege)
+ pr_info("BHRB privilege state filter request %llx ignored\n",
+ br_privilege);
/* No branch filter requested */
if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
--
1.7.11.7
^ permalink raw reply related
* [PATCH 2/2] powerpc, perf: BHRB filter configuration should follow the task
From: Anshuman Khandual @ 2013-05-22 5:47 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey
In-Reply-To: <1369201667-9048-1-git-send-email-khandual@linux.vnet.ibm.com>
When the task moves around the system, the corresponding cpuhw
per cpu strcuture should be popullated with the BHRB filter
request value so that PMU could be configured appropriately with
that during the next call into power_pmu_enable().
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/perf/core-book3s.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 426180b..48c68a8 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1122,8 +1122,11 @@ nocheck:
ret = 0;
out:
- if (has_branch_stack(event))
+ if (has_branch_stack(event)) {
power_pmu_bhrb_enable(event);
+ cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
+ event->attr.branch_sample_type);
+ }
perf_pmu_enable(event->pmu);
local_irq_restore(flags);
--
1.7.11.7
^ permalink raw reply related
* [PATCH 0/2] Improvement and fixes for BHRB
From: Anshuman Khandual @ 2013-05-22 5:47 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey
(1) The first patch fixes a situation like this
Before patch:-
------------
./perf record -j any -e branch-misses:k ls
Error:
The sys_perf_event_open() syscall returned with 95 (Operation not supported) for event (branch-misses:k).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?
Here 'perf record' actually copies over ':k' filter request into BHRB
privilege state filter config and our previous check in kernel would
fail that.
After patch:-
-------------
/perf record -j any -e branch-misses:k ls
perf perf.data perf.data.old test-mmap-ring
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~102 samples) ]
(2) The second patch fixes context migration for BHRB filter configuration
Anshuman Khandual (2):
powerpc, perf: Ignore separate BHRB privilege state filter request
powerpc, perf: BHRB filter configuration should follow the task
arch/powerpc/perf/core-book3s.c | 5 ++++-
arch/powerpc/perf/power8-pmu.c | 17 +++++++++--------
2 files changed, 13 insertions(+), 9 deletions(-)
--
1.7.11.7
^ permalink raw reply
* Re: [PATCH 1/1] powerpc: Force 32 bit MSIs on systems lacking firmware support
From: Benjamin Herrenschmidt @ 2013-05-22 4:44 UTC (permalink / raw)
To: Michael Ellerman; +Cc: Brian King, klebers, linuxppc-dev
In-Reply-To: <20130522043641.GB18345@concordia>
On Wed, 2013-05-22 at 14:36 +1000, Michael Ellerman wrote:
> This is basically baking knowledge of phyp's address layout into the
> kernel right? Which is OK, but it needs a big fat comment describing
> exactly what it's doing and why it's safe.
Not pHyp really but the HW, basically this should work with any IODA1
host bridge (P7IOC, Torrent, ...). The "assumption" here is that RTAS
MSI + PCIe Gen2 == IODA1 :-)
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH] powerpc/cell: Only iterate over online nodes in cbe_init_pm_irq()
From: Michael Ellerman @ 2013-05-22 4:39 UTC (permalink / raw)
To: Dennis Schridde; +Cc: linuxppc-dev
In-Reply-To: <5021824.e1KI2ptoeP@ernie>
On Fri, May 17, 2013 at 05:45:05PM +0200, Dennis Schridde wrote:
> Hello!
>
> Just wanted to remind you: The patchto fix cbe_init_pm_irq() that Michael and
> Grant sent me is still not included in Linux 3.8.12.
I didn't push that one to stable because it just fixes a warning. If you
want it you'll have to grab it yourself.
cheers
^ permalink raw reply
* Re: [PATCH 1/1] powerpc: Force 32 bit MSIs on systems lacking firmware support
From: Michael Ellerman @ 2013-05-22 4:36 UTC (permalink / raw)
To: Brian King; +Cc: klebers, linuxppc-dev
In-Reply-To: <201305212154.r4LLs4Zu026123@d01av03.pok.ibm.com>
On Tue, May 21, 2013 at 04:54:04PM -0500, Brian King wrote:
>
> Recent commit e61133dda480062d221f09e4fc18f66763f8ecd0 added support
> for a new firmware feature to force an adapter to use 32 bit MSIs.
> However, this firmware is not available for all systems. The hack below
> allows devices needing 32 bit MSIs to work on these systems as well.
> It is careful to only enable this on Gen2 slots, which should limit
> this to configurations where this hack is needed and tested to work.
Sorry I know you've already sent this to me once, but I didn't get time
to reply.
> diff -puN arch/powerpc/platforms/pseries/msi.c~powerpc_32bit_msi_hack_on_papr arch/powerpc/platforms/pseries/msi.c
> --- linux/arch/powerpc/platforms/pseries/msi.c~powerpc_32bit_msi_hack_on_papr 2013-05-15 10:44:46.000000000 -0500
> +++ linux-bjking1/arch/powerpc/platforms/pseries/msi.c 2013-05-20 15:24:52.000000000 -0500
> @@ -397,10 +397,11 @@ static int check_msix_entries(struct pci
> static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
> {
> struct pci_dn *pdn;
> - int hwirq, virq, i, rc;
> + int hwirq, virq, i, rc = -1;
I'd rather you didn't do a catch-all initialisation like this, it's too
easy to miss a return path.
> struct msi_desc *entry;
> struct msi_msg msg;
> int nvec = nvec_in;
> + int use_32bit_msi_hack = 0;
>
> pdn = get_pdn(pdev);
> if (!pdn)
> @@ -428,15 +429,37 @@ static int rtas_setup_msi_irqs(struct pc
> */
> again:
> if (type == PCI_CAP_ID_MSI) {
> - if (pdn->force_32bit_msi)
> + if (pdn->force_32bit_msi) {
> rc = rtas_change_msi(pdn, RTAS_CHANGE_32MSI_FN, nvec);
> - else
> + if (rc < 0) {
> + /* We only want to run the 32 bit MSI hack below if
> + the max bus speed is Gen2 speed. */
> + if (pdev->bus->max_bus_speed != PCIE_SPEED_5_0GT)
> + return rc;
> +
> + use_32bit_msi_hack = 1;
> + }
> + }
> +
> + if (rc < 0)
> rc = rtas_change_msi(pdn, RTAS_CHANGE_MSI_FN, nvec);
>
> - if (rc < 0 && !pdn->force_32bit_msi) {
> + if (rc < 0) {
> pr_debug("rtas_msi: trying the old firmware call.\n");
> rc = rtas_change_msi(pdn, RTAS_CHANGE_FN, nvec);
> }
> +
> + if (use_32bit_msi_hack && rc > 0) {
> + int pos;
> + u32 addr_hi, addr_lo;
> +
> + dev_info(&pdev->dev, "rtas_msi: No 32 bit MSI firmware support, forcing 32 bit MSI\n");
> + pos = pci_find_capability(pdev, PCI_CAP_ID_MSI);
> + pci_read_config_dword(pdev, pos + PCI_MSI_ADDRESS_HI, &addr_hi);
> + addr_lo = 0xffff0000 | ((addr_hi >> (48 - 32)) << 4);
> + pci_write_config_dword(pdev, pos + PCI_MSI_ADDRESS_LO, addr_lo);
> + pci_write_config_dword(pdev, pos + PCI_MSI_ADDRESS_HI, 0);
This is basically baking knowledge of phyp's address layout into the
kernel right? Which is OK, but it needs a big fat comment describing
exactly what it's doing and why it's safe.
cheers
^ permalink raw reply
* Re: SATA hang on 8315E triggered by heavy flash write?
From: Anthony Foiani @ 2013-05-22 4:16 UTC (permalink / raw)
To: Scott Wood; +Cc: linuxppc-dev, Shaohui.Xie
In-Reply-To: <1369172643.1374.15@scott-Lenovo-G560>
Scott --
Scott Wood <scottwood@freescale.com> writes:
> On 05/15/2013 03:12:21 AM, Anthony Foiani wrote:
>> At this point, /dev/sda is pretty much unusable, and I have to do
>> at least a reboot to recover. (I don't recall if I had to do a
>> power cycle at this point, though.)
For whatever it's worth, a hard boot (full power cycle) is indeed
necessary at this point.
>> I suspect that it is related to errata eLBC-A001 (from MPC8315E
>> Chip Errata, Rev. 3, 09/2011):
>> ...
>> But it seems that erratum is already fixed:
>>
>> http://patchwork.ozlabs.org/patch/96339/
>> (git patch d08e44570e)
>>
>> Am I reading that correctly?
>
> Yes, that erratum has been worked around.
Ok, thanks for the confirmation.
>> (I'm already writing only one flash sector at a time, but it might
>> be that even a single 0x10000-byte sector takes long enough to
>> trigger the issue.)
>
> I don't think this erratum is relevant. Unlike NAND, NOR flash does
> not involve holding the localbus for extended periods of time.
I wasn't sure about the mechanism of the erratum, and it seemed
awfully close, so I thought I'd go fishing. Guess I missed. :(
It is NOR writes, btw; I do both in my application, but the initial
error always seems to occur during a NOR write. (In this device,
kernel + devtree go into NOR flash, ramdisk goes into NAND flash, and
data goes to SSD... stop laughing.)
Here's the most recent hang. First, to compare the application log
timestamps with the kernel log timestamps:
# mix of kernel and application log, note that kernel is about +12s.
+0.537506 main.0 [0]: rc: fork took 9.376ms
[ 12.892323] PHY: mdio@e0024520:01 - Link is Up - 100/Full
+1.603034 main.0 [0]: schs: ctor: done
The console output is:
# console log
[318334.294126] ata2.00: exception Emask 0x10 SAct 0x0 SErr 0x0 action 0xe frozen
[318334.301515] ata2.00: PHY RDY changed
[318334.305301] ata2.00: failed command: WRITE DMA
[318334.309991] ata2.00: cmd ca/00:08:b0:00:18/00:00:00:00:00/e1 tag 0 dma 4096 out
[318334.310015] res 50/00:00:08:61:25/00:00:00:00:00/e1 Emask 0x10 (ATA bus error)
[318334.325689] ata2.00: status: { DRDY }
[318334.329717] ata2: hard resetting link
[318334.836038] ata2: Hardreset failed, not off-lined 0
[318334.848407] ata2: setting speed (in hard reset)
[318344.456050] ata2: No Signature Update
[318344.631916] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[318344.638354] ata2.00: link online but device misclassified
[318349.643897] ata2.00: qc timeout (cmd 0xec)
[318349.648268] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[318349.654562] ata2.00: revalidation failed (errno=-5)
[318349.659667] ata2: hard resetting link
[318350.163864] ata2: Hardreset failed, not off-lined 0
[318350.175869] ata2: setting speed (in hard reset)
[318359.771956] ata2: No Signature Update
[318359.947901] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[318359.954342] ata2.00: link online but device misclassified
[318369.959921] ata2.00: qc timeout (cmd 0xec)
[318369.964279] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[318369.970567] ata2.00: revalidation failed (errno=-5)
[318369.975658] ata2: hard resetting link
[318370.479933] ata2: Hardreset failed, not off-lined 0
[318370.491880] ata2: setting speed (in hard reset)
[318380.083892] ata2: No Signature Update
And my application log:
# application log
+318320.957019 sw-upd.0 [29]: fm: nor0: write: writing 0x10000 @0x180000 from buf[0x80000]; attempt 1/3
+318322.498346 sw-upd.0 [29]: fm: nor0: write: writing 0x10000 @0x190000 from buf[0x90000]; attempt 1/3
+318323.849995 sw-upd.0 [29]: fm: nor0: write: writing 0x10000 @0x1a0000 from buf[0xa0000]; attempt 1/3
+318325.262559 sw-upd.0 [29]: fm: nor0: write: writing 0x10000 @0x1b0000 from buf[0xb0000]; attempt 1/3
+318326.703213 sw-upd.0 [29]: fm: nor0: write: writing 0x10000 @0x1c0000 from buf[0xc0000]; attempt 1/3
> I also don't see how it would interact with SATA, which is separate
> from the localbus.
No idea. Is there some other shared resource that might be taxed by
this type of load?
I do get a few other errors, usually just once or twice per boot:
[ 4231.619368] NOHZ: local_softirq_pending 100
[ 4232.249935] NOHZ: local_softirq_pending 100
[ 4232.312241] NOHZ: local_softirq_pending 100
[ 4232.424523] NOHZ: local_softirq_pending 100
[ 4233.139146] NOHZ: local_softirq_pending 100
[ 4233.328540] NOHZ: local_softirq_pending 100
[ 4233.655909] NOHZ: local_softirq_pending 100
[ 4234.106578] NOHZ: local_softirq_pending 100
[ 4234.853966] NOHZ: local_softirq_pending 100
[ 4235.375208] NOHZ: local_softirq_pending 100
[11072.027818] hrtimer: interrupt took 126210 ns
They seem harmless, though, and (as the timestamps indicate) the
machine happily ran for 3-4 days after those issues.
> Are you seeing any errors on the localbus, or just on SATA?
I'm not seeing any errors in the console log -- but I'm not using the
LBC for anything other than flash writes, SFAIK. (Unless I2C is
handled through the LBC, in which case, I have frequent (~50-100/s)
small transactions all the time -- but the hangs always coincide with
flash writes, and not with the I2C traffic that is going on all the
time...)
> Hopefully Shaohui (our SATA person) can answer these. If you don't
> get an answer, go ahead and open an official support request.
I have a (lousy) workaround in hand: don't touch the disk during flash
updates. (The flash writes are software updates, which will hopefully
be fairly rare once I'm done developing this thing. Until then,
though, I'm updating it multiple times a day, and have hit this quite
a few times by now.)
So there's no great hurry. If Shaohui can find something in the next
week or so, that'd be fantastic; otherwise, I'll open a request.
Thanks again!
Best regards,
Anthony Foiani
^ permalink raw reply
* [PATCH 2/2 v3] powerpc: restore dbcr0 on user space exit
From: Bharat Bhushan @ 2013-05-22 4:20 UTC (permalink / raw)
To: galak, benh, linuxppc-dev, scottwood, stuart.yoder, james.yang
Cc: Bharat Bhushan
In-Reply-To: <1369196459-17275-1-git-send-email-Bharat.Bhushan@freescale.com>
On BookE (Branch taken + Single Step) is as same as Branch Taken
on BookS and in Linux we simulate BookS behavior for BookE as well.
When doing so, in Branch taken handling we want to set DBCR0_IC but
we update the current->thread->dbcr0 and not DBCR0.
Now on 64bit the current->thread.dbcr0 (and other debug registers)
is synchronized ONLY on context switch flow. But after handling
Branch taken in debug exception if we return back to user space
without context switch then single stepping change (DBCR0_ICMP)
does not get written in h/w DBCR0 and Instruction Complete exception
does not happen.
This fixes using ptrace reliably on BookE-PowerPC
lmbench latency test (lat_syscall) Results are (they varies a little
on each run)
1) ./lat_syscall <action> /dev/shm/uImage
action: Open read write stat fstat null
Before: 3.8618 0.2017 0.2851 1.6789 0.2256 0.0856
After: 3.8580 0.2017 0.2851 1.6955 0.2255 0.0856
1) ./lat_syscall -P 2 -N 10 <action> /dev/shm/uImage
action: Open read write stat fstat null
Before: 4.1388 0.2238 0.3066 1.7106 0.2256 0.0856
After: 4.1413 0.2236 0.3062 1.7107 0.2256 0.0856
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v2->v3
- Load PACACURRENT immediately after _MSR(r1), and load DBCR0
just after "beq resume_kernel
- Added lat_sysycal results before and after the patch
v1->v2
- Subject line was not having 1/2
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/entry_64.S | 28 ++++++++++++++++++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index b51a97c..1e2f450 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -103,6 +103,7 @@ int main(void)
#endif /* CONFIG_VSX */
#ifdef CONFIG_PPC64
DEFINE(KSP_VSID, offsetof(struct thread_struct, ksp_vsid));
+ DEFINE(THREAD_DBCR0, offsetof(struct thread_struct, dbcr0));
#else /* CONFIG_PPC64 */
DEFINE(PGDIR, offsetof(struct thread_struct, pgdir));
#if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 794889b..5b91d27 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -622,21 +622,41 @@ _GLOBAL(ret_from_except_lite)
CURRENT_THREAD_INFO(r9, r1)
ld r3,_MSR(r1)
+#ifdef CONFIG_PPC_BOOK3E
+ ld r10,PACACURRENT(r13)
+#endif /* CONFIG_PPC_BOOK3E */
ld r4,TI_FLAGS(r9)
andi. r3,r3,MSR_PR
beq resume_kernel
+#ifdef CONFIG_PPC_BOOK3E
+ lwz r3,(THREAD+THREAD_DBCR0)(r10)
+#endif /* CONFIG_PPC_BOOK3E */
/* Check current_thread_info()->flags */
andi. r0,r4,_TIF_USER_WORK_MASK
+ bne 1f
+#ifdef CONFIG_PPC_BOOK3E
+ /*
+ * Check to see if the dbcr0 register is set up to debug.
+ * Use the internal debug mode bit to do this.
+ */
+ andis. r0,r3,DBCR0_IDM@h
beq restore
-
- andi. r0,r4,_TIF_NEED_RESCHED
- beq 1f
+ mfmsr r0
+ rlwinm r0,r0,0,~MSR_DE /* Clear MSR.DE */
+ mtmsr r0
+ mtspr SPRN_DBCR0,r3
+ li r10, -1
+ mtspr SPRN_DBSR,r10
+ b restore
+#endif
+1: andi. r0,r4,_TIF_NEED_RESCHED
+ beq 2f
bl .restore_interrupts
SCHEDULE_USER
b .ret_from_except_lite
-1: bl .save_nvgprs
+2: bl .save_nvgprs
bl .restore_interrupts
addi r3,r1,STACK_FRAME_OVERHEAD
bl .do_notify_resume
--
1.7.0.4
^ permalink raw reply related
* [PATCH 1/2 v3] powerpc: debug control and status registers are 32bit
From: Bharat Bhushan @ 2013-05-22 4:20 UTC (permalink / raw)
To: galak, benh, linuxppc-dev, scottwood, stuart.yoder, james.yang
Cc: Bharat Bhushan
In-Reply-To: <1369196459-17275-1-git-send-email-Bharat.Bhushan@freescale.com>
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
---
v2->v3
- No change
v1->v2
- Subject line was not having 1/2
arch/powerpc/include/asm/processor.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index d7e67ca..5213577 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -168,10 +168,10 @@ struct thread_struct {
* The following help to manage the use of Debug Control Registers
* om the BookE platforms.
*/
- unsigned long dbcr0;
- unsigned long dbcr1;
+ uint32_t dbcr0;
+ uint32_t dbcr1;
#ifdef CONFIG_BOOKE
- unsigned long dbcr2;
+ uint32_t dbcr2;
#endif
/*
* The stored value of the DBSR register will be the value at the
@@ -179,7 +179,7 @@ struct thread_struct {
* user (will never be written to) and has value while helping to
* describe the reason for the last debug trap. Torez
*/
- unsigned long dbsr;
+ uint32_t dbsr;
/*
* The following will contain addresses used by debug applications
* to help trace and trap on particular address locations.
--
1.7.0.4
^ permalink raw reply related
* [PATCH 0/2 v3] powerpc: Make ptrace work reliably
From: Bharat Bhushan @ 2013-05-22 4:20 UTC (permalink / raw)
To: galak, benh, linuxppc-dev, scottwood, stuart.yoder, james.yang
Cc: Bharat Bhushan
From: Bharat Bhushan <bharat.bhushan@freescale.com>
v2->v3
- Load PACACURRENT immediately after _MSR(r1), and load DBCR0
just after "beq resume_kernel
- Added lat_sysycal results before and after the patch
v1->v2
- Subject line was missing 0/2, 1/2, 2/2
Bharat Bhushan (2):
powerpc: debug control and status registers are 32bit
=> This patch makes debug control and status registers as 32bit as they are.
This does not fix anything
powerpc: restore dbcr0 on user space exit
=> This patch fixes the ptrace reliability issue. The description is the patch
describes one of the case where it does not work reliably
arch/powerpc/include/asm/processor.h | 8 ++++----
arch/powerpc/kernel/asm-offsets.c | 1 +
arch/powerpc/kernel/entry_64.S | 28 ++++++++++++++++++++++++----
3 files changed, 29 insertions(+), 8 deletions(-)
^ permalink raw reply
* Re: PROBLEM: Only 2 of 4 cores used on IBM Cell blades and no threads shown in spufs
From: Michael Ellerman @ 2013-05-22 3:37 UTC (permalink / raw)
To: Dennis Schridde; +Cc: cbe-oss-dev, linuxppc-dev, arnd
In-Reply-To: <1401780.1HKOhdHcrh@ernie>
On Fri, May 17, 2013 at 05:46:52PM +0200, Dennis Schridde wrote:
> Hello!
>
> Am Dienstag, 23. April 2013, 19:12:47 schrieb Michael Ellerman:
> > For me it is fixed by applying the following patch, it should be in v3.10:
> >
> > http://patchwork.ozlabs.org/patch/230103/
>
> Can you please also backport this to 3.8? It is still missing in 3.8.12.
It's in 3.8.13.
cheers
^ permalink raw reply
* [PATCH] powerpc: Context switch more PMU related SPRs
From: Michael Ellerman @ 2013-05-22 2:31 UTC (permalink / raw)
To: linuxppc-dev
In commit 9353374 "Context switch the new EBB SPRs" we added support for
context switching some new EBB SPRs. However despite four of us signing
off on that patch we missed some. To be fair these are not actually new
SPRs, but they are now potentially user accessible so need to be context
switched.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
---
arch/powerpc/include/asm/processor.h | 6 ++++++
arch/powerpc/kernel/asm-offsets.c | 6 ++++++
arch/powerpc/kernel/entry_64.S | 28 ++++++++++++++++++++++++++++
3 files changed, 40 insertions(+)
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index d7e67ca..594db6b 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -284,6 +284,12 @@ struct thread_struct {
unsigned long ebbrr;
unsigned long ebbhr;
unsigned long bescr;
+ unsigned long siar;
+ unsigned long sdar;
+ unsigned long sier;
+ unsigned long mmcr0;
+ unsigned long mmcr2;
+ unsigned long mmcra;
#endif
};
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index b51a97c..6f16ffa 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -127,6 +127,12 @@ int main(void)
DEFINE(THREAD_BESCR, offsetof(struct thread_struct, bescr));
DEFINE(THREAD_EBBHR, offsetof(struct thread_struct, ebbhr));
DEFINE(THREAD_EBBRR, offsetof(struct thread_struct, ebbrr));
+ DEFINE(THREAD_SIAR, offsetof(struct thread_struct, siar));
+ DEFINE(THREAD_SDAR, offsetof(struct thread_struct, sdar));
+ DEFINE(THREAD_SIER, offsetof(struct thread_struct, sier));
+ DEFINE(THREAD_MMCR0, offsetof(struct thread_struct, mmcr0));
+ DEFINE(THREAD_MMCR2, offsetof(struct thread_struct, mmcr2));
+ DEFINE(THREAD_MMCRA, offsetof(struct thread_struct, mmcra));
#endif
#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
DEFINE(PACATMSCRATCH, offsetof(struct paca_struct, tm_scratch));
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 51cfb8f..0e9095e 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -465,6 +465,20 @@ BEGIN_FTR_SECTION
std r0, THREAD_EBBHR(r3)
mfspr r0, SPRN_EBBRR
std r0, THREAD_EBBRR(r3)
+
+ /* PMU registers made user read/(write) by EBB */
+ mfspr r0, SPRN_SIAR
+ std r0, THREAD_SIAR(r3)
+ mfspr r0, SPRN_SDAR
+ std r0, THREAD_SDAR(r3)
+ mfspr r0, SPRN_SIER
+ std r0, THREAD_SIER(r3)
+ mfspr r0, SPRN_MMCR0
+ std r0, THREAD_MMCR0(r3)
+ mfspr r0, SPRN_MMCR2
+ std r0, THREAD_MMCR2(r3)
+ mfspr r0, SPRN_MMCRA
+ std r0, THREAD_MMCRA(r3)
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
#endif
@@ -560,6 +574,20 @@ BEGIN_FTR_SECTION
ld r0, THREAD_EBBRR(r4)
mtspr SPRN_EBBRR, r0
+ /* PMU registers made user read/(write) by EBB */
+ ld r0, THREAD_SIAR(r4)
+ mtspr SPRN_SIAR, r0
+ ld r0, THREAD_SDAR(r4)
+ mtspr SPRN_SDAR, r0
+ ld r0, THREAD_SIER(r4)
+ mtspr SPRN_SIER, r0
+ ld r0, THREAD_MMCR0(r4)
+ mtspr SPRN_MMCR0, r0
+ ld r0, THREAD_MMCR2(r4)
+ mtspr SPRN_MMCR2, r0
+ ld r0, THREAD_MMCRA(r4)
+ mtspr SPRN_MMCRA, r0
+
ld r0,THREAD_TAR(r4)
mtspr SPRN_TAR,r0
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
--
1.7.10.4
^ permalink raw reply related
* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Benjamin Herrenschmidt @ 2013-05-22 0:15 UTC (permalink / raw)
To: Alexander Gordeev; +Cc: linuxppc-dev, tglx, Mike Qiu, linux-kernel
In-Reply-To: <20130521144548.GB21632@dhcp-26-207.brq.redhat.com>
On Tue, 2013-05-21 at 16:45 +0200, Alexander Gordeev wrote:
> On Tue, Jan 15, 2013 at 03:38:53PM +0800, Mike Qiu wrote:
> > The test results is shown by 'cat /proc/interrups':
> > CPU0 CPU1 CPU2 CPU3
> > 16: 240458 261601 226310 200425 XICS Level IPI
> > 17: 0 0 0 0 XICS Level RAS_EPOW
> > 18: 10 0 3 2 XICS Level hvc_console
> > 19: 122182 28481 28527 28864 XICS Level ibmvscsi
> > 20: 506 7388226 108 118 XICS Level eth0
> > 21: 6 5 5 5 XICS Level host1-0
> > 22: 817 814 816 813 XICS Level host1-1
>
> Hi Mike,
>
> I am curious if pSeries firmware allows changing affinity masks independently
> for multiple MSIs? I.e. in your example, would it be possible to assign IRQ21
> and IRQ22 to different CPUs?
Yes. Each interrupt has its own affinity, whether it's an MSI or not,
the affinity is not driven by the address.
Cheers,
Ben.
> Thanks!
>
> > LOC: 398077 316725 231882 203049 Local timer interrupts
> > SPU: 1659 919 961 903 Spurious interrupts
> > CNT: 0 0 0 0 Performance
> > monitoring interrupts
> > MCE: 0 0 0 0 Machine check exceptions
>
^ permalink raw reply
* [PATCH 1/1] powerpc: Force 32 bit MSIs on systems lacking firmware support
From: Brian King @ 2013-05-21 21:54 UTC (permalink / raw)
To: benh; +Cc: klebers, brking, linuxppc-dev
Recent commit e61133dda480062d221f09e4fc18f66763f8ecd0 added support
for a new firmware feature to force an adapter to use 32 bit MSIs.
However, this firmware is not available for all systems. The hack below
allows devices needing 32 bit MSIs to work on these systems as well.
It is careful to only enable this on Gen2 slots, which should limit
this to configurations where this hack is needed and tested to work.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/msi.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)
diff -puN arch/powerpc/platforms/pseries/msi.c~powerpc_32bit_msi_hack_on_papr arch/powerpc/platforms/pseries/msi.c
--- linux/arch/powerpc/platforms/pseries/msi.c~powerpc_32bit_msi_hack_on_papr 2013-05-15 10:44:46.000000000 -0500
+++ linux-bjking1/arch/powerpc/platforms/pseries/msi.c 2013-05-20 15:24:52.000000000 -0500
@@ -397,10 +397,11 @@ static int check_msix_entries(struct pci
static int rtas_setup_msi_irqs(struct pci_dev *pdev, int nvec_in, int type)
{
struct pci_dn *pdn;
- int hwirq, virq, i, rc;
+ int hwirq, virq, i, rc = -1;
struct msi_desc *entry;
struct msi_msg msg;
int nvec = nvec_in;
+ int use_32bit_msi_hack = 0;
pdn = get_pdn(pdev);
if (!pdn)
@@ -428,15 +429,37 @@ static int rtas_setup_msi_irqs(struct pc
*/
again:
if (type == PCI_CAP_ID_MSI) {
- if (pdn->force_32bit_msi)
+ if (pdn->force_32bit_msi) {
rc = rtas_change_msi(pdn, RTAS_CHANGE_32MSI_FN, nvec);
- else
+ if (rc < 0) {
+ /* We only want to run the 32 bit MSI hack below if
+ the max bus speed is Gen2 speed. */
+ if (pdev->bus->max_bus_speed != PCIE_SPEED_5_0GT)
+ return rc;
+
+ use_32bit_msi_hack = 1;
+ }
+ }
+
+ if (rc < 0)
rc = rtas_change_msi(pdn, RTAS_CHANGE_MSI_FN, nvec);
- if (rc < 0 && !pdn->force_32bit_msi) {
+ if (rc < 0) {
pr_debug("rtas_msi: trying the old firmware call.\n");
rc = rtas_change_msi(pdn, RTAS_CHANGE_FN, nvec);
}
+
+ if (use_32bit_msi_hack && rc > 0) {
+ int pos;
+ u32 addr_hi, addr_lo;
+
+ dev_info(&pdev->dev, "rtas_msi: No 32 bit MSI firmware support, forcing 32 bit MSI\n");
+ pos = pci_find_capability(pdev, PCI_CAP_ID_MSI);
+ pci_read_config_dword(pdev, pos + PCI_MSI_ADDRESS_HI, &addr_hi);
+ addr_lo = 0xffff0000 | ((addr_hi >> (48 - 32)) << 4);
+ pci_write_config_dword(pdev, pos + PCI_MSI_ADDRESS_LO, addr_lo);
+ pci_write_config_dword(pdev, pos + PCI_MSI_ADDRESS_HI, 0);
+ }
} else
rc = rtas_change_msi(pdn, RTAS_CHANGE_MSIX_FN, nvec);
_
^ permalink raw reply
* Re: SATA hang on 8315E triggered by heavy flash write?
From: Scott Wood @ 2013-05-21 21:44 UTC (permalink / raw)
To: Anthony Foiani; +Cc: linuxppc-dev, Shaohui.Xie
In-Reply-To: <gsj1oirve.fsf@dworkin.scrye.com>
On 05/15/2013 03:12:21 AM, Anthony Foiani wrote:
> At this point, /dev/sda is pretty much unusable, and I have to do at
> least a reboot to recover. (I don't recall if I had to do a power
> cycle at this point, though.)
>=20
> I suspect that it is related to errata eLBC-A001 (from MPC8315E Chip
> Errata, Rev. 3, 09/2011):
>=20
> eLBC-A001:
>=20
> Simultaneous FCM and GPCM or UPM operation may erroneously trigger
> bus monitor timeout
>=20
> Description: Devices: MPC8315E, MPC8314E
> When the FCM is in the middle of a long transaction, such as NAND
> erase or write, another transaction on the GPCM or UPM triggers the
> bus monitor to start immediately for the GPCM or UPM, even though
> the GPCM or UPM is still waiting for the FCM to finish and has not
> yet started its transaction. If the bus monitor timeout value is not
> programmed for a sufficiently large value, the local bus monitor may
> time out. This timeout corrupts the current NAND Flash operation and
> terminate the GPCM or UPM operation.
>=20
> Impact: Local bus monitor may time out unexpectedly and corrupt the
> NAND transaction.
>=20
> Workaround: Set the local bus monitor timeout value to the maximum
> by setting LBCR[BMT] =3D 0 and LBCR[BMTPS] =3D 0xF.
>=20
> Fix plan: No plans to fix
>=20
> But it seems that erratum is already fixed:
>=20
> http://patchwork.ozlabs.org/patch/96339/
> (git patch d08e44570e)
>=20
> Am I reading that correctly?
Yes, that erratum has been worked around.
> (I'm already writing only one flash
> sector at a time, but it might be that even a single 0x10000-byte
> sector takes long enough to trigger the issue.)
I don't think this erratum is relevant. Unlike NAND, NOR flash does =20
not involve holding the localbus for extended periods of time. I also =20
don't see how it would interact with SATA, which is separate from the =20
localbus. Are you seeing any errors on the localbus, or just on SATA?
> I also verified that
> I have the relevant property in my device tree:
>=20
> localbus@e0005000 {
> ...
> compatible =3D "fsl,mpc8315-elbc", "fsl,elbc", "simple-bus";
>=20
> So, my questions are:
>=20
> 1. Is anyone else seeing something like this?
>=20
> 2. Is there an obvious way for our code to detect that we're in the
> middle of error recovery, so we can not write to the disk until
> recovery is complete?
>=20
> 3. Is there any chance that the 1.5Gbps limiting code might have
> exacerbated the problems?
>=20
> 4. Should I open a support request with Freescale, or if someone from
> Freescale is already reading this, could you look to see if anyone
> else has reported it?
Hopefully Shaohui (our SATA person) can answer these. If you don't get =20
an answer, go ahead and open an official support request.
-Scott=
^ permalink raw reply
* Re: Build failure with 3.9.3 (Regression 3.9.2->3.9.3)
From: Srivatsa S. Bhat @ 2013-05-21 19:32 UTC (permalink / raw)
To: Adam Lackorzynski; +Cc: Robert Jennings, linuxppc-dev, nfont
In-Reply-To: <20130520151411.GD27420@os.inf.tu-dresden.de>
On 05/20/2013 08:44 PM, Adam Lackorzynski wrote:
> Hi,
>
> 3.9.3 introduced the following build failure:
>
> CC arch/powerpc/kernel/rtas.o
> arch/powerpc/kernel/rtas.c: In function ‘rtas_cpu_state_change_mask’:
> arch/powerpc/kernel/rtas.c:843:4: error: implicit declaration of function ‘cpu_down’ [-Werror=implicit-function-declaration]
> cc1: all warnings being treated as errors
> make[1]: *** [arch/powerpc/kernel/rtas.o] Error 1
> make: *** [arch/powerpc/kernel] Error 2
>
> My kernel config has CONFIG_HOTPLUG_CPU off, that's why cpu_down is not
> defined. Shall CONFIG_HOTPLUG_CPU just be enabled or should the code in
> rtas.c be adapted?
>
I think we should just enable CONFIG_HOTPLUG_CPU. I don't see any other
solution to this problem. The changelog of the below (untested) patch
explains the reasoning. (BTW, I'm not sure if this is the best way to
alter the Kconfig in order to enable both HOTPLUG and HOTPLUG_CPU. If
there is a better way to do it, let's go for it).
Also, this patch applies on current mainline. We need a separate backport
for 3.9 (because current mainline has a new line - "select HAVE_CONTEXT_TRACKING"
which is not present in 3.9, and this interferes with the patch).
Regards,
Srivatsa S. Bhat
-------------------------------------------------------------------------
From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Subject: [PATCH] powerpc: Enable CONFIG_HOTPLUG_CPU on PPC_PSERIES SMP builds
Adam Lackorzynski reported the following build failure on
!CONFIG_HOTPLUG_CPU configuration:
CC arch/powerpc/kernel/rtas.o
arch/powerpc/kernel/rtas.c: In function ‘rtas_cpu_state_change_mask’:
arch/powerpc/kernel/rtas.c:843:4: error: implicit declaration of function ‘cpu_down’ [-Werror=implicit-function-declaration]
cc1: all warnings being treated as errors
make[1]: *** [arch/powerpc/kernel/rtas.o] Error 1
make: *** [arch/powerpc/kernel] Error 2
The build fails because cpu_down() is defined only under CONFIG_HOTPLUG_CPU.
Looking further, the mobility code in pseries is one of the call-sites which
uses rtas_ibm_suspend_me(), which in turn calls rtas_cpu_state_change_mask().
And the mobility code is unconditionally compiled-in (it does not fall under
any Kconfig option). And commit 120496ac (powerpc: Bring all threads online
prior to migration/hibernation) which introduced this build regression is
critical for the proper functioning of the migration code. So it appears
that the only solution to this problem is to enable CONFIG_HOTPLUG_CPU if
SMP is enabled on PPC_PSERIES platforms. So make that change in the Kconfig.
Reported-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---
arch/powerpc/platforms/pseries/Kconfig | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 023b288..4459eff 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -19,6 +19,8 @@ config PPC_PSERIES
select ZLIB_DEFLATE
select PPC_DOORBELL
select HAVE_CONTEXT_TRACKING
+ select HOTPLUG if SMP
+ select HOTPLUG_CPU if SMP
default y
config PPC_SPLPAR
^ permalink raw reply related
* [PATCH 1/2] drivers/macintosh: Remove obsolete cleanup for clientdata
From: Wolfram Sang @ 2013-05-21 18:45 UTC (permalink / raw)
To: linux-i2c; +Cc: linuxppc-dev, Wolfram Sang
A few new i2c-drivers came into the kernel which clear the clientdata-pointer
on exit or error. This is obsolete meanwhile, the core will do it.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
---
drivers/macintosh/windfarm_smu_sat.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/macintosh/windfarm_smu_sat.c b/drivers/macintosh/windfarm_smu_sat.c
index d87f5ee..ad6223e 100644
--- a/drivers/macintosh/windfarm_smu_sat.c
+++ b/drivers/macintosh/windfarm_smu_sat.c
@@ -343,7 +343,6 @@ static int wf_sat_remove(struct i2c_client *client)
wf_unregister_sensor(&sens->sens);
}
sat->i2c = NULL;
- i2c_set_clientdata(client, NULL);
kref_put(&sat->ref, wf_sat_release);
return 0;
--
1.7.10.4
^ permalink raw reply related
* Re: [PATCH v4 06/12] ARM: dove: add gigabit ethernet and mvmdio device tree nodes
From: Andrew Lunn @ 2013-05-21 17:48 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Andrew Lunn, Jason Cooper, netdev, linux-kernel, linux-arm-kernel,
linuxppc-dev, David Miller, Lennert Buytenhek
In-Reply-To: <1369154510-4927-7-git-send-email-sebastian.hesselbarth@gmail.com>
On Tue, May 21, 2013 at 06:41:44PM +0200, Sebastian Hesselbarth wrote:
> This patch adds orion-eth and mvmdio device tree nodes for DT enabled
> Dove boards. As there is only one ethernet controller on Dove, a default
> phy node is also added with a note to set its reg property on a per-board
> basis.
>
> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> ---
> Changelog:
> v3->v4:
> - convert to new device tree binding
>
> Cc: David Miller <davem@davemloft.net>
> Cc: Lennert Buytenhek <buytenh@wantstofly.org>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: netdev@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-kernel@vger.kernel.org
> ---
> arch/arm/boot/dts/dove-cubox.dts | 7 +++++++
> arch/arm/boot/dts/dove.dtsi | 35 +++++++++++++++++++++++++++++++++++
> 2 files changed, 42 insertions(+)
>
> diff --git a/arch/arm/boot/dts/dove-cubox.dts b/arch/arm/boot/dts/dove-cubox.dts
> index 7e3065a..02618fa 100644
> --- a/arch/arm/boot/dts/dove-cubox.dts
> +++ b/arch/arm/boot/dts/dove-cubox.dts
> @@ -49,6 +49,13 @@
> &uart0 { status = "okay"; };
> &sata0 { status = "okay"; };
> &i2c0 { status = "okay"; };
> +&mdio { status = "okay"; };
> +ð { status = "okay"; };
> +
> +ðphy {
> + compatible = "marvell,88e1310";
> + reg = <1>;
> +};
>
> &sdio0 {
> status = "okay";
> diff --git a/arch/arm/boot/dts/dove.dtsi b/arch/arm/boot/dts/dove.dtsi
> index 6cab468..8612658 100644
> --- a/arch/arm/boot/dts/dove.dtsi
> +++ b/arch/arm/boot/dts/dove.dtsi
> @@ -258,5 +258,40 @@
> dmacap,xor;
> };
> };
> +
> + mdio: mdio-bus@72004 {
> + compatible = "marvell,orion-mdio";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + reg = <0x72004 0x84>;
> + interrupts = <30>;
> + clocks = <&gate_clk 2>;
> + status = "disabled";
> +
> + ethphy: ethernet-phy {
> + device-type = "ethernet-phy";
> + /* set phy address in board file */
> + };
> + };
> +
> + eth: ethernet-controller@72000 {
> + compatible = "marvell,orion-eth";
> + #address-cells = <1>;
> + #size-cells = <0>;
> + reg = <0x72000 0x4000>;
> + clocks = <&gate_clk 2>;
> + marvell,tx-checksum-limit = <1600>;
> + status = "disabled";
> +
> + ethernet-port@0 {
> + device_type = "network";
> + compatible = "marvell,orion-eth-port";
> + reg = <0>;
> + interrupts = <29>;
> + /* overwrite MAC address in bootloader */
> + local-mac-address = [00 00 00 00 00 00];
Hi Sebastian
Its probably a good idea to set the local administration bit in this
MAC address. i.e. first byte is 02.
Andrew
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox