* Re: Invalid perf_branch_entry.to entries question
From: Michael Neuling @ 2013-05-08 22:45 UTC (permalink / raw)
To: Stephane Eranian; +Cc: Peter Zijlstra, Linux PPC dev, LKML, Anshuman Khandual
In-Reply-To: <CABPqkBRSMxQK8LJWhD39yo8EQZBd3-dRkeMqvwhWwLiHa6m66g@mail.gmail.com>
Stephane Eranian <eranian@google.com> wrote:
> On Wed, May 8, 2013 at 5:59 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
> >> Peter & Stephane,
> >>
> >> We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
> >> struct perf_branch_entry.
> >>
> >> Sometimes on POWER8 we may not be able to fill out the "to" address.
> >
> > Just because I'm curious.. however does that happen? Surely the CPU knows where
> > next to fetch instructions?
> >
> >> We
> >> initially thought of just making this 0, but it's feasible that this
> >> could be a valid address to branch to.
> >
> > Right, while highly unlikely, x86 actually has some cases where 0 address is
> > valid *shudder*..
> >
> >> The other logical value to indicate an invalid entry would be all 1s
> >> which is not possible (on POWER at least).
> >>
> >> Do you guys have a preference as to what we should use as an invalid
> >> entry? This would have some consequences for the userspace tool also.
> >>
> >> The alternative would be to add a flag alongside mispred/predicted to
> >> indicate the validity of the "to" address.
> >
> > Either would work with me I suppose.. Stephane do you have any preference?
>
> But if the 'to' is bogus, why not just drop the sample?
> That happens on x86 if the HW captured branches which do not correspond to
> user filter settings (due to bug).
We can I guess but it seems useful to log the from address when
possible.
Can we log it and userspace tools can ignore it if it's not useful?
Mikey
^ permalink raw reply
* Re: Invalid perf_branch_entry.to entries question
From: Michael Neuling @ 2013-05-08 22:39 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Linux PPC dev, linux-kernel, eranian, Anshuman Khandual
In-Reply-To: <20130508155929.GA8459@dyad.programming.kicks-ass.net>
Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
> > Peter & Stephane,
> >
> > We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
> > struct perf_branch_entry.
> >
> > Sometimes on POWER8 we may not be able to fill out the "to" address.
>
> Just because I'm curious.. however does that happen? Surely the CPU
> knows where next to fetch instructions?
For computed gotos (ie. branch to a register value), the hardware gives
you the from and to address in the branch history buffer.
For branches where the branch target address is an immediate encoded in
the instruction, the hardware only logs the from address. It assumes
that software (perf irq handler in this case) can read this branch
instruction, calculate the corresponding offset and hence the
to/target address.
It's entirely possible that when the perf IRQ handler happens, the
instruction in question is not readable or is no longer a branch (self
modifying code). Hence we aren't able to calculate a valid to address.
Mikey
>
> > We
> > initially thought of just making this 0, but it's feasible that this
> > could be a valid address to branch to.
>
> Right, while highly unlikely, x86 actually has some cases where 0 address is
> valid *shudder*..
>
> > The other logical value to indicate an invalid entry would be all 1s
> > which is not possible (on POWER at least).
> >
> > Do you guys have a preference as to what we should use as an invalid
> > entry? This would have some consequences for the userspace tool also.
> >
> > The alternative would be to add a flag alongside mispred/predicted to
> > indicate the validity of the "to" address.
>
> Either would work with me I suppose.. Stephane do you have any preference?
>
^ permalink raw reply
* Re: Invalid perf_branch_entry.to entries question
From: Stephane Eranian @ 2013-05-08 21:33 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Linux PPC dev, Michael Neuling, LKML, Anshuman Khandual
In-Reply-To: <20130508155929.GA8459@dyad.programming.kicks-ass.net>
On Wed, May 8, 2013 at 5:59 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
>> Peter & Stephane,
>>
>> We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
>> struct perf_branch_entry.
>>
>> Sometimes on POWER8 we may not be able to fill out the "to" address.
>
> Just because I'm curious.. however does that happen? Surely the CPU knows where
> next to fetch instructions?
>
>> We
>> initially thought of just making this 0, but it's feasible that this
>> could be a valid address to branch to.
>
> Right, while highly unlikely, x86 actually has some cases where 0 address is
> valid *shudder*..
>
>> The other logical value to indicate an invalid entry would be all 1s
>> which is not possible (on POWER at least).
>>
>> Do you guys have a preference as to what we should use as an invalid
>> entry? This would have some consequences for the userspace tool also.
>>
>> The alternative would be to add a flag alongside mispred/predicted to
>> indicate the validity of the "to" address.
>
> Either would work with me I suppose.. Stephane do you have any preference?
But if the 'to' is bogus, why not just drop the sample?
That happens on x86 if the HW captured branches which do not correspond to
user filter settings (due to bug).
^ permalink raw reply
* Re: [v1][KVM][PATCH 1/1] kvm:ppc:booehv: direct ISI exception to Guest
From: Scott Wood @ 2013-05-08 19:09 UTC (permalink / raw)
To: tiejun.chen; +Cc: linuxppc-dev, agraf, kvm-ppc, kvm
In-Reply-To: <5189B02E.3000109@windriver.com>
On 05/07/2013 08:53:50 PM, tiejun.chen wrote:
> On 05/08/2013 07:40 AM, Scott Wood wrote:
>> On 05/07/2013 06:06:30 AM, Tiejun Chen wrote:
>>> We also can direct ISI exception to Guest like DSI.
>>>=20
>>> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
>>> ---
>>> arch/powerpc/kvm/booke_emulate.c | 3 +++
>>> arch/powerpc/kvm/e500mc.c | 3 ++-
>>> 2 files changed, 5 insertions(+), 1 deletion(-)
>>=20
>> Are you seeing a real performance improvement from this? This will =20
>> interfere
>=20
> No. But after we reduce the exit to host, shouldn't this improve =20
> performance?
Not if ISIs are too rare to matter.
-Scott=
^ permalink raw reply
* Re: Invalid perf_branch_entry.to entries question
From: Peter Zijlstra @ 2013-05-08 15:59 UTC (permalink / raw)
To: Michael Neuling; +Cc: Linux PPC dev, linux-kernel, eranian, Anshuman Khandual
In-Reply-To: <25394.1367890528@ale.ozlabs.ibm.com>
On Tue, May 07, 2013 at 11:35:28AM +1000, Michael Neuling wrote:
> Peter & Stephane,
>
> We are plumbing the POWER8 Branch History Rolling Buffer (BHRB) into
> struct perf_branch_entry.
>
> Sometimes on POWER8 we may not be able to fill out the "to" address.
Just because I'm curious.. however does that happen? Surely the CPU knows where
next to fetch instructions?
> We
> initially thought of just making this 0, but it's feasible that this
> could be a valid address to branch to.
Right, while highly unlikely, x86 actually has some cases where 0 address is
valid *shudder*..
> The other logical value to indicate an invalid entry would be all 1s
> which is not possible (on POWER at least).
>
> Do you guys have a preference as to what we should use as an invalid
> entry? This would have some consequences for the userspace tool also.
>
> The alternative would be to add a flag alongside mispred/predicted to
> indicate the validity of the "to" address.
Either would work with me I suppose.. Stephane do you have any preference?
^ permalink raw reply
* [PATCH v5, part4 31/41] mm/ppc: prepare for removing num_physpages and simplify mem_init()
From: Jiang Liu @ 2013-05-08 15:51 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-arch, James Bottomley, David Howells, Jiang Liu,
Wen Congyang, linux-mm, Mark Salter, linux-kernel, Michal Hocko,
Minchan Kim, Paul Mackerras, Mel Gorman, David Rientjes,
linuxppc-dev, Sergei Shtylyov, KAMEZAWA Hiroyuki, Jianguo Wu
In-Reply-To: <1368028298-7401-1-git-send-email-jiang.liu@huawei.com>
Prepare for removing num_physpages and simplify mem_init().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
---
arch/powerpc/mm/mem.c | 56 +++++++++++--------------------------------------
1 file changed, 12 insertions(+), 44 deletions(-)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index b890245..4e24f1c 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -299,46 +299,27 @@ void __init paging_init(void)
void __init mem_init(void)
{
-#ifdef CONFIG_NEED_MULTIPLE_NODES
- int nid;
-#endif
- pg_data_t *pgdat;
- unsigned long i;
- struct page *page;
- unsigned long reservedpages = 0, codesize, initsize, datasize, bsssize;
-
#ifdef CONFIG_SWIOTLB
swiotlb_init(0);
#endif
- num_physpages = memblock_phys_mem_size() >> PAGE_SHIFT;
high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
#ifdef CONFIG_NEED_MULTIPLE_NODES
- for_each_online_node(nid) {
- if (NODE_DATA(nid)->node_spanned_pages != 0) {
- printk("freeing bootmem node %d\n", nid);
- free_all_bootmem_node(NODE_DATA(nid));
- }
+ {
+ pg_data_t *pgdat;
+
+ for_each_online_pgdat(pgdat)
+ if (pgdat->node_spanned_pages != 0) {
+ printk("freeing bootmem node %d\n",
+ pgdat->node_id);
+ free_all_bootmem_node(pgdat);
+ }
}
#else
max_mapnr = max_pfn;
free_all_bootmem();
#endif
- for_each_online_pgdat(pgdat) {
- for (i = 0; i < pgdat->node_spanned_pages; i++) {
- if (!pfn_valid(pgdat->node_start_pfn + i))
- continue;
- page = pgdat_page_nr(pgdat, i);
- if (PageReserved(page))
- reservedpages++;
- }
- }
-
- codesize = (unsigned long)&_sdata - (unsigned long)&_stext;
- datasize = (unsigned long)&_edata - (unsigned long)&_sdata;
- initsize = (unsigned long)&__init_end - (unsigned long)&__init_begin;
- bsssize = (unsigned long)&__bss_stop - (unsigned long)&__bss_start;
#ifdef CONFIG_HIGHMEM
{
@@ -348,13 +329,9 @@ void __init mem_init(void)
for (pfn = highmem_mapnr; pfn < max_mapnr; ++pfn) {
phys_addr_t paddr = (phys_addr_t)pfn << PAGE_SHIFT;
struct page *page = pfn_to_page(pfn);
- if (memblock_is_reserved(paddr))
- continue;
- free_highmem_page(page);
- reservedpages--;
+ if (!memblock_is_reserved(paddr))
+ free_highmem_page(page);
}
- printk(KERN_DEBUG "High memory: %luk\n",
- totalhigh_pages << (PAGE_SHIFT-10));
}
#endif /* CONFIG_HIGHMEM */
@@ -367,16 +344,7 @@ void __init mem_init(void)
(mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) - 1;
#endif
- printk(KERN_INFO "Memory: %luk/%luk available (%luk kernel code, "
- "%luk reserved, %luk data, %luk bss, %luk init)\n",
- nr_free_pages() << (PAGE_SHIFT-10),
- num_physpages << (PAGE_SHIFT-10),
- codesize >> 10,
- reservedpages << (PAGE_SHIFT-10),
- datasize >> 10,
- bsssize >> 10,
- initsize >> 10);
-
+ mem_init_print_info(NULL);
#ifdef CONFIG_PPC32
pr_info("Kernel virtual memory layout:\n");
pr_info(" * 0x%08lx..0x%08lx : fixmap\n", FIXADDR_START, FIXADDR_TOP);
--
1.7.9.5
^ permalink raw reply related
* [PATCH] rapidio/tsi721: fix bug in MSI interrupt handling
From: Alexandre Bounine @ 2013-05-08 13:31 UTC (permalink / raw)
To: Andrew Morton, linux-kernel, linuxppc-dev; +Cc: Alexandre Bounine
Fix bug in MSI interrupt handling which causes loss of event notifications.
Typical indication of lost MSI interrupts are stalled message and doorbell
transfers between RapidIO endpoints. To avoid loss of MSI interrupts all
interrupts from the device must be disabled on entering the interrupt handler
routine and re-enabled when exiting it. Re-enabling device interrupts will
trigger new MSI message(s) if Tsi721 registered new events since entering
interrupt handler routine.
This patch is applicable to kernel versions starting from v3.2.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
---
drivers/rapidio/devices/tsi721.c | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/drivers/rapidio/devices/tsi721.c b/drivers/rapidio/devices/tsi721.c
index 6faba40..a8b2c23 100644
--- a/drivers/rapidio/devices/tsi721.c
+++ b/drivers/rapidio/devices/tsi721.c
@@ -471,6 +471,10 @@ static irqreturn_t tsi721_irqhandler(int irq, void *ptr)
u32 intval;
u32 ch_inte;
+ /* For MSI mode disable all device-level interrupts */
+ if (priv->flags & TSI721_USING_MSI)
+ iowrite32(0, priv->regs + TSI721_DEV_INTE);
+
dev_int = ioread32(priv->regs + TSI721_DEV_INT);
if (!dev_int)
return IRQ_NONE;
@@ -560,6 +564,14 @@ static irqreturn_t tsi721_irqhandler(int irq, void *ptr)
}
}
#endif
+
+ /* For MSI mode re-enable device-level interrupts */
+ if (priv->flags & TSI721_USING_MSI) {
+ dev_int = TSI721_DEV_INT_SR2PC_CH | TSI721_DEV_INT_SRIO |
+ TSI721_DEV_INT_SMSG_CH | TSI721_DEV_INT_BDMA_CH;
+ iowrite32(dev_int, priv->regs + TSI721_DEV_INTE);
+ }
+
return IRQ_HANDLED;
}
--
1.7.8.4
^ permalink raw reply related
* RE: [RFC][KVM][PATCH 1/1] kvm:ppc:booke-64: soft-disable interrupts
From: Caraman Mihai Claudiu-B02008 @ 2013-05-08 13:14 UTC (permalink / raw)
To: Wood Scott-B07421, tiejun.chen
Cc: linuxppc-dev@lists.ozlabs.org, agraf@suse.de,
kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <1367892390.3398.12@snotra>
> > This only disable soft interrupt for kvmppc_restart_interrupt() that
> > restarts interrupts if they were meant for the host:
> >
> > a. SOFT_DISABLE_INTS() only for BOOKE_INTERRUPT_EXTERNAL |
> > BOOKE_INTERRUPT_DECREMENTER | BOOKE_INTERRUPT_DOORBELL
>=20
> Those aren't the only exceptions that can end up going to the host. We
> could get a TLB miss that results in a heavyweight MMIO exit, etc.
>
> > And shouldn't we handle kvmppc_restart_interrupt() like the original
> > HOST flow?
> >
> > #define MASKABLE_EXCEPTION(trapnum, intnum, label, hdlr,
> > ack) \
> >
> > START_EXCEPTION(label); \
> > NORMAL_EXCEPTION_PROLOG(trapnum, intnum,
> > PROLOG_ADDITION_MASKABLE)\
> > EXCEPTION_COMMON(trapnum, PACA_EXGEN,
> > *INTS_DISABLE*) \
> > ...
>=20
> Could you elaborate on what you mean?
I think Tiejun was saying that host has flags and replays only EE/DEC/DBELL
interrupts. There is special macro masked_interrupt_book3e in those excepti=
on
handlers that sets paca->irq_happened.
The list of replied interrupts is limited to asynchronous noncritical
interrupts which can be masked by MSR[EE] (therefore no TLB miss). Now
on KVM book3e we don't want to put them in the irq_happened lazy state
but rather to execute them directly, so there is no reason for exception
handling symmetry between host and guest.
-Mike
^ permalink raw reply
* Re: ppc/sata-fsl: orphan config value: CONFIG_MPC8315_DS
From: Anthony Foiani @ 2013-05-08 12:04 UTC (permalink / raw)
To: Jeff Garzik
Cc: Scott Wood, Robert P.J.Day, linuxppc-dev@lists.ozlabs.org,
Li Yang-R58472, Adrian Bunk
In-Reply-To: <87a9odrinu.fsf@hum.int.foiani.com>
Anthony Foiani <tkil@scrye.com> writes:
> Maybe I need to call ata_set_sata_spd as well. Can I do that before
> discovery, or should it be a part of the port_start callback? And
> if the latter, shouldn't it be handled within the ata core, instead
> of expecting each host driver to do that call?
My final version calls sata_set_spd from within the hard reset
callback for the fsl sata driver.
If there's a better place to put it, please let me know.
With this patch (and an appropriate entry in the device tree), the
machine comes up and reports:
# cd /sys/devices/e0000000.immr/e0019000.sata
# find * -name '*_spd*' -print | xargs grep .
ata2/link2/ata_link/link2/sata_spd:1.5 Gbps
ata2/link2/ata_link/link2/hw_sata_spd_limit:1.5 Gbps
ata2/link2/ata_link/link2/sata_spd_limit:1.5 Gbps
Which is what I needed to see.
Thanks for the hints!
Best regards,
Anthony Foiani
--
>From 357c96b4f31b457eca0b96147c749c21d0f4f086 Mon Sep 17 00:00:00 2001
From: Anthony Foiani <anthony.foiani@gmail.com>
Date: Wed, 8 May 2013 05:24:20 -0600
Subject: [PATCH] sata: fsl: allow device tree to limit sata speed.
There used to be an "orphan" config symbol (CONFIG_MPC8315_DS) that
would artificially limit SATA speed to generation 1 (1.5Gbps).
Since that config symbol got lost whenever any sort of configuration
was done, we instead extract the limitation from the device tree,
using a new name "sata-spd-limit".
Signed-off-by: Anthony Foiani <anthony.foiani@gmail.com>
---
.../devicetree/bindings/powerpc/fsl/board.txt | 23 ++++++++++++++++++
drivers/ata/sata_fsl.c | 28 +++++++++++-----------
2 files changed, 37 insertions(+), 14 deletions(-)
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/board.txt b/Documentation/devicetree/bindings/powerpc/fsl/board.txt
index 380914e..9c9fed4 100644
--- a/Documentation/devicetree/bindings/powerpc/fsl/board.txt
+++ b/Documentation/devicetree/bindings/powerpc/fsl/board.txt
@@ -67,3 +67,26 @@ Example:
gpio-controller;
};
};
+
+* Maximum SATA Generation workaround
+
+Some boards advertise SATA speeds that they cannot actually achieve.
+Previously, this was dealt with via the orphaned config symbol
+CONFIG_MPC8315_DS. We now have a device tree property
+"sata-spd-limit" to control this. It should live within the "sata"
+block.
+
+Example:
+
+ sata@18000 {
+ compatible = "fsl,mpc8315-sata", "fsl,pq-sata";
+ reg = <0x18000 0x1000>;
+ cell-index = <1>;
+ interrupts = <44 0x8>;
+ interrupt-parent = <&ipic>;
+ sata-spd-limit = <1>;
+ };
+
+By default, there is no limitation; if a value is given, it indicates
+the maximum "generation" that should be negotiated. Gen 1 is 1.5Gbps,
+Gen 2 is 3.0Gbps.
diff --git a/drivers/ata/sata_fsl.c b/drivers/ata/sata_fsl.c
index d6577b9..9e3f3ec 100644
--- a/drivers/ata/sata_fsl.c
+++ b/drivers/ata/sata_fsl.c
@@ -726,20 +726,6 @@ static int sata_fsl_port_start(struct ata_port *ap)
VPRINTK("HControl = 0x%x\n", ioread32(hcr_base + HCONTROL));
VPRINTK("CHBA = 0x%x\n", ioread32(hcr_base + CHBA));
-#ifdef CONFIG_MPC8315_DS
- /*
- * Workaround for 8315DS board 3gbps link-up issue,
- * currently limit SATA port to GEN1 speed
- */
- sata_fsl_scr_read(&ap->link, SCR_CONTROL, &temp);
- temp &= ~(0xF << 4);
- temp |= (0x1 << 4);
- sata_fsl_scr_write(&ap->link, SCR_CONTROL, temp);
-
- sata_fsl_scr_read(&ap->link, SCR_CONTROL, &temp);
- dev_warn(dev, "scr_control, speed limited to %x\n", temp);
-#endif
-
return 0;
}
@@ -836,6 +822,11 @@ try_offline_again:
*/
ata_msleep(ap, 1);
+ /* if the device tree forces a speed limit, set it here. */
+ ata_link_info(link, "setting speed (in hard reset)\n");
+ DPRINTK("setting spd_limit\n");
+ sata_set_spd(link);
+
/*
* Now, bring the host controller online again, this can take time
* as PHY reset and communication establishment, 1st D2H FIS and
@@ -1444,6 +1435,15 @@ static int sata_fsl_probe(struct platform_device *ofdev)
goto error_exit_with_cleanup;
}
+ /* record speed limit if requested by device tree */
+ if (!of_property_read_u32(ofdev->dev.of_node, "sata-spd-limit",
+ &temp)) {
+ int i;
+ for (i = 0; i < SATA_FSL_MAX_PORTS; ++i)
+ host->ports[i]->link.hw_sata_spd_limit = temp;
+ dev_warn(&ofdev->dev, "speed limit set to gen %u\n", temp);
+ }
+
/* host->iomap is not used currently */
host->private_data = host_priv;
--
1.8.1.4
^ permalink raw reply related
* Re: [PATCH] powerpc: fix numa distance for form0 device tree
From: Luis Henriques @ 2013-05-08 10:29 UTC (permalink / raw)
To: Michael Ellerman; +Cc: linuxppc-dev, Anton Blanchard, stable
In-Reply-To: <1367898574-20594-1-git-send-email-michael@ellerman.id.au>
On Tue, May 07, 2013 at 01:49:34PM +1000, Michael Ellerman wrote:
> From: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
>
> Commit 7122beeee7bc1757682049780179d7c216dd1c83 upstream.
Thanks, I'm queuing it for the 3.5 kernel.
Cheers,
--
Luis
^ permalink raw reply
* Re: [v1][KVM][PATCH 1/1] kvm:ppc:booehv: direct ISI exception to Guest
From: tiejun.chen @ 2013-05-08 9:28 UTC (permalink / raw)
To: Caraman Mihai Claudiu-B02008
Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org, agraf@suse.de,
kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <300B73AA675FCE4A93EB4FC1D42459FF3EFA26@039-SN2MPN1-013.039d.mgd.msft.net>
On 05/08/2013 05:20 PM, Caraman Mihai Claudiu-B02008 wrote:
>> -----Original Message-----
>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On
>> Behalf Of tiejun.chen
>> Sent: Wednesday, May 08, 2013 4:54 AM
>> To: Wood Scott-B07421
>> Cc: agraf@suse.de; kvm-ppc@vger.kernel.org; kvm@vger.kernel.org;
>> linuxppc-dev@lists.ozlabs.org
>> Subject: Re: [v1][KVM][PATCH 1/1] kvm:ppc:booehv: direct ISI exception to
>> Guest
>>
>> On 05/08/2013 07:40 AM, Scott Wood wrote:
>>> On 05/07/2013 06:06:30 AM, Tiejun Chen wrote:
>>>> We also can direct ISI exception to Guest like DSI.
>>>>
>>>> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
>>>> ---
>>>> arch/powerpc/kvm/booke_emulate.c | 3 +++
>>>> arch/powerpc/kvm/e500mc.c | 3 ++-
>>>> 2 files changed, 5 insertions(+), 1 deletion(-)
>>>
>>> Are you seeing a real performance improvement from this? This will
>> interfere
>>
>> No. But after we reduce the exit to host, shouldn't this improve
>> performance?
>
> We lose some flexibility for this so it make sense only if we gain
> measurable improvements.
Sounds we have much more works to do.
>
>>
>>> somewhat with using the VF bit, if we were to ever do so, since VF only
>> affects
>>
>> Sorry, what is the VF you said?
>
> VF stands for virtualization fault see MAS8[VF] and we may use it for virtualized
I almost forget this point :)
> MMIO. The hypervisor should deny execute access on pages marked with VF. Accordingly
> in this case guest ISI exceptions should be handled by the hypervisor.
Thanks for your information.
Tiejun
^ permalink raw reply
* RE: [v1][KVM][PATCH 1/1] kvm:ppc:booehv: direct ISI exception to Guest
From: Caraman Mihai Claudiu-B02008 @ 2013-05-08 9:20 UTC (permalink / raw)
To: tiejun.chen, Wood Scott-B07421
Cc: linuxppc-dev@lists.ozlabs.org, agraf@suse.de,
kvm-ppc@vger.kernel.org, kvm@vger.kernel.org
In-Reply-To: <5189B02E.3000109@windriver.com>
PiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiBGcm9tOiBrdm0tb3duZXJAdmdlci5rZXJu
ZWwub3JnIFttYWlsdG86a3ZtLW93bmVyQHZnZXIua2VybmVsLm9yZ10gT24NCj4gQmVoYWxmIE9m
IHRpZWp1bi5jaGVuDQo+IFNlbnQ6IFdlZG5lc2RheSwgTWF5IDA4LCAyMDEzIDQ6NTQgQU0NCj4g
VG86IFdvb2QgU2NvdHQtQjA3NDIxDQo+IENjOiBhZ3JhZkBzdXNlLmRlOyBrdm0tcHBjQHZnZXIu
a2VybmVsLm9yZzsga3ZtQHZnZXIua2VybmVsLm9yZzsNCj4gbGludXhwcGMtZGV2QGxpc3RzLm96
bGFicy5vcmcNCj4gU3ViamVjdDogUmU6IFt2MV1bS1ZNXVtQQVRDSCAxLzFdIGt2bTpwcGM6Ym9v
ZWh2OiBkaXJlY3QgSVNJIGV4Y2VwdGlvbiB0bw0KPiBHdWVzdA0KPiANCj4gT24gMDUvMDgvMjAx
MyAwNzo0MCBBTSwgU2NvdHQgV29vZCB3cm90ZToNCj4gPiBPbiAwNS8wNy8yMDEzIDA2OjA2OjMw
IEFNLCBUaWVqdW4gQ2hlbiB3cm90ZToNCj4gPj4gV2UgYWxzbyBjYW4gZGlyZWN0IElTSSBleGNl
cHRpb24gdG8gR3Vlc3QgbGlrZSBEU0kuDQo+ID4+DQo+ID4+IFNpZ25lZC1vZmYtYnk6IFRpZWp1
biBDaGVuIDx0aWVqdW4uY2hlbkB3aW5kcml2ZXIuY29tPg0KPiA+PiAtLS0NCj4gPj4gIGFyY2gv
cG93ZXJwYy9rdm0vYm9va2VfZW11bGF0ZS5jIHwgICAgMyArKysNCj4gPj4gIGFyY2gvcG93ZXJw
Yy9rdm0vZTUwMG1jLmMgICAgICAgIHwgICAgMyArKy0NCj4gPj4gIDIgZmlsZXMgY2hhbmdlZCwg
NSBpbnNlcnRpb25zKCspLCAxIGRlbGV0aW9uKC0pDQo+ID4NCj4gPiBBcmUgeW91IHNlZWluZyBh
IHJlYWwgcGVyZm9ybWFuY2UgaW1wcm92ZW1lbnQgZnJvbSB0aGlzPyAgVGhpcyB3aWxsDQo+IGlu
dGVyZmVyZQ0KPiANCj4gTm8uIEJ1dCBhZnRlciB3ZSByZWR1Y2UgdGhlIGV4aXQgdG8gaG9zdCwg
c2hvdWxkbid0IHRoaXMgaW1wcm92ZQ0KPiBwZXJmb3JtYW5jZT8NCg0KV2UgbG9zZSBzb21lIGZs
ZXhpYmlsaXR5IGZvciB0aGlzIHNvIGl0IG1ha2Ugc2Vuc2Ugb25seSBpZiB3ZSBnYWluDQptZWFz
dXJhYmxlIGltcHJvdmVtZW50cy4NCg0KPiANCj4gPiBzb21ld2hhdCB3aXRoIHVzaW5nIHRoZSBW
RiBiaXQsIGlmIHdlIHdlcmUgdG8gZXZlciBkbyBzbywgc2luY2UgVkYgb25seQ0KPiBhZmZlY3Rz
DQo+IA0KPiBTb3JyeSwgd2hhdCBpcyB0aGUgVkYgeW91IHNhaWQ/DQoNClZGIHN0YW5kcyBmb3Ig
dmlydHVhbGl6YXRpb24gZmF1bHQgc2VlIE1BUzhbVkZdIGFuZCB3ZSBtYXkgdXNlIGl0IGZvciB2
aXJ0dWFsaXplZA0KTU1JTy4gVGhlIGh5cGVydmlzb3Igc2hvdWxkIGRlbnkgZXhlY3V0ZSBhY2Nl
c3Mgb24gcGFnZXMgbWFya2VkIHdpdGggVkYuIEFjY29yZGluZ2x5DQppbiB0aGlzIGNhc2UgZ3Vl
c3QgSVNJIGV4Y2VwdGlvbnMgc2hvdWxkIGJlIGhhbmRsZWQgYnkgdGhlIGh5cGVydmlzb3IuDQoN
Ci1NaWtlDQoNCg==
^ permalink raw reply
* [PATCH] powerpc/powernv: Properly drop characters if console is closed
From: Benjamin Herrenschmidt @ 2013-05-08 4:15 UTC (permalink / raw)
To: linuxppc-dev list
If the firmware returns an error such as "closed" (or hardware
error), we should drop characters.
Currently we only do that when a firmware compatible with OPAL v2
APIs is detected, in the code that calls opal_console_write_buffer_space(),
which didn't exist with OPAL v1 (or didn't work).
However, when enabling early debug consoles, the flag indicating
that v2 is supported isn't set yet, causing us, in case of errors
or closed console, to spin forever.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
arch/powerpc/platforms/powernv/opal.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
index ade4463..12d9846 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -144,6 +144,13 @@ int opal_put_chars(uint32_t vtermno, const char *data, int total_len)
rc == OPAL_BUSY_EVENT || rc == OPAL_SUCCESS)) {
len = total_len;
rc = opal_console_write(vtermno, &len, data);
+
+ /* Closed or other error drop */
+ if (rc != OPAL_SUCCESS && rc != OPAL_BUSY &&
+ rc != OPAL_BUSY_EVENT) {
+ written = total_len;
+ break;
+ }
if (rc == OPAL_SUCCESS) {
total_len -= len;
data += len;
^ permalink raw reply related
* [PATCH] powerpc/rtas_flash: Fix validate_flash buffer overflow issue
From: Vasant Hegde @ 2013-05-08 2:54 UTC (permalink / raw)
To: benh, linuxppc-dev; +Cc: paulus, linux-kernel
ibm,validate-flash-image RTAS call output buffer contains 150 - 200
bytes of data on latest system. Presently we have output
buffer size as 64 bytes and we use sprintf to copy data from
RTAS buffer to local buffer. This causes kernel oops (see below
call trace).
This patch increases local buffer size to 256 and also uses
snprintf instead of sprintf to copy data from RTAS buffer.
Kernel call trace :
-------------------
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: nfs fscache lockd auth_rpcgss nfs_acl sunrpc fuse loop dm_mod ipv6 ipv6_lib usb_storage ehea(X) sr_mod qlge ses cdrom enclosure st be2net sg ext3 jbd mbcache usbhid hid ohci_hcd ehci_hcd usbcore qla2xxx usb_common sd_mod crc_t10dif scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc scsi_dh lpfc scsi_transport_fc scsi_tgt ipr(X) libata scsi_mod
Supported: Yes
NIP: 4520323031333130 LR: 4520323031333130 CTR: 0000000000000000
REGS: c0000001b91779b0 TRAP: 0400 Tainted: G X (3.0.13-0.27-ppc64)
MSR: 8000000040009032 <EE,ME,IR,DR> CR: 44022488 XER: 20000018
TASK = c0000001bca1aba0[4736] 'cat' THREAD: c0000001b9174000 CPU: 36
GPR00: 4520323031333130 c0000001b9177c30 c000000000f87c98 000000000000009b
GPR04: c0000001b9177c4a 000000000000000b 3520323031333130 2032303133313031
GPR08: 3133313031350a4d 000000000000009b 0000000000000000 c0000000003664a4
GPR12: 0000000022022448 c000000003ee6c00 0000000000000002 00000000100e8a90
GPR16: 00000000100cb9d8 0000000010093370 000000001001d310 0000000000000000
GPR20: 0000000000008000 00000000100fae60 000000000000005e 0000000000000000
GPR24: 0000000010129350 46573738302e3030 2046573738302e30 300a4d4720323031
GPR28: 333130313520554e 4b4e4f574e0a4d47 2032303133313031 3520323031333130
NIP [4520323031333130] 0x4520323031333130
LR [4520323031333130] 0x4520323031333130
Call Trace:
[c0000001b9177c30] [4520323031333130] 0x4520323031333130 (unreliable)
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
---
arch/powerpc/kernel/rtas_flash.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
index 5b30224..2f3cdb0 100644
--- a/arch/powerpc/kernel/rtas_flash.c
+++ b/arch/powerpc/kernel/rtas_flash.c
@@ -89,6 +89,7 @@
/* Array sizes */
#define VALIDATE_BUF_SIZE 4096
+#define VALIDATE_MSG_LEN 256
#define RTAS_MSG_MAXLEN 64
/* Quirk - RTAS requires 4k list length and block size */
@@ -466,7 +467,7 @@ static void validate_flash(struct rtas_validate_flash_t *args_buf)
}
static int get_validate_flash_msg(struct rtas_validate_flash_t *args_buf,
- char *msg)
+ char *msg, int msglen)
{
int n;
@@ -474,7 +475,8 @@ static int get_validate_flash_msg(struct rtas_validate_flash_t *args_buf,
n = sprintf(msg, "%d\n", args_buf->update_results);
if ((args_buf->update_results >= VALIDATE_CUR_UNKNOWN) ||
(args_buf->update_results == VALIDATE_TMP_UPDATE))
- n += sprintf(msg + n, "%s\n", args_buf->buf);
+ n += snprintf(msg + n, msglen - n, "%s\n",
+ args_buf->buf);
} else {
n = sprintf(msg, "%d\n", args_buf->status);
}
@@ -486,11 +488,11 @@ static ssize_t validate_flash_read(struct file *file, char __user *buf,
{
struct rtas_validate_flash_t *const args_buf =
&rtas_validate_flash_data;
- char msg[RTAS_MSG_MAXLEN];
+ char msg[VALIDATE_MSG_LEN];
int msglen;
mutex_lock(&rtas_validate_flash_mutex);
- msglen = get_validate_flash_msg(args_buf, msg);
+ msglen = get_validate_flash_msg(args_buf, msg, VALIDATE_MSG_LEN);
mutex_unlock(&rtas_validate_flash_mutex);
return simple_read_from_buffer(buf, count, ppos, msg, msglen);
^ permalink raw reply related
* Re: [v1][KVM][PATCH 1/1] kvm:ppc:booehv: direct ISI exception to Guest
From: tiejun.chen @ 2013-05-08 1:53 UTC (permalink / raw)
To: Scott Wood; +Cc: linuxppc-dev, agraf, kvm-ppc, kvm
In-Reply-To: <1367970043.3398.39@snotra>
On 05/08/2013 07:40 AM, Scott Wood wrote:
> On 05/07/2013 06:06:30 AM, Tiejun Chen wrote:
>> We also can direct ISI exception to Guest like DSI.
>>
>> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
>> ---
>> arch/powerpc/kvm/booke_emulate.c | 3 +++
>> arch/powerpc/kvm/e500mc.c | 3 ++-
>> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> Are you seeing a real performance improvement from this? This will interfere
No. But after we reduce the exit to host, shouldn't this improve performance?
> somewhat with using the VF bit, if we were to ever do so, since VF only affects
Sorry, what is the VF you said?
Tiejun
^ permalink raw reply
* Re: [v1][KVM][PATCH 1/1] kvm:ppc:booehv: direct ISI exception to Guest
From: Scott Wood @ 2013-05-07 23:40 UTC (permalink / raw)
To: Tiejun Chen; +Cc: linuxppc-dev, agraf, kvm-ppc, kvm
In-Reply-To: <1367924791-24394-1-git-send-email-tiejun.chen@windriver.com>
On 05/07/2013 06:06:30 AM, Tiejun Chen wrote:
> We also can direct ISI exception to Guest like DSI.
>=20
> Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
> ---
> arch/powerpc/kvm/booke_emulate.c | 3 +++
> arch/powerpc/kvm/e500mc.c | 3 ++-
> 2 files changed, 5 insertions(+), 1 deletion(-)
Are you seeing a real performance improvement from this? This will =20
interfere somewhat with using the VF bit, if we were to ever do so, =20
since VF only affects data accesses (and so the guest would see an ISI =20
storm rather than a machine check if it tries to execute from such an =20
address).
-Scott=
^ permalink raw reply
* Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL bits in HWCAP2
From: Nishanth Aravamudan @ 2013-05-07 21:11 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev, Michael R Meissner, Steve Munroe, Peter Bergner,
Ryan Arnold, Michael Neuling
In-Reply-To: <1367959778.25488.17.camel@pasglop>
On 08.05.2013 [06:49:38 +1000], Benjamin Herrenschmidt wrote:
> On Tue, 2013-05-07 at 13:33 -0700, Nishanth Aravamudan wrote:
> > > Similarily, Nish, you may need to check that we remove those bits if
> > > pHyp has the partition in a mode that doesn't support them (P7
> > > compatibility for example) for migration purposes.
> >
> > Yep, I'll need to talk with Mikey about this part. Will be a follow-on
> > patch if needed. Minimally, the bit defines will stay the same, which is
> > the important part to get going right now.
>
> Actually in such a mode we'd get a back-version architected PVR so we
> should be fine now that I think twice, but of course that need to be
> tested.
True, I'll make sure it does get tested.
-Nish
^ permalink raw reply
* Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL bits in HWCAP2
From: Benjamin Herrenschmidt @ 2013-05-07 20:49 UTC (permalink / raw)
To: Nishanth Aravamudan
Cc: linuxppc-dev, Michael R Meissner, Steve Munroe, Peter Bergner,
Ryan Arnold, Michael Neuling
In-Reply-To: <20130507203346.GA7307@linux.vnet.ibm.com>
On Tue, 2013-05-07 at 13:33 -0700, Nishanth Aravamudan wrote:
> > Similarily, Nish, you may need to check that we remove those bits if
> > pHyp has the partition in a mode that doesn't support them (P7
> > compatibility for example) for migration purposes.
>
> Yep, I'll need to talk with Mikey about this part. Will be a follow-on
> patch if needed. Minimally, the bit defines will stay the same, which is
> the important part to get going right now.
Actually in such a mode we'd get a back-version architected PVR so we
should be fine now that I think twice, but of course that need to be
tested.
Cheers,
Ben.
^ permalink raw reply
* Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL bits in HWCAP2
From: Nishanth Aravamudan @ 2013-05-07 20:33 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev, Michael R Meissner, Steve Munroe, Peter Bergner,
Ryan Arnold, Michael Neuling
In-Reply-To: <1367876228.15842.62.camel@pasglop>
On 07.05.2013 [07:37:08 +1000], Benjamin Herrenschmidt wrote:
> On Mon, 2013-05-06 at 09:38 -0500, Ryan Arnold wrote:
> > My understanding was that these bits being 'on' is an indication of
> > what features the hardware supports (or what the kernel emulates) and
> > a not an indication of whether that facility is currently enabled or
> > not. If the hardware supports a particular feature but it is not
> > enabled I'd expect that user-space usage of that feature would cause
> > the kernel to trap on a facility availability exception (which is how
> > Altivec/VMX is implemented, being defaulted to turned off).
>
> Right but the discussion is about whether we should expose the bits
> when the kernel doesn't have the ability to handle the feature :-)
>
> IE. We need to remove the HTM feature if the kernel is compiled without
> transactional memory support.
>
> Similarily, Nish, you may need to check that we remove those bits if
> pHyp has the partition in a mode that doesn't support them (P7
> compatibility for example) for migration purposes.
Yep, I'll need to talk with Mikey about this part. Will be a follow-on
patch if needed. Minimally, the bit defines will stay the same, which is
the important part to get going right now.
Thanks,
Nish
^ permalink raw reply
* [PATCH v9 3/3] of/pci: microblaze: convert to common of_pci_range_parser
From: Andrew Murray @ 2013-05-07 15:31 UTC (permalink / raw)
To: robherring2
Cc: linux-mips, siva.kallam, linus.walleij, thierry.reding,
Liviu.Dudau, juhosg, paulus, linux-samsung-soc, linux, jg1.han,
jgunthorpe, thomas.abraham, linux-pci, grant.likely, arnd,
devicetree-discuss, kgene.kim, bhelgaas, linux-arm-kernel,
thomas.petazzoni, monstr, linux-kernel, suren.reddy,
Andrew Murray, linuxppc-dev
In-Reply-To: <1367940674-11987-1-git-send-email-Andrew.Murray@arm.com>
This patch converts the pci_load_of_ranges function to use the new common
of_pci_range_parser.
Signed-off-by: Andrew Murray <Andrew.Murray@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
---
arch/microblaze/pci/pci-common.c | 106 ++++++++++++++------------------------
1 files changed, 38 insertions(+), 68 deletions(-)
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 9ea521e..ba9e4a1 100644
--- a/arch/microblaze/pci/pci-common.c
+++ b/arch/microblaze/pci/pci-common.c
@@ -658,67 +658,42 @@ void pci_resource_to_user(const struct pci_dev *dev, int bar,
void pci_process_bridge_OF_ranges(struct pci_controller *hose,
struct device_node *dev, int primary)
{
- const u32 *ranges;
- int rlen;
- int pna = of_n_addr_cells(dev);
- int np = pna + 5;
int memno = 0, isa_hole = -1;
- u32 pci_space;
- unsigned long long pci_addr, cpu_addr, pci_next, cpu_next, size;
unsigned long long isa_mb = 0;
struct resource *res;
+ struct of_pci_range range;
+ struct of_pci_range_parser parser;
pr_info("PCI host bridge %s %s ranges:\n",
dev->full_name, primary ? "(primary)" : "");
- /* Get ranges property */
- ranges = of_get_property(dev, "ranges", &rlen);
- if (ranges == NULL)
+ /* Check for ranges property */
+ if (of_pci_range_parser_init(&parser, dev))
return;
- /* Parse it */
pr_debug("Parsing ranges property...\n");
- while ((rlen -= np * 4) >= 0) {
+ for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
- pci_space = ranges[0];
- pci_addr = of_read_number(ranges + 1, 2);
- cpu_addr = of_translate_address(dev, ranges + 3);
- size = of_read_number(ranges + pna + 3, 2);
-
pr_debug("pci_space: 0x%08x pci_addr:0x%016llx ",
- pci_space, pci_addr);
+ range.pci_space, range.pci_addr);
pr_debug("cpu_addr:0x%016llx size:0x%016llx\n",
- cpu_addr, size);
-
- ranges += np;
+ range.cpu_addr, range.size);
/* If we failed translation or got a zero-sized region
* (some FW try to feed us with non sensical zero sized regions
* such as power3 which look like some kind of attempt
* at exposing the VGA memory hole)
*/
- if (cpu_addr == OF_BAD_ADDR || size == 0)
+ if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;
- /* Now consume following elements while they are contiguous */
- for (; rlen >= np * sizeof(u32);
- ranges += np, rlen -= np * 4) {
- if (ranges[0] != pci_space)
- break;
- pci_next = of_read_number(ranges + 1, 2);
- cpu_next = of_translate_address(dev, ranges + 3);
- if (pci_next != pci_addr + size ||
- cpu_next != cpu_addr + size)
- break;
- size += of_read_number(ranges + pna + 3, 2);
- }
-
/* Act based on address space type */
res = NULL;
- switch ((pci_space >> 24) & 0x3) {
- case 1: /* PCI IO space */
+ switch (range.flags & IORESOURCE_TYPE_BITS) {
+ case IORESOURCE_IO:
pr_info(" IO 0x%016llx..0x%016llx -> 0x%016llx\n",
- cpu_addr, cpu_addr + size - 1, pci_addr);
+ range.cpu_addr, range.cpu_addr + range.size - 1,
+ range.pci_addr);
/* We support only one IO range */
if (hose->pci_io_size) {
@@ -726,11 +701,12 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose,
continue;
}
/* On 32 bits, limit I/O space to 16MB */
- if (size > 0x01000000)
- size = 0x01000000;
+ if (range.size > 0x01000000)
+ range.size = 0x01000000;
/* 32 bits needs to map IOs here */
- hose->io_base_virt = ioremap(cpu_addr, size);
+ hose->io_base_virt = ioremap(range.cpu_addr,
+ range.size);
/* Expect trouble if pci_addr is not 0 */
if (primary)
@@ -739,19 +715,20 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose,
/* pci_io_size and io_base_phys always represent IO
* space starting at 0 so we factor in pci_addr
*/
- hose->pci_io_size = pci_addr + size;
- hose->io_base_phys = cpu_addr - pci_addr;
+ hose->pci_io_size = range.pci_addr + range.size;
+ hose->io_base_phys = range.cpu_addr - range.pci_addr;
/* Build resource */
res = &hose->io_resource;
- res->flags = IORESOURCE_IO;
- res->start = pci_addr;
+ range.cpu_addr = range.pci_addr;
+
break;
- case 2: /* PCI Memory space */
- case 3: /* PCI 64 bits Memory space */
+ case IORESOURCE_MEM:
pr_info(" MEM 0x%016llx..0x%016llx -> 0x%016llx %s\n",
- cpu_addr, cpu_addr + size - 1, pci_addr,
- (pci_space & 0x40000000) ? "Prefetch" : "");
+ range.cpu_addr, range.cpu_addr + range.size - 1,
+ range.pci_addr,
+ (range.pci_space & 0x40000000) ?
+ "Prefetch" : "");
/* We support only 3 memory ranges */
if (memno >= 3) {
@@ -759,13 +736,13 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose,
continue;
}
/* Handles ISA memory hole space here */
- if (pci_addr == 0) {
- isa_mb = cpu_addr;
+ if (range.pci_addr == 0) {
+ isa_mb = range.cpu_addr;
isa_hole = memno;
if (primary || isa_mem_base == 0)
- isa_mem_base = cpu_addr;
- hose->isa_mem_phys = cpu_addr;
- hose->isa_mem_size = size;
+ isa_mem_base = range.cpu_addr;
+ hose->isa_mem_phys = range.cpu_addr;
+ hose->isa_mem_size = range.size;
}
/* We get the PCI/Mem offset from the first range or
@@ -773,30 +750,23 @@ void pci_process_bridge_OF_ranges(struct pci_controller *hose,
* hole. If they don't match, bugger.
*/
if (memno == 0 ||
- (isa_hole >= 0 && pci_addr != 0 &&
+ (isa_hole >= 0 && range.pci_addr != 0 &&
hose->pci_mem_offset == isa_mb))
- hose->pci_mem_offset = cpu_addr - pci_addr;
- else if (pci_addr != 0 &&
- hose->pci_mem_offset != cpu_addr - pci_addr) {
+ hose->pci_mem_offset = range.cpu_addr -
+ range.pci_addr;
+ else if (range.pci_addr != 0 &&
+ hose->pci_mem_offset != range.cpu_addr -
+ range.pci_addr) {
pr_info(" \\--> Skipped (offset mismatch) !\n");
continue;
}
/* Build resource */
res = &hose->mem_resources[memno++];
- res->flags = IORESOURCE_MEM;
- if (pci_space & 0x40000000)
- res->flags |= IORESOURCE_PREFETCH;
- res->start = cpu_addr;
break;
}
- if (res != NULL) {
- res->name = dev->full_name;
- res->end = res->start + size - 1;
- res->parent = NULL;
- res->sibling = NULL;
- res->child = NULL;
- }
+ if (res != NULL)
+ of_pci_range_to_resource(&range, dev, res);
}
/* If there's an ISA hole and the pci_mem_offset is -not- matching
--
1.7.0.4
^ permalink raw reply related
* [PATCH v9 2/3] of/pci: mips: convert to common of_pci_range_parser
From: Andrew Murray @ 2013-05-07 15:31 UTC (permalink / raw)
To: robherring2
Cc: linux-mips, siva.kallam, linus.walleij, thierry.reding,
Liviu.Dudau, juhosg, paulus, linux-samsung-soc, linux, jg1.han,
jgunthorpe, thomas.abraham, linux-pci, grant.likely, arnd,
devicetree-discuss, kgene.kim, bhelgaas, linux-arm-kernel,
thomas.petazzoni, monstr, linux-kernel, suren.reddy,
Andrew Murray, linuxppc-dev
In-Reply-To: <1367940674-11987-1-git-send-email-Andrew.Murray@arm.com>
This patch converts the pci_load_of_ranges function to use the new common
of_pci_range_parser.
Signed-off-by: Andrew Murray <Andrew.Murray@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
---
arch/mips/pci/pci.c | 50 ++++++++++++++++++--------------------------------
1 files changed, 18 insertions(+), 32 deletions(-)
diff --git a/arch/mips/pci/pci.c b/arch/mips/pci/pci.c
index 0872f12..0d291e9 100644
--- a/arch/mips/pci/pci.c
+++ b/arch/mips/pci/pci.c
@@ -122,51 +122,37 @@ static void pcibios_scanbus(struct pci_controller *hose)
#ifdef CONFIG_OF
void pci_load_of_ranges(struct pci_controller *hose, struct device_node *node)
{
- const __be32 *ranges;
- int rlen;
- int pna = of_n_addr_cells(node);
- int np = pna + 5;
+ struct of_pci_range range;
+ struct of_pci_range_parser parser;
pr_info("PCI host bridge %s ranges:\n", node->full_name);
- ranges = of_get_property(node, "ranges", &rlen);
- if (ranges == NULL)
- return;
hose->of_node = node;
- while ((rlen -= np * 4) >= 0) {
- u32 pci_space;
+ if (of_pci_range_parser_init(&parser, node))
+ return;
+
+ for_each_of_pci_range(&parser, &range) {
struct resource *res = NULL;
- u64 addr, size;
-
- pci_space = be32_to_cpup(&ranges[0]);
- addr = of_translate_address(node, ranges + 3);
- size = of_read_number(ranges + pna + 3, 2);
- ranges += np;
- switch ((pci_space >> 24) & 0x3) {
- case 1: /* PCI IO space */
+
+ switch (range.flags & IORESOURCE_TYPE_BITS) {
+ case IORESOURCE_IO:
pr_info(" IO 0x%016llx..0x%016llx\n",
- addr, addr + size - 1);
+ range.cpu_addr,
+ range.cpu_addr + range.size - 1);
hose->io_map_base =
- (unsigned long)ioremap(addr, size);
+ (unsigned long)ioremap(range.cpu_addr,
+ range.size);
res = hose->io_resource;
- res->flags = IORESOURCE_IO;
break;
- case 2: /* PCI Memory space */
- case 3: /* PCI 64 bits Memory space */
+ case IORESOURCE_MEM:
pr_info(" MEM 0x%016llx..0x%016llx\n",
- addr, addr + size - 1);
+ range.cpu_addr,
+ range.cpu_addr + range.size - 1);
res = hose->mem_resource;
- res->flags = IORESOURCE_MEM;
break;
}
- if (res != NULL) {
- res->start = addr;
- res->name = node->full_name;
- res->end = res->start + size - 1;
- res->parent = NULL;
- res->sibling = NULL;
- res->child = NULL;
- }
+ if (res != NULL)
+ of_pci_range_to_resource(&range, node, res);
}
}
#endif
--
1.7.0.4
^ permalink raw reply related
* [PATCH v9 1/3] of/pci: Provide support for parsing PCI DT ranges property
From: Andrew Murray @ 2013-05-07 15:31 UTC (permalink / raw)
To: robherring2
Cc: linux-mips, siva.kallam, linus.walleij, thierry.reding,
Liviu.Dudau, juhosg, paulus, linux-samsung-soc, linux, jg1.han,
jgunthorpe, thomas.abraham, linux-pci, grant.likely, arnd,
devicetree-discuss, kgene.kim, bhelgaas, linux-arm-kernel,
thomas.petazzoni, monstr, linux-kernel, suren.reddy,
Andrew Murray, linuxppc-dev
In-Reply-To: <1367940674-11987-1-git-send-email-Andrew.Murray@arm.com>
This patch factors out common implementation patterns to reduce overall kernel
code and provide a means for host bridge drivers to directly obtain struct
resources from the DT's ranges property without relying on architecture specific
DT handling. This will make it easier to write archiecture independent host bridge
drivers and mitigate against further duplication of DT parsing code.
This patch can be used in the following way:
struct of_pci_range_parser parser;
struct of_pci_range range;
if (of_pci_range_parser_init(&parser, np))
; //no ranges property
for_each_of_pci_range(&parser, &range) {
/*
directly access properties of the address range, e.g.:
range.pci_space, range.pci_addr, range.cpu_addr,
range.size, range.flags
alternatively obtain a struct resource, e.g.:
struct resource res;
of_pci_range_to_resource(&range, np, &res);
*/
}
Additionally the implementation takes care of adjacent ranges and merges them
into a single range (as was the case with powerpc and microblaze).
Signed-off-by: Andrew Murray <Andrew.Murray@arm.com>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Rob Herring <rob.herring@calxeda.com>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Jingoo Han <jg1.han@samsung.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
---
drivers/of/address.c | 67 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/of_address.h | 48 +++++++++++++++++++++++++++++++
2 files changed, 115 insertions(+), 0 deletions(-)
diff --git a/drivers/of/address.c b/drivers/of/address.c
index 04da786..fdd0636 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -227,6 +227,73 @@ int of_pci_address_to_resource(struct device_node *dev, int bar,
return __of_address_to_resource(dev, addrp, size, flags, NULL, r);
}
EXPORT_SYMBOL_GPL(of_pci_address_to_resource);
+
+int of_pci_range_parser_init(struct of_pci_range_parser *parser,
+ struct device_node *node)
+{
+ const int na = 3, ns = 2;
+ int rlen;
+
+ parser->node = node;
+ parser->pna = of_n_addr_cells(node);
+ parser->np = parser->pna + na + ns;
+
+ parser->range = of_get_property(node, "ranges", &rlen);
+ if (parser->range == NULL)
+ return -ENOENT;
+
+ parser->end = parser->range + rlen / sizeof(__be32);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(of_pci_range_parser_init);
+
+struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser *parser,
+ struct of_pci_range *range)
+{
+ const int na = 3, ns = 2;
+
+ if (!range)
+ return NULL;
+
+ if (!parser->range || parser->range + parser->np > parser->end)
+ return NULL;
+
+ range->pci_space = parser->range[0];
+ range->flags = of_bus_pci_get_flags(parser->range);
+ range->pci_addr = of_read_number(parser->range + 1, ns);
+ range->cpu_addr = of_translate_address(parser->node,
+ parser->range + na);
+ range->size = of_read_number(parser->range + parser->pna + na, ns);
+
+ parser->range += parser->np;
+
+ /* Now consume following elements while they are contiguous */
+ while (parser->range + parser->np <= parser->end) {
+ u32 flags, pci_space;
+ u64 pci_addr, cpu_addr, size;
+
+ pci_space = be32_to_cpup(parser->range);
+ flags = of_bus_pci_get_flags(parser->range);
+ pci_addr = of_read_number(parser->range + 1, ns);
+ cpu_addr = of_translate_address(parser->node,
+ parser->range + na);
+ size = of_read_number(parser->range + parser->pna + na, ns);
+
+ if (flags != range->flags)
+ break;
+ if (pci_addr != range->pci_addr + range->size ||
+ cpu_addr != range->cpu_addr + range->size)
+ break;
+
+ range->size += size;
+ parser->range += parser->np;
+ }
+
+ return range;
+}
+EXPORT_SYMBOL_GPL(of_pci_range_parser_one);
+
#endif /* CONFIG_PCI */
/*
diff --git a/include/linux/of_address.h b/include/linux/of_address.h
index 0506eb5..4c2e6f2 100644
--- a/include/linux/of_address.h
+++ b/include/linux/of_address.h
@@ -4,6 +4,36 @@
#include <linux/errno.h>
#include <linux/of.h>
+struct of_pci_range_parser {
+ struct device_node *node;
+ const __be32 *range;
+ const __be32 *end;
+ int np;
+ int pna;
+};
+
+struct of_pci_range {
+ u32 pci_space;
+ u64 pci_addr;
+ u64 cpu_addr;
+ u64 size;
+ u32 flags;
+};
+
+#define for_each_of_pci_range(parser, range) \
+ for (; of_pci_range_parser_one(parser, range);)
+
+static inline void of_pci_range_to_resource(struct of_pci_range *range,
+ struct device_node *np,
+ struct resource *res)
+{
+ res->flags = range->flags;
+ res->start = range->cpu_addr;
+ res->end = range->cpu_addr + range->size - 1;
+ res->parent = res->child = res->sibling = NULL;
+ res->name = np->full_name;
+}
+
#ifdef CONFIG_OF_ADDRESS
extern u64 of_translate_address(struct device_node *np, const __be32 *addr);
extern bool of_can_translate_address(struct device_node *dev);
@@ -27,6 +57,11 @@ static inline unsigned long pci_address_to_pio(phys_addr_t addr) { return -1; }
#define pci_address_to_pio pci_address_to_pio
#endif
+extern int of_pci_range_parser_init(struct of_pci_range_parser *parser,
+ struct device_node *node);
+extern struct of_pci_range *of_pci_range_parser_one(
+ struct of_pci_range_parser *parser,
+ struct of_pci_range *range);
#else /* CONFIG_OF_ADDRESS */
#ifndef of_address_to_resource
static inline int of_address_to_resource(struct device_node *dev, int index,
@@ -53,6 +88,19 @@ static inline const __be32 *of_get_address(struct device_node *dev, int index,
{
return NULL;
}
+
+static inline int of_pci_range_parser_init(struct of_pci_range_parser *parser,
+ struct device_node *node)
+{
+ return -1;
+}
+
+static inline struct of_pci_range *of_pci_range_parser_one(
+ struct of_pci_range_parser *parser,
+ struct of_pci_range *range)
+{
+ return NULL;
+}
#endif /* CONFIG_OF_ADDRESS */
--
1.7.0.4
^ permalink raw reply related
* [PATCH v9 0/3] of/pci: Provide common support for PCI DT parsing
From: Andrew Murray @ 2013-05-07 15:31 UTC (permalink / raw)
To: robherring2
Cc: linux-mips, siva.kallam, linus.walleij, thierry.reding,
Liviu.Dudau, juhosg, paulus, linux-samsung-soc, linux, jg1.han,
jgunthorpe, thomas.abraham, linux-pci, grant.likely, arnd,
devicetree-discuss, kgene.kim, bhelgaas, linux-arm-kernel,
thomas.petazzoni, monstr, linux-kernel, suren.reddy,
Andrew Murray, linuxppc-dev
This patchset factors out duplicated code associated with parsing PCI
DT "ranges" properties across the architectures and introduces a
"ranges" parser. This parser "of_pci_range_parser" can be used directly
by ARM host bridge drivers enabling them to obtain ranges from device
trees.
I've included the Reviewed-by, Tested-by and Acked-by's received from
v5/v6/v7/v8 in this patchset, earlier versions of this patchset (v3) have been
tested-by:
Thierry Reding <thierry.reding@avionic-design.de>
Jingoo Han <jg1.han@samsung.com>
I've tested that this patchset builds and runs on ARM and that it builds on
PowerPC, x86_64, MIPS and Microblaze.
Compared to the v8 sent by Andrew Murray, the following changes have been made
(please note that the MIPS patch is unchanged from v8):
* Remove the unification of pci_process_bridge_OF_ranges between PowerPC and
Microblaze. Feedback from Bjorn and Benjamin (along with a NAK) suggested
that this goes against their future direction (using more of struct
pci_host_bridge and less of arch specific struct pci_controller).
Compared to the v7 sent by Andrew Murray, the following changes have been made
(please note that the first patch is unchanged from v7):
* Rename of_pci_range_parser to of_pci_range_parser_init and
of_pci_process_ranges to of_pci_range_parser_one as suggested by Grant
Likely.
* Reverted back to using a switch statement instead of if/else in
pci_process_bridge_OF_ranges. Grant Likely highlighted this change from
the original code which was unnecessary.
* Squashed in a patch provided by Gabor Juhos which fixes build errors on
MIPS found in the last patchset.
Compared to the v6 sent by Andrew Murray, the following changes have
been made in response to build errors/warnings:
* Inclusion of linux/of_address.h in of_pci.c as suggested by Michal
Simek to prevent compilation failures on Microblaze (and others) and his
ack.
* Use of externs, static inlines and a typo in linux/of_address.h in response
to linker errors (multiple defination) on x86_64 as spotted by a kbuild test
robot on (jcooper/linux.git mvebu/drivers)
* Add EXPORT_SYMBOL_GPL to of_pci_range_parser function to be consistent
with of_pci_process_ranges function
Compared to the v5 sent by Andrew Murray, the following changes have
been made:
* Use of CONFIG_64BIT instead of CONFIG_[a32bitarch] as suggested by
Rob Herring in drivers/of/of_pci.c
* Added forward declaration of struct pci_controller in linux/of_pci.h
to prevent compiler warning as suggested by Thomas Petazzoni
* Improved error checking (!range check), removal of unnecessary be32_to_cpup
call, improved formatting of struct of_pci_range_parser layout and
replacement of macro with a static inline. All suggested by Rob Herring.
Compared to the v4 (incorrectly labelled v3) sent by Andrew Murray,
the following changes have been made:
* Split the patch as suggested by Rob Herring
Compared to the v3 sent by Andrew Murray, the following changes have
been made:
* Unify and move duplicate pci_process_bridge_OF_ranges functions to
drivers/of/of_pci.c as suggested by Rob Herring
* Fix potential build errors with Microblaze/MIPS
Compared to "[PATCH v5 01/17] of/pci: Provide support for parsing PCI DT
ranges property", the following changes have been made:
* Correct use of IORESOURCE_* as suggested by Russell King
* Improved interface and naming as suggested by Thierry Reding
Compared to the v2 sent by Andrew Murray, Thomas Petazzoni did:
* Add a memset() on the struct of_pci_range_iter when starting the
for loop in for_each_pci_range(). Otherwise, with an uninitialized
of_pci_range_iter, of_pci_process_ranges() may crash.
* Add parenthesis around 'res', 'np' and 'iter' in the
for_each_of_pci_range macro definitions. Otherwise, passing
something like &foobar as 'res' didn't work.
* Rebased on top of 3.9-rc2, which required fixing a few conflicts in
the Microblaze code.
v2:
This follows on from suggestions made by Grant Likely
(marc.info/?l=linux-kernel&m=136079602806328)
Andrew Murray (3):
of/pci: Provide support for parsing PCI DT ranges property
of/pci: mips: convert to common of_pci_range_parser
of/pci: microblaze: convert to common of_pci_range_parser
arch/microblaze/pci/pci-common.c | 106 ++++++++++++++------------------------
arch/mips/pci/pci.c | 50 ++++++-----------
drivers/of/address.c | 67 ++++++++++++++++++++++++
include/linux/of_address.h | 48 +++++++++++++++++
4 files changed, 171 insertions(+), 100 deletions(-)
^ permalink raw reply
* Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL bits in HWCAP2
From: Ryan Arnold @ 2013-05-07 15:11 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev, Michael R Meissner, Steve Munroe, Peter Bergner,
Michael Neuling, Nishanth Aravamudan
In-Reply-To: <1367876228.15842.62.camel@pasglop>
[-- Attachment #1: Type: text/plain, Size: 1427 bytes --]
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/06/2013
04:37:08 PM:
> Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 05/06/2013 04:37 PM
>
> To
>
> Ryan Arnold/Rochester/IBM@IBMUS
>
> cc
>
> Nishanth Aravamudan <nacc@linux.vnet.ibm.com>, linuxppc-
> dev@lists.ozlabs.org, Michael R Meissner/Cambridge/IBM@IBMUS,
> Michael Neuling <michael.neuling@au1.ibm.com>, Peter Bergner/
> Rochester/IBM@IBMUS, Steve Munroe/Rochester/IBM@IBMUS
>
> Subject
>
> Re: [PATCH] arch/powerpc: advertise ISA2.07, HTM, DSCR, EBB and ISEL
> bits in HWCAP2
>
> On Mon, 2013-05-06 at 09:38 -0500, Ryan Arnold wrote:
> > My understanding was that these bits being 'on' is an indication of
> > what features the hardware supports (or what the kernel emulates) and
> > a not an indication of whether that facility is currently enabled or
> > not. If the hardware supports a particular feature but it is not
> > enabled I'd expect that user-space usage of that feature would cause
> > the kernel to trap on a facility availability exception (which is how
> > Altivec/VMX is implemented, being defaulted to turned off).
>
> Right but the discussion is about whether we should expose the bits
> when the kernel doesn't have the ability to handle the feature :-)
>
> IE. We need to remove the HTM feature if the kernel is compiled without
> transactional memory support.
Thanks for explaining. This is exactly how it should work.
Ryan
[-- Attachment #2: Type: text/html, Size: 2115 bytes --]
^ permalink raw reply
* Re: [PATCH v2 1/4] powerpc/cputable: reserve bits in HWCAP2 for new features
From: Ryan Arnold @ 2013-05-07 15:07 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Michael Neuling, Nishanth Aravamudan, Steve Munroe, Peter Bergner,
linuxppc-dev, Michael R Meissner
In-Reply-To: <1367876461.15842.66.camel@pasglop>
[-- Attachment #1: Type: text/plain, Size: 938 bytes --]
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/06/2013
04:41:01 PM:
> Benjamin Herrenschmidt <benh@kernel.crashing.org>
> 05/06/2013 04:41 PM
>
> To
>
> Ryan Arnold/Rochester/IBM@IBMUS
>
> cc
>
> Nishanth Aravamudan <nacc@linux.vnet.ibm.com>, linuxppc-
> dev@lists.ozlabs.org, michael@ellerman.id.au, Michael R Meissner/
> Cambridge/IBM@IBMUS, Michael Neuling <mikey@neuling.org>, Peter
> Bergner/Rochester/IBM@IBMUS, Steve Munroe/Rochester/IBM@IBMUS
>
> Subject
>
> Re: [PATCH v2 1/4] powerpc/cputable: reserve bits in HWCAP2 for new
features
>
> On Mon, 2013-05-06 at 14:07 -0500, Ryan Arnold wrote:
> > Notice that I changed DSCR to DSC. The 'R' wasn't descriptive.
>
> The "R" is the name of the register for which we are exposing the
> availability to userspace... it's also the name of the sysfs entry so
> I'd rather keep it for consistency.
I'm fine with keeping the 'R' in the name. Thanks for the input.
Ryan
[-- Attachment #2: Type: text/html, Size: 1554 bytes --]
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox