LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3/3] powerpc: Set DSCR bit in FSCR setup
From: Michael Neuling @ 2013-03-05  5:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Michael Neuling, linuxppc-dev, Ian Munsie
In-Reply-To: <1362462352-12009-1-git-send-email-mikey@neuling.org>

We support DSCR (Data Stream Control Register) so we should make sure we set it
in the FSCR (Facility Status & Control Register) incase some firmwares don't
set it.  If we don't set this, we'll take a facility unavailable exception when
using the DSCR.

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/kernel/cpu_setup_power.S |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index bb2d203..ea847ab 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -116,7 +116,7 @@ __init_LPCR:
 
 __init_FSCR:
 	mfspr	r3,SPRN_FSCR
-	ori	r3,r3,FSCR_TAR
+	ori	r3,r3,FSCR_TAR|FSCR_DSCR
 	mtspr	SPRN_FSCR,r3
 	blr
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/3] powerpc: Add DSCR FSCR register bit definition
From: Michael Neuling @ 2013-03-05  5:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Michael Neuling, linuxppc-dev, Ian Munsie
In-Reply-To: <1362462352-12009-1-git-send-email-mikey@neuling.org>

This sets the DSCR (Data Stream Control Register) in the FSCR (Facility Status
& Control Register).

Also harmonise TAR (Target Address Register) FSCR bit definition too.

Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/include/asm/reg.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index e665861..c9c67fc 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -266,7 +266,8 @@
 #define SPRN_HSRR0	0x13A	/* Hypervisor Save/Restore 0 */
 #define SPRN_HSRR1	0x13B	/* Hypervisor Save/Restore 1 */
 #define SPRN_FSCR	0x099	/* Facility Status & Control Register */
-#define FSCR_TAR	(1<<8)	/* Enable Target Adress Register */
+#define   FSCR_TAR	(1 << (63-55)) /* Enable Target Address Register */
+#define   FSCR_DSCR	(1 << (63-61)) /* Enable Data Stream Control Register */
 #define SPRN_TAR	0x32f	/* Target Address Register */
 #define SPRN_LPCR	0x13E	/* LPAR Control Register */
 #define   LPCR_VPM0	(1ul << (63-0))
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/3] powerpc: Fix setting FSCR for HV=0 and on secondary CPUs
From: Michael Neuling @ 2013-03-05  5:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Michael Neuling, linuxppc-dev, Ian Munsie
In-Reply-To: <1362462352-12009-1-git-send-email-mikey@neuling.org>

Currently we only set the FSCR (Facility Status and Control Register) when HV=1
but this feature is available when HV=0 also.  This patch sets FSCR when HV=0.

Also, we currently only set the FSCR on the master CPU.  This patch also sets
the FSCR on secondary CPUs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: Ian Munsie <imunsie@au1.ibm.com>
---
 arch/powerpc/kernel/cpu_setup_power.S |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index d29facb..bb2d203 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -48,6 +48,7 @@ _GLOBAL(__restore_cpu_power7)
 
 _GLOBAL(__setup_cpu_power8)
 	mflr	r11
+	bl	__init_FSCR
 	bl	__init_hvmode_206
 	mtlr	r11
 	beqlr
@@ -56,13 +57,13 @@ _GLOBAL(__setup_cpu_power8)
 	mfspr	r3,SPRN_LPCR
 	oris	r3, r3, LPCR_AIL_3@h
 	bl	__init_LPCR
-	bl	__init_FSCR
 	bl	__init_TLB
 	mtlr	r11
 	blr
 
 _GLOBAL(__restore_cpu_power8)
 	mflr	r11
+	bl	__init_FSCR
 	mfmsr	r3
 	rldicl.	r0,r3,4,63
 	beqlr
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 0/3] powerpc: FSCR fixes for POWER8
From: Michael Neuling @ 2013-03-05  5:45 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Michael Neuling, linuxppc-dev, Ian Munsie
In-Reply-To: <1362459996.21357.46.camel@pasglop>

Benh, 

Here are a few fixes for the POWER8 (Performance Optimization With Enhanced
RISC Eight) FSCR (Facility Status & Control Register).

First patch changes the FSCR so that it's set on secondary CPUs as well as when
MSR HV=0.

Second two patches make sure that the FSCR DSCR (Data Stream Control
Register) bit is set so that we don't trap on DSCR usage.

These are aimed to fix issues in 3.9.

v2:
  Addressing comments from benh (Benjamin Herrenschmidt)

Mikey

^ permalink raw reply

* Re: [PATCH 2/3] powerpc: Add DSCR FSCR register bit definition
From: Benjamin Herrenschmidt @ 2013-03-05  5:09 UTC (permalink / raw)
  To: Michael Neuling; +Cc: linuxppc-dev, Ian Munsie
In-Reply-To: <1362390402-17725-3-git-send-email-mikey@neuling.org>

On Mon, 2013-03-04 at 20:46 +1100, Michael Neuling wrote:
> Also harmonise TAR bit definition too.

Same, expand accronyms, minimum blurb about what these are about (not a
long explanation, just so that when I come out of the blue I can at
least connect it to something that makes sense :-)

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH 1/3] powerpc: Fix setting FSCR for HV=0 and secondary CPUs
From: Benjamin Herrenschmidt @ 2013-03-05  5:06 UTC (permalink / raw)
  To: Michael Neuling; +Cc: linuxppc-dev, Ian Munsie
In-Reply-To: <1362390402-17725-2-git-send-email-mikey@neuling.org>

On Mon, 2013-03-04 at 20:46 +1100, Michael Neuling wrote:
> Currently we only set the FSCR when HV=1 but this feature is available when
> HV=0 also.  This patch sets FSCR when HV=0.
> 
> Also, we currently only set the FSCR on the master CPU.  This patch also sets
> the FSCR on secondary CPUs.

Please add a quick blurb/reminder of what FSCR is (at least expand the
accronym).

Cheers,
Ben.

> Signed-off-by: Michael Neuling <mikey@neuling.org>
> cc: Ian Munsie <imunsie@au1.ibm.com>
> ---
>  arch/powerpc/kernel/cpu_setup_power.S |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
> index d29facb..bb2d203 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.S
> +++ b/arch/powerpc/kernel/cpu_setup_power.S
> @@ -48,6 +48,7 @@ _GLOBAL(__restore_cpu_power7)
>  
>  _GLOBAL(__setup_cpu_power8)
>  	mflr	r11
> +	bl	__init_FSCR
>  	bl	__init_hvmode_206
>  	mtlr	r11
>  	beqlr
> @@ -56,13 +57,13 @@ _GLOBAL(__setup_cpu_power8)
>  	mfspr	r3,SPRN_LPCR
>  	oris	r3, r3, LPCR_AIL_3@h
>  	bl	__init_LPCR
> -	bl	__init_FSCR
>  	bl	__init_TLB
>  	mtlr	r11
>  	blr
>  
>  _GLOBAL(__restore_cpu_power8)
>  	mflr	r11
> +	bl	__init_FSCR
>  	mfmsr	r3
>  	rldicl.	r0,r3,4,63
>  	beqlr

^ permalink raw reply

* Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support
From: Paul Mundt @ 2013-03-05  2:41 UTC (permalink / raw)
  To: Mike Qiu; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <1358235536-32741-3-git-send-email-qiudayu@linux.vnet.ibm.com>

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
> Adding a function irq_create_mapping_many() which can associate
> multiple MSIs to a continous irq mapping.
> 
> This is needed to enable multiple MSI support for pSeries.
> 
> +int irq_create_mapping_many(struct irq_domain *domain,
> +		irq_hw_number_t hwirq_base, int count)
> +{

Other than the other review comments already made, I think you can
simplify this considerably by simply doing what irq_create_strict_mappings() does,
and relaxing the irq_base requirements.

In any event, as you are creating a new interface, I don't think you want
to carry around half of the legacy crap that irq_create_mapping() has to
deal with. We made the decision to avoid this with irq_create_strict_mappings()
intentionally, too.

^ permalink raw reply

* Re: [PATCH 2/3] irq: Add hw continuous IRQs map to virtual continuous IRQs support
From: Michael Ellerman @ 2013-03-05  2:23 UTC (permalink / raw)
  To: Mike Qiu; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <1358235536-32741-3-git-send-email-qiudayu@linux.vnet.ibm.com>

On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
> Adding a function irq_create_mapping_many() which can associate
> multiple MSIs to a continous irq mapping.
> 
> This is needed to enable multiple MSI support for pSeries.
> 
> Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
> ---
>  include/linux/irq.h       |    2 +
>  include/linux/irqdomain.h |    3 ++
>  kernel/irq/irqdomain.c    |   61 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 66 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/irq.h b/include/linux/irq.h
> index 60ef45b..e00a7ec 100644
> --- a/include/linux/irq.h
> +++ b/include/linux/irq.h
> @@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
>  #define irq_alloc_desc_from(from, node)		\
>  	irq_alloc_descs(-1, from, 1, node)
>  
> +#define irq_alloc_desc_n(nevc, node)		\
> +	irq_alloc_descs(-1, 0, nevc, node)

This has been superseeded by irq_alloc_descs_from(), which is the right
way to do it.

> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index 0d5b17b..831dded 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -168,6 +168,9 @@ extern int irq_create_strict_mappings(struct irq_domain *domain,
>  				      unsigned int irq_base,
>  				      irq_hw_number_t hwirq_base, int count);
>  
> +extern int irq_create_mapping_many(struct irq_domain *domain,
> +					irq_hw_number_t hwirq_base, int count);
> +
>  static inline int irq_create_identity_mapping(struct irq_domain *host,
>  					      irq_hw_number_t hwirq)
>  {
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 96f3a1d..38648e6 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
>  }
>  EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
>  
> +/**
> + * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
> + * @domain: domain owning the interrupt range
> + * @hwirq_base: beginning of continuous hardware IRQ range
> + * @count: Number of interrupts to map

For multiple-MSI the allocated interrupt numbers must be a power-of-2,
and must be naturally aligned. I don't /think/ that's a requirement for
the virtual numbers, but it's probably best that we do it anyway.

So this API needs to specify that it will give you back a power-of-2
block that is naturally aligned - otherwise you can't use it for MSI.

> + * This routine is used for allocating and mapping a range of hardware
> + * irqs to virtual IRQs where the virtual irq numbers are not at pre-defined
> + * locations.

This comment doesn't make sense to me.

> + *
> + * Greater than 0 is returned upon success, while any failure to establish a
> + * static mapping is treated as an error.
> + */
> +int irq_create_mapping_many(struct irq_domain *domain,
> +		irq_hw_number_t hwirq_base, int count)
> +{
> +	int ret, irq_base;
> +	int virq, i;
> +
> +	pr_debug("irq_create_mapping(0x%p, 0x%lx)\n", domain, hwirq_base);


I'd like to see this whole function rewritten to reduce the duplication
vs irq_create_mapping(). I don't see any reason why this can't be the
core routine, and irq_create_mapping() becomes a caller of it, passing a
count of 1 ?

> +	/* Look for default domain if nececssary */
> +	if (!domain)
> +		domain = irq_default_domain;
> +	if (!domain) {
> +		pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
> +			, hwirq_base);
> +		WARN_ON(1);
> +		return 0;
> +	}
> +	pr_debug("-> using domain @%p\n", domain);
> +
> +	/* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
> +	if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
> +		return irq_domain_legacy_revmap(domain, hwirq_base);

The above doesn't work.

> +	/* Check if mapping already exists */
> +	for (i = 0; i < count; i++) {
> +		virq = irq_find_mapping(domain, hwirq_base+i);
> +		if (virq) {
> +			pr_debug("existing mapping on virq %d,"
> +					" now dispose it first\n", virq);
> +			irq_dispose_mapping(virq);

You might have just disposed of someone elses mapping, we shouldn't do
that. It should be an error to the caller.

cheers

^ permalink raw reply

* Re: [PATCH -V1 06/24] powerpc: Reduce PTE table memory wastage
From: Paul Mackerras @ 2013-03-05  2:12 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, linux-mm
In-Reply-To: <874ngr2zz1.fsf@linux.vnet.ibm.com>

On Mon, Mar 04, 2013 at 04:28:42PM +0530, Aneesh Kumar K.V wrote:
> Paul Mackerras <paulus@samba.org> writes:
> 
> > The other general comment I have is that it's not really clear when a
> > page will be on the mm->context.pgtable_list and when it won't.  I
> > would like to see an invariant that says something like "the page is
> > on the pgtable_list if and only if (page->_mapcount & FRAG_MASK) is
> > neither 0 nor FRAG_MASK".  But that doesn't seem to be the case
> > exactly, and I can't see any consistent rule, which makes me think
> > there are going to be bugs in corner cases.
> >
> 
> 
> I added the below comment when initializing the list.
> 
> +#ifdef CONFIG_PPC_64K_PAGES
> +       /*
> +        * Used to support 4K PTE fragment. The pages are added to list,
> +        * when we have free framents in the page. We track the whether
> +        * a page frament is available using page._mapcount. A value of
> +        * zero indicate none of the fragments are used and page can be
> +        * freed. A value of FRAG_MASK indicate all the fragments are used
> +        * and hence the page will be removed from the below list.
> +        */
> +       INIT_LIST_HEAD(&init_mm.context.pgtable_list);
> +#endif
> 
> I am not sure about why you say there is no consistent rule. Can you
> elaborate on that ?

Well, sometimes you take things off the list when mask == 0, and
sometimes when (mask & FRAG_MASK) == 0.  So it's not clear whether the
page is supposed to be on the list when (mask & FRAG_MASK) == 0 but
mask != 0.  If you stated in a comment what the rule was supposed to
be then reviewers could check whether your code implemented that rule.
Also, if you had a consistent rule you could more easily add debug
code to check that the rule was being followed.

Also, that comment above doesn't say anything about the upper bits and
whether they have any influence on whether the page should be on the
list or not.

> > Consider, for example, the case where a page has two fragments still
> > in use, and one of them gets queued up by RCU for freeing via a call
> > to page_table_free_rcu, and then the other one gets freed through
> > page_table_free().  Neither the call to page_table_free_rcu nor the
> > call to page_table_free will take the page off the list AFAICS, and
> > then __page_table_free_rcu() will free the page while it's still on
> > the pgtable_list.
> 
> The last one that ends up doing atomic_xor_bits which cause the mapcount
> to go zero, will take the page off the list and free the page. 

No, look at the example again.  page_table_free_rcu() won't take it
off the list because it uses the (mask & FRAG_MASK) == 0 test, which
fails (one fragment is still in use).  page_table_free() won't take it
off the list because it uses the mask == 0 test, which also fails (one
fragment is still waiting for the RCU grace period).  Finally,
__page_table_free_rcu() doesn't take it off the list, it just frees
the page.  Oops. :)

Paul.

^ permalink raw reply

* Re: [PATCH -V1 09/24] powerpc: Decode the pte-lp-encoding bits correctly.
From: Paul Mackerras @ 2013-03-05  2:02 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, linux-mm
In-Reply-To: <87y5e31jem.fsf@linux.vnet.ibm.com>

On Mon, Mar 04, 2013 at 05:11:53PM +0530, Aneesh Kumar K.V wrote:
> Paul Mackerras <paulus@samba.org> writes:
> >> +static inline int hpte_actual_psize(struct hash_pte *hptep, int psize)
> >> +{
> >> +	unsigned int mask;
> >> +	int i, penc, shift;
> >> +	/* Look at the 8 bit LP value */
> >> +	unsigned int lp = (hptep->r >> LP_SHIFT) & ((1 << LP_BITS) - 1);
> >> +
> >> +	penc = 0;
> >> +	for (i = 0; i < MMU_PAGE_COUNT; i++) {
> >> +		/* valid entries have a shift value */
> >> +		if (!mmu_psize_defs[i].shift)
> >> +			continue;
> >> +
> >> +		/* encoding bits per actual page size */
> >> +		shift = mmu_psize_defs[i].shift - 11;
> >> +		if (shift > 9)
> >> +			shift = 9;
> >> +		mask = (1 << shift) - 1;
> >> +		if ((lp & mask) == mmu_psize_defs[psize].penc[i])
> >> +			return i;
> >> +	}
> >> +	return -1;
> >> +}
> >
> > This doesn't look right to me.  First, it's not clear what the 11 and
> > 9 refer to, and I think the 9 should be LP_BITS (i.e. 8).  Secondly,
> > the mask for the comparison needs to depend on the actual page size
> > not the base page size.
> 
> That 11 should be 12.That depends on the fact that we have below mapping

And the 12 should be LP_SHIFT, shouldn't it?

>  rrrr rrrz 	≥8KB
> 
> Yes, that 9 should be LP_BITs. 
> 
> We are generating mask based on actual page size above (variable i in
> the for loop).

OK, yes, you're right.

> > I don't see where in this function you set the penc[] elements for
> > invalid actual page sizes to -1.
> 
> We do the below
> 
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -125,7 +125,7 @@ static struct mmu_psize_def mmu_psize_defaults_old[] = {
>         [MMU_PAGE_4K] = {
>                 .shift  = 12,
>                 .sllp   = 0,
> -               .penc   = 0,
> +               .penc   = { [0 ... MMU_PAGE_COUNT - 1] = -1 },
>                 .avpnm  = 0,

Yes, which sets them for the entries you initialize, but not for the
others.  For example, the entry for MMU_PAGE_64K will initially be all
zeroes.  Then we find an entry in the ibm,segment-page-sizes property
for 64k pages, so we set mmu_psize_defs[MMU_PAGE_64K].shift to 16,
making that entry valid, but we never set any of the .penc[] entries
to -1, leading your other code to think that it can do (say) 1M pages
in a 64k segment using an encoding of 0.

Also, I noticed that the code in the if (base_idx < 0) statement is
wrong.  It needs to advance prop (and decrease size) by 2 * lpnum,
not just 2.

Paul.

^ permalink raw reply

* Re: [PATCH] drivers/tty/hvc: using strlcpy instead of strncpy
From: Chen Gang @ 2013-03-05  1:58 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Greg KH, tklauser, wfp5p, linuxppc-dev, alan
In-Reply-To: <512F5FDC.3060000@suse.cz>

于 2013年02月28日 21:47, Jiri Slaby 写道:
>>>   when strlen(&pi->location_code[0]) == HVCS_CLC_LENGTH + 2
> It cannot, pi->location_code is defined as char[HVCS_CLC_LENGTH + 1].
> 

  really, it is, I did not notice it.

  but I still prefer to modify it, but the patch should be changed
  such as:
    subject: beautify code: deleting useless judging code.
    comments: src buf len and dest buf len are the same, strcpy is better.
    contents: using strcpy instead of strncpy, and delete judging code.

  is it ok ?

BTW:
  sorry for my reply is too late, and did not notify it, originally before.
    I have to do some urgent things, during these days.
      my father had a serious heart disease, and is in hospital.
      during these days, most of my time has to be in hospital.
      (God Bless, and thank Jesus Christ, my father is safe, now).
    within my company (Asianux), I also have something to do.

  excuse me:
    within this week, maybe can not get my mail reply, too.




-- 
Chen Gang

Asianux Corporation

^ permalink raw reply

* [PATCH v2] powerpc: Wireup the kcmp syscall to sys_ni
From: Tony Breeds @ 2013-03-05  1:57 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, LinuxPPC-dev; +Cc: Stephen Rothwell
In-Reply-To: <1362443994-26580-1-git-send-email-tony@bakeyournoodle.com>

Since kmp takes 2 unsigned long args there should be a compat wrapper.
Since one isn't provided I think it's safer just to hook this up to not
implemented.  If we need it later we can do it properly then.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
Changes since v1:
 - Add comment to make finding the syscall later - thanks sfr

 arch/powerpc/include/asm/systbl.h      | 1 +
 arch/powerpc/include/asm/unistd.h      | 2 +-
 arch/powerpc/include/uapi/asm/unistd.h | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index 535b6d8..ebbec52 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -358,3 +358,4 @@ SYSCALL_SPU(setns)
 COMPAT_SYS(process_vm_readv)
 COMPAT_SYS(process_vm_writev)
 SYSCALL(finit_module)
+SYSCALL(ni_syscall) /* sys_kcmp */
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index f25b5c4..1487f0f 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,7 +12,7 @@
 #include <uapi/asm/unistd.h>
 
 
-#define __NR_syscalls		354
+#define __NR_syscalls		355
 
 #define __NR__exit __NR_exit
 #define NR_syscalls	__NR_syscalls
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index 8c478c6..74cb4d7 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -376,6 +376,7 @@
 #define __NR_process_vm_readv	351
 #define __NR_process_vm_writev	352
 #define __NR_finit_module	353
+#define __NR_kcmp		354
 
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.8.1.4

^ permalink raw reply related

* Re: [PATCH -V1 07/24] powerpc: Add size argument to pgtable_cache_add
From: Paul Mackerras @ 2013-03-05  1:50 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, linux-mm
In-Reply-To: <871ubv2zsv.fsf@linux.vnet.ibm.com>

On Mon, Mar 04, 2013 at 04:32:24PM +0530, Aneesh Kumar K.V wrote:
> 
> Now with table_size argument, the first arg is no more the shift value,
> rather it is index into the array. Hence i changed the variable name. I
> will split that patch to make it easy for review.

OK, so you're saying that the simple relation between index and the
size of the objects in PGT_CACHE(index) no longer holds.  That worries
me, because now, what guarantees that two callers won't use the same
index value with two different sizes?  And what guarantees that we
won't have two callers using different index values but the same size
(which wouldn't be a disaster but would be a waste of space)?

I think it would be preferable to keep the relation between shift and
the size of the objects and just arrange to use a different shift
value for the pmd objects when you need to.

Paul.

^ permalink raw reply

* Re: [PATCH] powerpc: Wireup the kcmp syscall to sys_ni
From: Stephen Rothwell @ 2013-03-05  1:47 UTC (permalink / raw)
  To: Tony Breeds; +Cc: LinuxPPC-dev
In-Reply-To: <1362443994-26580-1-git-send-email-tony@bakeyournoodle.com>

[-- Attachment #1: Type: text/plain, Size: 599 bytes --]

Hi Tony,

On Tue,  5 Mar 2013 11:39:54 +1100 Tony Breeds <tony@bakeyournoodle.com> wrote:
>
> diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
> index 535b6d8..289bb52 100644
> --- a/arch/powerpc/include/asm/systbl.h
> +++ b/arch/powerpc/include/asm/systbl.h
> @@ -358,3 +358,4 @@ SYSCALL_SPU(setns)
>  COMPAT_SYS(process_vm_readv)
>  COMPAT_SYS(process_vm_writev)
>  SYSCALL(finit_module)
> +SYSCALL(ni_syscall)

It is probably worth commenting what syscall should be there.

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] kexec/ppc: Fix kernel program entry point while changing the load addr
From: Simon Horman @ 2013-03-05  1:40 UTC (permalink / raw)
  To: Suzuki K. Poulose
  Cc: Matthew McClintock, Sebastian Andrzej Siewior, kexec,
	linuxppc-dev
In-Reply-To: <51343788.7070503@in.ibm.com>

On Mon, Mar 04, 2013 at 11:26:24AM +0530, Suzuki K. Poulose wrote:
> On 03/04/2013 07:11 AM, Simon Horman wrote:
> >[ Cc: linuxppc-dev@lists.ozlabs.org ]
> >
> >On Sun, Mar 03, 2013 at 01:06:00PM +0530, Suzuki K. Poulose wrote:
> >>From: Suzuki K. Poulose <suzuki@in.ibm.com>
> >>
> >>uImage probe fills the entry point (ep) based on the load_addr
> >>from the uImage headers. If we change the load_addr, we should
> >>accordingly update the entry point.
> >>
> >>For ELF, calculate the offset of e_entry from the virtual start
> >>address and add it to the physical start address to find the
> >>physical address of kernel entry.
> >>
> >>i.e,
> >>   pa (e_entry) = pa(phdr[0].p_vaddr) + (e_entry - phdr[0].p_vaddr)
> >>                = kernel_addr + (e_entry - phdr[0].p_vaddr)
> >
> >Would it be possible for someone to provide a review of this change?
> To make it bit more clear :
> 
> The entry point of the kernel is usually at 0 offset from the first
> PT_LOAD section. The current code makes this assumption and uses the
> pa(phdr[0].p_vaddr) as the kernel entry.
> 
> But this *may* not be true always, in such a case the kexec would fail.
> While I fixed the uImage case, I thought it would be better to
> handle the same case in ELF.
> 
> Btw, this calculation is not specific to ppc32.

Thanks, I have applied the patch.

^ permalink raw reply

* Re: [PATCH V2] lglock: add read-preference local-global rwlock
From: Michel Lespinasse @ 2013-03-05  1:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Lai Jiangshan, linux-doc, peterz, fweisbec, mingo, linux-arch,
	linux, xiaoguangrong, wangyun, paulmck, nikunj, linux-pm, rusty,
	rostedt, rjw, namhyung, tglx, linux-arm-kernel, Lai Jiangshan,
	netdev, linux-kernel, vincent.guittot, sbw, Srivatsa S. Bhat, tj,
	akpm, linuxppc-dev
In-Reply-To: <20130303174056.GA30176@redhat.com>

On Sun, Mar 3, 2013 at 9:40 AM, Oleg Nesterov <oleg@redhat.com> wrote:
>> However, I still think that FALLBACK_BASE only adds the unnecessary
>> complications. But even if I am right this is subjective of course, please
>> feel free to ignore.

Would it help if I sent out that version (without FALLBACK_BASE) as a
formal proposal ?

> Hmm. But then I do not understand the lglock annotations. Obviously,
> rwlock_acquire_read() in lg_local_lock() can't even detect the simplest
> deadlock, say, lg_local_lock(LOCK) + lg_local_lock(LOCK). Not to mention
> spin_lock(X) + lg_local_lock(Y) vs lg_local_lock(Y) + spin_lock(X).
>
> OK, I understand that it is not easy to make these annotations correct...

I am going to send out a proposal to fix the existing lglock
annotations and detect the two cases you noticed. It's actually not
that hard :)

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

^ permalink raw reply

* Re: [PATCH] ppc32: Fix compile of sha1-powerpc-asm.S
From: Christian Kujau @ 2013-03-05  1:23 UTC (permalink / raw)
  To: Tony Breeds; +Cc: LinuxPPC-dev
In-Reply-To: <1361845205-30606-1-git-send-email-tony@bakeyournoodle.com>

On Tue, 26 Feb 2013 at 13:20, Tony Breeds wrote:
> When building with CRYPTO_SHA1_PPC enabled we fail with:
> ---
> powerpc/crypto/sha1-powerpc-asm.S: Assembler messages:
> powerpc/crypto/sha1-powerpc-asm.S:116: Error: can't resolve `0' {*ABS* section} - `STACKFRAMESIZE' {*UND* section}
> powerpc/crypto/sha1-powerpc-asm.S:116: Error: expression too complex
> powerpc/crypto/sha1-powerpc-asm.S:178: Error: unsupported relocation against STACKFRAMESIZE
> ---
> 
> Use INT_FRAME_SIZE instead.
> 
> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>

Thanks for the fix! Ran into this as well, with your patch 3.9-rc1 
compiles again (and it even boots :-))

Tested-by: Christian Kujau <lists@nerdbynature.de>

$ grep -A10 sha1 /proc/crypto
name         : sha1
driver       : sha1-powerpc
module       : kernel
priority     : 0
refcnt       : 1
selftest     : passed
type         : shash
blocksize    : 64
digestsize   : 20


> ---
>  arch/powerpc/crypto/sha1-powerpc-asm.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> FWIW the SHA1_PPC makes about a 20% difference on my 32bit board
> 
> diff --git a/arch/powerpc/crypto/sha1-powerpc-asm.S b/arch/powerpc/crypto/sha1-powerpc-asm.S
> index a5f8264..125e165 100644
> --- a/arch/powerpc/crypto/sha1-powerpc-asm.S
> +++ b/arch/powerpc/crypto/sha1-powerpc-asm.S
> @@ -113,7 +113,7 @@
>  	STEPUP4((t)+16, fn)
>  
>  _GLOBAL(powerpc_sha_transform)
> -	PPC_STLU r1,-STACKFRAMESIZE(r1)
> +	PPC_STLU r1,-INT_FRAME_SIZE(r1)
>  	SAVE_8GPRS(14, r1)
>  	SAVE_10GPRS(22, r1)
>  
> @@ -175,5 +175,5 @@ _GLOBAL(powerpc_sha_transform)
>  
>  	REST_8GPRS(14, r1)
>  	REST_10GPRS(22, r1)
> -	addi	r1,r1,STACKFRAMESIZE
> +	addi	r1,r1,INT_FRAME_SIZE
>  	blr
> -- 
> 1.8.1.2
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
> 

-- 
BOFH excuse #21:

POSIX compliance problem

^ permalink raw reply

* [PATCH] powerpc: Wireup the kcmp syscall to sys_ni
From: Tony Breeds @ 2013-03-05  0:39 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, LinuxPPC-dev; +Cc: Stephen Rothwell

Since kmp takes 2 unsigned long args there should be a compat wrapper.
Since one isn't provided I think it's safer just to hook this up to not
implemented.  If we need it later we can do it properly then.

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
---
 arch/powerpc/include/asm/systbl.h      | 1 +
 arch/powerpc/include/asm/unistd.h      | 2 +-
 arch/powerpc/include/uapi/asm/unistd.h | 1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index 535b6d8..289bb52 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -358,3 +358,4 @@ SYSCALL_SPU(setns)
 COMPAT_SYS(process_vm_readv)
 COMPAT_SYS(process_vm_writev)
 SYSCALL(finit_module)
+SYSCALL(ni_syscall)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index f25b5c4..1487f0f 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,7 +12,7 @@
 #include <uapi/asm/unistd.h>
 
 
-#define __NR_syscalls		354
+#define __NR_syscalls		355
 
 #define __NR__exit __NR_exit
 #define NR_syscalls	__NR_syscalls
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index 8c478c6..74cb4d7 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -376,6 +376,7 @@
 #define __NR_process_vm_readv	351
 #define __NR_process_vm_writev	352
 #define __NR_finit_module	353
+#define __NR_kcmp		354
 
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.8.1.4

^ permalink raw reply related

* Re: [PATCH 0/3] Enable multiple MSI feature in pSeries
From: Michael Ellerman @ 2013-03-05  0:28 UTC (permalink / raw)
  To: Mike Qiu; +Cc: tglx, linuxppc-dev, linux-kernel
In-Reply-To: <513411AD.5010408@linux.vnet.ibm.com>

On Mon, Mar 04, 2013 at 11:14:53AM +0800, Mike Qiu wrote:
> 于 2013/3/1 11:54, Michael Ellerman 写道:
> >On Fri, Mar 01, 2013 at 11:08:45AM +0800, Mike wrote:
> >>Hi all
> >>
> >>Any comments? or any questions about my patchset?
> >You were going to get some performance numbers that show a definite
> >benefit for using more than one MSI.

> Yes, but my patch just enable the kernel to support this feature, whether
> to use it depens on the device driver.

Sure, but we don't add code just for fun, so unless there's a good
reason to add the feature - like better performance - we won't bother.

> And this feature has been merged to the kernel for X86 for a long time.
> See commit: 5ca72c4f7c412c2002363218901eba5516c476b1
> 51906e779f2b13b38f8153774c4c7163d412ffd9

That commit was merged in 3.9-rc1, ie. a few days ago, so no it has not been
in x86 for a long time.

That code removes the need for your first patch, which is a good start.
Please send a new version using irq_set_msi_desc_off().

cheers

^ permalink raw reply

* Re: [PATCH V4] powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx
From: Scott Wood @ 2013-03-04 23:45 UTC (permalink / raw)
  To: Stuart Yoder; +Cc: linuxppc-dev, Jia Hongtao
In-Reply-To: <CALRxmdBjb+TrqZ_XdNjH=oDcdYtVJ5=Q71dcCkKodpe7WCHDpA@mail.gmail.com>

On 03/04/2013 10:16:10 AM, Stuart Yoder wrote:
> On Mon, Mar 4, 2013 at 2:40 AM, Jia Hongtao <B38951@freescale.com> =20
> wrote:
> > A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe
> > goes down. when the link goes down, Non-posted transactions issued
> > via the ATMU requiring completion result in an instruction stall.
> > At the same time a machine-check exception is generated to the core
> > to allow further processing by the handler. We implements the =20
> handler
> > which skips the instruction caused the stall.
>=20
> Can you explain at a high level how just skipping an instruction =20
> solves
> anything?   If you just skip a load/store and continue like nothing is
> wrong, isn't your system possibly in a really bad state.

If the instruction was a load, we probably at least want to fill the =20
destination register with 0xffffffff or similar.

> And if the core is already hung, due to the PCI link going down, isn't
> it too late?   How does skipping help?

Maybe the machine check unhangs the core?

Is there an erratum number for this?

-Scott=

^ permalink raw reply

* Re: [PATCH -V1 06/24] powerpc: Reduce PTE table memory wastage
From: Benjamin Herrenschmidt @ 2013-03-04 23:36 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, Paul Mackerras, linux-mm
In-Reply-To: <874ngr2zz1.fsf@linux.vnet.ibm.com>

On Mon, 2013-03-04 at 16:28 +0530, Aneesh Kumar K.V wrote:
> I added the below comment when initializing the list.
> 
> +#ifdef CONFIG_PPC_64K_PAGES
> +       /*
> +        * Used to support 4K PTE fragment. The pages are added to list,
> +        * when we have free framents in the page. We track the whether
> +        * a page frament is available using page._mapcount. A value of
> +        * zero indicate none of the fragments are used and page can be
> +        * freed. A value of FRAG_MASK indicate all the fragments are used
> +        * and hence the page will be removed from the below list.
> +        */
> +       INIT_LIST_HEAD(&init_mm.context.pgtable_list);
> +#endif
> 
> I am not sure about why you say there is no consistent rule. Can you
> elaborate on that ?

Do you really need that list ? I assume it's meant to allow you to find
free frags when allocating but my worry is that you'll end up losing
quite a bit of node locality of PTE pages....

It may or may not work but can you investigate doing things differently
here ? The idea I want you to consider is to always allocate a full
page, but make the relationship of the fragments to PTE pages fixed. IE.
the fragment in the page is a function of the VA.

Basically, the algorithm for allocation is roughly:

 - Walk the tree down to the PMD ptr (* that can be improved with a
generic change, see below)

 - Check if any of the neighbouring PMDs is populated. If yes, you have
your page and pick the appropriate fragment based on the VA

 - If not, allocate and populate

On free, similarly, you checked if all neighbouring PMDs have been
cleared, in which case you can fire off the page for RCU freeing.

(*) By changing pte_alloc_one to take the PMD ptr (which the call side
has right at hand) you can avoid the tree lookup.

Cheers,
Ben.

^ permalink raw reply

* Re: [PATCH -V1 09/24] powerpc: Decode the pte-lp-encoding bits correctly.
From: Aneesh Kumar K.V @ 2013-03-04 18:11 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, linux-mm
In-Reply-To: <87vc971iwd.fsf@linux.vnet.ibm.com>

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> Paul Mackerras <paulus@samba.org> writes:
>
>> On Tue, Feb 26, 2013 at 01:34:59PM +0530, Aneesh Kumar K.V wrote:
>>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>>=20
>>> +static inline int hpte_actual_psize(struct hash_pte *hptep, int psize)
>>> +{
>>> +	unsigned int mask;
>>> +	int i, penc, shift;
>>> +	/* Look at the 8 bit LP value */
>>> +	unsigned int lp =3D (hptep->r >> LP_SHIFT) & ((1 << LP_BITS) - 1);
>>> +
>>> +	penc =3D 0;
>>> +	for (i =3D 0; i < MMU_PAGE_COUNT; i++) {
>>> +		/* valid entries have a shift value */
>>> +		if (!mmu_psize_defs[i].shift)
>>> +			continue;
>>> +
>>> +		/* encoding bits per actual page size */
>>> +		shift =3D mmu_psize_defs[i].shift - 11;
>>> +		if (shift > 9)
>>> +			shift =3D 9;
>>> +		mask =3D (1 << shift) - 1;
>>> +		if ((lp & mask) =3D=3D mmu_psize_defs[psize].penc[i])
>>> +			return i;
>>> +	}
>>> +	return -1;
>>> +}
>>
>> This doesn't look right to me.  First, it's not clear what the 11 and
>> 9 refer to, and I think the 9 should be LP_BITS (i.e. 8).  Secondly,
>> the mask for the comparison needs to depend on the actual page size
>> not the base page size.
>
> How about the below. I am yet to test this in user space.=20

I needed to special case 4K case. This seems to work fine with the test.

static inline int hpte_actual_psize(struct hash_pte *hptep, int psize)
{
	unsigned int mask;
	int i, penc, shift;
	/* Look at the 8 bit LP value */
	unsigned int lp =3D (hptep->r >> LP_SHIFT) & ((1 << LP_BITS) - 1);

	/* First check if it is large page */
	if (!(hptep->v & HPTE_V_LARGE))
		return MMU_PAGE_4K;

	penc =3D 0;
	for (i =3D 1; i < MMU_PAGE_COUNT; i++) {
		/* valid entries have a shift value */
		if (!mmu_psize_defs[i].shift)
			continue;
		/*
		 * encoding bits per actual page size
		 *        PTE LP     actual page size
		 *    rrrr rrrz		=E2=89=A58KB
		 *    rrrr rrzz		=E2=89=A516KB
		 *    rrrr rzzz		=E2=89=A532KB
		 *    rrrr zzzz		=E2=89=A564KB
		 * .......
		 */
		shift =3D mmu_psize_defs[i].shift -
				mmu_psize_defs[MMU_PAGE_4K].shift;
		if (shift > LP_BITS)
			shift =3D LP_BITS;
		mask =3D (1 << shift) - 1;
		if ((lp & mask) =3D=3D mmu_psize_defs[psize].penc[i])
			return i;
	}
	return -1;
}

^ permalink raw reply

* RE: [PATCH V4] powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx
From: David Laight @ 2013-03-04 17:15 UTC (permalink / raw)
  To: Jia Hongtao, linuxppc-dev, galak; +Cc: B07421
In-Reply-To: <1362386425-20149-1-git-send-email-B38951@freescale.com>

> A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe
> goes down. when the link goes down, Non-posted transactions issued
> via the ATMU requiring completion result in an instruction stall.
> At the same time a machine-check exception is generated to the core
> to allow further processing by the handler. We implements the handler
> which skips the instruction caused the stall.

Just skipping the instruction doesn't seem a good idea.
But I suspect that re-initialising the PCI interface is also
almost impossible.

Does the mpc83xx have the same errata?
We've seen machine-check faults using the CSB bridge on an 83xx
doing a 'pio' access after a PEX_DMA transfer to certain target
addresses stalls - software gives up waiting for the dma.
The target is an fpga, nothing is mapped at those
addesses - but we'd expect to get ~0u back as happens on other
slave windows.

I also remember some problems with single word DMA.

	David

^ permalink raw reply

* RE: [PATCH 6/6 v8] iommu/fsl: Freescale PAMU driver and IOMMU API implementation.
From: Yoder Stuart-B08248 @ 2013-03-04 16:55 UTC (permalink / raw)
  To: Sethi Varun-B16395
  Cc: Wood Scott-B07421, iommu@lists.linux-foundation.org,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	Joerg Roedel
In-Reply-To: <C5ECD7A89D1DC44195F34B25E172658D3D12D9@039-SN2MPN1-013.039d.mgd.msft.net>



> -----Original Message-----
> From: Sethi Varun-B16395
> Sent: Monday, March 04, 2013 5:31 AM
> To: Stuart Yoder
> Cc: iommu@lists.linux-foundation.org; linuxppc-dev@lists.ozlabs.org; linu=
x-kernel@vger.kernel.org; Wood
> Scott-B07421; Joerg Roedel; Yoder Stuart-B08248
> Subject: RE: [PATCH 6/6 v8] iommu/fsl: Freescale PAMU driver and IOMMU AP=
I implementation.
>=20
>=20
>=20
> > -----Original Message-----
> > From: Stuart Yoder [mailto:b08248@gmail.com]
> > Sent: Saturday, March 02, 2013 4:58 AM
> > To: Sethi Varun-B16395
> > Cc: iommu@lists.linux-foundation.org; linuxppc-dev@lists.ozlabs.org;
> > linux-kernel@vger.kernel.org; Wood Scott-B07421; Joerg Roedel; Yoder
> > Stuart-B08248
> > Subject: Re: [PATCH 6/6 v8] iommu/fsl: Freescale PAMU driver and IOMMU
> > API implementation.
> >
> > On Mon, Feb 18, 2013 at 6:52 AM, Varun Sethi <Varun.Sethi@freescale.com=
>
> > wrote:
> > [cut]
> > > +static phys_addr_t get_phys_addr(struct fsl_dma_domain *dma_domain,
> > > +unsigned long iova) {
> > > +       u32 win_cnt =3D dma_domain->win_cnt;
> > > +       struct dma_window *win_ptr =3D
> > > +                               &dma_domain->win_arr[0];
> > > +       struct iommu_domain_geometry *geom;
> > > +
> > > +       geom =3D &dma_domain->iommu_domain->geometry;
> > > +
> > > +       if (!win_cnt || !dma_domain->geom_size) {
> > > +               pr_err("Number of windows/geometry not configured for
> > the domain\n");
> > > +               return 0;
> > > +       }
> > > +
> > > +       if (win_cnt > 1) {
> > > +               u64 subwin_size;
> > > +               unsigned long subwin_iova;
> > > +               u32 wnd;
> > > +
> > > +               subwin_size =3D dma_domain->geom_size >> ilog2(win_cn=
t);
> >
> > Could it be just geom_size / win_cnt ??
> [Sethi Varun-B16395] You would run in to linking issues with respect to u=
64 division on 32 bit kernels.
>=20
> >
> > > +               subwin_iova =3D iova & ~(subwin_size - 1);
> > > +               wnd =3D (subwin_iova - geom->aperture_start) >>
> > ilog2(subwin_size);
> > > +               win_ptr =3D &dma_domain->win_arr[wnd];
> > > +       }
> > > +
> > > +       if (win_ptr->valid)
> > > +               return (win_ptr->paddr + (iova & (win_ptr->size -
> > > + 1)));
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +static int map_liodn_subwins(int liodn, struct fsl_dma_domain
> > > +*dma_domain)
> >
> > Just call it map_subwins().  They are just sub windows, not "liodn sub
> > windows".
> >
> [Sethi Varun-B16395] This would be called per liodn.
>
> > [cut]
> >
> > > +static int map_liodn_win(int liodn, struct fsl_dma_domain
> > > +*dma_domain)
> >
> > Call it map_win().
> [Sethi Varun-B16395] This would again be called per liodn.

On the function names-- function names are typically named
with a noun describing some object and a verb describing
the action...and sometimes a subsystem identifier:
    kmem_cache + zalloc
    spin + lock

I don't think inserting liodn in the function name to indicates that it=20
operates per liodn makes it more clear and reads a little awkward to me:

    map + liodn_win

...it's almost like there is a "liodn_win" object.

I think plain map_win() is more clear and the function prototype makes
it pretty clear that you are operating on an liodn.
=20
> > [cut]
> > > +static  bool check_pci_ctl_endpt_part(struct pci_controller *pci_ctl=
)
> > > +{
> > > +       u32 version;
> > > +
> > > +       /* Check the PCI controller version number by readding BRR1
> > register */
> > > +       version =3D in_be32(pci_ctl->cfg_addr + (PCI_FSL_BRR1 >> 2));
> > > +       version &=3D PCI_FSL_BRR1_VER;
> > > +       /* If PCI controller version is >=3D 0x204 we can partition
> > endpoints*/
> > > +       if (version >=3D 0x204)
> > > +               return 1;
> > > +
> > > +       return 0;
> > > +}
> >
> > Can we just use the compatible string here to identify the different
> > kinds of PCI
> > controllers?   Reading the actual device registers is ugly right now
> > because
> > you are #defining the BRR1 stuff in a generic powerpc header.
> >
> [Sethi Varun-B16395] hmmm......, I would have to check for various differ=
ent compatible strings in that
> case. May be I can move the defines to a different file. What if I move t=
hem to some other header?

Don't personally feel too strongly about version register or header...but i=
t should be some fsl
PCI header that you define those regs.

You'll either need to check for a hardware version number or a compatible,
so a compatible check may be just as easy...look for "fsl,qoriq-pcie-v2.4"?

Stuart

^ permalink raw reply

* Re: [PATCH V4] powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx
From: Stuart Yoder @ 2013-03-04 16:16 UTC (permalink / raw)
  To: Jia Hongtao; +Cc: B07421, linuxppc-dev
In-Reply-To: <1362386425-20149-1-git-send-email-B38951@freescale.com>

On Mon, Mar 4, 2013 at 2:40 AM, Jia Hongtao <B38951@freescale.com> wrote:
> A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe
> goes down. when the link goes down, Non-posted transactions issued
> via the ATMU requiring completion result in an instruction stall.
> At the same time a machine-check exception is generated to the core
> to allow further processing by the handler. We implements the handler
> which skips the instruction caused the stall.

Can you explain at a high level how just skipping an instruction solves
anything?   If you just skip a load/store and continue like nothing is
wrong, isn't your system possibly in a really bad state.

And if the core is already hung, due to the PCI link going down, isn't
it too late?   How does skipping help?

Stuart

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox