LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Paul E. McKenney @ 2013-11-04 16:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131104112254.GK28601@twins.programming.kicks-ass.net>

On Mon, Nov 04, 2013 at 12:22:54PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 04, 2013 at 02:51:00AM -0800, Paul E. McKenney wrote:
> > OK, something like this for the definitions (though PowerPC might want
> > to locally abstract the lwsync expansion):
> > 
> > 	#define smp_store_with_release_semantics(p, v) /* x86, s390, etc. */ \
> > 	do { \
> > 		barrier(); \
> > 		ACCESS_ONCE(p) = (v); \
> > 	} while (0)
> > 
> > 	#define smp_store_with_release_semantics(p, v) /* PowerPC. */ \
> > 	do { \
> > 		__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory"); \
> > 		ACCESS_ONCE(p) = (v); \
> > 	} while (0)
> > 
> > 	#define smp_load_with_acquire_semantics(p) /* x86, s390, etc. */ \
> > 	({ \
> > 		typeof(*p) *_________p1 = ACCESS_ONCE(p); \
> > 		barrier(); \
> > 		_________p1; \
> > 	})
> > 
> > 	#define smp_load_with_acquire_semantics(p) /* PowerPC. */ \
> > 	({ \
> > 		typeof(*p) *_________p1 = ACCESS_ONCE(p); \
> > 		__asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory"); \
> > 		_________p1; \
> > 	})
> > 
> > For ARM, smp_load_with_acquire_semantics() is a wrapper around the ARM
> > "ldar" instruction and smp_store_with_release_semantics() is a wrapper
> > around the ARM "stlr" instruction.
> 
> This still leaves me confused as to what to do with my case :/
> 
> Slightly modified since last time -- as the simplified version was maybe
> simplified too far.
> 
> To recap, I'd like to get rid of barrier A where possible, since that's
> now a full barrier for every event written.
> 
> However, there's no immediate store I can attach it to; the obvious one
> would be the kbuf->head store, but that's complicated by the
> local_cmpxchg() thing.
> 
> And we need that cmpxchg loop because a hardware NMI event can
> interleave with a software event.
> 
> And to be honest, I'm still totally confused about memory barriers vs
> control flow vs C/C++. The only way we're ever getting to that memcpy is
> if we've already observed ubuf->tail, so that LOAD has to be fully
> processes and completed.
> 
> I'm really not seeing how a STORE from the memcpy() could possibly go
> wrong; and if C/C++ can hoist the memcpy() over a compiler barrier()
> then I suppose we should all just go home.
> 
> /me who wants A to be a barrier() but is terminally confused.

Well, let's see...

> ---
> 
> 
> /*
>  * One important detail is that the kbuf part and the kbuf_writer() are
>  * strictly per cpu and we can thus rely on program order for those.
>  *
>  * Only the userspace consumer can possibly run on another cpu, and thus we
>  * need to ensure data consistency for those.
>  */
> 
> struct buffer {
>         u64 size;
>         u64 tail;
>         u64 head;
>         void *data;
> };
> 
> struct buffer *kbuf, *ubuf;
> 
> /*
>  * If there's space in the buffer; store the data @buf; otherwise
>  * discard it.
>  */
> void kbuf_write(int sz, void *buf)
> {
> 	u64 tail, head, offset;
> 
> 	do {
> 		tail = ACCESS_ONCE(ubuf->tail);

So the above load is the key load.  It determines whether or not we
have space in the buffer.  This of course assumes that only this CPU
writes to ->head.

If so, then:

		tail = smp_load_with_acquire_semantics(ubuf->tail); /* A -> D */

> 		offset = head = kbuf->head;
> 		if (CIRC_SPACE(head, tail, kbuf->size) < sz) {
> 			/* discard @buf */
> 			return;
> 		}
> 		head += sz;
> 	} while (local_cmpxchg(&kbuf->head, offset, head) != offset)

If there is an issue with kbuf->head, presumably local_cmpxchg() fails
and we retry.

But sheesh, do you think we could have buried the definitions of
local_cmpxchg() under a few more layers of macro expansion just to
keep things more obscure?  Anyway, griping aside...

o	__cmpxchg_local_generic() in include/asm-generic/cmpxchg-local.h
	doesn't seem to exclude NMIs, so is not safe for this usage.

o	__cmpxchg_local() in ARM handles NMI as long as the
	argument is 32 bits, otherwise, it uses the aforementionted
	__cmpxchg_local_generic(), which does not handle NMI.  Given your
	u64, this does not look good...

	And some ARM subarches (e.g., metag) seem to fail to handle NMI
	even in the 32-bit case.

o	FRV and M32r seem to act similar to ARM.

Or maybe these architectures don't do NMIs?  If they do, local_cmpxchg()
does not seem to be safe against NMIs in general.  :-/

That said, powerpc, 64-bit s390, sparc, and x86 seem to handle it.

Of course, x86's local_cmpxchg() has full memory barriers implicitly.

> 
>         /*
>          * Ensure that if we see the userspace tail (ubuf->tail) such
>          * that there is space to write @buf without overwriting data
>          * userspace hasn't seen yet, we won't in fact store data before
>          * that read completes.
>          */
> 
>         smp_mb(); /* A, matches with D */

Given a change to smp_load_with_acquire_semantics() above, you should not
need this smp_mb().

>         memcpy(kbuf->data + offset, buf, sz);
> 
>         /*
>          * Ensure that we write all the @buf data before we update the
>          * userspace visible ubuf->head pointer.
>          */
>         smp_wmb(); /* B, matches with C */
> 
>         ubuf->head = kbuf->head;

Replace the smp_wmb() and the assignment with:

	smp_store_with_release_semantics(ubuf->head, kbuf->head); /* B -> C */

> }
> 
> /*
>  * Consume the buffer data and update the tail pointer to indicate to
>  * kernel space there's 'free' space.
>  */
> void ubuf_read(void)
> {
>         u64 head, tail;
> 
>         tail = ACCESS_ONCE(ubuf->tail);

Does anyone else write tail?  Or is this defense against NMIs?

If no one else writes to tail and if NMIs cannot muck things up, then
the above ACCESS_ONCE() is not needed, though I would not object to its
staying.

>         head = ACCESS_ONCE(ubuf->head);

Make the above be:

	head = smp_load_with_acquire_semantics(ubuf->head);  /* C -> B */

>         /*
>          * Ensure we read the buffer boundaries before the actual buffer
>          * data...
>          */
>         smp_rmb(); /* C, matches with B */

And drop the above memory barrier.

>         while (tail != head) {
>                 obj = ubuf->data + tail;
>                 /* process obj */
>                 tail += obj->size;
>                 tail %= ubuf->size;
>         }
> 
>         /*
>          * Ensure all data reads are complete before we issue the
>          * ubuf->tail update; once that update hits, kbuf_write() can
>          * observe and overwrite data.
>          */
>         smp_mb(); /* D, matches with A */
> 
>         ubuf->tail = tail;

Replace the above barrier and the assignment with:

	smp_store_with_release_semantics(ubuf->tail, tail); /* D -> B. */

> }

All this is leading me to suggest the following shortenings of names:

	smp_load_with_acquire_semantics() -> smp_load_acquire()

	smp_store_with_release_semantics() -> smp_store_release()

But names aside, the above gets rid of explicit barriers on TSO architectures,
allows ARM to avoid full DMB, and allows PowerPC to use lwsync instead of
the heavier-weight sync.

								Thanx, Paul

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Peter Zijlstra @ 2013-11-04 16:48 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131104162732.GN3947@linux.vnet.ibm.com>

On Mon, Nov 04, 2013 at 08:27:32AM -0800, Paul E. McKenney wrote:
> > 
> > 
> > /*
> >  * One important detail is that the kbuf part and the kbuf_writer() are
> >  * strictly per cpu and we can thus rely on program order for those.
> >  *
> >  * Only the userspace consumer can possibly run on another cpu, and thus we
> >  * need to ensure data consistency for those.
> >  */
> > 
> > struct buffer {
> >         u64 size;
> >         u64 tail;
> >         u64 head;
> >         void *data;
> > };
> > 
> > struct buffer *kbuf, *ubuf;
> > 
> > /*
> >  * If there's space in the buffer; store the data @buf; otherwise
> >  * discard it.
> >  */
> > void kbuf_write(int sz, void *buf)
> > {
> > 	u64 tail, head, offset;
> > 
> > 	do {
> > 		tail = ACCESS_ONCE(ubuf->tail);
> 
> So the above load is the key load.  It determines whether or not we
> have space in the buffer.  This of course assumes that only this CPU
> writes to ->head.

This assumption is true.

> If so, then:
> 
> 		tail = smp_load_with_acquire_semantics(ubuf->tail); /* A -> D */
> 

OK, the way I understand ACQUIRE semantics are the semi-permeable LOCK
semantics from Documetnation/memory-barriers.txt. In which case the
relevant STORES below could be hoisted up here, but not across the READ,
which I suppose is sufficient.

> > 		offset = head = kbuf->head;
> > 		if (CIRC_SPACE(head, tail, kbuf->size) < sz) {
> > 			/* discard @buf */
> > 			return;
> > 		}
> > 		head += sz;
> > 	} while (local_cmpxchg(&kbuf->head, offset, head) != offset)
> 
> If there is an issue with kbuf->head, presumably local_cmpxchg() fails
> and we retry.
> 
> But sheesh, do you think we could have buried the definitions of
> local_cmpxchg() under a few more layers of macro expansion just to
> keep things more obscure?  Anyway, griping aside...
> 
> o	__cmpxchg_local_generic() in include/asm-generic/cmpxchg-local.h
> 	doesn't seem to exclude NMIs, so is not safe for this usage.
> 
> o	__cmpxchg_local() in ARM handles NMI as long as the
> 	argument is 32 bits, otherwise, it uses the aforementionted
> 	__cmpxchg_local_generic(), which does not handle NMI.  Given your
> 	u64, this does not look good...
> 
> 	And some ARM subarches (e.g., metag) seem to fail to handle NMI
> 	even in the 32-bit case.
> 
> o	FRV and M32r seem to act similar to ARM.
> 
> Or maybe these architectures don't do NMIs?  If they do, local_cmpxchg()
> does not seem to be safe against NMIs in general.  :-/
> 
> That said, powerpc, 64-bit s390, sparc, and x86 seem to handle it.

Ah my bad, so the in-kernel kbuf variant uses unsigned long, which on
all archs should be the native words size and cover the address space.

Only the public variant (ubuf) is u64 wide to not change data structure
layout on compat etc.

I suppose this was a victim on the simplification :/

And in case of 32bit the upper word will always be zero and the partial
reads should all work out just fine.

> Of course, x86's local_cmpxchg() has full memory barriers implicitly.

Not quite, the 'lock' in __raw_cmpxchg() expands to "" due to
__cmpxchg_local(), etc.. 

> > 
> >         /*
> >          * Ensure that if we see the userspace tail (ubuf->tail) such
> >          * that there is space to write @buf without overwriting data
> >          * userspace hasn't seen yet, we won't in fact store data before
> >          * that read completes.
> >          */
> > 
> >         smp_mb(); /* A, matches with D */
> 
> Given a change to smp_load_with_acquire_semantics() above, you should not
> need this smp_mb().

Because the STORES can not be hoisted across the ACQUIRE, indeed.

> 
> >         memcpy(kbuf->data + offset, buf, sz);
> > 
> >         /*
> >          * Ensure that we write all the @buf data before we update the
> >          * userspace visible ubuf->head pointer.
> >          */
> >         smp_wmb(); /* B, matches with C */
> > 
> >         ubuf->head = kbuf->head;
> 
> Replace the smp_wmb() and the assignment with:
> 
> 	smp_store_with_release_semantics(ubuf->head, kbuf->head); /* B -> C */

And here the RELEASE semantics I assume are the same as the
semi-permeable UNLOCK from _The_ document? In which case the above
STORES cannot be lowered across this store and all should, again, be
well.

> > }
> > 
> > /*
> >  * Consume the buffer data and update the tail pointer to indicate to
> >  * kernel space there's 'free' space.
> >  */
> > void ubuf_read(void)
> > {
> >         u64 head, tail;
> > 
> >         tail = ACCESS_ONCE(ubuf->tail);
> 
> Does anyone else write tail?  Or is this defense against NMIs?

No, we're the sole writer; just general paranoia. Not sure the actual
userspace does this; /me checks. Nope, distinct lack of ACCESS_ONCE()
there, just the rmb(), which including a barrier() should hopefully
accomplish similar things most of the time ;-)

I'll need to introduce ACCESS_ONCE to the userspace code.

> If no one else writes to tail and if NMIs cannot muck things up, then
> the above ACCESS_ONCE() is not needed, though I would not object to its
> staying.
> 
> >         head = ACCESS_ONCE(ubuf->head);
> 
> Make the above be:
> 
> 	head = smp_load_with_acquire_semantics(ubuf->head);  /* C -> B */
> 
> >         /*
> >          * Ensure we read the buffer boundaries before the actual buffer
> >          * data...
> >          */
> >         smp_rmb(); /* C, matches with B */
> 
> And drop the above memory barrier.
> 
> >         while (tail != head) {
> >                 obj = ubuf->data + tail;
> >                 /* process obj */
> >                 tail += obj->size;
> >                 tail %= ubuf->size;
> >         }
> > 
> >         /*
> >          * Ensure all data reads are complete before we issue the
> >          * ubuf->tail update; once that update hits, kbuf_write() can
> >          * observe and overwrite data.
> >          */
> >         smp_mb(); /* D, matches with A */
> > 
> >         ubuf->tail = tail;
> 
> Replace the above barrier and the assignment with:
> 
> 	smp_store_with_release_semantics(ubuf->tail, tail); /* D -> B. */
> 
> > }

Right, so this consumer side isn't called that often and the two
barriers are only per consume, not per event, so I don't care too much
about these.

It would also mean hoisting the implementation of the proposed
primitives into userspace -- which reminds me: should we make
include/asm/barrier.h a uapi header?

> All this is leading me to suggest the following shortenings of names:
> 
> 	smp_load_with_acquire_semantics() -> smp_load_acquire()
> 
> 	smp_store_with_release_semantics() -> smp_store_release()
> 
> But names aside, the above gets rid of explicit barriers on TSO architectures,
> allows ARM to avoid full DMB, and allows PowerPC to use lwsync instead of
> the heavier-weight sync.

Totally awesome! ;-) And full ack on the shorter names.

^ permalink raw reply

* [PATCH 0/6] powerpc/math-emu: e500 SPE float emulation fixes
From: Joseph S. Myers @ 2013-11-04 16:50 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Liu Yu, linux-kernel, Shan Hai

This patch series fixes various problems with the floating-point
emulation code for powerpc e500 SPE (some being issues with the
e500-specific emulation code, some with the generic math-emu headers).
All six patches were sent individually last month as the issues were
identified and fixed in the course of preparing the e500 glibc port,
and received no comments.  There are no substantive changes to the
patches in this version, but I've retested the glibc port (which is
now upstream, along with all the generic math-emu changes relevant to
the glibc soft-fp code, and various fixes to soft-fp corresponding to
fixes in the kernel code in the hope that at some point we can get the
kernel using the current soft-fp code again) with current kernel
sources with this patch series applied.

The only dependencies between patches in this series should be that
patch 5 (fix e500 SPE float to integer and fixed-point conversions)
depends on patch 2 (fix e500 SPE float rounding inexactness
detection).  Other than that, I think any subset of the patches can be
applied in any order, if some subset seems OK but there are concerns
about other patches in the series.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply

* [PATCH 1/6] powerpc: fix exception clearing in e500 SPE float emulation
From: Joseph S. Myers @ 2013-11-04 16:52 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Liu Yu, linux-kernel, Shan Hai
In-Reply-To: <Pine.LNX.4.64.1311041649250.4290@digraph.polyomino.org.uk>

From: Joseph Myers <joseph@codesourcery.com>

The e500 SPE floating-point emulation code clears existing exceptions
(__FPU_FPSCR &= ~FP_EX_MASK;) before ORing in the exceptions from the
emulated operation.  However, these exception bits are the "sticky",
cumulative exception bits, and should only be cleared by the user
program setting SPEFSCR, not implicitly by any floating-point
instruction (whether executed purely by the hardware or emulated).
The spurious clearing of these bits shows up as missing exceptions in
glibc testing.

Fixing this, however, is not as simple as just not clearing the bits,
because while the bits may be from previous floating-point operations
(in which case they should not be cleared), the processor can also set
the sticky bits itself before the interrupt for an exception occurs,
and this can happen in cases when IEEE 754 semantics are that the
sticky bit should not be set.  Specifically, the "invalid" sticky bit
is set in various cases with non-finite operands, where IEEE 754
semantics do not involve raising such an exception, and the
"underflow" sticky bit is set in cases of exact underflow, whereas
IEEE 754 semantics are that this flag is set only for inexact
underflow.  Thus, for correct emulation the kernel needs to know the
setting of these two sticky bits before the instruction being
emulated.

When a floating-point operation raises an exception, the kernel can
note the state of the sticky bits immediately afterwards.  Some
<fenv.h> functions that affect the state of these bits, such as
fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and
PR_SET_FPEXC anyway, and so it is natural to record the state of those
bits during that call into the kernel and so avoid any need for a
separate call into the kernel to inform it of a change to those bits.
Thus, the interface I chose to use (in this patch and the glibc port)
is that one of those prctl calls must be made after any userspace
change to those sticky bits, other than through a floating-point
operation that traps into the kernel anyway.  feclearexcept and
fesetexceptflag duly make those calls, which would not be required
were it not for this issue.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

Previous submission: <http://lkml.org/lkml/2013/10/4/495>.

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index ce4de5a..0b02e23 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -237,6 +237,8 @@ struct thread_struct {
 	unsigned long	evr[32];	/* upper 32-bits of SPE regs */
 	u64		acc;		/* Accumulator */
 	unsigned long	spefscr;	/* SPE & eFP status */
+	unsigned long	spefscr_last;	/* SPEFSCR value on last prctl
+					   call or trap return */
 	int		used_spe;	/* set if process has used spe */
 #endif /* CONFIG_SPE */
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -303,7 +305,9 @@ struct thread_struct {
 	(_ALIGN_UP(sizeof(init_thread_info), 16) + (unsigned long) &init_stack)
 
 #ifdef CONFIG_SPE
-#define SPEFSCR_INIT .spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | SPEFSCR_FOVFE,
+#define SPEFSCR_INIT \
+	.spefscr = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | SPEFSCR_FOVFE, \
+	.spefscr_last = SPEFSCR_FINVE | SPEFSCR_FDBZE | SPEFSCR_FUNFE | SPEFSCR_FOVFE,
 #else
 #define SPEFSCR_INIT
 #endif
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 96d2fdf..e3b91f1 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1151,6 +1151,7 @@ int set_fpexc_mode(struct task_struct *tsk, unsigned int val)
 	if (val & PR_FP_EXC_SW_ENABLE) {
 #ifdef CONFIG_SPE
 		if (cpu_has_feature(CPU_FTR_SPE)) {
+			tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR);
 			tsk->thread.fpexc_mode = val &
 				(PR_FP_EXC_SW_ENABLE | PR_FP_ALL_EXCEPT);
 			return 0;
@@ -1182,9 +1183,10 @@ int get_fpexc_mode(struct task_struct *tsk, unsigned long adr)
 
 	if (tsk->thread.fpexc_mode & PR_FP_EXC_SW_ENABLE)
 #ifdef CONFIG_SPE
-		if (cpu_has_feature(CPU_FTR_SPE))
+		if (cpu_has_feature(CPU_FTR_SPE)) {
+			tsk->thread.spefscr_last = mfspr(SPRN_SPEFSCR);
 			val = tsk->thread.fpexc_mode;
-		else
+		} else
 			return -EINVAL;
 #else
 		return -EINVAL;
diff --git a/arch/powerpc/math-emu/math_efp.c b/arch/powerpc/math-emu/math_efp.c
index a73f088..59835c6 100644
--- a/arch/powerpc/math-emu/math_efp.c
+++ b/arch/powerpc/math-emu/math_efp.c
@@ -630,9 +630,27 @@ update_ccr:
 	regs->ccr |= (IR << ((7 - ((speinsn >> 23) & 0x7)) << 2));
 
 update_regs:
-	__FPU_FPSCR &= ~FP_EX_MASK;
+	/*
+	 * If the "invalid" exception sticky bit was set by the
+	 * processor for non-finite input, but was not set before the
+	 * instruction being emulated, clear it.  Likewise for the
+	 * "underflow" bit, which may have been set by the processor
+	 * for exact underflow, not just inexact underflow when the
+	 * flag should be set for IEEE 754 semantics.  Other sticky
+	 * exceptions will only be set by the processor when they are
+	 * correct according to IEEE 754 semantics, and we must not
+	 * clear sticky bits that were already set before the emulated
+	 * instruction as they represent the user-visible sticky
+	 * exception status.  "inexact" traps to kernel are not
+	 * required for IEEE semantics and are not enabled by default,
+	 * so the "inexact" sticky bit may have been set by a previous
+	 * instruction without the kernel being aware of it.
+	 */
+	__FPU_FPSCR
+	  &= ~(FP_EX_INVALID | FP_EX_UNDERFLOW) | current->thread.spefscr_last;
 	__FPU_FPSCR |= (FP_CUR_EXCEPTIONS & FP_EX_MASK);
 	mtspr(SPRN_SPEFSCR, __FPU_FPSCR);
+	current->thread.spefscr_last = __FPU_FPSCR;
 
 	current->thread.evr[fc] = vc.wp[0];
 	regs->gpr[fc] = vc.wp[1];

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* [PATCH 2/6] powerpc: fix e500 SPE float rounding inexactness detection
From: Joseph S. Myers @ 2013-11-04 16:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Liu Yu, linux-kernel, Shan Hai
In-Reply-To: <Pine.LNX.4.64.1311041649250.4290@digraph.polyomino.org.uk>

From: Joseph Myers <joseph@codesourcery.com>

The e500 SPE floating-point emulation code for the rounding modes
rounding to positive or negative infinity (which may not be
implemented in hardware) tries to avoid emulating rounding if the
result was inexact.  However, it tests inexactness using the sticky
bit with the cumulative result of previous operations, rather than
with the non-sticky bits relating to the operation that generated the
interrupt.  Furthermore, when a vector operation generates the
interrupt, it's possible that only one of the low and high parts is
inexact, and so only that part should have rounding emulated.  This
results in incorrect rounding of exact results in these modes when the
sticky bit is set from a previous operation.

(I'm not sure why the rounding interrupts are generated at all when
the result is exact, but empirically the hardware does generate them.)

This patch checks for inexactness using the correct bits of SPEFSCR,
and ensures that rounding only occurs when the relevant part of the
result was actually inexact.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

Previous submission: <http://lkml.org/lkml/2013/10/4/497>.

diff --git a/arch/powerpc/math-emu/math_efp.c b/arch/powerpc/math-emu/math_efp.c
index 59835c6..ecdf35d 100644
--- a/arch/powerpc/math-emu/math_efp.c
+++ b/arch/powerpc/math-emu/math_efp.c
@@ -680,7 +680,8 @@ int speround_handler(struct pt_regs *regs)
 {
 	union dw_union fgpr;
 	int s_lo, s_hi;
-	unsigned long speinsn, type, fc;
+	int lo_inexact, hi_inexact;
+	unsigned long speinsn, type, fc, fptype;
 
 	if (get_user(speinsn, (unsigned int __user *) regs->nip))
 		return -EFAULT;
@@ -693,8 +694,12 @@ int speround_handler(struct pt_regs *regs)
 	__FPU_FPSCR = mfspr(SPRN_SPEFSCR);
 	pr_debug("speinsn:%08lx spefscr:%08lx\n", speinsn, __FPU_FPSCR);
 
+	fptype = (speinsn >> 5) & 0x7;
+
 	/* No need to round if the result is exact */
-	if (!(__FPU_FPSCR & FP_EX_INEXACT))
+	lo_inexact = __FPU_FPSCR & (SPEFSCR_FG | SPEFSCR_FX);
+	hi_inexact = __FPU_FPSCR & (SPEFSCR_FGH | SPEFSCR_FXH);
+	if (!(lo_inexact || (hi_inexact && fptype == VCT)))
 		return 0;
 
 	fc = (speinsn >> 21) & 0x1f;
@@ -705,7 +710,7 @@ int speround_handler(struct pt_regs *regs)
 
 	pr_debug("round fgpr: %08x  %08x\n", fgpr.wp[0], fgpr.wp[1]);
 
-	switch ((speinsn >> 5) & 0x7) {
+	switch (fptype) {
 	/* Since SPE instructions on E500 core can handle round to nearest
 	 * and round toward zero with IEEE-754 complied, we just need
 	 * to handle round toward +Inf and round toward -Inf by software.
@@ -728,11 +733,15 @@ int speround_handler(struct pt_regs *regs)
 
 	case VCT:
 		if (FP_ROUNDMODE == FP_RND_PINF) {
-			if (!s_lo) fgpr.wp[1]++; /* Z_low > 0, choose Z1 */
-			if (!s_hi) fgpr.wp[0]++; /* Z_high word > 0, choose Z1 */
+			if (lo_inexact && !s_lo)
+				fgpr.wp[1]++; /* Z_low > 0, choose Z1 */
+			if (hi_inexact && !s_hi)
+				fgpr.wp[0]++; /* Z_high word > 0, choose Z1 */
 		} else { /* round to -Inf */
-			if (s_lo) fgpr.wp[1]++; /* Z_low < 0, choose Z2 */
-			if (s_hi) fgpr.wp[0]++; /* Z_high < 0, choose Z2 */
+			if (lo_inexact && s_lo)
+				fgpr.wp[1]++; /* Z_low < 0, choose Z2 */
+			if (hi_inexact && s_hi)
+				fgpr.wp[0]++; /* Z_high < 0, choose Z2 */
 		}
 		break;
 

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* [PATCH 3/6] math-emu: fix floating-point to integer unsigned saturation
From: Joseph S. Myers @ 2013-11-04 16:53 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Liu Yu, linux-kernel, Shan Hai
In-Reply-To: <Pine.LNX.4.64.1311041649250.4290@digraph.polyomino.org.uk>

From: Joseph Myers <joseph@codesourcery.com>

The math-emu macros _FP_TO_INT and _FP_TO_INT_ROUND are supposed to
saturate their results for out-of-range arguments, except in the case
rsigned == 2 (when instead the low bits of the result are taken).
However, in the case rsigned == 0 (converting to unsigned integers),
they mistakenly produce 0 for positive results and the maximum
unsigned integer for negative results, the opposite of correct
unsigned saturation.  This patch fixes the logic.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

Previous submission: <http://lkml.org/lkml/2013/10/8/694>.

I have made the corresponding changes to the glibc/libgcc copy of this
code, given that it would be desirable to resync the Linux and
glibc/libgcc copies (the latter has had many enhancements and bug
fixes since it was copied into Linux), although strictly this
incorrect saturation is only a bug when trying to emulate particular
instruction semantics, not when used in userspace to implement C
operations where the results of out-of-range conversions are
unspecified or undefined.

diff --git a/include/math-emu/op-common.h b/include/math-emu/op-common.h
index 9696a5e..70fe5e9 100644
--- a/include/math-emu/op-common.h
+++ b/include/math-emu/op-common.h
@@ -685,7 +685,7 @@ do {									\
 	    else								\
 	      {									\
 		r = 0;								\
-		if (X##_s)							\
+		if (!X##_s)							\
 		  r = ~r;							\
 	      }									\
 	    FP_SET_EXCEPTION(FP_EX_INVALID);					\
@@ -762,7 +762,7 @@ do {									\
 	    if (!rsigned)							\
 	      {									\
 		r = 0;								\
-		if (X##_s)							\
+		if (!X##_s)							\
 		  r = ~r;							\
 	      }									\
 	    else if (rsigned != 2)						\

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* [PATCH 4/6] math-emu: fix floating-point to integer overflow detection
From: Joseph S. Myers @ 2013-11-04 16:54 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Liu Yu, linux-kernel, Shan Hai
In-Reply-To: <Pine.LNX.4.64.1311041649250.4290@digraph.polyomino.org.uk>

From: Joseph Myers <joseph@codesourcery.com>

On overflow, the math-emu macro _FP_TO_INT_ROUND tries to saturate its
result (subject to the value of rsigned specifying the desired
overflow semantics).  However, if the rounding step has the effect of
increasing the exponent so as to cause overflow (if the rounded result
is 1 larger than the largest positive value with the given number of
bits, allowing for signedness), the overflow does not get detected,
meaning that for unsigned results 0 is produced instead of the maximum
unsigned integer with the give number of bits, without an exception
being raised for overflow, and that for signed results the minimum
(negative) value is produced instead of the maximum (positive) value,
again without an exception.  This patch makes the code check for
rounding increasing the exponent and adjusts the exponent value as
needed for the overflow check.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

Previous submission: <http://lkml.org/lkml/2013/10/8/700>.

This macro is not present in the glibc/libgcc version of the code.  It
remains the case both before and after this patch that the conversions
wrongly treat a signed result of the most negative integer as an
overflow, when actually only that integer minus 1 or smaller should be
an overflow, although this only means an incorrect exception rather
than affecting the value returned; that was one of the bugs I fixed in
the glibc/libgcc version of this code in 2006 (as part of a major
overhaul of the code including various interface changes, so not
trivially backportable to the kernel version).

diff --git a/include/math-emu/op-common.h b/include/math-emu/op-common.h
index 70fe5e9..6bdf8c6 100644
--- a/include/math-emu/op-common.h
+++ b/include/math-emu/op-common.h
@@ -743,12 +743,17 @@ do {									\
 	  }									\
 	else									\
 	  {									\
+	    int _lz0, _lz1;							\
 	    if (X##_e <= -_FP_WORKBITS - 1)					\
 	      _FP_FRAC_SET_##wc(X, _FP_MINFRAC_##wc);				\
 	    else								\
 	      _FP_FRAC_SRS_##wc(X, _FP_FRACBITS_##fs - 1 - X##_e,		\
 				_FP_WFRACBITS_##fs);				\
+	    _FP_FRAC_CLZ_##wc(_lz0, X);						\
 	    _FP_ROUND(wc, X);							\
+	    _FP_FRAC_CLZ_##wc(_lz1, X);						\
+	    if (_lz1 < _lz0)							\
+	      X##_e++; /* For overflow detection.  */				\
 	    _FP_FRAC_SRL_##wc(X, _FP_WORKBITS);					\
 	    _FP_FRAC_ASSEMBLE_##wc(r, X, rsize);				\
 	  }									\

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* [PATCH 5/6] powerpc: fix e500 SPE float to integer and fixed-point conversions
From: Joseph S. Myers @ 2013-11-04 16:54 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Liu Yu, linux-kernel, Shan Hai
In-Reply-To: <Pine.LNX.4.64.1311041649250.4290@digraph.polyomino.org.uk>

From: Joseph Myers <joseph@codesourcery.com>

The e500 SPE floating-point emulation code has several problems in how
it handles conversions to integer and fixed-point fractional types.

There are the following 20 relevant instructions.  These can convert
to signed or unsigned 32-bit integers, either rounding towards zero
(as correct for C casts from floating-point to integer) or according
to the current rounding mode, or to signed or unsigned 32-bit
fixed-point values (values in the range [-1, 1) or [0, 1)).  For
conversion from double precision there are also instructions to
convert to 64-bit integers, rounding towards zero, although as far as
I know those instructions are completely theoretical (they are only
defined for implementations that support both SPE and classic 64-bit,
and I'm not aware of any such hardware even though the architecture
definition permits that combination).

#define EFSCTUI		0x2d4
#define EFSCTSI		0x2d5
#define EFSCTUF		0x2d6
#define EFSCTSF		0x2d7
#define EFSCTUIZ	0x2d8
#define EFSCTSIZ	0x2da

#define EVFSCTUI	0x294
#define EVFSCTSI	0x295
#define EVFSCTUF	0x296
#define EVFSCTSF	0x297
#define EVFSCTUIZ	0x298
#define EVFSCTSIZ	0x29a

#define EFDCTUIDZ	0x2ea
#define EFDCTSIDZ	0x2eb

#define EFDCTUI		0x2f4
#define EFDCTSI		0x2f5
#define EFDCTUF		0x2f6
#define EFDCTSF		0x2f7
#define EFDCTUIZ	0x2f8
#define EFDCTSIZ	0x2fa

The emulation code, for the instructions that come in variants
rounding either towards zero or according to the current rounding
direction, uses "if (func & 0x4)" as a condition for using _FP_ROUND
(otherwise _FP_ROUND_ZERO is used).  The condition is correct, but the
code it controls isn't.  Whether _FP_ROUND or _FP_ROUND_ZERO is used
makes no difference, as the effect of those soft-fp macros is to round
an intermediate floating-point result using the low three bits (the
last one sticky) of the working format.  As these operations are
dealing with a freshly unpacked floating-point input, those low bits
are zero and no rounding occurs.  The emulation code then uses the
FP_TO_INT_* macros for the actual integer conversion, with the effect
of always rounding towards zero; for rounding according to the current
rounding direction, it should be using FP_TO_INT_ROUND_*.

The instructions in question have semantics defined (in the Power ISA
documents) for out-of-range values and NaNs: out-of-range values
saturate and NaNs are converted to zero.  The emulation does nothing
to follow those semantics for NaNs (the soft-fp handling is to treat
them as infinities), and messes up the saturation semantics.  For
single-precision conversion to integers, (((func & 0x3) != 0) || SB_s)
is the condition used for doing a signed conversion.  The first part
is correct, but the second isn't: negative numbers should result in
saturation to 0 when converted to unsigned.  Double-precision
conversion to 64-bit integers correctly uses ((func & 0x1) == 0).
Double-precision conversion to 32-bit integers uses (((func & 0x3) !=
0) || DB_s), with correct first part and incorrect second part.  And
vector float conversion to integers uses (((func & 0x3) != 0) ||
SB0_s) (and similar for the other vector element), where the sign bit
check is again wrong.

The incorrect handling of negative numbers converted to unsigned was
introduced in commit afc0a07d4a283599ac3a6a31d7454e9baaeccca0.  The
rationale given there was a C testcase with cast from float to
unsigned int.  Conversion of out-of-range floating-point numbers to
integer types in C is undefined behavior in the base standard, defined
in Annex F to produce an unspecified value.  That is, the C testcase
used to justify that patch is incorrect - there is no ISO C
requirement for a particular value resulting from this conversion -
and in any case, the correct semantics for such emulation are the
semantics for the instruction (unsigned saturation, which is what it
does in hardware when the emulation is disabled).

The conversion to fixed-point values has its own problems.  That code
doesn't try to do a full emulation; it relies on the trap handler only
being called for arguments that are infinities, NaNs, subnormal or out
of range.  That's fine, but the logic ((vb.wp[1] >> 23) == 0xff &&
((vb.wp[1] & 0x7fffff) > 0)) for NaN detection won't detect negative
NaNs as being NaNs (the same applies for the double-precision case),
and subnormals are mapped to 0 rather than respecting the rounding
mode; the code should also explicitly raise the "invalid" exception.
The code for vectors works by executing the scalar float instruction
with the trapping disabled, meaning at least subnormals won't be
handled correctly.

As well as all those problems in the main emulation code, the rounding
handler - used to emulate rounding upward and downward when not
supported in hardware and when no higher priority exception occurred -
has its own problems.

* It gets called in some cases even for the instructions rounding to
  zero, and then acts according to the current rounding mode when it
  should just leave alone the truncated result provided by hardware.

* It presumes that the result is a single-precision, double-precision
  or single-precision vector as appropriate for the instruction type,
  determines the sign of the result accordingly, and then adjusts the
  result based on that sign and the rounding mode.

  - In the single-precision cases at least the sign determination for
    an integer result is the same as for a floating-point result; in
    the double-precision case, converted to 32-bit integer or fixed
    point, the sign of a double-precision value is in the high part of
    the register but it's the low part of the register that has the
    result of the conversion.

  - If the result is unsigned fixed-point, its sign may be wrongly
    determined as negative (does not actually cause problems, because
    inexact unsigned fixed-point results with the high bit set can
    only appear when converting from double, in which case the sign
    determination is instead wrongly using the high part of the
    register).

  - If the sign of the result is correctly determined as negative, any
    adjustment required to change the truncated result to one correct
    for the rounding mode should be in the opposite direction for
    two's-complement integers as for sign-magnitude floating-point
    values.

  - And if the integer result is zero, the correct sign can only be
    determined by examining the original operand, and not at all (as
    far as I can tell) if the operand and result are the same
    register.

This patch fixes all these problems (as far as possible, given the
inability to determine the correct sign in the rounding handler when
the truncated result is 0, the conversion is to a signed type and the
truncated result has overwritten the original operand).  Conversion to
fixed-point now uses full emulation, and does not use "asm" in the
vector case; the semantics are exactly those of converting to integer
according to the current rounding direction, once the exponent has
been adjusted, so the code makes such an adjustment then uses the
FP_TO_INT_ROUND macros.

The testcase I used for verifying that the instructions (other than
the theoretical conversions to 64-bit integers) produce the correct
results is at <http://lkml.org/lkml/2013/10/8/708>.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

Previous submission: <http://lkml.org/lkml/2013/10/8/705>.

diff --git a/arch/powerpc/math-emu/math_efp.c b/arch/powerpc/math-emu/math_efp.c
index ecdf35d..01a0abb 100644
--- a/arch/powerpc/math-emu/math_efp.c
+++ b/arch/powerpc/math-emu/math_efp.c
@@ -275,21 +275,13 @@ int do_spe_mathemu(struct pt_regs *regs)
 
 		case EFSCTSF:
 		case EFSCTUF:
-			if (!((vb.wp[1] >> 23) == 0xff && ((vb.wp[1] & 0x7fffff) > 0))) {
-				/* NaN */
-				if (((vb.wp[1] >> 23) & 0xff) == 0) {
-					/* denorm */
-					vc.wp[1] = 0x0;
-				} else if ((vb.wp[1] >> 31) == 0) {
-					/* positive normal */
-					vc.wp[1] = (func == EFSCTSF) ?
-						0x7fffffff : 0xffffffff;
-				} else { /* negative normal */
-					vc.wp[1] = (func == EFSCTSF) ?
-						0x80000000 : 0x0;
-				}
-			} else { /* rB is NaN */
-				vc.wp[1] = 0x0;
+			if (SB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				SB_e += (func == EFSCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_S(vc.wp[1], SB, 32,
+						(func == EFSCTSF));
 			}
 			goto update_regs;
 
@@ -306,16 +298,25 @@ int do_spe_mathemu(struct pt_regs *regs)
 		}
 
 		case EFSCTSI:
-		case EFSCTSIZ:
 		case EFSCTUI:
+			if (SB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_S(vc.wp[1], SB, 32,
+						((func & 0x3) != 0));
+			}
+			goto update_regs;
+
+		case EFSCTSIZ:
 		case EFSCTUIZ:
-			if (func & 0x4) {
-				_FP_ROUND(1, SB);
+			if (SB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
 			} else {
-				_FP_ROUND_ZERO(1, SB);
+				FP_TO_INT_S(vc.wp[1], SB, 32,
+						((func & 0x3) != 0));
 			}
-			FP_TO_INT_S(vc.wp[1], SB, 32,
-					(((func & 0x3) != 0) || SB_s));
 			goto update_regs;
 
 		default:
@@ -404,22 +405,13 @@ cmp_s:
 
 		case EFDCTSF:
 		case EFDCTUF:
-			if (!((vb.wp[0] >> 20) == 0x7ff &&
-			   ((vb.wp[0] & 0xfffff) > 0 || (vb.wp[1] > 0)))) {
-				/* not a NaN */
-				if (((vb.wp[0] >> 20) & 0x7ff) == 0) {
-					/* denorm */
-					vc.wp[1] = 0x0;
-				} else if ((vb.wp[0] >> 31) == 0) {
-					/* positive normal */
-					vc.wp[1] = (func == EFDCTSF) ?
-						0x7fffffff : 0xffffffff;
-				} else { /* negative normal */
-					vc.wp[1] = (func == EFDCTSF) ?
-						0x80000000 : 0x0;
-				}
-			} else { /* NaN */
-				vc.wp[1] = 0x0;
+			if (DB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				DB_e += (func == EFDCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_D(vc.wp[1], DB, 32,
+						(func == EFDCTSF));
 			}
 			goto update_regs;
 
@@ -437,21 +429,35 @@ cmp_s:
 
 		case EFDCTUIDZ:
 		case EFDCTSIDZ:
-			_FP_ROUND_ZERO(2, DB);
-			FP_TO_INT_D(vc.dp[0], DB, 64, ((func & 0x1) == 0));
+			if (DB_c == FP_CLS_NAN) {
+				vc.dp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_D(vc.dp[0], DB, 64,
+						((func & 0x1) == 0));
+			}
 			goto update_regs;
 
 		case EFDCTUI:
 		case EFDCTSI:
+			if (DB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_D(vc.wp[1], DB, 32,
+						((func & 0x3) != 0));
+			}
+			goto update_regs;
+
 		case EFDCTUIZ:
 		case EFDCTSIZ:
-			if (func & 0x4) {
-				_FP_ROUND(2, DB);
+			if (DB_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
 			} else {
-				_FP_ROUND_ZERO(2, DB);
+				FP_TO_INT_D(vc.wp[1], DB, 32,
+						((func & 0x3) != 0));
 			}
-			FP_TO_INT_D(vc.wp[1], DB, 32,
-					(((func & 0x3) != 0) || DB_s));
 			goto update_regs;
 
 		default:
@@ -556,37 +562,60 @@ cmp_d:
 			cmp = -1;
 			goto cmp_vs;
 
-		case EVFSCTSF:
-			__asm__ __volatile__ ("mtspr 512, %4\n"
-				"efsctsf %0, %2\n"
-				"efsctsf %1, %3\n"
-				: "=r" (vc.wp[0]), "=r" (vc.wp[1])
-				: "r" (vb.wp[0]), "r" (vb.wp[1]), "r" (0));
-			goto update_regs;
-
 		case EVFSCTUF:
-			__asm__ __volatile__ ("mtspr 512, %4\n"
-				"efsctuf %0, %2\n"
-				"efsctuf %1, %3\n"
-				: "=r" (vc.wp[0]), "=r" (vc.wp[1])
-				: "r" (vb.wp[0]), "r" (vb.wp[1]), "r" (0));
+		case EVFSCTSF:
+			if (SB0_c == FP_CLS_NAN) {
+				vc.wp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				SB0_e += (func == EVFSCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_S(vc.wp[0], SB0, 32,
+						(func == EVFSCTSF));
+			}
+			if (SB1_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				SB1_e += (func == EVFSCTSF ? 31 : 32);
+				FP_TO_INT_ROUND_S(vc.wp[1], SB1, 32,
+						(func == EVFSCTSF));
+			}
 			goto update_regs;
 
 		case EVFSCTUI:
 		case EVFSCTSI:
+			if (SB0_c == FP_CLS_NAN) {
+				vc.wp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_S(vc.wp[0], SB0, 32,
+						((func & 0x3) != 0));
+			}
+			if (SB1_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_ROUND_S(vc.wp[1], SB1, 32,
+						((func & 0x3) != 0));
+			}
+			goto update_regs;
+
 		case EVFSCTUIZ:
 		case EVFSCTSIZ:
-			if (func & 0x4) {
-				_FP_ROUND(1, SB0);
-				_FP_ROUND(1, SB1);
+			if (SB0_c == FP_CLS_NAN) {
+				vc.wp[0] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
 			} else {
-				_FP_ROUND_ZERO(1, SB0);
-				_FP_ROUND_ZERO(1, SB1);
+				FP_TO_INT_S(vc.wp[0], SB0, 32,
+						((func & 0x3) != 0));
+			}
+			if (SB1_c == FP_CLS_NAN) {
+				vc.wp[1] = 0;
+				FP_SET_EXCEPTION(FP_EX_INVALID);
+			} else {
+				FP_TO_INT_S(vc.wp[1], SB1, 32,
+						((func & 0x3) != 0));
 			}
-			FP_TO_INT_S(vc.wp[0], SB0, 32,
-					(((func & 0x3) != 0) || SB0_s));
-			FP_TO_INT_S(vc.wp[1], SB1, 32,
-					(((func & 0x3) != 0) || SB1_s));
 			goto update_regs;
 
 		default:
@@ -681,14 +710,16 @@ int speround_handler(struct pt_regs *regs)
 	union dw_union fgpr;
 	int s_lo, s_hi;
 	int lo_inexact, hi_inexact;
-	unsigned long speinsn, type, fc, fptype;
+	int fp_result;
+	unsigned long speinsn, type, fb, fc, fptype, func;
 
 	if (get_user(speinsn, (unsigned int __user *) regs->nip))
 		return -EFAULT;
 	if ((speinsn >> 26) != 4)
 		return -EINVAL;         /* not an spe instruction */
 
-	type = insn_type(speinsn & 0x7ff);
+	func = speinsn & 0x7ff;
+	type = insn_type(func);
 	if (type == XCR) return -ENOSYS;
 
 	__FPU_FPSCR = mfspr(SPRN_SPEFSCR);
@@ -708,6 +739,65 @@ int speround_handler(struct pt_regs *regs)
 	fgpr.wp[0] = current->thread.evr[fc];
 	fgpr.wp[1] = regs->gpr[fc];
 
+	fb = (speinsn >> 11) & 0x1f;
+	switch (func) {
+	case EFSCTUIZ:
+	case EFSCTSIZ:
+	case EVFSCTUIZ:
+	case EVFSCTSIZ:
+	case EFDCTUIDZ:
+	case EFDCTSIDZ:
+	case EFDCTUIZ:
+	case EFDCTSIZ:
+		/*
+		 * These instructions always round to zero,
+		 * independent of the rounding mode.
+		 */
+		return 0;
+
+	case EFSCTUI:
+	case EFSCTUF:
+	case EVFSCTUI:
+	case EVFSCTUF:
+	case EFDCTUI:
+	case EFDCTUF:
+		fp_result = 0;
+		s_lo = 0;
+		s_hi = 0;
+		break;
+
+	case EFSCTSI:
+	case EFSCTSF:
+		fp_result = 0;
+		/* Recover the sign of a zero result if possible.  */
+		if (fgpr.wp[1] == 0)
+			s_lo = regs->gpr[fb] & SIGN_BIT_S;
+		break;
+
+	case EVFSCTSI:
+	case EVFSCTSF:
+		fp_result = 0;
+		/* Recover the sign of a zero result if possible.  */
+		if (fgpr.wp[1] == 0)
+			s_lo = regs->gpr[fb] & SIGN_BIT_S;
+		if (fgpr.wp[0] == 0)
+			s_hi = current->thread.evr[fb] & SIGN_BIT_S;
+		break;
+
+	case EFDCTSI:
+	case EFDCTSF:
+		fp_result = 0;
+		s_hi = s_lo;
+		/* Recover the sign of a zero result if possible.  */
+		if (fgpr.wp[1] == 0)
+			s_hi = current->thread.evr[fb] & SIGN_BIT_S;
+		break;
+
+	default:
+		fp_result = 1;
+		break;
+	}
+
 	pr_debug("round fgpr: %08x  %08x\n", fgpr.wp[0], fgpr.wp[1]);
 
 	switch (fptype) {
@@ -719,15 +809,30 @@ int speround_handler(struct pt_regs *regs)
 		if ((FP_ROUNDMODE) == FP_RND_PINF) {
 			if (!s_lo) fgpr.wp[1]++; /* Z > 0, choose Z1 */
 		} else { /* round to -Inf */
-			if (s_lo) fgpr.wp[1]++; /* Z < 0, choose Z2 */
+			if (s_lo) {
+				if (fp_result)
+					fgpr.wp[1]++; /* Z < 0, choose Z2 */
+				else
+					fgpr.wp[1]--; /* Z < 0, choose Z2 */
+			}
 		}
 		break;
 
 	case DPFP:
 		if (FP_ROUNDMODE == FP_RND_PINF) {
-			if (!s_hi) fgpr.dp[0]++; /* Z > 0, choose Z1 */
+			if (!s_hi) {
+				if (fp_result)
+					fgpr.dp[0]++; /* Z > 0, choose Z1 */
+				else
+					fgpr.wp[1]++; /* Z > 0, choose Z1 */
+			}
 		} else { /* round to -Inf */
-			if (s_hi) fgpr.dp[0]++; /* Z < 0, choose Z2 */
+			if (s_hi) {
+				if (fp_result)
+					fgpr.dp[0]++; /* Z < 0, choose Z2 */
+				else
+					fgpr.wp[1]--; /* Z < 0, choose Z2 */
+			}
 		}
 		break;
 
@@ -738,10 +843,18 @@ int speround_handler(struct pt_regs *regs)
 			if (hi_inexact && !s_hi)
 				fgpr.wp[0]++; /* Z_high word > 0, choose Z1 */
 		} else { /* round to -Inf */
-			if (lo_inexact && s_lo)
-				fgpr.wp[1]++; /* Z_low < 0, choose Z2 */
-			if (hi_inexact && s_hi)
-				fgpr.wp[0]++; /* Z_high < 0, choose Z2 */
+			if (lo_inexact && s_lo) {
+				if (fp_result)
+					fgpr.wp[1]++; /* Z_low < 0, choose Z2 */
+				else
+					fgpr.wp[1]--; /* Z_low < 0, choose Z2 */
+			}
+			if (hi_inexact && s_hi) {
+				if (fp_result)
+					fgpr.wp[0]++; /* Z_high < 0, choose Z2 */
+				else
+					fgpr.wp[0]--; /* Z_high < 0, choose Z2 */
+			}
 		}
 		break;
 

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* [PATCH 6/6] powerpc: fix e500 SPE float SIGFPE generation
From: Joseph S. Myers @ 2013-11-04 16:55 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Liu Yu, linux-kernel, Shan Hai
In-Reply-To: <Pine.LNX.4.64.1311041649250.4290@digraph.polyomino.org.uk>

From: Joseph Myers <joseph@codesourcery.com>

The e500 SPE floating-point emulation code is called from
SPEFloatingPointException and SPEFloatingPointRoundException in
arch/powerpc/kernel/traps.c.  Those functions have support for
generating SIGFPE, but do_spe_mathemu and speround_handler don't
generate a return value to indicate that this should be done.  Such a
return value should depend on whether an exception is raised that has
been set via prctl to generate SIGFPE.  This patch adds the relevant
logic in these functions so that SIGFPE is generated as expected by
the glibc testsuite.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>

---

Previous submission: <http://lkml.org/lkml/2013/10/10/626>.

diff --git a/arch/powerpc/math-emu/math_efp.c b/arch/powerpc/math-emu/math_efp.c
index 01a0abb..28337c9 100644
--- a/arch/powerpc/math-emu/math_efp.c
+++ b/arch/powerpc/math-emu/math_efp.c
@@ -20,6 +20,7 @@
  */
 
 #include <linux/types.h>
+#include <linux/prctl.h>
 
 #include <asm/uaccess.h>
 #include <asm/reg.h>
@@ -691,6 +692,23 @@ update_regs:
 	pr_debug("va: %08x  %08x\n", va.wp[0], va.wp[1]);
 	pr_debug("vb: %08x  %08x\n", vb.wp[0], vb.wp[1]);
 
+	if (current->thread.fpexc_mode & PR_FP_EXC_SW_ENABLE) {
+		if ((FP_CUR_EXCEPTIONS & FP_EX_DIVZERO)
+		    && (current->thread.fpexc_mode & PR_FP_EXC_DIV))
+			return 1;
+		if ((FP_CUR_EXCEPTIONS & FP_EX_OVERFLOW)
+		    && (current->thread.fpexc_mode & PR_FP_EXC_OVF))
+			return 1;
+		if ((FP_CUR_EXCEPTIONS & FP_EX_UNDERFLOW)
+		    && (current->thread.fpexc_mode & PR_FP_EXC_UND))
+			return 1;
+		if ((FP_CUR_EXCEPTIONS & FP_EX_INEXACT)
+		    && (current->thread.fpexc_mode & PR_FP_EXC_RES))
+			return 1;
+		if ((FP_CUR_EXCEPTIONS & FP_EX_INVALID)
+		    && (current->thread.fpexc_mode & PR_FP_EXC_INV))
+			return 1;
+	}
 	return 0;
 
 illegal:
@@ -867,6 +885,8 @@ int speround_handler(struct pt_regs *regs)
 
 	pr_debug("  to fgpr: %08x  %08x\n", fgpr.wp[0], fgpr.wp[1]);
 
+	if (current->thread.fpexc_mode & PR_FP_EXC_SW_ENABLE)
+		return (current->thread.fpexc_mode & PR_FP_EXC_RES) ? 1 : 0;
 	return 0;
 }
 

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply related

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Paul E. McKenney @ 2013-11-04 16:34 UTC (permalink / raw)
  To: Will Deacon
  Cc: Michael Neuling, Mathieu Desnoyers, Peter Zijlstra, Oleg Nesterov,
	LKML, Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131104110553.GA8595@mudshark.cambridge.arm.com>

On Mon, Nov 04, 2013 at 11:05:53AM +0000, Will Deacon wrote:
> On Sun, Nov 03, 2013 at 11:34:00PM +0000, Linus Torvalds wrote:
> > So it would *kind* of act like a "smp_wmb() + smp_rmb()", but the
> > problem is that a "smp_rmb()" doesn't really "attach" to the preceding
> > write.
> 
> Agreed.
> 
> > This is analogous to a "acquire" operation: you cannot make an
> > "acquire" barrier, because it's not a barrier *between* two ops, it's
> > associated with one particular op.
> > 
> > So what I *think* you actually really really want is a "store with
> > release consistency, followed by a write barrier".
> 
> How does that order reads against reads? (Paul mentioned this as a
> requirement). I not clear about the use case for this, so perhaps there is a
> dependency that I'm not aware of.

An smp_store_with_release_semantics() orders against prior reads -and-
writes.  It maps to barrier() for x86, stlr for ARM, and lwsync for
PowerPC, as called out in my prototype definitions.

> > In TSO, afaik all stores have release consistency, and all writes are
> > ordered, which is why this is a no-op in TSO. And x86 also has that
> > "all stores have release consistency, and all writes are ordered"
> > model, even if TSO doesn't really describe the x86 model.
> > 
> > But on ARM64, for example, I think you'd really want the store itself
> > to be done with "stlr" (store with release), and then follow up with a
> > "dsb st" after that.
> 
> So a dsb is pretty heavyweight here (it prevents execution of *any* further
> instructions until all preceeding stores have completed, as well as
> ensuring completion of any ongoing cache flushes). In conjunction with the
> store-release, that's going to hold everything up until the store-release
> (and therefore any preceeding memory accesses) have completed. Granted, I
> think that gives Paul his read/read ordering, but it's a lot heavier than
> what's required.

I do not believe that we need the trailing "dsb st".

> > And notice how that requires you to mark the store itself. There is no
> > actual barrier *after* the store that does the optimized model.
> > 
> > Of course, it's entirely possible that it's not worth worrying about
> > this on ARM64, and that just doing it as a "normal store followed by a
> > full memory barrier" is good enough. But at least in *theory* a
> > microarchitecture might make it much cheaper to do a "store with
> > release consistency" followed by "write barrier".
> 
> I agree with the sentiment but, given that this stuff is so heavily
> microarchitecture-dependent (and not simple to probe), a simple dmb ish
> might be the best option after all. That's especially true if the
> microarchitecture decided to ignore the barrier options and treat everything
> as `all accesses, full system' in order to keep the hardware design simple.

I believe that we can do quite a bit better with current hardware
instructions (in the case of ARM, for a recent definition of "current")
and also simplify the memory ordering quite a bit.

								Thanx, Paul

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Peter Zijlstra @ 2013-11-04 19:11 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131104162732.GN3947@linux.vnet.ibm.com>

On Mon, Nov 04, 2013 at 08:27:32AM -0800, Paul E. McKenney wrote:
> All this is leading me to suggest the following shortenings of names:
> 
> 	smp_load_with_acquire_semantics() -> smp_load_acquire()
> 
> 	smp_store_with_release_semantics() -> smp_store_release()
> 
> But names aside, the above gets rid of explicit barriers on TSO architectures,
> allows ARM to avoid full DMB, and allows PowerPC to use lwsync instead of
> the heavier-weight sync.

A little something like this? Completely guessed at the arm/arm64/ia64
asm, but at least for those archs I found proper instructions (I hope),
for x86,sparc,s390 which are TSO we can do with a barrier and PPC like
said can do with the lwsync, all others fall back to using a smp_mb().

Should probably come with a proper changelog and an addition to _The_
document.

---
 arch/alpha/include/asm/barrier.h      | 13 +++++++++++
 arch/arc/include/asm/barrier.h        | 13 +++++++++++
 arch/arm/include/asm/barrier.h        | 26 +++++++++++++++++++++
 arch/arm64/include/asm/barrier.h      | 28 +++++++++++++++++++++++
 arch/avr32/include/asm/barrier.h      | 12 ++++++++++
 arch/blackfin/include/asm/barrier.h   | 13 +++++++++++
 arch/cris/include/asm/barrier.h       | 13 +++++++++++
 arch/frv/include/asm/barrier.h        | 13 +++++++++++
 arch/h8300/include/asm/barrier.h      | 13 +++++++++++
 arch/hexagon/include/asm/barrier.h    | 13 +++++++++++
 arch/ia64/include/asm/barrier.h       | 43 +++++++++++++++++++++++++++++++++++
 arch/m32r/include/asm/barrier.h       | 13 +++++++++++
 arch/m68k/include/asm/barrier.h       | 13 +++++++++++
 arch/metag/include/asm/barrier.h      | 13 +++++++++++
 arch/microblaze/include/asm/barrier.h | 13 +++++++++++
 arch/mips/include/asm/barrier.h       | 13 +++++++++++
 arch/mn10300/include/asm/barrier.h    | 13 +++++++++++
 arch/parisc/include/asm/barrier.h     | 13 +++++++++++
 arch/powerpc/include/asm/barrier.h    | 15 ++++++++++++
 arch/s390/include/asm/barrier.h       | 13 +++++++++++
 arch/score/include/asm/barrier.h      | 13 +++++++++++
 arch/sh/include/asm/barrier.h         | 13 +++++++++++
 arch/sparc/include/asm/barrier_32.h   | 13 +++++++++++
 arch/sparc/include/asm/barrier_64.h   | 13 +++++++++++
 arch/tile/include/asm/barrier.h       | 13 +++++++++++
 arch/unicore32/include/asm/barrier.h  | 13 +++++++++++
 arch/x86/include/asm/barrier.h        | 13 +++++++++++
 arch/xtensa/include/asm/barrier.h     | 13 +++++++++++
 28 files changed, 423 insertions(+)

diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
index ce8860a0b32d..464139feee97 100644
--- a/arch/alpha/include/asm/barrier.h
+++ b/arch/alpha/include/asm/barrier.h
@@ -29,6 +29,19 @@ __asm__ __volatile__("mb": : :"memory")
 #define smp_read_barrier_depends()	do { } while (0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #define set_mb(var, value) \
 do { var = value; mb(); } while (0)
 
diff --git a/arch/arc/include/asm/barrier.h b/arch/arc/include/asm/barrier.h
index f6cb7c4ffb35..a779da846fb5 100644
--- a/arch/arc/include/asm/barrier.h
+++ b/arch/arc/include/asm/barrier.h
@@ -30,6 +30,19 @@
 #define smp_wmb()       barrier()
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #define smp_mb__before_atomic_dec()	barrier()
 #define smp_mb__after_atomic_dec()	barrier()
 #define smp_mb__before_atomic_inc()	barrier()
diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 60f15e274e6d..a804093d6891 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -53,10 +53,36 @@
 #define smp_mb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
 #else
 #define smp_mb()	dmb(ish)
 #define smp_rmb()	smp_mb()
 #define smp_wmb()	dmb(ishst)
+
+#define smp_store_release(p, v)						\
+do {									\
+	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1;						\
+	asm volatile ("ldar %w0, [%1]"					\
+			: "=r" (___p1) : "r" (&p) : "memory");		\
+	return ___p1;							\
+} while (0)
 #endif
 
 #define read_barrier_depends()		do { } while(0)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index d4a63338a53c..0da2d4ebb9a8 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -35,10 +35,38 @@
 #define smp_mb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #else
+
 #define smp_mb()	asm volatile("dmb ish" : : : "memory")
 #define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
 #define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
+
+#define smp_store_release(p, v)						\
+do {									\
+	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1;						\
+	asm volatile ("ldar %w0, [%1]"					\
+			: "=r" (___p1) : "r" (&p) : "memory");		\
+	return ___p1;							\
+} while (0)
 #endif
 
 #define read_barrier_depends()		do { } while(0)
diff --git a/arch/avr32/include/asm/barrier.h b/arch/avr32/include/asm/barrier.h
index 0961275373db..a0c48ad684f8 100644
--- a/arch/avr32/include/asm/barrier.h
+++ b/arch/avr32/include/asm/barrier.h
@@ -25,5 +25,17 @@
 # define smp_read_barrier_depends() do { } while(0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
 
 #endif /* __ASM_AVR32_BARRIER_H */
diff --git a/arch/blackfin/include/asm/barrier.h b/arch/blackfin/include/asm/barrier.h
index ebb189507dd7..67889d9225d9 100644
--- a/arch/blackfin/include/asm/barrier.h
+++ b/arch/blackfin/include/asm/barrier.h
@@ -45,4 +45,17 @@
 #define set_mb(var, value) do { var = value; mb(); } while (0)
 #define smp_read_barrier_depends()	read_barrier_depends()
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _BLACKFIN_BARRIER_H */
diff --git a/arch/cris/include/asm/barrier.h b/arch/cris/include/asm/barrier.h
index 198ad7fa6b25..34243dc44ef1 100644
--- a/arch/cris/include/asm/barrier.h
+++ b/arch/cris/include/asm/barrier.h
@@ -22,4 +22,17 @@
 #define smp_read_barrier_depends()     do { } while(0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_CRIS_BARRIER_H */
diff --git a/arch/frv/include/asm/barrier.h b/arch/frv/include/asm/barrier.h
index 06776ad9f5e9..92f89934d4ed 100644
--- a/arch/frv/include/asm/barrier.h
+++ b/arch/frv/include/asm/barrier.h
@@ -26,4 +26,17 @@
 #define set_mb(var, value) \
 	do { var = (value); barrier(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_BARRIER_H */
diff --git a/arch/h8300/include/asm/barrier.h b/arch/h8300/include/asm/barrier.h
index 9e0aa9fc195d..516e9d379e25 100644
--- a/arch/h8300/include/asm/barrier.h
+++ b/arch/h8300/include/asm/barrier.h
@@ -26,4 +26,17 @@
 #define smp_read_barrier_depends()	do { } while(0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _H8300_BARRIER_H */
diff --git a/arch/hexagon/include/asm/barrier.h b/arch/hexagon/include/asm/barrier.h
index 1041a8e70ce8..838a2ebe07a5 100644
--- a/arch/hexagon/include/asm/barrier.h
+++ b/arch/hexagon/include/asm/barrier.h
@@ -38,4 +38,17 @@
 #define set_mb(var, value) \
 	do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_BARRIER_H */
diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index 60576e06b6fb..4598d390fabb 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -45,11 +45,54 @@
 # define smp_rmb()	rmb()
 # define smp_wmb()	wmb()
 # define smp_read_barrier_depends()	read_barrier_depends()
+
+#define smp_store_release(p, v)						\
+do {									\
+	switch (sizeof(p)) {						\
+	case 4:								\
+		asm volatile ("st4.acq [%0]=%1" 			\
+				:: "r" (&p), "r" (v) : "memory");	\
+		break;							\
+	case 8:								\
+		asm volatile ("st8.acq [%0]=%1" 			\
+				:: "r" (&p), "r" (v) : "memory"); 	\
+		break;							\
+	}								\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1;						\
+	switch (sizeof(p)) {						\
+	case 4:								\
+		asm volatile ("ld4.rel %0=[%1]" 			\
+				: "=r"(___p1) : "r" (&p) : "memory"); 	\
+		break;							\
+	case 8:								\
+		asm volatile ("ld8.rel %0=[%1]" 			\
+				: "=r"(___p1) : "r" (&p) : "memory"); 	\
+		break;							\
+	}								\
+	return ___p1;							\
+} while (0)
 #else
 # define smp_mb()	barrier()
 # define smp_rmb()	barrier()
 # define smp_wmb()	barrier()
 # define smp_read_barrier_depends()	do { } while(0)
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
 #endif
 
 /*
diff --git a/arch/m32r/include/asm/barrier.h b/arch/m32r/include/asm/barrier.h
index 6976621efd3f..e5d42bcf90c5 100644
--- a/arch/m32r/include/asm/barrier.h
+++ b/arch/m32r/include/asm/barrier.h
@@ -91,4 +91,17 @@
 #define set_mb(var, value) do { var = value; barrier(); } while (0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_M32R_BARRIER_H */
diff --git a/arch/m68k/include/asm/barrier.h b/arch/m68k/include/asm/barrier.h
index 445ce22c23cb..eeb9ecf713cc 100644
--- a/arch/m68k/include/asm/barrier.h
+++ b/arch/m68k/include/asm/barrier.h
@@ -17,4 +17,17 @@
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	((void)0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _M68K_BARRIER_H */
diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
index c90bfc6bf648..d8e6f2e4a27c 100644
--- a/arch/metag/include/asm/barrier.h
+++ b/arch/metag/include/asm/barrier.h
@@ -82,4 +82,17 @@ static inline void fence(void)
 #define smp_read_barrier_depends()     do { } while (0)
 #define set_mb(var, value) do { var = value; smp_mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_METAG_BARRIER_H */
diff --git a/arch/microblaze/include/asm/barrier.h b/arch/microblaze/include/asm/barrier.h
index df5be3e87044..a890702061c9 100644
--- a/arch/microblaze/include/asm/barrier.h
+++ b/arch/microblaze/include/asm/barrier.h
@@ -24,4 +24,17 @@
 #define smp_rmb()		rmb()
 #define smp_wmb()		wmb()
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_MICROBLAZE_BARRIER_H */
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 314ab5532019..e59bcd051f36 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -180,4 +180,17 @@
 #define nudge_writes() mb()
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_BARRIER_H */
diff --git a/arch/mn10300/include/asm/barrier.h b/arch/mn10300/include/asm/barrier.h
index 2bd97a5c8af7..0e6a0608d4a1 100644
--- a/arch/mn10300/include/asm/barrier.h
+++ b/arch/mn10300/include/asm/barrier.h
@@ -34,4 +34,17 @@
 #define read_barrier_depends()		do {} while (0)
 #define smp_read_barrier_depends()	do {} while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_BARRIER_H */
diff --git a/arch/parisc/include/asm/barrier.h b/arch/parisc/include/asm/barrier.h
index e77d834aa803..f1145a8594a0 100644
--- a/arch/parisc/include/asm/barrier.h
+++ b/arch/parisc/include/asm/barrier.h
@@ -32,4 +32,17 @@
 
 #define set_mb(var, value)		do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __PARISC_BARRIER_H */
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index ae782254e731..b5cc36791f42 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -65,4 +65,19 @@
 #define data_barrier(x)	\
 	asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
 
+/* use smp_rmb() as that is either lwsync or a barrier() depending on SMP */
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_rmb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_rmb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_POWERPC_BARRIER_H */
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index 16760eeb79b0..e8989c40e11c 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -32,4 +32,17 @@
 
 #define set_mb(var, value)		do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	barrier();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	barrier();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_BARRIER_H */
diff --git a/arch/score/include/asm/barrier.h b/arch/score/include/asm/barrier.h
index 0eacb6471e6d..5f101ef8ade9 100644
--- a/arch/score/include/asm/barrier.h
+++ b/arch/score/include/asm/barrier.h
@@ -13,4 +13,17 @@
 
 #define set_mb(var, value) 		do {var = value; wmb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_SCORE_BARRIER_H */
diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
index 72c103dae300..611128c2f636 100644
--- a/arch/sh/include/asm/barrier.h
+++ b/arch/sh/include/asm/barrier.h
@@ -51,4 +51,17 @@
 
 #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_SH_BARRIER_H */
diff --git a/arch/sparc/include/asm/barrier_32.h b/arch/sparc/include/asm/barrier_32.h
index c1b76654ee76..f47f9d51f326 100644
--- a/arch/sparc/include/asm/barrier_32.h
+++ b/arch/sparc/include/asm/barrier_32.h
@@ -12,4 +12,17 @@
 #define smp_wmb()	__asm__ __volatile__("":::"memory")
 #define smp_read_barrier_depends()	do { } while(0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* !(__SPARC_BARRIER_H) */
diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
index 95d45986f908..77cbe6982ca0 100644
--- a/arch/sparc/include/asm/barrier_64.h
+++ b/arch/sparc/include/asm/barrier_64.h
@@ -53,4 +53,17 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
 
 #define smp_read_barrier_depends()	do { } while(0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	barrier();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	barrier();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* !(__SPARC64_BARRIER_H) */
diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
index a9a73da5865d..4d5330d4fd31 100644
--- a/arch/tile/include/asm/barrier.h
+++ b/arch/tile/include/asm/barrier.h
@@ -140,5 +140,18 @@ mb_incoherent(void)
 #define set_mb(var, value) \
 	do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASM_TILE_BARRIER_H */
diff --git a/arch/unicore32/include/asm/barrier.h b/arch/unicore32/include/asm/barrier.h
index a6620e5336b6..5471ff6aae10 100644
--- a/arch/unicore32/include/asm/barrier.h
+++ b/arch/unicore32/include/asm/barrier.h
@@ -25,4 +25,17 @@
 
 #define set_mb(var, value)		do { var = value; smp_mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __UNICORE_BARRIER_H__ */
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index c6cd358a1eec..a7fd8201ab09 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -100,6 +100,19 @@
 #define set_mb(var, value) do { var = value; barrier(); } while (0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	barrier();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	barrier();							\
+	return ___p1;							\
+} while (0)
+
 /*
  * Stop RDTSC speculation. This is needed when you need to use RDTSC
  * (or get_cycles or vread that possibly accesses the TSC) in a defined
diff --git a/arch/xtensa/include/asm/barrier.h b/arch/xtensa/include/asm/barrier.h
index ef021677d536..703d511add49 100644
--- a/arch/xtensa/include/asm/barrier.h
+++ b/arch/xtensa/include/asm/barrier.h
@@ -26,4 +26,17 @@
 
 #define set_mb(var, value)	do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p, v)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _XTENSA_SYSTEM_H */

^ permalink raw reply related

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Peter Zijlstra @ 2013-11-04 19:18 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131104191127.GW16117@laptop.programming.kicks-ass.net>

On Mon, Nov 04, 2013 at 08:11:27PM +0100, Peter Zijlstra wrote:
> +#define smp_load_acquire(p, v)						\

I R idiot!! :-)

---
 arch/alpha/include/asm/barrier.h      | 13 +++++++++++
 arch/arc/include/asm/barrier.h        | 13 +++++++++++
 arch/arm/include/asm/barrier.h        | 26 +++++++++++++++++++++
 arch/arm64/include/asm/barrier.h      | 28 +++++++++++++++++++++++
 arch/avr32/include/asm/barrier.h      | 12 ++++++++++
 arch/blackfin/include/asm/barrier.h   | 13 +++++++++++
 arch/cris/include/asm/barrier.h       | 13 +++++++++++
 arch/frv/include/asm/barrier.h        | 13 +++++++++++
 arch/h8300/include/asm/barrier.h      | 13 +++++++++++
 arch/hexagon/include/asm/barrier.h    | 13 +++++++++++
 arch/ia64/include/asm/barrier.h       | 43 +++++++++++++++++++++++++++++++++++
 arch/m32r/include/asm/barrier.h       | 13 +++++++++++
 arch/m68k/include/asm/barrier.h       | 13 +++++++++++
 arch/metag/include/asm/barrier.h      | 13 +++++++++++
 arch/microblaze/include/asm/barrier.h | 13 +++++++++++
 arch/mips/include/asm/barrier.h       | 13 +++++++++++
 arch/mn10300/include/asm/barrier.h    | 13 +++++++++++
 arch/parisc/include/asm/barrier.h     | 13 +++++++++++
 arch/powerpc/include/asm/barrier.h    | 15 ++++++++++++
 arch/s390/include/asm/barrier.h       | 13 +++++++++++
 arch/score/include/asm/barrier.h      | 13 +++++++++++
 arch/sh/include/asm/barrier.h         | 13 +++++++++++
 arch/sparc/include/asm/barrier_32.h   | 13 +++++++++++
 arch/sparc/include/asm/barrier_64.h   | 13 +++++++++++
 arch/tile/include/asm/barrier.h       | 13 +++++++++++
 arch/unicore32/include/asm/barrier.h  | 13 +++++++++++
 arch/x86/include/asm/barrier.h        | 13 +++++++++++
 arch/xtensa/include/asm/barrier.h     | 13 +++++++++++
 28 files changed, 423 insertions(+)

diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
index ce8860a0b32d..464139feee97 100644
--- a/arch/alpha/include/asm/barrier.h
+++ b/arch/alpha/include/asm/barrier.h
@@ -29,6 +29,19 @@ __asm__ __volatile__("mb": : :"memory")
 #define smp_read_barrier_depends()	do { } while (0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #define set_mb(var, value) \
 do { var = value; mb(); } while (0)
 
diff --git a/arch/arc/include/asm/barrier.h b/arch/arc/include/asm/barrier.h
index f6cb7c4ffb35..a779da846fb5 100644
--- a/arch/arc/include/asm/barrier.h
+++ b/arch/arc/include/asm/barrier.h
@@ -30,6 +30,19 @@
 #define smp_wmb()       barrier()
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #define smp_mb__before_atomic_dec()	barrier()
 #define smp_mb__after_atomic_dec()	barrier()
 #define smp_mb__before_atomic_inc()	barrier()
diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 60f15e274e6d..4ada4720bdeb 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -53,10 +53,36 @@
 #define smp_mb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
 #else
 #define smp_mb()	dmb(ish)
 #define smp_rmb()	smp_mb()
 #define smp_wmb()	dmb(ishst)
+
+#define smp_store_release(p, v)						\
+do {									\
+	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1;						\
+	asm volatile ("ldar %w0, [%1]"					\
+			: "=r" (___p1) : "r" (&p) : "memory");		\
+	return ___p1;							\
+} while (0)
 #endif
 
 #define read_barrier_depends()		do { } while(0)
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index d4a63338a53c..3dfddc0416f6 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -35,10 +35,38 @@
 #define smp_mb()	barrier()
 #define smp_rmb()	barrier()
 #define smp_wmb()	barrier()
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #else
+
 #define smp_mb()	asm volatile("dmb ish" : : : "memory")
 #define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
 #define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
+
+#define smp_store_release(p, v)						\
+do {									\
+	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1;						\
+	asm volatile ("ldar %w0, [%1]" 					\
+			: "=r" (___p1) : "r" (&p) : "memory"); 		\
+	return ___p1;							\
+} while (0)
 #endif
 
 #define read_barrier_depends()		do { } while(0)
diff --git a/arch/avr32/include/asm/barrier.h b/arch/avr32/include/asm/barrier.h
index 0961275373db..8fd164648e71 100644
--- a/arch/avr32/include/asm/barrier.h
+++ b/arch/avr32/include/asm/barrier.h
@@ -25,5 +25,17 @@
 # define smp_read_barrier_depends() do { } while(0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
 
 #endif /* __ASM_AVR32_BARRIER_H */
diff --git a/arch/blackfin/include/asm/barrier.h b/arch/blackfin/include/asm/barrier.h
index ebb189507dd7..c8b85bba843f 100644
--- a/arch/blackfin/include/asm/barrier.h
+++ b/arch/blackfin/include/asm/barrier.h
@@ -45,4 +45,17 @@
 #define set_mb(var, value) do { var = value; mb(); } while (0)
 #define smp_read_barrier_depends()	read_barrier_depends()
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _BLACKFIN_BARRIER_H */
diff --git a/arch/cris/include/asm/barrier.h b/arch/cris/include/asm/barrier.h
index 198ad7fa6b25..26f21f5d1d15 100644
--- a/arch/cris/include/asm/barrier.h
+++ b/arch/cris/include/asm/barrier.h
@@ -22,4 +22,17 @@
 #define smp_read_barrier_depends()     do { } while(0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_CRIS_BARRIER_H */
diff --git a/arch/frv/include/asm/barrier.h b/arch/frv/include/asm/barrier.h
index 06776ad9f5e9..4569028382fa 100644
--- a/arch/frv/include/asm/barrier.h
+++ b/arch/frv/include/asm/barrier.h
@@ -26,4 +26,17 @@
 #define set_mb(var, value) \
 	do { var = (value); barrier(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_BARRIER_H */
diff --git a/arch/h8300/include/asm/barrier.h b/arch/h8300/include/asm/barrier.h
index 9e0aa9fc195d..45d36738814d 100644
--- a/arch/h8300/include/asm/barrier.h
+++ b/arch/h8300/include/asm/barrier.h
@@ -26,4 +26,17 @@
 #define smp_read_barrier_depends()	do { } while(0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _H8300_BARRIER_H */
diff --git a/arch/hexagon/include/asm/barrier.h b/arch/hexagon/include/asm/barrier.h
index 1041a8e70ce8..d88d54bd2e6e 100644
--- a/arch/hexagon/include/asm/barrier.h
+++ b/arch/hexagon/include/asm/barrier.h
@@ -38,4 +38,17 @@
 #define set_mb(var, value) \
 	do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_BARRIER_H */
diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
index 60576e06b6fb..b7f1a8aa03af 100644
--- a/arch/ia64/include/asm/barrier.h
+++ b/arch/ia64/include/asm/barrier.h
@@ -45,11 +45,54 @@
 # define smp_rmb()	rmb()
 # define smp_wmb()	wmb()
 # define smp_read_barrier_depends()	read_barrier_depends()
+
+#define smp_store_release(p, v)						\
+do {									\
+	switch (sizeof(p)) {						\
+	case 4:								\
+		asm volatile ("st4.acq [%0]=%1"				\
+				:: "r" (&p), "r" (v) : "memory");	\
+		break;							\
+	case 8:								\
+		asm volatile ("st8.acq [%0]=%1"				\
+				:: "r" (&p), "r" (v) : "memory");	\
+		break;							\
+	}								\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1;						\
+	switch (sizeof(p)) {						\
+	case 4:								\
+		asm volatile ("ld4.rel %0=[%1]"				\
+				: "=r"(___p1) : "r" (&p) : "memory");	\
+		break;							\
+	case 8:								\
+		asm volatile ("ld8.rel %0=[%1]"				\
+				: "=r"(___p1) : "r" (&p) : "memory");	\
+		break;							\
+	}								\
+	return ___p1;							\
+} while (0)
 #else
 # define smp_mb()	barrier()
 # define smp_rmb()	barrier()
 # define smp_wmb()	barrier()
 # define smp_read_barrier_depends()	do { } while(0)
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
 #endif
 
 /*
diff --git a/arch/m32r/include/asm/barrier.h b/arch/m32r/include/asm/barrier.h
index 6976621efd3f..d78612289cb2 100644
--- a/arch/m32r/include/asm/barrier.h
+++ b/arch/m32r/include/asm/barrier.h
@@ -91,4 +91,17 @@
 #define set_mb(var, value) do { var = value; barrier(); } while (0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_M32R_BARRIER_H */
diff --git a/arch/m68k/include/asm/barrier.h b/arch/m68k/include/asm/barrier.h
index 445ce22c23cb..1e63b11c424c 100644
--- a/arch/m68k/include/asm/barrier.h
+++ b/arch/m68k/include/asm/barrier.h
@@ -17,4 +17,17 @@
 #define smp_wmb()	barrier()
 #define smp_read_barrier_depends()	((void)0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _M68K_BARRIER_H */
diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
index c90bfc6bf648..9ffd0b167f07 100644
--- a/arch/metag/include/asm/barrier.h
+++ b/arch/metag/include/asm/barrier.h
@@ -82,4 +82,17 @@ static inline void fence(void)
 #define smp_read_barrier_depends()     do { } while (0)
 #define set_mb(var, value) do { var = value; smp_mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_METAG_BARRIER_H */
diff --git a/arch/microblaze/include/asm/barrier.h b/arch/microblaze/include/asm/barrier.h
index df5be3e87044..db0b5e205ce3 100644
--- a/arch/microblaze/include/asm/barrier.h
+++ b/arch/microblaze/include/asm/barrier.h
@@ -24,4 +24,17 @@
 #define smp_rmb()		rmb()
 #define smp_wmb()		wmb()
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_MICROBLAZE_BARRIER_H */
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 314ab5532019..8031afcc7f64 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -180,4 +180,17 @@
 #define nudge_writes() mb()
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_BARRIER_H */
diff --git a/arch/mn10300/include/asm/barrier.h b/arch/mn10300/include/asm/barrier.h
index 2bd97a5c8af7..e822ff76f498 100644
--- a/arch/mn10300/include/asm/barrier.h
+++ b/arch/mn10300/include/asm/barrier.h
@@ -34,4 +34,17 @@
 #define read_barrier_depends()		do {} while (0)
 #define smp_read_barrier_depends()	do {} while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_BARRIER_H */
diff --git a/arch/parisc/include/asm/barrier.h b/arch/parisc/include/asm/barrier.h
index e77d834aa803..58757747f873 100644
--- a/arch/parisc/include/asm/barrier.h
+++ b/arch/parisc/include/asm/barrier.h
@@ -32,4 +32,17 @@
 
 #define set_mb(var, value)		do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __PARISC_BARRIER_H */
diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
index ae782254e731..54922626b356 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -65,4 +65,19 @@
 #define data_barrier(x)	\
 	asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
 
+/* use smp_rmb() as that is either lwsync or a barrier() depending on SMP */
+
+#define smp_store_release(p, v)						\
+do {									\
+	smp_rmb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_rmb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_POWERPC_BARRIER_H */
diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
index 16760eeb79b0..babf928649a4 100644
--- a/arch/s390/include/asm/barrier.h
+++ b/arch/s390/include/asm/barrier.h
@@ -32,4 +32,17 @@
 
 #define set_mb(var, value)		do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	barrier();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	barrier();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_BARRIER_H */
diff --git a/arch/score/include/asm/barrier.h b/arch/score/include/asm/barrier.h
index 0eacb6471e6d..5905ea57a104 100644
--- a/arch/score/include/asm/barrier.h
+++ b/arch/score/include/asm/barrier.h
@@ -13,4 +13,17 @@
 
 #define set_mb(var, value) 		do {var = value; wmb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _ASM_SCORE_BARRIER_H */
diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
index 72c103dae300..379f500023b6 100644
--- a/arch/sh/include/asm/barrier.h
+++ b/arch/sh/include/asm/barrier.h
@@ -51,4 +51,17 @@
 
 #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __ASM_SH_BARRIER_H */
diff --git a/arch/sparc/include/asm/barrier_32.h b/arch/sparc/include/asm/barrier_32.h
index c1b76654ee76..1649081d1b86 100644
--- a/arch/sparc/include/asm/barrier_32.h
+++ b/arch/sparc/include/asm/barrier_32.h
@@ -12,4 +12,17 @@
 #define smp_wmb()	__asm__ __volatile__("":::"memory")
 #define smp_read_barrier_depends()	do { } while(0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* !(__SPARC_BARRIER_H) */
diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
index 95d45986f908..5e23ced0a29a 100644
--- a/arch/sparc/include/asm/barrier_64.h
+++ b/arch/sparc/include/asm/barrier_64.h
@@ -53,4 +53,17 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
 
 #define smp_read_barrier_depends()	do { } while(0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	barrier();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	barrier();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* !(__SPARC64_BARRIER_H) */
diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
index a9a73da5865d..1f08318db3c0 100644
--- a/arch/tile/include/asm/barrier.h
+++ b/arch/tile/include/asm/barrier.h
@@ -140,5 +140,18 @@ mb_incoherent(void)
 #define set_mb(var, value) \
 	do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASM_TILE_BARRIER_H */
diff --git a/arch/unicore32/include/asm/barrier.h b/arch/unicore32/include/asm/barrier.h
index a6620e5336b6..fa8bf69d9a09 100644
--- a/arch/unicore32/include/asm/barrier.h
+++ b/arch/unicore32/include/asm/barrier.h
@@ -25,4 +25,17 @@
 
 #define set_mb(var, value)		do { var = value; smp_mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* __UNICORE_BARRIER_H__ */
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index c6cd358a1eec..115ef72b3784 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -100,6 +100,19 @@
 #define set_mb(var, value) do { var = value; barrier(); } while (0)
 #endif
 
+#define smp_store_release(p, v)						\
+do {									\
+	barrier();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	barrier();							\
+	return ___p1;							\
+} while (0)
+
 /*
  * Stop RDTSC speculation. This is needed when you need to use RDTSC
  * (or get_cycles or vread that possibly accesses the TSC) in a defined
diff --git a/arch/xtensa/include/asm/barrier.h b/arch/xtensa/include/asm/barrier.h
index ef021677d536..e96a674c337a 100644
--- a/arch/xtensa/include/asm/barrier.h
+++ b/arch/xtensa/include/asm/barrier.h
@@ -26,4 +26,17 @@
 
 #define set_mb(var, value)	do { var = value; mb(); } while (0)
 
+#define smp_store_release(p, v)						\
+do {									\
+	smp_mb();							\
+	ACCESS_ONCE(p) = (v);						\
+} while (0)
+
+#define smp_load_acquire(p)						\
+do {									\
+	typeof(p) ___p1 = ACCESS_ONCE(p);				\
+	smp_mb();							\
+	return ___p1;							\
+} while (0)
+
 #endif /* _XTENSA_SYSTEM_H */

^ permalink raw reply related

* Re: [PATCH v3 1/1] powerpc/embedded6xx: Add support for Motorola/Emerson MVME5100
From: Stephen N Chivers @ 2013-11-04 19:18 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: geert.uytterhoeven, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <CAMuHMdUOUf7shC9A+enCt8qQygG=Qw_soeNJnTWQKr90o2gZhA@mail.gmail.com>

geert.uytterhoeven@gmail.com wrote on 11/04/2013 06:59:21 PM:

> From: Geert Uytterhoeven <geert@linux-m68k.org>
> To: Stephen N Chivers/AUS/CSC@CSC
> Cc: "linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
> Date: 11/04/2013 06:59 PM
> Subject: Re: [PATCH v3 1/1] powerpc/embedded6xx: Add support for 
> Motorola/Emerson MVME5100
> Sent by: geert.uytterhoeven@gmail.com
> 
> On Sun, Nov 3, 2013 at 10:07 PM, Stephen Chivers <schivers@csc.com> 
wrote:
> > +++ b/arch/powerpc/boot/dts/mvme5100.dts
> > @@ -0,0 +1,185 @@
> > +/*
> > + * Device Tree Souce for Motorola/Emerson MVME5100.
> 
> Source
Ok. Thanks, will be fixed.
> 
> (unless this expresses your personal appreciation for device trees ;-)
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- 
> geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. 
But
> when I'm talking to journalists I just say "programmer" or somethinglike 
that.
>                                 -- Linus Torvalds

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Paul E. McKenney @ 2013-11-04 20:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, heiko.carstens, Oleg Nesterov,
	LKML, Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, linux, Linus Torvalds, schwidefsky
In-Reply-To: <20131104191127.GW16117@laptop.programming.kicks-ass.net>

On Mon, Nov 04, 2013 at 08:11:27PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 04, 2013 at 08:27:32AM -0800, Paul E. McKenney wrote:
> > All this is leading me to suggest the following shortenings of names:
> > 
> > 	smp_load_with_acquire_semantics() -> smp_load_acquire()
> > 
> > 	smp_store_with_release_semantics() -> smp_store_release()
> > 
> > But names aside, the above gets rid of explicit barriers on TSO architectures,
> > allows ARM to avoid full DMB, and allows PowerPC to use lwsync instead of
> > the heavier-weight sync.
> 
> A little something like this? Completely guessed at the arm/arm64/ia64
> asm, but at least for those archs I found proper instructions (I hope),
> for x86,sparc,s390 which are TSO we can do with a barrier and PPC like
> said can do with the lwsync, all others fall back to using a smp_mb().
> 
> Should probably come with a proper changelog and an addition to _The_
> document.

Maybe something like this for the changelog?

	A number of situations currently require the heavyweight smp_mb(),
	even though there is no need to order prior stores against later
	loads.  Many architectures have much cheaper ways to handle these
	situations, but the Linux kernel currently has no portable way
	to make use of them.

	This commit therefore supplies smp_load_acquire() and
	smp_store_release() to remedy this situation.  The new
	smp_load_acquire() primitive orders the specified load against
	any subsequent reads or writes, while the new smp_store_release()
	primitive orders the specifed store against any prior reads or
	writes.  These primitives allow array-based circular FIFOs to be
	implemented without an smp_mb(), and also allow a theoretical
	hole in rcu_assign_pointer() to be closed at no additional
	expense on most architectures.

	In addition, the RCU experience transitioning from explicit
	smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
	and rcu_assign_pointer(), respectively resulted in substantial
	improvements in readability.  It therefore seems likely that
	replacing other explicit barriers with smp_load_acquire() and
	smp_store_release() will provide similar benefits.  It appears
	that roughly half of the explicit barriers in core kernel code
	might be so replaced.

Some comments below.  I believe that opcodes need to be fixed for IA64.
I am unsure of the ifdefs and opcodes for arm64, but the ARM folks should
be able to tell us.

Other than that, for the rest:

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/alpha/include/asm/barrier.h      | 13 +++++++++++
>  arch/arc/include/asm/barrier.h        | 13 +++++++++++
>  arch/arm/include/asm/barrier.h        | 26 +++++++++++++++++++++
>  arch/arm64/include/asm/barrier.h      | 28 +++++++++++++++++++++++
>  arch/avr32/include/asm/barrier.h      | 12 ++++++++++
>  arch/blackfin/include/asm/barrier.h   | 13 +++++++++++
>  arch/cris/include/asm/barrier.h       | 13 +++++++++++
>  arch/frv/include/asm/barrier.h        | 13 +++++++++++
>  arch/h8300/include/asm/barrier.h      | 13 +++++++++++
>  arch/hexagon/include/asm/barrier.h    | 13 +++++++++++
>  arch/ia64/include/asm/barrier.h       | 43 +++++++++++++++++++++++++++++++++++
>  arch/m32r/include/asm/barrier.h       | 13 +++++++++++
>  arch/m68k/include/asm/barrier.h       | 13 +++++++++++
>  arch/metag/include/asm/barrier.h      | 13 +++++++++++
>  arch/microblaze/include/asm/barrier.h | 13 +++++++++++
>  arch/mips/include/asm/barrier.h       | 13 +++++++++++
>  arch/mn10300/include/asm/barrier.h    | 13 +++++++++++
>  arch/parisc/include/asm/barrier.h     | 13 +++++++++++
>  arch/powerpc/include/asm/barrier.h    | 15 ++++++++++++
>  arch/s390/include/asm/barrier.h       | 13 +++++++++++
>  arch/score/include/asm/barrier.h      | 13 +++++++++++
>  arch/sh/include/asm/barrier.h         | 13 +++++++++++
>  arch/sparc/include/asm/barrier_32.h   | 13 +++++++++++
>  arch/sparc/include/asm/barrier_64.h   | 13 +++++++++++
>  arch/tile/include/asm/barrier.h       | 13 +++++++++++
>  arch/unicore32/include/asm/barrier.h  | 13 +++++++++++
>  arch/x86/include/asm/barrier.h        | 13 +++++++++++
>  arch/xtensa/include/asm/barrier.h     | 13 +++++++++++
>  28 files changed, 423 insertions(+)
> 
> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> index ce8860a0b32d..464139feee97 100644
> --- a/arch/alpha/include/asm/barrier.h
> +++ b/arch/alpha/include/asm/barrier.h
> @@ -29,6 +29,19 @@ __asm__ __volatile__("mb": : :"memory")
>  #define smp_read_barrier_depends()	do { } while (0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Yep, not any alternative to smp_mb() here.

>  #define set_mb(var, value) \
>  do { var = value; mb(); } while (0)
> 
> diff --git a/arch/arc/include/asm/barrier.h b/arch/arc/include/asm/barrier.h
> index f6cb7c4ffb35..a779da846fb5 100644
> --- a/arch/arc/include/asm/barrier.h
> +++ b/arch/arc/include/asm/barrier.h
> @@ -30,6 +30,19 @@
>  #define smp_wmb()       barrier()
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Appears to be !SMP, so OK.

>  #define smp_mb__before_atomic_dec()	barrier()
>  #define smp_mb__after_atomic_dec()	barrier()
>  #define smp_mb__before_atomic_inc()	barrier()
> diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
> index 60f15e274e6d..a804093d6891 100644
> --- a/arch/arm/include/asm/barrier.h
> +++ b/arch/arm/include/asm/barrier.h
> @@ -53,10 +53,36 @@
>  #define smp_mb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
>  #else
>  #define smp_mb()	dmb(ish)
>  #define smp_rmb()	smp_mb()
>  #define smp_wmb()	dmb(ishst)
> +

Seems like there should be some sort of #ifdef condition to distinguish
between these.  My guess is something like:

#if __LINUX_ARM_ARCH__ > 7

But I must defer to the ARM guys.  For all I know, they might prefer
arch/arm to stick with smp_mb() and have arch/arm64 do the ldar and stlr.

> +#define smp_store_release(p, v)						\
> +do {									\
> +	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1;						\
> +	asm volatile ("ldar %w0, [%1]"					\
> +			: "=r" (___p1) : "r" (&p) : "memory");		\
> +	return ___p1;							\
> +} while (0)
>  #endif
> 
>  #define read_barrier_depends()		do { } while(0)
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index d4a63338a53c..0da2d4ebb9a8 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -35,10 +35,38 @@
>  #define smp_mb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #else
> +
>  #define smp_mb()	asm volatile("dmb ish" : : : "memory")
>  #define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
>  #define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1;						\
> +	asm volatile ("ldar %w0, [%1]"					\
> +			: "=r" (___p1) : "r" (&p) : "memory");		\
> +	return ___p1;							\
> +} while (0)
>  #endif

Ditto on the instruction format.  The closest thing I see in the kernel
is "stlr %w1, %0" in arch_write_unlock() and arch_spin_unlock().

> 
>  #define read_barrier_depends()		do { } while(0)
> diff --git a/arch/avr32/include/asm/barrier.h b/arch/avr32/include/asm/barrier.h
> index 0961275373db..a0c48ad684f8 100644
> --- a/arch/avr32/include/asm/barrier.h
> +++ b/arch/avr32/include/asm/barrier.h
> @@ -25,5 +25,17 @@
>  # define smp_read_barrier_depends() do { } while(0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)

!SMP, so should be OK.

> 
>  #endif /* __ASM_AVR32_BARRIER_H */
> diff --git a/arch/blackfin/include/asm/barrier.h b/arch/blackfin/include/asm/barrier.h
> index ebb189507dd7..67889d9225d9 100644
> --- a/arch/blackfin/include/asm/barrier.h
> +++ b/arch/blackfin/include/asm/barrier.h
> @@ -45,4 +45,17 @@
>  #define set_mb(var, value) do { var = value; mb(); } while (0)
>  #define smp_read_barrier_depends()	read_barrier_depends()
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Ditto.

>  #endif /* _BLACKFIN_BARRIER_H */
> diff --git a/arch/cris/include/asm/barrier.h b/arch/cris/include/asm/barrier.h
> index 198ad7fa6b25..34243dc44ef1 100644
> --- a/arch/cris/include/asm/barrier.h
> +++ b/arch/cris/include/asm/barrier.h
> @@ -22,4 +22,17 @@
>  #define smp_read_barrier_depends()     do { } while(0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Ditto.

>  #endif /* __ASM_CRIS_BARRIER_H */
> diff --git a/arch/frv/include/asm/barrier.h b/arch/frv/include/asm/barrier.h
> index 06776ad9f5e9..92f89934d4ed 100644
> --- a/arch/frv/include/asm/barrier.h
> +++ b/arch/frv/include/asm/barrier.h
> @@ -26,4 +26,17 @@
>  #define set_mb(var, value) \
>  	do { var = (value); barrier(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Ditto.

>  #endif /* _ASM_BARRIER_H */
> diff --git a/arch/h8300/include/asm/barrier.h b/arch/h8300/include/asm/barrier.h
> index 9e0aa9fc195d..516e9d379e25 100644
> --- a/arch/h8300/include/asm/barrier.h
> +++ b/arch/h8300/include/asm/barrier.h
> @@ -26,4 +26,17 @@
>  #define smp_read_barrier_depends()	do { } while(0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

And ditto again...

>  #endif /* _H8300_BARRIER_H */
> diff --git a/arch/hexagon/include/asm/barrier.h b/arch/hexagon/include/asm/barrier.h
> index 1041a8e70ce8..838a2ebe07a5 100644
> --- a/arch/hexagon/include/asm/barrier.h
> +++ b/arch/hexagon/include/asm/barrier.h
> @@ -38,4 +38,17 @@
>  #define set_mb(var, value) \
>  	do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

And again...

>  #endif /* _ASM_BARRIER_H */
> diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
> index 60576e06b6fb..4598d390fabb 100644
> --- a/arch/ia64/include/asm/barrier.h
> +++ b/arch/ia64/include/asm/barrier.h
> @@ -45,11 +45,54 @@
>  # define smp_rmb()	rmb()
>  # define smp_wmb()	wmb()
>  # define smp_read_barrier_depends()	read_barrier_depends()
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	switch (sizeof(p)) {						\
> +	case 4:								\
> +		asm volatile ("st4.acq [%0]=%1" 			\

This should be "st4.rel".

> +				:: "r" (&p), "r" (v) : "memory");	\
> +		break;							\
> +	case 8:								\
> +		asm volatile ("st8.acq [%0]=%1" 			\

And this should be "st8.rel"

> +				:: "r" (&p), "r" (v) : "memory"); 	\
> +		break;							\
> +	}								\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1;						\
> +	switch (sizeof(p)) {						\
> +	case 4:								\
> +		asm volatile ("ld4.rel %0=[%1]" 			\

And this should be "ld4.acq".

> +				: "=r"(___p1) : "r" (&p) : "memory"); 	\
> +		break;							\
> +	case 8:								\
> +		asm volatile ("ld8.rel %0=[%1]" 			\

And this should be "ld8.acq".

> +				: "=r"(___p1) : "r" (&p) : "memory"); 	\
> +		break;							\
> +	}								\
> +	return ___p1;							\
> +} while (0)

It appears that sizes 2 and 1 are also available, but 4 and 8 seem like
good places to start.

>  #else
>  # define smp_mb()	barrier()
>  # define smp_rmb()	barrier()
>  # define smp_wmb()	barrier()
>  # define smp_read_barrier_depends()	do { } while(0)
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
>  #endif
> 
>  /*
> diff --git a/arch/m32r/include/asm/barrier.h b/arch/m32r/include/asm/barrier.h
> index 6976621efd3f..e5d42bcf90c5 100644
> --- a/arch/m32r/include/asm/barrier.h
> +++ b/arch/m32r/include/asm/barrier.h
> @@ -91,4 +91,17 @@
>  #define set_mb(var, value) do { var = value; barrier(); } while (0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Another !SMP architecture, so looks good.

>  #endif /* _ASM_M32R_BARRIER_H */
> diff --git a/arch/m68k/include/asm/barrier.h b/arch/m68k/include/asm/barrier.h
> index 445ce22c23cb..eeb9ecf713cc 100644
> --- a/arch/m68k/include/asm/barrier.h
> +++ b/arch/m68k/include/asm/barrier.h
> @@ -17,4 +17,17 @@
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	((void)0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Ditto.

>  #endif /* _M68K_BARRIER_H */
> diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
> index c90bfc6bf648..d8e6f2e4a27c 100644
> --- a/arch/metag/include/asm/barrier.h
> +++ b/arch/metag/include/asm/barrier.h
> @@ -82,4 +82,17 @@ static inline void fence(void)
>  #define smp_read_barrier_depends()     do { } while (0)
>  #define set_mb(var, value) do { var = value; smp_mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

This one is a bit unusual, but use of smp_mb() should be safe.

>  #endif /* _ASM_METAG_BARRIER_H */
> diff --git a/arch/microblaze/include/asm/barrier.h b/arch/microblaze/include/asm/barrier.h
> index df5be3e87044..a890702061c9 100644
> --- a/arch/microblaze/include/asm/barrier.h
> +++ b/arch/microblaze/include/asm/barrier.h
> @@ -24,4 +24,17 @@
>  #define smp_rmb()		rmb()
>  #define smp_wmb()		wmb()
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

!SMP only, so good.

>  #endif /* _ASM_MICROBLAZE_BARRIER_H */
> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> index 314ab5532019..e59bcd051f36 100644
> --- a/arch/mips/include/asm/barrier.h
> +++ b/arch/mips/include/asm/barrier.h
> @@ -180,4 +180,17 @@
>  #define nudge_writes() mb()
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Interesting variety here as well.  Again, smp_mb() should be safe.

>  #endif /* __ASM_BARRIER_H */
> diff --git a/arch/mn10300/include/asm/barrier.h b/arch/mn10300/include/asm/barrier.h
> index 2bd97a5c8af7..0e6a0608d4a1 100644
> --- a/arch/mn10300/include/asm/barrier.h
> +++ b/arch/mn10300/include/asm/barrier.h
> @@ -34,4 +34,17 @@
>  #define read_barrier_depends()		do {} while (0)
>  #define smp_read_barrier_depends()	do {} while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

!SMP, so good.

>  #endif /* _ASM_BARRIER_H */
> diff --git a/arch/parisc/include/asm/barrier.h b/arch/parisc/include/asm/barrier.h
> index e77d834aa803..f1145a8594a0 100644
> --- a/arch/parisc/include/asm/barrier.h
> +++ b/arch/parisc/include/asm/barrier.h
> @@ -32,4 +32,17 @@
> 
>  #define set_mb(var, value)		do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Ditto.

>  #endif /* __PARISC_BARRIER_H */
> diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
> index ae782254e731..b5cc36791f42 100644
> --- a/arch/powerpc/include/asm/barrier.h
> +++ b/arch/powerpc/include/asm/barrier.h
> @@ -65,4 +65,19 @@
>  #define data_barrier(x)	\
>  	asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
> 
> +/* use smp_rmb() as that is either lwsync or a barrier() depending on SMP */
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_rmb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_rmb();							\
> +	return ___p1;							\
> +} while (0)
> +

I think that this actually does work, strange though it does look.

>  #endif /* _ASM_POWERPC_BARRIER_H */
> diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
> index 16760eeb79b0..e8989c40e11c 100644
> --- a/arch/s390/include/asm/barrier.h
> +++ b/arch/s390/include/asm/barrier.h
> @@ -32,4 +32,17 @@
> 
>  #define set_mb(var, value)		do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	barrier();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	barrier();							\
> +	return ___p1;							\
> +} while (0)
> +

I believe that this is OK as well, but must defer to the s390
maintainers.

>  #endif /* __ASM_BARRIER_H */
> diff --git a/arch/score/include/asm/barrier.h b/arch/score/include/asm/barrier.h
> index 0eacb6471e6d..5f101ef8ade9 100644
> --- a/arch/score/include/asm/barrier.h
> +++ b/arch/score/include/asm/barrier.h
> @@ -13,4 +13,17 @@
> 
>  #define set_mb(var, value) 		do {var = value; wmb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

!SMP, so good.

>  #endif /* _ASM_SCORE_BARRIER_H */
> diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
> index 72c103dae300..611128c2f636 100644
> --- a/arch/sh/include/asm/barrier.h
> +++ b/arch/sh/include/asm/barrier.h
> @@ -51,4 +51,17 @@
> 
>  #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

Use of smp_mb() should be safe here.

>  #endif /* __ASM_SH_BARRIER_H */
> diff --git a/arch/sparc/include/asm/barrier_32.h b/arch/sparc/include/asm/barrier_32.h
> index c1b76654ee76..f47f9d51f326 100644
> --- a/arch/sparc/include/asm/barrier_32.h
> +++ b/arch/sparc/include/asm/barrier_32.h
> @@ -12,4 +12,17 @@
>  #define smp_wmb()	__asm__ __volatile__("":::"memory")
>  #define smp_read_barrier_depends()	do { } while(0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

The surrounding code looks to be set up for !SMP.  I -thought- that there
were SMP 32-bit SPARC systems, but either way, smp_mb() should be safe.

>  #endif /* !(__SPARC_BARRIER_H) */
> diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
> index 95d45986f908..77cbe6982ca0 100644
> --- a/arch/sparc/include/asm/barrier_64.h
> +++ b/arch/sparc/include/asm/barrier_64.h
> @@ -53,4 +53,17 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
> 
>  #define smp_read_barrier_depends()	do { } while(0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	barrier();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	barrier();							\
> +	return ___p1;							\
> +} while (0)
> +

SPARC64 is TSO, so looks good.

>  #endif /* !(__SPARC64_BARRIER_H) */
> diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
> index a9a73da5865d..4d5330d4fd31 100644
> --- a/arch/tile/include/asm/barrier.h
> +++ b/arch/tile/include/asm/barrier.h
> @@ -140,5 +140,18 @@ mb_incoherent(void)
>  #define set_mb(var, value) \
>  	do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

The __mb_incoherent() in the surrounding code looks scary, but smp_mb()
should suffice here as well as elsewhere.

>  #endif /* !__ASSEMBLY__ */
>  #endif /* _ASM_TILE_BARRIER_H */
> diff --git a/arch/unicore32/include/asm/barrier.h b/arch/unicore32/include/asm/barrier.h
> index a6620e5336b6..5471ff6aae10 100644
> --- a/arch/unicore32/include/asm/barrier.h
> +++ b/arch/unicore32/include/asm/barrier.h
> @@ -25,4 +25,17 @@
> 
>  #define set_mb(var, value)		do { var = value; smp_mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

!SMP, so good.

>  #endif /* __UNICORE_BARRIER_H__ */
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index c6cd358a1eec..a7fd8201ab09 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -100,6 +100,19 @@
>  #define set_mb(var, value) do { var = value; barrier(); } while (0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	barrier();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	barrier();							\
> +	return ___p1;							\
> +} while (0)
> +

TSO, so good.

>  /*
>   * Stop RDTSC speculation. This is needed when you need to use RDTSC
>   * (or get_cycles or vread that possibly accesses the TSC) in a defined
> diff --git a/arch/xtensa/include/asm/barrier.h b/arch/xtensa/include/asm/barrier.h
> index ef021677d536..703d511add49 100644
> --- a/arch/xtensa/include/asm/barrier.h
> +++ b/arch/xtensa/include/asm/barrier.h
> @@ -26,4 +26,17 @@
> 
>  #define set_mb(var, value)	do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p, v)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +

The use of smp_mb() should be safe, so good.  Looks like xtensa orders
reads, but not writes -- interesting...

>  #endif /* _XTENSA_SYSTEM_H */
> 

^ permalink raw reply

* Re: [RFC] arch: Introduce new TSO memory barrier smp_tmb()
From: Paul E. McKenney @ 2013-11-04 20:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Neuling, Mathieu Desnoyers, Oleg Nesterov, LKML,
	Linux PPC dev, Anton Blanchard, Frederic Weisbecker,
	Victor Kaplansky, Linus Torvalds
In-Reply-To: <20131104191811.GS2490@laptop.programming.kicks-ass.net>

On Mon, Nov 04, 2013 at 08:18:11PM +0100, Peter Zijlstra wrote:
> On Mon, Nov 04, 2013 at 08:11:27PM +0100, Peter Zijlstra wrote:
> > +#define smp_load_acquire(p, v)						\
> 
> I R idiot!! :-)

OK, I did miss this one as well...  :-/

							Thanx, Paul

> ---
>  arch/alpha/include/asm/barrier.h      | 13 +++++++++++
>  arch/arc/include/asm/barrier.h        | 13 +++++++++++
>  arch/arm/include/asm/barrier.h        | 26 +++++++++++++++++++++
>  arch/arm64/include/asm/barrier.h      | 28 +++++++++++++++++++++++
>  arch/avr32/include/asm/barrier.h      | 12 ++++++++++
>  arch/blackfin/include/asm/barrier.h   | 13 +++++++++++
>  arch/cris/include/asm/barrier.h       | 13 +++++++++++
>  arch/frv/include/asm/barrier.h        | 13 +++++++++++
>  arch/h8300/include/asm/barrier.h      | 13 +++++++++++
>  arch/hexagon/include/asm/barrier.h    | 13 +++++++++++
>  arch/ia64/include/asm/barrier.h       | 43 +++++++++++++++++++++++++++++++++++
>  arch/m32r/include/asm/barrier.h       | 13 +++++++++++
>  arch/m68k/include/asm/barrier.h       | 13 +++++++++++
>  arch/metag/include/asm/barrier.h      | 13 +++++++++++
>  arch/microblaze/include/asm/barrier.h | 13 +++++++++++
>  arch/mips/include/asm/barrier.h       | 13 +++++++++++
>  arch/mn10300/include/asm/barrier.h    | 13 +++++++++++
>  arch/parisc/include/asm/barrier.h     | 13 +++++++++++
>  arch/powerpc/include/asm/barrier.h    | 15 ++++++++++++
>  arch/s390/include/asm/barrier.h       | 13 +++++++++++
>  arch/score/include/asm/barrier.h      | 13 +++++++++++
>  arch/sh/include/asm/barrier.h         | 13 +++++++++++
>  arch/sparc/include/asm/barrier_32.h   | 13 +++++++++++
>  arch/sparc/include/asm/barrier_64.h   | 13 +++++++++++
>  arch/tile/include/asm/barrier.h       | 13 +++++++++++
>  arch/unicore32/include/asm/barrier.h  | 13 +++++++++++
>  arch/x86/include/asm/barrier.h        | 13 +++++++++++
>  arch/xtensa/include/asm/barrier.h     | 13 +++++++++++
>  28 files changed, 423 insertions(+)
> 
> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> index ce8860a0b32d..464139feee97 100644
> --- a/arch/alpha/include/asm/barrier.h
> +++ b/arch/alpha/include/asm/barrier.h
> @@ -29,6 +29,19 @@ __asm__ __volatile__("mb": : :"memory")
>  #define smp_read_barrier_depends()	do { } while (0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #define set_mb(var, value) \
>  do { var = value; mb(); } while (0)
> 
> diff --git a/arch/arc/include/asm/barrier.h b/arch/arc/include/asm/barrier.h
> index f6cb7c4ffb35..a779da846fb5 100644
> --- a/arch/arc/include/asm/barrier.h
> +++ b/arch/arc/include/asm/barrier.h
> @@ -30,6 +30,19 @@
>  #define smp_wmb()       barrier()
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #define smp_mb__before_atomic_dec()	barrier()
>  #define smp_mb__after_atomic_dec()	barrier()
>  #define smp_mb__before_atomic_inc()	barrier()
> diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
> index 60f15e274e6d..4ada4720bdeb 100644
> --- a/arch/arm/include/asm/barrier.h
> +++ b/arch/arm/include/asm/barrier.h
> @@ -53,10 +53,36 @@
>  #define smp_mb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
>  #else
>  #define smp_mb()	dmb(ish)
>  #define smp_rmb()	smp_mb()
>  #define smp_wmb()	dmb(ishst)
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1;						\
> +	asm volatile ("ldar %w0, [%1]"					\
> +			: "=r" (___p1) : "r" (&p) : "memory");		\
> +	return ___p1;							\
> +} while (0)
>  #endif
> 
>  #define read_barrier_depends()		do { } while(0)
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index d4a63338a53c..3dfddc0416f6 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -35,10 +35,38 @@
>  #define smp_mb()	barrier()
>  #define smp_rmb()	barrier()
>  #define smp_wmb()	barrier()
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #else
> +
>  #define smp_mb()	asm volatile("dmb ish" : : : "memory")
>  #define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
>  #define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	asm volatile ("stlr %w0 [%1]" : : "r" (v), "r" (&p) : "memory");\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1;						\
> +	asm volatile ("ldar %w0, [%1]" 					\
> +			: "=r" (___p1) : "r" (&p) : "memory"); 		\
> +	return ___p1;							\
> +} while (0)
>  #endif
> 
>  #define read_barrier_depends()		do { } while(0)
> diff --git a/arch/avr32/include/asm/barrier.h b/arch/avr32/include/asm/barrier.h
> index 0961275373db..8fd164648e71 100644
> --- a/arch/avr32/include/asm/barrier.h
> +++ b/arch/avr32/include/asm/barrier.h
> @@ -25,5 +25,17 @@
>  # define smp_read_barrier_depends() do { } while(0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> 
>  #endif /* __ASM_AVR32_BARRIER_H */
> diff --git a/arch/blackfin/include/asm/barrier.h b/arch/blackfin/include/asm/barrier.h
> index ebb189507dd7..c8b85bba843f 100644
> --- a/arch/blackfin/include/asm/barrier.h
> +++ b/arch/blackfin/include/asm/barrier.h
> @@ -45,4 +45,17 @@
>  #define set_mb(var, value) do { var = value; mb(); } while (0)
>  #define smp_read_barrier_depends()	read_barrier_depends()
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _BLACKFIN_BARRIER_H */
> diff --git a/arch/cris/include/asm/barrier.h b/arch/cris/include/asm/barrier.h
> index 198ad7fa6b25..26f21f5d1d15 100644
> --- a/arch/cris/include/asm/barrier.h
> +++ b/arch/cris/include/asm/barrier.h
> @@ -22,4 +22,17 @@
>  #define smp_read_barrier_depends()     do { } while(0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* __ASM_CRIS_BARRIER_H */
> diff --git a/arch/frv/include/asm/barrier.h b/arch/frv/include/asm/barrier.h
> index 06776ad9f5e9..4569028382fa 100644
> --- a/arch/frv/include/asm/barrier.h
> +++ b/arch/frv/include/asm/barrier.h
> @@ -26,4 +26,17 @@
>  #define set_mb(var, value) \
>  	do { var = (value); barrier(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_BARRIER_H */
> diff --git a/arch/h8300/include/asm/barrier.h b/arch/h8300/include/asm/barrier.h
> index 9e0aa9fc195d..45d36738814d 100644
> --- a/arch/h8300/include/asm/barrier.h
> +++ b/arch/h8300/include/asm/barrier.h
> @@ -26,4 +26,17 @@
>  #define smp_read_barrier_depends()	do { } while(0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _H8300_BARRIER_H */
> diff --git a/arch/hexagon/include/asm/barrier.h b/arch/hexagon/include/asm/barrier.h
> index 1041a8e70ce8..d88d54bd2e6e 100644
> --- a/arch/hexagon/include/asm/barrier.h
> +++ b/arch/hexagon/include/asm/barrier.h
> @@ -38,4 +38,17 @@
>  #define set_mb(var, value) \
>  	do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_BARRIER_H */
> diff --git a/arch/ia64/include/asm/barrier.h b/arch/ia64/include/asm/barrier.h
> index 60576e06b6fb..b7f1a8aa03af 100644
> --- a/arch/ia64/include/asm/barrier.h
> +++ b/arch/ia64/include/asm/barrier.h
> @@ -45,11 +45,54 @@
>  # define smp_rmb()	rmb()
>  # define smp_wmb()	wmb()
>  # define smp_read_barrier_depends()	read_barrier_depends()
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	switch (sizeof(p)) {						\
> +	case 4:								\
> +		asm volatile ("st4.acq [%0]=%1"				\
> +				:: "r" (&p), "r" (v) : "memory");	\
> +		break;							\
> +	case 8:								\
> +		asm volatile ("st8.acq [%0]=%1"				\
> +				:: "r" (&p), "r" (v) : "memory");	\
> +		break;							\
> +	}								\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1;						\
> +	switch (sizeof(p)) {						\
> +	case 4:								\
> +		asm volatile ("ld4.rel %0=[%1]"				\
> +				: "=r"(___p1) : "r" (&p) : "memory");	\
> +		break;							\
> +	case 8:								\
> +		asm volatile ("ld8.rel %0=[%1]"				\
> +				: "=r"(___p1) : "r" (&p) : "memory");	\
> +		break;							\
> +	}								\
> +	return ___p1;							\
> +} while (0)
>  #else
>  # define smp_mb()	barrier()
>  # define smp_rmb()	barrier()
>  # define smp_wmb()	barrier()
>  # define smp_read_barrier_depends()	do { } while(0)
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
>  #endif
> 
>  /*
> diff --git a/arch/m32r/include/asm/barrier.h b/arch/m32r/include/asm/barrier.h
> index 6976621efd3f..d78612289cb2 100644
> --- a/arch/m32r/include/asm/barrier.h
> +++ b/arch/m32r/include/asm/barrier.h
> @@ -91,4 +91,17 @@
>  #define set_mb(var, value) do { var = value; barrier(); } while (0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_M32R_BARRIER_H */
> diff --git a/arch/m68k/include/asm/barrier.h b/arch/m68k/include/asm/barrier.h
> index 445ce22c23cb..1e63b11c424c 100644
> --- a/arch/m68k/include/asm/barrier.h
> +++ b/arch/m68k/include/asm/barrier.h
> @@ -17,4 +17,17 @@
>  #define smp_wmb()	barrier()
>  #define smp_read_barrier_depends()	((void)0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _M68K_BARRIER_H */
> diff --git a/arch/metag/include/asm/barrier.h b/arch/metag/include/asm/barrier.h
> index c90bfc6bf648..9ffd0b167f07 100644
> --- a/arch/metag/include/asm/barrier.h
> +++ b/arch/metag/include/asm/barrier.h
> @@ -82,4 +82,17 @@ static inline void fence(void)
>  #define smp_read_barrier_depends()     do { } while (0)
>  #define set_mb(var, value) do { var = value; smp_mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_METAG_BARRIER_H */
> diff --git a/arch/microblaze/include/asm/barrier.h b/arch/microblaze/include/asm/barrier.h
> index df5be3e87044..db0b5e205ce3 100644
> --- a/arch/microblaze/include/asm/barrier.h
> +++ b/arch/microblaze/include/asm/barrier.h
> @@ -24,4 +24,17 @@
>  #define smp_rmb()		rmb()
>  #define smp_wmb()		wmb()
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_MICROBLAZE_BARRIER_H */
> diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
> index 314ab5532019..8031afcc7f64 100644
> --- a/arch/mips/include/asm/barrier.h
> +++ b/arch/mips/include/asm/barrier.h
> @@ -180,4 +180,17 @@
>  #define nudge_writes() mb()
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* __ASM_BARRIER_H */
> diff --git a/arch/mn10300/include/asm/barrier.h b/arch/mn10300/include/asm/barrier.h
> index 2bd97a5c8af7..e822ff76f498 100644
> --- a/arch/mn10300/include/asm/barrier.h
> +++ b/arch/mn10300/include/asm/barrier.h
> @@ -34,4 +34,17 @@
>  #define read_barrier_depends()		do {} while (0)
>  #define smp_read_barrier_depends()	do {} while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_BARRIER_H */
> diff --git a/arch/parisc/include/asm/barrier.h b/arch/parisc/include/asm/barrier.h
> index e77d834aa803..58757747f873 100644
> --- a/arch/parisc/include/asm/barrier.h
> +++ b/arch/parisc/include/asm/barrier.h
> @@ -32,4 +32,17 @@
> 
>  #define set_mb(var, value)		do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* __PARISC_BARRIER_H */
> diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
> index ae782254e731..54922626b356 100644
> --- a/arch/powerpc/include/asm/barrier.h
> +++ b/arch/powerpc/include/asm/barrier.h
> @@ -65,4 +65,19 @@
>  #define data_barrier(x)	\
>  	asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
> 
> +/* use smp_rmb() as that is either lwsync or a barrier() depending on SMP */
> +
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_rmb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_rmb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_POWERPC_BARRIER_H */
> diff --git a/arch/s390/include/asm/barrier.h b/arch/s390/include/asm/barrier.h
> index 16760eeb79b0..babf928649a4 100644
> --- a/arch/s390/include/asm/barrier.h
> +++ b/arch/s390/include/asm/barrier.h
> @@ -32,4 +32,17 @@
> 
>  #define set_mb(var, value)		do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	barrier();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	barrier();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* __ASM_BARRIER_H */
> diff --git a/arch/score/include/asm/barrier.h b/arch/score/include/asm/barrier.h
> index 0eacb6471e6d..5905ea57a104 100644
> --- a/arch/score/include/asm/barrier.h
> +++ b/arch/score/include/asm/barrier.h
> @@ -13,4 +13,17 @@
> 
>  #define set_mb(var, value) 		do {var = value; wmb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _ASM_SCORE_BARRIER_H */
> diff --git a/arch/sh/include/asm/barrier.h b/arch/sh/include/asm/barrier.h
> index 72c103dae300..379f500023b6 100644
> --- a/arch/sh/include/asm/barrier.h
> +++ b/arch/sh/include/asm/barrier.h
> @@ -51,4 +51,17 @@
> 
>  #define set_mb(var, value) do { (void)xchg(&var, value); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* __ASM_SH_BARRIER_H */
> diff --git a/arch/sparc/include/asm/barrier_32.h b/arch/sparc/include/asm/barrier_32.h
> index c1b76654ee76..1649081d1b86 100644
> --- a/arch/sparc/include/asm/barrier_32.h
> +++ b/arch/sparc/include/asm/barrier_32.h
> @@ -12,4 +12,17 @@
>  #define smp_wmb()	__asm__ __volatile__("":::"memory")
>  #define smp_read_barrier_depends()	do { } while(0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* !(__SPARC_BARRIER_H) */
> diff --git a/arch/sparc/include/asm/barrier_64.h b/arch/sparc/include/asm/barrier_64.h
> index 95d45986f908..5e23ced0a29a 100644
> --- a/arch/sparc/include/asm/barrier_64.h
> +++ b/arch/sparc/include/asm/barrier_64.h
> @@ -53,4 +53,17 @@ do {	__asm__ __volatile__("ba,pt	%%xcc, 1f\n\t" \
> 
>  #define smp_read_barrier_depends()	do { } while(0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	barrier();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	barrier();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* !(__SPARC64_BARRIER_H) */
> diff --git a/arch/tile/include/asm/barrier.h b/arch/tile/include/asm/barrier.h
> index a9a73da5865d..1f08318db3c0 100644
> --- a/arch/tile/include/asm/barrier.h
> +++ b/arch/tile/include/asm/barrier.h
> @@ -140,5 +140,18 @@ mb_incoherent(void)
>  #define set_mb(var, value) \
>  	do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* !__ASSEMBLY__ */
>  #endif /* _ASM_TILE_BARRIER_H */
> diff --git a/arch/unicore32/include/asm/barrier.h b/arch/unicore32/include/asm/barrier.h
> index a6620e5336b6..fa8bf69d9a09 100644
> --- a/arch/unicore32/include/asm/barrier.h
> +++ b/arch/unicore32/include/asm/barrier.h
> @@ -25,4 +25,17 @@
> 
>  #define set_mb(var, value)		do { var = value; smp_mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* __UNICORE_BARRIER_H__ */
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index c6cd358a1eec..115ef72b3784 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -100,6 +100,19 @@
>  #define set_mb(var, value) do { var = value; barrier(); } while (0)
>  #endif
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	barrier();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	barrier();							\
> +	return ___p1;							\
> +} while (0)
> +
>  /*
>   * Stop RDTSC speculation. This is needed when you need to use RDTSC
>   * (or get_cycles or vread that possibly accesses the TSC) in a defined
> diff --git a/arch/xtensa/include/asm/barrier.h b/arch/xtensa/include/asm/barrier.h
> index ef021677d536..e96a674c337a 100644
> --- a/arch/xtensa/include/asm/barrier.h
> +++ b/arch/xtensa/include/asm/barrier.h
> @@ -26,4 +26,17 @@
> 
>  #define set_mb(var, value)	do { var = value; mb(); } while (0)
> 
> +#define smp_store_release(p, v)						\
> +do {									\
> +	smp_mb();							\
> +	ACCESS_ONCE(p) = (v);						\
> +} while (0)
> +
> +#define smp_load_acquire(p)						\
> +do {									\
> +	typeof(p) ___p1 = ACCESS_ONCE(p);				\
> +	smp_mb();							\
> +	return ___p1;							\
> +} while (0)
> +
>  #endif /* _XTENSA_SYSTEM_H */
> 

^ permalink raw reply

* Re: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and altivec idle
From: Scott Wood @ 2013-11-04 21:51 UTC (permalink / raw)
  To: Wang Dongsheng-B40534
  Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org,
	Bhushan Bharat-R65777
In-Reply-To: <ABB05CD9C9F68C46A5CEDC7F15439259010BBC86@039-SN2MPN1-021.039d.mgd.msft.net>

On Sun, 2013-11-03 at 22:04 -0600, Wang Dongsheng-B40534 wrote:
> > -----Original Message-----
> > From: Wang Dongsheng-B40534
> > Sent: Monday, October 21, 2013 11:11 AM
> > To: Wood Scott-B07421
> > Cc: Bhushan Bharat-R65777; linuxppc-dev@lists.ozlabs.org
> > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and
> > altivec idle
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Wood Scott-B07421
> > > Sent: Saturday, October 19, 2013 3:22 AM
> > > To: Wang Dongsheng-B40534
> > > Cc: Bhushan Bharat-R65777; Wood Scott-B07421; linuxppc-
> > > dev@lists.ozlabs.org
> > > Subject: Re: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and
> > > altivec idle
> > >
> > > On Thu, 2013-10-17 at 22:02 -0500, Wang Dongsheng-B40534 wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Bhushan Bharat-R65777
> > > > > Sent: Thursday, October 17, 2013 2:46 PM
> > > > > To: Wang Dongsheng-B40534; Wood Scott-B07421
> > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state
> > > > > and altivec idle
> > > > >
> > > > >
> > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Wang Dongsheng-B40534
> > > > > > > > Sent: Thursday, October 17, 2013 11:22 AM
> > > > > > > > To: Bhushan Bharat-R65777; Wood Scott-B07421
> > > > > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20
> > > > > > > > state and altivec idle
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Bhushan Bharat-R65777
> > > > > > > > > Sent: Thursday, October 17, 2013 11:20 AM
> > > > > > > > > To: Wang Dongsheng-B40534; Wood Scott-B07421
> > > > > > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for
> > > > > > > > > pw20 state and altivec idle
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > -----Original Message-----
> > > > > > > > > > From: Wang Dongsheng-B40534
> > > > > > > > > > Sent: Thursday, October 17, 2013 8:16 AM
> > > > > > > > > > To: Bhushan Bharat-R65777; Wood Scott-B07421
> > > > > > > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > > > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for
> > > > > > > > > > pw20 state and altivec idle
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > From: Bhushan Bharat-R65777
> > > > > > > > > > > Sent: Thursday, October 17, 2013 1:01 AM
> > > > > > > > > > > To: Wang Dongsheng-B40534; Wood Scott-B07421
> > > > > > > > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > > > > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs
> > > > > > > > > > > for
> > > > > > > > > > > pw20 state and altivec idle
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > -----Original Message-----
> > > > > > > > > > > > From: Wang Dongsheng-B40534
> > > > > > > > > > > > Sent: Tuesday, October 15, 2013 2:51 PM
> > > > > > > > > > > > To: Wood Scott-B07421
> > > > > > > > > > > > Cc: Bhushan Bharat-R65777;
> > > > > > > > > > > > linuxppc-dev@lists.ozlabs.org; Wang
> > > > > > > > > > > Dongsheng-B40534
> > > > > > > > > > > > Subject: [PATCH v5 4/4] powerpc/85xx: add sysfs for
> > > > > > > > > > > > pw20 state and
> > > > > > > > > > > altivec idle
> > > > > > > > > > > >
> > > > > > > > > > > > From: Wang Dongsheng <dongsheng.wang@freescale.com>
> > > > > > > > > > > >
> > > > > > > > > > > > Add a sys interface to enable/diable pw20 state or
> > > > > > > > > > > > altivec idle, and
> > > > > > > > > > > control the
> > > > > > > > > > > > wait entry time.
> > > > > > > > > > > >
> > > > > > > > > > > > Enable/Disable interface:
> > > > > > > > > > > > 0, disable. 1, enable.
> > > > > > > > > > > > /sys/devices/system/cpu/cpuX/pw20_state
> > > > > > > > > > > > /sys/devices/system/cpu/cpuX/altivec_idle
> > > > > > > > > > > >
> > > > > > > > > > > > Set wait time interface:(Nanosecond)
> > > > > > > > > > > > /sys/devices/system/cpu/cpuX/pw20_wait_time
> > > > > > > > > > > > /sys/devices/system/cpu/cpuX/altivec_idle_wait_time
> > > > > > > > > > > > Example: Base on TBfreq is 41MHZ.
> > > > > > > > > > > > 1~48(ns): TB[63]
> > > > > > > > > > > > 49~97(ns): TB[62]
> > > > > > > > > > > > 98~195(ns): TB[61]
> > > > > > > > > > > > 196~390(ns): TB[60]
> > > > > > > > > > > > 391~780(ns): TB[59]
> > > > > > > > > > > > 781~1560(ns): TB[58] ...
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Wang Dongsheng
> > > > > > > > > > > > <dongsheng.wang@freescale.com>
> > > > > > > > > > > > ---
> > > > > > > > > > > > *v5:
> > > > > > > > > > > > Change get_idle_ticks_bit function implementation.
> > > > > > > > > > > >
> > > > > > > > > > > > *v4:
> > > > > > > > > > > > Move code from 85xx/common.c to kernel/sysfs.c.
> > > > > > > > > > > >
> > > > > > > > > > > > Remove has_pw20_altivec_idle function.
> > > > > > > > > > > >
> > > > > > > > > > > > Change wait "entry_bit" to wait time.
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/arch/powerpc/kernel/sysfs.c
> > > > > > > > > > > > b/arch/powerpc/kernel/sysfs.c
> > > > > > > > > > > index
> > > > > > > > > > > > 27a90b9..10d1128 100644
> > > > > > > > > > > > --- a/arch/powerpc/kernel/sysfs.c
> > > > > > > > > > > > +++ b/arch/powerpc/kernel/sysfs.c
> > > > > > > > > > > > @@ -85,6 +85,284 @@ __setup("smt-snooze-delay=",
> > > > > > > > > > > setup_smt_snooze_delay);
> > > > > > > > > > > >
> > > > > > > > > > > >  #endif /* CONFIG_PPC64 */
> > > > > > > > > > > >
> > > > > > > > > > > > +#ifdef CONFIG_FSL_SOC
> > > > > > > > > > > > +#define MAX_BIT				63
> > > > > > > > > > > > +
> > > > > > > > > > > > +static u64 pw20_wt; static u64 altivec_idle_wt;
> > > > > > > > > > > > +
> > > > > > > > > > > > +static unsigned int get_idle_ticks_bit(u64 ns) {
> > > > > > > > > > > > +	u64 cycle;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	if (ns >= 10000)
> > > > > > > > > > > > +		cycle = div_u64(ns + 500, 1000) *
> > > > > tb_ticks_per_usec;
> > > > > > > > > > > > +	else
> > > > > > > > > > > > +		cycle = div_u64(ns * tb_ticks_per_usec,
> > > 1000);
> > > > > > > > > > > > +
> > > > > > > > > > > > +	if (!cycle)
> > > > > > > > > > > > +		return 0;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	return ilog2(cycle); }
> > > > > > > > > > > > +
> > > > > > > > > > > > +static void do_show_pwrmgtcr0(void *val) {
> > > > > > > > > > > > +	u32 *value = val;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	*value = mfspr(SPRN_PWRMGTCR0); }
> > > > > > > > > > > > +
> > > > > > > > > > > > +static ssize_t show_pw20_state(struct device *dev,
> > > > > > > > > > > > +				struct device_attribute *attr,
> > > char
> > > > > *buf) {
> > > > > > > > > > > > +	u32 value;
> > > > > > > > > > > > +	unsigned int cpu = dev->id;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	smp_call_function_single(cpu, do_show_pwrmgtcr0,
> > > > > > > > > > > > +&value, 1);
> > > > > > > > > > > > +
> > > > > > > > > > > > +	value &= PWRMGTCR0_PW20_WAIT;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	return sprintf(buf, "%u\n", value ? 1 : 0); }
> > > > > > > > > > > > +
> > > > > > > > > > > > +static void do_store_pw20_state(void *val) {
> > > > > > > > > > > > +	u32 *value = val;
> > > > > > > > > > > > +	u32 pw20_state;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	pw20_state = mfspr(SPRN_PWRMGTCR0);
> > > > > > > > > > > > +
> > > > > > > > > > > > +	if (*value)
> > > > > > > > > > > > +		pw20_state |= PWRMGTCR0_PW20_WAIT;
> > > > > > > > > > > > +	else
> > > > > > > > > > > > +		pw20_state &= ~PWRMGTCR0_PW20_WAIT;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	mtspr(SPRN_PWRMGTCR0, pw20_state); }
> > > > > > > > > > > > +
> > > > > > > > > > > > +static ssize_t store_pw20_state(struct device *dev,
> > > > > > > > > > > > +				struct device_attribute *attr,
> > > > > > > > > > > > +				const char *buf, size_t count)
> > > {
> > > > > > > > > > > > +	u32 value;
> > > > > > > > > > > > +	unsigned int cpu = dev->id;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	if (kstrtou32(buf, 0, &value))
> > > > > > > > > > > > +		return -EINVAL;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	if (value > 1)
> > > > > > > > > > > > +		return -EINVAL;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	smp_call_function_single(cpu, do_store_pw20_state,
> > > > > > > > > > > > +&value, 1);
> > > > > > > > > > > > +
> > > > > > > > > > > > +	return count;
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > > +static ssize_t show_pw20_wait_time(struct device
> > *dev,
> > > > > > > > > > > > +				struct device_attribute *attr,
> > > char
> > > > > *buf) {
> > > > > > > > > > > > +	u32 value;
> > > > > > > > > > > > +	u64 tb_cycle;
> > > > > > > > > > > > +	s64 time;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	unsigned int cpu = dev->id;
> > > > > > > > > > > > +
> > > > > > > > > > > > +	if (!pw20_wt) {
> > > > > > > > > > > > +		smp_call_function_single(cpu,
> > > do_show_pwrmgtcr0,
> > > > > > > > > > > > +&value,
> > > > > > > > > 1);
> > > > > > > > > > > > +		value = (value & PWRMGTCR0_PW20_ENT) >>
> > > > > > > > > > > > +					PWRMGTCR0_PW20_ENT_SHIFT;
> > > > > > > > > > > > +
> > > > > > > > > > > > +		tb_cycle = (1 << (MAX_BIT - value)) * 2;
> > > > > > > > > > >
> > > > > > > > > > > Is value = 0 and value = 1 legal? These will make
> > > > > > > > > > > tb_cycle = 0,
> > > > > > > > > > >
> > > > > > > > > > > > +		time = div_u64(tb_cycle * 1000,
> > > tb_ticks_per_usec)
> > > > > - 1;
> > > > > > > > > > >
> > > > > > > > > > > And time = -1;
> > > > > > > > > > >
> > > > > > > > > > Please look at the end of the function, :)
> > > > > > > > > >
> > > > > > > > > > "return sprintf(buf, "%llu\n", time > 0 ? time : 0);"
> > > > > > > > >
> > > > > > > > > I know you return 0 if value = 0/1, my question was that,
> > > > > > > > > is this correct as per specification?
> > > > > > > > >
> > > > > > > > > Ahh, also for "value" upto 7 you will return 0, no?
> > > > > > > > >
> > > > > > > > If value = 0, MAX_BIT - value = 63 tb_cycle =
> > > > > > > > 0xffffffff_ffffffff, tb_cycle * 1000 will overflow, but this
> > > situation is not possible.
> > > > > > > > Because if the "value = 0" means this feature will be
> > "disable".
> > > > > > > > Now The default wait bit is 50(MAX_BIT - value, value = 13),
> > > > > > > > the PW20/Altivec Idle wait entry time is about 1ms, this
> > > > > > > > time is very long for wait idle time, and it's cannot be
> > > > > > > > increased(means (MAX_BIT
> > > > > > > > - value)
> > > > > > > cannot greater than 50).
> > > > > > >
> > > > > > > What you said is not obvious from code and so at least write a
> > > > > > > comment that value will be always >= 13 or value will never be
> > > > > > > less than < 8 and below calculation will not overflow. may be
> > > > > > > error out if value is less than 8.
> > > > > > >
> > > > > > The "value" less than 10, this will overflow.
> > > > > > There is not error, The code I knew it could not be less than
> > > > > > 10, that's why I use the following code. :)
> > > > >
> > > > > I am sorry to persist but this is not about what you know, this is
> > > > > about how code is read and code does not say what you know, so add
> > > > > a comment at least and error out/warn when "value" is less than a
> > > certain number.
> > > > >
> > > > Sorry for the late to response the mail. If it caused confusion, we
> > > > can
> > > add a comment.
> > > >
> > > > How about the following comment?
> > > > /*
> > > >  * If the "value" less than 10, this will overflow.
> > > >  * From benchmark test, the default wait bit will not be set less
> > > > than
> > > 10bit.
> > > >  * Because 10 bit corresponds to the wait entry time is
> > > > 439375573401999609(ns),
> > > >  * for wait-entry-idle time this value looks too long, and we cannot
> > > > use those
> > > >  * "long" time as a default wait-entry time. So overflow could not
> > > > have happened
> > > >  * and we use this calculation method to get wait-entry-idle time.
> > > >  */
> > >
> > > If there's to be a limit on the times we accept, make it explicit.
> > > Check for it before doing any conversions, and return an error if
> > > userspace tries to set it.
> > >
> > The branch only use to read default wait-entry-time.
> > We have no limit the user's input, and we can't restrict. Once the user
> > set the wait-entry-time, the code will do another branch.
> > 
> 
> Hi scott,
> Do you have any comments about this patch?
> I will add the comment and send this patch again.

What do you mean by "and we can't restrict"?  Why not?

Why is it only used to read the default, and not the current value?

-Scott

^ permalink raw reply

* Re: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and altivec idle
From: Scott Wood @ 2013-11-04 23:47 UTC (permalink / raw)
  To: Wang Dongsheng-B40534
  Cc: Wood Scott-B07421, linuxppc-dev@lists.ozlabs.org,
	Bhushan Bharat-R65777
In-Reply-To: <ABB05CD9C9F68C46A5CEDC7F15439259010B3EA6@039-SN2MPN1-021.039d.mgd.msft.net>

On Sun, 2013-10-20 at 22:27 -0500, Wang Dongsheng-B40534 wrote:
> 
> > -----Original Message-----
> > From: Wood Scott-B07421
> > Sent: Saturday, October 19, 2013 3:23 AM
> > To: Wang Dongsheng-B40534
> > Cc: Wood Scott-B07421; Bhushan Bharat-R65777; linuxppc-
> > dev@lists.ozlabs.org
> > Subject: Re: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and
> > altivec idle
> > 
> > On Thu, 2013-10-17 at 21:36 -0500, Wang Dongsheng-B40534 wrote:
> > >
> > > > -----Original Message-----
> > > > From: Wood Scott-B07421
> > > > Sent: Friday, October 18, 2013 12:52 AM
> > > > To: Wang Dongsheng-B40534
> > > > Cc: Bhushan Bharat-R65777; Wood Scott-B07421; linuxppc-
> > > > dev@lists.ozlabs.org
> > > > Subject: Re: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state
> > > > and altivec idle
> > > >
> > > > On Thu, 2013-10-17 at 00:51 -0500, Wang Dongsheng-B40534 wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Bhushan Bharat-R65777
> > > > > > Sent: Thursday, October 17, 2013 11:20 AM
> > > > > > To: Wang Dongsheng-B40534; Wood Scott-B07421
> > > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20
> > > > > > state and altivec idle
> > > > > >
> > > > > >
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Wang Dongsheng-B40534
> > > > > > > Sent: Thursday, October 17, 2013 8:16 AM
> > > > > > > To: Bhushan Bharat-R65777; Wood Scott-B07421
> > > > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20
> > > > > > > state and altivec idle
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Bhushan Bharat-R65777
> > > > > > > > Sent: Thursday, October 17, 2013 1:01 AM
> > > > > > > > To: Wang Dongsheng-B40534; Wood Scott-B07421
> > > > > > > > Cc: linuxppc-dev@lists.ozlabs.org
> > > > > > > > Subject: RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20
> > > > > > > > state and altivec idle
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Wang Dongsheng-B40534
> > > > > > > > > Sent: Tuesday, October 15, 2013 2:51 PM
> > > > > > > > > To: Wood Scott-B07421
> > > > > > > > > Cc: Bhushan Bharat-R65777; linuxppc-dev@lists.ozlabs.org;
> > > > > > > > > Wang
> > > > > > > > Dongsheng-B40534
> > > > > > > > > Subject: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20
> > > > > > > > > state and
> > > > > > > > altivec idle
> > > > > > > > >
> > > > > > > > > +static ssize_t show_pw20_wait_time(struct device *dev,
> > > > > > > > > +				struct device_attribute *attr, char
> > *buf) {
> > > > > > > > > +	u32 value;
> > > > > > > > > +	u64 tb_cycle;
> > > > > > > > > +	s64 time;
> > > > > > > > > +
> > > > > > > > > +	unsigned int cpu = dev->id;
> > > > > > > > > +
> > > > > > > > > +	if (!pw20_wt) {
> > > > > > > > > +		smp_call_function_single(cpu, do_show_pwrmgtcr0,
> > > > > > > > > +&value,
> > > > > > 1);
> > > > > > > > > +		value = (value & PWRMGTCR0_PW20_ENT) >>
> > > > > > > > > +					PWRMGTCR0_PW20_ENT_SHIFT;
> > > > > > > > > +
> > > > > > > > > +		tb_cycle = (1 << (MAX_BIT - value)) * 2;
> > > > > > > >
> > > > > > > > Is value = 0 and value = 1 legal? These will make tb_cycle =
> > > > > > > > 0,
> > > > > > > >
> > > > > > > > > +		time = div_u64(tb_cycle * 1000, tb_ticks_per_usec)
> > - 1;
> > > > > > > >
> > > > > > > > And time = -1;
> > > > > > > >
> > > > > > > Please look at the end of the function, :)
> > > > > > >
> > > > > > > "return sprintf(buf, "%llu\n", time > 0 ? time : 0);"
> > > > > >
> > > > > > I know you return 0 if value = 0/1, my question was that, is
> > > > > > this correct as per specification?
> > > > > >
> > > > > > Ahh, also for "value" upto 7 you will return 0, no?
> > > > > >
> > > > > If value = 0, MAX_BIT - value = 63 tb_cycle = 0xffffffff_ffffffff,
> > > >
> > > > Actually, tb_cycle will be undefined because you shifted a 32-bit
> > > > value
> > > > (1) by more than 31 bits.  s/1/1ULL/
> > > >
> > > Actually, we have been discussing this situation that could not have
> > happened.
> > > See !pw20_wt branch, this branch is read default wait bit.
> > > The default wait bit is 50, the time is about 1ms.
> > > The default wait bit cannot less than 50, means the wait entry time
> > cannot greater than 1ms.
> > > We have already begun benchmark test, and we got a preliminary results.
> > > 55, 56, 57bit looks good, but we need more benchmark to get the default
> > bit.
> > 
> > What does the default have to do with it?  The user could have set a
> > different value, and then read it back.
> > 
> > Plus, how much time corresponds to bit 50 depends on the actual timebase
> > frequency which could vary.
> > 
> 	if (!pw20_wt) {
> 		smp_call_function_single(cpu, do_show_pwrmgtcr0, &value, 1);
> 		value = (value & PWRMGTCR0_PW20_ENT) >>
> 					PWRMGTCR0_PW20_ENT_SHIFT;
> 
> 		tb_cycle = (1 << (MAX_BIT - value)) * 2;
> 		time = tb_cycle * (1000 / tb_ticks_per_usec) - 1;
> 	} else {
> 		time = pw20_wt;
> 	}
> 
> As we have discussed before we need a variable to save To save the users to set wait-entry-time value.
> 
> See the code, if user have set a value, and the value will be set in pw20_wt. 
> When the user read it back, the code will do "time = pw20_wt" branch.
> 
> When the user not set the wait-entry-time value and read this sys interface, the code will do the following branch.

Oh, so it's not that you "need" this, you already have it.

-Scott

^ permalink raw reply

* [PATCH] net: mv643xx_eth: Add missing phy_addr_set in DT mode
From: Jason Gunthorpe @ 2013-11-05  0:27 UTC (permalink / raw)
  To: Sebastian Hesselbarth
  Cc: Andrew Lunn, Jason Cooper, netdev, linux-kernel, linux-arm-kernel,
	linuxppc-dev, David Miller, Lennert Buytenhek

Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node
present' made the call to phy_scan optional, if the DT has a link to
the phy node.

However phy_scan has the side effect of calling phy_addr_set, which
writes the phy MDIO address to the ethernet controller. If phy_addr_set
is not called, and the bootloader has not set the correct address then
the driver will fail to function.

Tested on Kirkwood.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
---
Cc: David Miller <davem@davemloft.net>
Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/net/ethernet/marvell/mv643xx_eth.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index 2c210ec..00e43b5 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -2890,6 +2890,7 @@ static int mv643xx_eth_probe(struct platform_device *pdev)
 					 PHY_INTERFACE_MODE_GMII);
 		if (!mp->phy)
 			err = -ENODEV;
+		phy_addr_set(mp, mp->phy->addr);
 	} else if (pd->phy_addr != MV643XX_ETH_PHY_NONE) {
 		mp->phy = phy_scan(mp, pd->phy_addr);
 
-- 
1.8.1.2

^ permalink raw reply related

* Re: [PATCH] net: mv643xx_eth: Add missing phy_addr_set in DT mode
From: Jason Cooper @ 2013-11-05  0:36 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrew Lunn, netdev, linux-kernel, linux-arm-kernel, linuxppc-dev,
	David Miller, Lennert Buytenhek, Sebastian Hesselbarth
In-Reply-To: <1383611239-14556-1-git-send-email-jgunthorpe@obsidianresearch.com>

Jason,

On Mon, Nov 04, 2013 at 05:27:19PM -0700, Jason Gunthorpe wrote:
> Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node

fyi: set core.abbrev = 12 in your git config, according to Linus, 7/8
was a bad decision...

> present' made the call to phy_scan optional, if the DT has a link to
> the phy node.
> 
> However phy_scan has the side effect of calling phy_addr_set, which
> writes the phy MDIO address to the ethernet controller. If phy_addr_set
> is not called, and the bootloader has not set the correct address then
> the driver will fail to function.
> 
> Tested on Kirkwood.
> 
> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>

Fixes: cc9d459894b0 "net: mv643xx_eth: use of_phy_connect if phy_node present"
Acked-by: Jason Cooper <jason@lakedaemon.net>

And it should be suitable for v3.11+

thx,

Jason.

^ permalink raw reply

* RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and altivec idle
From: Dongsheng Wang @ 2013-11-05  2:11 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev@lists.ozlabs.org, Bharat Bhushan
In-Reply-To: <1383608853.25829.38.camel@snotra.buserror.net>

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogV29vZCBTY290dC1CMDc0
MjENCj4gU2VudDogVHVlc2RheSwgTm92ZW1iZXIgMDUsIDIwMTMgNzo0OCBBTQ0KPiBUbzogV2Fu
ZyBEb25nc2hlbmctQjQwNTM0DQo+IENjOiBXb29kIFNjb3R0LUIwNzQyMTsgQmh1c2hhbiBCaGFy
YXQtUjY1Nzc3OyBsaW51eHBwYy0NCj4gZGV2QGxpc3RzLm96bGFicy5vcmcNCj4gU3ViamVjdDog
UmU6IFtQQVRDSCB2NSA0LzRdIHBvd2VycGMvODV4eDogYWRkIHN5c2ZzIGZvciBwdzIwIHN0YXRl
IGFuZA0KPiBhbHRpdmVjIGlkbGUNCj4gDQo+IE9uIFN1biwgMjAxMy0xMC0yMCBhdCAyMjoyNyAt
MDUwMCwgV2FuZyBEb25nc2hlbmctQjQwNTM0IHdyb3RlOg0KPiA+DQo+ID4gPiAtLS0tLU9yaWdp
bmFsIE1lc3NhZ2UtLS0tLQ0KPiA+ID4gRnJvbTogV29vZCBTY290dC1CMDc0MjENCj4gPiA+IFNl
bnQ6IFNhdHVyZGF5LCBPY3RvYmVyIDE5LCAyMDEzIDM6MjMgQU0NCj4gPiA+IFRvOiBXYW5nIERv
bmdzaGVuZy1CNDA1MzQNCj4gPiA+IENjOiBXb29kIFNjb3R0LUIwNzQyMTsgQmh1c2hhbiBCaGFy
YXQtUjY1Nzc3OyBsaW51eHBwYy0NCj4gPiA+IGRldkBsaXN0cy5vemxhYnMub3JnDQo+ID4gPiBT
dWJqZWN0OiBSZTogW1BBVENIIHY1IDQvNF0gcG93ZXJwYy84NXh4OiBhZGQgc3lzZnMgZm9yIHB3
MjAgc3RhdGUNCj4gPiA+IGFuZCBhbHRpdmVjIGlkbGUNCj4gPiA+DQo+ID4gPiBPbiBUaHUsIDIw
MTMtMTAtMTcgYXQgMjE6MzYgLTA1MDAsIFdhbmcgRG9uZ3NoZW5nLUI0MDUzNCB3cm90ZToNCj4g
PiA+ID4NCj4gPiA+ID4gPiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+ID4gPiA+IEZy
b206IFdvb2QgU2NvdHQtQjA3NDIxDQo+ID4gPiA+ID4gU2VudDogRnJpZGF5LCBPY3RvYmVyIDE4
LCAyMDEzIDEyOjUyIEFNDQo+ID4gPiA+ID4gVG86IFdhbmcgRG9uZ3NoZW5nLUI0MDUzNA0KPiA+
ID4gPiA+IENjOiBCaHVzaGFuIEJoYXJhdC1SNjU3Nzc7IFdvb2QgU2NvdHQtQjA3NDIxOyBsaW51
eHBwYy0NCj4gPiA+ID4gPiBkZXZAbGlzdHMub3psYWJzLm9yZw0KPiA+ID4gPiA+IFN1YmplY3Q6
IFJlOiBbUEFUQ0ggdjUgNC80XSBwb3dlcnBjLzg1eHg6IGFkZCBzeXNmcyBmb3IgcHcyMA0KPiA+
ID4gPiA+IHN0YXRlIGFuZCBhbHRpdmVjIGlkbGUNCj4gPiA+ID4gPg0KPiA+ID4gPiA+IE9uIFRo
dSwgMjAxMy0xMC0xNyBhdCAwMDo1MSAtMDUwMCwgV2FuZyBEb25nc2hlbmctQjQwNTM0IHdyb3Rl
Og0KPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0N
Cj4gPiA+ID4gPiA+ID4gRnJvbTogQmh1c2hhbiBCaGFyYXQtUjY1Nzc3DQo+ID4gPiA+ID4gPiA+
IFNlbnQ6IFRodXJzZGF5LCBPY3RvYmVyIDE3LCAyMDEzIDExOjIwIEFNDQo+ID4gPiA+ID4gPiA+
IFRvOiBXYW5nIERvbmdzaGVuZy1CNDA1MzQ7IFdvb2QgU2NvdHQtQjA3NDIxDQo+ID4gPiA+ID4g
PiA+IENjOiBsaW51eHBwYy1kZXZAbGlzdHMub3psYWJzLm9yZw0KPiA+ID4gPiA+ID4gPiBTdWJq
ZWN0OiBSRTogW1BBVENIIHY1IDQvNF0gcG93ZXJwYy84NXh4OiBhZGQgc3lzZnMgZm9yIHB3MjAN
Cj4gPiA+ID4gPiA+ID4gc3RhdGUgYW5kIGFsdGl2ZWMgaWRsZQ0KPiA+ID4gPiA+ID4gPg0KPiA+
ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiA+IC0tLS0tT3JpZ2luYWwg
TWVzc2FnZS0tLS0tDQo+ID4gPiA+ID4gPiA+ID4gRnJvbTogV2FuZyBEb25nc2hlbmctQjQwNTM0
DQo+ID4gPiA+ID4gPiA+ID4gU2VudDogVGh1cnNkYXksIE9jdG9iZXIgMTcsIDIwMTMgODoxNiBB
TQ0KPiA+ID4gPiA+ID4gPiA+IFRvOiBCaHVzaGFuIEJoYXJhdC1SNjU3Nzc7IFdvb2QgU2NvdHQt
QjA3NDIxDQo+ID4gPiA+ID4gPiA+ID4gQ2M6IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3Jn
DQo+ID4gPiA+ID4gPiA+ID4gU3ViamVjdDogUkU6IFtQQVRDSCB2NSA0LzRdIHBvd2VycGMvODV4
eDogYWRkIHN5c2ZzIGZvcg0KPiA+ID4gPiA+ID4gPiA+IHB3MjAgc3RhdGUgYW5kIGFsdGl2ZWMg
aWRsZQ0KPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPg0K
PiA+ID4gPiA+ID4gPiA+ID4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPiA+ID4gPiA+
ID4gPiA+IEZyb206IEJodXNoYW4gQmhhcmF0LVI2NTc3Nw0KPiA+ID4gPiA+ID4gPiA+ID4gU2Vu
dDogVGh1cnNkYXksIE9jdG9iZXIgMTcsIDIwMTMgMTowMSBBTQ0KPiA+ID4gPiA+ID4gPiA+ID4g
VG86IFdhbmcgRG9uZ3NoZW5nLUI0MDUzNDsgV29vZCBTY290dC1CMDc0MjENCj4gPiA+ID4gPiA+
ID4gPiA+IENjOiBsaW51eHBwYy1kZXZAbGlzdHMub3psYWJzLm9yZw0KPiA+ID4gPiA+ID4gPiA+
ID4gU3ViamVjdDogUkU6IFtQQVRDSCB2NSA0LzRdIHBvd2VycGMvODV4eDogYWRkIHN5c2ZzIGZv
cg0KPiA+ID4gPiA+ID4gPiA+ID4gcHcyMCBzdGF0ZSBhbmQgYWx0aXZlYyBpZGxlDQo+ID4gPiA+
ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+
ID4gPiA+ID4gPiA+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4gPiA+ID4gPiA+ID4g
PiA+IEZyb206IFdhbmcgRG9uZ3NoZW5nLUI0MDUzNA0KPiA+ID4gPiA+ID4gPiA+ID4gPiBTZW50
OiBUdWVzZGF5LCBPY3RvYmVyIDE1LCAyMDEzIDI6NTEgUE0NCj4gPiA+ID4gPiA+ID4gPiA+ID4g
VG86IFdvb2QgU2NvdHQtQjA3NDIxDQo+ID4gPiA+ID4gPiA+ID4gPiA+IENjOiBCaHVzaGFuIEJo
YXJhdC1SNjU3Nzc7DQo+ID4gPiA+ID4gPiA+ID4gPiA+IGxpbnV4cHBjLWRldkBsaXN0cy5vemxh
YnMub3JnOyBXYW5nDQo+ID4gPiA+ID4gPiA+ID4gPiBEb25nc2hlbmctQjQwNTM0DQo+ID4gPiA+
ID4gPiA+ID4gPiA+IFN1YmplY3Q6IFtQQVRDSCB2NSA0LzRdIHBvd2VycGMvODV4eDogYWRkIHN5
c2ZzIGZvcg0KPiA+ID4gPiA+ID4gPiA+ID4gPiBwdzIwIHN0YXRlIGFuZA0KPiA+ID4gPiA+ID4g
PiA+ID4gYWx0aXZlYyBpZGxlDQo+ID4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4g
PiA+ICtzdGF0aWMgc3NpemVfdCBzaG93X3B3MjBfd2FpdF90aW1lKHN0cnVjdCBkZXZpY2UgKmRl
diwNCj4gPiA+ID4gPiA+ID4gPiA+ID4gKwkJCQlzdHJ1Y3QgZGV2aWNlX2F0dHJpYnV0ZSAqYXR0
ciwNCj4gY2hhcg0KPiA+ID4gKmJ1Zikgew0KPiA+ID4gPiA+ID4gPiA+ID4gPiArCXUzMiB2YWx1
ZTsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gKwl1NjQgdGJfY3ljbGU7DQo+ID4gPiA+ID4gPiA+ID4g
PiA+ICsJczY0IHRpbWU7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+
ID4gKwl1bnNpZ25lZCBpbnQgY3B1ID0gZGV2LT5pZDsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gKw0K
PiA+ID4gPiA+ID4gPiA+ID4gPiArCWlmICghcHcyMF93dCkgew0KPiA+ID4gPiA+ID4gPiA+ID4g
PiArCQlzbXBfY2FsbF9mdW5jdGlvbl9zaW5nbGUoY3B1LA0KPiBkb19zaG93X3B3cm1ndGNyMCwN
Cj4gPiA+ID4gPiA+ID4gPiA+ID4gKyZ2YWx1ZSwNCj4gPiA+ID4gPiA+ID4gMSk7DQo+ID4gPiA+
ID4gPiA+ID4gPiA+ICsJCXZhbHVlID0gKHZhbHVlICYgUFdSTUdUQ1IwX1BXMjBfRU5UKSA+Pg0K
PiA+ID4gPiA+ID4gPiA+ID4gPiArCQkJCQlQV1JNR1RDUjBfUFcyMF9FTlRfU0hJRlQ7DQo+ID4g
PiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gKwkJdGJfY3ljbGUgPSAoMSA8
PCAoTUFYX0JJVCAtIHZhbHVlKSkgKiAyOw0KPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+
ID4gPiA+IElzIHZhbHVlID0gMCBhbmQgdmFsdWUgPSAxIGxlZ2FsPyBUaGVzZSB3aWxsIG1ha2UN
Cj4gPiA+ID4gPiA+ID4gPiA+IHRiX2N5Y2xlID0gMCwNCj4gPiA+ID4gPiA+ID4gPiA+DQo+ID4g
PiA+ID4gPiA+ID4gPiA+ICsJCXRpbWUgPSBkaXZfdTY0KHRiX2N5Y2xlICogMTAwMCwNCj4gdGJf
dGlja3NfcGVyX3VzZWMpDQo+ID4gPiAtIDE7DQo+ID4gPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+
ID4gPiA+ID4gQW5kIHRpbWUgPSAtMTsNCj4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+
ID4gUGxlYXNlIGxvb2sgYXQgdGhlIGVuZCBvZiB0aGUgZnVuY3Rpb24sIDopDQo+ID4gPiA+ID4g
PiA+ID4NCj4gPiA+ID4gPiA+ID4gPiAicmV0dXJuIHNwcmludGYoYnVmLCAiJWxsdVxuIiwgdGlt
ZSA+IDAgPyB0aW1lIDogMCk7Ig0KPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiBJIGtub3cg
eW91IHJldHVybiAwIGlmIHZhbHVlID0gMC8xLCBteSBxdWVzdGlvbiB3YXMgdGhhdCwgaXMNCj4g
PiA+ID4gPiA+ID4gdGhpcyBjb3JyZWN0IGFzIHBlciBzcGVjaWZpY2F0aW9uPw0KPiA+ID4gPiA+
ID4gPg0KPiA+ID4gPiA+ID4gPiBBaGgsIGFsc28gZm9yICJ2YWx1ZSIgdXB0byA3IHlvdSB3aWxs
IHJldHVybiAwLCBubz8NCj4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+IElmIHZhbHVlID0gMCwg
TUFYX0JJVCAtIHZhbHVlID0gNjMgdGJfY3ljbGUgPQ0KPiA+ID4gPiA+ID4gMHhmZmZmZmZmZl9m
ZmZmZmZmZiwNCj4gPiA+ID4gPg0KPiA+ID4gPiA+IEFjdHVhbGx5LCB0Yl9jeWNsZSB3aWxsIGJl
IHVuZGVmaW5lZCBiZWNhdXNlIHlvdSBzaGlmdGVkIGENCj4gPiA+ID4gPiAzMi1iaXQgdmFsdWUN
Cj4gPiA+ID4gPiAoMSkgYnkgbW9yZSB0aGFuIDMxIGJpdHMuICBzLzEvMVVMTC8NCj4gPiA+ID4g
Pg0KPiA+ID4gPiBBY3R1YWxseSwgd2UgaGF2ZSBiZWVuIGRpc2N1c3NpbmcgdGhpcyBzaXR1YXRp
b24gdGhhdCBjb3VsZCBub3QNCj4gPiA+ID4gaGF2ZQ0KPiA+ID4gaGFwcGVuZWQuDQo+ID4gPiA+
IFNlZSAhcHcyMF93dCBicmFuY2gsIHRoaXMgYnJhbmNoIGlzIHJlYWQgZGVmYXVsdCB3YWl0IGJp
dC4NCj4gPiA+ID4gVGhlIGRlZmF1bHQgd2FpdCBiaXQgaXMgNTAsIHRoZSB0aW1lIGlzIGFib3V0
IDFtcy4NCj4gPiA+ID4gVGhlIGRlZmF1bHQgd2FpdCBiaXQgY2Fubm90IGxlc3MgdGhhbiA1MCwg
bWVhbnMgdGhlIHdhaXQgZW50cnkNCj4gPiA+ID4gdGltZQ0KPiA+ID4gY2Fubm90IGdyZWF0ZXIg
dGhhbiAxbXMuDQo+ID4gPiA+IFdlIGhhdmUgYWxyZWFkeSBiZWd1biBiZW5jaG1hcmsgdGVzdCwg
YW5kIHdlIGdvdCBhIHByZWxpbWluYXJ5DQo+IHJlc3VsdHMuDQo+ID4gPiA+IDU1LCA1NiwgNTdi
aXQgbG9va3MgZ29vZCwgYnV0IHdlIG5lZWQgbW9yZSBiZW5jaG1hcmsgdG8gZ2V0IHRoZQ0KPiA+
ID4gPiBkZWZhdWx0DQo+ID4gPiBiaXQuDQo+ID4gPg0KPiA+ID4gV2hhdCBkb2VzIHRoZSBkZWZh
dWx0IGhhdmUgdG8gZG8gd2l0aCBpdD8gIFRoZSB1c2VyIGNvdWxkIGhhdmUgc2V0IGENCj4gPiA+
IGRpZmZlcmVudCB2YWx1ZSwgYW5kIHRoZW4gcmVhZCBpdCBiYWNrLg0KPiA+ID4NCj4gPiA+IFBs
dXMsIGhvdyBtdWNoIHRpbWUgY29ycmVzcG9uZHMgdG8gYml0IDUwIGRlcGVuZHMgb24gdGhlIGFj
dHVhbA0KPiA+ID4gdGltZWJhc2UgZnJlcXVlbmN5IHdoaWNoIGNvdWxkIHZhcnkuDQo+ID4gPg0K
PiA+IAlpZiAoIXB3MjBfd3QpIHsNCj4gPiAJCXNtcF9jYWxsX2Z1bmN0aW9uX3NpbmdsZShjcHUs
IGRvX3Nob3dfcHdybWd0Y3IwLCAmdmFsdWUsIDEpOw0KPiA+IAkJdmFsdWUgPSAodmFsdWUgJiBQ
V1JNR1RDUjBfUFcyMF9FTlQpID4+DQo+ID4gCQkJCQlQV1JNR1RDUjBfUFcyMF9FTlRfU0hJRlQ7
DQo+ID4NCj4gPiAJCXRiX2N5Y2xlID0gKDEgPDwgKE1BWF9CSVQgLSB2YWx1ZSkpICogMjsNCj4g
PiAJCXRpbWUgPSB0Yl9jeWNsZSAqICgxMDAwIC8gdGJfdGlja3NfcGVyX3VzZWMpIC0gMTsNCj4g
PiAJfSBlbHNlIHsNCj4gPiAJCXRpbWUgPSBwdzIwX3d0Ow0KPiA+IAl9DQo+ID4NCj4gPiBBcyB3
ZSBoYXZlIGRpc2N1c3NlZCBiZWZvcmUgd2UgbmVlZCBhIHZhcmlhYmxlIHRvIHNhdmUgVG8gc2F2
ZSB0aGUNCj4gdXNlcnMgdG8gc2V0IHdhaXQtZW50cnktdGltZSB2YWx1ZS4NCj4gPg0KPiA+IFNl
ZSB0aGUgY29kZSwgaWYgdXNlciBoYXZlIHNldCBhIHZhbHVlLCBhbmQgdGhlIHZhbHVlIHdpbGwg
YmUgc2V0IGluDQo+IHB3MjBfd3QuDQo+ID4gV2hlbiB0aGUgdXNlciByZWFkIGl0IGJhY2ssIHRo
ZSBjb2RlIHdpbGwgZG8gInRpbWUgPSBwdzIwX3d0IiBicmFuY2guDQo+ID4NCj4gPiBXaGVuIHRo
ZSB1c2VyIG5vdCBzZXQgdGhlIHdhaXQtZW50cnktdGltZSB2YWx1ZSBhbmQgcmVhZCB0aGlzIHN5
cw0KPiBpbnRlcmZhY2UsIHRoZSBjb2RlIHdpbGwgZG8gdGhlIGZvbGxvd2luZyBicmFuY2guDQo+
IA0KPiBPaCwgc28gaXQncyBub3QgdGhhdCB5b3UgIm5lZWQiIHRoaXMsIHlvdSBhbHJlYWR5IGhh
dmUgaXQuDQo+IA0KWWVzLCB3ZSBhbHJlYWR5IGhhdmUgYSB2YXJpYWJsZS4gOikNCg0K

^ permalink raw reply

* Re: [PATCH v11 3/3] DMA: Freescale: update driver to support 8-channel DMA engine
From: Hongbo Zhang @ 2013-11-05  2:31 UTC (permalink / raw)
  To: vinod.koul, djbw
  Cc: mark.rutland, devicetree, ian.campbell, pawel.moll, swarren,
	Hongbo Zhang, linux-kernel, rob.herring, linuxppc-dev
In-Reply-To: <525F7C18.3010409@freescale.com>

Hi Vinod Koul and Dan Williams,
Ping?


On 10/17/2013 01:56 PM, Hongbo Zhang wrote:
> Hi Vinod,
> I have gotten ACK from Mark for both the 1/3 and 2/3 patches.
> Thanks.
>
>
> On 09/26/2013 05:33 PM, hongbo.zhang@freescale.com wrote:
>> From: Hongbo Zhang <hongbo.zhang@freescale.com>
>>
>> This patch adds support to 8-channel DMA engine, thus the driver 
>> works for both
>> the new 8-channel and the legacy 4-channel DMA engines.
>>
>> Signed-off-by: Hongbo Zhang <hongbo.zhang@freescale.com>
>> ---
>>   drivers/dma/Kconfig  |    9 +++++----
>>   drivers/dma/fsldma.c |    9 ++++++---
>>   drivers/dma/fsldma.h |    2 +-
>>   3 files changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
>> index 6825957..3979c65 100644
>> --- a/drivers/dma/Kconfig
>> +++ b/drivers/dma/Kconfig
>> @@ -89,14 +89,15 @@ config AT_HDMAC
>>         Support the Atmel AHB DMA controller.
>>     config FSL_DMA
>> -    tristate "Freescale Elo and Elo Plus DMA support"
>> +    tristate "Freescale Elo series DMA support"
>>       depends on FSL_SOC
>>       select DMA_ENGINE
>>       select ASYNC_TX_ENABLE_CHANNEL_SWITCH
>>       ---help---
>> -      Enable support for the Freescale Elo and Elo Plus DMA 
>> controllers.
>> -      The Elo is the DMA controller on some 82xx and 83xx parts, and 
>> the
>> -      Elo Plus is the DMA controller on 85xx and 86xx parts.
>> +      Enable support for the Freescale Elo series DMA controllers.
>> +      The Elo is the DMA controller on some mpc82xx and mpc83xx 
>> parts, the
>> +      EloPlus is on mpc85xx and mpc86xx and Pxxx parts, and the Elo3 
>> is on
>> +      some Txxx and Bxxx parts.
>>     config MPC512X_DMA
>>       tristate "Freescale MPC512x built-in DMA engine support"
>> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
>> index 49e8fbd..16a9a48 100644
>> --- a/drivers/dma/fsldma.c
>> +++ b/drivers/dma/fsldma.c
>> @@ -1261,7 +1261,9 @@ static int fsl_dma_chan_probe(struct 
>> fsldma_device *fdev,
>>       WARN_ON(fdev->feature != chan->feature);
>>         chan->dev = fdev->dev;
>> -    chan->id = ((res.start - 0x100) & 0xfff) >> 7;
>> +    chan->id = (res.start & 0xfff) < 0x300 ?
>> +           ((res.start - 0x100) & 0xfff) >> 7 :
>> +           ((res.start - 0x200) & 0xfff) >> 7;
>>       if (chan->id >= FSL_DMA_MAX_CHANS_PER_DEVICE) {
>>           dev_err(fdev->dev, "too many channels for device\n");
>>           err = -EINVAL;
>> @@ -1434,6 +1436,7 @@ static int fsldma_of_remove(struct 
>> platform_device *op)
>>   }
>>     static const struct of_device_id fsldma_of_ids[] = {
>> +    { .compatible = "fsl,elo3-dma", },
>>       { .compatible = "fsl,eloplus-dma", },
>>       { .compatible = "fsl,elo-dma", },
>>       {}
>> @@ -1455,7 +1458,7 @@ static struct platform_driver fsldma_of_driver = {
>>     static __init int fsldma_init(void)
>>   {
>> -    pr_info("Freescale Elo / Elo Plus DMA driver\n");
>> +    pr_info("Freescale Elo series DMA driver\n");
>>       return platform_driver_register(&fsldma_of_driver);
>>   }
>>   @@ -1467,5 +1470,5 @@ static void __exit fsldma_exit(void)
>>   subsys_initcall(fsldma_init);
>>   module_exit(fsldma_exit);
>>   -MODULE_DESCRIPTION("Freescale Elo / Elo Plus DMA driver");
>> +MODULE_DESCRIPTION("Freescale Elo series DMA driver");
>>   MODULE_LICENSE("GPL");
>> diff --git a/drivers/dma/fsldma.h b/drivers/dma/fsldma.h
>> index f5c3879..1ffc244 100644
>> --- a/drivers/dma/fsldma.h
>> +++ b/drivers/dma/fsldma.h
>> @@ -112,7 +112,7 @@ struct fsldma_chan_regs {
>>   };
>>     struct fsldma_chan;
>> -#define FSL_DMA_MAX_CHANS_PER_DEVICE 4
>> +#define FSL_DMA_MAX_CHANS_PER_DEVICE 8
>>     struct fsldma_device {
>>       void __iomem *regs;    /* DGSR register base */
>
>

^ permalink raw reply

* RE: [PATCH v5 4/4] powerpc/85xx: add sysfs for pw20 state and altivec idle
From: Dongsheng Wang @ 2013-11-05  3:09 UTC (permalink / raw)
  To: Scott Wood; +Cc: linuxppc-dev@lists.ozlabs.org, Bharat Bhushan
In-Reply-To: <1383601901.25829.33.camel@snotra.buserror.net>

DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogV29vZCBTY290dC1CMDc0
MjENCj4gU2VudDogVHVlc2RheSwgTm92ZW1iZXIgMDUsIDIwMTMgNTo1MiBBTQ0KPiBUbzogV2Fu
ZyBEb25nc2hlbmctQjQwNTM0DQo+IENjOiBXb29kIFNjb3R0LUIwNzQyMTsgQmh1c2hhbiBCaGFy
YXQtUjY1Nzc3OyBsaW51eHBwYy0NCj4gZGV2QGxpc3RzLm96bGFicy5vcmcNCj4gU3ViamVjdDog
UmU6IFtQQVRDSCB2NSA0LzRdIHBvd2VycGMvODV4eDogYWRkIHN5c2ZzIGZvciBwdzIwIHN0YXRl
IGFuZA0KPiBhbHRpdmVjIGlkbGUNCj4gDQo+IE9uIFN1biwgMjAxMy0xMS0wMyBhdCAyMjowNCAt
MDYwMCwgV2FuZyBEb25nc2hlbmctQjQwNTM0IHdyb3RlOg0KPiA+ID4gLS0tLS1PcmlnaW5hbCBN
ZXNzYWdlLS0tLS0NCj4gPiA+IEZyb206IFdhbmcgRG9uZ3NoZW5nLUI0MDUzNA0KPiA+ID4gU2Vu
dDogTW9uZGF5LCBPY3RvYmVyIDIxLCAyMDEzIDExOjExIEFNDQo+ID4gPiBUbzogV29vZCBTY290
dC1CMDc0MjENCj4gPiA+IENjOiBCaHVzaGFuIEJoYXJhdC1SNjU3Nzc7IGxpbnV4cHBjLWRldkBs
aXN0cy5vemxhYnMub3JnDQo+ID4gPiBTdWJqZWN0OiBSRTogW1BBVENIIHY1IDQvNF0gcG93ZXJw
Yy84NXh4OiBhZGQgc3lzZnMgZm9yIHB3MjAgc3RhdGUNCj4gPiA+IGFuZCBhbHRpdmVjIGlkbGUN
Cj4gPiA+DQo+ID4gPg0KPiA+ID4NCj4gPiA+ID4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0N
Cj4gPiA+ID4gRnJvbTogV29vZCBTY290dC1CMDc0MjENCj4gPiA+ID4gU2VudDogU2F0dXJkYXks
IE9jdG9iZXIgMTksIDIwMTMgMzoyMiBBTQ0KPiA+ID4gPiBUbzogV2FuZyBEb25nc2hlbmctQjQw
NTM0DQo+ID4gPiA+IENjOiBCaHVzaGFuIEJoYXJhdC1SNjU3Nzc7IFdvb2QgU2NvdHQtQjA3NDIx
OyBsaW51eHBwYy0NCj4gPiA+ID4gZGV2QGxpc3RzLm96bGFicy5vcmcNCj4gPiA+ID4gU3ViamVj
dDogUmU6IFtQQVRDSCB2NSA0LzRdIHBvd2VycGMvODV4eDogYWRkIHN5c2ZzIGZvciBwdzIwIHN0
YXRlDQo+ID4gPiA+IGFuZCBhbHRpdmVjIGlkbGUNCj4gPiA+ID4NCj4gPiA+ID4gT24gVGh1LCAy
MDEzLTEwLTE3IGF0IDIyOjAyIC0wNTAwLCBXYW5nIERvbmdzaGVuZy1CNDA1MzQgd3JvdGU6DQo+
ID4gPiA+ID4NCj4gPiA+ID4gPiA+IC0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4gPiA+
ID4gPiBGcm9tOiBCaHVzaGFuIEJoYXJhdC1SNjU3NzcNCj4gPiA+ID4gPiA+IFNlbnQ6IFRodXJz
ZGF5LCBPY3RvYmVyIDE3LCAyMDEzIDI6NDYgUE0NCj4gPiA+ID4gPiA+IFRvOiBXYW5nIERvbmdz
aGVuZy1CNDA1MzQ7IFdvb2QgU2NvdHQtQjA3NDIxDQo+ID4gPiA+ID4gPiBDYzogbGludXhwcGMt
ZGV2QGxpc3RzLm96bGFicy5vcmcNCj4gPiA+ID4gPiA+IFN1YmplY3Q6IFJFOiBbUEFUQ0ggdjUg
NC80XSBwb3dlcnBjLzg1eHg6IGFkZCBzeXNmcyBmb3IgcHcyMA0KPiA+ID4gPiA+ID4gc3RhdGUg
YW5kIGFsdGl2ZWMgaWRsZQ0KPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPg0K
PiA+ID4gPiA+ID4gPiA+ID4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPiA+ID4gPiA+
ID4gPiA+IEZyb206IFdhbmcgRG9uZ3NoZW5nLUI0MDUzNA0KPiA+ID4gPiA+ID4gPiA+ID4gU2Vu
dDogVGh1cnNkYXksIE9jdG9iZXIgMTcsIDIwMTMgMTE6MjIgQU0NCj4gPiA+ID4gPiA+ID4gPiA+
IFRvOiBCaHVzaGFuIEJoYXJhdC1SNjU3Nzc7IFdvb2QgU2NvdHQtQjA3NDIxDQo+ID4gPiA+ID4g
PiA+ID4gPiBDYzogbGludXhwcGMtZGV2QGxpc3RzLm96bGFicy5vcmcNCj4gPiA+ID4gPiA+ID4g
PiA+IFN1YmplY3Q6IFJFOiBbUEFUQ0ggdjUgNC80XSBwb3dlcnBjLzg1eHg6IGFkZCBzeXNmcyBm
b3INCj4gPiA+ID4gPiA+ID4gPiA+IHB3MjAgc3RhdGUgYW5kIGFsdGl2ZWMgaWRsZQ0KPiA+ID4g
PiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4gPg0KPiA+ID4g
PiA+ID4gPiA+ID4gPiAtLS0tLU9yaWdpbmFsIE1lc3NhZ2UtLS0tLQ0KPiA+ID4gPiA+ID4gPiA+
ID4gPiBGcm9tOiBCaHVzaGFuIEJoYXJhdC1SNjU3NzcNCj4gPiA+ID4gPiA+ID4gPiA+ID4gU2Vu
dDogVGh1cnNkYXksIE9jdG9iZXIgMTcsIDIwMTMgMTE6MjAgQU0NCj4gPiA+ID4gPiA+ID4gPiA+
ID4gVG86IFdhbmcgRG9uZ3NoZW5nLUI0MDUzNDsgV29vZCBTY290dC1CMDc0MjENCj4gPiA+ID4g
PiA+ID4gPiA+ID4gQ2M6IGxpbnV4cHBjLWRldkBsaXN0cy5vemxhYnMub3JnDQo+ID4gPiA+ID4g
PiA+ID4gPiA+IFN1YmplY3Q6IFJFOiBbUEFUQ0ggdjUgNC80XSBwb3dlcnBjLzg1eHg6IGFkZCBz
eXNmcw0KPiA+ID4gPiA+ID4gPiA+ID4gPiBmb3INCj4gPiA+ID4gPiA+ID4gPiA+ID4gcHcyMCBz
dGF0ZSBhbmQgYWx0aXZlYyBpZGxlDQo+ID4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+
ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gLS0tLS1P
cmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiBGcm9tOiBXYW5nIERv
bmdzaGVuZy1CNDA1MzQNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiBTZW50OiBUaHVyc2RheSwgT2N0
b2JlciAxNywgMjAxMyA4OjE2IEFNDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gVG86IEJodXNoYW4g
QmhhcmF0LVI2NTc3NzsgV29vZCBTY290dC1CMDc0MjENCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiBD
YzogbGludXhwcGMtZGV2QGxpc3RzLm96bGFicy5vcmcNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiBT
dWJqZWN0OiBSRTogW1BBVENIIHY1IDQvNF0gcG93ZXJwYy84NXh4OiBhZGQgc3lzZnMNCj4gPiA+
ID4gPiA+ID4gPiA+ID4gPiBmb3INCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiBwdzIwIHN0YXRlIGFu
ZCBhbHRpdmVjIGlkbGUNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiA+ID4g
PiA+DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IC0tLS0t
T3JpZ2luYWwgTWVzc2FnZS0tLS0tDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiBGcm9tOiBCaHVz
aGFuIEJoYXJhdC1SNjU3NzcNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IFNlbnQ6IFRodXJzZGF5
LCBPY3RvYmVyIDE3LCAyMDEzIDE6MDEgQU0NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IFRvOiBX
YW5nIERvbmdzaGVuZy1CNDA1MzQ7IFdvb2QgU2NvdHQtQjA3NDIxDQo+ID4gPiA+ID4gPiA+ID4g
PiA+ID4gPiBDYzogbGludXhwcGMtZGV2QGxpc3RzLm96bGFicy5vcmcNCj4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+IFN1YmplY3Q6IFJFOiBbUEFUQ0ggdjUgNC80XSBwb3dlcnBjLzg1eHg6IGFkZA0K
PiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gc3lzZnMgZm9yDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4g
PiBwdzIwIHN0YXRlIGFuZCBhbHRpdmVjIGlkbGUNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+DQo+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ID4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gRnJvbTogV2FuZyBEb25nc2hlbmctQjQwNTM0DQo+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gPiA+IFNlbnQ6IFR1ZXNkYXksIE9jdG9iZXIgMTUsIDIwMTMgMjo1MSBQTQ0KPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiBUbzogV29vZCBTY290dC1CMDc0MjENCj4gPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gQ2M6IEJodXNoYW4gQmhhcmF0LVI2NTc3NzsNCj4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gbGludXhwcGMtZGV2QGxpc3RzLm96bGFicy5vcmc7IFdhbmcNCj4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+IERvbmdzaGVuZy1CNDA1MzQNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gU3ViamVjdDogW1BBVENIIHY1IDQvNF0gcG93ZXJwYy84NXh4OiBhZGQgc3lzZnMNCj4gPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gZm9yDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IHB3MjAg
c3RhdGUgYW5kDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiBhbHRpdmVjIGlkbGUNCj4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gRnJvbTogV2FuZyBE
b25nc2hlbmcNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPGRvbmdzaGVuZy53YW5nQGZyZWVz
Y2FsZS5jb20+DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+IEFkZCBhIHN5cyBpbnRlcmZhY2UgdG8gZW5hYmxlL2RpYWJsZSBwdzIwIHN0YXRlDQo+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IG9yIGFsdGl2ZWMgaWRsZSwgYW5kDQo+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gPiBjb250cm9sIHRoZQ0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiB3YWl0
IGVudHJ5IHRpbWUuDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+IEVuYWJsZS9EaXNhYmxlIGludGVyZmFjZToNCj4gPiA+ID4gPiA+ID4gPiA+ID4g
PiA+ID4gMCwgZGlzYWJsZS4gMSwgZW5hYmxlLg0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiAv
c3lzL2RldmljZXMvc3lzdGVtL2NwdS9jcHVYL3B3MjBfc3RhdGUNCj4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gL3N5cy9kZXZpY2VzL3N5c3RlbS9jcHUvY3B1WC9hbHRpdmVjX2lkbGUNCj4gPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gU2V0IHdhaXQg
dGltZSBpbnRlcmZhY2U6KE5hbm9zZWNvbmQpDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IC9z
eXMvZGV2aWNlcy9zeXN0ZW0vY3B1L2NwdVgvcHcyMF93YWl0X3RpbWUNCj4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gL3N5cy9kZXZpY2VzL3N5c3RlbS9jcHUvY3B1WC9hbHRpdmVjX2lkbGVfd2Fp
dF90DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IGltZQ0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPiBFeGFtcGxlOiBCYXNlIG9uIFRCZnJlcSBpcyA0MU1IWi4NCj4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gMX40OChucyk6IFRCWzYzXQ0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA0OX45
Nyhucyk6IFRCWzYyXQ0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA5OH4xOTUobnMpOiBUQls2
MV0NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gMTk2fjM5MChucyk6IFRCWzYwXQ0KPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ID4gPiAzOTF+NzgwKG5zKTogVEJbNTldDQo+ID4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+IDc4MX4xNTYwKG5zKTogVEJbNThdIC4uLg0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPg0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiBTaWduZWQtb2ZmLWJ5OiBXYW5nIERvbmdz
aGVuZw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA8ZG9uZ3NoZW5nLndhbmdAZnJlZXNjYWxl
LmNvbT4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gLS0tDQo+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ICp2NToNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gQ2hhbmdlIGdldF9pZGxlX3Rp
Y2tzX2JpdCBmdW5jdGlvbiBpbXBsZW1lbnRhdGlvbi4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+
ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKnY0Og0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPiBNb3ZlIGNvZGUgZnJvbSA4NXh4L2NvbW1vbi5jIHRvIGtlcm5lbC9zeXNmcy5jLg0KPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiBSZW1vdmUg
aGFzX3B3MjBfYWx0aXZlY19pZGxlIGZ1bmN0aW9uLg0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4g
Pg0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiBDaGFuZ2Ugd2FpdCAiZW50cnlfYml0IiB0byB3
YWl0IHRpbWUuDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+IGRpZmYgLS1naXQgYS9hcmNoL3Bvd2VycGMva2VybmVsL3N5c2ZzLmMNCj4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ID4gYi9hcmNoL3Bvd2VycGMva2VybmVsL3N5c2ZzLmMNCj4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+IGluZGV4DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IDI3YTkwYjku
LjEwZDExMjggMTAwNjQ0DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IC0tLSBhL2FyY2gvcG93
ZXJwYy9rZXJuZWwvc3lzZnMuYw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArKysgYi9hcmNo
L3Bvd2VycGMva2VybmVsL3N5c2ZzLmMNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gQEAgLTg1
LDYgKzg1LDI4NCBAQCBfX3NldHVwKCJzbXQtc25vb3plLWRlbGF5PSIsDQo+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gPiBzZXR1cF9zbXRfc25vb3plX2RlbGF5KTsNCj4gPiA+ID4gPiA+ID4gPiA+ID4g
PiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gICNlbmRpZiAvKiBDT05GSUdfUFBDNjQg
Ki8NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4g
KyNpZmRlZiBDT05GSUdfRlNMX1NPQw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArI2RlZmlu
ZSBNQVhfQklUCQkJCTYzDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gK3N0YXRpYyB1NjQgcHcyMF93dDsgc3RhdGljIHU2NCBhbHRpdmVjX2lk
bGVfd3Q7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+ID4g
PiA+ID4gK3N0YXRpYyB1bnNpZ25lZCBpbnQgZ2V0X2lkbGVfdGlja3NfYml0KHU2NCBucykgew0K
PiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCXU2NCBjeWNsZTsNCj4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gKw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCWlmIChucyA+PSAxMDAwMCkN
Cj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwkJY3ljbGUgPSBkaXZfdTY0KG5zICsgNTAwLCAx
MDAwKSAqDQo+ID4gPiA+ID4gPiB0Yl90aWNrc19wZXJfdXNlYzsNCj4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gKwllbHNlDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJCWN5Y2xlID0gZGl2
X3U2NChucyAqDQo+IHRiX3RpY2tzX3Blcl91c2VjLA0KPiA+ID4gPiAxMDAwKTsNCj4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ID4gKw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCWlmICghY3lj
bGUpDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJCXJldHVybiAwOw0KPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gPiArDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJcmV0dXJuIGlsb2cy
KGN5Y2xlKTsgfQ0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArDQo+ID4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ICtzdGF0aWMgdm9pZCBkb19zaG93X3B3cm1ndGNyMCh2b2lkICp2YWwpIHsNCj4g
PiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwl1MzIgKnZhbHVlID0gdmFsOw0KPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gPiArDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJKnZhbHVlID0gbWZz
cHIoU1BSTl9QV1JNR1RDUjApOyB9DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gK3N0YXRpYyBzc2l6ZV90IHNob3dfcHcyMF9zdGF0ZShzdHJ1
Y3QgZGV2aWNlDQo+ICpkZXYsDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJCQkJc3RydWN0
IGRldmljZV9hdHRyaWJ1dGUNCj4gKmF0dHIsDQo+ID4gPiA+IGNoYXINCj4gPiA+ID4gPiA+ICpi
dWYpIHsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwl1MzIgdmFsdWU7DQo+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ICsJdW5zaWduZWQgaW50IGNwdSA9IGRldi0+aWQ7DQo+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwlzbXBfY2FsbF9m
dW5jdGlvbl9zaW5nbGUoY3B1LA0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArZG9fc2hvd19w
d3JtZ3RjcjAsICZ2YWx1ZSwgMSk7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwl2YWx1ZSAmPSBQV1JNR1RDUjBfUFcyMF9XQUlUOw0KPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJcmV0
dXJuIHNwcmludGYoYnVmLCAiJXVcbiIsIHZhbHVlID8gMSA6DQo+IDApOyB9DQo+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gK3N0YXRpYyB2b2lk
IGRvX3N0b3JlX3B3MjBfc3RhdGUodm9pZCAqdmFsKSB7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4g
PiA+ICsJdTMyICp2YWx1ZSA9IHZhbDsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwl1MzIg
cHcyMF9zdGF0ZTsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKw0KPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gPiArCXB3MjBfc3RhdGUgPSBtZnNwcihTUFJOX1BXUk1HVENSMCk7DQo+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwlpZiAoKnZh
bHVlKQ0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCQlwdzIwX3N0YXRlIHw9IFBXUk1HVENS
MF9QVzIwX1dBSVQ7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJZWxzZQ0KPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gPiArCQlwdzIwX3N0YXRlICY9IH5QV1JNR1RDUjBfUFcyMF9XQUlUOw0K
PiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJ
bXRzcHIoU1BSTl9QV1JNR1RDUjAsIHB3MjBfc3RhdGUpOyB9DQo+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gK3N0YXRpYyBzc2l6ZV90IHN0b3Jl
X3B3MjBfc3RhdGUoc3RydWN0IGRldmljZQ0KPiAqZGV2LA0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+
ID4gPiArCQkJCXN0cnVjdCBkZXZpY2VfYXR0cmlidXRlDQo+ICphdHRyLA0KPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gPiArCQkJCWNvbnN0IGNoYXIgKmJ1Ziwgc2l6ZV90DQo+IGNvdW50KQ0KPiA+
ID4gPiB7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJdTMyIHZhbHVlOw0KPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gPiArCXVuc2lnbmVkIGludCBjcHUgPSBkZXYtPmlkOw0KPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gPiArDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJaWYgKGtzdHJ0
b3UzMihidWYsIDAsICZ2YWx1ZSkpDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJCXJldHVy
biAtRUlOVkFMOw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArDQo+ID4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ICsJaWYgKHZhbHVlID4gMSkNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwkJ
cmV0dXJuIC1FSU5WQUw7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gKwlzbXBfY2FsbF9mdW5jdGlvbl9zaW5nbGUoY3B1LA0KPiA+ID4gPiA+
ID4gPiA+ID4gPiA+ID4gPiArZG9fc3RvcmVfcHcyMF9zdGF0ZSwgJnZhbHVlLCAxKTsNCj4gPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gKw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCXJldHVy
biBjb3VudDsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gK30NCj4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gKw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArc3RhdGljIHNzaXplX3Qgc2hv
d19wdzIwX3dhaXRfdGltZShzdHJ1Y3QNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gK2Rldmlj
ZQ0KPiA+ID4gKmRldiwNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwkJCQlzdHJ1Y3QgZGV2
aWNlX2F0dHJpYnV0ZQ0KPiAqYXR0ciwNCj4gPiA+ID4gY2hhcg0KPiA+ID4gPiA+ID4gKmJ1Zikg
ew0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCXUzMiB2YWx1ZTsNCj4gPiA+ID4gPiA+ID4g
PiA+ID4gPiA+ID4gKwl1NjQgdGJfY3ljbGU7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsJ
czY0IHRpbWU7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gKwl1bnNpZ25lZCBpbnQgY3B1ID0gZGV2LT5pZDsNCj4gPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4gKw0KPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCWlmICghcHcyMF93dCkgew0K
PiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPiArCQlzbXBfY2FsbF9mdW5jdGlvbl9zaW5nbGUoY3B1
LA0KPiA+ID4gPiBkb19zaG93X3B3cm1ndGNyMCwNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4g
KyZ2YWx1ZSwNCj4gPiA+ID4gPiA+ID4gPiA+ID4gMSk7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4g
PiA+ICsJCXZhbHVlID0gKHZhbHVlICYNCj4gUFdSTUdUQ1IwX1BXMjBfRU5UKSA+Pg0KPiA+ID4g
PiA+ID4gPiA+ID4gPiA+ID4gPiArDQo+IAlQV1JNR1RDUjBfUFcyMF9FTlRfU0hJRlQ7DQo+ID4g
PiA+ID4gPiA+ID4gPiA+ID4gPiA+ICsNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+ID4gKwkJdGJf
Y3ljbGUgPSAoMSA8PCAoTUFYX0JJVCAtIHZhbHVlKSkgKg0KPiAyOw0KPiA+ID4gPiA+ID4gPiA+
ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IElzIHZhbHVlID0gMCBhbmQgdmFsdWUg
PSAxIGxlZ2FsPyBUaGVzZSB3aWxsIG1ha2UNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+IHRiX2N5
Y2xlID0gMCwNCj4gPiA+ID4gPiA+ID4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4g
PiA+ICsJCXRpbWUgPSBkaXZfdTY0KHRiX2N5Y2xlICogMTAwMCwNCj4gPiA+ID4gdGJfdGlja3Nf
cGVyX3VzZWMpDQo+ID4gPiA+ID4gPiAtIDE7DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gPg0KPiA+
ID4gPiA+ID4gPiA+ID4gPiA+ID4gQW5kIHRpbWUgPSAtMTsNCj4gPiA+ID4gPiA+ID4gPiA+ID4g
PiA+DQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4gUGxlYXNlIGxvb2sgYXQgdGhlIGVuZCBvZiB0aGUg
ZnVuY3Rpb24sIDopDQo+ID4gPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiA+ID4g
PiAicmV0dXJuIHNwcmludGYoYnVmLCAiJWxsdVxuIiwgdGltZSA+IDAgPyB0aW1lIDogMCk7Ig0K
PiA+ID4gPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiA+ID4gPiBJIGtub3cgeW91IHJldHVy
biAwIGlmIHZhbHVlID0gMC8xLCBteSBxdWVzdGlvbiB3YXMNCj4gPiA+ID4gPiA+ID4gPiA+ID4g
dGhhdCwgaXMgdGhpcyBjb3JyZWN0IGFzIHBlciBzcGVjaWZpY2F0aW9uPw0KPiA+ID4gPiA+ID4g
PiA+ID4gPg0KPiA+ID4gPiA+ID4gPiA+ID4gPiBBaGgsIGFsc28gZm9yICJ2YWx1ZSIgdXB0byA3
IHlvdSB3aWxsIHJldHVybiAwLCBubz8NCj4gPiA+ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+
ID4gPiA+IElmIHZhbHVlID0gMCwgTUFYX0JJVCAtIHZhbHVlID0gNjMgdGJfY3ljbGUgPQ0KPiA+
ID4gPiA+ID4gPiA+ID4gMHhmZmZmZmZmZl9mZmZmZmZmZiwgdGJfY3ljbGUgKiAxMDAwIHdpbGwg
b3ZlcmZsb3csIGJ1dA0KPiA+ID4gPiA+ID4gPiA+ID4gdGhpcw0KPiA+ID4gPiBzaXR1YXRpb24g
aXMgbm90IHBvc3NpYmxlLg0KPiA+ID4gPiA+ID4gPiA+ID4gQmVjYXVzZSBpZiB0aGUgInZhbHVl
ID0gMCIgbWVhbnMgdGhpcyBmZWF0dXJlIHdpbGwgYmUNCj4gPiA+ICJkaXNhYmxlIi4NCj4gPiA+
ID4gPiA+ID4gPiA+IE5vdyBUaGUgZGVmYXVsdCB3YWl0IGJpdCBpcyA1MChNQVhfQklUIC0gdmFs
dWUsIHZhbHVlID0NCj4gPiA+ID4gPiA+ID4gPiA+IDEzKSwgdGhlIFBXMjAvQWx0aXZlYyBJZGxl
IHdhaXQgZW50cnkgdGltZSBpcyBhYm91dCAxbXMsDQo+ID4gPiA+ID4gPiA+ID4gPiB0aGlzIHRp
bWUgaXMgdmVyeSBsb25nIGZvciB3YWl0IGlkbGUgdGltZSwgYW5kIGl0J3MNCj4gPiA+ID4gPiA+
ID4gPiA+IGNhbm5vdCBiZSBpbmNyZWFzZWQobWVhbnMgKE1BWF9CSVQNCj4gPiA+ID4gPiA+ID4g
PiA+IC0gdmFsdWUpDQo+ID4gPiA+ID4gPiA+ID4gY2Fubm90IGdyZWF0ZXIgdGhhbiA1MCkuDQo+
ID4gPiA+ID4gPiA+ID4NCj4gPiA+ID4gPiA+ID4gPiBXaGF0IHlvdSBzYWlkIGlzIG5vdCBvYnZp
b3VzIGZyb20gY29kZSBhbmQgc28gYXQgbGVhc3QNCj4gPiA+ID4gPiA+ID4gPiB3cml0ZSBhIGNv
bW1lbnQgdGhhdCB2YWx1ZSB3aWxsIGJlIGFsd2F5cyA+PSAxMyBvciB2YWx1ZQ0KPiA+ID4gPiA+
ID4gPiA+IHdpbGwgbmV2ZXIgYmUgbGVzcyB0aGFuIDwgOCBhbmQgYmVsb3cgY2FsY3VsYXRpb24g
d2lsbCBub3QNCj4gPiA+ID4gPiA+ID4gPiBvdmVyZmxvdy4gbWF5IGJlIGVycm9yIG91dCBpZiB2
YWx1ZSBpcyBsZXNzIHRoYW4gOC4NCj4gPiA+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gPiBUaGUg
InZhbHVlIiBsZXNzIHRoYW4gMTAsIHRoaXMgd2lsbCBvdmVyZmxvdy4NCj4gPiA+ID4gPiA+ID4g
VGhlcmUgaXMgbm90IGVycm9yLCBUaGUgY29kZSBJIGtuZXcgaXQgY291bGQgbm90IGJlIGxlc3MN
Cj4gPiA+ID4gPiA+ID4gdGhhbiAxMCwgdGhhdCdzIHdoeSBJIHVzZSB0aGUgZm9sbG93aW5nIGNv
ZGUuIDopDQo+ID4gPiA+ID4gPg0KPiA+ID4gPiA+ID4gSSBhbSBzb3JyeSB0byBwZXJzaXN0IGJ1
dCB0aGlzIGlzIG5vdCBhYm91dCB3aGF0IHlvdSBrbm93LA0KPiA+ID4gPiA+ID4gdGhpcyBpcyBh
Ym91dCBob3cgY29kZSBpcyByZWFkIGFuZCBjb2RlIGRvZXMgbm90IHNheSB3aGF0IHlvdQ0KPiA+
ID4gPiA+ID4ga25vdywgc28gYWRkIGEgY29tbWVudCBhdCBsZWFzdCBhbmQgZXJyb3Igb3V0L3dh
cm4gd2hlbg0KPiA+ID4gPiA+ID4gInZhbHVlIiBpcyBsZXNzIHRoYW4gYQ0KPiA+ID4gPiBjZXJ0
YWluIG51bWJlci4NCj4gPiA+ID4gPiA+DQo+ID4gPiA+ID4gU29ycnkgZm9yIHRoZSBsYXRlIHRv
IHJlc3BvbnNlIHRoZSBtYWlsLiBJZiBpdCBjYXVzZWQgY29uZnVzaW9uLA0KPiA+ID4gPiA+IHdl
IGNhbg0KPiA+ID4gPiBhZGQgYSBjb21tZW50Lg0KPiA+ID4gPiA+DQo+ID4gPiA+ID4gSG93IGFi
b3V0IHRoZSBmb2xsb3dpbmcgY29tbWVudD8NCj4gPiA+ID4gPiAvKg0KPiA+ID4gPiA+ICAqIElm
IHRoZSAidmFsdWUiIGxlc3MgdGhhbiAxMCwgdGhpcyB3aWxsIG92ZXJmbG93Lg0KPiA+ID4gPiA+
ICAqIEZyb20gYmVuY2htYXJrIHRlc3QsIHRoZSBkZWZhdWx0IHdhaXQgYml0IHdpbGwgbm90IGJl
IHNldA0KPiA+ID4gPiA+IGxlc3MgdGhhbg0KPiA+ID4gPiAxMGJpdC4NCj4gPiA+ID4gPiAgKiBC
ZWNhdXNlIDEwIGJpdCBjb3JyZXNwb25kcyB0byB0aGUgd2FpdCBlbnRyeSB0aW1lIGlzDQo+ID4g
PiA+ID4gNDM5Mzc1NTczNDAxOTk5NjA5KG5zKSwNCj4gPiA+ID4gPiAgKiBmb3Igd2FpdC1lbnRy
eS1pZGxlIHRpbWUgdGhpcyB2YWx1ZSBsb29rcyB0b28gbG9uZywgYW5kIHdlDQo+ID4gPiA+ID4g
Y2Fubm90IHVzZSB0aG9zZQ0KPiA+ID4gPiA+ICAqICJsb25nIiB0aW1lIGFzIGEgZGVmYXVsdCB3
YWl0LWVudHJ5IHRpbWUuIFNvIG92ZXJmbG93IGNvdWxkDQo+ID4gPiA+ID4gbm90IGhhdmUgaGFw
cGVuZWQNCj4gPiA+ID4gPiAgKiBhbmQgd2UgdXNlIHRoaXMgY2FsY3VsYXRpb24gbWV0aG9kIHRv
IGdldCB3YWl0LWVudHJ5LWlkbGUgdGltZS4NCj4gPiA+ID4gPiAgKi8NCj4gPiA+ID4NCj4gPiA+
ID4gSWYgdGhlcmUncyB0byBiZSBhIGxpbWl0IG9uIHRoZSB0aW1lcyB3ZSBhY2NlcHQsIG1ha2Ug
aXQgZXhwbGljaXQuDQo+ID4gPiA+IENoZWNrIGZvciBpdCBiZWZvcmUgZG9pbmcgYW55IGNvbnZl
cnNpb25zLCBhbmQgcmV0dXJuIGFuIGVycm9yIGlmDQo+ID4gPiA+IHVzZXJzcGFjZSB0cmllcyB0
byBzZXQgaXQuDQo+ID4gPiA+DQo+ID4gPiBUaGUgYnJhbmNoIG9ubHkgdXNlIHRvIHJlYWQgZGVm
YXVsdCB3YWl0LWVudHJ5LXRpbWUuDQo+ID4gPiBXZSBoYXZlIG5vIGxpbWl0IHRoZSB1c2VyJ3Mg
aW5wdXQsIGFuZCB3ZSBjYW4ndCByZXN0cmljdC4gT25jZSB0aGUNCj4gPiA+IHVzZXIgc2V0IHRo
ZSB3YWl0LWVudHJ5LXRpbWUsIHRoZSBjb2RlIHdpbGwgZG8gYW5vdGhlciBicmFuY2guDQo+ID4g
Pg0KPiA+DQo+ID4gSGkgc2NvdHQsDQo+ID4gRG8geW91IGhhdmUgYW55IGNvbW1lbnRzIGFib3V0
IHRoaXMgcGF0Y2g/DQo+ID4gSSB3aWxsIGFkZCB0aGUgY29tbWVudCBhbmQgc2VuZCB0aGlzIHBh
dGNoIGFnYWluLg0KPiANCj4gV2hhdCBkbyB5b3UgbWVhbiBieSAiYW5kIHdlIGNhbid0IHJlc3Ry
aWN0Ij8gIFdoeSBub3Q/DQo+IA0KPiBXaHkgaXMgaXQgb25seSB1c2VkIHRvIHJlYWQgdGhlIGRl
ZmF1bHQsIGFuZCBub3QgdGhlIGN1cnJlbnQgdmFsdWU/DQo+IA0KV2UgYWxyZWFkeSBoYXZlIGEg
dmFyaWFibGUgd2hpY2ggdmFsdWUgaXMgc2V0IGJ5IHRoZSB1c2VyLCBhcyB3ZSBoYXZlIGRpc2N1
c3NlZCBiZWZvcmUuDQoNCldoZW4gdGhlIHN5c3RlbSBib290LXVwLiBCZWZvcmUgdXNlciBzZXQg
dGhlIHdhaXQtZW50cnktdGltZSwgd2UgbmVlZCB0byByZXR1cm4gYSBkZWZhdWx0IHdhaXQtZW50
cnktdGltZSwgaWYgdGhlIHVzZXIgcmVhZCB0aGlzIHN5cy1pbnRlcmZhY2UuIFRoZSBkZWZhdWx0
IHdhaXQtZW50cnktdGltZSBpcyBjb252ZXJ0ZWQgYnkgd2FpdC1iaXQuDQoNCk9uY2UgdGhlIHVz
ZXIgc2V0IHRoZSBzeXMtaW50ZXJmYWNlLCBhIHZhcmlhYmxlIHdpbGwgYmUgdXNlZCB0byBzYXZl
IGl0LiBBbmQgd2hlbiB0aGUgdXNlciByZWFkIHN5cy1pbnRlcmZhY2Ugd2Ugd2lsbCByZXR1cm4g
YmFjayB0aGUgdmFyaWFibGUuDQoNCi1kb25nc2hlbmcNCg0K

^ permalink raw reply

* RE: [PATCHv2 1/8] ALSA: Add SAI SoC Digital Audio Interface driver.
From: Li Xiubo @ 2013-11-05  3:21 UTC (permalink / raw)
  To: Mark Brown
  Cc: mark.rutland@arm.com, alsa-devel@alsa-project.org,
	linux-doc@vger.kernel.org, tiwai@suse.de, Huan Wang,
	timur@tabi.org, perex@perex.cz, Shawn Guo, LW@KARO-electronics.de,
	linux@arm.linux.org.uk, Guangyu Chen,
	linux-arm-kernel@lists.infradead.org, grant.likely@linaro.org,
	devicetree@vger.kernel.org, ian.campbell@citrix.com,
	pawel.moll@arm.com, swarren@wwwdotorg.org,
	rob.herring@calxeda.com, oskar@scara.com, Fabio Estevam,
	lgirdwood@gmail.com, linux-kernel@vger.kernel.org,
	rob@landley.net, Zhengxiong Jin, shawn.guo@linaro.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20131104161537.GI2493@sirena.org.uk>

> > From the ASoC subsystem comments we can see that:
> > ++++++
> > Configures the clock dividers. This is used to derive the best DAI bit
> > and frame clocks from the system or master clock. It's best to set the
> > DAI bit and frame clocks as low as possible to save system power.
> > ------
>=20
> You should never use this unless you have to, there is no point in every
> single machine driver using your driver having to duplicate the same
> calculations.
>=20

Okey, I'll think it over, if not have to, I will revise this.

> >
> > > > +static int fsl_sai_dai_probe(struct snd_soc_dai *dai) {
> > > > +	int ret;
> > > > +	struct fsl_sai *sai =3D dev_get_drvdata(dai->dev);
> > > > +
> > > > +	ret =3D clk_prepare_enable(sai->clk);
> > > > +	if (ret)
> > > > +		return ret;
>=20
> > > It'd be nicer to only enable the clock while the device is in active
> use.
>=20
> > While if the module clock is not enabled here, the followed registers
> cannot read/write in the same function.
> > And this _probe function is the _dai_probe not the driver's module
> _probe.
>=20
> So you can enable the clock when you explicitly need to write to the
> registers...
>=20
> > If the clk_prepare_enable(sai->clk) is not here, where should it be
> will be nicer ?
> > One of the following functions ?
> >         .set_sysclk     =3D fsl_sai_set_dai_sysclk,
> >         .set_clkdiv     =3D fsl_sai_set_dai_clkdiv,
> >         .set_fmt        =3D fsl_sai_set_dai_fmt,
> >         .set_tdm_slot   =3D fsl_sai_set_dai_tdm_slot,
> >         .hw_params      =3D fsl_sai_hw_params,
> >         .trigger        =3D fsl_sai_trigger,
>=20
> It could be in any or all of them except trigger (where the core should
> hold a runtime PM reference anyway).  You can always take a reference for
> the duration of the function if you're concerned it may be called when
> the referent isn't otherwise held.
>=20

While in this _sai_dai_probe function just followed the clock enable senten=
ce, there are some register writing operations:
The PATCH:
+++++++++++++++++++++++
+static int fsl_sai_dai_probe(struct snd_soc_dai *dai) {
+	int ret;
+	struct fsl_sai *sai =3D dev_get_drvdata(dai->dev);
+
+	ret =3D clk_prepare_enable(sai->clk);                     =3D=3D=3D=3D=3D=
> clock enable here
+	if (ret)
+		return ret;
+
+	writel(0x0, sai->base + FSL_SAI_RCSR);			  =3D=3D=3D=3D=3D=3D>registers w=
riting here.
+	writel(0x0, sai->base + FSL_SAI_TCSR);			  =3D=3D=3D=3D=3D=3D> and here
+	writel(sai->dma_params_tx.maxburst, sai->base + FSL_SAI_TCR1); =3D=3D=3D=
=3D=3D=3D=3D>and here
+	writel(sai->dma_params_rx.maxburst, sai->base + FSL_SAI_RCR1); =3D=3D=3D=
=3D=3D=3D=3D>and here
+
+	dai->playback_dma_data =3D &sai->dma_params_tx;
+	dai->capture_dma_data =3D &sai->dma_params_rx;
+
+	snd_soc_dai_set_drvdata(dai, sai);
+
+	return 0;
+}
-------------------------

As your opinions, should I move the four register writing operations to .se=
t_sysclk/set_clkdiv/... functions too ?
Or just add a clk_disable_unprepare() after them here, and then add clk_pre=
pare_enable in one of .set_sysclk/set_clkdiv/...?

And the first two of this four registers must be initialize as early as pos=
sible, and if move them to one of the .set_sysclk/set_clkdiv/... functions,
How can I very ensure which one is the first to be called ?
Won't the calling sequence be changed in the feature ?

>From the debug logs, we can see that:
1, _sai_probe
This is called when the machine brings up and has one SAI device.

2, _sai_dai_probe
3, .set_sysclk
4, .set_fmt
Are called in order when the machine has Audio driver and is enabled, and a=
lso while the machine brings up.

The above four steps only be called one time in order.

When aplay/arecord is runs the following will be called in order:
5, .set_tdm_slot
6, .hw_param
7, .trigger -->begain=20
8, .trigger --> end

The 2,3,4 are always called almost the same time, and they are all have reg=
ister read/write operations.
Now the clk_prepare_enable() is in step 2, and it won't be any different mo=
ving to step 3 or 4.

So, only could move it to step 5 or 6, if so every time the aplay/arecord r=
uns, clk_prepare_enable() will be
called, and there has no chance to call clk_disable_unprepare().

Now from the code we can see that I have add clk_prepare_enable() in _sai_d=
ai_probe() and clk_disable_unprepare() in _sai_dai_remove().
Isn't this okey ?



> > > > +	ret =3D snd_dmaengine_pcm_register(&pdev->dev, NULL,
> > > > +			SND_DMAENGINE_PCM_FLAG_NO_RESIDUE);
> > > > +	if (ret)
> > > > +		return ret;
>=20
> > > We should have a devm_ version of this.
>=20
> > Sorry, is there one patch for adding the devm_ version of
> snd_dmaengine_pcm_register() already ?
> > In the -next and other topics branches I could not find it.
>=20
> No, there isn't one but there should be one.
>
And if it has existed then I will use it.




BRs,
Xiubo

^ permalink raw reply

* RE: [PATCHv2 6/8] ASoC: fsl: add SGTL5000 based audio machine driver.
From: Li Xiubo @ 2013-11-05  3:26 UTC (permalink / raw)
  To: Oskar Schirmer
  Cc: mark.rutland@arm.com, alsa-devel@alsa-project.org,
	linux-doc@vger.kernel.org, tiwai@suse.de, Huan Wang,
	timur@tabi.org, perex@perex.cz, Shawn Guo, LW@KARO-electronics.de,
	linux@arm.linux.org.uk, Guangyu Chen, grant.likely@linaro.org,
	devicetree@vger.kernel.org, ian.campbell@citrix.com,
	pawel.moll@arm.com, swarren@wwwdotorg.org,
	rob.herring@calxeda.com, broonie@kernel.org,
	linux-arm-kernel@lists.infradead.org, Fabio Estevam,
	lgirdwood@gmail.com, linux-kernel@vger.kernel.org,
	rob@landley.net, Zhengxiong Jin, shawn.guo@linaro.org,
	linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20131101101750.GB16492@curry>


=3D
> [...]
> > diff --git a/sound/soc/fsl/fsl-sgtl5000-vf610.c b/sound/soc/fsl/fsl-
> sgtl5000-vf610.c
> > new file mode 100644
> > index 0000000..f535b42
> > --- /dev/null
> > +++ b/sound/soc/fsl/fsl-sgtl5000-vf610.c
> > @@ -0,0 +1,208 @@
> > +/*
> > + * Freeacale ALSA SoC Audio using SGT1500 as codec.
>           ^                             ^   ^
>      "Freescale"                    "SGTL5000"
>=20
Yes. See the next version.

^ permalink raw reply

* RE: [PATCHv2 6/8] ASoC: fsl: add SGTL5000 based audio machine driver.
From: Li Xiubo @ 2013-11-05  3:50 UTC (permalink / raw)
  To: Guangyu Chen, shawn.guo@linaro.org
  Cc: mark.rutland@arm.com, alsa-devel@alsa-project.org,
	linux-doc@vger.kernel.org, tiwai@suse.de, Huan Wang,
	timur@tabi.org, linux-kernel@vger.kernel.org, Shawn Guo,
	LW@KARO-electronics.de, linux@arm.linux.org.uk,
	linux-arm-kernel@lists.infradead.org, grant.likely@linaro.org,
	devicetree@vger.kernel.org, ian.campbell@citrix.com,
	pawel.moll@arm.com, swarren@wwwdotorg.org,
	rob.herring@calxeda.com, broonie@kernel.org, perex@perex.cz,
	oskar@scara.com, Fabio Estevam, lgirdwood@gmail.com,
	rob@landley.net, Zhengxiong Jin, linuxppc-dev@lists.ozlabs.org
In-Reply-To: <20131101102805.GG28401@MrMyself>

Hi Nicolin,


> > This is the SGTL5000 codec based audio driver supported with both
> > playback and capture dai link implemention.
> >
> > This implementation is only compatible with device tree definition.
> >
> > Signed-off-by: Alison Wang <b18965@freescale.com
> > Signed-off-by: Xiubo Li <Li.Xiubo@freescale.com>
> >
> > Conflicts:
> > 	sound/soc/fsl/Makefile
> > ---
> >  sound/soc/fsl/Kconfig              |  10 ++
> >  sound/soc/fsl/Makefile             |   2 +
>=20
> >  sound/soc/fsl/fsl-sgtl5000-vf610.c | 208
> > +++++++++++++++++++++++++++++++++++++
>=20
> I just doubt if this file naming is appropriate. Even if we might not
> have rigor rule for the file names, according to existing ones, they are
> all in a same pattern: [SoC name]-[codec name].c
>=20
> "imx-sgtl5000.c" for example
>=20
> I think it would make user less confused about what this file exactly is
> if this machine driver also follow the pattern: vf610-sgtl5000.c
>=20

Yes, it looks nicer.

>=20
> @Shawn
>=20
> What do you think about the file name?
>=20
> >  3 files changed, 220 insertions(+)
> >  create mode 100644 sound/soc/fsl/fsl-sgtl5000-vf610.c
> >
> > diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig index
> > 9a8851e..1b835ba 100644
> > --- a/sound/soc/fsl/Kconfig
> > +++ b/sound/soc/fsl/Kconfig
> > @@ -228,4 +228,14 @@ config SND_SOC_FSL_SAI
> >  	tristate
> >  	select SND_SOC_GENERIC_DMAENGINE_PCM
> >
> > +config SND_SOC_FSL_SGTL5000_VF610
>=20
> Same problem with the this define.
>=20
> > +	tristate "SoC Audio support for FSL boards with sgtl5000"
>=20
> And 'FSL' here confuses me a lot. Because those boards based on i.MX
> series also could be called FSL boards.
>=20

Yes, this should be VF610.

> > +	depends on OF && I2C
> > +	select SND_SOC_FSL_SAI
> > +	select SND_SOC_FSL_PCM
> > +	select SND_SOC_SGTL5000
> > +	help
> > +	  Say Y if you want to add support for SoC audio on an FSL board
> with
> > +	  a sgtl5000 codec.
> > +
> >  endif # SND_FSL_SOC
> > diff --git a/sound/soc/fsl/Makefile b/sound/soc/fsl/Makefile index
> > e5acc03..26fc551 100644
> > --- a/sound/soc/fsl/Makefile
> > +++ b/sound/soc/fsl/Makefile
> > @@ -59,5 +59,7 @@ obj-$(CONFIG_SND_SOC_IMX_MC13783) +=3D
> > snd-soc-imx-mc13783.o
> >
> >  # FSL ARM SAI/SGT15000 Platform Support  snd-soc-fsl-sai-objs :=3D
> > fsl-sai.o
> > +snd-soc-fsl-sgtl5000-vf610-objs :=3D fsl-sgtl5000-vf610.o
> >
> >  obj-$(CONFIG_SND_SOC_FSL_SAI) +=3D snd-soc-fsl-sai.o
> > +obj-$(CONFIG_SND_SOC_FSL_SGTL5000_VF610) +=3D
> > +snd-soc-fsl-sgtl5000-vf610.o
> > diff --git a/sound/soc/fsl/fsl-sgtl5000-vf610.c
> > b/sound/soc/fsl/fsl-sgtl5000-vf610.c
> > new file mode 100644
> > index 0000000..f535b42
> > --- /dev/null
> > +++ b/sound/soc/fsl/fsl-sgtl5000-vf610.c
> > @@ -0,0 +1,208 @@
> > +/*
> > + * Freeacale ALSA SoC Audio using SGT1500 as codec.
> > + *
> > + * Copyright 2012-2013 Freescale Semiconductor, Inc.
> > + *
> > + * The code contained herein is licensed under the GNU General Public
> > + * License. You may obtain a copy of the GNU General Public License
> > + * Version 2 or later at the following locations:
> > + *
> > + */
> > +
> > +#include <linux/module.h>
> > +#include <linux/of.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/i2c.h>
> > +#include <linux/clk.h>
> > +
> > +#include "../codecs/sgtl5000.h"
> > +#include "fsl-sai.h"
> > +
> > +static unsigned int sysclk_rate;
> > +
> > +static int fsl_sgtl5000_dai_init(struct snd_soc_pcm_runtime *rtd)
>=20
> Naming issue here again.
>=20
> At least from my point of view, if you actually merged imx-sgtl5000 with
> vf610-sgtl5000 and made it also compatible to other freescale SoCs, you
> could then fairly call it fsl_sgtl5000_xxxx.
>=20
> Well, I might be a little picky here because it's a static function and
> won't conflict others. Just the name here doesn't look so explicit to me.
>=20
> Please reconsider about this whole file's naming.
>=20

Yes, I still not very sure the names of the functions and files, from your =
replies,
I have got many information about the rules and others, I'll think it over =
and do some
research, Please see the next version.



Best regards,
Xiubo

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox