LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] powerpc: rheap allocator should honour alignment parameter
From: Marcelo Tosatti @ 2006-01-31 22:22 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: Pantelis Antoniou, linux-ppc-embedded


From: Pantelis Antoniou <pantelis@embeddedalley.com>

Honour alignment parameter in the rheap allocator.

Signed-off-by: Pantelis Antoniou <pantelis@embeddedalley.com>

diff --git a/arch/ppc/lib/rheap.c b/arch/ppc/lib/rheap.c
--- a/arch/ppc/lib/rheap.c
+++ b/arch/ppc/lib/rheap.c
@@ -425,17 +425,21 @@ void *rh_detach_region(rh_info_t * info,
 	return (void *)s;
 }
 
-void *rh_alloc(rh_info_t * info, int size, const char *owner)
+void *rh_alloc_align(rh_info_t * info, int size, int alignment, const char *owner)
 {
 	struct list_head *l;
 	rh_block_t *blk;
 	rh_block_t *newblk;
 	void *start;
 
-	/* Validate size */
-	if (size <= 0)
+	/* Validate size, (must be power of two) */
+	if (size <= 0 || (alignment & (alignment - 1)) != 0)
 		return ERR_PTR(-EINVAL);
 
+	/* given alignment larger that default rheap alignment */
+	if (alignment > info->alignment)
+		size += alignment - 1;
+
 	/* Align to configured alignment */
 	size = (size + (info->alignment - 1)) & ~(info->alignment - 1);
 
@@ -478,9 +482,21 @@ void *rh_alloc(rh_info_t * info, int siz
 
 	attach_taken_block(info, newblk);
 
+	/* for larger alignment return fixed up pointer  */
+	/* this is no problem with the deallocator since */
+	/* we scan for pointers that lie in the blocks   */
+	if (alignment > info->alignment)
+		start = (void *)(((unsigned long)start + alignment - 1) &
+				~(alignment - 1));
+
 	return start;
 }
 
+void *rh_alloc(rh_info_t * info, int size, const char *owner)
+{
+	return rh_alloc_align(info, size, info->alignment, owner);
+}
+
 /* allocate at precisely the given address */
 void *rh_alloc_fixed(rh_info_t * info, void *start, int size, const char *owner)
 {
diff --git a/arch/ppc/syslib/cpm2_common.c b/arch/ppc/syslib/cpm2_common.c
--- a/arch/ppc/syslib/cpm2_common.c
+++ b/arch/ppc/syslib/cpm2_common.c
@@ -148,8 +148,7 @@ uint cpm_dpalloc(uint size, uint align)
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpm_dpmem_lock, flags);
-	cpm_dpmem_info.alignment = align;
-	start = rh_alloc(&cpm_dpmem_info, size, "commproc");
+	start = rh_alloc_align(&cpm_dpmem_info, size, align, "commproc");
 	spin_unlock_irqrestore(&cpm_dpmem_lock, flags);
 
 	return (uint)start;
@@ -170,13 +169,12 @@ int cpm_dpfree(uint offset)
 EXPORT_SYMBOL(cpm_dpfree);
 
 /* not sure if this is ever needed */
-uint cpm_dpalloc_fixed(uint offset, uint size, uint align)
+uint cpm_dpalloc_fixed(uint offset, uint size)
 {
 	void *start;
 	unsigned long flags;
 
 	spin_lock_irqsave(&cpm_dpmem_lock, flags);
-	cpm_dpmem_info.alignment = align;
 	start = rh_alloc_fixed(&cpm_dpmem_info, (void *)offset, size, "commproc");
 	spin_unlock_irqrestore(&cpm_dpmem_lock, flags);
 
diff --git a/include/asm-ppc/commproc.h b/include/asm-ppc/commproc.h
--- a/include/asm-ppc/commproc.h
+++ b/include/asm-ppc/commproc.h
@@ -74,7 +74,7 @@ static inline long IS_DPERR(const uint o
 extern	cpm8xx_t	*cpmp;		/* Pointer to comm processor */
 extern uint cpm_dpalloc(uint size, uint align);
 extern int cpm_dpfree(uint offset);
-extern uint cpm_dpalloc_fixed(uint offset, uint size, uint align);
+extern uint cpm_dpalloc_fixed(uint offset, uint size);
 extern void cpm_dpdump(void);
 extern void *cpm_dpram_addr(uint offset);
 extern void cpm_setbrg(uint brg, uint rate);
diff --git a/include/asm-ppc/cpm2.h b/include/asm-ppc/cpm2.h
--- a/include/asm-ppc/cpm2.h
+++ b/include/asm-ppc/cpm2.h
@@ -112,7 +112,7 @@ extern		cpm_cpm2_t	*cpmp;	 /* Pointer to
 
 extern uint cpm_dpalloc(uint size, uint align);
 extern int cpm_dpfree(uint offset);
-extern uint cpm_dpalloc_fixed(uint offset, uint size, uint align);
+extern uint cpm_dpalloc_fixed(uint offset, uint size);
 extern void cpm_dpdump(void);
 extern void *cpm_dpram_addr(uint offset);
 extern void cpm_setbrg(uint brg, uint rate);
diff --git a/include/asm-ppc/rheap.h b/include/asm-ppc/rheap.h
--- a/include/asm-ppc/rheap.h
+++ b/include/asm-ppc/rheap.h
@@ -62,6 +62,10 @@ extern int rh_attach_region(rh_info_t * 
 /* Detach a free region */
 extern void *rh_detach_region(rh_info_t * info, void *start, int size);
 
+/* Allocate the given size from the remote heap (with alignment) */
+extern void *rh_alloc_align(rh_info_t * info, int size, int alignment,
+		const char *owner);
+
 /* Allocate the given size from the remote heap */
 extern void *rh_alloc(rh_info_t * info, int size, const char *owner);
 

^ permalink raw reply

* Re: [PATCH] powerpc: Fix Kernel FP unavail exception for BookE
From: Kumar Gala @ 2006-02-01  0:15 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev
In-Reply-To: <Pine.LNX.4.61.0601311746250.18350@cde-tx32-ldt329.sps.mot.com>


Acked-by: Kumar Gala <galak@kernel.crashing.org>

On Tue, 31 Jan 2006, Becky Bruce wrote:

> powerpc: Correct BookE FP unavailable exception
> 
> Updated FP unavailable exception to refer to the correct
> function in traps.c. head_booke.h was using the old name, KernelFP,
> instead of kernel_fp_unavailable_exception.
> 
> Signed-off-by: Becky Bruce <becky.bruce@freescale.com>
> 
> ---
> commit 6e481b074a1cbe44dd5ccc29fe74857986a41e14
> tree 5e3d59136176a927dae2f42ce83b393b17fc86e6
> parent 2a68349345a9bf292d06a8baaa8182b946c7056c
> author Becky Bruce <becky.bruce@freescale.com> Tue, 31 Jan 2006 17:41:00 -0600
> committer Becky Bruce <becky.bruce@freescale.com> Tue, 31 Jan 2006 17:41:00 -0600
> 
>  arch/powerpc/kernel/head_booke.h |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
> index 5827c27..8536e76 100644
> --- a/arch/powerpc/kernel/head_booke.h
> +++ b/arch/powerpc/kernel/head_booke.h
> @@ -358,6 +358,6 @@ label:
>  	NORMAL_EXCEPTION_PROLOG;					      \
>  	bne	load_up_fpu;		/* if from user, just load it up */   \
>  	addi	r3,r1,STACK_FRAME_OVERHEAD;				      \
> -	EXC_XFER_EE_LITE(0x800, KernelFP)
> +	EXC_XFER_EE_LITE(0x800, kernel_fp_unavailable_exception)
>  
>  #endif /* __HEAD_BOOKE_H__ */
> 

^ permalink raw reply

* Linux 2.4 and UARTLITE
From: S. Egbert @ 2006-02-01  8:11 UTC (permalink / raw)
  To: linuxppc-embedded

I'm using XUARTLITE with PPC405 and special core.  U-Boot hasn't
leverage this combo before (AFAICT, U-Boot XUARTLITE is used only with
MicroBlaze CPU).

So, I took a Oct. 10 2005 snapshot and integrated the XUARTLITE into
U-Boot.  That works so far when Xilinx bootloader (via ISOCM/DSOCM)
loads the Linux 2.4.20 SREC-formatted file directly from flash into DDR
and jumps to 0x100.  So, it has been established that a working Linux
XUARTLITE output has been had by Xilinx-booting directly from
flash-based Linux.SREC.  (that was my earlier announcement).

But flashing 8MB Linux.SREC, every time I tweak things, isn't my cup of
tea.  It is not reasonable to wait 45 minutes for a serial download to
finish.  So, I thought, why not U-Boot it.

My next step is to put a uImage (Linux XUARTLITE) into flash using

    => tftpboot 30000000 uImage
    => bootm 30000000

bootm confirms that the uImage checksum is good and jumps to 0x100.

The calling sequence within Linux in question is:

 + _start  (linux/arch/ppc/kernel/head_4xx.S)
   + initial_mmu (also in head_4xx.S)
   + start_here (head_4xx.S)
     + machine_init (
     + MMU_init (arch/ppc/mm/init.c)
         --> XUARLITE register goes phantom here <--
     + start_kernel (init/main.c)

Via linux/arch/ppc/mm/init.c ppc_md.progress() debug output, I see that
outputs various "MMU:" debug lines and then it falls silent (no further
output).

   id mach(): done
   MMU:enter
   MMU:hw init
   MMU:mapin
   MMU:setio
   MMU:exit

so, it finished mmu_init() and platform_init() just fine.

Using Xilinx XDM, one can repeatedly poke the XUARTLITE transmit
register and evoke character outputs just fine while in Linux IDLE
(scheduler) mode.  So, the XUARTLITE hardware is still working at this
point.  HW works still.

I noticed that in Linux PPC, to get 'early debug output' using
CONFIG_SERIAL_TEXT_DEBUG option, PPC4xx TLB Slot 0 is dedicated to
mapping of serial hardware registers/memory region.  Somehow that TLB
remapping fails.

At the moment, I'm about to re-review the TLB slot 0 and figure out why
poking the XUARTLITE transmit register isn't working from memory-mapped
mode.

Any insight is greatly appreciated, particularly on how to read PPC405
TLB registers.

Steve

^ permalink raw reply

* opening the serial port from userspace application
From: bharathi kandimalla @ 2006-02-01  8:24 UTC (permalink / raw)
  To: linuxppc-embedded

[-- Attachment #1: Type: text/plain, Size: 1349 bytes --]

Hi
I am able to  use the serialport uart drivers  for the custom board with mpc860T 
 which are available in the kernel 2.6.13  
I am getting bootup messages like this
  ttyCPM0 at MMIO 0xfb000a80 (irq = 20) is a CPM UART
ttyCPM1 at MMIO 0xfb000a90 (irq = 19) is a CPM UART
ttyCPM2 at MMIO 0xfb000a00 (irq = 46) is a CPM UART
ttyCPM3 at MMIO 0xfb000a20 (irq = 45) is a CPM UART  
   And after getting prompt
cd /proc/dev/     
cat drivers
/dev/tty             /dev/tty        5       0 system:/dev/tty
/dev/console         /dev/console    5       1 system:console
/dev/ptmx            /dev/ptmx       5       2 system
ttyCPM               /dev/ttyCPM   204 46-49 serial
pty_slave            /dev/pts      136 0-1048575 pty:slave
pty_master           /dev/ptm      128 0-1048575 pty:master 
  I am  using mknod command 
mknod /dev/ttyCPM0 c 204 46 
    ln -sf  console ttyCPM0 
mknod /dev/ttyCPM1 c 204 47
mknod /dev/ttyCPM2 c 204 48
mknod /dev/ttyCPM3 c 204 49  
  even I am not able to open the device from the user space application
  I want to write some data from the user space into the
application
But I am not able to open the device
  please help me out
regards
Bharathi 
                                                                 
  
 

		
---------------------------------
 
 What are the most popular cars? Find out at Yahoo! Autos 

[-- Attachment #2: Type: text/html, Size: 2711 bytes --]

^ permalink raw reply

* Re: Getting started with Xilinx V4 PPC? (David Summers)
From: Jaap de Jong @ 2006-02-01  8:14 UTC (permalink / raw)
  To: linuxppc-embedded

Hi David,

I am working on the same board as you do, using the montavista linux 2.4
distribution and the eldk 3.1.1 environment.
With some minor patches I've got it working with the soft ethernet core,
the uartlite and the flash (although I'm still having some trouble with
the flash).=20
In the fpga design you have to change some settings to get this working
too; the default will fail.
I can give you the details I've found if you like.

Good luck!

	Jaap de Jong

^ permalink raw reply

* [patch 06/44] generic __{, test_and_}{set, clear, change}_bit() and test_bit()
From: Akinobu Mita @ 2006-02-01  9:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: Akinobu Mita, linux-mips, dev-etrax, linux-ia64, ultralinux,
	Ian Molton, Hirokazu Takata, linuxsh-shmedia-dev, linuxppc-dev,
	Ivan Kokshaysky, linuxsh-dev, sparclinux, Chris Zankel,
	parisc-linux, Russell King, Richard Henderson
In-Reply-To: <20060201090224.536581000@localhost.localdomain>

This patch introduces the C-language equivalents of the functions below:

void __set_bit(int nr, volatile unsigned long *addr);
void __clear_bit(int nr, volatile unsigned long *addr);
void __change_bit(int nr, volatile unsigned long *addr);
int __test_and_set_bit(int nr, volatile unsigned long *addr);
int __test_and_clear_bit(int nr, volatile unsigned long *addr);
int __test_and_change_bit(int nr, volatile unsigned long *addr);
int test_bit(int nr, const volatile unsigned long *addr);

In include/asm-generic/bitops/non-atomic.h

This code largely copied from:
asm-powerpc/bitops.h

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
 include/asm-generic/bitops/non-atomic.h |  111 ++++++++++++++++++++++++++++++++
 1 files changed, 111 insertions(+)

Index: 2.6-git/include/asm-generic/bitops/non-atomic.h
===================================================================
--- /dev/null
+++ 2.6-git/include/asm-generic/bitops/non-atomic.h
@@ -0,0 +1,111 @@
+#ifndef _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
+#define _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
+
+#include <asm/types.h>
+
+#define BITOP_MASK(nr)		(1UL << ((nr) % BITS_PER_LONG))
+#define BITOP_WORD(nr)		((nr) / BITS_PER_LONG)
+
+/**
+ * __set_bit - Set a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * Unlike set_bit(), this function is non-atomic and may be reordered.
+ * If it's called on the same region of memory simultaneously, the effect
+ * may be that only one operation succeeds.
+ */
+static __inline__ void __set_bit(int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BITOP_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
+
+	*p  |= mask;
+}
+
+static __inline__ void __clear_bit(int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BITOP_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
+
+	*p &= ~mask;
+}
+
+/**
+ * __change_bit - Toggle a bit in memory
+ * @nr: the bit to change
+ * @addr: the address to start counting from
+ *
+ * Unlike change_bit(), this function is non-atomic and may be reordered.
+ * If it's called on the same region of memory simultaneously, the effect
+ * may be that only one operation succeeds.
+ */
+static __inline__ void __change_bit(int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BITOP_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
+
+	*p ^= mask;
+}
+
+/**
+ * __test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.  
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static __inline__ int __test_and_set_bit(int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BITOP_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
+	unsigned long old = *p;
+
+	*p = old | mask;
+	return (old & mask) != 0;
+}
+
+/**
+ * __test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.  
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static __inline__ int __test_and_clear_bit(int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BITOP_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
+	unsigned long old = *p;
+
+	*p = old & ~mask;
+	return (old & mask) != 0;
+}
+
+/* WARNING: non atomic and it can be reordered! */
+static __inline__ int __test_and_change_bit(int nr,
+					    volatile unsigned long *addr)
+{
+	unsigned long mask = BITOP_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
+	unsigned long old = *p;
+
+	*p = old ^ mask;
+	return (old & mask) != 0;
+}
+
+/**
+ * test_bit - Determine whether a bit is set
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ */
+static __inline__ int test_bit(int nr, __const__ volatile unsigned long *addr)
+{
+	return 1UL & (addr[BITOP_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
+}
+
+#endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */

--

^ permalink raw reply

* [patch 10/44] generic fls64()
From: Akinobu Mita @ 2006-02-01  9:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mips, linux-ia64, Ian Molton, David Howells, linuxppc-dev,
	Greg Ungerer, sparclinux, Miles Bader, Linus Torvalds,
	Yoshinori Sato, Hirokazu Takata, linuxsh-shmedia-dev, linux-m68k,
	Ivan Kokshaysky, Richard Henderson, Akinobu Mita, Chris Zankel,
	dev-etrax, ultralinux, Andi Kleen, linuxsh-dev, linux390,
	Russell King, parisc-linux
In-Reply-To: <20060201090224.536581000@localhost.localdomain>

This patch introduces the C-language equivalent of the function:
int fls64(__u64 x);

In include/asm-generic/bitops/fls64.h

This code largely copied from:
include/linux/bitops.h

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
 include/asm-generic/bitops/fls64.h |   12 ++++++++++++
 1 files changed, 12 insertions(+)

Index: 2.6-git/include/asm-generic/bitops/fls64.h
===================================================================
--- /dev/null
+++ 2.6-git/include/asm-generic/bitops/fls64.h
@@ -0,0 +1,12 @@
+#ifndef _ASM_GENERIC_BITOPS_FLS64_H_
+#define _ASM_GENERIC_BITOPS_FLS64_H_
+
+static inline int fls64(__u64 x)
+{
+	__u32 h = x >> 32;
+	if (h)
+		return fls(x) + 32;
+	return fls(x);
+}
+
+#endif /* _ASM_GENERIC_BITOPS_FLS64_H_ */

--

^ permalink raw reply

* [patch 12/44] generic sched_find_first_bit()
From: Akinobu Mita @ 2006-02-01  9:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mips, linux-ia64, Ian Molton, David Howells, linuxppc-dev,
	Greg Ungerer, sparclinux, Miles Bader, Linus Torvalds,
	Yoshinori Sato, Hirokazu Takata, linuxsh-dev, linux-m68k,
	Akinobu Mita, Chris Zankel, dev-etrax, ultralinux, Andi Kleen,
	linuxsh-shmedia-dev, linux390, Russell King, parisc-linux
In-Reply-To: <20060201090224.536581000@localhost.localdomain>

This patch introduces the C-language equivalent of the function:
int sched_find_first_bit(const unsigned long *b);

In include/asm-generic/bitops/sched.h

This code largely copied from:
include/asm-powerpc/bitops.h

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
 include/asm-generic/bitops/sched.h |   36 ++++++++++++++++++++++++++++++++++++
 1 files changed, 36 insertions(+)

Index: 2.6-git/include/asm-generic/bitops/sched.h
===================================================================
--- /dev/null
+++ 2.6-git/include/asm-generic/bitops/sched.h
@@ -0,0 +1,36 @@
+#ifndef _ASM_GENERIC_BITOPS_SCHED_H_
+#define _ASM_GENERIC_BITOPS_SCHED_H_
+
+#include <linux/compiler.h>	/* unlikely() */
+#include <asm/types.h>
+
+/*
+ * Every architecture must define this function. It's the fastest
+ * way of searching a 140-bit bitmap where the first 100 bits are
+ * unlikely to be set. It's guaranteed that at least one of the 140
+ * bits is cleared.
+ */
+static inline int sched_find_first_bit(const unsigned long *b)
+{
+#if BITS_PER_LONG == 64
+	if (unlikely(b[0]))
+		return __ffs(b[0]);
+	if (unlikely(b[1]))
+		return __ffs(b[1]) + 64;
+	return __ffs(b[2]) + 128;
+#elif BITS_PER_LONG == 32
+	if (unlikely(b[0]))
+		return __ffs(b[0]);
+	if (unlikely(b[1]))
+		return __ffs(b[1]) + 32;
+	if (unlikely(b[2]))
+		return __ffs(b[2]) + 64;
+	if (b[3])
+		return __ffs(b[3]) + 96;
+	return __ffs(b[4]) + 128;
+#else
+#error BITS_PER_LONG not defined
+#endif
+}
+
+#endif /* _ASM_GENERIC_BITOPS_SCHED_H_ */

--

^ permalink raw reply

* [patch 14/44] generic hweight{64,32,16,8}()
From: Akinobu Mita @ 2006-02-01  9:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mips, linux-ia64, Ian Molton, David Howells, linuxppc-dev,
	Greg Ungerer, sparclinux, Miles Bader, Linus Torvalds,
	Yoshinori Sato, Hirokazu Takata, linuxsh-shmedia-dev, linux-m68k,
	Ivan Kokshaysky, Richard Henderson, Akinobu Mita, Chris Zankel,
	dev-etrax, ultralinux, Andi Kleen, linuxsh-dev, linux390,
	Russell King, parisc-linux
In-Reply-To: <20060201090224.536581000@localhost.localdomain>


This patch introduces the C-language equivalents of the functions below:

unsigned int hweight32(unsigned int w);
unsigned int hweight16(unsigned int w);
unsigned int hweight8(unsigned int w);
unsigned long hweight64(__u64 w);

In include/asm-generic/bitops/hweight.h

This code largely copied from:
include/linux/bitops.h

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
 include/asm-generic/bitops/hweight.h |   54 +++++++++++++++++++++++++++++++++++
 1 files changed, 54 insertions(+)

Index: 2.6-git/include/asm-generic/bitops/hweight.h
===================================================================
--- /dev/null
+++ 2.6-git/include/asm-generic/bitops/hweight.h
@@ -0,0 +1,54 @@
+#ifndef _ASM_GENERIC_BITOPS_HWEIGHT_H_
+#define _ASM_GENERIC_BITOPS_HWEIGHT_H_
+
+#include <asm/types.h>
+
+/**
+ * hweightN - returns the hamming weight of a N-bit word
+ * @x: the word to weigh
+ *
+ * The Hamming Weight of a number is the total number of bits set in it.
+ */
+
+static inline unsigned int hweight32(unsigned int w)
+{
+        unsigned int res = (w & 0x55555555) + ((w >> 1) & 0x55555555);
+        res = (res & 0x33333333) + ((res >> 2) & 0x33333333);
+        res = (res & 0x0F0F0F0F) + ((res >> 4) & 0x0F0F0F0F);
+        res = (res & 0x00FF00FF) + ((res >> 8) & 0x00FF00FF);
+        return (res & 0x0000FFFF) + ((res >> 16) & 0x0000FFFF);
+}
+
+static inline unsigned int hweight16(unsigned int w)
+{
+        unsigned int res = (w & 0x5555) + ((w >> 1) & 0x5555);
+        res = (res & 0x3333) + ((res >> 2) & 0x3333);
+        res = (res & 0x0F0F) + ((res >> 4) & 0x0F0F);
+        return (res & 0x00FF) + ((res >> 8) & 0x00FF);
+}
+
+static inline unsigned int hweight8(unsigned int w)
+{
+        unsigned int res = (w & 0x55) + ((w >> 1) & 0x55);
+        res = (res & 0x33) + ((res >> 2) & 0x33);
+        return (res & 0x0F) + ((res >> 4) & 0x0F);
+}
+
+static inline unsigned long hweight64(__u64 w)
+{
+#if BITS_PER_LONG == 32
+	return hweight32((unsigned int)(w >> 32)) + hweight32((unsigned int)w);
+#elif BITS_PER_LONG == 64
+	u64 res;
+	res = (w & 0x5555555555555555ul) + ((w >> 1) & 0x5555555555555555ul);
+	res = (res & 0x3333333333333333ul) + ((res >> 2) & 0x3333333333333333ul);
+	res = (res & 0x0F0F0F0F0F0F0F0Ful) + ((res >> 4) & 0x0F0F0F0F0F0F0F0Ful);
+	res = (res & 0x00FF00FF00FF00FFul) + ((res >> 8) & 0x00FF00FF00FF00FFul);
+	res = (res & 0x0000FFFF0000FFFFul) + ((res >> 16) & 0x0000FFFF0000FFFFul);
+	return (res & 0x00000000FFFFFFFFul) + ((res >> 32) & 0x00000000FFFFFFFFul);
+#else
+#error BITS_PER_LONG not defined
+#endif
+}
+
+#endif /* _ASM_GENERIC_BITOPS_HWEIGHT_H_ */

--

^ permalink raw reply

* [patch 31/44] powerpc: use generic bitops
From: Akinobu Mita @ 2006-02-01  9:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Akinobu Mita, linuxppc-dev
In-Reply-To: <20060201090224.536581000@localhost.localdomain>

- remove __{,test_and_}{set,clear,change}_bit() and test_bit()
- remove generic_fls64()
- remove generic_hweight{64,32,16,8}()
- remove sched_find_first_bit()

Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
 include/asm-powerpc/bitops.h |  105 +------------------------------------------
 1 files changed, 4 insertions(+), 101 deletions(-)

Index: 2.6-git/include/asm-powerpc/bitops.h
===================================================================
--- 2.6-git.orig/include/asm-powerpc/bitops.h
+++ 2.6-git/include/asm-powerpc/bitops.h
@@ -184,72 +184,7 @@ static __inline__ void set_bits(unsigned
 	: "cc");
 }
 
-/* Non-atomic versions */
-static __inline__ int test_bit(unsigned long nr,
-			       __const__ volatile unsigned long *addr)
-{
-	return 1UL & (addr[BITOP_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
-}
-
-static __inline__ void __set_bit(unsigned long nr,
-				 volatile unsigned long *addr)
-{
-	unsigned long mask = BITOP_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
-
-	*p  |= mask;
-}
-
-static __inline__ void __clear_bit(unsigned long nr,
-				   volatile unsigned long *addr)
-{
-	unsigned long mask = BITOP_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
-
-	*p &= ~mask;
-}
-
-static __inline__ void __change_bit(unsigned long nr,
-				    volatile unsigned long *addr)
-{
-	unsigned long mask = BITOP_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
-
-	*p ^= mask;
-}
-
-static __inline__ int __test_and_set_bit(unsigned long nr,
-					 volatile unsigned long *addr)
-{
-	unsigned long mask = BITOP_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
-	unsigned long old = *p;
-
-	*p = old | mask;
-	return (old & mask) != 0;
-}
-
-static __inline__ int __test_and_clear_bit(unsigned long nr,
-					   volatile unsigned long *addr)
-{
-	unsigned long mask = BITOP_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
-	unsigned long old = *p;
-
-	*p = old & ~mask;
-	return (old & mask) != 0;
-}
-
-static __inline__ int __test_and_change_bit(unsigned long nr,
-					    volatile unsigned long *addr)
-{
-	unsigned long mask = BITOP_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BITOP_WORD(nr);
-	unsigned long old = *p;
-
-	*p = old ^ mask;
-	return (old & mask) != 0;
-}
+#include <asm-generic/bitops/non-atomic.h>
 
 /*
  * Return the zero-based bit position (LE, not IBM bit numbering) of
@@ -310,16 +245,9 @@ static __inline__ int fls(unsigned int x
 	asm ("cntlzw %0,%1" : "=r" (lz) : "r" (x));
 	return 32 - lz;
 }
-#define fls64(x)   generic_fls64(x)
+#include <asm-generic/bitops/fls64.h>
 
-/*
- * hweightN: returns the hamming weight (i.e. the number
- * of bits set) of a N-bit word
- */
-#define hweight64(x) generic_hweight64(x)
-#define hweight32(x) generic_hweight32(x)
-#define hweight16(x) generic_hweight16(x)
-#define hweight8(x) generic_hweight8(x)
+#include <asm-generic/bitops/hweight.h>
 
 #define find_first_zero_bit(addr, size) find_next_zero_bit((addr), (size), 0)
 unsigned long find_next_zero_bit(const unsigned long *addr,
@@ -397,32 +325,7 @@ unsigned long find_next_zero_le_bit(cons
 #define minix_find_first_zero_bit(addr,size) \
 	find_first_zero_le_bit((unsigned long *)addr, size)
 
-/*
- * Every architecture must define this function. It's the fastest
- * way of searching a 140-bit bitmap where the first 100 bits are
- * unlikely to be set. It's guaranteed that at least one of the 140
- * bits is cleared.
- */
-static inline int sched_find_first_bit(const unsigned long *b)
-{
-#ifdef CONFIG_PPC64
-	if (unlikely(b[0]))
-		return __ffs(b[0]);
-	if (unlikely(b[1]))
-		return __ffs(b[1]) + 64;
-	return __ffs(b[2]) + 128;
-#else
-	if (unlikely(b[0]))
-		return __ffs(b[0]);
-	if (unlikely(b[1]))
-		return __ffs(b[1]) + 32;
-	if (unlikely(b[2]))
-		return __ffs(b[2]) + 64;
-	if (b[3])
-		return __ffs(b[3]) + 96;
-	return __ffs(b[4]) + 128;
-#endif
-}
+#include <asm-generic/bitops/sched.h>
 
 #endif /* __KERNEL__ */
 

--

^ permalink raw reply

* Re: [patch 14/44] generic hweight{64,32,16,8}()
From: Andi Kleen @ 2006-02-01  9:06 UTC (permalink / raw)
  To: Akinobu Mita
  Cc: linux-mips, linux-ia64, Ian Molton, David Howells, linuxppc-dev,
	Greg Ungerer, sparclinux, Miles Bader, Yoshinori Sato,
	Hirokazu Takata, linuxsh-dev, Linus Torvalds, Ivan Kokshaysky,
	Richard Henderson, Chris Zankel, dev-etrax, ultralinux,
	linux-m68k, linux-kernel, linuxsh-shmedia-dev, linux390,
	Russell King, parisc-linux
In-Reply-To: <20060201090325.905071000@localhost.localdomain>

On Wednesday 01 February 2006 10:02, Akinobu Mita wrote:

> +static inline unsigned int hweight32(unsigned int w)
> +{
> +        unsigned int res = (w & 0x55555555) + ((w >> 1) & 0x55555555);
> +        res = (res & 0x33333333) + ((res >> 2) & 0x33333333);
> +        res = (res & 0x0F0F0F0F) + ((res >> 4) & 0x0F0F0F0F);
> +        res = (res & 0x00FF00FF) + ((res >> 8) & 0x00FF00FF);
> +        return (res & 0x0000FFFF) + ((res >> 16) & 0x0000FFFF);
> +}

How large are these functions on x86? Maybe it would be better to not inline them,
but put it into some C file out of line.

-Andi

^ permalink raw reply

* Re: [patch 14/44] generic hweight{64,32,16,8}()
From: Michael Tokarev @ 2006-02-01  9:26 UTC (permalink / raw)
  To: Andi Kleen
  Cc: linux-mips, linux-ia64, Ian Molton, David Howells, linuxppc-dev,
	Greg Ungerer, sparclinux, Miles Bader, Yoshinori Sato,
	Hirokazu Takata, linuxsh-dev, Linus Torvalds, Ivan Kokshaysky,
	Richard Henderson, Akinobu Mita, Chris Zankel, dev-etrax,
	ultralinux, linux-m68k, linux-kernel, linuxsh-shmedia-dev,
	linux390, Russell King, parisc-linux
In-Reply-To: <200602011006.09596.ak@suse.de>

Andi Kleen wrote:
> On Wednesday 01 February 2006 10:02, Akinobu Mita wrote:
> 
>>+static inline unsigned int hweight32(unsigned int w)
[]
> How large are these functions on x86? Maybe it would be better to not inline them,
> but put it into some C file out of line.

hweight8	47 bytes
hweight16	76 bytes
hweight32	97 bytes
hweight64	56 bytes (NOT inlining hweight32)
hweight64	197 bytes (inlining hweight32)

Those are when compiled as separate non-inlined functions,
with pushl %ebp and ret.

/mjt

^ permalink raw reply

* Re: [patch 14/44] generic hweight{64,32,16,8}()
From: Andi Kleen @ 2006-02-01 10:24 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: linux-mips, linux-ia64, Ian Molton, David Howells, linuxppc-dev,
	Greg Ungerer, sparclinux, Miles Bader, Yoshinori Sato,
	Hirokazu Takata, linuxsh-dev, Linus Torvalds, Ivan Kokshaysky,
	Richard Henderson, Akinobu Mita, Chris Zankel, dev-etrax,
	ultralinux, linux-m68k, linux-kernel, linuxsh-shmedia-dev,
	linux390, Russell King, parisc-linux
In-Reply-To: <43E07EB2.4020409@tls.msk.ru>

On Wednesday 01 February 2006 10:26, Michael Tokarev wrote:
> Andi Kleen wrote:
> > On Wednesday 01 February 2006 10:02, Akinobu Mita wrote:
> > 
> >>+static inline unsigned int hweight32(unsigned int w)
> []
> > How large are these functions on x86? Maybe it would be better to not inline them,
> > but put it into some C file out of line.
> 
> hweight8	47 bytes
> hweight16	76 bytes
> hweight32	97 bytes
> hweight64	56 bytes (NOT inlining hweight32)
> hweight64	197 bytes (inlining hweight32)
> 
> Those are when compiled as separate non-inlined functions,
> with pushl %ebp and ret.

This would argue for moving them out of line.

-Andi

^ permalink raw reply

* RE: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: Jenkins, Clive @ 2006-02-01 11:19 UTC (permalink / raw)
  To: David Hawkins, Stefan Roese; +Cc: linuxppc-embedded

> >>readl() and ioread32() read the registers in little-endian format!
> >
> > Correct. That's how it is implemented on all platforms. Think for
> > example of an pci device driver. Using these IO functions, the
> > driver will become platform independent, running without
> > modifications on little- and big-endian machines.
>
> I just stumbled across the section in Rubini 3rd Ed that mislead
> me into believing that the readl()/writel() were machine endianness
> dependent, i.e., LE on x86, BE on PPC.

> p453 of his book has a PCI DMA example, where he uses the
> cpu_to_le32() macros inside calls writel().

> However, since these functions are internally implemented to
> perform LE operations, this example appears to be incorrect.

> Would you agree?

No, I think it is correct.

On most architectures readl() and writel() assume you are
accessing MMIO on the PCI bus, which is little-endian.
So these macros convert between the endian-ness of the CPU
and the endian-ness of the PCI bus.

Clive

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: David Hawkins @ 2006-02-01 17:02 UTC (permalink / raw)
  To: Jenkins, Clive; +Cc: linuxppc-embedded
In-Reply-To: <35786B99AB3FDC45A8215724617919736D9217@gbrwgceumf01.eu.xerox.net>

Jenkins, Clive wrote:
>>>>readl() and ioread32() read the registers in little-endian format!
>>>
>>>Correct. That's how it is implemented on all platforms. Think for
>>>example of an pci device driver. Using these IO functions, the
>>>driver will become platform independent, running without
>>>modifications on little- and big-endian machines.
>>
>>I just stumbled across the section in Rubini 3rd Ed that mislead
>>me into believing that the readl()/writel() were machine endianness
>>dependent, i.e., LE on x86, BE on PPC.
> 
> 
>>p453 of his book has a PCI DMA example, where he uses the
>>cpu_to_le32() macros inside calls writel().
> 
> 
>>However, since these functions are internally implemented to
>>perform LE operations, this example appears to be incorrect.
> 
> 
>>Would you agree?
> 
> 
> No, I think it is correct.
> 
> On most architectures readl() and writel() assume you are
> accessing MMIO on the PCI bus, which is little-endian.
> So these macros convert between the endian-ness of the CPU
> and the endian-ness of the PCI bus.
> 
> Clive

Hey Clive,

Right, but in the 440EP source, and probably on other PowerPC
ports, readl() and writel() perform the endian conversion for
you, so if you wrote an operation as

   writel(register_address, cpu_to_le32(data));

you would get *two* little endian conversions, i.e., you
would write the bytes in the wrong order.

I haven't looked in the source for other big-endian architecures
yet. I wonder if the ARM stuff is big, or little endian?

The original question came about while I was trying to read
back PLX-9054 registers (a PCI-to-local bus bridge on a PCI
adapter board).

I search through the other drivers for the 440EP peripherals
that are not on the PCI bus showed that the authors used
the in_be32() and out_be32() calls. The comment at the top
of this email indicates that the readl() and writel() are
effectively 'reserved' for PCI accesses. I didn't get this
impression from reading Rubini.

Thanks for you comments. Feel free to add more comments!

Cheers
Dave

^ permalink raw reply

* MPC8272ADS Hangs with gcc 4.0.0 and u-boot 1.1.4
From: Addison Baldwin @ 2006-02-01 17:26 UTC (permalink / raw)
  To: linuxppc-embedded

Hi developers

I have a MPC8272ADS board, from Freescale. I'm using low-boot
configuration (HRCW is 0x0E_74_B2_0A).

I have ELDK 3.1 and ELDK 4.0 installed. I'm using U-Boot 1.1.4. When I
compile U-Boot with GCC 3.3.3, the board works ok, but not with GCC
4.0.0. All the ethernet leds freeze turned on and the Run led too. It
makes me thinking the board entered some loop state. Nothing appears
at serial port.

If anyone had this problem, or if anyone have such a configuration
working (GCC 4 and U-Boot 1.1.4 with this board), any help will be
very appreciated.

Regards,
Addison Baldwin

^ permalink raw reply

* Re: MPC8272ADS Hangs with gcc 4.0.0 and u-boot 1.1.4
From: Addison Baldwin @ 2006-02-01 17:27 UTC (permalink / raw)
  To: linuxppc-embedded
In-Reply-To: <af313df20602010926o305b24b6p555843ed96bbf491@mail.gmail.com>

Really sorry!

I sent this message to the wrong list.

I'll send it to the U-Boot.

Regards,
Addison Baldwin

On 2/1/06, Addison Baldwin <addison.baldwin@gmail.com> wrote:
> Hi developers
>
> I have a MPC8272ADS board, from Freescale. I'm using low-boot
> configuration (HRCW is 0x0E_74_B2_0A).
>
> I have ELDK 3.1 and ELDK 4.0 installed. I'm using U-Boot 1.1.4. When I
> compile U-Boot with GCC 3.3.3, the board works ok, but not with GCC
> 4.0.0. All the ethernet leds freeze turned on and the Run led too. It
> makes me thinking the board entered some loop state. Nothing appears
> at serial port.
>
> If anyone had this problem, or if anyone have such a configuration
> working (GCC 4 and U-Boot 1.1.4 with this board), any help will be
> very appreciated.
>
> Regards,
> Addison Baldwin
>

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: Matt Porter @ 2006-02-01 17:44 UTC (permalink / raw)
  To: David Hawkins; +Cc: linuxppc-embedded, Jenkins, Clive
In-Reply-To: <43E0E9A7.4040508@ovro.caltech.edu>

On Wed, Feb 01, 2006 at 09:02:31AM -0800, David Hawkins wrote:
> Jenkins, Clive wrote:
> >>>>readl() and ioread32() read the registers in little-endian format!
> >>>
> >>>Correct. That's how it is implemented on all platforms. Think for
> >>>example of an pci device driver. Using these IO functions, the
> >>>driver will become platform independent, running without
> >>>modifications on little- and big-endian machines.
> >>
> >>I just stumbled across the section in Rubini 3rd Ed that mislead
> >>me into believing that the readl()/writel() were machine endianness
> >>dependent, i.e., LE on x86, BE on PPC.
> > 
> > 
> >>p453 of his book has a PCI DMA example, where he uses the
> >>cpu_to_le32() macros inside calls writel().
> > 
> > 
> >>However, since these functions are internally implemented to
> >>perform LE operations, this example appears to be incorrect.
> > 
> > 
> >>Would you agree?
> > 
> > 
> > No, I think it is correct.
> > 
> > On most architectures readl() and writel() assume you are
> > accessing MMIO on the PCI bus, which is little-endian.
> > So these macros convert between the endian-ness of the CPU
> > and the endian-ness of the PCI bus.
> > 
> > Clive
> 
> Hey Clive,
> 
> Right, but in the 440EP source, and probably on other PowerPC
> ports, readl() and writel() perform the endian conversion for
> you, so if you wrote an operation as
> 
>    writel(register_address, cpu_to_le32(data));
> 
> you would get *two* little endian conversions, i.e., you
> would write the bytes in the wrong order.
> 
> I haven't looked in the source for other big-endian architecures
> yet. I wonder if the ARM stuff is big, or little endian?

Both

> The original question came about while I was trying to read
> back PLX-9054 registers (a PCI-to-local bus bridge on a PCI
> adapter board).

Those regs are little endian since it's a PCI device. You
use the old read*/write* or new ioread*/iowrite* since it's
a PCI device.

> I search through the other drivers for the 440EP peripherals
> that are not on the PCI bus showed that the authors used
> the in_be32() and out_be32() calls. The comment at the top
> of this email indicates that the readl() and writel() are
> effectively 'reserved' for PCI accesses. I didn't get this
> impression from reading Rubini.

The book implicitly focuses on x86 driver developers, that's
why you don't get an explicit statement about this...
"everything" is PCI in that world.

read*/write* and ioread*/iowrite* generate outbound little
endian cycles on ALL arches, period.  They are intended
only for PCI use and have generic names only because of
the assumption that "all the world is a PC".

Now, what it takes to to generate outbound little endian cycles
varies. On some arches, it's just a store (native LE) on
other arches, it's a reversed store (PPC), others still configure
their PCI bridge hardware to do byte swapping in hardware (typically
if their arch doesn't have a simple byte-swapping store like PPC).

The example you cite on pg. 453 of Rubini looks broken for BE
systems. It works on LE systems since cpu_to_le32() does nothing
and writel is a simply dereference. That's pure luck. On PPC,
for example, that would write a big endian bus_addr to the fictitious
PCI device which is not what they want.

-Matt

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: David Hawkins @ 2006-02-01 17:53 UTC (permalink / raw)
  To: Matt Porter; +Cc: linuxppc-embedded, Jenkins, Clive
In-Reply-To: <20060201104405.C16064@cox.net>

> The book implicitly focuses on x86 driver developers, that's
> why you don't get an explicit statement about this...
> "everything" is PCI in that world.
> 
> read*/write* and ioread*/iowrite* generate outbound little
> endian cycles on ALL arches, period.  They are intended
> only for PCI use and have generic names only because of
> the assumption that "all the world is a PC".
> 
> Now, what it takes to to generate outbound little endian cycles
> varies. On some arches, it's just a store (native LE) on
> other arches, it's a reversed store (PPC), others still configure
> their PCI bridge hardware to do byte swapping in hardware (typically
> if their arch doesn't have a simple byte-swapping store like PPC).
> 
> The example you cite on pg. 453 of Rubini looks broken for BE
> systems. It works on LE systems since cpu_to_le32() does nothing
> and writel is a simply dereference. That's pure luck. On PPC,
> for example, that would write a big endian bus_addr to the fictitious
> PCI device which is not what they want.

Great! An authoritive answer!

Re: endianness, even cooler on the 440EP, in your mmap()
implementation you can set the _PAGE_ENDIAN flag, and
user-space will see the PCI device in little endian
format. Fun stuff!

Thanks Matt.

Dave

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: David Hawkins @ 2006-02-01 18:04 UTC (permalink / raw)
  To: Matt Porter; +Cc: linuxppc-embedded
In-Reply-To: <20060201104405.C16064@cox.net>

Matt,

In the same vein as the readl()/writel() question, what
are the assumptions regarding memcpy_toio and memcpy_fromio?

If the memcpy_to/fromio operations are intended only
for access to PCI devices, then they should also inherently
perform little-endianness conversion. For the test driver
I was working on, I did *not* find this the case, eg.
I implemented the test driver read() and write() using the
memcpy_to/fromio calls, and the data transfers occur
in big-endian (well, 'native' mode, since I also test the
same test driver with the PCI adapter in an x86 system).

If memcpy_to/fromio can be used in a more general context,
then I can see why they operate in native mode.

Just looking for enlightenment.

Cheers
Dave

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: Eugene Surovegin @ 2006-02-01 18:11 UTC (permalink / raw)
  To: David Hawkins; +Cc: linuxppc-embedded
In-Reply-To: <43E0F81F.4050005@ovro.caltech.edu>

On Wed, Feb 01, 2006 at 10:04:15AM -0800, David Hawkins wrote:
> Matt,
> 
> In the same vein as the readl()/writel() question, what
> are the assumptions regarding memcpy_toio and memcpy_fromio?
> 
> If the memcpy_to/fromio operations are intended only
> for access to PCI devices, then they should also inherently
> perform little-endianness conversion. For the test driver
> I was working on, I did *not* find this the case, eg.
> I implemented the test driver read() and write() using the
> memcpy_to/fromio calls, and the data transfers occur
> in big-endian (well, 'native' mode, since I also test the
> same test driver with the PCI adapter in an x86 system).
> 
> If memcpy_to/fromio can be used in a more general context,
> then I can see why they operate in native mode.
> 
> Just looking for enlightenment.

This commands IIRC are intended for copying chunk of _bytes_. There 
are no issues with endianess for bytes, e.g. they work just like 
ordinary memcpy.

-- 
Eugene

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: David Hawkins @ 2006-02-01 18:20 UTC (permalink / raw)
  To: Eugene Surovegin; +Cc: linuxppc-embedded
In-Reply-To: <20060201181106.GA24138@gate.ebshome.net>

Eugene Surovegin wrote:
> On Wed, Feb 01, 2006 at 10:04:15AM -0800, David Hawkins wrote:
> 
>>Matt,
>>
>>In the same vein as the readl()/writel() question, what
>>are the assumptions regarding memcpy_toio and memcpy_fromio?
>>
>>If the memcpy_to/fromio operations are intended only
>>for access to PCI devices, then they should also inherently
>>perform little-endianness conversion. For the test driver
>>I was working on, I did *not* find this the case, eg.
>>I implemented the test driver read() and write() using the
>>memcpy_to/fromio calls, and the data transfers occur
>>in big-endian (well, 'native' mode, since I also test the
>>same test driver with the PCI adapter in an x86 system).
>>
>>If memcpy_to/fromio can be used in a more general context,
>>then I can see why they operate in native mode.
>>
>>Just looking for enlightenment.
> 
> 
> This commands IIRC are intended for copying chunk of _bytes_. There 
> are no issues with endianess for bytes, e.g. they work just like 
> ordinary memcpy.
> 

True, good point.

I quite often implement a 'control' device to read/write/mmap PCI
device registers. In that case, the registers are usually 32-bit, so
if I wanted endian neutrality, I could either let the user-space
app determine the endianness and act accordingly, or force the
user-space app to always see little-endian registers by replacing
memcpy_to/fromio calls with a loop over read;/writel, and in mmap
making sure to set the _PAGE_ENDIAN flag. Of course, making mmap
endian-neutral depends on the 440EP page flags, which say an
ARM might not have.

Thanks for the valuable feedback guys.

Dave

^ permalink raw reply

* RE: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: Jenkins, Clive @ 2006-02-01 18:35 UTC (permalink / raw)
  To: David Hawkins; +Cc: linuxppc-embedded

>>>p453 of his book has a PCI DMA example, where he uses the
>>>cpu_to_le32() macros inside calls writel().

>>>However, since these functions are internally implemented to
>>>perform LE operations, this example appears to be incorrect.

>>>Would you agree?
=20
>> No, I think it is correct.

I said this without looking up the example you cited.
I agree now, the example is incorrect; and yes, file a bug
report at oreilly!

I think, from memory, that elsewhere in the book Rubini does
say that readl()... are for the PCI bus, but cross-arch issues
are only addressed in certain sections.

I always find out exactly what these macros do on the arch
I am using, then I know where I stand. I find LXR (Google it if
you don't know it) good for browsing the source of vanilla
kernels. After finding out how and where it is done, I then
double check the relevant files of the actual kernel I am using.

ppc implementations of readl, writel, cpu_to_le32 use the byte-
reversed load/store word instructions.

Clive

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: Eugene Surovegin @ 2006-02-01 18:23 UTC (permalink / raw)
  To: David Hawkins
In-Reply-To: <43E0FBFA.10103@ovro.caltech.edu>

On Wed, Feb 01, 2006 at 10:20:42AM -0800, David Hawkins wrote:
> Eugene Surovegin wrote:
> >On Wed, Feb 01, 2006 at 10:04:15AM -0800, David Hawkins wrote:
> >
> >>Matt,
> >>
> >>In the same vein as the readl()/writel() question, what
> >>are the assumptions regarding memcpy_toio and memcpy_fromio?
> >>
> >>If the memcpy_to/fromio operations are intended only
> >>for access to PCI devices, then they should also inherently
> >>perform little-endianness conversion. For the test driver
> >>I was working on, I did *not* find this the case, eg.
> >>I implemented the test driver read() and write() using the
> >>memcpy_to/fromio calls, and the data transfers occur
> >>in big-endian (well, 'native' mode, since I also test the
> >>same test driver with the PCI adapter in an x86 system).
> >>
> >>If memcpy_to/fromio can be used in a more general context,
> >>then I can see why they operate in native mode.
> >>
> >>Just looking for enlightenment.
> >
> >
> >This commands IIRC are intended for copying chunk of _bytes_. There 
> >are no issues with endianess for bytes, e.g. they work just like 
> >ordinary memcpy.
> >
> 
> True, good point.
> 
> I quite often implement a 'control' device to read/write/mmap PCI
> device registers. In that case, the registers are usually 32-bit, so
> if I wanted endian neutrality, I could either let the user-space
> app determine the endianness and act accordingly, or force the
> user-space app to always see little-endian registers by replacing
> memcpy_to/fromio calls with a loop over read;/writel,

You seem to assume that memcpy should do 32-bit reads/writes. Why not 
16-bit ones? That's why memcpy cannot do any byte swapping, because it 
can "theoretically" do 2 different types of it (16-bit and 32-bit), 
which is obviously not specified in memcpy interface.

-- 
Eugene

^ permalink raw reply

* Re: Yosemite/440EP why are readl()/ioread32() setup to readlittle-endian?
From: David Hawkins @ 2006-02-01 20:35 UTC (permalink / raw)
  To: Jenkins, Clive; +Cc: linuxppc-embedded
In-Reply-To: <35786B99AB3FDC45A8215724617919736D921B@gbrwgceumf01.eu.xerox.net>

> I said this without looking up the example you cited.
> I agree now, the example is incorrect; and yes, file a bug
> report at oreilly!

Ok, I will.

> I think, from memory, that elsewhere in the book Rubini does
> say that readl()... are for the PCI bus, but cross-arch issues
> are only addressed in certain sections.

I didn't find anything that specifically mentioned their
use was for the PCI bus only. The endianness swapping
features of the pci_config_xxx functions are clearly
stated, but not the readl/writel. And of course the
example I refer to clearly uses those functions on the
PCI bus incorrectly.

But its still a great book.

> I always find out exactly what these macros do on the arch
> I am using, then I know where I stand. I find LXR (Google it if
> you don't know it) good for browsing the source of vanilla
> kernels. After finding out how and where it is done, I then
> double check the relevant files of the actual kernel I am using.
> 
> ppc implementations of readl, writel, cpu_to_le32 use the byte-
> reversed load/store word instructions.

Ahh, very good advice. I think I read about LXR in one of
Freescale's app notes on porting Linux. I'll go take a look
on Google.

Thanks
Dave

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox