* __pv_phys_offset, inline assembly functions, and such like
@ 2014-03-28 15:28 Russell King - ARM Linux
2014-03-28 15:29 ` [PATCH] ARM: Better virt_to_page() handling Russell King
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2014-03-28 15:28 UTC (permalink / raw)
To: linux-arm-kernel
Having spent a while looking at the Freescale FEC driver, including
inspecting its assembly, I believe that we have made some pretty bad
errors of judgement concerning various "optimisations" which have
been done.
One of the most striking is the conversion from virtual to physical
addresses.
Drivers make calls to the DMA API, and one such call is dma_map_single().
This used to be a mere function call which didn't generate lots of code.
However, this has changed since it is now an inline function, which
involves this badly written code:
addr = ops->map_page(dev, virt_to_page(ptr),
(unsigned long)ptr & ~PAGE_MASK, size,
dir, attrs);
debug_dma_map_page(dev, virt_to_page(ptr),
(unsigned long)ptr & ~PAGE_MASK, size,
dir, addr, true);
virt_to_page() is itself a macro which calls other macros which calls
other functions. So, virt_to_page() itself is not a trivial operation.
We do this twice in the above code, once after a function pointer has
been indirected through, so the compiler has no knowledge what side
effects that may have. So the first thing is that this crappily written
generic function should be changed to:
struct page *page = virt_to_page(ptr);
addr = ops->map_page(dev, page, (unsigned long)ptr & ~PAGE_MASK, size,
dir, attrs);
debug_dma_map_page(dev, page, (unsigned long)ptr & ~PAGE_MASK, size,
dir, addr, true);
I'm not going to make the above change here though...
The second problem is virt_to_page() itself. Let's look at what that
involves:
#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
For flatmem (which is single-zImage's only supported memory model):
#define __pfn_to_page(pfn) (mem_map + ((pfn) - ARCH_PFN_OFFSET))
#define ARCH_PFN_OFFSET PHYS_PFN_OFFSET
#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
extern u64 __pv_phys_offset;
#define PHYS_OFFSET __pv_phys_offset
#define __pa(x) __virt_to_phys((unsigned long)(x))
static inline phys_addr_t __virt_to_phys(unsigned long x)
{
phys_addr_t t;
if (sizeof(phys_addr_t) == 4) {
__pv_stub(x, t, "add", __PV_BITS_31_24);
} else {
__pv_stub_mov_hi(t);
__pv_add_carry_stub(x, t);
}
return t;
}
Now, let's look at the code generated by the above with DMA debugging
disabled, and marvell in awe:
; sl = virtual address
a38: e59331d4 ldr r3, [r3, #468] ; get_dma_ops
a3c: e3001000 movw r1, #0
a3c: R_ARM_MOVW_ABS_NC arm_dma_ops
a40: e3401000 movt r1, #0 ; r1 = &arm_dma_ops
a40: R_ARM_MOVT_ABS arm_dma_ops
a44: e3530000 cmp r3, #0 ; get_dma_ops
a48: 01a03001 moveq r3, r1 ; get_dma_ops
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
a68: e3401000 movt r1, #0 ; r1 = &mem_map
a68: R_ARM_MOVT_ABS mem_map
a6c: e1a02a0a lsl r2, sl, #20 ; r2 = part converted offset into page
a70: e50b904c str r9, [fp, #-76] ; save high word of __pv_phys_offset to stack
a74: e3a09000 mov r9, #0 ; attrs
a78: e591e000 ldr lr, [r1] ; lr = mem_map
a7c: e1a0162c lsr r1, ip, #12 ; r1 = __pv_phys_offset(low) >> 12
a80: e3a0c001 mov ip, #1 ; direction
a84: e58dc000 str ip, [sp] ; direction stacked for call
a88: e51bc04c ldr ip, [fp, #-76] ; get high word of __pv_phys_offset from stack
a8c: e1a02a22 lsr r2, r2, #20 ; r2 = offset into page
a90: e28aa481 add sl, sl, #-2130706432 ; assembly virt_to_phys
a94: e58d9004 str r9, [sp, #4] ; attrs stacked for call
a98: e1811a0c orr r1, r1, ip, lsl #20 ; r1 = __pv_phys_offset >> 12
a9c: e593c010 ldr ip, [r3, #16] ; ip = ops->map_page
aa0: e061162a rsb r1, r1, sl, lsr #12 ; r1 = asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12
aa4: e1a03006 mov r3, r6 ; length
aa8: e08e1281 add r1, lr, r1, lsl #5 r1 = &mem_map[r1]
aac: e12fff3c blx ip ; call map_page
So, what this is doing is:
offset = (addr & 0xfff);
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
Now, realise that the asm virt_to_phys(addr) is hiding this calculation
from the compiler:
addr - PAGE_OFFSET + __pv_phys_offset
Therefore, the above becomes:
page = &mem_map[(addr - PAGE_OFFSET + __pv_phys_offset) >> 12 - __pv_phys_offset >> 12]
I'm sure it doesn't take many people long to realise that this is
equivalent to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
IOW, no reference to the physical stuff at all.
With the inline assembly eliminated from the macro, and __pv_phys_offset
converted to a PFN offset instead, the above assembly code shrinks
dramatically:
a20: e59331d4 ldr r3, [r3, #468] ; get_dma_ops
a24: e3001000 movw r1, #0
a24: R_ARM_MOVW_ABS_NC arm_dma_ops
a28: e3401000 movt r1, #0 ; &arm_dma_ops
a28: R_ARM_MOVT_ABS arm_dma_ops
a2c: e3530000 cmp r3, #0 ; get_dma_ops
a30: 01a03001 moveq r3, r1 ; get_dma_ops
a34: e28a1101 add r1, sl, #1073741824 ; r1 = addr - PAGE_OFFSET
a38: e3002000 movw r2, #0
a38: R_ARM_MOVW_ABS_NC mem_map
a3c: e3402000 movt r2, #0 ; r2 = &mem_map
a3c: R_ARM_MOVT_ABS mem_map
a40: e3a0e001 mov lr, #1 ; direction
a44: e1a01621 lsr r1, r1, #12 ; r1 = (addr - PAGE_OFFSET) >> 12 (pfn offset)
a48: e592c000 ldr ip, [r2] ; ip = &mem_map[0]
a4c: e1a02a0a lsl r2, sl, #20 ; r2 = part converted offset into page
a50: e58de000 str lr, [sp] ; stack direction
a54: e3a0a000 mov sl, #0 ; attr
a58: e08c1281 add r1, ip, r1, lsl #5 ; r1 = &mem_map[(addr - PAGE_OFFSET) >> 12]
a5c: e58da004 str sl, [sp, #4] ; stack attr
a60: e1a02a22 lsr r2, r2, #20 ; r2 = offset
a64: e593c010 ldr ip, [r3, #16] ; ops->map_page
a68: e1a03006 mov r3, r6 ; length
a6c: e12fff3c blx ip ; call map_page
With this in place, I see a reduction of 4220 bytes in the kernel image,
and that's from just fixing virt_to_page() and changing __pv_phys_offset
to be PFN-based (so it can be 32-bit in all cases.)
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Better virt_to_page() handling
2014-03-28 15:28 __pv_phys_offset, inline assembly functions, and such like Russell King - ARM Linux
@ 2014-03-28 15:29 ` Russell King
2014-03-28 19:52 ` Nicolas Pitre
2014-04-22 19:55 ` Ezequiel Garcia
2014-03-28 16:44 ` __pv_phys_offset, inline assembly functions, and such like Arnd Bergmann
2014-04-04 22:32 ` [PATCH] ARM: Better virt_to_page() handling Russell King
2 siblings, 2 replies; 14+ messages in thread
From: Russell King @ 2014-03-28 15:29 UTC (permalink / raw)
To: linux-arm-kernel
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
arch/arm/include/asm/memory.h | 41 ++++++++++++++++++++++++-----------------
arch/arm/kernel/armksyms.c | 2 +-
arch/arm/kernel/head.S | 17 +++++++++--------
3 files changed, 34 insertions(+), 26 deletions(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 4afb376d9c7c..02fa2558f662 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -166,9 +166,17 @@
* Physical vs virtual RAM address space conversion. These are
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
+ *
+ * PFNs are used to describe any physical page; this means
+ * PFN 0 == physical address 0.
*/
-#ifndef __virt_to_phys
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
+#if defined(__virt_to_phys)
+#define PHYS_OFFSET PLAT_PHYS_OFFSET
+#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
+
+#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
+
+#elif defined(CONFIG_ARM_PATCH_PHYS_VIRT)
/*
* Constants used to force the right instruction encodings and shifts
@@ -177,12 +185,17 @@
#define __PV_BITS_31_24 0x81000000
#define __PV_BITS_7_0 0x81
-extern u64 __pv_phys_offset;
+extern unsigned long __pv_phys_pfn_offset;
extern u64 __pv_offset;
extern void fixup_pv_table(const void *, unsigned long);
extern const void *__pv_table_begin, *__pv_table_end;
-#define PHYS_OFFSET __pv_phys_offset
+#define PHYS_OFFSET ((phys_addr_t)__pv_phys_pfn_offset << PAGE_SHIFT)
+#define PHYS_PFN_OFFSET (__pv_phys_pfn_offset)
+
+#define virt_to_pfn(kaddr) \
+ ((((unsigned long)(kaddr) - PAGE_OFFSET) >> PAGE_SHIFT) + \
+ PHYS_PFN_OFFSET)
#define __pv_stub(from,to,instr,type) \
__asm__("@ __pv_stub\n" \
@@ -243,6 +256,7 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
#else
#define PHYS_OFFSET PLAT_PHYS_OFFSET
+#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
static inline phys_addr_t __virt_to_phys(unsigned long x)
{
@@ -254,18 +268,11 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
return x - PHYS_OFFSET + PAGE_OFFSET;
}
-#endif
-#endif
+#define virt_to_pfn(kaddr) \
+ ((((unsigned long)(kaddr) - PAGE_OFFSET) >> PAGE_SHIFT) + \
+ PHYS_PFN_OFFSET)
-/*
- * PFNs are used to describe any physical page; this means
- * PFN 0 == physical address 0.
- *
- * This is the PFN of the first RAM page in the kernel
- * direct-mapped view. We assume this is the first page
- * of RAM in the mem_map as well.
- */
-#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
+#endif
/*
* These are *only* valid on the kernel direct mapped RAM memory.
@@ -343,9 +350,9 @@ static inline __deprecated void *bus_to_virt(unsigned long x)
*/
#define ARCH_PFN_OFFSET PHYS_PFN_OFFSET
-#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
+#define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr))
#define virt_addr_valid(kaddr) (((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory) \
- && pfn_valid(__pa(kaddr) >> PAGE_SHIFT) )
+ && pfn_valid(virt_to_pfn(kaddr)))
#endif
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 85e664b6a5f1..f7b450f97e68 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -158,6 +158,6 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
#endif
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-EXPORT_SYMBOL(__pv_phys_offset);
+EXPORT_SYMBOL(__pv_phys_pfn_offset);
EXPORT_SYMBOL(__pv_offset);
#endif
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index f5f381d91556..f8c08839edf3 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -584,9 +584,10 @@ __fixup_pv_table:
subs r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
add r4, r4, r3 @ adjust table start address
add r5, r5, r3 @ adjust table end address
- add r6, r6, r3 @ adjust __pv_phys_offset address
+ add r6, r6, r3 @ adjust __pv_phys_pfn_offset address
add r7, r7, r3 @ adjust __pv_offset address
- str r8, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_offset
+ mov r0, r8, lsr #12 @ convert to PFN
+ str r0, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_pfn_offset
strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits
mov r6, r3, lsr #24 @ constant for add/sub instructions
teq r3, r6, lsl #24 @ must be 16MiB aligned
@@ -600,7 +601,7 @@ ENDPROC(__fixup_pv_table)
1: .long .
.long __pv_table_begin
.long __pv_table_end
-2: .long __pv_phys_offset
+2: .long __pv_phys_pfn_offset
.long __pv_offset
.text
@@ -688,11 +689,11 @@ ENTRY(fixup_pv_table)
ENDPROC(fixup_pv_table)
.data
- .globl __pv_phys_offset
- .type __pv_phys_offset, %object
-__pv_phys_offset:
- .quad 0
- .size __pv_phys_offset, . -__pv_phys_offset
+ .globl __pv_phys_pfn_offset
+ .type __pv_phys_pfn_offset, %object
+__pv_phys_pfn_offset:
+ .word 0
+ .size __pv_phys_pfn_offset, . -__pv_phys_pfn_offset
.globl __pv_offset
.type __pv_offset, %object
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* __pv_phys_offset, inline assembly functions, and such like
2014-03-28 15:28 __pv_phys_offset, inline assembly functions, and such like Russell King - ARM Linux
2014-03-28 15:29 ` [PATCH] ARM: Better virt_to_page() handling Russell King
@ 2014-03-28 16:44 ` Arnd Bergmann
2014-03-28 16:56 ` Russell King - ARM Linux
2014-04-04 22:32 ` [PATCH] ARM: Better virt_to_page() handling Russell King
2 siblings, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2014-03-28 16:44 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 28 March 2014 15:28:32 Russell King - ARM Linux wrote:
>
> The second problem is virt_to_page() itself. Let's look at what that
> involves:
>
> #define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
>
> For flatmem (which is single-zImage's only supported memory model):
A side note on this: I don't think there is a strong reason why we
can't also support sparsemem in multiplatform. I believe we have
so far avoided it because nobody really needed it to be in multiplatform,
and because the mach-realview Kconfig doesn't allow it in combination
with ARM_PATCH_PHYS_VIRT.
I would probably suggest changing two things here:
a) allow sparsemem with multiplatform, for more efficient support
of some platforms
b) change mach-realview to decouple sparsemem from the custom
__virt_to_phys implementation, at the same time as getting
realview ready for multiplatform.
I don't think any of these will fundamentally change your analysis,
but I thought it might help to mention this in case you see other
problems with sparsemem enabled.
> With the inline assembly eliminated from the macro, and __pv_phys_offset
> converted to a PFN offset instead, the above assembly code shrinks
> dramatically:
>
> a20: e59331d4 ldr r3, [r3, #468] ; get_dma_ops
> a24: e3001000 movw r1, #0
> a24: R_ARM_MOVW_ABS_NC arm_dma_ops
> a28: e3401000 movt r1, #0 ; &arm_dma_ops
> a28: R_ARM_MOVT_ABS arm_dma_ops
> a2c: e3530000 cmp r3, #0 ; get_dma_ops
> a30: 01a03001 moveq r3, r1 ; get_dma_ops
> a34: e28a1101 add r1, sl, #1073741824 ; r1 = addr - PAGE_OFFSET
> a38: e3002000 movw r2, #0
> a38: R_ARM_MOVW_ABS_NC mem_map
> a3c: e3402000 movt r2, #0 ; r2 = &mem_map
> a3c: R_ARM_MOVT_ABS mem_map
> a40: e3a0e001 mov lr, #1 ; direction
> a44: e1a01621 lsr r1, r1, #12 ; r1 = (addr - PAGE_OFFSET) >> 12 (pfn offset)
> a48: e592c000 ldr ip, [r2] ; ip = &mem_map[0]
> a4c: e1a02a0a lsl r2, sl, #20 ; r2 = part converted offset into page
> a50: e58de000 str lr, [sp] ; stack direction
> a54: e3a0a000 mov sl, #0 ; attr
> a58: e08c1281 add r1, ip, r1, lsl #5 ; r1 = &mem_map[(addr - PAGE_OFFSET) >> 12]
> a5c: e58da004 str sl, [sp, #4] ; stack attr
> a60: e1a02a22 lsr r2, r2, #20 ; r2 = offset
> a64: e593c010 ldr ip, [r3, #16] ; ops->map_page
> a68: e1a03006 mov r3, r6 ; length
> a6c: e12fff3c blx ip ; call map_page
>
> With this in place, I see a reduction of 4220 bytes in the kernel image,
> and that's from just fixing virt_to_page() and changing __pv_phys_offset
> to be PFN-based (so it can be 32-bit in all cases.)
Ah, nice. Another idea: have you considered making the entire function extern?
The only advantage I see in the inline version is saving one function call,
but even with your new version there seems to be a noticeable size overhead.
Arnd
^ permalink raw reply [flat|nested] 14+ messages in thread
* __pv_phys_offset, inline assembly functions, and such like
2014-03-28 16:44 ` __pv_phys_offset, inline assembly functions, and such like Arnd Bergmann
@ 2014-03-28 16:56 ` Russell King - ARM Linux
2014-03-28 17:39 ` Arnd Bergmann
0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2014-03-28 16:56 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Mar 28, 2014 at 05:44:42PM +0100, Arnd Bergmann wrote:
> On Friday 28 March 2014 15:28:32 Russell King - ARM Linux wrote:
> > With the inline assembly eliminated from the macro, and __pv_phys_offset
> > converted to a PFN offset instead, the above assembly code shrinks
> > dramatically:
> >
> > a20: e59331d4 ldr r3, [r3, #468] ; get_dma_ops
> > a24: e3001000 movw r1, #0
> > a24: R_ARM_MOVW_ABS_NC arm_dma_ops
> > a28: e3401000 movt r1, #0 ; &arm_dma_ops
> > a28: R_ARM_MOVT_ABS arm_dma_ops
> > a2c: e3530000 cmp r3, #0 ; get_dma_ops
> > a30: 01a03001 moveq r3, r1 ; get_dma_ops
> > a34: e28a1101 add r1, sl, #1073741824 ; r1 = addr - PAGE_OFFSET
> > a38: e3002000 movw r2, #0
> > a38: R_ARM_MOVW_ABS_NC mem_map
> > a3c: e3402000 movt r2, #0 ; r2 = &mem_map
> > a3c: R_ARM_MOVT_ABS mem_map
> > a40: e3a0e001 mov lr, #1 ; direction
> > a44: e1a01621 lsr r1, r1, #12 ; r1 = (addr - PAGE_OFFSET) >> 12 (pfn offset)
> > a48: e592c000 ldr ip, [r2] ; ip = &mem_map[0]
> > a4c: e1a02a0a lsl r2, sl, #20 ; r2 = part converted offset into page
> > a50: e58de000 str lr, [sp] ; stack direction
> > a54: e3a0a000 mov sl, #0 ; attr
> > a58: e08c1281 add r1, ip, r1, lsl #5 ; r1 = &mem_map[(addr - PAGE_OFFSET) >> 12]
> > a5c: e58da004 str sl, [sp, #4] ; stack attr
> > a60: e1a02a22 lsr r2, r2, #20 ; r2 = offset
> > a64: e593c010 ldr ip, [r3, #16] ; ops->map_page
> > a68: e1a03006 mov r3, r6 ; length
> > a6c: e12fff3c blx ip ; call map_page
> >
> > With this in place, I see a reduction of 4220 bytes in the kernel image,
> > and that's from just fixing virt_to_page() and changing __pv_phys_offset
> > to be PFN-based (so it can be 32-bit in all cases.)
>
> Ah, nice. Another idea: have you considered making the entire function extern?
> The only advantage I see in the inline version is saving one function call,
> but even with your new version there seems to be a noticeable size overhead.
The contribution above from virt_to_page() is just six instructions,
compared to 19 for the whole of dma_map_single().
So, there's much more advantage to killing the inline functions in
include/asm-generic/dma-mapping.h, and moving them out of line. That's
where much of the above assembly is coming from.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* __pv_phys_offset, inline assembly functions, and such like
2014-03-28 16:56 ` Russell King - ARM Linux
@ 2014-03-28 17:39 ` Arnd Bergmann
0 siblings, 0 replies; 14+ messages in thread
From: Arnd Bergmann @ 2014-03-28 17:39 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 28 March 2014 16:56:43 Russell King - ARM Linux wrote:
> On Fri, Mar 28, 2014 at 05:44:42PM +0100, Arnd Bergmann wrote:
> > On Friday 28 March 2014 15:28:32 Russell King - ARM Linux wrote:
> > > With the inline assembly eliminated from the macro, and __pv_phys_offset
> > > converted to a PFN offset instead, the above assembly code shrinks
> > > dramatically:
> > >
> > > a20: e59331d4 ldr r3, [r3, #468] ; get_dma_ops
> > > a24: e3001000 movw r1, #0
> > > a24: R_ARM_MOVW_ABS_NC arm_dma_ops
> > > a28: e3401000 movt r1, #0 ; &arm_dma_ops
> > > a28: R_ARM_MOVT_ABS arm_dma_ops
> > > a2c: e3530000 cmp r3, #0 ; get_dma_ops
> > > a30: 01a03001 moveq r3, r1 ; get_dma_ops
> > > a34: e28a1101 add r1, sl, #1073741824 ; r1 = addr - PAGE_OFFSET
> > > a38: e3002000 movw r2, #0
> > > a38: R_ARM_MOVW_ABS_NC mem_map
> > > a3c: e3402000 movt r2, #0 ; r2 = &mem_map
> > > a3c: R_ARM_MOVT_ABS mem_map
> > > a40: e3a0e001 mov lr, #1 ; direction
> > > a44: e1a01621 lsr r1, r1, #12 ; r1 = (addr - PAGE_OFFSET) >> 12 (pfn offset)
> > > a48: e592c000 ldr ip, [r2] ; ip = &mem_map[0]
> > > a4c: e1a02a0a lsl r2, sl, #20 ; r2 = part converted offset into page
> > > a50: e58de000 str lr, [sp] ; stack direction
> > > a54: e3a0a000 mov sl, #0 ; attr
> > > a58: e08c1281 add r1, ip, r1, lsl #5 ; r1 = &mem_map[(addr - PAGE_OFFSET) >> 12]
> > > a5c: e58da004 str sl, [sp, #4] ; stack attr
> > > a60: e1a02a22 lsr r2, r2, #20 ; r2 = offset
> > > a64: e593c010 ldr ip, [r3, #16] ; ops->map_page
> > > a68: e1a03006 mov r3, r6 ; length
> > > a6c: e12fff3c blx ip ; call map_page
> > >
> > > With this in place, I see a reduction of 4220 bytes in the kernel image,
> > > and that's from just fixing virt_to_page() and changing __pv_phys_offset
> > > to be PFN-based (so it can be 32-bit in all cases.)
> >
> > Ah, nice. Another idea: have you considered making the entire function extern?
> > The only advantage I see in the inline version is saving one function call,
> > but even with your new version there seems to be a noticeable size overhead.
>
> The contribution above from virt_to_page() is just six instructions,
> compared to 19 for the whole of dma_map_single().
>
> So, there's much more advantage to killing the inline functions in
> include/asm-generic/dma-mapping.h, and moving them out of line. That's
> where much of the above assembly is coming from.
Yes, that's what I meant.
Arnd
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Better virt_to_page() handling
2014-03-28 15:29 ` [PATCH] ARM: Better virt_to_page() handling Russell King
@ 2014-03-28 19:52 ` Nicolas Pitre
2014-03-28 20:05 ` Russell King - ARM Linux
2014-04-22 19:55 ` Ezequiel Garcia
1 sibling, 1 reply; 14+ messages in thread
From: Nicolas Pitre @ 2014-03-28 19:52 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 28 Mar 2014, Russell King wrote:
> virt_to_page() is incredibly inefficient when virt-to-phys patching is
> enabled. This is because we end up with this calculation:
>
> page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
>
> in assembly. The asm virt_to_phys() is equivalent this this operation:
>
> addr - PAGE_OFFSET + __pv_phys_offset
>
> and we can see that because this is assembly, the compiler has no chance
> to optimise some of that away. This should reduce down to:
>
> page = &mem_map[(addr - PAGE_OFFSET) >> 12]
>
> for the common cases. Permit the compiler to make this optimisation by
> giving it more of the information it needs - do this by providing a
> virt_to_pfn() macro.
>
> Another issue which makes this more complex is that __pv_phys_offset is
> a 64-bit type on all platforms. This is needlessly wasteful - if we
> store the physical offset as a PFN, we can save a lot of work having
> to deal with 64-bit values, which sometimes ends up producing incredibly
> horrid code:
>
> a4c: e3009000 movw r9, #0
> a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
> a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
> a50: R_ARM_MOVT_ABS __pv_phys_offset
> a54: e3002000 movw r2, #0
> a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
> a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
> a58: R_ARM_MOVT_ABS __pv_phys_offset
> a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
> a60: e3001000 movw r1, #0
> a60: R_ARM_MOVW_ABS_NC mem_map
> a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
>
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Good catch!
Some observations below.
> ---
> arch/arm/include/asm/memory.h | 41 ++++++++++++++++++++++++-----------------
> arch/arm/kernel/armksyms.c | 2 +-
> arch/arm/kernel/head.S | 17 +++++++++--------
> 3 files changed, 34 insertions(+), 26 deletions(-)
>
> diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
> index 4afb376d9c7c..02fa2558f662 100644
> --- a/arch/arm/include/asm/memory.h
> +++ b/arch/arm/include/asm/memory.h
> @@ -166,9 +166,17 @@
> * Physical vs virtual RAM address space conversion. These are
> * private definitions which should NOT be used outside memory.h
> * files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
> + *
> + * PFNs are used to describe any physical page; this means
> + * PFN 0 == physical address 0.
> */
> -#ifndef __virt_to_phys
> -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> +#if defined(__virt_to_phys)
> +#define PHYS_OFFSET PLAT_PHYS_OFFSET
> +#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
> +
> +#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
> +
> +#elif defined(CONFIG_ARM_PATCH_PHYS_VIRT)
>
> /*
> * Constants used to force the right instruction encodings and shifts
> @@ -177,12 +185,17 @@
> #define __PV_BITS_31_24 0x81000000
> #define __PV_BITS_7_0 0x81
>
> -extern u64 __pv_phys_offset;
> +extern unsigned long __pv_phys_pfn_offset;
> extern u64 __pv_offset;
> extern void fixup_pv_table(const void *, unsigned long);
> extern const void *__pv_table_begin, *__pv_table_end;
>
> -#define PHYS_OFFSET __pv_phys_offset
> +#define PHYS_OFFSET ((phys_addr_t)__pv_phys_pfn_offset << PAGE_SHIFT)
> +#define PHYS_PFN_OFFSET (__pv_phys_pfn_offset)
> +
> +#define virt_to_pfn(kaddr) \
> + ((((unsigned long)(kaddr) - PAGE_OFFSET) >> PAGE_SHIFT) + \
> + PHYS_PFN_OFFSET)
This is the second definition for virt_to_pfn().
>
> #define __pv_stub(from,to,instr,type) \
> __asm__("@ __pv_stub\n" \
> @@ -243,6 +256,7 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
> #else
>
> #define PHYS_OFFSET PLAT_PHYS_OFFSET
> +#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
>
> static inline phys_addr_t __virt_to_phys(unsigned long x)
> {
> @@ -254,18 +268,11 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
> return x - PHYS_OFFSET + PAGE_OFFSET;
> }
>
> -#endif
> -#endif
> +#define virt_to_pfn(kaddr) \
> + ((((unsigned long)(kaddr) - PAGE_OFFSET) >> PAGE_SHIFT) + \
> + PHYS_PFN_OFFSET)
This is a third definition, identical to the second one.
I see it might be hard to make the last two common, unless it is defined
up front and the odd case does a #undef virt_to_pfn before redefining
it. Which way is best I'm not sure.
Also this needs to take care of those machines overriding PHYS_OFFSET at
run time (see commit a77e0c7b2774f). However it looks like no code
relying on the LPAE version of early_paging_init() has been merged in
mainline yet. Maybe a note in the commit log to mention this case might
be all that is needed for now.
With that you may add:
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Nicolas
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Better virt_to_page() handling
2014-03-28 19:52 ` Nicolas Pitre
@ 2014-03-28 20:05 ` Russell King - ARM Linux
2014-03-29 2:51 ` Nicolas Pitre
0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2014-03-28 20:05 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Mar 28, 2014 at 03:52:16PM -0400, Nicolas Pitre wrote:
> I see it might be hard to make the last two common, unless it is defined
> up front and the odd case does a #undef virt_to_pfn before redefining
> it. Which way is best I'm not sure.
Given the problems we've had in this area with changes ending up with
build failures, I think it's far better to have simplicity even if it
means duplicating definitions, rather than trying to invent some clever
way to avoid duplications.
We've had soo many build failures due to trying to be too clever that
I'm not playing the "try to be clever" game here anymore. We can be
more verbose instead.
> Also this needs to take care of those machines overriding PHYS_OFFSET at
> run time (see commit a77e0c7b2774f).
Platforms don't override PHYS_OFFSET. They override PLAT_PHYS_OFFSET
instead.
> However it looks like no code
> relying on the LPAE version of early_paging_init() has been merged in
> mainline yet.
I believe that's fully reliant on CONFIG_ARM_PATCH_PHYS_VIRT being set,
and the difference for that case would be setting __pv_phys_pfn_offset
instead of __pv_phys_offset, which is a simple modification.
However, that's something which should be done by core code, not
individually by platforms. Since we don't have any view of code doing
this yet, it's not relevant here. We can only deal with what is in
mainline and not in external trees.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Better virt_to_page() handling
2014-03-28 20:05 ` Russell King - ARM Linux
@ 2014-03-29 2:51 ` Nicolas Pitre
0 siblings, 0 replies; 14+ messages in thread
From: Nicolas Pitre @ 2014-03-29 2:51 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, 28 Mar 2014, Russell King - ARM Linux wrote:
> On Fri, Mar 28, 2014 at 03:52:16PM -0400, Nicolas Pitre wrote:
> > I see it might be hard to make the last two common, unless it is defined
> > up front and the odd case does a #undef virt_to_pfn before redefining
> > it. Which way is best I'm not sure.
>
> Given the problems we've had in this area with changes ending up with
> build failures, I think it's far better to have simplicity even if it
> means duplicating definitions, rather than trying to invent some clever
> way to avoid duplications.
OK, fair enough.
> > Also this needs to take care of those machines overriding PHYS_OFFSET at
> > run time (see commit a77e0c7b2774f).
>
> Platforms don't override PHYS_OFFSET. They override PLAT_PHYS_OFFSET
> instead.
>
> > However it looks like no code
> > relying on the LPAE version of early_paging_init() has been merged in
> > mainline yet.
>
> I believe that's fully reliant on CONFIG_ARM_PATCH_PHYS_VIRT being set,
> and the difference for that case would be setting __pv_phys_pfn_offset
> instead of __pv_phys_offset, which is a simple modification.
Indeed.
> However, that's something which should be done by core code, not
> individually by platforms. Since we don't have any view of code doing
> this yet, it's not relevant here. We can only deal with what is in
> mainline and not in external trees.
Hence my suggestion to simply add a note to the commit log for this
patch, or better would be a note added to the comment for
early_paging_init() to that effect.
Nicolas
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Better virt_to_page() handling
2014-03-28 15:28 __pv_phys_offset, inline assembly functions, and such like Russell King - ARM Linux
2014-03-28 15:29 ` [PATCH] ARM: Better virt_to_page() handling Russell King
2014-03-28 16:44 ` __pv_phys_offset, inline assembly functions, and such like Arnd Bergmann
@ 2014-04-04 22:32 ` Russell King
2014-04-04 22:44 ` Russell King - ARM Linux
2 siblings, 1 reply; 14+ messages in thread
From: Russell King @ 2014-04-04 22:32 UTC (permalink / raw)
To: linux-arm-kernel
virt_to_page() is incredibly inefficient when virt-to-phys patching is
enabled. This is because we end up with this calculation:
page = &mem_map[asm virt_to_phys(addr) >> 12 - __pv_phys_offset >> 12]
in assembly. The asm virt_to_phys() is equivalent this this operation:
addr - PAGE_OFFSET + __pv_phys_offset
and we can see that because this is assembly, the compiler has no chance
to optimise some of that away. This should reduce down to:
page = &mem_map[(addr - PAGE_OFFSET) >> 12]
for the common cases. Permit the compiler to make this optimisation by
giving it more of the information it needs - do this by providing a
virt_to_pfn() macro.
Another issue which makes this more complex is that __pv_phys_offset is
a 64-bit type on all platforms. This is needlessly wasteful - if we
store the physical offset as a PFN, we can save a lot of work having
to deal with 64-bit values, which sometimes ends up producing incredibly
horrid code:
a4c: e3009000 movw r9, #0
a4c: R_ARM_MOVW_ABS_NC __pv_phys_offset
a50: e3409000 movt r9, #0 ; r9 = &__pv_phys_offset
a50: R_ARM_MOVT_ABS __pv_phys_offset
a54: e3002000 movw r2, #0
a54: R_ARM_MOVW_ABS_NC __pv_phys_offset
a58: e3402000 movt r2, #0 ; r2 = &__pv_phys_offset
a58: R_ARM_MOVT_ABS __pv_phys_offset
a5c: e5999004 ldr r9, [r9, #4] ; r9 = high word of __pv_phys_offset
a60: e3001000 movw r1, #0
a60: R_ARM_MOVW_ABS_NC mem_map
a64: e592c000 ldr ip, [r2] ; ip = low word of __pv_phys_offset
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
arch/arm/include/asm/memory.h | 41 ++++++++++++++++++++++++-----------------
arch/arm/kernel/armksyms.c | 2 +-
arch/arm/kernel/head.S | 17 +++++++++--------
3 files changed, 34 insertions(+), 26 deletions(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 4afb376d9c7c..02fa2558f662 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -166,9 +166,17 @@
* Physical vs virtual RAM address space conversion. These are
* private definitions which should NOT be used outside memory.h
* files. Use virt_to_phys/phys_to_virt/__pa/__va instead.
+ *
+ * PFNs are used to describe any physical page; this means
+ * PFN 0 == physical address 0.
*/
-#ifndef __virt_to_phys
-#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
+#if defined(__virt_to_phys)
+#define PHYS_OFFSET PLAT_PHYS_OFFSET
+#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
+
+#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
+
+#elif defined(CONFIG_ARM_PATCH_PHYS_VIRT)
/*
* Constants used to force the right instruction encodings and shifts
@@ -177,12 +185,17 @@
#define __PV_BITS_31_24 0x81000000
#define __PV_BITS_7_0 0x81
-extern u64 __pv_phys_offset;
+extern unsigned long __pv_phys_pfn_offset;
extern u64 __pv_offset;
extern void fixup_pv_table(const void *, unsigned long);
extern const void *__pv_table_begin, *__pv_table_end;
-#define PHYS_OFFSET __pv_phys_offset
+#define PHYS_OFFSET ((phys_addr_t)__pv_phys_pfn_offset << PAGE_SHIFT)
+#define PHYS_PFN_OFFSET (__pv_phys_pfn_offset)
+
+#define virt_to_pfn(kaddr) \
+ ((((unsigned long)(kaddr) - PAGE_OFFSET) >> PAGE_SHIFT) + \
+ PHYS_PFN_OFFSET)
#define __pv_stub(from,to,instr,type) \
__asm__("@ __pv_stub\n" \
@@ -243,6 +256,7 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
#else
#define PHYS_OFFSET PLAT_PHYS_OFFSET
+#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
static inline phys_addr_t __virt_to_phys(unsigned long x)
{
@@ -254,18 +268,11 @@ static inline unsigned long __phys_to_virt(phys_addr_t x)
return x - PHYS_OFFSET + PAGE_OFFSET;
}
-#endif
-#endif
+#define virt_to_pfn(kaddr) \
+ ((((unsigned long)(kaddr) - PAGE_OFFSET) >> PAGE_SHIFT) + \
+ PHYS_PFN_OFFSET)
-/*
- * PFNs are used to describe any physical page; this means
- * PFN 0 == physical address 0.
- *
- * This is the PFN of the first RAM page in the kernel
- * direct-mapped view. We assume this is the first page
- * of RAM in the mem_map as well.
- */
-#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
+#endif
/*
* These are *only* valid on the kernel direct mapped RAM memory.
@@ -343,9 +350,9 @@ static inline __deprecated void *bus_to_virt(unsigned long x)
*/
#define ARCH_PFN_OFFSET PHYS_PFN_OFFSET
-#define virt_to_page(kaddr) pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
+#define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr))
#define virt_addr_valid(kaddr) (((unsigned long)(kaddr) >= PAGE_OFFSET && (unsigned long)(kaddr) < (unsigned long)high_memory) \
- && pfn_valid(__pa(kaddr) >> PAGE_SHIFT) )
+ && pfn_valid(virt_to_pfn(kaddr)))
#endif
diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c
index 85e664b6a5f1..f7b450f97e68 100644
--- a/arch/arm/kernel/armksyms.c
+++ b/arch/arm/kernel/armksyms.c
@@ -158,6 +158,6 @@ EXPORT_SYMBOL(__gnu_mcount_nc);
#endif
#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
-EXPORT_SYMBOL(__pv_phys_offset);
+EXPORT_SYMBOL(__pv_phys_pfn_offset);
EXPORT_SYMBOL(__pv_offset);
#endif
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index f5f381d91556..f8c08839edf3 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -584,9 +584,10 @@ __fixup_pv_table:
subs r3, r0, r3 @ PHYS_OFFSET - PAGE_OFFSET
add r4, r4, r3 @ adjust table start address
add r5, r5, r3 @ adjust table end address
- add r6, r6, r3 @ adjust __pv_phys_offset address
+ add r6, r6, r3 @ adjust __pv_phys_pfn_offset address
add r7, r7, r3 @ adjust __pv_offset address
- str r8, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_offset
+ mov r0, r8, lsr #12 @ convert to PFN
+ str r0, [r6, #LOW_OFFSET] @ save computed PHYS_OFFSET to __pv_phys_pfn_offset
strcc ip, [r7, #HIGH_OFFSET] @ save to __pv_offset high bits
mov r6, r3, lsr #24 @ constant for add/sub instructions
teq r3, r6, lsl #24 @ must be 16MiB aligned
@@ -600,7 +601,7 @@ ENDPROC(__fixup_pv_table)
1: .long .
.long __pv_table_begin
.long __pv_table_end
-2: .long __pv_phys_offset
+2: .long __pv_phys_pfn_offset
.long __pv_offset
.text
@@ -688,11 +689,11 @@ ENTRY(fixup_pv_table)
ENDPROC(fixup_pv_table)
.data
- .globl __pv_phys_offset
- .type __pv_phys_offset, %object
-__pv_phys_offset:
- .quad 0
- .size __pv_phys_offset, . -__pv_phys_offset
+ .globl __pv_phys_pfn_offset
+ .type __pv_phys_pfn_offset, %object
+__pv_phys_pfn_offset:
+ .word 0
+ .size __pv_phys_pfn_offset, . -__pv_phys_pfn_offset
.globl __pv_offset
.type __pv_offset, %object
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH] ARM: Better virt_to_page() handling
2014-04-04 22:32 ` [PATCH] ARM: Better virt_to_page() handling Russell King
@ 2014-04-04 22:44 ` Russell King - ARM Linux
0 siblings, 0 replies; 14+ messages in thread
From: Russell King - ARM Linux @ 2014-04-04 22:44 UTC (permalink / raw)
To: linux-arm-kernel
Ignore this :)
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Better virt_to_page() handling
2014-03-28 15:29 ` [PATCH] ARM: Better virt_to_page() handling Russell King
2014-03-28 19:52 ` Nicolas Pitre
@ 2014-04-22 19:55 ` Ezequiel Garcia
2014-04-23 7:35 ` [PATCH] ARM: Fix double definition of virt_to_pfn() Peter Ujfalusi
1 sibling, 1 reply; 14+ messages in thread
From: Ezequiel Garcia @ 2014-04-22 19:55 UTC (permalink / raw)
To: linux-arm-kernel
Russell,
On Mar 28, Russell King wrote:
> -#ifndef __virt_to_phys
> -#ifdef CONFIG_ARM_PATCH_PHYS_VIRT
> +#if defined(__virt_to_phys)
> +#define PHYS_OFFSET PLAT_PHYS_OFFSET
> +#define PHYS_PFN_OFFSET ((unsigned long)(PHYS_OFFSET >> PAGE_SHIFT))
> +
> +#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
> +
Building 20140422 -next produces a myriad of these warnings:
[..]
CC fs/mpage.o
In file included from /home/zeta/linux/next/arch/arm/include/asm/page.h:163:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:5,
from include/linux/mm.h:9,
from fs/mpage.c:17:
/home/zeta/linux/next/arch/arm/include/asm/memory.h:297:0: warning: "virt_to_pfn" redefined [enabled by default]
#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
^
/home/zeta/linux/next/arch/arm/include/asm/memory.h:194:0: note: this is the location of the previous definition
#define virt_to_pfn(kaddr) \
^
I saw you already answered Nicolas about the double-definition, so I'm merely
reporting the build gets really spammy now. (Although I guess you already
noticed this)
--
Ezequiel Garc?a, Free Electrons
Embedded Linux, Kernel and Android Engineering
http://free-electrons.com
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Fix double definition of virt_to_pfn()
2014-04-22 19:55 ` Ezequiel Garcia
@ 2014-04-23 7:35 ` Peter Ujfalusi
2014-04-23 9:37 ` Russell King - ARM Linux
0 siblings, 1 reply; 14+ messages in thread
From: Peter Ujfalusi @ 2014-04-23 7:35 UTC (permalink / raw)
To: linux-arm-kernel
One definition of virt_to_pfn() was outside of the #if #elif #else section
causing constant redefinition of the macro.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
---
Hi,
the redefinition actually coming from different source. It is caused by the
leftover definition of virt_to_pfn() which would redefine the virt_to_pfn()
all the time.
The redefinition still exist in next-20140423.
Regards,
Peter
arch/arm/include/asm/memory.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index d3e91fa41b37..7a0dc0f9d1a1 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -293,8 +293,6 @@ static inline void *phys_to_virt(phys_addr_t x)
*/
#define __pa(x) __virt_to_phys((unsigned long)(x))
#define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x)))
-#define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT)
-#define virt_to_pfn(kaddr) (__pa(kaddr) >> PAGE_SHIFT)
extern phys_addr_t (*arch_virt_to_idmap)(unsigned long x);
--
1.9.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH] ARM: Fix double definition of virt_to_pfn()
2014-04-23 7:35 ` [PATCH] ARM: Fix double definition of virt_to_pfn() Peter Ujfalusi
@ 2014-04-23 9:37 ` Russell King - ARM Linux
2014-04-23 10:40 ` Peter Ujfalusi
0 siblings, 1 reply; 14+ messages in thread
From: Russell King - ARM Linux @ 2014-04-23 9:37 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Apr 23, 2014 at 10:35:47AM +0300, Peter Ujfalusi wrote:
> One definition of virt_to_pfn() was outside of the #if #elif #else section
> causing constant redefinition of the macro.
>
> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
> ---
> Hi,
>
> the redefinition actually coming from different source. It is caused by the
> leftover definition of virt_to_pfn() which would redefine the virt_to_pfn()
> all the time.
>
> The redefinition still exist in next-20140423.
Please see the thread "Re: [PATCH v7 2/2] ARM hibernation / suspend-to-disk"
this month. The fixed up commit will be pushed out shortly.
--
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] ARM: Fix double definition of virt_to_pfn()
2014-04-23 9:37 ` Russell King - ARM Linux
@ 2014-04-23 10:40 ` Peter Ujfalusi
0 siblings, 0 replies; 14+ messages in thread
From: Peter Ujfalusi @ 2014-04-23 10:40 UTC (permalink / raw)
To: linux-arm-kernel
On 04/23/2014 12:37 PM, Russell King - ARM Linux wrote:
> On Wed, Apr 23, 2014 at 10:35:47AM +0300, Peter Ujfalusi wrote:
>> One definition of virt_to_pfn() was outside of the #if #elif #else section
>> causing constant redefinition of the macro.
>>
>> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
>> ---
>> Hi,
>>
>> the redefinition actually coming from different source. It is caused by the
>> leftover definition of virt_to_pfn() which would redefine the virt_to_pfn()
>> all the time.
>>
>> The redefinition still exist in next-20140423.
>
> Please see the thread "Re: [PATCH v7 2/2] ARM hibernation / suspend-to-disk"
> this month. The fixed up commit will be pushed out shortly.
Thanks, I found the thread.
--
P?ter
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-04-23 10:40 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-28 15:28 __pv_phys_offset, inline assembly functions, and such like Russell King - ARM Linux
2014-03-28 15:29 ` [PATCH] ARM: Better virt_to_page() handling Russell King
2014-03-28 19:52 ` Nicolas Pitre
2014-03-28 20:05 ` Russell King - ARM Linux
2014-03-29 2:51 ` Nicolas Pitre
2014-04-22 19:55 ` Ezequiel Garcia
2014-04-23 7:35 ` [PATCH] ARM: Fix double definition of virt_to_pfn() Peter Ujfalusi
2014-04-23 9:37 ` Russell King - ARM Linux
2014-04-23 10:40 ` Peter Ujfalusi
2014-03-28 16:44 ` __pv_phys_offset, inline assembly functions, and such like Arnd Bergmann
2014-03-28 16:56 ` Russell King - ARM Linux
2014-03-28 17:39 ` Arnd Bergmann
2014-04-04 22:32 ` [PATCH] ARM: Better virt_to_page() handling Russell King
2014-04-04 22:44 ` Russell King - ARM Linux
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).