public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] Improve performance of VM translation on x86_64
@ 2012-11-05 19:03 Alexander Duyck
  2012-11-05 19:04 ` [PATCH v3 1/8] x86: Improve __phys_addr performance by making use of carry flags and inlining Alexander Duyck
                   ` (7 more replies)
  0 siblings, 8 replies; 15+ messages in thread
From: Alexander Duyck @ 2012-11-05 19:03 UTC (permalink / raw)
  To: tglx, mingo, hpa, andi; +Cc: linux-kernel, x86

This patch series is meant to address several issues I encountered with VM
translations on x86_64.  In my testing I found that swiotlb was incurring up
to a 5% processing overhead due to calls to __phys_addr.  To address that I
have updated swiotlb to use physical addresses instead of virtual addresses
to reduce the need to call __phys_addr.  However those patches didn't address
the other callers.  With these patches applied I am able to achieve an
additional 1% to 2% performance gain on top of the changes to swiotlb.

The first 2 patches are the performance optimizations that result in the 1% to
2% increase in overall performance.  The remaining patches are various
cleanups for a number of spots where __pa or virt_to_phys was being called
and was not needed or __pa_symbol could have been used.

It doesn't seem like the v2 patch set was accepted so I am submitting an
updated v3 set that is rebased off of linux-next with a few additional
improvements to the existing patches.  Specifically the first patch now also
updates __virt_addr_valid so that it is almost identical in layout to
__phys_addr.  Also I found one additional spot in init_64.c that could use
__pa_symbol instead of virt_to_page calls so I updated the first __pa_symbol
patch for the x86 init calls.

With this patch set applied I am noticing a 1-2% improvement in performance in
my routing tests.  Without my earlier swiotlb changes applied it was getting as
high as 6-7% because that code originally relied heavily on virt_to_phys.

The overall effect on size varies depending on what kernel options are
enabled.  I have notices that almost all of the network device drivers have
dropped in size by around 100 bytes.  I suspect this is due to the fact that
the virt_to_page call in dma_map_single is now less expensive.  However the
default build for x86_64 increases the vmlinux size by 3.5K with this change
applied.

---

Alexander Duyck (8):
      x86/lguest: Use __pa_symbol instead of __pa on C visible symbols
      x86/acpi: Use __pa_symbol instead of __pa on C visible symbols
      x86/xen: Use __pa_symbol instead of __pa on C visible symbols
      x86/ftrace: Use __pa_symbol instead of __pa on C visible symbols
      x86: Use __pa_symbol instead of __pa on C visible symbols
      x86: Drop 4 unnecessary calls to __pa_symbol
      x86: Make it so that __pa_symbol can only process kernel symbols on x86_64
      x86: Improve __phys_addr performance by making use of carry flags and inlining


 arch/x86/include/asm/page.h          |    3 +-
 arch/x86/include/asm/page_32.h       |    1 +
 arch/x86/include/asm/page_64_types.h |   20 +++++++++++-
 arch/x86/kernel/acpi/sleep.c         |    2 +
 arch/x86/kernel/cpu/intel.c          |    2 +
 arch/x86/kernel/ftrace.c             |    4 +-
 arch/x86/kernel/head32.c             |    4 +-
 arch/x86/kernel/head64.c             |    4 +-
 arch/x86/kernel/setup.c              |   16 +++++-----
 arch/x86/kernel/x8664_ksyms_64.c     |    3 ++
 arch/x86/lguest/boot.c               |    3 +-
 arch/x86/mm/init_64.c                |   18 +++++------
 arch/x86/mm/pageattr.c               |    8 ++---
 arch/x86/mm/physaddr.c               |   55 +++++++++++++++++++++++++---------
 arch/x86/platform/efi/efi.c          |    4 +-
 arch/x86/realmode/init.c             |    8 ++---
 arch/x86/xen/mmu.c                   |   21 +++++++------
 17 files changed, 111 insertions(+), 65 deletions(-)

-- 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-11-16 19:35 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-05 19:03 [PATCH v3 0/8] Improve performance of VM translation on x86_64 Alexander Duyck
2012-11-05 19:04 ` [PATCH v3 1/8] x86: Improve __phys_addr performance by making use of carry flags and inlining Alexander Duyck
2012-11-05 20:24   ` Kirill A. Shutemov
2012-11-05 21:56     ` Alexander Duyck
2012-11-05 22:08       ` Kirill A. Shutemov
2012-11-16 19:35         ` Alexander Duyck
2012-11-05 19:04 ` [PATCH v3 2/8] x86: Make it so that __pa_symbol can only process kernel symbols on x86_64 Alexander Duyck
2012-11-05 19:04 ` [PATCH v3 3/8] x86: Drop 4 unnecessary calls to __pa_symbol Alexander Duyck
2012-11-05 19:05 ` [PATCH v3 4/8] x86: Use __pa_symbol instead of __pa on C visible symbols Alexander Duyck
2012-11-05 19:05 ` [PATCH v3 5/8] x86/ftrace: " Alexander Duyck
2012-11-05 19:06 ` [PATCH v3 6/8] x86/xen: " Alexander Duyck
2012-11-06 15:45   ` Konrad Rzeszutek Wilk
2012-11-05 19:06 ` [PATCH v3 7/8] x86/acpi: " Alexander Duyck
2012-11-05 19:06 ` [PATCH v3 8/8] x86/lguest: " Alexander Duyck
2012-11-06  1:11   ` Rusty Russell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox