linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC v1 0/2] Per-arch page checksumming and comparison
@ 2017-09-25  8:46 Claudio Imbrenda
  2017-09-25  8:46 ` [RFC v1 1/2] VS1544 KSM generic memory comparison functions Claudio Imbrenda
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Claudio Imbrenda @ 2017-09-25  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: borntraeger, kvm, linux-mm, nefelim4ag, akpm, aarcange, mingo,
	zhongjiang, kirill.shutemov, arvind.yadav.cs, solee, ak

Since we now have two different proposals on how to speed up KSM, I
thought I'd share what I had done too, so we can now have three :)

I have analysed the performance of KSM, and I have found out that both
the checksum and the memcmp take up a significant amount of time.
Depending on the content of the pages, either function can be the
"bottleneck".

I did some synthetic benchmarks, using different checksum functions and
with different page content scenarios. Only in the best case (e.g.
pages differing at the very beginning) was the checksum consuming more
CPU time than the memcmps.
Using a simpler function (like CRC32 or even just a simple sum)
significantly reduced the CPU load. 
In other scenarios, like when the pages differ in the middle or at the
end, the biggest offender is the memcmp. Still, using simpler checksums
lowers the overall CPU load.

The idea I had in this patchseries was to provide arch-overridable
functions to checksum and compare whole pages.

Depending on the arch, the best memcmp/checksum to use in the
specialized case of comparing/checksumming one whole page might not
necessarily be the one that is the best in the general case. So what I
did here was to factor out the old code and make it generic, and then
provide an s390-specific implementation for the checksum using the CKSM
instruction, which is also used to calculate the checksum of IP
headers, the idea being that other architectures can then follow and
use their preferred checksum.


I like Sioh Lee's proposal of using the crypto API to choose a fast but
good checksum, since this can be made arch-dependant too, and CRC32 is
also almost as fast as the simple checksum. Also, I had underestimated
how many more collisions the simple checksum could potentially cause
(although I did not see any performance regressions in my tests).

While there is a crypto API to choose between different hash functions,
there is nothing like that for page comparison.


I think at this point we need to coordinate a little, to avoid
reinventing the wheel three times and in different ways.




Claudio Imbrenda (2):
  VS1544 KSM generic memory comparison functions
  VS1544 KSM s390-specific memory comparison functions

 arch/alpha/include/asm/Kbuild       |  1 +
 arch/arc/include/asm/Kbuild         |  1 +
 arch/arm/include/asm/Kbuild         |  1 +
 arch/arm64/include/asm/Kbuild       |  1 +
 arch/blackfin/include/asm/Kbuild    |  1 +
 arch/c6x/include/asm/Kbuild         |  1 +
 arch/cris/include/asm/Kbuild        |  1 +
 arch/frv/include/asm/Kbuild         |  1 +
 arch/h8300/include/asm/Kbuild       |  1 +
 arch/hexagon/include/asm/Kbuild     |  1 +
 arch/ia64/include/asm/Kbuild        |  1 +
 arch/m32r/include/asm/Kbuild        |  1 +
 arch/m68k/include/asm/Kbuild        |  1 +
 arch/metag/include/asm/Kbuild       |  1 +
 arch/microblaze/include/asm/Kbuild  |  1 +
 arch/mips/include/asm/Kbuild        |  1 +
 arch/mn10300/include/asm/Kbuild     |  1 +
 arch/nios2/include/asm/Kbuild       |  1 +
 arch/openrisc/include/asm/Kbuild    |  1 +
 arch/parisc/include/asm/Kbuild      |  1 +
 arch/powerpc/include/asm/Kbuild     |  1 +
 arch/s390/include/asm/page_memops.h | 18 ++++++++++++++++++
 arch/score/include/asm/Kbuild       |  1 +
 arch/sh/include/asm/Kbuild          |  1 +
 arch/sparc/include/asm/Kbuild       |  1 +
 arch/tile/include/asm/Kbuild        |  1 +
 arch/um/include/asm/Kbuild          |  1 +
 arch/unicore32/include/asm/Kbuild   |  1 +
 arch/x86/include/asm/Kbuild         |  1 +
 arch/xtensa/include/asm/Kbuild      |  1 +
 include/asm-generic/page_memops.h   | 31 +++++++++++++++++++++++++++++++
 mm/ksm.c                            | 27 +++------------------------
 32 files changed, 81 insertions(+), 24 deletions(-)
 create mode 100644 arch/s390/include/asm/page_memops.h
 create mode 100644 include/asm-generic/page_memops.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-09-25 11:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-25  8:46 [RFC v1 0/2] Per-arch page checksumming and comparison Claudio Imbrenda
2017-09-25  8:46 ` [RFC v1 1/2] VS1544 KSM generic memory comparison functions Claudio Imbrenda
2017-09-25 10:06   ` Geert Uytterhoeven
2017-09-25  8:46 ` [RFC v1 2/2] VS1544 KSM s390-specific " Claudio Imbrenda
2017-09-25 11:41 ` [RFC v1 0/2] Per-arch page checksumming and comparison Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).