linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V7 0/2] KSM replace hash algo with xxhash
@ 2018-09-13 21:19 Timofey Titovets
  2018-09-13 21:19 ` [PATCH V7 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Timofey Titovets @ 2018-09-13 21:19 UTC (permalink / raw)
  To: linux-mm
  Cc: Pavel.Tatashin, rppt, Timofey Titovets, Andrea Arcangeli, kvm,
	leesioh

From: Timofey Titovets <nefelim4ag@gmail.com>

Currently used jhash are slow enough and replace it allow as to make KSM
less cpu hungry.

About speed (in kernel):
        ksm: crc32c   hash() 12081 MB/s
        ksm: xxh64    hash()  8770 MB/s
        ksm: xxh32    hash()  4529 MB/s
        ksm: jhash2   hash()  1569 MB/s

By sioh Lee tests (copy from other mail):
Test platform: openstack cloud platform (NEWTON version)
Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, memory 64gb)
VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
Linux kernel: 4.14 (latest version)
KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200

Experiment process
Firstly, we turn off KSM and launch 4 VMs.
Then we turn on the KSM and measure the checksum computation time until full_scans become two.

The experimental results (the experimental value is the average of the measured values)
crc32c_intel: 1084.10ns
crc32c (no hardware acceleration): 7012.51ns
xxhash32: 2227.75ns
xxhash64: 1413.16ns
jhash2: 5128.30ns

In summary, the result shows that crc32c_intel has advantages over all 
of the hash function used in the experiment. (decreased by 84.54% compared to crc32c,
78.86% compared to jhash2, 51.33% xxhash32, 23.28% compared to xxhash64)
the results are similar to those of Timofey.

But,
use only xxhash for now, because for using crc32c,
cryptoapi must be initialized first - that require some
tricky solution to work good in all situations.

So:
  - Fisrt patch implement compile time pickup of fastest implementation of xxhash
    for target platform.
  - Second replace jhash2 with xxhash
  
Thanks.

CC: Andrea Arcangeli <aarcange@redhat.com>
CC: linux-mm@kvack.org
CC: kvm@vger.kernel.org
CC: leesioh <solee@os.korea.ac.kr>

Timofey Titovets (2):
  xxHash: create arch dependent 32/64-bit xxhash()
  ksm: replace jhash2 with xxhash

 include/linux/xxhash.h | 23 +++++++++++++
 mm/Kconfig             |  2 ++
 mm/ksm.c               | 93 +++++++++++++++++++++++++++++++++++++++++++++++---
 3 files changed, 114 insertions(+), 4 deletions(-)

-- 
2.14.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V7 1/2] xxHash: create arch dependent 32/64-bit xxhash()
  2018-09-13 21:19 [PATCH V7 0/2] KSM replace hash algo with xxhash Timofey Titovets
@ 2018-09-13 21:19 ` Timofey Titovets
  2018-09-13 21:24   ` Pasha Tatashin
  2018-09-13 21:19 ` [PATCH V7 2/2] ksm: replace jhash2 with xxhash Timofey Titovets
  2018-09-13 21:26 ` [PATCH V7 0/2] KSM replace hash algo " Pasha Tatashin
  2 siblings, 1 reply; 8+ messages in thread
From: Timofey Titovets @ 2018-09-13 21:19 UTC (permalink / raw)
  To: linux-mm
  Cc: Pavel.Tatashin, rppt, Timofey Titovets, Andrea Arcangeli, kvm,
	leesioh

From: Timofey Titovets <nefelim4ag@gmail.com>

xxh32() - fast on both 32/64-bit platforms
xxh64() - fast only on 64-bit platform

Create xxhash() which will pickup fastest version
on compile time.

As result depends on cpu word size,
the main proporse of that - in memory hashing.

Changes:
  v2:
    - Create that patch
  v3 -> v6:
    - Nothing, whole patchset version bump

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
CC: Andrea Arcangeli <aarcange@redhat.com>
CC: linux-mm@kvack.org
CC: kvm@vger.kernel.org
CC: leesioh <solee@os.korea.ac.kr>
---
 include/linux/xxhash.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/include/linux/xxhash.h b/include/linux/xxhash.h
index 9e1f42cb57e9..52b073fea17f 100644
--- a/include/linux/xxhash.h
+++ b/include/linux/xxhash.h
@@ -107,6 +107,29 @@ uint32_t xxh32(const void *input, size_t length, uint32_t seed);
  */
 uint64_t xxh64(const void *input, size_t length, uint64_t seed);
 
+/**
+ * xxhash() - calculate wordsize hash of the input with a given seed
+ * @input:  The data to hash.
+ * @length: The length of the data to hash.
+ * @seed:   The seed can be used to alter the result predictably.
+ *
+ * If the hash does not need to be comparable between machines with
+ * different word sizes, this function will call whichever of xxh32()
+ * or xxh64() is faster.
+ *
+ * Return:  wordsize hash of the data.
+ */
+
+static inline unsigned long xxhash(const void *input, size_t length,
+				   uint64_t seed)
+{
+#if BITS_PER_LONG == 64
+       return xxh64(input, length, seed);
+#else
+       return xxh32(input, length, seed);
+#endif
+}
+
 /*-****************************
  * Streaming Hash Functions
  *****************************/
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V7 2/2] ksm: replace jhash2 with xxhash
  2018-09-13 21:19 [PATCH V7 0/2] KSM replace hash algo with xxhash Timofey Titovets
  2018-09-13 21:19 ` [PATCH V7 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
@ 2018-09-13 21:19 ` Timofey Titovets
  2018-09-13 21:24   ` Pasha Tatashin
  2018-09-13 21:26   ` Pasha Tatashin
  2018-09-13 21:26 ` [PATCH V7 0/2] KSM replace hash algo " Pasha Tatashin
  2 siblings, 2 replies; 8+ messages in thread
From: Timofey Titovets @ 2018-09-13 21:19 UTC (permalink / raw)
  To: linux-mm
  Cc: Pavel.Tatashin, rppt, Timofey Titovets, leesioh, Andrea Arcangeli,
	kvm

From: Timofey Titovets <nefelim4ag@gmail.com>

Replace jhash2 with xxhash.

Perf numbers:
Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
ksm: crc32c   hash() 12081 MB/s
ksm: xxh64    hash()  8770 MB/s
ksm: xxh32    hash()  4529 MB/s
ksm: jhash2   hash()  1569 MB/s

By sioh Lee tests (copy from other mail):
Test platform: openstack cloud platform (NEWTON version)
Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, memory 64gb)
VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
Linux kernel: 4.14 (latest version)
KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200

Experiment process
Firstly, we turn off KSM and launch 4 VMs.
Then we turn on the KSM and measure the checksum computation time until full_scans become two.

The experimental results (the experimental value is the average of the measured values)
crc32c_intel: 1084.10ns
crc32c (no hardware acceleration): 7012.51ns
xxhash32: 2227.75ns
xxhash64: 1413.16ns
jhash2: 5128.30ns

As jhash2 always will be slower (for data size like PAGE_SIZE).
Don't use it in ksm at all.

Use only xxhash for now, because for using crc32c,
cryptoapi must be initialized first - that require some
tricky solution to work good in all situations.

Thanks.

Changes:
  v1 -> v2:
    - Move xxhash() to xxhash.h/c and separate patches
  v2 -> v3:
    - Move xxhash() xxhash.c -> xxhash.h
    - replace xxhash_t with 'unsigned long'
    - update kerneldoc above xxhash()
  v3 -> v4:
    - Merge xxhash/crc32 patches
    - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
    - Add auto speed test and auto choice of fastest hash function
  v4 -> v5:
    - Pickup missed xxhash patch
    - Update code with compile time choicen xxhash
    - Add more macros to make code more readable
    - As now that only possible use xxhash or crc32c,
      on crc32c allocation error, skip speed test and fallback to xxhash
    - For workaround too early init problem (crc32c not avaliable),
      move zero_checksum init to first call of fastcall()
    - Don't alloc page for hash testing, use arch zero pages for that
  v5 -> v6:
    - Use libcrc32c instead of CRYPTO API, mainly for
      code/Kconfig deps Simplification
    - Add crc32c_available():
      libcrc32c will BUG_ON on crc32c problems,
      so test crc32c avaliable by crc32c_available()
    - Simplify choice_fastest_hash()
    - Simplify fasthash()
    - struct rmap_item && stable_node have sizeof == 64 on x86_64,
      that makes them cache friendly. As we don't suffer from hash collisions,
      change hash type from unsigned long back to u32.
    - Fix kbuild robot warning, make all local functions static
  v6 -> v7:
    - Drop crc32c for now and use only xxhash in ksm.

Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: leesioh <solee@os.korea.ac.kr>
CC: Andrea Arcangeli <aarcange@redhat.com>
CC: linux-mm@kvack.org
CC: kvm@vger.kernel.org
---
 mm/Kconfig | 1 +
 mm/ksm.c   | 6 ++++--
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index a550635ea5c3..b5f923081bce 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -297,6 +297,7 @@ config MMU_NOTIFIER
 config KSM
 	bool "Enable KSM for page merging"
 	depends on MMU
+	select XXHASH
 	help
 	  Enable Kernel Samepage Merging: KSM periodically scans those areas
 	  of an application's address space that an app has advised may be
diff --git a/mm/ksm.c b/mm/ksm.c
index 5b0894b45ee5..30c595dd5d87 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -25,7 +25,7 @@
 #include <linux/pagemap.h>
 #include <linux/rmap.h>
 #include <linux/spinlock.h>
-#include <linux/jhash.h>
+#include <linux/xxhash.h>
 #include <linux/delay.h>
 #include <linux/kthread.h>
 #include <linux/wait.h>
@@ -41,6 +41,7 @@
 #include <linux/numa.h>
 
 #include <asm/tlbflush.h>
+
 #include "internal.h"
 
 #ifdef CONFIG_NUMA
@@ -303,6 +304,7 @@ static DEFINE_SPINLOCK(ksm_mmlist_lock);
 		sizeof(struct __struct), __alignof__(struct __struct),\
 		(__flags), NULL)
 
+
 static int __init ksm_slab_init(void)
 {
 	rmap_item_cache = KSM_KMEM_CACHE(rmap_item, 0);
@@ -1009,7 +1011,7 @@ static u32 calc_checksum(struct page *page)
 {
 	u32 checksum;
 	void *addr = kmap_atomic(page);
-	checksum = jhash2(addr, PAGE_SIZE / 4, 17);
+	checksum = xxhash(addr, PAGE_SIZE, 0);
 	kunmap_atomic(addr);
 	return checksum;
 }
-- 
2.19.0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 1/2] xxHash: create arch dependent 32/64-bit xxhash()
  2018-09-13 21:19 ` [PATCH V7 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
@ 2018-09-13 21:24   ` Pasha Tatashin
  0 siblings, 0 replies; 8+ messages in thread
From: Pasha Tatashin @ 2018-09-13 21:24 UTC (permalink / raw)
  To: Timofey Titovets, linux-mm@kvack.org
  Cc: rppt@linux.vnet.ibm.com, Timofey Titovets, Andrea Arcangeli,
	kvm@vger.kernel.org, leesioh



On 9/13/18 5:19 PM, Timofey Titovets wrote:
> From: Timofey Titovets <nefelim4ag@gmail.com>
> 
> xxh32() - fast on both 32/64-bit platforms
> xxh64() - fast only on 64-bit platform
> 
> Create xxhash() which will pickup fastest version
> on compile time.
> 
> As result depends on cpu word size,
> the main proporse of that - in memory hashing.
> 
> Changes:
>   v2:
>     - Create that patch
>   v3 -> v6:
>     - Nothing, whole patchset version bump
> 
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 2/2] ksm: replace jhash2 with xxhash
  2018-09-13 21:19 ` [PATCH V7 2/2] ksm: replace jhash2 with xxhash Timofey Titovets
@ 2018-09-13 21:24   ` Pasha Tatashin
  2018-09-13 21:26   ` Pasha Tatashin
  1 sibling, 0 replies; 8+ messages in thread
From: Pasha Tatashin @ 2018-09-13 21:24 UTC (permalink / raw)
  To: Timofey Titovets, linux-mm@kvack.org
  Cc: rppt@linux.vnet.ibm.com, Timofey Titovets, leesioh,
	Andrea Arcangeli, kvm@vger.kernel.org



On 9/13/18 5:19 PM, Timofey Titovets wrote:
> From: Timofey Titovets <nefelim4ag@gmail.com>
> 
> Replace jhash2 with xxhash.
> 
> Perf numbers:
> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
> ksm: crc32c   hash() 12081 MB/s
> ksm: xxh64    hash()  8770 MB/s
> ksm: xxh32    hash()  4529 MB/s
> ksm: jhash2   hash()  1569 MB/s
> 
> By sioh Lee tests (copy from other mail):
> Test platform: openstack cloud platform (NEWTON version)
> Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, memory 64gb)
> VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
> Linux kernel: 4.14 (latest version)
> KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200
> 
> Experiment process
> Firstly, we turn off KSM and launch 4 VMs.
> Then we turn on the KSM and measure the checksum computation time until full_scans become two.
> 
> The experimental results (the experimental value is the average of the measured values)
> crc32c_intel: 1084.10ns
> crc32c (no hardware acceleration): 7012.51ns
> xxhash32: 2227.75ns
> xxhash64: 1413.16ns
> jhash2: 5128.30ns
> 
> As jhash2 always will be slower (for data size like PAGE_SIZE).
> Don't use it in ksm at all.
> 
> Use only xxhash for now, because for using crc32c,
> cryptoapi must be initialized first - that require some
> tricky solution to work good in all situations.
> 
> Thanks.
> 
> Changes:
>   v1 -> v2:
>     - Move xxhash() to xxhash.h/c and separate patches
>   v2 -> v3:
>     - Move xxhash() xxhash.c -> xxhash.h
>     - replace xxhash_t with 'unsigned long'
>     - update kerneldoc above xxhash()
>   v3 -> v4:
>     - Merge xxhash/crc32 patches
>     - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
>     - Add auto speed test and auto choice of fastest hash function
>   v4 -> v5:
>     - Pickup missed xxhash patch
>     - Update code with compile time choicen xxhash
>     - Add more macros to make code more readable
>     - As now that only possible use xxhash or crc32c,
>       on crc32c allocation error, skip speed test and fallback to xxhash
>     - For workaround too early init problem (crc32c not avaliable),
>       move zero_checksum init to first call of fastcall()
>     - Don't alloc page for hash testing, use arch zero pages for that
>   v5 -> v6:
>     - Use libcrc32c instead of CRYPTO API, mainly for
>       code/Kconfig deps Simplification
>     - Add crc32c_available():
>       libcrc32c will BUG_ON on crc32c problems,
>       so test crc32c avaliable by crc32c_available()
>     - Simplify choice_fastest_hash()
>     - Simplify fasthash()
>     - struct rmap_item && stable_node have sizeof == 64 on x86_64,
>       that makes them cache friendly. As we don't suffer from hash collisions,
>       change hash type from unsigned long back to u32.
>     - Fix kbuild robot warning, make all local functions static
>   v6 -> v7:
>     - Drop crc32c for now and use only xxhash in ksm.
> 
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> Signed-off-by: leesioh <solee@os.korea.ac.kr>

Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 0/2] KSM replace hash algo with xxhash
  2018-09-13 21:19 [PATCH V7 0/2] KSM replace hash algo with xxhash Timofey Titovets
  2018-09-13 21:19 ` [PATCH V7 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
  2018-09-13 21:19 ` [PATCH V7 2/2] ksm: replace jhash2 with xxhash Timofey Titovets
@ 2018-09-13 21:26 ` Pasha Tatashin
  2018-09-13 21:34   ` Timofey Titovets
  2 siblings, 1 reply; 8+ messages in thread
From: Pasha Tatashin @ 2018-09-13 21:26 UTC (permalink / raw)
  To: Timofey Titovets, linux-mm@kvack.org
  Cc: rppt@linux.vnet.ibm.com, Timofey Titovets, Andrea Arcangeli,
	kvm@vger.kernel.org, leesioh



On 9/13/18 5:19 PM, Timofey Titovets wrote:
> From: Timofey Titovets <nefelim4ag@gmail.com>
> 
> Currently used jhash are slow enough and replace it allow as to make KSM
> less cpu hungry.
> 
> About speed (in kernel):
>         ksm: crc32c   hash() 12081 MB/s
>         ksm: xxh64    hash()  8770 MB/s
>         ksm: xxh32    hash()  4529 MB/s
>         ksm: jhash2   hash()  1569 MB/s
> 
> By sioh Lee tests (copy from other mail):
> Test platform: openstack cloud platform (NEWTON version)
> Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, memory 64gb)
> VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
> Linux kernel: 4.14 (latest version)
> KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200
> 
> Experiment process
> Firstly, we turn off KSM and launch 4 VMs.
> Then we turn on the KSM and measure the checksum computation time until full_scans become two.
> 
> The experimental results (the experimental value is the average of the measured values)
> crc32c_intel: 1084.10ns
> crc32c (no hardware acceleration): 7012.51ns
> xxhash32: 2227.75ns
> xxhash64: 1413.16ns
> jhash2: 5128.30ns
> 
> In summary, the result shows that crc32c_intel has advantages over all 
> of the hash function used in the experiment. (decreased by 84.54% compared to crc32c,
> 78.86% compared to jhash2, 51.33% xxhash32, 23.28% compared to xxhash64)
> the results are similar to those of Timofey.
> 
> But,
> use only xxhash for now, because for using crc32c,
> cryptoapi must be initialized first - that require some
> tricky solution to work good in all situations.
> 
> So:
>   - Fisrt patch implement compile time pickup of fastest implementation of xxhash
>     for target platform.
>   - Second replace jhash2 with xxhash
>   
> Thanks.
> 
> CC: Andrea Arcangeli <aarcange@redhat.com>
> CC: linux-mm@kvack.org
> CC: kvm@vger.kernel.org
> CC: leesioh <solee@os.korea.ac.kr>
> 
> Timofey Titovets (2):
>   xxHash: create arch dependent 32/64-bit xxhash()
>   ksm: replace jhash2 with xxhash
> 
>  include/linux/xxhash.h | 23 +++++++++++++
>  mm/Kconfig             |  2 ++
>  mm/ksm.c               | 93 +++++++++++++++++++++++++++++++++++++++++++++++---
>  3 files changed, 114 insertions(+), 4 deletions(-)

This is wrong stat. ksm.c should not have any new lines at all.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 2/2] ksm: replace jhash2 with xxhash
  2018-09-13 21:19 ` [PATCH V7 2/2] ksm: replace jhash2 with xxhash Timofey Titovets
  2018-09-13 21:24   ` Pasha Tatashin
@ 2018-09-13 21:26   ` Pasha Tatashin
  1 sibling, 0 replies; 8+ messages in thread
From: Pasha Tatashin @ 2018-09-13 21:26 UTC (permalink / raw)
  To: Timofey Titovets, linux-mm@kvack.org
  Cc: rppt@linux.vnet.ibm.com, Timofey Titovets, leesioh,
	Andrea Arcangeli, kvm@vger.kernel.org



On 9/13/18 5:19 PM, Timofey Titovets wrote:
> From: Timofey Titovets <nefelim4ag@gmail.com>
> 
> Replace jhash2 with xxhash.
> 
> Perf numbers:
> Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
> ksm: crc32c   hash() 12081 MB/s
> ksm: xxh64    hash()  8770 MB/s
> ksm: xxh32    hash()  4529 MB/s
> ksm: jhash2   hash()  1569 MB/s
> 
> By sioh Lee tests (copy from other mail):
> Test platform: openstack cloud platform (NEWTON version)
> Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, memory 64gb)
> VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
> Linux kernel: 4.14 (latest version)
> KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200
> 
> Experiment process
> Firstly, we turn off KSM and launch 4 VMs.
> Then we turn on the KSM and measure the checksum computation time until full_scans become two.
> 
> The experimental results (the experimental value is the average of the measured values)
> crc32c_intel: 1084.10ns
> crc32c (no hardware acceleration): 7012.51ns
> xxhash32: 2227.75ns
> xxhash64: 1413.16ns
> jhash2: 5128.30ns
> 
> As jhash2 always will be slower (for data size like PAGE_SIZE).
> Don't use it in ksm at all.
> 
> Use only xxhash for now, because for using crc32c,
> cryptoapi must be initialized first - that require some
> tricky solution to work good in all situations.
> 
> Thanks.
> 
> Changes:
>   v1 -> v2:
>     - Move xxhash() to xxhash.h/c and separate patches
>   v2 -> v3:
>     - Move xxhash() xxhash.c -> xxhash.h
>     - replace xxhash_t with 'unsigned long'
>     - update kerneldoc above xxhash()
>   v3 -> v4:
>     - Merge xxhash/crc32 patches
>     - Replace crc32 with crc32c (crc32 have same as jhash2 speed)
>     - Add auto speed test and auto choice of fastest hash function
>   v4 -> v5:
>     - Pickup missed xxhash patch
>     - Update code with compile time choicen xxhash
>     - Add more macros to make code more readable
>     - As now that only possible use xxhash or crc32c,
>       on crc32c allocation error, skip speed test and fallback to xxhash
>     - For workaround too early init problem (crc32c not avaliable),
>       move zero_checksum init to first call of fastcall()
>     - Don't alloc page for hash testing, use arch zero pages for that
>   v5 -> v6:
>     - Use libcrc32c instead of CRYPTO API, mainly for
>       code/Kconfig deps Simplification
>     - Add crc32c_available():
>       libcrc32c will BUG_ON on crc32c problems,
>       so test crc32c avaliable by crc32c_available()
>     - Simplify choice_fastest_hash()
>     - Simplify fasthash()
>     - struct rmap_item && stable_node have sizeof == 64 on x86_64,
>       that makes them cache friendly. As we don't suffer from hash collisions,
>       change hash type from unsigned long back to u32.
>     - Fix kbuild robot warning, make all local functions static
>   v6 -> v7:
>     - Drop crc32c for now and use only xxhash in ksm.
> 
> Signed-off-by: Timofey Titovets <nefelim4ag@gmail.com>
> Signed-off-by: leesioh <solee@os.korea.ac.kr>
> CC: Andrea Arcangeli <aarcange@redhat.com>
> CC: linux-mm@kvack.org
> CC: kvm@vger.kernel.org
> ---
>  mm/Kconfig | 1 +
>  mm/ksm.c   | 6 ++++--
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index a550635ea5c3..b5f923081bce 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -297,6 +297,7 @@ config MMU_NOTIFIER
>  config KSM
>  	bool "Enable KSM for page merging"
>  	depends on MMU
> +	select XXHASH
>  	help
>  	  Enable Kernel Samepage Merging: KSM periodically scans those areas
>  	  of an application's address space that an app has advised may be
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 5b0894b45ee5..30c595dd5d87 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -25,7 +25,7 @@
>  #include <linux/pagemap.h>
>  #include <linux/rmap.h>
>  #include <linux/spinlock.h>
> -#include <linux/jhash.h>
> +#include <linux/xxhash.h>
>  #include <linux/delay.h>
>  #include <linux/kthread.h>
>  #include <linux/wait.h>
> @@ -41,6 +41,7 @@
>  #include <linux/numa.h>
>  
>  #include <asm/tlbflush.h>
> +
>  #include "internal.h"

Please remove this change

>  
>  #ifdef CONFIG_NUMA
> @@ -303,6 +304,7 @@ static DEFINE_SPINLOCK(ksm_mmlist_lock);
>  		sizeof(struct __struct), __alignof__(struct __struct),\
>  		(__flags), NULL)
>  
> +

And this one

>  static int __init ksm_slab_init(void)
>  {
>  	rmap_item_cache = KSM_KMEM_CACHE(rmap_item, 0);
> @@ -1009,7 +1011,7 @@ static u32 calc_checksum(struct page *page)
>  {
>  	u32 checksum;
>  	void *addr = kmap_atomic(page);
> -	checksum = jhash2(addr, PAGE_SIZE / 4, 17);
> +	checksum = xxhash(addr, PAGE_SIZE, 0);
>  	kunmap_atomic(addr);
>  	return checksum;
>  }
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V7 0/2] KSM replace hash algo with xxhash
  2018-09-13 21:26 ` [PATCH V7 0/2] KSM replace hash algo " Pasha Tatashin
@ 2018-09-13 21:34   ` Timofey Titovets
  0 siblings, 0 replies; 8+ messages in thread
From: Timofey Titovets @ 2018-09-13 21:34 UTC (permalink / raw)
  To: Pavel.Tatashin; +Cc: linux-mm, rppt, Andrea Arcangeli, kvm, Sioh Lee

пт, 14 сент. 2018 г. в 0:26, Pasha Tatashin <Pavel.Tatashin@microsoft.com>:
>
>
>
> On 9/13/18 5:19 PM, Timofey Titovets wrote:
> > From: Timofey Titovets <nefelim4ag@gmail.com>
> >
> > Currently used jhash are slow enough and replace it allow as to make KSM
> > less cpu hungry.
> >
> > About speed (in kernel):
> >         ksm: crc32c   hash() 12081 MB/s
> >         ksm: xxh64    hash()  8770 MB/s
> >         ksm: xxh32    hash()  4529 MB/s
> >         ksm: jhash2   hash()  1569 MB/s
> >
> > By sioh Lee tests (copy from other mail):
> > Test platform: openstack cloud platform (NEWTON version)
> > Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, memory 64gb)
> > VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4
> > Linux kernel: 4.14 (latest version)
> > KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200
> >
> > Experiment process
> > Firstly, we turn off KSM and launch 4 VMs.
> > Then we turn on the KSM and measure the checksum computation time until full_scans become two.
> >
> > The experimental results (the experimental value is the average of the measured values)
> > crc32c_intel: 1084.10ns
> > crc32c (no hardware acceleration): 7012.51ns
> > xxhash32: 2227.75ns
> > xxhash64: 1413.16ns
> > jhash2: 5128.30ns
> >
> > In summary, the result shows that crc32c_intel has advantages over all
> > of the hash function used in the experiment. (decreased by 84.54% compared to crc32c,
> > 78.86% compared to jhash2, 51.33% xxhash32, 23.28% compared to xxhash64)
> > the results are similar to those of Timofey.
> >
> > But,
> > use only xxhash for now, because for using crc32c,
> > cryptoapi must be initialized first - that require some
> > tricky solution to work good in all situations.
> >
> > So:
> >   - Fisrt patch implement compile time pickup of fastest implementation of xxhash
> >     for target platform.
> >   - Second replace jhash2 with xxhash
> >
> > Thanks.
> >
> > CC: Andrea Arcangeli <aarcange@redhat.com>
> > CC: linux-mm@kvack.org
> > CC: kvm@vger.kernel.org
> > CC: leesioh <solee@os.korea.ac.kr>
> >
> > Timofey Titovets (2):
> >   xxHash: create arch dependent 32/64-bit xxhash()
> >   ksm: replace jhash2 with xxhash
> >
> >  include/linux/xxhash.h | 23 +++++++++++++
> >  mm/Kconfig             |  2 ++
> >  mm/ksm.c               | 93 +++++++++++++++++++++++++++++++++++++++++++++++---
> >  3 files changed, 114 insertions(+), 4 deletions(-)
>
> This is wrong stat. ksm.c should not have any new lines at all.

Sorry, just copy-paste error when i rework patchset.
Must be:
 include/linux/xxhash.h | 23 +++++++++++++++++++++++
 mm/Kconfig             |  1 +
 mm/ksm.c               |  4 ++--

And i leave some useless new lines in second patch, i can drop them
byself and resend if that needed.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-09-13 21:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-13 21:19 [PATCH V7 0/2] KSM replace hash algo with xxhash Timofey Titovets
2018-09-13 21:19 ` [PATCH V7 1/2] xxHash: create arch dependent 32/64-bit xxhash() Timofey Titovets
2018-09-13 21:24   ` Pasha Tatashin
2018-09-13 21:19 ` [PATCH V7 2/2] ksm: replace jhash2 with xxhash Timofey Titovets
2018-09-13 21:24   ` Pasha Tatashin
2018-09-13 21:26   ` Pasha Tatashin
2018-09-13 21:26 ` [PATCH V7 0/2] KSM replace hash algo " Pasha Tatashin
2018-09-13 21:34   ` Timofey Titovets

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).