* [RFC PATCH 0/3] memtests for ioengines using mmap
@ 2018-01-18 23:53 Robert Elliott
2018-01-18 23:53 ` [PATCH 1/3] memcpytest: Add more sizes Robert Elliott
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Robert Elliott @ 2018-01-18 23:53 UTC (permalink / raw)
To: fio; +Cc: Robert Elliott
Add memtest workloads for an ioengine using mmap to run within
the memory mapped region (not to/from another transfer buffer in
regular memory). Useful for persistent memory testing.
Tests include:
memcpy = copy with libc memcpy() (d = s)(one read, one write)
memscan = read memory to registers (one read)
memset = write memory from registers with libc memset() (one write)
wmemset = write memory from registers with libc wmemset() (one write)
streamcopy = STREAM copy (d = s)(one read, one write)
streamadd = STREAM add (d = s1 + s2)(two reads, add, one write)
streamscale = STREAM scale (d = 3 * s1)(one read, multiply, one write)
streamtriad = STREAM triad (d = s1 + 3 * s2)(two reads, add and
multiply, one write)
Open issues:
* make memscan architecture-independent (or make the test unavailable
in non-x86). The initial generic memcsum attempt still results in the
compiler generating memory writes.
* ensure fio is not allocating/filling unused xfer_buf
* ensure Read and Write statistics make sense for each memtest (e.g.
streamadd should count 2x reads and 1x writes)
* make use_glibc_nt functional (needs to be earlier, may not even
be possible)
* combine map_populate, glibc_nt, etc. to avoid creating too many
top-level fio options
* add to dev-dax and libpmem ioengines
Robert Elliott (3):
memcpytest: Add more sizes
memcpytest: add more memcpy tests
ioengines: add memtest workloads for ioengines using mmap
HOWTO | 37 +++++
debug.h | 1 +
engines/dev-dax.c | 12 +-
engines/libpmem.c | 18 +--
engines/mmap.c | 142 +++++++++++++++++--
fio.1 | 37 +++++
init.c | 4 +
io_ddir.h | 27 +++-
io_u.c | 3 +-
io_u.h | 9 +-
lib/memcpy.c | 411 ++++++++++++++++++++++++++++++++++++++++++++++++------
lib/memcpy.h | 4 +
options.c | 91 ++++++++++++
thread_options.h | 7 +
14 files changed, 733 insertions(+), 70 deletions(-)
--
2.14.3
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/3] memcpytest: Add more sizes
2018-01-18 23:53 [RFC PATCH 0/3] memtests for ioengines using mmap Robert Elliott
@ 2018-01-18 23:53 ` Robert Elliott
2018-01-18 23:53 ` [PATCH 2/3] memcpytest: add more memcpy tests Robert Elliott
2018-01-18 23:53 ` [PATCH 3/3] ioengines: add memtest workloads for ioengines using mmap Robert Elliott
2 siblings, 0 replies; 5+ messages in thread
From: Robert Elliott @ 2018-01-18 23:53 UTC (permalink / raw)
To: fio; +Cc: Robert Elliott
From: Robert Elliott <elliott@hpe.com>
Run memcpy tests over much larger sizes (L3 cache size and larger),
and reduce the number of iterations.
---
lib/memcpy.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 78 insertions(+), 10 deletions(-)
diff --git a/lib/memcpy.c b/lib/memcpy.c
index 00e65aa7..a79d7c50 100644
--- a/lib/memcpy.c
+++ b/lib/memcpy.c
@@ -8,9 +8,17 @@
#include "../gettime.h"
#include "../fio.h"
-#define BUF_SIZE 32 * 1024 * 1024ULL
+/* largest last-level CPU cache size of an x86 in 2018 in bytes */
+#define LLC_SIZE (45* 1024 * 1024ULL)
-#define NR_ITERS 64
+#define BUF_SIZE (LLC_SIZE * 4ULL)
+
+/* alignment in bytes for the buffers. Ensure that functions like
+ * libc memcpy can use most optimal paths (512 B for x86_64 AVX2).
+ */
+#define BUF_ALIGN 512
+
+#define NR_ITERS 8
struct memcpy_test {
const char *name;
@@ -21,15 +29,27 @@ struct memcpy_test {
static struct memcpy_test tests[] = {
{
- .name = "8 bytes",
+ .name = " 4 bytes",
+ .size = 4,
+ },
+ {
+ .name = " 8 bytes",
.size = 8,
},
{
- .name = "16 bytes",
+ .name = " 16 bytes",
.size = 16,
},
{
- .name = "96 bytes",
+ .name = " 32 bytes",
+ .size = 32,
+ },
+ {
+ .name = " 64 bytes",
+ .size = 64,
+ },
+ {
+ .name = " 96 bytes",
.size = 96,
},
{
@@ -45,25 +65,73 @@ static struct memcpy_test tests[] = {
.size = 512,
},
{
- .name = "2048 bytes",
+ .name = " 2 KiB",
.size = 2048,
},
{
- .name = "8192 bytes",
+ .name = " 4 KiB",
+ .size = 4096,
+ },
+ {
+ .name = " 8 KiB",
.size = 8192,
},
{
- .name = "131072 bytes",
+ .name = "128 KiB",
.size = 131072,
},
{
- .name = "262144 bytes",
+ .name = "256 KiB",
.size = 262144,
},
{
- .name = "524288 bytes",
+ .name = "512 KiB",
.size = 524288,
},
+ {
+ .name = " 8 MiB",
+ .size = 8 * 1024 * 1024,
+ },
+ {
+ .name = "6x 1.375 MiB",
+ .size = 8650752,
+ },
+ {
+ .name = " 9 MiB",
+ .size = 9 * 1024 * 1024,
+ },
+ {
+ .name = " 16 MiB",
+ .size = 16 * 1024 * 1024, /* 3/4 L3 size is 16.5 */
+ },
+ {
+ .name = " 17 MiB",
+ .size = 17 * 1024 * 1024, /* 3/4 L3 size is 16.5 */
+ },
+ {
+ .name = " 22 MiB",
+ .size = 22 * 1024 * 1024, /* L3 size */
+ },
+ {
+ .name = " 32 MiB",
+ .size = 32 * 1024 * 1024, /* >L3 size */
+ },
+ {
+ .name = " 40 MiB",
+ .size = 40 * 1024 * 1024,
+ },
+ {
+ .name = " 48 MiB",
+ .size = 48 * 1024 * 1024, /* larger than most L3 */
+ },
+ {
+ .name = "128 MiB",
+ .size = 128 * 1024 * 1024, /* much larger than L3 */
+ },
+ {
+ .name = "full buffer",
+ .size = BUF_SIZE,
+ },
{
.name = NULL,
},
--
2.14.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/3] memcpytest: add more memcpy tests
2018-01-18 23:53 [RFC PATCH 0/3] memtests for ioengines using mmap Robert Elliott
2018-01-18 23:53 ` [PATCH 1/3] memcpytest: Add more sizes Robert Elliott
@ 2018-01-18 23:53 ` Robert Elliott
2018-01-25 21:22 ` Jens Axboe
2018-01-18 23:53 ` [PATCH 3/3] ioengines: add memtest workloads for ioengines using mmap Robert Elliott
2 siblings, 1 reply; 5+ messages in thread
From: Robert Elliott @ 2018-01-18 23:53 UTC (permalink / raw)
To: fio; +Cc: Robert Elliott
From: Robert Elliott <elliott@hpe.com>
Add more memcpy tests:
memcpy = copy with libc memcpy() (d = s)(one read, one write)
memcsum = read memory to registers (one read)
memset = write memory from registers with libc memset() (one write)
wmemset = write memory from registers with libc wmemset() (one write)
streamcopy = STREAM copy (d = s)(one read, one write)
streamadd = STREAM add (d = s1 + s2)(two reads, add, one write)
streamscale = STREAM scale (d = 3 * s1)(one read, multiply, one write)
streamtriad = STREAM triad (d = s1 + 3 * s2)(two reads, add and multiply, one write)
---
engines/dev-dax.c | 12 +-
engines/libpmem.c | 18 +--
engines/mmap.c | 13 ++-
lib/memcpy.c | 323 +++++++++++++++++++++++++++++++++++++++++++++++++-----
lib/memcpy.h | 4 +
5 files changed, 320 insertions(+), 50 deletions(-)
diff --git a/engines/dev-dax.c b/engines/dev-dax.c
index caae1e09..fc169450 100644
--- a/engines/dev-dax.c
+++ b/engines/dev-dax.c
@@ -73,19 +73,19 @@ static int fio_devdax_file(struct thread_data *td, struct fio_file *f,
size_t length, off_t off)
{
struct fio_devdax_data *fdd = FILE_ENG_DATA(f);
- int flags = 0;
+ int prot = 0;
if (td_rw(td))
- flags = PROT_READ | PROT_WRITE;
+ prot = PROT_READ | PROT_WRITE;
else if (td_write(td)) {
- flags = PROT_WRITE;
+ prot = PROT_WRITE;
if (td->o.verify != VERIFY_NONE)
- flags |= PROT_READ;
+ prot |= PROT_READ;
} else
- flags = PROT_READ;
+ prot = PROT_READ;
- fdd->devdax_ptr = mmap(NULL, length, flags, MAP_SHARED, f->fd, off);
+ fdd->devdax_ptr = mmap(NULL, length, prot, MAP_SHARED, f->fd, off);
if (fdd->devdax_ptr == MAP_FAILED) {
fdd->devdax_ptr = NULL;
td_verror(td, errno, "mmap");
diff --git a/engines/libpmem.c b/engines/libpmem.c
index aa0a36f9..a6fdf964 100644
--- a/engines/libpmem.c
+++ b/engines/libpmem.c
@@ -318,31 +318,31 @@ static int fio_libpmem_file(struct thread_data *td, struct fio_file *f,
size_t length, off_t off)
{
struct fio_libpmem_data *fdd = FILE_ENG_DATA(f);
- int flags = 0;
+ int prot = 0;
void *addr = NULL;
dprint(FD_IO, "DEBUG fio_libpmem_file\n");
if (td_rw(td))
- flags = PROT_READ | PROT_WRITE;
+ prot = PROT_READ | PROT_WRITE;
else if (td_write(td)) {
- flags = PROT_WRITE;
+ prot = PROT_WRITE;
if (td->o.verify != VERIFY_NONE)
- flags |= PROT_READ;
+ prot |= PROT_READ;
} else
- flags = PROT_READ;
+ prot = PROT_READ;
dprint(FD_IO, "f->file_name = %s td->o.verify = %d \n", f->file_name,
td->o.verify);
- dprint(FD_IO, "length = %ld flags = %d f->fd = %d off = %ld \n",
- length, flags, f->fd,off);
+ dprint(FD_IO, "length = %ld prot = %d f->fd = %d off = %ld \n",
+ length, prot, f->fd,off);
addr = util_map_hint(length, 0);
dprint(FD_IO, "DEBUG mmap addr=%p length=0x%lx prot=0x%x\n",
- addr, length, flags);
- fdd->libpmem_ptr = mmap(addr, length, flags, MAP_SHARED, f->fd, off);
+ addr, length, prot);
+ fdd->libpmem_ptr = mmap(addr, length, prot, MAP_SHARED, f->fd, off);
if (fdd->libpmem_ptr == MAP_FAILED) {
fdd->libpmem_ptr = NULL;
td_verror(td, errno, "mmap");
diff --git a/engines/mmap.c b/engines/mmap.c
index 77556588..54b5b11d 100644
--- a/engines/mmap.c
+++ b/engines/mmap.c
@@ -31,19 +31,20 @@ static int fio_mmap_file(struct thread_data *td, struct fio_file *f,
size_t length, off_t off)
{
struct fio_mmap_data *fmd = FILE_ENG_DATA(f);
- int flags = 0;
+ int prot = 0;
+ int flags = MAP_SHARED;
if (td_rw(td) && !td->o.verify_only)
- flags = PROT_READ | PROT_WRITE;
+ prot = PROT_READ | PROT_WRITE;
else if (td_write(td) && !td->o.verify_only) {
- flags = PROT_WRITE;
+ prot = PROT_WRITE;
if (td->o.verify != VERIFY_NONE)
- flags |= PROT_READ;
+ prot |= PROT_READ;
} else
- flags = PROT_READ;
+ prot = PROT_READ;
- fmd->mmap_ptr = mmap(NULL, length, flags, MAP_SHARED, f->fd, off);
+ fmd->mmap_ptr = mmap(NULL, length, prot, flags, f->fd, off);
if (fmd->mmap_ptr == MAP_FAILED) {
fmd->mmap_ptr = NULL;
td_verror(td, errno, "mmap");
diff --git a/lib/memcpy.c b/lib/memcpy.c
index a79d7c50..e52a08fd 100644
--- a/lib/memcpy.c
+++ b/lib/memcpy.c
@@ -1,7 +1,10 @@
+#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
+#include <wchar.h>
+#include "memalign.h"
#include "memcpy.h"
#include "rand.h"
#include "../fio_time.h"
@@ -23,6 +26,7 @@
struct memcpy_test {
const char *name;
void *src;
+ void *src2;
void *dst;
size_t size;
};
@@ -140,14 +144,22 @@ static struct memcpy_test tests[] = {
struct memcpy_type {
const char *name;
unsigned int mask;
- void (*fn)(struct memcpy_test *);
+ void (*fn)(struct memcpy_type *, struct memcpy_test *);
};
enum {
T_MEMCPY = 1U << 0,
T_MEMMOVE = 1U << 1,
- T_SIMPLE = 1U << 2,
+ T_SIMPLE_MEMCPY = 1U << 2,
T_HYBRID = 1U << 3,
+ T_MEMSET = 1U << 4,
+ T_WMEMSET = 1U << 5,
+ T_SIMPLE_MEMSET = 1U << 6,
+ T_MEMCSUM = 1U << 7,
+ T_STREAMCOPY = 1U << 8,
+ T_STREAMSCALE = 1U << 9,
+ T_STREAMADD = 1U << 10,
+ T_STREAMTRIAD = 1U << 11,
};
#define do_test(test, fn) do { \
@@ -171,31 +183,61 @@ enum {
} \
} while (0)
-static void t_memcpy(struct memcpy_test *test)
+#define do_test_twosources(t, test, fn) do { \
+ size_t left, this; \
+ void *src, *src2, *dst; \
+ int i; \
+ \
+ for (i = 0; i < NR_ITERS; i++) { \
+ left = BUF_SIZE; \
+ src = test->src; \
+ src2 = test->src2; \
+ dst = test->dst; \
+ while (left) { \
+ this = test->size; \
+ if (this > left) \
+ this = left; \
+ (fn)(dst, src, src2, this); \
+ left -= this; \
+ src += this; \
+ src2 += this; \
+ dst += this; \
+ } \
+ } \
+} while (0)
+
+static void flush_caches(struct memcpy_type *t, struct memcpy_test *test)
+{
+ __builtin___clear_cache(test->src, test->src + BUF_SIZE);
+ __builtin___clear_cache(test->src2, test->src2 + BUF_SIZE);
+ __builtin___clear_cache(test->dst, test->dst + BUF_SIZE);
+}
+
+static void t_memcpy(struct memcpy_type *t, struct memcpy_test *test)
{
do_test(test, memcpy);
}
-static void t_memmove(struct memcpy_test *test)
+static void t_memmove(struct memcpy_type *t, struct memcpy_test *test)
{
do_test(test, memmove);
}
static void simple_memcpy(void *dst, void const *src, size_t len)
{
- char *d = dst;
+ char *d = dst;
const char *s = src;
while (len--)
*d++ = *s++;
}
-static void t_simple(struct memcpy_test *test)
+static void t_simple_memcpy(struct memcpy_type *t, struct memcpy_test *test)
{
do_test(test, simple_memcpy);
}
-static void t_hybrid(struct memcpy_test *test)
+static void t_hybrid(struct memcpy_type *t, struct memcpy_test *test)
{
if (test->size >= 64)
do_test(test, simple_memcpy);
@@ -203,6 +245,186 @@ static void t_hybrid(struct memcpy_test *test)
do_test(test, memcpy);
}
+static void t_memset(struct memcpy_type *t, struct memcpy_test *test)
+{
+ size_t left, this;
+ void *dst;
+ int i;
+
+ for (i = 0; i < NR_ITERS; i++) {
+ left = BUF_SIZE;
+ dst = test->dst;
+ // NOTE: test->size must divide into BUF_SIZE or this will loop forever
+ while (left) {
+ this = test->size;
+ if (this > left)
+ this = left;
+ memset(dst, 0x00, this);
+ left -= this;
+ dst += this;
+ }
+ }
+}
+
+static void t_wmemset(struct memcpy_type *t, struct memcpy_test *test)
+{
+ size_t left, this;
+ void *dst;
+ int i;
+
+ for (i = 0; i < NR_ITERS; i++) {
+ left = BUF_SIZE;
+ dst = test->dst;
+ // NOTE: test->size must divide into BUF_SIZE or this will loop forever
+ while (left) {
+ this = test->size;
+ if (this > left)
+ this = left;
+ wmemset(dst, 0x0000, this / sizeof(wchar_t));
+ left -= this;
+ dst += this;
+ }
+ }
+}
+static void simple_memset(void *dst, uint8_t val, size_t len)
+{
+ uint8_t *d = dst;
+
+ // assert len is multiple of 8
+ while (len) {
+ *d++ = val + len;
+ len -= sizeof(uint8_t);
+ }
+}
+
+static void t_simple_memset(struct memcpy_type *t, struct memcpy_test *test)
+{
+ size_t left, this;
+ uint8_t *dst;
+ int i;
+
+ for (i = 0; i < NR_ITERS; i++) {
+ left = BUF_SIZE;
+ dst = test->dst;
+ // NOTE: test->size must divide into BUF_SIZE or this will loop forever
+ while (left) {
+ this = test->size;
+ if (this > left)
+ this = left;
+ simple_memset(dst, 0x00, this);
+ left -= this;
+ dst += this;
+ }
+ }
+}
+
+volatile uint64_t csum;
+static void simple_memcsum(void const *src, size_t len)
+{
+ const uint64_t *s = src;
+
+ // assert len is multiple of 8
+ while (len) {
+ csum += *s++;
+ len -= sizeof(uint64_t);
+ }
+}
+
+// read memory, but use all the results so it is not optimized away
+// to benchmark read performance
+static void t_memcsum(struct memcpy_type *t, struct memcpy_test *test)
+{
+ size_t left, this;
+ void *src;
+ int i;
+
+ if (test->size < sizeof csum)
+ return;
+ for (i = 0; i < NR_ITERS; i++) {
+ left = BUF_SIZE;
+ src = test->src;
+ while (left) {
+ this = test->size;
+ if (this > left)
+ this = left;
+ simple_memcsum(src, this);
+ left -= this;
+ src += this;
+ }
+ }
+}
+
+const double scalar = 3.0;
+void streamcopy(void *dst, void const *src, size_t len)
+{
+ double *d = dst;
+ const double *s = src;
+
+ while (len -= sizeof(double))
+ *d++ = *s++;
+}
+
+static void t_streamcopy(struct memcpy_type *t, struct memcpy_test *test)
+{
+ if (test->size < sizeof scalar)
+ return;
+ do_test(test, streamcopy);
+}
+
+void streamscale(void *dst, void const *src, size_t len)
+{
+ double *d = dst;
+ const double *s = src;
+
+ while (len -= sizeof(double))
+ *d++ = scalar * *s++;
+}
+
+static void t_streamscale(struct memcpy_type *t, struct memcpy_test *test)
+{
+ if (test->size < sizeof scalar)
+ return;
+ do_test(test, streamscale);
+}
+
+void streamadd(void *dst, void const *src, void const *src2, size_t len)
+{
+ double *d = dst;
+ const double *s = src;
+ const double *s2 = src2;
+
+ while (len) {
+ *d++ = *s++ + *s2++;
+ len -= sizeof(double);
+ }
+}
+
+static void t_streamadd(struct memcpy_type *t, struct memcpy_test *test)
+{
+ if (test->size < sizeof scalar)
+ return;
+ do_test_twosources(t, test, streamadd);
+}
+
+void streamtriad(void *dst, void const *src, void const *src2, size_t len)
+{
+ double *d = dst;
+ const double *s = src;
+ const double *s2 = src2;
+
+ while (len) {
+ *d++ = *s++ + scalar * *s2++;
+ len -= sizeof(double);
+ }
+}
+
+static void t_streamtriad(struct memcpy_type *t, struct memcpy_test *test)
+{
+ if (test->size < sizeof scalar)
+ return;
+ do_test_twosources(t, test, streamtriad);
+}
+
static struct memcpy_type t[] = {
{
.name = "memcpy",
@@ -215,9 +437,49 @@ static struct memcpy_type t[] = {
.fn = t_memmove,
},
{
- .name = "simple",
- .mask = T_SIMPLE,
- .fn = t_simple,
+ .name = "simple_memcpy",
+ .mask = T_SIMPLE_MEMCPY,
+ .fn = t_simple_memcpy,
+ },
+ {
+ .name = "memset",
+ .mask = T_MEMSET,
+ .fn = t_memset,
+ },
+ {
+ .name = "wmemset",
+ .mask = T_WMEMSET,
+ .fn = t_wmemset,
+ },
+ {
+ .name = "simple_memset",
+ .mask = T_SIMPLE_MEMSET,
+ .fn = t_simple_memset,
+ },
+ {
+ .name = "memcsum",
+ .mask = T_MEMCSUM,
+ .fn = t_memcsum,
+ },
+ {
+ .name = "streamcopy",
+ .mask = T_STREAMCOPY,
+ .fn = t_streamcopy,
+ },
+ {
+ .name = "streamscale",
+ .mask = T_STREAMSCALE,
+ .fn = t_streamscale,
+ },
+ {
+ .name = "streamadd",
+ .mask = T_STREAMADD,
+ .fn = t_streamadd,
+ },
+ {
+ .name = "streamtriad",
+ .mask = T_STREAMTRIAD,
+ .fn = t_streamtriad,
},
{
.name = "hybrid",
@@ -265,23 +527,27 @@ static int setup_tests(void)
{
struct memcpy_test *test;
struct frand_state state;
- void *src, *dst;
+ void *src, *src2, *dst;
int i;
- src = malloc(BUF_SIZE);
- dst = malloc(BUF_SIZE);
- if (!src || !dst) {
- free(src);
- free(dst);
+ // align to multiple of cache line size so library functions take the
+ // optimized paths
+ // e.g., __memmove_avx_erms rather than _mmmemmove_avs_unaligned_erms
+ src = fio_memalign(BUF_ALIGN, BUF_SIZE);
+ src2 = fio_memalign(BUF_ALIGN, BUF_SIZE);
+ dst = fio_memalign(BUF_ALIGN, BUF_SIZE);
+ if (!src || !src2 || !dst)
+ // FIXFIX free too
return 1;
- }
init_rand_seed(&state, 0x8989, 0);
fill_random_buf(&state, src, BUF_SIZE);
+ fill_random_buf(&state, src2, BUF_SIZE);
for (i = 0; tests[i].name; i++) {
test = &tests[i];
test->src = src;
+ test->src2 = src2;
test->dst = dst;
}
@@ -290,8 +556,9 @@ static int setup_tests(void)
static void free_tests(void)
{
- free(tests[0].src);
- free(tests[0].dst);
+ fio_memfree(tests[0].src, BUF_SIZE);
+ fio_memfree(tests[0].src2, BUF_SIZE);
+ fio_memfree(tests[0].dst, BUF_SIZE);
}
int fio_memcpy_test(const char *type)
@@ -316,6 +583,9 @@ int fio_memcpy_test(const char *type)
return 1;
}
+ printf("memcpytest compile-time options: BUF_SIZE=%lld MiB, NR_INTERS=%d\n",
+ BUF_SIZE / 1024 / 1024, NR_ITERS);
+
for (i = 0; t[i].name; i++) {
struct timespec ts;
double mb_sec;
@@ -324,18 +594,13 @@ int fio_memcpy_test(const char *type)
if (!(t[i].mask & test_mask))
continue;
- /*
- * For first run, make sure CPUs are spun up and that
- * we've touched the data.
- */
- usec_spin(100000);
- t[i].fn(&tests[0]);
-
printf("%s\n", t[i].name);
for (j = 0; tests[j].name; j++) {
+ flush_caches(&t[i], &tests[j]);
fio_gettime(&ts, NULL);
- t[i].fn(&tests[j]);
+ t[i].fn(&t[i], &tests[j]);
+ flush_caches(&t[i], &tests[j]);
usec = utime_since_now(&ts);
if (usec) {
@@ -343,9 +608,9 @@ int fio_memcpy_test(const char *type)
mb_sec = (double) mb / (double) usec;
mb_sec /= (1.024 * 1.024);
- printf("\t%s:\t%8.2f MiB/sec\n", tests[j].name, mb_sec);
+ printf("\t%s:\t%8.2f MiB/s\n", tests[j].name, mb_sec);
} else
- printf("\t%s:inf MiB/sec\n", tests[j].name);
+ printf("\t%s:\tinf MiB/s\n", tests[j].name);
}
}
diff --git a/lib/memcpy.h b/lib/memcpy.h
index f61a4a09..86006e71 100644
--- a/lib/memcpy.h
+++ b/lib/memcpy.h
@@ -2,5 +2,9 @@
#define FIO_MEMCPY_H
int fio_memcpy_test(const char *type);
+void streamcopy(void *dst, void const *src, size_t len);
+void streamscale(void *dst, void const *src, size_t len);
+void streamadd(void *dst, void const *src, void const *src2, size_t len);
+void streamtriad(void *dst, void const *src, void const *src2, size_t len);
#endif
--
2.14.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 3/3] ioengines: add memtest workloads for ioengines using mmap
2018-01-18 23:53 [RFC PATCH 0/3] memtests for ioengines using mmap Robert Elliott
2018-01-18 23:53 ` [PATCH 1/3] memcpytest: Add more sizes Robert Elliott
2018-01-18 23:53 ` [PATCH 2/3] memcpytest: add more memcpy tests Robert Elliott
@ 2018-01-18 23:53 ` Robert Elliott
2 siblings, 0 replies; 5+ messages in thread
From: Robert Elliott @ 2018-01-18 23:53 UTC (permalink / raw)
To: fio; +Cc: Robert Elliott
From: Robert Elliott <elliott@hpe.com>
Add memtest workloads for an ioengine using mmap to run within
the memory mapped region (not to/from another transfer buffer in
regular memory). Useful for persistent memory testing.
Tests include:
memcpy = copy with libc memcpy() (d = s)(one read, one write)
memscan = read memory to registers (one read)
memset = write memory from registers with libc memset() (one write)
wmemset = write memory from registers with libc wmemset() (one write)
streamcopy = STREAM copy (d = s)(one read, one write)
streamadd = STREAM add (d = s1 + s2)(two reads, add, one write)
streamscale = STREAM scale (d = 3 * s1)(one read, multiply, one write)
streamtriad = STREAM triad (d = s1 + 3 * s2)(two reads, add and multiply, one write)
NOTE: memscan function is x86-specific, not ready for inclusion yet.
---
HOWTO | 37 ++++++++++++++++
debug.h | 1 +
engines/mmap.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++++++++--
fio.1 | 37 ++++++++++++++++
init.c | 4 ++
io_ddir.h | 27 ++++++++++--
io_u.c | 3 +-
io_u.h | 9 +++-
options.c | 91 +++++++++++++++++++++++++++++++++++++++
thread_options.h | 7 +++
10 files changed, 334 insertions(+), 9 deletions(-)
diff --git a/HOWTO b/HOWTO
index 78fa6ccf..b2d0c69e 100644
--- a/HOWTO
+++ b/HOWTO
@@ -992,6 +992,9 @@ I/O type
Sequential writes.
**trim**
Sequential trims (Linux block devices only).
+ **memtest**
+ Memory test (ioengines using mmap only).
+ Specified with memtest=.
**randread**
Random reads.
**randwrite**
@@ -1019,6 +1022,40 @@ I/O type
For instance, using ``rw=write:4k`` will skip 4k for every write. Also see
the :option:`rw_sequencer` option.
+.. option:: memtest=str
+
+ Type of memory test to perform if rw=memtest is specified.
+ For use with ioengines supporting mmap() - performs the tests within the
+ memory mapped region. Useful for persistent memory testing.
+
+ Accepted values are:
+
+ **memcpy**
+ copy with libc memcpy() (d = s)(one read, one write)
+ **memscan** (default)
+ read memory to registers (one read)
+ **memset**
+ write memory from registers with libc memset() (one write)
+ **wmemset**
+ write memory from registers with libc wmemset() (one write)
+ **streamcopy**
+ STREAM copy (d = s)(one read, one write)
+ **streamadd**
+ STREAM add (d = s1 + s2)(two reads, add, one write)
+ **streamscale**
+ STREAM scale (d = 3 * s1)(one read, multiply, one write)
+ **streamtriad**
+ STREAM triad (d = s1 + 3 * s2)(two reads, add and multiply, one write)
+
+ If library functions are provided by glibc, memcpy() honors this
+ environment variable:
+ export GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=131072
+ to select the threshold for choosing non-temporal stores (e.g., vmovnt)
+ rather than normal stores (e.g., rep movsb).
+
+ Additional tunables might also be needed:
+ export GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=131072:glibc.tune.hwcaps=AVX2_Usable,ERMS,-Prefer_No_VZEROUPPER,AVX_Fast_Unaligned_Load
+
.. option:: rw_sequencer=str
If an offset modifier is given by appending a number to the ``rw=<str>``
diff --git a/debug.h b/debug.h
index e3aa3f18..e7b176c6 100644
--- a/debug.h
+++ b/debug.h
@@ -23,6 +23,7 @@ enum {
FD_COMPRESS,
FD_STEADYSTATE,
FD_HELPERTHREAD,
+ FD_MEMTEST,
FD_DEBUG_MAX,
};
diff --git a/engines/mmap.c b/engines/mmap.c
index 54b5b11d..edc59f50 100644
--- a/engines/mmap.c
+++ b/engines/mmap.c
@@ -10,7 +10,9 @@
#include <unistd.h>
#include <errno.h>
#include <sys/mman.h>
+#include <wchar.h>
+#include "../lib/memcpy.h"
#include "../fio.h"
#include "../verify.h"
@@ -34,7 +36,9 @@ static int fio_mmap_file(struct thread_data *td, struct fio_file *f,
int prot = 0;
int flags = MAP_SHARED;
- if (td_rw(td) && !td->o.verify_only)
+ if (td->o.td_memtest)
+ prot = PROT_READ | PROT_WRITE;
+ else if (td_rw(td) && !td->o.verify_only)
prot = PROT_READ | PROT_WRITE;
else if (td_write(td) && !td->o.verify_only) {
prot = PROT_WRITE;
@@ -44,7 +48,12 @@ static int fio_mmap_file(struct thread_data *td, struct fio_file *f,
} else
prot = PROT_READ;
+ if (td->o.use_map_populate)
+ flags |= MAP_POPULATE;
fmd->mmap_ptr = mmap(NULL, length, prot, flags, f->fd, off);
+ dprint(FD_MEMTEST,
+ "mmap addr=%p len=0x%lx=%ld off=0x%lx=%ld prot=0x%x flags=0x%x\n",
+ fmd->mmap_ptr, length, length, off, off, prot, flags);
if (fmd->mmap_ptr == MAP_FAILED) {
fmd->mmap_ptr = NULL;
td_verror(td, errno, "mmap");
@@ -163,6 +172,30 @@ done:
return 0;
}
+/* read from memory to register (don't write to memory) */
+static void memtoreg(uint64_t const *p, size_t len)
+{
+ uint64_t localreg = 0;
+ uint64_t ptmp = (uint64_t)p;
+ uint64_t end = (uint64_t)p + len / 8;
+
+ /* read 0x8 bytes per pass */
+ __asm__ __volatile__(
+ "loop:\n\t"
+ "mov 0(%[ptmp]), %[localreg]\n\t"
+ "add $0x8, %[ptmp]\n\t"
+ "cmp %[ptmp], %[end]\n\t"
+ "jne loop"
+ /* Output operands */
+ :"=r" (localreg)
+ /* Input operands */
+ :[localreg] "0" (localreg),
+ [ptmp] "rp" (ptmp),
+ [end] "r" (end)
+ /* Clobbered registers after another : */
+ );
+}
+
static int fio_mmapio_queue(struct thread_data *td, struct io_u *io_u)
{
struct fio_file *f = io_u->file;
@@ -170,7 +203,95 @@ static int fio_mmapio_queue(struct thread_data *td, struct io_u *io_u)
fio_ro_check(td, io_u);
- if (io_u->ddir == DDIR_READ)
+ if (io_u->memtest == TD_MEMTEST_MEMSCAN) {
+ /* presence of this keeps the compiler from optimizing away memtoreg() */
+ uint32_t volatile result = 0;
+
+ dprint(FD_MEMTEST, "memscan %p len=0x%lx\n",
+ io_u->mmap_data, io_u->xfer_buflen);
+ memtoreg(io_u->mmap_data, io_u->xfer_buflen);
+ } else if (io_u->memtest == TD_MEMTEST_MEMSET) {
+ dprint(FD_MEMTEST, "memset %p len=0x%lx\n",
+ io_u->mmap_data, io_u->xfer_buflen);
+ memset(io_u->mmap_data, 0x00, io_u->xfer_buflen);
+ } else if (io_u->memtest == TD_MEMTEST_WMEMSET) {
+ dprint(FD_MEMTEST, "wmemset %p len=0x%lx\n",
+ io_u->mmap_data, io_u->xfer_buflen);
+ wmemset(io_u->mmap_data, 0x00, io_u->xfer_buflen / sizeof(wchar_t));
+
+// HACKHACK
+#define PAGE_SIZE 4096
+
+ } else if (io_u->memtest == TD_MEMTEST_MEMCPY) {
+ size_t len = io_u->xfer_buflen / 2;
+ void *dst = io_u->mmap_data;
+ void *src = io_u->mmap_data + len;
+
+ dprint(FD_MEMTEST, "memcpy dst=%p src=%p len=0x%lx\n", dst, src, len);
+
+ // FIXFIX this doesn't work here, must be done before the process makes
+ // any memcpy() calls (first call selects the function to use)
+ if (td->o.use_glibc_nt) {
+ char ntstr[96];
+ int err;
+
+ // 1 = off (huge threshold)
+ // 2 = on (low threshold)
+ snprintf(ntstr, sizeof ntstr,
+ "GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=%lu",
+ (td->o.use_glibc_nt == 1)? len * 2: 0);
+
+ err = putenv(ntstr);
+ if (err)
+ dprint(FD_MEMTEST, "error setting GLIBC_TUNABLES=%s\n", ntstr);
+ else
+ dprint(FD_MEMTEST, "setting GLIBC_TUNABLES=%s\n", ntstr);
+ }
+ memcpy(dst, src, io_u->xfer_buflen / 2);
+ unsetenv("GLIBC_TUNABLES");
+ } else if (io_u->memtest == TD_MEMTEST_STREAM_COPY) {
+ size_t len = io_u->xfer_buflen / 2;
+ void *dst = io_u->mmap_data;
+ void *src = io_u->mmap_data + len;
+
+ dprint(FD_MEMTEST, "streamcopy dst=%p src=%p len=0x%lx\n",
+ dst, src, len);
+ streamcopy(dst, src, io_u->xfer_buflen / 2);
+ } else if (io_u->memtest == TD_MEMTEST_STREAM_SCALE) {
+ size_t len = io_u->xfer_buflen / 2;
+ void *dst = io_u->mmap_data;
+ void *src = io_u->mmap_data + len;
+
+ dprint(FD_MEMTEST, "streamscale dst=%p src=%p len=0x%lx\n",
+ dst, src, len);
+ streamscale(dst, src, io_u->xfer_buflen / 2);
+ } else if (io_u->memtest == TD_MEMTEST_STREAM_ADD) {
+ size_t len = (io_u->xfer_buflen / 3) & ~(PAGE_SIZE - 1);
+ void *dst = io_u->mmap_data;
+ void *src1 = PTR_ALIGN(io_u->mmap_data + len, PAGE_SIZE);
+ void *src2 = PTR_ALIGN(io_u->mmap_data + 2 * len, PAGE_SIZE);
+
+ dprint(FD_MEMTEST,
+ "streamadd dst=%p src1=%p src2=%p len=0x%lx=%ld\n",
+ dst, src1, src2, len, len);
+ dprint(FD_MEMTEST,
+ "streamadd rel dst=0x%lx src1=0x%lx src2=0x%lx\n",
+ dst - dst, src1 - dst, src2 - dst);
+ streamadd(dst, src1, src2, len);
+ } else if (io_u->memtest == TD_MEMTEST_STREAM_TRIAD) {
+ size_t len = (io_u->xfer_buflen / 3) & ~(PAGE_SIZE - 1);
+ void *dst = io_u->mmap_data;
+ void *src1 = PTR_ALIGN(io_u->mmap_data + len, PAGE_SIZE);
+ void *src2 = PTR_ALIGN(io_u->mmap_data + 2 * len, PAGE_SIZE);
+
+ dprint(FD_MEMTEST,
+ "streamtriad dst=%p src1=%p src2=%p len=0x%lx=%ld\n",
+ dst, src1, src2, len, len);
+ dprint(FD_MEMTEST,
+ "streamtriad rel dst=0x%lx src1=0x%lx src2=0x%lx\n",
+ dst - dst, src1 - dst, src2 - dst);
+ streamtriad(dst, src1, src2, len);
+ } else if (io_u->ddir == DDIR_READ)
memcpy(io_u->xfer_buf, io_u->mmap_data, io_u->xfer_buflen);
else if (io_u->ddir == DDIR_WRITE)
memcpy(io_u->mmap_data, io_u->xfer_buf, io_u->xfer_buflen);
@@ -186,7 +307,6 @@ static int fio_mmapio_queue(struct thread_data *td, struct io_u *io_u)
td_verror(td, io_u->error, "trim");
}
-
/*
* not really direct, but should drop the pages from the cache
*/
@@ -216,6 +336,7 @@ static int fio_mmapio_init(struct thread_data *td)
}
mmap_map_size = MMAP_TOTAL_SZ / o->nr_files;
+
return 0;
}
diff --git a/fio.1 b/fio.1
index 70eeeb0f..7672e9e7 100644
--- a/fio.1
+++ b/fio.1
@@ -769,6 +769,9 @@ Random writes.
.B randtrim
Random trims (Linux block devices only).
.TP
+.B memtest
+Memory test (for ioengines using mmap only).
+.TP
.B rw,readwrite
Sequential mixed reads and writes.
.TP
@@ -818,6 +821,40 @@ behaves in a similar fashion, except it sends the same offset 8 number of
times before generating a new offset.
.RE
.TP
+.BI memtest \fR=\fPstr "\fR
+Type of memory test to perform if rw=memtest is specified.
+For use with ioengines supporting mmap() - performs the tests within the
+mapped region. Useful for persistent memory testing.
+Accepted values are:
+.RS
+.RS
+.TP
+.B memcpy
+.thcopy with libc memcpy() (d = s)(one read, one write)
+.TP
+.B memscan (default)
+read memory to registers (one read)
+.TP
+.B memset
+write memory from registers with libc memset() (one write)
+.TP
+.B wmemset
+write memory from registers with libc wmemset() (one write)
+.TP
+.B streamcopy
+STREAM copy (d = s)(one read, one write)
+.TP
+.B streamadd
+STREAM add (d = s1 + s2)(two reads, add, one write)
+.TP
+.B streamscale
+STREAM scale (d = 3 * s1)(one read, multiply, one write)
+.TP
+.B streamtriad
+STREAM triad (d = s1 + 3 * s2)(two reads, add and multiply, one write)
+.RE
+.RE
+.TP
.BI unified_rw_reporting \fR=\fPbool
Fio normally reports statistics on a per data direction basis, meaning that
reads, writes, and trims are accounted and reported separately. If this
diff --git a/init.c b/init.c
index 8a801383..78167a47 100644
--- a/init.c
+++ b/init.c
@@ -2251,6 +2251,10 @@ struct debug_level debug_levels[] = {
.help = "Helper thread logging",
.shift = FD_HELPERTHREAD,
},
+ { .name = "mmap",
+ .help = "mmap-based memory test logging",
+ .shift = FD_MEMTEST,
+ },
{ .name = NULL, },
};
diff --git a/io_ddir.h b/io_ddir.h
index 613d5fbc..0b0a0139 100644
--- a/io_ddir.h
+++ b/io_ddir.h
@@ -37,6 +37,7 @@ enum td_ddir {
TD_DDIR_RANDRW = TD_DDIR_RW | TD_DDIR_RAND,
TD_DDIR_RANDTRIM = TD_DDIR_TRIM | TD_DDIR_RAND,
TD_DDIR_TRIMWRITE = TD_DDIR_TRIM | TD_DDIR_WRITE,
+ TD_DDIR_LAST = TD_DDIR_TRIMWRITE + 1
};
#define td_read(td) ((td)->o.td_ddir & TD_DDIR_READ)
@@ -61,14 +62,32 @@ static inline int ddir_rw(enum fio_ddir ddir)
static inline const char *ddir_str(enum td_ddir ddir)
{
- static const char *__str[] = { NULL, "read", "write", "rw", "rand",
- "randread", "randwrite", "randrw",
- "trim", NULL, "trimwrite", NULL, "randtrim" };
+ static const char *__str[] = {
+ NULL, "read", "write", "rw", // 0x0 - 0x3
+ "rand", "randread", "randwrite", "randrw", // 0x4 - 0x7 RAND
+ NULL, NULL, "trimwrite", NULL, // 0x8 - 0xB TRIM
+ "randtrim", NULL, NULL, NULL, // 0xC - 0xF RAND, TRIM
+ };
- return __str[ddir];
+ if (ddir < TD_DDIR_LAST)
+ return __str[ddir];
+ else
+ return NULL;
}
#define ddir_rw_sum(arr) \
((arr)[DDIR_READ] + (arr)[DDIR_WRITE] + (arr)[DDIR_TRIM])
+enum td_memtest {
+ TD_MEMTEST_MEMCPY,
+ TD_MEMTEST_MEMSCAN,
+ TD_MEMTEST_MEMSET,
+ TD_MEMTEST_WMEMSET,
+ TD_MEMTEST_STREAM_COPY,
+ TD_MEMTEST_STREAM_ADD,
+ TD_MEMTEST_STREAM_SCALE,
+ TD_MEMTEST_STREAM_TRIAD,
+};
+
#endif
+
diff --git a/io_u.c b/io_u.c
index 1d6872ed..738801a1 100644
--- a/io_u.c
+++ b/io_u.c
@@ -968,6 +968,7 @@ static int fill_io_u(struct thread_data *td, struct io_u *io_u)
if (td_ioengine_flagged(td, FIO_NOIO))
goto out;
+ io_u->memtest = td->o.td_memtest;
set_rw_ddir(td, io_u);
/*
@@ -1791,7 +1792,7 @@ struct io_u *get_io_u(struct thread_data *td)
f->last_start[io_u->ddir] = io_u->offset;
f->last_pos[io_u->ddir] = io_u->offset + io_u->buflen;
- if (io_u->ddir == DDIR_WRITE) {
+ if (io_u->ddir == DDIR_WRITE && !io_u->memtest) {
if (td->flags & TD_F_REFILL_BUFFERS) {
io_u_fill_buffer(td, io_u,
td->o.min_bs[DDIR_WRITE],
diff --git a/io_u.h b/io_u.h
index da25efb9..4d39a10b 100644
--- a/io_u.h
+++ b/io_u.h
@@ -37,6 +37,7 @@ struct io_u {
struct fio_file *file;
unsigned int flags;
enum fio_ddir ddir;
+ unsigned int memtest;
/*
* For replay workloads, we may want to account as a different
@@ -152,7 +153,13 @@ static inline void dprint_io_u(struct io_u *io_u, const char *p)
{
struct fio_file *f = io_u->file;
- if (f)
+ if (f && io_u->memtest)
+ dprint(FD_IO, "%s: io_u %p: off=0x%llx,len=0x%lx,ddir=%d,memtest=%d,file=%s\n",
+ p, io_u,
+ (unsigned long long) io_u->offset,
+ io_u->buflen, io_u->ddir, io_u->memtest,
+ f->file_name);
+ else if (f)
dprint(FD_IO, "%s: io_u %p: off=0x%llx,len=0x%lx,ddir=%d,file=%s\n",
p, io_u,
(unsigned long long) io_u->offset,
diff --git a/options.c b/options.c
index 9a3431d8..e6b214e1 100644
--- a/options.c
+++ b/options.c
@@ -409,6 +409,14 @@ static int str_rw_cb(void *data, const char *str)
return 0;
}
+static int str_memtest_cb(void *data, const char *str)
+{
+ //struct thread_data *td = cb_data_to_td(data);
+ //struct thread_options *o = &td->o;
+
+ return 0;
+}
+
static int str_mem_cb(void *data, const char *mem)
{
struct thread_data *td = cb_data_to_td(data);
@@ -1534,6 +1542,19 @@ static int rw_verify(struct fio_option *o, void *data)
return 0;
}
+// FIXFIX add more checks
+static int memtest_verify(struct fio_option *o, void *data)
+{
+ struct thread_data *td = cb_data_to_td(data);
+
+ if (read_only && td_write(td)) {
+ log_err("fio: job <%s> has write bit set, but fio is in read-only mode\n", td->o.name);
+ return 1;
+ }
+
+ return 0;
+}
+
static int gtod_cpu_verify(struct fio_option *o, void *data)
{
#ifndef FIO_HAVE_CPU_AFFINITY
@@ -1685,6 +1706,10 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
.oval = TD_DDIR_TRIM,
.help = "Sequential trim",
},
+ { .ival = "memtest",
+ .oval = TD_DDIR_WRITE, // assume both directions for accounting
+ .help = "Memory test for mmap engines (specify with memtest option)",
+ },
{ .ival = "randread",
.oval = TD_DDIR_RANDREAD,
.help = "Random read",
@@ -1715,6 +1740,72 @@ struct fio_option fio_options[FIO_MAX_OPTS] = {
},
},
},
+ {
+ .name = "memtest",
+ .lname = "memory test for ioengines using mmap()",
+ .type = FIO_OPT_STR,
+ .cb = str_memtest_cb,
+ .off1 = offsetof(struct thread_options, td_memtest),
+ .help = "memory test within the mmap() region of the specified file or device",
+ .def = "memscan",
+ .verify = memtest_verify,
+ .category = FIO_OPT_C_IO,
+ .group = FIO_OPT_G_IO_BASIC,
+ .posval = {
+ { .ival = "memcpy",
+ .oval = TD_MEMTEST_MEMCPY,
+ .help = "copy with libc memcpy() (d = s)(one read, one write)",
+ },
+ { .ival = "memscan",
+ .oval = TD_MEMTEST_MEMSCAN,
+ .help = "read memory to registers (one read)",
+ },
+ { .ival = "memset",
+ .oval = TD_MEMTEST_MEMSET,
+ .help = "write memory from registers with libc memset() (one write)",
+ },
+ { .ival = "wmemset",
+ .oval = TD_MEMTEST_WMEMSET,
+ .help = "write memory from registers with libc wmemset() (one write)",
+ },
+ { .ival = "streamcopy",
+ .oval = TD_MEMTEST_STREAM_COPY,
+ .help = "STREAM copy (d = s)(one read, one write)",
+ },
+ { .ival = "streamadd",
+ .oval = TD_MEMTEST_STREAM_ADD,
+ .help = "STREAM add (d = s1 + s2)(two reads, add, one write)",
+ },
+ { .ival = "streamscale",
+ .oval = TD_MEMTEST_STREAM_SCALE,
+ .help = "STREAM scale (d = 3 * s1)(one read, multiply, one write)",
+ },
+ { .ival = "streamtriad",
+ .oval = TD_MEMTEST_STREAM_TRIAD,
+ .help = "STREAM triad (d = s1 + 3 * s2)(two reads, add and multiply, one write)",
+ },
+ },
+ },
+ {
+ .name = "mmap_populate",
+ .lname = "mmap MAP_POPULATE",
+ .type = FIO_OPT_STR_SET,
+ .off1 = offsetof(struct thread_options, use_map_populate),
+ .help = "Use MAP_POPULATE on mmap() calls",
+ .def = 0,
+ .category = FIO_OPT_C_GENERAL,
+ .group = FIO_OPT_G_IO_BASIC,
+ },
+ {
+ .name = "memtest_nt",
+ .lname = "memtest non-temporal GLIBC tunable",
+ .type = FIO_OPT_STR_SET,
+ .off1 = offsetof(struct thread_options, use_glibc_nt),
+ .help = "Set GLIBC_TUNABLES nontemporal threshold below the transfer size (0=natural, 1=force temporal, 2=force NT)",
+ .def = 0,
+ .category = FIO_OPT_C_GENERAL,
+ .group = FIO_OPT_G_IO_BASIC,
+ },
{
.name = "rw_sequencer",
.lname = "RW Sequencer",
diff --git a/thread_options.h b/thread_options.h
index dc290b0b..ee51e898 100644
--- a/thread_options.h
+++ b/thread_options.h
@@ -58,6 +58,7 @@ struct thread_options {
char *ioengine_so_path;
char *mmapfile;
enum td_ddir td_ddir;
+ enum td_memtest td_memtest;
unsigned int rw_seq;
unsigned int kb_base;
unsigned int unit_base;
@@ -191,6 +192,8 @@ struct thread_options {
unsigned long long lockmem;
enum fio_memtype mem_type;
unsigned int mem_align;
+ unsigned int use_map_populate;
+ unsigned int use_glibc_nt;
unsigned long long max_latency;
@@ -338,6 +341,8 @@ struct thread_options_pack {
uint8_t ioengine[FIO_TOP_STR_MAX];
uint8_t mmapfile[FIO_TOP_STR_MAX];
uint32_t td_ddir;
+ uint32_t td_memtest;
+ uint32_t reserved;
uint32_t rw_seq;
uint32_t kb_base;
uint32_t unit_base;
@@ -469,6 +474,8 @@ struct thread_options_pack {
uint64_t lockmem;
uint32_t mem_type;
uint32_t mem_align;
+ uint32_t use_map_populate;
+ uint32_t use_glibc_nt;
uint32_t stonewall;
uint32_t new_group;
--
2.14.3
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/3] memcpytest: add more memcpy tests
2018-01-18 23:53 ` [PATCH 2/3] memcpytest: add more memcpy tests Robert Elliott
@ 2018-01-25 21:22 ` Jens Axboe
0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2018-01-25 21:22 UTC (permalink / raw)
To: Robert Elliott, fio
On 1/18/18 4:53 PM, Robert Elliott wrote:
> From: Robert Elliott <elliott@hpe.com>
>
> Add more memcpy tests:
> memcpy = copy with libc memcpy() (d = s)(one read, one write)
> memcsum = read memory to registers (one read)
> memset = write memory from registers with libc memset() (one write)
> wmemset = write memory from registers with libc wmemset() (one write)
> streamcopy = STREAM copy (d = s)(one read, one write)
> streamadd = STREAM add (d = s1 + s2)(two reads, add, one write)
> streamscale = STREAM scale (d = 3 * s1)(one read, multiply, one write)
> streamtriad = STREAM triad (d = s1 + 3 * s2)(two reads, add and multiply, one write)
The engine changes in here don't seem related?
> +static void flush_caches(struct memcpy_type *t, struct memcpy_test *test)
> +{
> + __builtin___clear_cache(test->src, test->src + BUF_SIZE);
> + __builtin___clear_cache(test->src2, test->src2 + BUF_SIZE);
> + __builtin___clear_cache(test->dst, test->dst + BUF_SIZE);
> +}
Is this going to work on all platforms? I'm fine with adding it, but
we'll probably need a configure test to ensure we don't break various
builds.
--
Jens Axboe
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-01-25 21:22 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-18 23:53 [RFC PATCH 0/3] memtests for ioengines using mmap Robert Elliott
2018-01-18 23:53 ` [PATCH 1/3] memcpytest: Add more sizes Robert Elliott
2018-01-18 23:53 ` [PATCH 2/3] memcpytest: add more memcpy tests Robert Elliott
2018-01-25 21:22 ` Jens Axboe
2018-01-18 23:53 ` [PATCH 3/3] ioengines: add memtest workloads for ioengines using mmap Robert Elliott
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox