From: Nirjhar Roy <nirjhar@linux.ibm.com>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: fstests@vger.kernel.org, linux-fsdevel@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH 1/2] fsx: support reads/writes from buffers backed by hugepages
Date: Tue, 24 Dec 2024 10:39:29 +0530 [thread overview]
Message-ID: <20ededbc-d915-4850-80f7-61585fdfd156@linux.ibm.com> (raw)
In-Reply-To: <CAJnrk1ad0KPYkLmW3sXimrJ52LL_quoxAYX6WUZ9jKnMTUa8-A@mail.gmail.com>
On 12/24/24 02:37, Joanne Koong wrote:
> On Thu, Dec 19, 2024 at 11:47 PM Nirjhar Roy <nirjhar@linux.ibm.com> wrote:
>> On Wed, 2024-12-18 at 13:01 -0800, Joanne Koong wrote:
>>> Add support for reads/writes from buffers backed by hugepages.
>>> This can be enabled through the '-h' flag. This flag should only be
>>> used
>>> on systems where THP capabilities are enabled.
>>>
>>> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
>>> ---
>>> ltp/fsx.c | 100 +++++++++++++++++++++++++++++++++++++++++++++++++---
>>> --
>>> 1 file changed, 92 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/ltp/fsx.c b/ltp/fsx.c
>>> index 41933354..3656fd9f 100644
>>> --- a/ltp/fsx.c
>>> +++ b/ltp/fsx.c
>>> @@ -190,6 +190,7 @@ int o_direct; /* -Z */
>>> +static long
>>> +get_hugepage_size(void)
>>> +{
>>> + const char *str = "Hugepagesize:";
>>> + long hugepage_size = -1;
>>> + char buffer[64];
>>> + FILE *file;
>>> +
>>> + file = fopen("/proc/meminfo", "r");
>>> + if (!file) {
>>> + prterr("get_hugepage_size: fopen /proc/meminfo");
>>> + return -1;
>>> + }
>>> + while (fgets(buffer, sizeof(buffer), file)) {
>>> + if (strncmp(buffer, str, strlen(str)) == 0) {
>> Extremely minor: Since str is a fixed string, why not calculate the
>> length outside the loop and not re-use strlen(str) multiple times?
> Thinking about this some more, maybe it'd be best to define it as
> const char str[] = "Hugepagesize:" as an array of chars and use sizeof
> which would be at compile-time instead of runtime.
> I'll do this for v2.
Yes, that is a good idea too. Thanks.
>
>>> + sscanf(buffer + strlen(str), "%ld",
>>> &hugepage_size);
>>> + break;
>>> + }
>>> + }
>>> + fclose(file);
>>> + if (hugepage_size == -1) {
>>> + prterr("get_hugepage_size: failed to find "
>>> + "hugepage size in /proc/meminfo\n");
>>> + return -1;
>>> + }
>>> +
>>> + /* convert from KiB to bytes */
>>> + return hugepage_size * 1024;
>> Minor: << 10 might be faster instead of '*' ?
> Will do for v2.
Thanks.
>
>>> +}
>>> +
>>> +static void *
>>> +init_hugepages_buf(unsigned len, long hugepage_size)
>>> +{
>>> + void *buf;
>>> + long buf_size = roundup(len, hugepage_size);
>>> +
>>> + if (posix_memalign(&buf, hugepage_size, buf_size)) {
>>> + prterr("posix_memalign for buf");
>>> + return NULL;
>>> + }
>>> + memset(buf, '\0', len);
>>> + if (madvise(buf, buf_size, MADV_COLLAPSE)) {
>>> + prterr("madvise collapse for buf");
>>> + free(buf);
>>> + return NULL;
>>> + }
>>> +
>>> + return buf;
>>> +}
>>> +
>>> static struct option longopts[] = {
>>> {"replay-ops", required_argument, 0, 256},
>>> {"record-ops", optional_argument, 0, 255},
>>> @@ -2883,7 +2935,7 @@ main(int argc, char **argv)
>>> setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout
>>> */
>>>
>>> while ((ch = getopt_long(argc, argv,
>>> - "0b:c:de:fg:i:j:kl:m:no:p:qr:s:t:uw:xy
>>> ABD:EFJKHzCILN:OP:RS:UWXZ",
>>> + "0b:c:de:fg:hi:j:kl:m:no:p:qr:s:t:uw:x
>>> yABD:EFJKHzCILN:OP:RS:UWXZ",
>>> longopts, NULL)) != EOF)
>>> switch (ch) {
>>> case 'b':
>>> @@ -2916,6 +2968,9 @@ main(int argc, char **argv)
>>> case 'g':
>>> filldata = *optarg;
>>> break;
>>> + case 'h':
>>> + hugepages = 1;
>>> + break;
>>> case 'i':
>>> integrity = 1;
>>> logdev = strdup(optarg);
>>> @@ -3232,12 +3287,41 @@ main(int argc, char **argv)
>>> original_buf = (char *) malloc(maxfilelen);
>>> for (i = 0; i < maxfilelen; i++)
>>> original_buf[i] = random() % 256;
>>> - good_buf = (char *) malloc(maxfilelen + writebdy);
>>> - good_buf = round_ptr_up(good_buf, writebdy, 0);
>>> - memset(good_buf, '\0', maxfilelen);
>>> - temp_buf = (char *) malloc(maxoplen + readbdy);
>>> - temp_buf = round_ptr_up(temp_buf, readbdy, 0);
>>> - memset(temp_buf, '\0', maxoplen);
>>> + if (hugepages) {
>>> + long hugepage_size;
>>> +
>>> + hugepage_size = get_hugepage_size();
>>> + if (hugepage_size == -1) {
>>> + prterr("get_hugepage_size()");
>>> + exit(99);
>>> + }
>>> +
>>> + if (writebdy != 1 && writebdy != hugepage_size)
>>> + prt("ignoring write alignment (since -h is
>>> enabled)");
>>> +
>>> + if (readbdy != 1 && readbdy != hugepage_size)
>>> + prt("ignoring read alignment (since -h is
>>> enabled)");
>>> +
>>> + good_buf = init_hugepages_buf(maxfilelen,
>>> hugepage_size);
>>> + if (!good_buf) {
>>> + prterr("init_hugepages_buf failed for
>>> good_buf");
>>> + exit(100);
>>> + }
>>> +
>>> + temp_buf = init_hugepages_buf(maxoplen, hugepage_size);
>>> + if (!temp_buf) {
>>> + prterr("init_hugepages_buf failed for
>>> temp_buf");
>>> + exit(101);
>>> + }
>>> + } else {
>>> + good_buf = (char *) malloc(maxfilelen + writebdy);
>>> + good_buf = round_ptr_up(good_buf, writebdy, 0);
>> Not sure if it would matter but aren't we seeing a small memory leak
>> here since good_buf's original will be lost after rounding up?
> This is inherited from the original code but AFAICT, it relies on the
> memory being cleaned up at exit time (eg free() is never called on
> good_buf and temp_buf either).
Okay, makes sense.
--
NR
>
>
> Thanks,
> Joanne
>
>>> + memset(good_buf, '\0', maxfilelen);
>>> +
>>> + temp_buf = (char *) malloc(maxoplen + readbdy);
>>> + temp_buf = round_ptr_up(temp_buf, readbdy, 0);
>>> + memset(temp_buf, '\0', maxoplen);
>>> + }
>>> if (lite) { /* zero entire existing file */
>>> ssize_t written;
>>>
--
---
Nirjhar Roy
Linux Kernel Developer
IBM, Bangalore
next prev parent reply other threads:[~2024-12-24 5:09 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-18 21:01 [PATCH 0/2] fstests: test reads/writes from hugepages-backed buffers Joanne Koong
2024-12-18 21:01 ` [PATCH 1/2] fsx: support reads/writes from buffers backed by hugepages Joanne Koong
2024-12-19 14:29 ` Brian Foster
2024-12-19 22:34 ` Joanne Koong
2024-12-20 11:52 ` Brian Foster
2024-12-23 20:59 ` Joanne Koong
2024-12-27 19:22 ` Joanne Koong
2024-12-19 17:51 ` Darrick J. Wong
2024-12-19 22:51 ` Joanne Koong
2024-12-20 7:47 ` Nirjhar Roy
2024-12-23 21:07 ` Joanne Koong
2024-12-24 5:09 ` Nirjhar Roy [this message]
2024-12-18 21:01 ` [PATCH 2/2] generic: add tests for read/writes from hugepages-backed buffers Joanne Koong
2024-12-19 17:52 ` Darrick J. Wong
2024-12-19 21:36 ` Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20ededbc-d915-4850-80f7-61585fdfd156@linux.ibm.com \
--to=nirjhar@linux.ibm.com \
--cc=fstests@vger.kernel.org \
--cc=joannelkoong@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.