Linux io-uring development
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <gabriel@krisman.be>
To: Prateek <kprateek283@gmail.com>, io-uring@vger.kernel.org
Cc: Prateek <kprateek283@gmail.com>
Subject: Re: [PATCH] setup: dynamically detect default huge page size
Date: Mon, 22 Jun 2026 12:49:10 -0400	[thread overview]
Message-ID: <87qzlyy0zd.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <20260620113609.123575-1-kprateek283@gmail.com>

Prateek <kprateek283@gmail.com> writes:

>     Replaces the hardcoded 2MB huge page size with dynamic detection by
>     parsing /proc/meminfo. This fixes no-mmap allocation failures on
>     architectures with different default huge page sizes (like ARM64
>     which often uses 512MB) or x86 systems configured for 1GB pages.
>
>     - Safely parses /proc/meminfo without allocating memory.
>     - Uses raw syscalls and manual byte-by-byte matching to maintain
>       strict compatibility with CONFIG_NOLIBC builds (avoiding strstr).
>     - Drops the MAP_HUGE_2MB mmap flag to allow the kernel to correctly
>       apply the system's default huge page size.
>     - Falls back safely to 2MB if /proc/meminfo is unreadable.
>
> Signed-off-by: Prateek <kprateek283@gmail.com>
> ---
>  src/setup.c | 84 +++++++++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 68 insertions(+), 16 deletions(-)
>
> diff --git a/src/setup.c b/src/setup.c
> index ea6f11fd..46e20e0b 100644
> --- a/src/setup.c
> +++ b/src/setup.c
> @@ -220,15 +220,67 @@ __cold int io_uring_ring_dontfork(struct io_uring *ring)
>  	return 0;
>  }
>  
> -#ifndef MAP_HUGE_SHIFT
> -#define MAP_HUGE_SHIFT	26
> -#endif
> -#ifndef MAP_HUGE_2MB
> -#define MAP_HUGE_2MB	(21U << MAP_HUGE_SHIFT)
> -#endif
>  
> -/* FIXME */
> -static size_t huge_page_size = 2 * 1024 * 1024;
> +static size_t get_huge_page_size(void)
> +{
> +	static size_t hps;

Please, initialize your static variables to makes it readable. I.e,
should be initialized it to 2MB.

> +	size_t ret = 2 * 1024 * 1024; /* fallback: 2MB */

ret redundant with hps, could go away.

> +	char buf[4096];
> +	char *p, *end;
> +	unsigned long val;
> +	ssize_t n;
> +	int fd;
> +
> +	if (hps)
> +		return hps;
> +
> +	fd = __sys_open("/proc/meminfo", O_RDONLY, 0);
> +	if (fd < 0)
> +		goto out;
> +
> +	n = __sys_read(fd, buf, sizeof(buf) - 1);
> +	__sys_close(fd);
> +	if (n <= 0)
> +		goto out;
> +	buf[n] = '\0';
> +
> +	/*
> +	 * Scan line-by-line for "Hugepagesize:". We avoid strstr() and
> +	 * memcmp() because they are not available in CONFIG_NOLIBC builds.
> +	 */
> +	p = buf;
> +	end = buf + n;
> +	while (p < end) {
> +		/* Check if this line starts with "Hugepagesize:" (13 chars) */
> +		if (p + 13 <= end &&
> +		    p[0]  == 'H' && p[1]  == 'u' && p[2]  == 'g' &&
> +		    p[3]  == 'e' && p[4]  == 'p' && p[5]  == 'a' &&
> +		    p[6]  == 'g' && p[7]  == 'e' && p[8]  == 's' &&
> +		    p[9]  == 'i' && p[10] == 'z' && p[11] == 'e' &&
> +		    p[12] == ':') {

This is unreadable.  It would be much better as a two line loop
iterating over two strings...  But then, why not create it a couple line
implementation of memcmp and atoi in arch/generic/lib.h instead?


> +			p += 13;
> +			while (p < end && (*p == ' ' || *p == '\t'))
> +				p++;
> +			val = 0;
> +			while (p < end && *p >= '0' && *p <= '9') {
> +				val = val * 10 + (*p - '0');
> +				p++;
> +			}
> +			if (val)
> +				ret = val * 1024; /* kB -> bytes */
> +			break;
> +		}
> +		/* Advance to next line */
> +		while (p < end && *p != '\n')
> +			p++;
> +		if (p < end)
> +			p++;
> +	}
> +out:
> +	hps = ret;
> +	return hps;
> +}

This function should go in arch/generic/lib.h too.  A hint is the
get_page_size is already there.

That said, we should be looking into something like the kernel's nolibc
instead of reinventing libc.

-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2026-06-22 16:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-20 11:36 [PATCH] setup: dynamically detect default huge page size Prateek
2026-06-22 16:49 ` Gabriel Krisman Bertazi [this message]
2026-06-23 11:09   ` Prateek
2026-06-23 15:11     ` Gabriel Krisman Bertazi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87qzlyy0zd.fsf@mailhost.krisman.be \
    --to=gabriel@krisman.be \
    --cc=io-uring@vger.kernel.org \
    --cc=kprateek283@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox