public inbox for devicetree-compiler@vger.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Yao Zi <ziyao@disroot.org>
Cc: devicetree-compiler@vger.kernel.org
Subject: Re: [PATCH v2] flattree: Optimize stringtable_insert() with memmem()
Date: Fri, 18 Apr 2025 17:58:06 +1000	[thread overview]
Message-ID: <aAIGDn9G8CSp5Gnz@zatzit> (raw)
In-Reply-To: <20250417085900.1134-1-ziyao@disroot.org>

[-- Attachment #1: Type: text/plain, Size: 2641 bytes --]

On Thu, Apr 17, 2025 at 08:59:00AM +0000, Yao Zi wrote:
> According to perf result, stringtable_insert() is one of the five hotest
> functions, which is obvious since it depends on a brute-force,
> quadratic-complexity method to deduplicate the string block and is
> called at creation of every property.
> 
> This patch replaces the brute-force deduplication with libc-provided
> memmem(), which is guaranteed to be in linear complexity on both glibc
> and musl libc.
> 
> On an i7-7200U Linux system with musl-libc, building the "dtbs" target
> in Linux 6.11's arm64 port takes 19.1s less in average, achieving 25.1%
> speed up.
> 
> Signed-off-by: Yao Zi <ziyao@disroot.org>

This looks great, except for one detail.  I went to apply this, but it
failed the github tests on Windows.  Looks like the Windows libc
doesn't include memmem(), so we'll need to have a fallback
implementation.  Trickiest part will be working out when to use the
alternative.

> ---
> 
> Changed from v1
> - Drop the hashtable for simplicity and update the commit message
> - Add a comment to explain restrictions on deduplication
> - Link to v1: https://lore.kernel.org/devicetree-compiler/20250412100351.9009-2-ziyao@disroot.org/
> 
>  flattree.c | 23 +++++++++++++++--------
>  1 file changed, 15 insertions(+), 8 deletions(-)
> 
> diff --git a/flattree.c b/flattree.c
> index 30e6de2044b2..d7328690e008 100644
> --- a/flattree.c
> +++ b/flattree.c
> @@ -220,17 +220,24 @@ static struct emitter asm_emitter = {
>  
>  static int stringtable_insert(struct data *d, const char *str)
>  {
> -	unsigned int i;
> +	const char *dup;
> +	size_t size;
>  
> -	/* FIXME: do this more efficiently? */
> +	/*
> +	 * Reuse (subsequence of) existing string if possible. E.g.
> +	 * "enable-method" could be reused when inserting "method".
> +	 *
> +	 * The terminating '\0' must be taken into account, or "foo\0" may be
> +	 * wrongly recognized as subsequence of "foobar\0".
> +	 */
> +	size = strlen(str) + 1;
> +	dup = memmem(d->val, d->len, str, size);
> +	if (dup)
> +		return dup - d->val;
>  
> -	for (i = 0; i < d->len; i++) {
> -		if (streq(str, d->val + i))
> -			return i;
> -	}
> +	*d = data_append_data(*d, str, size);
>  
> -	*d = data_append_data(*d, str, strlen(str)+1);
> -	return i;
> +	return d->len - size;
>  }
>  
>  static void flatten_tree(struct node *tree, struct emitter *emit,

-- 
David Gibson (he or they)	| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you, not the other way
				| around.
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      reply	other threads:[~2025-04-18  7:58 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-17  8:59 [PATCH v2] flattree: Optimize stringtable_insert() with memmem() Yao Zi
2025-04-18  7:58 ` David Gibson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAIGDn9G8CSp5Gnz@zatzit \
    --to=david@gibson.dropbear.id.au \
    --cc=devicetree-compiler@vger.kernel.org \
    --cc=ziyao@disroot.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox