From: Yao Zi <ziyao@disroot.org>
To: Simon Glass <sjg@chromium.org>
Cc: devicetree-compiler@vger.kernel.org
Subject: Re: [RESEND PATCH] flattree: Optimize stringtable_insert()
Date: Mon, 14 Apr 2025 03:27:35 +0000 [thread overview]
Message-ID: <Z_s0VifEhOvtMz_8@pie> (raw)
In-Reply-To: <CAFLszThbo8BEVbPDLsNqUN4aacxgeH4crbMhi8MoZvVtHztb6Q@mail.gmail.com>
On Sat, Apr 12, 2025 at 06:45:07AM -0600, Simon Glass wrote:
> Hi Yao,
>
> On Sat, 12 Apr 2025 at 04:05, Yao Zi <ziyao@disroot.org> wrote:
...
> > diff --git a/flattree.c b/flattree.c
> > index 30e6de2..afca1f2 100644
> > --- a/flattree.c
> > +++ b/flattree.c
> > @@ -218,23 +218,134 @@ static struct emitter asm_emitter = {
> > .property = asm_emit_property,
> > };
> >
> > -static int stringtable_insert(struct data *d, const char *str)
> > +struct stringtable {
> > + unsigned int len, cap;
> > + struct data data;
> > + int *slots;
> > +};
>
> Could you comment these items, including that a slot of -1 means it is free?
Sure, I'll do it in v2.
> > +
> > +/*
> > + * Must be 2^n to ensure stringtable.cap - 1 correctly masks hash into the
> > + * index of slots
> > + */
> > +#define stringtable_initcap 256
>
> When building the kernel, does stringtable_grow() get called
> regularly? If so, and if it affects execution table, perhaps this
> value could be larger?
Among the 1161 devicetrees in Linux 6.11 (counted by
find arch/arm64/boot/dts/ -name '*.dtb' | wc -l), 821 of them trigger
stringtable_grow() for at least one time. But increasing the initial
capability doesn't help much for the overall performance since it's not
a hot function actually. Thus I'd like to keep the macro as is.
> > +
> > +static unsigned int stringtable_hash(const char *str, size_t len)
> > +{
> > + unsigned int hash = (unsigned int)len;
> > +
> > + for (; len > 0; len--)
> > + hash ^= (hash << 5) + (hash >> 2) + str[len - 1];
> > +
> > + return hash;
> > +}
...
> > +static void stringtable_grow(struct stringtable *strtab)
> > +{
> > + unsigned int newcap = strtab->cap * 2;
> > + int *newslots = xmalloc(newcap * sizeof(int));
> > + unsigned int i;
> > +
> > + for (i = 0; i < newcap; i++)
> > + newslots[i] = -1;
> > +
> > + for (i = 0; i < strtab->cap; i++) {
> > + int off = strtab->slots[i];
> > + const char *str = strtab->data.val + off;
> > + unsigned int hash = stringtable_hash(str, strlen(str));
> > + int newslot = stringtable_findslot(newslots, newcap, hash);
> > +
> > + newslots[newslot] = off;
> > + }
> > +
> > + strtab->cap = newcap;
> > + strtab->slots = newslots;
>
> Does strtab->slots get freed here?
Thanks for catching this! It's lost here, causing some leaking, I'll fix
it in v2.
> > +}
> > +
> > +static int stringtable_insert(struct stringtable *strtab, const char *str)
> > +{
> > + unsigned int hash, i, mask;
> > + int *slots, *slot;
> > + const char *dup;
> > + size_t len;
> > +
> > + if (strtab->cap < strtab->len * 2)
> > + stringtable_grow(strtab);
> > +
> > + len = strlen(str);
> > + mask = strtab->cap - 1;
> > + hash = stringtable_hash(str, len);
> > + slots = strtab->slots;
> > +
> > + for (i = hash & mask; *(slot = &slots[i]) != -1; i = (i + 1) & mask) {
> > + const char *oldstr = strtab->data.val + *slot;
> > +
> > + if (streq(str, oldstr))
> > + return *slot;
> > + }
> > +
> > + /* Try to match a subsequence */
> > + dup = memmem(strtab->data.val, strtab->data.len, str, len + 1);
>
> Is there a case early on where strtab->data.len < len + 1 ?
Yes, but I don't think it's worth a special branch, as it's very rare.
AFAIK the memmem implementation of musl libc is even able to handle such
cases,
/* Return immediately when needle is longer than haystack */
if (k<l) return 0;
> > + if (dup) {
> > + *slot = dup - strtab->data.val;
> > + } else {
> > + *slot = strtab->data.len;
> > + strtab->data = data_append_data(strtab->data, str, len + 1);
> > + }
>
> What case is this subsequence-code handling?
String block contains C-style strings (terminated by zero). If A is
a suffix of B, we could reuse the data of B by pointing to the middle of
B when emitting A. For example, we could reuse "enable-method" when
emitting "method", which is common in a DTS of ARM SoC.
> If you add "frederick"
> and then "fred", won't this end up using "frederick" for both, in the
> flattened tree? I suppose not since the tests pass, but I think a
> better comment would help.
This won't happen since we take the terminating '\0' into account when
searching for a subsequence, the two strings are actually "frederick\0"
and "fred\0", where the latter is obviously not a subsequence of the
former one.
At first I didn't add such sebsequence deduplication and then noted
small increasing of size of compiled devicetree blobs. Then I realized
the original code may reuse sebsequences as well -- sure I'll add a
comment to explain to avoid confusion in the future.
> > +
> > + strtab->len++;
> > +
> > + return *slot;
> > +}
> > +
...
> Regards,
> Simon
Regards,
Yao Zi
next prev parent reply other threads:[~2025-04-14 3:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-12 10:03 [RESEND PATCH] flattree: Optimize stringtable_insert() Yao Zi
2025-04-12 12:45 ` Simon Glass
2025-04-14 3:27 ` Yao Zi [this message]
2025-04-14 6:24 ` David Gibson
2025-04-14 12:10 ` Yao Zi
2025-04-15 1:39 ` David Gibson
2025-04-15 4:03 ` Yao Zi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z_s0VifEhOvtMz_8@pie \
--to=ziyao@disroot.org \
--cc=devicetree-compiler@vger.kernel.org \
--cc=sjg@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).