From: David Gibson <david@gibson.dropbear.id.au>
To: Wasim Nazir <quic_wasimn@quicinc.com>
Cc: devicetree-compiler@vger.kernel.org, kernel@quicinc.com,
kernel@oss.qualcomm.com
Subject: Re: [PATCH v3 0/4] Introduce fdt_overlay_merge() to allow merge of overlay blobs
Date: Sat, 28 Jun 2025 20:55:15 +1000 [thread overview]
Message-ID: <aF_KEwXLkQ2ff_69@zatzit> (raw)
In-Reply-To: <aF5vSAKbARuThVZR@hu-wasimn-hyd.qualcomm.com>
[-- Attachment #1: Type: text/plain, Size: 14669 bytes --]
On Fri, Jun 27, 2025 at 03:45:36PM +0530, Wasim Nazir wrote:
> On Tue, Jun 03, 2025 at 09:05:19PM +1000, David Gibson wrote:
> > On Fri, May 30, 2025 at 08:06:40PM +0530, Wasim Nazir wrote:
> > > On Wed, May 21, 2025 at 02:20:55PM +1000, David Gibson wrote:
> > > > On Mon, May 19, 2025 at 02:40:39PM +0530, Wasim Nazir wrote:
> > > > > Hello,
> > > > >
> > > > > This is follow-up attempt for fdtoverlaymerge tool.
> > > > >
> > > > > Currently all the device-tree (DT) code for a given soc is maintained in a
> > > > > common kernel repository. For example, this common DT code will have code for
> > > > > audio, video, fingerprint, bluetooth etc. Further this, DT code is typically
> > > > > split into a base (soc-common) code and board specific code, with the soc code
> > > > > being compiled as soc.dtb and board specific code being compiled as respective
> > > > > overlay blobs (board1.dtbo, board2.dtbo etc). soc.dtb represents hardware configuration
> > > > > of a given SOC while boardX.dtbo represents configuration of a board/platform
> > > > > designed using that soc.soc.dtb and boardX.dtbo files are flashed separately on
> > > >
> > > > So.. *build time* separation of the SoC and board pieces makes sense
> > > > to me, which is I think how this convention arose. *Boot time*
> > > > separation of the SoC and board seems kind of pointless. Almost by
> > > > definition of what a "board" is, you must know early in boot which
> > > > board it is. Still, I guess the convention is established, even if
> > > > it's stupid.
> > >
> > > Android Treble requires separation of soc and board DT bits and so we
> > > need to have soc.dtb & board.dtbo in separate images.
> > >
> > > In the above example I tried to simplify using 1 soc & multiple
> > > board-variants but we have setup with different combination of socX + boardY,
> > > where X & Y can vary.
> > > So we compile socX & boardY separately and combine socX dtb's in one image
> > > while boardY dtbo's in another image.
> > > Now at run-time based on SKU/HW config, we select particular socX and boardY
> > > to boot the system.
> >
> > Ok, sure. Still seems like a silly approach to me, but it sounds like
> > that's not within your control.
> >
> > > In our workspace each repository i.e kernel & tech-packs (viz. audio, video etc.)
> > > are independent (building in its own workspace) and can create dtb & dtbo.
> > > Kernel can have socX.dtb & boardY.dtbo; Tech-packs can have socX-featureZ.dtbo
> > > and boardY-featureZ.dtbo, Z can vary.
> >
> > I mean.. how you organise your repositories should serve the needs of
> > the problem, not the other way around. But I guess that's equally
> > true of dtc and associated tools.
> >
> > > At build-time, we parse sku-id mentioned in all dtb & dtbo and combine matching
> > > files i.e we overlay socX-featureZ.dtbo to socX.dtb & similarly
> > > merge (fdtoverlaymerge) boardY-featureZ.dtbo to boardY.dtbo.
> > >
> > > We can create single dtb as socX-boardY.dtb but Android Treble doesn't
> > > allow that. Moreover, this modularity also helps us to reduce dtb + dtbo
> > > image size by choosing N combinations with X+Y files instead of having
> > > X*Y files which can increase image size.
> > > If we don't merge boardY-featureZ.dtbo to boardY.dtbo then we need to
> > > overlay Z number of boardY-featureZ.dtbo at run-time and it increases
> > > boot-time.
> >
> > So, you clearly have a late-build stage where you combine the various
> > dtbos down to just two (soc & board). Why can't the output from the
> > earlier (single repo) build stages be a dts instead of a dtbo, then
> > you use dtc to combine those into the two dtbos you need at the late
> > build stage?
>
> Our repositores are following android structure based on android-treble
> where vendor subsystems needs to be modular for ease of OTA upgrade and
> development.
>
> As a result of which each feature/techpack (viz. audio, video etc.) have
> its own independent repositories where it can build its modules and
> overlay DT (soc & board).
> Kernel is separate and provides only the core images and base DT (soc &
> board).
The build system should serve the needs of the project, not the other
way around. If the feature repos can emit a dtbo, why can't they emit
a dts as well?
> So, dtbo is the option to build it independently and at last we have
> options to either "overlay all dtbo at boot-time" or "merge it at build-time and
> overlay final dtbo at boot-time".
>
> This build-time merge is done outside techpack build system.
> But if we want to have dts from each techpack, then techpack-build system needs
> to communicate somehow to know which dts to include with which base dts.
Why is that a harder problem than knowing which dtbo? The dtbo must
have been built from some sort of dts, no?
> This is not possible at build-time between techpacks.
The build system is not immutable - you're proposing changes to libfdt
and/or dtc; why not changes to the build system?
> > > > > target (besides improving the overall size of DT blobs flashed on target, Android
> > > > > Treble also requires separation of soc and board DT bits). Bootloader will pick
> > > > > one of the board overlay blobs and merge it with soc.dtb, before booting kernel
> > > > > which is presented a unified DT blob (soc + board overlay).
> > > > >
> > > > > For ease of code maintenance and better control over release management, we are
> > > > > exploring allowing some of the tech teams (audio/fingerprint sensor etc) to
> > > > > maintain their kernel code (including their DT code) outside a common kernel
> > > > > repository. In our experience, this simplifies number of branches maintained in
> > > > > core kernel repo. New/experimental features in fingerprint sensor driver for
> > > > > example that needs to be on a separate branch will not result in unnecessary
> > > > > branching in core kenrel repo, affecting all other drivers.
> > > > >
> > > > > In addition to compiling DT code outside core kernel tree, we also want to merge
> > > > > the blobs back to respective blobs found in kernel build tree at buildtime
> > > > > (soc.dtb or boardX.dtbo), as otherwise relying on bootloader to do all the
> > > > > overlay impacts boot-time.
> > > >
> > > > It's again unclear to me why you need a boot time separation of these
> > > > devices rather than merely boot time. What does using separate .dtbo
> > > > files give you that just /include/ing multiple pieces into a single
> > > > .dtbo at build time would not?
> > > >
> > >
> > > Since our workspace is split into multiple independent repositories we cannot
> > > include the pieces in one place.
> >
> > I still don't see why not. If you can emit dtbos from the single
> > repository stages, why can't you emit dts instead?
> >
> > > > > This brings up the need to merge two overlay blobs (fingerprint-overlay.dtbo +
> > > > > boardX.dtbo), which currently doesn't seem to be supported and which this patch
> > > > > series aims to support.
> > > >
> > > > Merging overlays is a logically sensible operation, but it's not clear
> > > > to me why the need for it follows from the premises above. It's also
> > > > unclear why you need to compile to .dtbo *then* merge, rather than
> > > > combine .dts files then compile into a single .dtbo.
> > >
> > > Due to splitted repository structure we cannot combine all .dts together.
> > > And due to standalone build system for kernel & tech-packs we are
> > > creating dtbo and merging together at end.
> > >
> > > > > fdt_overlay_apply() API currently allows for an overlay DT blob to be merged
> > > > > with a base blob. It assumes that all external symbols specified in overlay
> > > > > blob's __fixups__ section are found in base blob's __symbols__ section and
> > > > > aborts on the first instance where a symbol could not be found in base blob.
> > > > > This is mostly fine as the primary use of overlay is on a target for its
> > > > > bootloader to merge various overlay blobs based on h/w configuration detected.
> > > > > But when the number of overlays increased then bootloader takes lot of time to
> > > > > apply the overlays on base DT.
> > > > >
> > > > > So we need new API/tool to merge all the overlays into single overlay file
> > > > > at host (build machine) side,
> > > >
> > > > Merging into a single overlay at build time makes sense to me. But at
> > > > build time you'd expect to have access to the .dts files. Why do you
> > > > need to merge .dtbo rather than merge the .dts before compiling to
> > > > .dtbo? The latter should be possible already by /include/ing each of
> > > > the individual overlays in order then compiling with dtc.
> > >
> > > This is not possible with our current repository structure.
> > > Moreover, this splitting of repository is needed to work independently
> > > without slowing any teck-packs.
> >
> > I really don't know what you mean by that.
> >
> > A few other things bother me about the situation, but maybe I'm
> > misunderstanding.
> >
> > 1) You imply you need many various of the soc.dtb as well as the
> > board.dtbo. How does that come to be the case? Isn't there a fixed
> > set of SoCs with known features? Remember that device trees should -
> > as much as is possible - describe just the hardware, not how it's to
> > be configured or used.
> >
> > 2) To a certain extent the same concern applies to boards. What's
> > controlling when the extra features are needed? Are extre pieces
> > physically connected on? Is it controlled by on-board switches?
> > Something else?
> >
>
> We do have a fixed set of SoCs, but there can be multiple boards using same SoC.
> socX-featureZ.dtbo & boardY-featureZ.dtbo is describing one of the
> feature (viz. audio, video etc.) for each soc & board respectively.
But why does the set of features on a SoC vary? Generally, by its
nature, a SoC has the components it has, they're not variable.
> > 3) What exactly is costing the additional time when applying may
> > .dtbos at boot time. Combining many together at build time will
> > obviously result in a larger dtbo with more fragments that will itself
> > take longer to apply. I can certainly believe it's still faster
> > overall, but it's not obvious to me why, Understanding that will
> > allow us all to reason better about what's a good approach here.
>
> In our setup, we have 7 overlays (boardY-featureZ.dtbo) for 1 of the board
> (soc overlay (socX-featureZ.dtbo) are different so excluding that here).
> So, if we do overlay-merge at build-time we are saving ~60% boot-time spent on
> overlaying the board dtbo's.
Well, sure, but I'm trying to understand *why* doing it all at once
is so much faster. I know the code for overlay application, and
there's not an obvious large cost per-dtbo - I'd expect the time to be
dominated by things that are per-fragment or (roughly) proportional
to the total size of the applied dtbos. That leaves two possibilities
that I can see:
1) the later dtbos are frequently overwriting properties in the
earlier dtbos, resulting in a final dtbo substantially smaller than
the sum of the original dtbos. This seems implausible to me given
the structure you've described - I'd expect each dtbo to be
altering different parts of the tree.
2) There *is*, contrary to my intuition, a substantial once off cost
for each fdt_overlay_apply() call, independent of the dtbo's size.
If that's the case, we should understand why and see if it can be
mitigated. That would help all use cases, not just your one.
In short, adding a substantial new feature to dtc/libfdt seems
premature without better understanding the cause of the poor
performance you're seeing.
> > > > > so that on target side bootloader needs to only
> > > > > apply merged-overlay-dt to its base-dt. This saves lot of time due to reduced
> > > > > number file reading/loading & minimizing repeatative overlay apply.
> > > > > In our test setup we see an improvement of ~60% while applying merged-overlay
> > > > > at bootloader and the merged-overlay is product of 7 overlays.
> > > > >
> > > > > To serve this overlay-merge feature we have introduce fdtoverlaymerge tool
> > > > > which takes input as overlays and gives output to merged-overlay.
> > > > > The tool uses fdt_overlay_merge() API introduced in libfdt to do the actual work.
> > > > >
> > > > > Additional notes:
> > > > > If snprintf (in libc) may not available in some environments, then we will need
> > > > > to write our own snprintf() in libfdt.
> > > > >
> > > > > ---
> > > > > Changelog:
> > > > >
> > > > > v3:
> > > > > - Update copy_node & add copy_fragment_to_base to incorporate two cases i.e
> > > > > - Case1: When target is available and we merge fragments
> > > > > - Case2: When target is not available and we add new fragments
> > > > > - Change the logic to update fixups & local_fixups in case of overlay merge.
> > > > > - Few patches are squashed, reduced to 4 patches.
> > > > > - v2-link: https://lore.kernel.org/all/1599671882-310027-1-git-send-email-gurbaror@codeaurora.org/
> > > > >
> > > > >
> > > > > Srivatsa Vaddagiri (4):
> > > > > libfdt: overlay_merge: Introduce fdt_overlay_merge()
> > > > > libfdt: overlay_merge: Rename & copy overlay fragments and their
> > > > > properties
> > > > > libfdt: overlay_merge: Update phandles, symbols, fixups & local_fixups
> > > > > fdtoverlaymerge: A tool that merges overlays
> > > > >
> > > > > .gitignore | 1 +
> > > > > Makefile | 4 +
> > > > > Makefile.utils | 6 +
> > > > > fdtoverlaymerge.c | 223 +++++++++++
> > > > > libfdt/fdt_overlay.c | 901 ++++++++++++++++++++++++++++++++++++++++++-
> > > > > libfdt/fdt_rw.c | 14 +-
> > > > > libfdt/libfdt.h | 18 +
> > > > > libfdt/version.lds | 1 +
> > > > > meson.build | 2 +-
> > > > > 9 files changed, 1146 insertions(+), 24 deletions(-)
> > > > > create mode 100644 fdtoverlaymerge.c
> > > > >
> > > > >
> > > > > base-commit: f4c53f4ebf7809a07666bf728c823005e1f1a612
> > > >
> > >
> > > Regards,
> > > Wasim
> > >
> >
>
>
>
--
David Gibson (he or they) | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you, not the other way
| around.
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2025-06-29 2:21 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-19 9:10 [PATCH v3 0/4] Introduce fdt_overlay_merge() to allow merge of overlay blobs Wasim Nazir
2025-05-19 9:10 ` [PATCH v3 1/4] libfdt: overlay_merge: Introduce fdt_overlay_merge() Wasim Nazir
2025-05-19 17:17 ` Trilok Soni
2025-05-30 14:38 ` Wasim Nazir
2025-05-21 4:23 ` David Gibson
2025-05-30 12:28 ` Wasim Nazir
2025-06-03 11:09 ` David Gibson
2025-06-27 10:31 ` Wasim Nazir
2025-06-28 11:02 ` David Gibson
2025-05-19 9:10 ` [PATCH v3 2/4] libfdt: overlay_merge: Rename & copy overlay fragments and their properties Wasim Nazir
2025-05-19 9:10 ` [PATCH v3 3/4] libfdt: overlay_merge: Update phandles, symbols, fixups & local_fixups Wasim Nazir
2025-05-19 9:10 ` [PATCH v3 4/4] fdtoverlaymerge: A tool that merges overlays Wasim Nazir
2025-05-23 13:44 ` Simon Glass
2025-05-30 14:55 ` Wasim Nazir
2025-05-19 17:15 ` [PATCH v3 0/4] Introduce fdt_overlay_merge() to allow merge of overlay blobs Trilok Soni
2025-05-30 14:42 ` Wasim Nazir
2025-05-21 4:20 ` David Gibson
2025-05-30 14:36 ` Wasim Nazir
2025-06-03 11:05 ` David Gibson
2025-06-27 10:15 ` Wasim Nazir
2025-06-28 10:55 ` David Gibson [this message]
2025-09-02 10:35 ` Wasim Nazir
2025-09-02 12:49 ` Konrad Dybcio
2025-09-02 12:51 ` Konrad Dybcio
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aF_KEwXLkQ2ff_69@zatzit \
--to=david@gibson.dropbear.id.au \
--cc=devicetree-compiler@vger.kernel.org \
--cc=kernel@oss.qualcomm.com \
--cc=kernel@quicinc.com \
--cc=quic_wasimn@quicinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).