From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2FDCC28B28 for ; Thu, 13 Mar 2025 10:21:49 +0000 (UTC) Subject: Re: [PATCH v3 1/1] scripts: Add clean-hashserver-database script To: openembedded-core@lists.openembedded.org From: "Alexandre Marques" X-Originating-Location: Vila Nova de Foz Coa, Guarda, PT (213.205.68.220) X-Originating-Platform: Linux Firefox 135 User-Agent: GROUPS.IO Web Poster MIME-Version: 1.0 Date: Thu, 13 Mar 2025 03:21:48 -0700 References: In-Reply-To: Message-ID: <7915.1741861308881806700@lists.openembedded.org> Content-Type: multipart/alternative; boundary="U5rgWhu6Swoq4mpTgrDv" List-Id: X-Webhook-Received: from li982-79.members.linode.com [45.33.32.79] by aws-us-west-2-korg-lkml-1.web.codeaurora.org with HTTPS for ; Thu, 13 Mar 2025 10:21:49 -0000 X-Groupsio-URL: https://lists.openembedded.org/g/openembedded-core/message/212757 --U5rgWhu6Swoq4mpTgrDv Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Wed, Mar 12, 2025 at 09:27 AM, Joshua Watt wrote: >=20 > On Wed, Mar 12, 2025 at 5:24=E2=80=AFAM Alexandre Marques via > lists.openembedded.org > wrote: >=20 >> From: Alexandre Marques >>=20 >> Auxiliary script to clean the hashserver database based on the files >> available in the sstate directory. >>=20 >> It makes used of the new "hashclient gc-mark-stream" command to mark all >> sstate >> relevant hashes as "alive" and removes everything else from the >> database. >>=20 >> Usage example: >> ``` >> ./scripts/clean-hashserver-database \ >> --sstate-dir ~/build/sstate-cache \ >> --hashclient ./bitbake/bin/bitabke-hashclient \ >> --hashserver-address "ws://localhost:8688/ws" \ >> --mark "alive" \ >> --clean-db >> ``` >>=20 >> Signed-off-by: Alexander Marques >> --- >> scripts/clean-hashserver-database | 73 +++++++++++++++++++++++++++++++ >> 1 file changed, 73 insertions(+) >> create mode 100755 scripts/clean-hashserver-database >>=20 >> diff --git a/scripts/clean-hashserver-database >> b/scripts/clean-hashserver-database >> new file mode 100755 >> index 0000000000..6eb006758e >> --- /dev/null >> +++ b/scripts/clean-hashserver-database >> @@ -0,0 +1,73 @@ >> +#!/bin/bash >> +set -euo pipefail >> + >> +SSTATE_DIR=3D"" >> +BB_HASHCLIENT=3D"" >> +BB_HASHSERVER=3D"" >> + >> +ALIVE_DB_MARK=3D"alive" >> +CLEAN_DB=3D"false" >> + >> +function help() { >> + cat <> +Usage: $0 --sstate-dir path --hashclient path --hashserver-address >> address [--mark value] [--clean-db] >> + >> +Auxiliary script remove unused or no longer relevant entries from the >> hashequivalence database, based >> +on the files available on the sstate directory. >> + >> + -h | --help) Show this help message and exit >> + -s | --sstate-dir) Path to the sstate dir >> + -c | --hashclient) Path to bitbake-hashclient >> + -a | --hashserver-adress) bitbake-hashserver address >> + -m | --mark) Marker string to mark database entries >> + --clean-db) Remove all unmarked and unused entries from the database >> +HELP_TEXT >> +} >> + >> +function argument_parser() { >> + while [ $# -gt 0 ]; do >> + case "$1" in >> + -h | --help) help; exit 0 ;; >> + -s | --sstate-dir) SSTATE_DIR=3D"$2"; shift ;; >> + -c | --hashclient) BB_HASHCLIENT=3D"$2"; shift ;; >> + -a | --hashserver-address) BB_HASHSERVER=3D"$2"; shift ;; >> + -m | --mark) ALIVE_DB_MARK=3D"$2"; shift ;; >> + --clean-db) CLEAN_DB=3D"true";; >> + *) >> + echo "Argument '$1' is not supported" >&2 >> + help >&2 >> + exit 1 >> + ;; >> + esac >> + shift >> + done >> + >> + function validate_mandatory_argument() { >> + local var_value=3D"$1" >> + local error_message=3D"$2" >> + >> + if [ -z "$var_value" ]; then >> + echo "$error_message" >> + help >&2 >> + exit 1 >> + fi >> + } >> + >> + validate_mandatory_argument "$SSTATE_DIR" "Please provide the path to >> the sstate dir." >> + validate_mandatory_argument "$BB_HASHCLIENT" "Please provide the path = to >> bitbake-hashclient." >> + validate_mandatory_argument "$BB_HASHSERVER" "Please provide the addre= ss >> of bitbake-hashserver." >> +} >> + >> +# -- main code -- >> +argument_parser $@ >> + >> +# Mark all db sstate hashes >> +find "$SSTATE_DIR" -name "*.tar.zst" | \ >> +sed 's/.*:\([^_]*\)_.*/unihash \1/' | \ >> +$BB_HASHCLIENT --address "$BB_HASHSERVER" gc-mark-stream >> "${ALIVE_DB_MARK}" >> + >> +# Remove unmarked and unused entries >> +if [ "$CLEAN_DB" =3D "true" ]; then >> + $BB_HASHCLIENT --address "$BB_HASHSERVER" gc-sweep "${ALIVE_DB_MARK}" >> + $BB_HASHCLIENT --address "$BB_HASHSERVER" clean-unused 0 >=20 > The reason for the time is that entries can appear to be unused if > they are created while a build is in progress and you don't > necessarily want to remove them. Ideally, this is longer than your > longest build time. Either way, 0 is probably too aggressive and/or it > should be configurable on the command line. Makes sense. Thanks for the review. Pushed a new version in the meanwhile. >=20 >=20 >=20 >> +fi >> -- >> 2.34.1 >>=20 >>=20 >>=20 >=20 > --U5rgWhu6Swoq4mpTgrDv Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable
On Wed, Mar 12, 2025 at 09:27 AM, Joshua Watt wrote:
On Wed, Mar 12, 2025 at 5:24=E2=80=AFAM Alexandre Marques vialists.openembedded.org <c137.marques=3Dgmail.com@lists.openembedded.= org>
wrote:
From: Alexandre Marques <c137.marques@gmail.com>
Auxiliary script to clean the hashserver database based on the files
available in the sstate directory.

It makes used of the new "h= ashclient gc-mark-stream" command to mark all sstate
relevant hashes a= s "alive" and removes everything else from the
database.

Us= age example:
```
./scripts/clean-hashserver-database \
--sst= ate-dir ~/build/sstate-cache \
--hashclient ./bitbake/bin/bitabke-hash= client \
--hashserver-address "ws://localhost:8688/ws" \
--mark "= alive" \
--clean-db
```

Signed-off-by: Alexander Marqu= es <c137.marques@gmail.com>
---
scripts/clean-hashserver-da= tabase | 73 +++++++++++++++++++++++++++++++
1 file changed, 73 inserti= ons(+)
create mode 100755 scripts/clean-hashserver-database

diff --git a/scripts/clean-hashserver-database b/scripts/clean-hashserver-= database
new file mode 100755
index 0000000000..6eb006758e
-= -- /dev/null
+++ b/scripts/clean-hashserver-database
@@ -0,0 +1,7= 3 @@
+#!/bin/bash
+set -euo pipefail
+
+SSTATE_DIR=3D""=
+BB_HASHCLIENT=3D""
+BB_HASHSERVER=3D""
+
+ALIVE_DB_MA= RK=3D"alive"
+CLEAN_DB=3D"false"
+
+function help() {
+= cat <<HELP_TEXT
+Usage: $0 --sstate-dir path --hashclient path = --hashserver-address address [--mark value] [--clean-db]
+
+Auxil= iary script remove unused or no longer relevant entries from the hashequiva= lence database, based
+on the files available on the sstate directory.=
+
+ -h | --help) Show this help message and exit
+ -s | --s= state-dir) Path to the sstate dir
+ -c | --hashclient) Path to bitbake= -hashclient
+ -a | --hashserver-adress) bitbake-hashserver address
+ -m | --mark) Marker string to mark database entries
+ --clean-db) = Remove all unmarked and unused entries from the database
+HELP_TEXT+}
+
+function argument_parser() {
+ while [ $# -gt 0 ]; = do
+ case "$1" in
+ -h | --help) help; exit 0 ;;
+ -s | --ss= tate-dir) SSTATE_DIR=3D"$2"; shift ;;
+ -c | --hashclient) BB_HASHCLIE= NT=3D"$2"; shift ;;
+ -a | --hashserver-address) BB_HASHSERVER=3D"$2";= shift ;;
+ -m | --mark) ALIVE_DB_MARK=3D"$2"; shift ;;
+ --clean= -db) CLEAN_DB=3D"true";;
+ *)
+ echo "Argument '$1' is not suppor= ted" >&2
+ help >&2
+ exit 1
+ ;;
+ esac<= br />+ shift
+ done
+
+ function validate_mandatory_argument= () {
+ local var_value=3D"$1"
+ local error_message=3D"$2"
+=
+ if [ -z "$var_value" ]; then
+ echo "$error_message"
+ he= lp >&2
+ exit 1
+ fi
+ }
+
+ validate_manda= tory_argument "$SSTATE_DIR" "Please provide the path to the sstate dir."+ validate_mandatory_argument "$BB_HASHCLIENT" "Please provide the path = to bitbake-hashclient."
+ validate_mandatory_argument "$BB_HASHSERVER"= "Please provide the address of bitbake-hashserver."
+}
+
+#= -- main code --
+argument_parser $@
+
+# Mark all db sstate= hashes
+find "$SSTATE_DIR" -name "*.tar.zst" | \
+sed 's/.*:\([^= _]*\)_.*/unihash \1/' | \
+$BB_HASHCLIENT --address "$BB_HASHSERVER" g= c-mark-stream "${ALIVE_DB_MARK}"
+
+# Remove unmarked and unused = entries
+if [ "$CLEAN_DB" =3D "true" ]; then
+ $BB_HASHCLIENT --a= ddress "$BB_HASHSERVER" gc-sweep "${ALIVE_DB_MARK}"
+ $BB_HASHCLIENT -= -address "$BB_HASHSERVER" clean-unused 0
The reason for the time is that entries can appear to be unused if
the= y are created while a build is in progress and you don't
necessarily w= ant to remove them. Ideally, this is longer than your
longest build ti= me. Either way, 0 is probably too aggressive and/or it
should be confi= gurable on the command line.
Makes sense. Thanks for the review.
Pushed a new version in the meanwhile.

+fi
--
2.34.1


--U5rgWhu6Swoq4mpTgrDv--