All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joshua Watt <jpewhacker@gmail.com>
To: bitbake-devel@lists.openembedded.org
Cc: Joshua Watt <JPEWhacker@gmail.com>
Subject: [bitbake-devel][PATCH v2 7/8] siggen: Add parallel unihash exist API
Date: Sun, 18 Feb 2024 15:59:52 -0700	[thread overview]
Message-ID: <20240218225953.2997239-8-JPEWhacker@gmail.com> (raw)
In-Reply-To: <20240218225953.2997239-1-JPEWhacker@gmail.com>

Adds API to query if unihashes are known to the server in parallel

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 bitbake/lib/bb/siggen.py | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/bitbake/lib/bb/siggen.py b/bitbake/lib/bb/siggen.py
index e1a4fa2cdd1..3ab8431a089 100644
--- a/bitbake/lib/bb/siggen.py
+++ b/bitbake/lib/bb/siggen.py
@@ -530,6 +530,11 @@ class SignatureGeneratorBasicHash(SignatureGeneratorBasic):
 class SignatureGeneratorUniHashMixIn(object):
     def __init__(self, data):
         self.extramethod = {}
+        # NOTE: The cache only tracks hashes that exist. Hashes that don't
+        # exist are always queries from the server since it is possible for
+        # hashes to appear over time, but much less likely for them to
+        # disappear
+        self.unihash_exists_cache = set()
         super().__init__(data)
 
     def get_taskdata(self):
@@ -620,6 +625,33 @@ class SignatureGeneratorUniHashMixIn(object):
 
         return method
 
+    def unihashes_exist(self, query):
+        if len(query) == 0:
+            return {}
+
+        uncached_query = {}
+        result = {}
+        for key, unihash in query.items():
+            if unihash in self.unihash_exists_cache:
+                result[key] = True
+            else:
+                uncached_query[key] = unihash
+
+        if self.max_parallel <= 1 or len(uncached_query) <= 1:
+            # No parallelism required. Make the query serially with the single client
+            uncached_result = {
+                key: self.client().unihash_exists(value) for key, value in uncached_query.items()
+            }
+        else:
+            uncached_result = self.client_pool().unihashes_exist(uncached_query)
+
+        for key, exists in uncached_result.items():
+            if exists:
+                self.unihash_exists_cache.add(query[key])
+            result[key] = exists
+
+        return result
+
     def get_unihash(self, tid):
         return self.get_unihashes([tid])[tid]
 
-- 
2.34.1



  parent reply	other threads:[~2024-02-18 23:00 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-18 20:07 [bitbake-devel][PATCH 0/5] Implement parallel Query API Joshua Watt
2024-02-18 20:07 ` [bitbake-devel][PATCH 1/5] hashserv: sqlalchemy: Use _execute() helper Joshua Watt
2024-02-18 20:07 ` [bitbake-devel][PATCH 2/5] hashserv: Add unihash-exists API Joshua Watt
2024-02-18 20:07 ` [bitbake-devel][PATCH 3/5] asyncrpc: Add Client Pool object Joshua Watt
2024-02-18 20:07 ` [bitbake-devel][PATCH 4/5] hashserv: Add Client Pool Joshua Watt
2024-02-18 20:07 ` [bitbake-devel][PATCH 5/5] siggen: Add parallel query API Joshua Watt
2024-02-18 22:59 ` [bitbake-devel][PATCH v2 0/8] Implement parallel Query API Joshua Watt
2024-02-18 22:59   ` [bitbake-devel][PATCH v2 1/8] hashserv: Add Unihash Garbage Collection Joshua Watt
2024-02-18 22:59   ` [bitbake-devel][PATCH v2 2/8] hashserv: sqlalchemy: Use _execute() helper Joshua Watt
2024-02-18 22:59   ` [bitbake-devel][PATCH v2 3/8] hashserv: Add unihash-exists API Joshua Watt
2024-02-18 22:59   ` [bitbake-devel][PATCH v2 4/8] asyncrpc: Add Client Pool object Joshua Watt
2024-02-18 22:59   ` [bitbake-devel][PATCH v2 5/8] hashserv: Add Client Pool Joshua Watt
2024-02-18 22:59   ` [bitbake-devel][PATCH v2 6/8] siggen: Add parallel query API Joshua Watt
2024-02-18 22:59   ` Joshua Watt [this message]
2024-02-18 22:59   ` [bitbake-devel][PATCH v2 8/8] bitbake: hashserv: Postgres adaptations for ignoring duplicate inserts Joshua Watt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240218225953.2997239-8-JPEWhacker@gmail.com \
    --to=jpewhacker@gmail.com \
    --cc=bitbake-devel@lists.openembedded.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.