On Wed, Feb 26, 2025 at 08:08 AM, Joshua Watt wrote:
On Wed, Feb 26, 2025 at 2:51 AM Alexandre Marques
<Alexandre.Marques@criticaltechworks.com> wrote:
It already does this. If you specify multiple KEY VALUE pairs on the
command line, it will send all of them to the server in a single
message.

Well yes, but as far as I understand, keys get overwritten, and the key is the actual db column, meaning we can't really pass multiple hashes, just "refine" the query.

For example:
Client Side:
$ ./bitbake-hashclient --address localhost:8688 gc-mark alive --where taskhash 9cd45da2fb6aa303a7828ec3cad7709bde2882422792e696016663f390aeece0
New hashes marked: 1

$ ./bitbake-hashclient --address localhost:8688 gc-mark alive --where taskhash f1c4cca2ea1fc1181c40afc8518d75db42d5c5e841fc4a4dbdcba30e1a9879e0
New hashes marked: 1

$ ./bitbake-hashclient --address localhost:8688 gc-mark alive --where taskhash 9cd45da2fb6aa303a7828ec3cad7709bde2882422792e696016663f390aeece0 --where taskhash f1c4cca2ea1fc1181c40afc8518d75db42d5c5e841fc4a4dbdcba30e1a9879e0
New hashes marked: 1

$ ./bitbake-hashclient --address localhost:8688 gc-mark alive --where taskhash 9cd45da2fb6aa303a7828ec3cad7709bde2882422792e696016663f390aeece0 --where taskhash2 f1c4cca2ea1fc1181c40afc8518d75db42d5c5e841fc4a4dbdcba30e1a9879e0
New hashes marked: 1

Server Side:
{'mark': 'alive', 'where': {'taskhash': '9cd45da2fb6aa303a7828ec3cad7709bde2882422792e696016663f390aeece0'}}
{'mark': 'alive', 'where': {'taskhash': 'f1c4cca2ea1fc1181c40afc8518d75db42d5c5e841fc4a4dbdcba30e1a9879e0'}}
{'mark': 'alive', 'where': {'taskhash': 'f1c4cca2ea1fc1181c40afc8518d75db42d5c5e841fc4a4dbdcba30e1a9879e0'}}
{'mark': 'alive', 'where': {'taskhash': '9cd45da2fb6aa303a7828ec3cad7709bde2882422792e696016663f390aeece0', 'taskhash2': 'f1c4cca2ea1fc1181c40afc8518d75db42d5c5e841fc4a4dbdcba30e1a9879e0'}}

Perhaps I'm missing a something..
What I was proposing would be supporting something similar to this:
Client Side:
$ ./bitbake-hashclient --address localhost:8688 gc-mark alive --where 9cd45da f1c4cca

Server Side:
{'mark': 'alive', 'where': {'taskhash': ['9cd45da', 'f1c4cca']}}
No, you aren't missing anything, I was incorrect :)

There are 2 approaches here.

The first would be to improve bitbake-hashclient to add the input
stream command, but keep the existing API with the server. Each mark
sent through the file/pipe would result in a separate `mark` command
being sent to the server. This should still be faster since it will
reuse the same connection to the server as long as bitbake-hashclient
is running, which will save the overhead of establishing and
negotiating a connection. This also has the advantage that it doesn't
require server side changes, but it does mean a round-trip for each
`mark` command
 
I have a simple POC for this first approach, and seems to be working fine.
I still need to test with the remote server to have a better sense of how
much faster it is, but based on my tests with a local server, I expect it to
still be in the "minutes".

The second is to make a new server API to allow streaming of marks.
The protocol between the client and server allows commands to go into
a "stream" mode (which to be clear is distinct from the "input stream"
discussed above, the name overlap is unfortunate). This mode allows
the client to send newline delimited messages as fast as it wants
(usually is some large batch size), and asynchronously read the
responses from the server (see send_stream_batch() in the client
code), allowing multiple messages to be in-flight at once.
Implementing a new mark API on the server using this mechanism would
be the fastest possible way of communicating the marks. Of course the
disadvantage here is that it would require new API on the server, so a
server upgrade would be required to use it. That said, it may be
possible to make bitbake-hashclient intelligent enough to know if this
new API exists and if so use it for the "input stream" and if not
fallback to the older messages as described above
I haven't tried the "stream mode" yet, but had a quick look at the code and I
don't see any obvious reason for it not to work. :) so thanks!!
 
I've been trying to tackle this "make bitbake-hashclient intelligent enough to know
if this new API exists" and tbh I'm struggling a bit..
 
Whenever I use a command that isn't available on the server I just get "Error talking
to server: Connection closed". So far I'm not really seeing a way of getting more
info on the particular error, and blindly assume a "connection closed" means the
new API is not available does not seem right.
 
I was thinking of adding a new command to the server to "request the API", or check
if a particular command is available, but that in itself changes the API :,)
so its a bit chicken and egg, which makes me think this might not be the away to go..