public inbox for buildroot@busybox.net
 help / color / mirror / Atom feed
* [Buildroot] [PATCH 1/1] support/testing: test_aichat: improve test reliability
@ 2026-03-18 21:38 Julien Olivain via buildroot
  2026-03-19 20:55 ` Julien Olivain via buildroot
  0 siblings, 1 reply; 2+ messages in thread
From: Julien Olivain via buildroot @ 2026-03-18 21:38 UTC (permalink / raw)
  To: buildroot; +Cc: Julien Olivain

Since llama.cpp update in Buildroot commit [1], the test_aichat can
fail for several reasons:

The loop checking for the llama-server availability can fail if curl
succeed, but the returned json data is not formatted as expected.
This can happen if the server is ready but the model is not completely
loaded. In that case, the server returns:

    {"error":{"message":"Loading model","type":"unavailable_error","code":503}}

This commit ignore Python KeyError exceptions while doing the
server test, to avoid failing if this message is received.

Also, this new llama-server version introduced a prompt caching, which
uses too much memory. This commit completely disable this prompt
caching by adding "--cache-ram 0" in the llama-server options.

[1] https://gitlab.com/buildroot.org/buildroot/-/commit/05c36d5d875713521f99b7bad48be316dcde2510

Signed-off-by: Julien Olivain <ju.o@free.fr>
---
 support/testing/tests/package/test_aichat.py | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/support/testing/tests/package/test_aichat.py b/support/testing/tests/package/test_aichat.py
index 5fc554bbb5..50c07f87dc 100644
--- a/support/testing/tests/package/test_aichat.py
+++ b/support/testing/tests/package/test_aichat.py
@@ -70,6 +70,8 @@ class TestAiChat(infra.basetest.BRTest):
         llama_opts = "--log-file /tmp/llama-server.log"
         # We set a fixed seed, to reduce variability of the test
         llama_opts += " --seed 123456789"
+        # We disable prompt caching to reduce RAM usage
+        llama_opts += " --cache-ram 0"
         llama_opts += f" --hf-repo {hf_model}"
 
         # We start a llama-server in background, which will expose an
@@ -91,9 +93,12 @@ class TestAiChat(infra.basetest.BRTest):
             if ret == 0:
                 models_json = "".join(out)
                 models = json.loads(models_json)
-                model_name = models['models'][0]['name']
-                if model_name == hf_model:
-                    break
+                try:
+                    model_name = models['models'][0]['name']
+                    if model_name == hf_model:
+                        break
+                except KeyError:
+                    pass
         else:
             self.fail("Timeout while waiting for llama-server.")
 
-- 
2.53.0

_______________________________________________
buildroot mailing list
buildroot@buildroot.org
https://lists.buildroot.org/mailman/listinfo/buildroot

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [Buildroot] [PATCH 1/1] support/testing: test_aichat: improve test reliability
  2026-03-18 21:38 [Buildroot] [PATCH 1/1] support/testing: test_aichat: improve test reliability Julien Olivain via buildroot
@ 2026-03-19 20:55 ` Julien Olivain via buildroot
  0 siblings, 0 replies; 2+ messages in thread
From: Julien Olivain via buildroot @ 2026-03-19 20:55 UTC (permalink / raw)
  To: Julien Olivain; +Cc: buildroot

On 18/03/2026 22:38, Julien Olivain via buildroot wrote:
> Since llama.cpp update in Buildroot commit [1], the test_aichat can
> fail for several reasons:
> 
> The loop checking for the llama-server availability can fail if curl
> succeed, but the returned json data is not formatted as expected.
> This can happen if the server is ready but the model is not completely
> loaded. In that case, the server returns:
> 
>     {"error":{"message":"Loading 
> model","type":"unavailable_error","code":503}}
> 
> This commit ignore Python KeyError exceptions while doing the
> server test, to avoid failing if this message is received.
> 
> Also, this new llama-server version introduced a prompt caching, which
> uses too much memory. This commit completely disable this prompt
> caching by adding "--cache-ram 0" in the llama-server options.
> 
> [1] 
> https://gitlab.com/buildroot.org/buildroot/-/commit/05c36d5d875713521f99b7bad48be316dcde2510
> 
> Signed-off-by: Julien Olivain <ju.o@free.fr>

Applied to master.
_______________________________________________
buildroot mailing list
buildroot@buildroot.org
https://lists.buildroot.org/mailman/listinfo/buildroot

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-19 20:55 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-18 21:38 [Buildroot] [PATCH 1/1] support/testing: test_aichat: improve test reliability Julien Olivain via buildroot
2026-03-19 20:55 ` Julien Olivain via buildroot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox