All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
	"Thomas Huth" <thuth@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Eduardo Habkost" <ehabkost@redhat.com>,
	"Mark Cave-Ayland" <mark.cave-ayland@ilande.co.uk>,
	"Philippe Mathieu-Daudé" <pmathieu@redhat.com>,
	"Li Feng" <fengli@smartx.com>,
	"Marc-André Lureau" <marcandre.lureau@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: tests/test-char hangs in CI (Centos7 with check-crypto-old-nettle)
Date: Wed, 16 Dec 2020 15:46:52 +0000	[thread overview]
Message-ID: <87wnxhsqba.fsf@linaro.org> (raw)
In-Reply-To: <87eejqu6y9.fsf@linaro.org>


Alex Bennée <alex.bennee@linaro.org> writes:

> Hi,
>
> Over the last few weeks we've seen a steadily more common failure in
> test-char on CI. It seems to only manifest itself on the
> check-crypto-old-nettle build set and fails with a message like:
>
>   Running test test-char
>   Unexpected error in object_property_try_add() at ../qom/object.c:1219:
>   attempt to add duplicate property 'serial-id' to object (type 'container')
>   ERROR test-char - too few tests run (expected 38, got 9)
>   make: *** [run-test-87] Error 1

This is fixed by Eduardo's patch:

  Subject: [PATCH 0/2] Fix test-char reference counting bug
  Date: Tue, 15 Dec 2020 17:41:31 -0500
  Message-Id: <20201215224133.3545901-1-ehabkost@redhat.com>

which also fixes:
>
> passed with flying colours. It seems there must be an interaction
> between the tests to cause the failure. Attempts to valgrind out any
> strangeness were met with valgrind hanging on exit but otherwise
> passing. You can avoid the hang in valgrind by commenting out:
>
> --8<---------------cut here---------------start------------->8---
> modified   tests/test-char.c
> @@ -1434,11 +1434,11 @@ static void char_hotswap_test(void)
>      g_free(chr_args);
>  }
>  
> -static SocketAddress tcpaddr = {
> -    .type = SOCKET_ADDRESS_TYPE_INET,
> -    .u.inet.host = (char *)"127.0.0.1",
> -    .u.inet.port = (char *)"0",
> -};
> +/* static SocketAddress tcpaddr = { */
> +/*     .type = SOCKET_ADDRESS_TYPE_INET, */
> +/*     .u.inet.host = (char *)"127.0.0.1", */
> +/*     .u.inet.port = (char *)"0", */
> +/* }; */
>  #ifndef WIN32
>  static SocketAddress unixaddr = {
>      .type = SOCKET_ADDRESS_TYPE_UNIX,
> @@ -1534,12 +1534,12 @@ int main(int argc, char **argv)
>      g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
>                           &client8 ##name, char_socket_client_dupid_test)
>  
> -    if (has_ipv4) {
> -        SOCKET_SERVER_TEST(tcp, &tcpaddr);
> -        SOCKET_CLIENT_TEST(tcp, &tcpaddr);
> -        g_test_add_data_func("/char/socket/server/two-clients/tcp", &tcpaddr,
> -                             char_socket_server_two_clients_test);
> -    }
> +    /* if (has_ipv4) { */
> +    /*     SOCKET_SERVER_TEST(tcp, &tcpaddr); */
> +    /*     SOCKET_CLIENT_TEST(tcp, &tcpaddr); */
> +    /*     g_test_add_data_func("/char/socket/server/two-clients/tcp", &tcpaddr, */
> +    /*                          char_socket_server_two_clients_test); */
> +    /* } */
>  #ifndef WIN32
>      SOCKET_SERVER_TEST(unix, &unixaddr);
>      SOCKET_CLIENT_TEST(unix, &unixaddr);
> --8<---------------cut here---------------end--------------->8---

the need for the above, however we still see the "leaked" fd's and I
also noticed you can't run the same test in parallel in the same
directory (soak test and valgrind) because of a non-unique socket
address in:

#ifndef WIN32
static SocketAddress unixaddr = {
    .type = SOCKET_ADDRESS_TYPE_UNIX,
    .u.q_unix.path = (char *)"test-char.sock",
};
#endif

>
> At that point valgrind doesn't have any smoking guns of corruption but
> minor leaks and also it seems a bunch of fd's being used up:
>
>   [alex@centos7 crypto-old-nettle]$ valgrind --tool=memcheck --leak-check=full --track-origins=yes --track-fds=yes ./tests/test-char
>   ==28263== Memcheck, a memory error detector
>   ==28263== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
>   ==28263== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
>   ==28263== Command: ./tests/test-char
>   ==28263==
>   /char/null: OK
>   /char/invalid: OK
>   /char/ringbuf: OK
>   /char/mux: OK
>   /char/stdio: OK
>   /char/pipe: OK
>   /char/file: OK
>   /char/file-fifo: OK
>   /char/udp: OK
>   /char/serial: OK
>   /char/hotswap: OK
>   /char/websocket: OK
>   /char/socket/server/mainloop/unix: OK
>   /char/socket/server/wait-conn/unix: OK
>   /char/socket/server/mainloop-fdpass/unix: OK
>   /char/socket/server/wait-conn-fdpass/unix: OK
>   /char/socket/server/two-clients/unix: OK
>   /char/socket/client/mainloop/unix: OK
>   /char/socket/client/wait-conn/unix: OK
>   /char/socket/client/mainloop-reconnect/unix: OK
>   /char/socket/client/wait-conn-reconnect/unix: OK
>   /char/socket/client/mainloop-fdpass/unix: OK
>   /char/socket/client/wait-conn-fdpass/unix: OK
>   /char/socket/client/reconnect-error/unix: OK
>   ==28263==
>   ==28263== FILE DESCRIPTORS: 12 open at exit.
>   ==28263== Open AF_UNIX socket 12: <unknown>
>   ==28263==    at 0x5B9EAE7: socket (syscall-template.S:81)
>   ==28263==    by 0x14B1AC: qemu_socket (osdep.c:472)
>   ==28263==    by 0x1562FB: unix_connect_saddr (qemu-sockets.c:985)
>   ==28263==    by 0x1575E8: socket_connect (qemu-sockets.c:1158)
>   ==28263==    by 0x12A1D2: qio_channel_socket_connect_sync (channel-socket.c:145)
>   ==28263==    by 0x13A7B8: tcp_chr_connect_client_task (char-socket.c:1123)
>   ==28263==    by 0x125640: qio_task_thread_worker (task.c:124)
>   ==28263==    by 0x14F245: qemu_thread_start (qemu-thread-posix.c:521)
>   ==28263==    by 0x588AEA4: start_thread (pthread_create.c:307)
>   ==28263==    by 0x5B9D96C: clone (clone.S:111)
>   ==28263==
>   ==28263== Open AF_UNIX socket 11: <unknown>
>   ==28263==    at 0x5B9EAE7: socket (syscall-template.S:81)
>   ==28263==    by 0x14B1AC: qemu_socket (osdep.c:472)
>   ==28263==    by 0x1562FB: unix_connect_saddr (qemu-sockets.c:985)
>   ==28263==    by 0x1575E8: socket_connect (qemu-sockets.c:1158)
>   ==28263==    by 0x12A1D2: qio_channel_socket_connect_sync (channel-socket.c:145)
>   ==28263==    by 0x13A7B8: tcp_chr_connect_client_task (char-socket.c:1123)
>   ==28263==    by 0x125640: qio_task_thread_worker (task.c:124)
>   ==28263==    by 0x14F245: qemu_thread_start (qemu-thread-posix.c:521)
>   ==28263==    by 0x588AEA4: start_thread (pthread_create.c:307)
>   ==28263==    by 0x5B9D96C: clone (clone.S:111)
>   ==28263==
>   ==28263== Open file descriptor 10:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x4EC6B06: g_wakeup_new (gwakeup.c:146)
>   ==28263==    by 0x4E808B3: g_main_context_new (gmain.c:656)
>   ==28263==    by 0x4E842C4: g_get_worker_context (gmain.c:5797)
>   ==28263==    by 0x4E84332: ref_unix_signal_handler_unlocked (gmain.c:5205)
>   ==28263==    by 0x4E84458: g_child_watch_source_new (gmain.c:5384)
>   ==28263==    by 0x4EA5E64: wait_for_child (gtestutils.c:2814)
>   ==28263==    by 0x4EA9993: g_test_trap_subprocess (gtestutils.c:3100)
>   ==28263==    by 0x11BCB3: char_stdio_test (test-char.c:125)
>   ==28263==    by 0x4EA92E0: test_case_run (gtestutils.c:2255)
>   ==28263==    by 0x4EA92E0: g_test_run_suite_internal (gtestutils.c:2339)
>   ==28263==    by 0x4EA919A: g_test_run_suite_internal (gtestutils.c:2351)
>   ==28263==    by 0x4EA947D: g_test_run_suite (gtestutils.c:2426)
>   ==28263==
>   ==28263== Open file descriptor 8:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x4EC6B06: g_wakeup_new (gwakeup.c:146)
>   ==28263==    by 0x4E808B3: g_main_context_new (gmain.c:656)
>   ==28263==    by 0x4E8098C: g_main_context_default (gmain.c:692)
>   ==28263==    by 0x4E818F4: g_source_attach (gmain.c:1194)
>   ==28263==    by 0x158D5B: qemu_init_main_loop (main-loop.c:167)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 7:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x16C76B: event_notifier_init (event_notifier-posix.c:41)
>   ==28263==    by 0x1626A7: aio_context_new (async.c:508)
>   ==28263==    by 0x158CF9: qemu_init_main_loop (main-loop.c:159)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 6:
>   ==28263==    at 0x5B9DEB7: epoll_create1 (syscall-template.S:81)
>   ==28263==    by 0x165B5D: fdmon_epoll_setup (fdmon-epoll.c:151)
>   ==28263==    by 0x16269D: aio_context_new (async.c:506)
>   ==28263==    by 0x158CF9: qemu_init_main_loop (main-loop.c:159)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 5:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x16C76B: event_notifier_init (event_notifier-posix.c:41)
>   ==28263==    by 0x1626A7: aio_context_new (async.c:508)
>   ==28263==    by 0x1588AF: iohandler_init.part.0 (main-loop.c:551)
>   ==28263==    by 0x158C23: iohandler_init (main-loop.c:575)
>   ==28263==    by 0x158C23: qemu_set_fd_handler (main-loop.c:572)
>   ==28263==    by 0x158CF1: qemu_signal_init (main-loop.c:109)
>   ==28263==    by 0x158CF1: qemu_init_main_loop (main-loop.c:154)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 4:
>   ==28263==    at 0x5B9DEB7: epoll_create1 (syscall-template.S:81)
>   ==28263==    by 0x165B5D: fdmon_epoll_setup (fdmon-epoll.c:151)
>   ==28263==    by 0x16269D: aio_context_new (async.c:506)
>   ==28263==    by 0x1588AF: iohandler_init.part.0 (main-loop.c:551)
>   ==28263==    by 0x158C23: iohandler_init (main-loop.c:575)
>   ==28263==    by 0x158C23: qemu_set_fd_handler (main-loop.c:572)
>   ==28263==    by 0x158CF1: qemu_signal_init (main-loop.c:109)
>   ==28263==    by 0x158CF1: qemu_init_main_loop (main-loop.c:154)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 3:
>   ==28263==    at 0x5B97C89: syscall (syscall.S:38)
>   ==28263==    by 0x16C0C1: qemu_signalfd (compatfd.c:103)
>   ==28263==    by 0x158CC7: qemu_signal_init (main-loop.c:101)
>   ==28263==    by 0x158CC7: qemu_init_main_loop (main-loop.c:154)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 2: /dev/pts/1
>   ==28263==    <inherited from parent>
>   ==28263==
>   ==28263== Open file descriptor 1: /dev/pts/1
>   ==28263==    <inherited from parent>
>   ==28263==
>   ==28263== Open file descriptor 0: /dev/pts/1
>   ==28263==    <inherited from parent>
>   ==28263==
>   ==28263==
>   ==28263== HEAP SUMMARY:
>   ==28263==     in use at exit: 62,568 bytes in 537 blocks
>   ==28263==   total heap usage: 5,803 allocs, 5,266 frees, 2,400,984 bytes allocated
>   ==28263==
>   ==28263== 576 bytes in 1 blocks are possibly lost in loss record 352 of 359
>   ==28263==    at 0x4C2C089: calloc (vg_replace_malloc.c:762)
>   ==28263==    by 0x4012784: allocate_dtv (dl-tls.c:317)
>   ==28263==    by 0x4012784: _dl_allocate_tls (dl-tls.c:533)
>   ==28263==    by 0x588B87B: allocate_stack (allocatestack.c:539)
>   ==28263==    by 0x588B87B: pthread_create@@GLIBC_2.2.5 (pthread_create.c:447)
>   ==28263==    by 0x1500E5: qemu_thread_create (qemu-thread-posix.c:558)
>   ==28263==    by 0x1619D8: rcu_init_complete (rcu.c:379)
>   ==28263==    by 0x16FC9C: __libc_csu_init (in /home/alex/lsrc/qemu.git/builds/crypto-old-nettle/tests/test-char)
>   ==28263==    by 0x5AC14E4: (below main) (libc-start.c:225)
>   ==28263==
>   ==28263== 576 bytes in 1 blocks are possibly lost in loss record 353 of 359
>   ==28263==    at 0x4C2C089: calloc (vg_replace_malloc.c:762)
>   ==28263==    by 0x4012784: allocate_dtv (dl-tls.c:317)
>   ==28263==    by 0x4012784: _dl_allocate_tls (dl-tls.c:533)
>   ==28263==    by 0x588B87B: allocate_stack (allocatestack.c:539)
>   ==28263==    by 0x588B87B: pthread_create@@GLIBC_2.2.5 (pthread_create.c:447)
>   ==28263==    by 0x4EC7C81: g_system_thread_new (gthread-posix.c:1177)
>   ==28263==    by 0x4EAA82E: g_thread_new_internal (gthread.c:874)
>   ==28263==    by 0x4EAA8E7: g_thread_new (gthread.c:827)
>   ==28263==    by 0x4E842E0: g_get_worker_context (gmain.c:5798)
>   ==28263==    by 0x4E84332: ref_unix_signal_handler_unlocked (gmain.c:5205)
>   ==28263==    by 0x4E84458: g_child_watch_source_new (gmain.c:5384)
>   ==28263==    by 0x4EA5E64: wait_for_child (gtestutils.c:2814)
>   ==28263==    by 0x4EA9993: g_test_trap_subprocess (gtestutils.c:3100)
>   ==28263==    by 0x11BCB3: char_stdio_test (test-char.c:125)
>   ==28263==
>   ==28263== LEAK SUMMARY:
>   ==28263==    definitely lost: 0 bytes in 0 blocks
>   ==28263==    indirectly lost: 0 bytes in 0 blocks
>   ==28263==      possibly lost: 1,152 bytes in 2 blocks
>   ==28263==    still reachable: 61,416 bytes in 535 blocks
>   ==28263==         suppressed: 0 bytes in 0 blocks
>   ==28263== Reachable blocks (those to which a pointer was found) are not shown.
>   ==28263== To see them, rerun with: --leak-check=full --show-leak-kinds=all
>   ==28263==
>   ==28263== For lists of detected and suppressed errors, rerun with: -s
>   ==28263== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
>
> I also have a core dump of the binary when it fails but I suspect it
> requires someone with a bit more deftness of QOM to see if anything can
> be devised from its guts.
>
> So anyone want to hazard a guess at what's going on?


-- 
Alex Bennée


      reply	other threads:[~2020-12-16 15:52 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-15 20:31 tests/test-char hangs in CI (Centos7 with check-crypto-old-nettle) Alex Bennée
2020-12-16 15:46 ` Alex Bennée [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wnxhsqba.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=berrange@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=fengli@smartx.com \
    --cc=marcandre.lureau@redhat.com \
    --cc=mark.cave-ayland@ilande.co.uk \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=pmathieu@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.