qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Remaining CI failures
@ 2019-01-11 19:10 Alex Bennée
  2019-01-14 10:20 ` Peter Maydell
  2019-01-14 15:02 ` Emilio G. Cota
  0 siblings, 2 replies; 5+ messages in thread
From: Alex Bennée @ 2019-01-11 19:10 UTC (permalink / raw)
  To: qemu-devel@nongnu.org
  Cc: Paolo Bonzini, Philippe Mathieu-Daudé, cota@braap.org,
	Peter Maydell, Daniel P. Berrangé


Hi,

So trying to narrow down the remaining failures in the CI system. There
is one with a patch in flight (use g_usleep instead of sleep) but there
remains two failure modes, both erratic.

tests/qht-par:

I can trigger this on my dev machine with a gprof enabled build:

  # QEMU configure log Fri Jan 11 14:10:45 GMT 2019
  # Configured with: './configure' '--disable-tools' '--disable-docs' '--enable-gprof' '--enable-gcov'

I only seem to be able to trigger it when running via the wrapper in the
make system:

  retry.py -n 30 --invert make check-tests/test-qht-par

Eventually this crashes with:

  ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 == 0)

Leaving a core dump for the child:

  Core was generated by `tests/qht-bench -R -S0.1 -D10000 -N1 -n 2 -u 20 -d 1'
  (gdb) info thread
    Id   Target Id         Frame
  * 1    Thread 0x7ffff6a7e700 (LWP 15473) 0x000055555557c306 in call_rcu_thread (opaque=0x0) at util/rcu.c:255
    2    Thread 0x7ffff7fbe780 (LWP 15472) 0x00005555555b8d50 in gcov_read_words ()
  (gdb) bt
  #0  __mcount_internal (frompc=<optimised out>, selfpc=93824992383630) at mcount.c:98
  #1  0x00007ffff6e15e24 in mcount () at ../sysdeps/x86_64/_mcount.S:51
  #2  0x000055555557928e in qemu_event_reset (ev=0x3cc692b8d452f400) at util/qemu-thread-posix.c:408
  #3  0x000055555557c306 in call_rcu_thread (opaque=0x0) at util/rcu.c:255
  #4  0x0000555555579630 in qemu_thread_start (args=0x555555808080) at util/qemu-thread-posix.c:502
  #5  0x00007ffff70e96db in start_thread (arg=0x7ffff6a7e700) at pthread_create.c:463
  #6  0x00007ffff6e1288f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
  (gdb) thread 2
  [Switching to thread 2 (Thread 0x7ffff7fbe780 (LWP 15472))]
  #0  0x00005555555b8d50 in gcov_read_words ()
  (gdb) bt
  #0  0x00005555555b8d50 in gcov_read_words ()
  #1  0x00005555555b9453 in __gcov_read_summary ()
  #2  0x00005555555ba461 in gcov_do_dump ()
  #3  0x00005555555bab62 in __gcov_exit ()
  #4  0x00005555555b8c22 in _GLOBAL__sub_D_00100_1_json_lexer_init () at qobject/json-lexer.c:365
  #5  0x00007ffff7de5b73 in _dl_fini () at dl-fini.c:138
  #6  0x00007ffff6d34041 in __run_exit_handlers (status=0, listp=0x7ffff70dc718 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at
  exit.c:108
  #7  0x00007ffff6d3413a in __GI_exit (status=<optimised out>) at exit.c:139
  #8  0x00007ffff6d12b9e in __libc_start_main (main=0x555555575d50 <main>, argc=11, argv=0x7fffffffdf78, init=<optimised out>, fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7fffffffdf68) at ../csu/libc-start.c:344
  #9  0x0000555555573c1a in _start ()

To trigger the second failure I had to run on a limited Xenial machine
(16.04, 2 cores, 8Gb RAM) again with gprof build:

  # QEMU configure log Thu 10 Jan 22:22:52 GMT 2019
  # Configured with: './configure' '--enable-gprof' '--enable-gcov' '--disable-pie' '--target-list=aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu'

Running it like Travis does:

  retry.py -n 40 --invert -- make -j 3 check V=1

It eventually fails with:

  PASS: tests/test-hmp
  make: write error: stdout

It's hard to tell from the output what was running that failed. So far
I've managed to get the following information out of execsnoop:

  qemu-system-x86  1345   1332     0 x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display none -S -M pc-i440fx-4.0
  sh               1350   1332     0 /bin/sh -c exec x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/q
  t
  qemu-system-x86  1350   1332     0 x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display none -S -M pc-q35-3.1
  sh               1356   1332     0 /bin/sh -c exec x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/q
  t
  qemu-system-x86  1356   1332     0 x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display none -S -M pc-i440fx-3.1
  sh               1361   1332     0 /bin/sh -c exec x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/q
  t
  qemu-system-x86  1361   1332     0 x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display none -S -M pc-q35-4.0
  sh               1366   1332     0 /bin/sh -c exec x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/q
  t
  qemu-system-x86  1366   1332     0 x86_64-softmmu/qemu-system-x86_64 -qtest unix:/tmp/qtest-1332.sock,nowait -qtest-log /dev/null -chardev socket,path=/tmp/qtest-1332.qmp,no
  wait,id=char0 -mon chardev=char0,mode=control -machine accel=qtest -display none -S -M none -m 2
  tset             1370   2129     0 /usr/bin/tset -c
  git              1374   1373     0 /usr/bin/git rev-parse --git-dir
  git              1376   1375     0 /usr/bin/git symbolic-ref HEAD
  git              1378   1377     0 /usr/bin/git config branch.testing/next.remote

So even though tests/test-hmp has nominally passed I think it has to be
one of the x86_64 tests unless there is something that was triggered a
lot longer ago finally crashing out somehow.

I did run the make under strace to see who is using O_NONBLOCK but even
after filtering out a bunch of stuff it seems to be quite embedded:

 ag "O_NONBLOCK" check.strace  | ag -v "/sys" | ag -v "/dev" | grep -v ".git" | wc -l
 2449

Anyway I thought it would be worth dumping my current debug state to the
list in case anyone has any bright ideas about whats going on or fancies
some weekend debugging.

The retry.py is just a hacky script I use, I'm sure everyone else has
something similar. See https://github.com/stsquad/retry

--
Alex Bennée

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-14 15:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-11 19:10 [Qemu-devel] Remaining CI failures Alex Bennée
2019-01-14 10:20 ` Peter Maydell
2019-01-14 15:02 ` Emilio G. Cota
2019-01-14 15:12   ` Daniel P. Berrangé
2019-01-14 15:29   ` Alex Bennée

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).