From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([209.51.188.92]:42525) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gp9V2-0005M4-2b for qemu-devel@nongnu.org; Thu, 31 Jan 2019 05:23:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gp9V0-0007Oy-QV for qemu-devel@nongnu.org; Thu, 31 Jan 2019 05:23:12 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:39926) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gp9V0-0007OG-Kq for qemu-devel@nongnu.org; Thu, 31 Jan 2019 05:23:10 -0500 Received: by mail-wr1-f65.google.com with SMTP id t27so2627860wra.6 for ; Thu, 31 Jan 2019 02:23:10 -0800 (PST) References: <20190117185628.21862-1-crosa@redhat.com> <20190117185628.21862-15-crosa@redhat.com> <8736pk1xs8.fsf@linaro.org> From: =?UTF-8?Q?Philippe_Mathieu-Daud=c3=a9?= Message-ID: <209afb25-10e9-ced3-e6d3-d6324d571c89@redhat.com> Date: Thu, 31 Jan 2019 11:23:07 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH 14/18] Boot Linux Console Test: add a test for ppc64 + pseries List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cleber Rosa , =?UTF-8?Q?Alex_Benn=c3=a9e?= Cc: qemu-devel@nongnu.org, Stefan Markovic , Aleksandar Markovic , Eduardo Habkost , Caio Carrara , qemu-s390x@nongnu.org, Aurelien Jarno , Cornelia Huck , Fam Zheng , Wainer dos Santos Moschetta , Aleksandar Rikalo , Peter Maydell On 1/31/19 3:37 AM, Cleber Rosa wrote: > On 1/22/19 11:07 AM, Alex Bennée wrote: >> Cleber Rosa writes: >> >>> Just like the previous tests, boots a Linux kernel on a ppc64 target >>> using the pseries machine. >> >> So running this on my rather slow SynQuacer I get: >> >> (04/16) /home/alex/lsrc/qemu.git/tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_ppc64_pseries: INTERRUPTED: Test reported status but did not finish\nRunner error occurred: Timeout reached\nOriginal status: ERROR\n{'name': '04-/home/alex/lsrc/qemu.git/tests/acceptance/boot_linux_console.py:BootLinuxConsole.test_ppc64_pseries', 'logdir': '/home/alex/lsrc/qemu.git/te... (60.93 s) >> >> which I'm guessing is a timeout occurring. >> > > Yes, that's what's happening. It's hard to pinpoint, and control, the > sluggishness points in such a test running on a different environment. > For this one execution, I do trust your assessment, and it's most > certainly caused by your "slow SynQuacer", spending too much running > emulation code. > > But, I'd like to mention that there are other possibilities. One is > that you're hitting a "asset fetcher bug" that I recently fixed in > Avocado[1] (fix to be available on 68.0, to be released next Monday, Feb > 4th). > > Even with that bug fixed, I feel like it's unfair to test code to spend > its time waiting to download a file when it's not testing *the file > download itself*. Because of that, there are plans to add an (optional) > job pre-processing step that will make sure the needed assets are in the > cache ahead of time[2][3]. > >> I wonder if that means we should: >> >> a) set timeouts longer for when running on TCG I hit the same problem with VM tests, and suggested a poor "increase timeout" patch [1]. Then Peter sent a different patch [2] which happens to inadvertently Dictionary resolve my problem, since the longer a VM took to boot on the Cavium ThunderX I have access is 288 seconds, which is closely below the 300 seconds limit =) I understood nobody seemed to really care about testing the x86 TCG backend this way, so I didn't worry much. [1] http://lists.nongnu.org/archive/html/qemu-devel/2018-07/msg03416.html [2] http://lists.nongnu.org/archive/html/qemu-devel/2018-08/msg04977.html >> or >> b) split tests into TCG and KVM tests and select KVM tests on appropriate HW >> > > I wonder the same, and I believe this falls into a similar scenario > we've seen with the setup of console devices in the QEMUMachine class. > I've started by setting the device types defined at the framework level, > and then reverted to the machine's default devices (using '-serial'), > because the "default" behavior of QEMU is usually what a test writer > wants when not setting something else explicitly. > >> The qemu.py code has (slightly flawed) logic for detecting KVM and >> passing --enable-kvm. Maybe we should be doing that here? >> > > I'm not sure. IMO, the common question is: should we magically (at a > framework level) configure tests based on probed host environment > characteristics? I feel like we should attempt to minimize that for the > sake of tests being more obvious and more easily reproducible. I agree we shouldn't randomly test different features, but rather explicitly add 2 tests (TCG/KVM), and if it is not possible to run a test, mark it as SKIPPED. An user with KVM available would then have to run --filter-out=tcg, or build QEMU with --disable-tcg. > And because of that, I'd go, *initially*, with an approach more similar > to your option "b". > > Having said that, we don't want to rewrite most tests just to be able to > test with either KVM or TCG, if the tests are not explicitly testing KVM > or TCG. At this point, using KVM or TCG is test/framework > *configuration*, and in Avocado we hope to solve this by having the > executed tests easily identifiable and reproducible (a test ID will > contain a information about the options passed, and a replay of the job > will apply the same configuration). > > For now, I think the best approach is to increase the timeout, because I > think it's much worse to have to deal with false negatives (longer > execution times that don't really mean a failure), than having a test > possibly taking some more time to finish. > > And sorry for extremely the long answer! > - Cleber. > > [1] - https://github.com/avocado-framework/avocado/pull/2996 > [2] - > https://trello.com/c/WPd4FrIy/1479-add-support-to-specify-assets-in-test-docstring > [3] - https://trello.com/c/CKP7YS6G/1481-on-cache-check-for-asset-fetcher >