From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1E1C7C10DC3 for ; Mon, 11 Dec 2023 09:15:07 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rCcMV-0001SB-LA; Mon, 11 Dec 2023 04:14:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rCcMT-0001RX-OI for qemu-devel@nongnu.org; Mon, 11 Dec 2023 04:14:01 -0500 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rCcMS-0004MH-33 for qemu-devel@nongnu.org; Mon, 11 Dec 2023 04:14:01 -0500 Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-3332efd75c9so3883296f8f.2 for ; Mon, 11 Dec 2023 01:13:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1702286037; x=1702890837; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZyUUUbIxqiiG4umBh+NSFFGXFuKjw1v2RZdUYXJB6cI=; b=BBaZdwmGvcVsnne6vn9pZ3ooD8i4Yd5SpMVx0dVThqwXAvQZQHMOK2PZRbJd7LSEwB bDZI+WpVvpR1R+UGlx6SU59eNGYnE2xvK/K5tsFLzQGAo4gXGVQvAe+1wdKU4+hZqOtD 2kEYT3hg+n5RphfaGUNFQNpLAiYIuMwMDrBNQfq/p7HV+MoC7FVhDhqXXvb7RcYwK8Dv u81qYvWvVGWFlhRtzfOPN1HVGbtZ9UgFHSUMukPHmXvoTBVGtwLydIt5UqoBUkdLnk4S xYMB1tKTgTTUf+KggIbBach1ZADBVFmIGKqlHehTN+TX0mibr8ejFMpiPTMfgX11kQ7g euwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702286037; x=1702890837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZyUUUbIxqiiG4umBh+NSFFGXFuKjw1v2RZdUYXJB6cI=; b=Kd+Jk0HtR5yVx7mVQ7fqF1xQpHhAYbvhoYaM7YkQwO4UvlzZpN/HSA/4uDoQDpKQw0 Fsfmi3kT8Kqoltr6GSoaR6jthbQDKoKtm+Miax/znyhupYONAov57TYH3mx3qo/KNmyt GeMqeqsSNL+2xPysudMVMrFE4xsBrG8p4OQpaglHYOgKQJj3Y3grndboK4q8xu+tkZXu dimz9eFB4OdHdLAeMkzNaGCw43Sz1wCvPRt+nESa6PxmF+7itmJLgwlfStBjnMLTYj3W IqjQgsGlAv0VucwITvGHsvTfqSPU9K2j6xS/32IIva+B/y6gkFXXDT+6xQ3vja+PIaro hbxQ== X-Gm-Message-State: AOJu0YzExZSlASbD6I5H0IilZ4ZmUCKbCZD0hFeN56eEVOJwp4iHn1TO XrGG7vFVHy+1h3R+5VWhyaz1UQ== X-Google-Smtp-Source: AGHT+IFKH3+rvZ3KmVY2lBwc7RwlO9vZtgfXoEMugRzZ0i5pd5mELv+AyZ1vSo+XLedN99G5e/mcaw== X-Received: by 2002:adf:f60f:0:b0:332:dfeb:76ab with SMTP id t15-20020adff60f000000b00332dfeb76abmr997353wrp.60.1702286036736; Mon, 11 Dec 2023 01:13:56 -0800 (PST) Received: from draig.lan ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id b18-20020adfe652000000b003333f5f5fd7sm8073340wrn.31.2023.12.11.01.13.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 01:13:51 -0800 (PST) Received: from draig.lan (localhost [IPv6:::1]) by draig.lan (Postfix) with ESMTP id 14DA15FBF2; Mon, 11 Dec 2023 09:13:47 +0000 (GMT) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: qemu-devel@nongnu.org Cc: John Snow , Eduardo Habkost , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Paolo Bonzini , Wainer dos Santos Moschetta , Cleber Rosa , =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= , Beraldo Leal , Richard Henderson , Pavel Dovgalyuk , =?UTF-8?q?Alex=20Benn=C3=A9e?= Subject: [PATCH v2 11/16] replay: stop us hanging in rr_wait_io_event Date: Mon, 11 Dec 2023 09:13:40 +0000 Message-Id: <20231211091346.14616-12-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231211091346.14616-1-alex.bennee@linaro.org> References: <20231211091346.14616-1-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=alex.bennee@linaro.org; helo=mail-wr1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org A lot of the hang I see are when we end up spinning in rr_wait_io_event for an event that will never come in playback. As a new check functions which can see if we are in PLAY mode and kick us us the wait function so the event can be processed. This fixes most of the failures in replay_kernel.py Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2013 Signed-off-by: Alex Bennée Cc: Pavel Dovgalyuk --- v2 - report failure with replay_sync_error --- include/sysemu/replay.h | 5 +++++ accel/tcg/tcg-accel-ops-rr.c | 2 +- replay/replay.c | 21 +++++++++++++++++++++ 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h index 08aae5869f..83995ae4bd 100644 --- a/include/sysemu/replay.h +++ b/include/sysemu/replay.h @@ -70,6 +70,11 @@ int replay_get_instructions(void); /*! Updates instructions counter in replay mode. */ void replay_account_executed_instructions(void); +/** + * replay_can_wait: check if we should pause for wait-io + */ +bool replay_can_wait(void); + /* Processing clocks and other time sources */ /*! Save the specified clock */ diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c index 611932f3c3..825e35b3dc 100644 --- a/accel/tcg/tcg-accel-ops-rr.c +++ b/accel/tcg/tcg-accel-ops-rr.c @@ -109,7 +109,7 @@ static void rr_wait_io_event(void) { CPUState *cpu; - while (all_cpu_threads_idle()) { + while (all_cpu_threads_idle() && replay_can_wait()) { rr_stop_kick_timer(); qemu_cond_wait_iothread(first_cpu->halt_cond); } diff --git a/replay/replay.c b/replay/replay.c index 3ab6360cfa..665dbb34fb 100644 --- a/replay/replay.c +++ b/replay/replay.c @@ -449,6 +449,27 @@ void replay_start(void) replay_enable_events(); } +/* + * For none/record the answer is yes. + */ +bool replay_can_wait(void) +{ + if (replay_mode == REPLAY_MODE_PLAY) { + /* + * For playback we shouldn't ever be at a point we wait. If + * the instruction count has reached zero and we have an + * unconsumed event we should go around again and consume it. + */ + if (replay_state.instruction_count == 0 && replay_state.has_unread_data) { + return false; + } else { + replay_sync_error("Playback shouldn't have to iowait"); + } + } + return true; +} + + void replay_finish(void) { if (replay_mode == REPLAY_MODE_NONE) { -- 2.39.2