From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from fout-b1-smtp.messagingengine.com (fout-b1-smtp.messagingengine.com [202.12.124.144]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 966934C9563; Wed, 3 Jun 2026 22:20:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=202.12.124.144 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780525223; cv=none; b=Z5Fyiyr2wxzre+lka+9GWa3W/HEuEfTkI6ZKjttFgWP3nd8a4AsftrIz4adXYhnnH4TIZIySlfCQlF3UiyQOWCp23J6cfGE6QjlYD/44FHAzS99kPLw1g0CLpxm7uDa2yb1p89exASBqywgkQnOXKpdNw0NGt1EeRMQ33fdqPuQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780525223; c=relaxed/simple; bh=EhlBlQFfgXncPqsu9f3zgRymJFfQVj5WLn7oHMFBbr4=; h=Date:From:To:Cc:Subject:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MS5AeaYK4TIBmbeZWb2Kotj4PZwwSRbnzvI85i7g3R9z1nksq88YwJz6G6UVHBJgPuxxJFMtE2gpKLswt5R3LP+zDkKBpTj32Z7+0kXQOdUfFf06kKh18jJYC4FJ8B6X3zTD1Rcdw8HR5trF3zxwPUsvGQiGiTOR8R0Ybp/FwME= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=shazbot.org; spf=pass smtp.mailfrom=shazbot.org; dkim=pass (2048-bit key) header.d=shazbot.org header.i=@shazbot.org header.b=YqS5V9JK; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=Nc6dHaN3; arc=none smtp.client-ip=202.12.124.144 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=shazbot.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shazbot.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=shazbot.org header.i=@shazbot.org header.b="YqS5V9JK"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Nc6dHaN3" Received: from phl-compute-02.internal (phl-compute-02.internal [10.202.2.42]) by mailfout.stl.internal (Postfix) with ESMTP id 8B12B1D000FB; Wed, 3 Jun 2026 18:20:09 -0400 (EDT) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-02.internal (MEProxy); Wed, 03 Jun 2026 18:20:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shazbot.org; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm3; t=1780525209; x=1780611609; bh=9kZhKZGJz0v4KTv+sA6NCtH1KkRlsAFv9a7Yznk7BYU=; b= YqS5V9JKO7OB67qFV127l7bRjJ0WMiiGfiwXvOejk+Hbz9ybMgWD4xGTB7g5Hqx2 OISg+eZS1GpKhH4+FVCxu2aKT786f2+kq2EX0trMDhnywnLaJ+F/FHHyYck0DFfE AScsW6ZnDBqNopgVLSiXfYI9rIY3p1nOqrZhINefD6qkOGMCCuwrNYTGM5UEL+h7 m98jdqkBuL3Fjs7j/vkXGHPvyFsjioMFujBJ9c6hpmNn6hwZagJDWAA3joMMiJia hF+05nlb/yx/JwFAItpKvs8hWQwmKtdUuc9m0eXs5z2DDuSF6eZQWvF50Cl2DeOE 6iDlw+nRaPo/lqxB8Ne+Ng== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1780525209; x= 1780611609; bh=9kZhKZGJz0v4KTv+sA6NCtH1KkRlsAFv9a7Yznk7BYU=; b=N c6dHaN34QMQ+VQ2ivmgOiBdOAWYOzDX5fXZSXgRZzSTfRQaKqMJoGeGJ0k1IBVeX lfsFJx5Q1Zl9RrQ2XxWGwG3y4nPv3gCxAgOJBL5bXK2xwyDC95ywgPvUPWtcPo13 RemqKormSyvM/LvM5AIQZ4DRNbfQmuhUJ/gGEY7FhaichIn5kNSNpZZHGrxuTTw0 OcUhrZ9gtW0/AyCAmcKSW/qVlP9hp0VCaFdy/s+nDSxxi5hS26ExGzt2ROzY20pN CNYeKPv1vKhk88sXDiv+3AuucbfastkouMsHr367NBrVUF6iJPCvvLQtGRLip6z1 /gKcCcv3ONCQumrDRo9rA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: dmFkZTFl48SMsTNMd/jGCpgDHRd9W6JwBWzYhz/u5NgrllOUiKDSu/jLbzVVJxAxkxsy+H WOGl97jH4VSfLNdbHN6iFYwqsSywnmLCPv5AHo7gSBanjdPXPDQDvhrgfkkRWmzxHbI5g6 LtZLq+K5UHmO8jNgjTSuquPSQXzTRQR35DRU5WI8X51j45RoTTEfnSchk5NLPMDo9BFdYU hBgJMIQbCNdqV3Czu4JwC1b2yryst0rUIXeaGXWDBnhkm+hVi7hHPDHsOCBjVKWZ0/m5H6 y1nO+RF4lh2NIvr5ojdHyXCG6we4DVdzWuBcefasuOZHwByJ2qqyLy1ncsiPi8zj4I2zec tsKD3WEBF6cfWfQvMD9vOcPcyxZwauiLEyGTF9YdEh0GlCTY+mPLyAJ0QctpJWtigOcQYw fjUR9xB87HoPszJJuB3+7a4YlEpzdjrMjbiFvua/QNXHKngmSBn0324rpe6waYD15jng8f g5bJXHi+eYaTF6oJsLkcgwkv4ka02ecjnntwmT4HCEeiz5qidpqxbv4hkqUmzJx0egdNCB vwLheARx46jPgJupLG2hscG7pI1A8xi2oF20M8UcP6tGeM2wlrG7Gpbh1KqXppD/ysIhSt HUlW2IV+1ZvGXoEWnpFMwmUILoce3mFFaTYemJvd+vipXhcRUtsxXd6rsGOQ X-ME-Proxy: Feedback-ID: i03f14258:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Jun 2026 18:20:07 -0400 (EDT) Date: Wed, 3 Jun 2026 16:20:06 -0600 From: Alex Williamson To: Farhan Ali Cc: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, helgaas@kernel.org, schnelle@linux.ibm.com, mjrosato@linux.ibm.com, alex@shazbot.org Subject: Re: [PATCH v18 1/4] s390/pci: Store PCI error information for passthrough devices Message-ID: <20260603162006.27b78de5@shazbot.org> In-Reply-To: <20260603182415.2324-2-alifm@linux.ibm.com> References: <20260603182415.2324-1-alifm@linux.ibm.com> <20260603182415.2324-2-alifm@linux.ibm.com> X-Mailer: Claws Mail 4.4.0 (GTK 3.24.52; x86_64-pc-linux-gnu) Precedence: bulk X-Mailing-List: linux-s390@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Wed, 3 Jun 2026 11:24:12 -0700 Farhan Ali wrote: > @@ -81,6 +52,47 @@ static bool is_driver_supported(struct pci_driver *driver) > return true; > } > > +static int zpci_store_pci_error(struct pci_dev *pdev, > + struct zpci_ccdf_err *ccdf) > +{ > + struct zpci_dev *zdev = to_zpci(pdev); > + int i; > + > + guard(mutex)(&zdev->pending_errs_lock); > + if (!zdev->pending_errs.mediated_recovery) > + return -EINVAL; > + > + if (zdev->pending_errs.count >= ZPCI_ERR_PENDING_MAX) { > + pr_err("%s: Maximum number (%d) of pending error events queued\n", > + pci_name(pdev), ZPCI_ERR_PENDING_MAX); Is this really an err condition or just a warn? Nothing is fundamentally broken here, the queue is just full and we're losing errors. Maybe this should be a warn? Can this create a DoS if a device continues to generate errors and nobody is consuming them? Userspace could ignore the error. This should probably be _ratelimited. pr_err + pci_name suggests this should be a pci_ or dev_ call and since the pci variant doesn't exist, use dev_warn_ratelimited(). > + return -ENOMEM; > + } > + > + i = zdev->pending_errs.tail % ZPCI_ERR_PENDING_MAX; > + memcpy(&zdev->pending_errs.err[i], ccdf, sizeof(struct zpci_ccdf_err)); > + zdev->pending_errs.tail++; > + zdev->pending_errs.count++; > + return 0; > +} > + > +void zpci_start_mediated_recovery(struct zpci_dev *zdev) > +{ > + guard(mutex)(&zdev->pending_errs_lock); > + zdev->pending_errs.mediated_recovery = true; > +} > +EXPORT_SYMBOL_GPL(zpci_start_mediated_recovery); > + > +void zpci_stop_mediated_recovery(struct zpci_dev *zdev) > +{ > + guard(mutex)(&zdev->pending_errs_lock); > + zdev->pending_errs.mediated_recovery = false; > + if (zdev->pending_errs.count) > + pr_info("Unhandled PCI error events count=%d for PCI function 0x%x\n", > + zdev->pending_errs.count, zdev->fid); It seems like there's always a race that an error could occur as the user is closing the device. Is this really worth logging at anything more than a dbg level, pci_dbg() in this case? > + memset(&zdev->pending_errs, 0, sizeof(struct zpci_ccdf_pending)); > +} > +EXPORT_SYMBOL_GPL(zpci_stop_mediated_recovery); > + > static pci_ers_result_t zpci_event_notify_error_detected(struct pci_dev *pdev, > struct pci_driver *driver) > { > @@ -175,12 +187,15 @@ static pci_ers_result_t zpci_event_do_reset(struct pci_dev *pdev, > * and the platform determines which functions are affected for > * multi-function devices. > */ > -static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev) > +static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev, > + struct zpci_ccdf_err *ccdf) > { > pci_ers_result_t ers_res = PCI_ERS_RESULT_DISCONNECT; > struct zpci_dev *zdev = to_zpci(pdev); > + bool mediated_recovery = false; > char *status_str = "success"; > struct pci_driver *driver; > + int rc; > > /* > * Ensure that the PCI function is not removed concurrently, no driver > @@ -194,13 +209,6 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev) > } > pdev->error_state = pci_channel_io_frozen; > > - if (is_passed_through(pdev)) { > - pr_info("%s: Cannot be recovered in the host because it is a pass-through device\n", > - pci_name(pdev)); > - status_str = "failed (pass-through)"; > - goto out_unlock; > - } > - > driver = to_pci_driver(pdev->dev.driver); > if (!is_driver_supported(driver)) { > if (!driver) { > @@ -216,12 +224,24 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev) > goto out_unlock; > } > > + rc = zpci_store_pci_error(pdev, ccdf); > + if (!rc || rc == -ENOMEM) > + mediated_recovery = true; This is a convoluted way to get the state of zdev->pending_errs.mediated_recovery, which becomes invalid out of pending_errs_lock anyway. > + > ers_res = zpci_event_notify_error_detected(pdev, driver); > if (ers_result_indicates_abort(ers_res)) { > status_str = "failed (abort on detection)"; > goto out_unlock; > } > > + if (mediated_recovery) { > + pr_info("%s: Leaving recovery of pass-through device to user-space\n", > + pci_name(pdev)); > + ers_res = PCI_ERS_RESULT_RECOVERED; > + status_str = "in progress"; > + goto out_unlock; > + } Since zdev->pending_errs.mediated_recovery is only valid while holding pending_errs_lock, this is really no better than the is_passed_through() test. > + > if (ers_res != PCI_ERS_RESULT_NEED_RESET) { > ers_res = zpci_event_do_error_state_clear(pdev, driver); > if (ers_result_indicates_abort(ers_res)) { > @@ -266,25 +286,19 @@ static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *pdev) > * @pdev: PCI function for which to report > * @es: PCI channel failure state to report > */ > -static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_t es) > +static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_t es, > + struct zpci_ccdf_err *ccdf) > { > struct pci_driver *driver; > > pci_dev_lock(pdev); > pdev->error_state = es; > - /** > - * While vfio-pci's error_detected callback notifies user-space QEMU > - * reacts to this by freezing the guest. In an s390 environment PCI > - * errors are rarely fatal so this is overkill. Instead in the future > - * we will inject the error event and let the guest recover the device > - * itself. > - */ > - if (is_passed_through(pdev)) > - goto out; > + > + zpci_store_pci_error(pdev, ccdf); > driver = to_pci_driver(pdev->dev.driver); > if (driver && driver->err_handler && driver->err_handler->error_detected) > driver->err_handler->error_detected(pdev, pdev->error_state); How do you intend to stage this versus QEMU changes? This seems like a big regression if we're suddenly triggering the eventfd that causes QEMU to halt. Do you need userspace to opt-in to mediated recovery rather than automatically enabling it on open? Thanks, Alex