All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: jeyu@kernel.org, davem@davemloft.net, kuba@kernel.org
Cc: linux-wireless@vger.kernel.org, aquini@redhat.com,
	linux-doc@vger.kernel.org, peterz@infradead.org,
	daniel.vetter@ffwll.ch, linux@dominikbrodowski.net,
	linux-kernel@vger.kernel.org, yamada.masahiro@socionext.com,
	glider@google.com, GR-everest-linux-l2@marvell.com,
	mchehab+samsung@kernel.org, will@kernel.org,
	michael.chan@broadcom.com, robh@kernel.org, paulmck@kernel.org,
	bhe@redhat.com, corbet@lwn.net, mchehab+huawei@kernel.org,
	Luis Chamberlain <mcgrof@kernel.org>,
	ath10k@lists.infradead.org, derosier@gmail.com, tiwai@suse.de,
	mingo@redhat.com, dvyukov@google.com, samitolvanen@google.com,
	yzaikin@google.com, dyoung@redhat.com, pmladek@suse.com,
	elver@google.com, sburla@marvell.com, aelior@marvell.com,
	keescook@chromium.org, arnd@arndb.de, sfr@canb.auug.org.au,
	gpiccoli@canonical.com, rostedt@goodmis.org,
	fmanlunas@marvell.com, cai@lca.pw, tglx@linutronix.de,
	andriy.shevchenko@linux.intel.com, johannes@sipsolutions.net,
	kvalo@codeaurora.org, netdev@vger.kernel.org,
	rdunlap@infradead.org, schlad@suse.de, dianders@chromium.org,
	vkoul@kernel.org, mhiramat@kernel.org, akpm@linux-foundation.org,
	dchickles@marvell.com, bauerman@linux.ibm.com
Subject: [PATCH v3 5/8] ath10k: use new taint_firmware_crashed()
Date: Tue, 26 May 2020 14:58:12 +0000	[thread overview]
Message-ID: <20200526145815.6415-6-mcgrof@kernel.org> (raw)
In-Reply-To: <20200526145815.6415-1-mcgrof@kernel.org>

This makes use of the new taint_firmware_crashed() to help
annotate when firmware for device drivers crash. When firmware
crashes devices can sometimes become unresponsive, and recovery
sometimes requires a driver unload / reload and in the worst cases
a reboot.

Using a taint flag allows us to annotate when this happens clearly.

I have run into this situation with this driver with the latest
firmware as of today, May 21, 2020 using v5.6.0, leaving me at
a state at which my only option is to reboot. Driver removal and
addition does not fix the situation. This is reported on kernel.org
bugzilla korg#207851 [0]. But this isn't the first firmware crash reported,
others have been filed before and none of these bugs have yet been
addressed [1] [2] [3].  Including my own I see these firmware crash
reports:

  * korg#207851 [0]
  * korg#197013 [1]
  * korg#201237 [2]
  * korg#195987 [3]

[0] https://bugzilla.kernel.org/show_bug.cgi?id=207851
[1] https://bugzilla.kernel.org/show_bug.cgi?id=197013
[2] https://bugzilla.kernel.org/show_bug.cgi?id=201237
[3] https://bugzilla.kernel.org/show_bug.cgi?id=195987

Cc: linux-wireless@vger.kernel.org
Cc: ath10k@lists.infradead.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 drivers/net/wireless/ath/ath10k/pci.c  | 2 ++
 drivers/net/wireless/ath/ath10k/sdio.c | 2 ++
 drivers/net/wireless/ath/ath10k/snoc.c | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 1d941d53fdc9..818c3acc2468 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1767,6 +1767,7 @@ static void ath10k_pci_fw_dump_work(struct work_struct *work)
 		scnprintf(guid, sizeof(guid), "n/a");
 
 	ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+	taint_firmware_crashed();
 	ath10k_print_driver_info(ar);
 	ath10k_pci_dump_registers(ar, crash_data);
 	ath10k_ce_dump_registers(ar, crash_data);
@@ -2837,6 +2838,7 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar,
 	if (ret) {
 		if (ath10k_pci_has_fw_crashed(ar)) {
 			ath10k_warn(ar, "firmware crashed during chip reset\n");
+			taint_firmware_crashed();
 			ath10k_pci_fw_crashed_clear(ar);
 			ath10k_pci_fw_crashed_dump(ar);
 		}
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index e2aff2254a40..8b2fc0b89be4 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -794,6 +794,7 @@ static int ath10k_sdio_mbox_proc_dbg_intr(struct ath10k *ar)
 
 	/* TODO: Add firmware crash handling */
 	ath10k_warn(ar, "firmware crashed\n");
+	taint_firmware_crashed();
 
 	/* read counter to clear the interrupt, the debug error interrupt is
 	 * counter 0.
@@ -915,6 +916,7 @@ static int ath10k_sdio_mbox_proc_cpu_intr(struct ath10k *ar)
 	if (cpu_int_status & MBOX_CPU_STATUS_ENABLE_ASSERT_MASK) {
 		ath10k_err(ar, "firmware crashed!\n");
 		queue_work(ar->workqueue, &ar->restart_work);
+		taint_firmware_crashed();
 	}
 	return ret;
 }
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index 354d49b1cd45..071ee7607a4c 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -1451,6 +1451,7 @@ void ath10k_snoc_fw_crashed_dump(struct ath10k *ar)
 		scnprintf(guid, sizeof(guid), "n/a");
 
 	ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+	taint_firmware_crashed();
 	ath10k_print_driver_info(ar);
 	ath10k_msa_dump_memory(ar, crash_data);
 	mutex_unlock(&ar->dump_mutex);
-- 
2.26.2


_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

WARNING: multiple messages have this Message-ID (diff)
From: Luis Chamberlain <mcgrof@kernel.org>
To: jeyu@kernel.org, davem@davemloft.net, kuba@kernel.org
Cc: michael.chan@broadcom.com, dchickles@marvell.com,
	sburla@marvell.com, fmanlunas@marvell.com, aelior@marvell.com,
	GR-everest-linux-l2@marvell.com, kvalo@codeaurora.org,
	johannes@sipsolutions.net, akpm@linux-foundation.org,
	arnd@arndb.de, rostedt@goodmis.org, mingo@redhat.com,
	aquini@redhat.com, cai@lca.pw, dyoung@redhat.com, bhe@redhat.com,
	peterz@infradead.org, tglx@linutronix.de, gpiccoli@canonical.com,
	pmladek@suse.com, tiwai@suse.de, schlad@suse.de,
	andriy.shevchenko@linux.intel.com, derosier@gmail.com,
	keescook@chromium.org, daniel.vetter@ffwll.ch, will@kernel.org,
	mchehab+samsung@kernel.org, vkoul@kernel.org,
	mchehab+huawei@kernel.org, robh@kernel.org, mhiramat@kernel.org,
	sfr@canb.auug.org.au, linux@dominikbrodowski.net,
	glider@google.com, paulmck@kernel.org, elver@google.com,
	bauerman@linux.ibm.com, yamada.masahiro@socionext.com,
	samitolvanen@google.com, yzaikin@google.com, dvyukov@google.com,
	rdunlap@infradead.org, corbet@lwn.net, dianders@chromium.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, Luis Chamberlain <mcgrof@kernel.org>,
	linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
Subject: [PATCH v3 5/8] ath10k: use new taint_firmware_crashed()
Date: Tue, 26 May 2020 14:58:12 +0000	[thread overview]
Message-ID: <20200526145815.6415-6-mcgrof@kernel.org> (raw)
In-Reply-To: <20200526145815.6415-1-mcgrof@kernel.org>

This makes use of the new taint_firmware_crashed() to help
annotate when firmware for device drivers crash. When firmware
crashes devices can sometimes become unresponsive, and recovery
sometimes requires a driver unload / reload and in the worst cases
a reboot.

Using a taint flag allows us to annotate when this happens clearly.

I have run into this situation with this driver with the latest
firmware as of today, May 21, 2020 using v5.6.0, leaving me at
a state at which my only option is to reboot. Driver removal and
addition does not fix the situation. This is reported on kernel.org
bugzilla korg#207851 [0]. But this isn't the first firmware crash reported,
others have been filed before and none of these bugs have yet been
addressed [1] [2] [3].  Including my own I see these firmware crash
reports:

  * korg#207851 [0]
  * korg#197013 [1]
  * korg#201237 [2]
  * korg#195987 [3]

[0] https://bugzilla.kernel.org/show_bug.cgi?id=207851
[1] https://bugzilla.kernel.org/show_bug.cgi?id=197013
[2] https://bugzilla.kernel.org/show_bug.cgi?id=201237
[3] https://bugzilla.kernel.org/show_bug.cgi?id=195987

Cc: linux-wireless@vger.kernel.org
Cc: ath10k@lists.infradead.org
Cc: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 drivers/net/wireless/ath/ath10k/pci.c  | 2 ++
 drivers/net/wireless/ath/ath10k/sdio.c | 2 ++
 drivers/net/wireless/ath/ath10k/snoc.c | 1 +
 3 files changed, 5 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/pci.c b/drivers/net/wireless/ath/ath10k/pci.c
index 1d941d53fdc9..818c3acc2468 100644
--- a/drivers/net/wireless/ath/ath10k/pci.c
+++ b/drivers/net/wireless/ath/ath10k/pci.c
@@ -1767,6 +1767,7 @@ static void ath10k_pci_fw_dump_work(struct work_struct *work)
 		scnprintf(guid, sizeof(guid), "n/a");
 
 	ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+	taint_firmware_crashed();
 	ath10k_print_driver_info(ar);
 	ath10k_pci_dump_registers(ar, crash_data);
 	ath10k_ce_dump_registers(ar, crash_data);
@@ -2837,6 +2838,7 @@ static int ath10k_pci_hif_power_up(struct ath10k *ar,
 	if (ret) {
 		if (ath10k_pci_has_fw_crashed(ar)) {
 			ath10k_warn(ar, "firmware crashed during chip reset\n");
+			taint_firmware_crashed();
 			ath10k_pci_fw_crashed_clear(ar);
 			ath10k_pci_fw_crashed_dump(ar);
 		}
diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index e2aff2254a40..8b2fc0b89be4 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -794,6 +794,7 @@ static int ath10k_sdio_mbox_proc_dbg_intr(struct ath10k *ar)
 
 	/* TODO: Add firmware crash handling */
 	ath10k_warn(ar, "firmware crashed\n");
+	taint_firmware_crashed();
 
 	/* read counter to clear the interrupt, the debug error interrupt is
 	 * counter 0.
@@ -915,6 +916,7 @@ static int ath10k_sdio_mbox_proc_cpu_intr(struct ath10k *ar)
 	if (cpu_int_status & MBOX_CPU_STATUS_ENABLE_ASSERT_MASK) {
 		ath10k_err(ar, "firmware crashed!\n");
 		queue_work(ar->workqueue, &ar->restart_work);
+		taint_firmware_crashed();
 	}
 	return ret;
 }
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index 354d49b1cd45..071ee7607a4c 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -1451,6 +1451,7 @@ void ath10k_snoc_fw_crashed_dump(struct ath10k *ar)
 		scnprintf(guid, sizeof(guid), "n/a");
 
 	ath10k_err(ar, "firmware crashed! (guid %s)\n", guid);
+	taint_firmware_crashed();
 	ath10k_print_driver_info(ar);
 	ath10k_msa_dump_memory(ar, crash_data);
 	mutex_unlock(&ar->dump_mutex);
-- 
2.26.2


  parent reply	other threads:[~2020-05-26 14:58 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 14:58 [PATCH v3 0/8] kernel: taint when the driver firmware crashes Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 1/8] kernel.h: move taint and system state flags to uapi Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 2/8] panic: add uevent support Luis Chamberlain
2020-05-31  4:46   ` kbuild test robot
2020-06-03 17:55   ` kernel test robot
2020-05-26 14:58 ` [PATCH v3 3/8] taint: add firmware crash taint support Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 4/8] panic: make taint data type clearer Luis Chamberlain
2020-05-26 14:58 ` Luis Chamberlain [this message]
2020-05-26 14:58   ` [PATCH v3 5/8] ath10k: use new taint_firmware_crashed() Luis Chamberlain
2020-06-02 21:01   ` Brian Norris
2020-06-02 21:01     ` Brian Norris
2020-05-26 14:58 ` [PATCH v3 6/8] bnxt_en: " Luis Chamberlain
2020-05-26 18:09   ` Michael Chan
2020-05-26 14:58 ` [PATCH v3 7/8] liquidio: " Luis Chamberlain
2020-05-26 14:58 ` [PATCH v3 8/8] qed: " Luis Chamberlain
2020-05-26 22:46 ` [PATCH v3 0/8] kernel: taint when the driver firmware crashes Jakub Kicinski
2020-05-26 23:07   ` Luis Chamberlain
2020-05-26 23:30     ` Jakub Kicinski
2020-05-27  3:19       ` Luis Chamberlain
2020-05-27 21:36         ` Jakub Kicinski
2020-05-28 14:27           ` Luis Chamberlain
2020-05-28 15:04             ` Ben Greear
2020-05-28 16:33               ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200526145815.6415-6-mcgrof@kernel.org \
    --to=mcgrof@kernel.org \
    --cc=GR-everest-linux-l2@marvell.com \
    --cc=aelior@marvell.com \
    --cc=akpm@linux-foundation.org \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=aquini@redhat.com \
    --cc=arnd@arndb.de \
    --cc=ath10k@lists.infradead.org \
    --cc=bauerman@linux.ibm.com \
    --cc=bhe@redhat.com \
    --cc=cai@lca.pw \
    --cc=corbet@lwn.net \
    --cc=daniel.vetter@ffwll.ch \
    --cc=davem@davemloft.net \
    --cc=dchickles@marvell.com \
    --cc=derosier@gmail.com \
    --cc=dianders@chromium.org \
    --cc=dvyukov@google.com \
    --cc=dyoung@redhat.com \
    --cc=elver@google.com \
    --cc=fmanlunas@marvell.com \
    --cc=glider@google.com \
    --cc=gpiccoli@canonical.com \
    --cc=jeyu@kernel.org \
    --cc=johannes@sipsolutions.net \
    --cc=keescook@chromium.org \
    --cc=kuba@kernel.org \
    --cc=kvalo@codeaurora.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=mchehab+huawei@kernel.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rdunlap@infradead.org \
    --cc=robh@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=samitolvanen@google.com \
    --cc=sburla@marvell.com \
    --cc=schlad@suse.de \
    --cc=sfr@canb.auug.org.au \
    --cc=tglx@linutronix.de \
    --cc=tiwai@suse.de \
    --cc=vkoul@kernel.org \
    --cc=will@kernel.org \
    --cc=yamada.masahiro@socionext.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.