From: Cezary Rojewski <cezary.rojewski@intel.com>
To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>,
<alsa-devel@alsa-project.org>, <broonie@kernel.org>
Cc: upstream@semihalf.com, harshapriya.n@intel.com, rad@semihalf.com,
tiwai@suse.com, hdegoede@redhat.com,
amadeuszx.slawinski@linux.intel.com, cujomalainey@chromium.org,
lma@semihalf.com
Subject: Re: [PATCH 06/14] ASoC: Intel: avs: Coredump and recovery flow
Date: Sun, 1 May 2022 17:32:39 +0200 [thread overview]
Message-ID: <f20f3d72-8f5a-1878-c1fa-49dafce784d7@intel.com> (raw)
In-Reply-To: <d80075c7-3658-52e0-b09f-35182961d5df@linux.intel.com>
On 2022-04-26 11:53 PM, Pierre-Louis Bossart wrote:
> On 4/26/22 12:23, Cezary Rojewski wrote:
>> In rare occassions, under stress conditions or hardware malfunction, DSP
>
> occasions
Ack.
>> firmware may fail. Software is notified about such situation with
>> EXCEPTION_CAUGHT notification. IPC timeout is also counted as critical
>> device failure. More often than not, driver can recover from such
>> situations by performing full reset: killing and restarting ADSP.
>>
>> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
>> Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
>> ---
>> sound/soc/intel/Kconfig | 1 +
>> sound/soc/intel/avs/avs.h | 4 ++
>> sound/soc/intel/avs/ipc.c | 95 +++++++++++++++++++++++++++++++++-
>> sound/soc/intel/avs/messages.h | 5 ++
>> 4 files changed, 103 insertions(+), 2 deletions(-)
>>
>> diff --git a/sound/soc/intel/Kconfig b/sound/soc/intel/Kconfig
>> index c364ddf22267..05ad6bdecfc5 100644
>> --- a/sound/soc/intel/Kconfig
>> +++ b/sound/soc/intel/Kconfig
>> @@ -218,6 +218,7 @@ config SND_SOC_INTEL_AVS
>> select SND_HDA_EXT_CORE
>> select SND_HDA_DSP_LOADER
>> select SND_INTEL_NHLT
>> + select WANT_DEV_COREDUMP
>> help
>> Enable support for Intel(R) cAVS 1.5 platforms with DSP
>> capabilities. This includes Skylake, Kabylake, Amberlake and
>> diff --git a/sound/soc/intel/avs/avs.h b/sound/soc/intel/avs/avs.h
>> index e628f78d1864..02c2aa1bcd5c 100644
>> --- a/sound/soc/intel/avs/avs.h
>> +++ b/sound/soc/intel/avs/avs.h
>> @@ -42,6 +42,7 @@ struct avs_dsp_ops {
>> int (* const load_basefw)(struct avs_dev *, struct firmware *);
>> int (* const load_lib)(struct avs_dev *, struct firmware *, u32);
>> int (* const transfer_mods)(struct avs_dev *, bool, struct avs_module_entry *, u32);
>> + int (* const coredump)(struct avs_dev *, union avs_notify_msg *);
>> };
>>
>> #define avs_dsp_op(adev, op, ...) \
>> @@ -164,12 +165,15 @@ struct avs_ipc {
>> struct avs_ipc_msg rx;
>> u32 default_timeout_ms;
>> bool ready;
>> + bool recovering;
>>
>> bool rx_completed;
>> spinlock_t rx_lock;
>> struct mutex msg_mutex;
>> struct completion done_completion;
>> struct completion busy_completion;
>> +
>> + struct work_struct recovery_work;
>> };
>>
>> #define AVS_EIPC EREMOTEIO
>> diff --git a/sound/soc/intel/avs/ipc.c b/sound/soc/intel/avs/ipc.c
>> index 68aaf01edbf2..84cb411c82fa 100644
>> --- a/sound/soc/intel/avs/ipc.c
>> +++ b/sound/soc/intel/avs/ipc.c
>> @@ -14,6 +14,87 @@
>>
>> #define AVS_IPC_TIMEOUT_MS 300
>>
>> +static void avs_dsp_recovery(struct avs_dev *adev)
>> +{
>> + struct avs_soc_component *acomp;
>> + unsigned int core_mask;
>> + int ret;
>> +
>> + if (adev->ipc->recovering)
>> + return;
>> + adev->ipc->recovering = true;
>
> don't you need some sort of lock to test/clear this flag?
Our stress tests do not confirm this. I'll not ignore this warning
though, will recheck with my team next week.
>> +
>> + mutex_lock(&adev->comp_list_mutex);
>> + /* disconnect all running streams */
>> + list_for_each_entry(acomp, &adev->comp_list, node) {
>> + struct snd_soc_pcm_runtime *rtd;
>> + struct snd_soc_card *card;
>> +
>> + card = acomp->base.card;
>> + if (!card)
>> + continue;
>> +
>> + for_each_card_rtds(card, rtd) {
>> + struct snd_pcm *pcm;
>> + int dir;
>> +
>> + pcm = rtd->pcm;
>> + if (!pcm || rtd->dai_link->no_pcm)
>> + continue;
>> +
>> + for_each_pcm_streams(dir) {
>> + struct snd_pcm_substream *substream;
>> +
>> + substream = pcm->streams[dir].substream;
>> + if (!substream || !substream->runtime)
>> + continue;
>> +
>> + snd_pcm_stop(substream, SNDRV_PCM_STATE_DISCONNECTED);
>> + }
>> + }
>> + }
>> + mutex_unlock(&adev->comp_list_mutex);
>> +
>> + /* forcibly shutdown all cores */
>> + core_mask = GENMASK(adev->hw_cfg.dsp_cores - 1, 0);
>> + avs_dsp_core_disable(adev, core_mask);
>> +
>> + /* attempt dsp reboot */
>> + ret = avs_dsp_boot_firmware(adev, true);
>> + if (ret < 0)
>> + dev_err(adev->dev, "dsp reboot failed: %d\n", ret);
>> +
>> + pm_runtime_mark_last_busy(adev->dev);
>> + pm_runtime_enable(adev->dev);
>> + pm_request_autosuspend(adev->dev);
>
> there are zero users of this routine in the entire sound/ tree, can you clarify why this is needed or what you are trying to do?
Unsure which routine you question here. I'll assume it's
pm_request_autosuspend().
pm_request_audiosuspend() is being used to queue suspend once recovery
completes. Recovery takes time and during that time all communication
attempts with DSP will yield -EPERM. PM is also blocked for the device
with pm_runtime_disable(), performed before scheduling the recovery
work. Once recovery completes we do not just unblock the PM as that
would cause immediate suspend. Instead, we "refresh" the *last busy*
status and queue the suspend operation.
>> +
>> + adev->ipc->recovering = false;
>> +}
next prev parent reply other threads:[~2022-05-01 15:34 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-26 17:23 [PATCH 00/14] ASoC: Intel: avs: Driver core and PCM operations Cezary Rojewski
2022-04-26 17:23 ` [PATCH 01/14] ASoC: Intel: avs: Account for libraries when booting basefw Cezary Rojewski
2022-04-26 21:21 ` Pierre-Louis Bossart
2022-05-01 9:45 ` Cezary Rojewski
2022-05-06 15:25 ` Piotr Maziarz
2022-05-06 15:47 ` Pierre-Louis Bossart
2022-04-26 17:23 ` [PATCH 02/14] ASoC: Intel: avs: Generic soc component driver Cezary Rojewski
2022-04-26 21:33 ` Pierre-Louis Bossart
2022-05-01 10:45 ` Cezary Rojewski
2022-04-26 17:23 ` [PATCH 03/14] ASoC: Intel: avs: Generic PCM FE operations Cezary Rojewski
2022-04-26 17:23 ` [PATCH 04/14] ASoC: Intel: avs: non-HDA PCM BE operations Cezary Rojewski
2022-04-26 21:40 ` Pierre-Louis Bossart
2022-05-01 10:48 ` Cezary Rojewski
2022-04-26 17:23 ` [PATCH 05/14] ASoC: Intel: avs: HDA " Cezary Rojewski
2022-04-26 21:45 ` Pierre-Louis Bossart
2022-05-01 10:55 ` Cezary Rojewski
2022-04-26 17:23 ` [PATCH 06/14] ASoC: Intel: avs: Coredump and recovery flow Cezary Rojewski
2022-04-26 21:53 ` Pierre-Louis Bossart
2022-05-01 15:32 ` Cezary Rojewski [this message]
2022-05-02 13:53 ` Pierre-Louis Bossart
2022-04-26 17:23 ` [PATCH 07/14] ASoC: Intel: avs: Prepare for firmware tracing Cezary Rojewski
2022-04-26 17:23 ` [PATCH 08/14] ASoC: Intel: avs: D0ix power state support Cezary Rojewski
2022-04-26 21:58 ` Pierre-Louis Bossart
2022-04-29 14:19 ` Cezary Rojewski
2022-04-29 14:33 ` Cezary Rojewski
2022-04-26 17:23 ` [PATCH 09/14] ASoC: Intel: avs: Event tracing Cezary Rojewski
2022-04-26 17:23 ` [PATCH 10/14] ASoC: Intel: avs: Machine board registration Cezary Rojewski
2022-04-26 22:12 ` Pierre-Louis Bossart
2022-04-29 14:01 ` Cezary Rojewski
2022-05-04 9:41 ` Amadeusz Sławiński
2022-05-04 11:12 ` Cezary Rojewski
2022-05-04 11:26 ` Péter Ujfalusi
2022-05-04 12:33 ` Cezary Rojewski
2022-04-26 17:23 ` [PATCH 11/14] ASoC: Intel: avs: PCI driver implementation Cezary Rojewski
2022-04-26 17:23 ` [PATCH 12/14] ASoC: Intel: avs: Power management Cezary Rojewski
2022-04-26 22:18 ` Pierre-Louis Bossart
2022-04-29 13:44 ` Cezary Rojewski
2022-04-26 17:23 ` [PATCH 13/14] ASoC: Intel: avs: SKL-based platforms support Cezary Rojewski
2022-04-26 17:23 ` [PATCH 14/14] ASoC: Intel: avs: APL-based " Cezary Rojewski
2022-04-27 8:15 ` [PATCH 00/14] ASoC: Intel: avs: Driver core and PCM operations Cezary Rojewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f20f3d72-8f5a-1878-c1fa-49dafce784d7@intel.com \
--to=cezary.rojewski@intel.com \
--cc=alsa-devel@alsa-project.org \
--cc=amadeuszx.slawinski@linux.intel.com \
--cc=broonie@kernel.org \
--cc=cujomalainey@chromium.org \
--cc=harshapriya.n@intel.com \
--cc=hdegoede@redhat.com \
--cc=lma@semihalf.com \
--cc=pierre-louis.bossart@linux.intel.com \
--cc=rad@semihalf.com \
--cc=tiwai@suse.com \
--cc=upstream@semihalf.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox