All of lore.kernel.org
 help / color / mirror / Atom feed
From: Takashi Iwai <tiwai@suse.de>
To: Chancel Liu <chancel.liu@nxp.com>
Cc: "alsa-devel@alsa-project.org" <alsa-devel@alsa-project.org>,
	Jaroslav Kysela <perex@perex.cz>,
	"S.J. Wang" <shengjiu.wang@nxp.com>
Subject: Re: [EXT] Re: Suspend/resume Issue on pcm_dmix.c in alsa-lib
Date: Thu, 05 Sep 2024 15:36:34 +0200	[thread overview]
Message-ID: <877cbqgr31.wl-tiwai@suse.de> (raw)
In-Reply-To: <DB9PR04MB9498BE3E297E22281C0E6914E39D2@DB9PR04MB9498.eurprd04.prod.outlook.com>

On Thu, 05 Sep 2024 13:01:11 +0200,
Chancel Liu wrote:
> 
> > > > > Hi Takashi,
> > > > >
> > > > > Thanks for your reply and suggestions. Finally we have found the root
> > cause.
> > > > > Seems it's related to both drivers and alsa-lib.
> > > > >
> > > > > When two dmix clients run in parallel we get two direct dmix instances.
> > > > > 1st dmix instance:
> > > > > snd_pcm_dmix_open()
> > > > >       snd_pcm_direct_initialize_slave()
> > > > >               save_slave_setting()
> > > > > Since the driver we are using has SND_PCM_INFO_RESUME flag,
> > > > > dmix->spcm->info has this flag. Then this flag is cleared in
> > > > dmix->shmptr->s.info.
> > > > >
> > > > > 2nd dmix instance:
> > > > > snd_pcm_dmix_open()
> > > > >       snd_pcm_direct_open_secondary_client()
> > > > >               copy_slave_setting()
> > > > > 2nd dmix->spcm->info is copied from dmix->shmptr->s.info so it doesn'
> > > > > has this flag.
> > > > >
> > > > > If 1st dmix instance resumes firstly it should implement recovery of
> > > > > slave pcm in snd_pcm_direct_slave_recover(). Because 1st
> > > > > dmix->spcm->info has
> > > > > SND_PCM_INFO_RESUME,snd_pcm_resume(direct->spcm) can be called
> > > > > correctly to resume slave pcm.
> > > >
> > > > ... and immediately stop the stream, then prepare and restart as a usual
> > > > restart.
> > > >
> > > > > However if 2nd dmix instance resumes firstly,
> > > > > snd_pcm_resume(direct->spcm) will not be called because it's
> > > > > spcm->info doesn't has SND_PCM_INFO_RESUME flag. The 1st dmix
> > instance
> > > > > assumes someone else already did recovery so
> > > > > snd_pcm_resume(direct->spcm) won't be called neither. In result the
> > > > > slave pcm fails to resume.
> > > >
> > > > Something wrong happening here, then.
> > > >
> > > > In dmix, there is no hardware resume at all, but it's always a restart of the
> > > > stream.  The call of snd_pcm_resume() is only temporarily for
> > inconsistencies
> > > > that can be a problem on some drivers (IIRC dmaengine stuff).  That said,
> > > > dmix does a kind of fake resume, stops and restarts the stream cleanly on
> > the
> > > > first instance.  On the second instance, it's already recovered, hence it
> > bails
> > > > out.
> > > >
> > > > If poll() hangs on the second instance, there can be some other problem.
> > > > Maybe the resume -> stop -> restart sequence doesn't work with your
> > driver
> > > > well?
> > > >
> > >
> > > Our dma driver will do PAUSE in system suspend and requires doing RESUME
> > in
> > > system resume. Current problem is that snd_pcm_resume() is not called by
> > both
> > > 1st instance and 2nd instance.
> > 
> > That's weird.  Are you really testing with the latest alsa-lib code?
> > 
> > If application doesn't call snd_pcm_resume(), it means that the PCM
> > state isn't set to SUSPENDED, so it pretends as if still running.
> > 
> > Or if you mean that snd_pcm_resume() to the slave PCM isn't called
> > (even though snd_pcm_resume() is called for the dmix PCM), check
> > whether snd_pcm_direct_slave_recover() gets called, especially at the
> > point:
> > 
> >         /* some buggy drivers require the device resumed before prepared;
> >          * when a device has RESUME flag and is in SUSPENDED state,
> > resume
> >          * here but immediately drop to bring it to a sane active state.
> >          */
> >         if (state == SND_PCM_STATE_SUSPENDED &&
> >             (direct->spcm->info & SND_PCM_INFO_RESUME)) {
> >                 snd_pcm_resume(direct->spcm);
> >                 snd_pcm_drop(direct->spcm);
> >                 snd_pcm_direct_timer_stop(direct);
> >                 snd_pcm_direct_clear_timer_queue(direct);
> >         }
> > 
> > Try to put debug prints or catch via breakpoint whether this code path
> > is executed.
> > 
> > Also, does the issue happen with the latest 6.11-rc kernel, too?
> > If yes, what if you drop SNDRV_PCM_INFO_RESUME bit flag in the driver
> > side?  Does the problem persist, or it works?
> > 
> 
> I'm working on kernel 6.6 and alsa-lib v1.2.11. It's not so outdated I think and
> then I will try to switch on the latest version.
> 
> Indeed I did some debug on this part. Please see my comments inline.
> 
> int snd_pcm_direct_slave_recover(snd_pcm_direct_t *direct)
> {
> 	...
> 	
> 	/* [Chancel]
> 	 * When two dmix clients run in parallel we get two direct dmix instances.
> 	 * 1st dmix->spcm->info has SND_PCM_INFO_RESUME flag but 2nd dmix doesn't.

OK, that must be the cause.  It's because the second open copies the
saved shmem->s.info into spcm->info at its open time while we already
dropped the INFO_RESUME bit.  All the rest behavior are side effect of
this inconsistency.

I guess dropping the INFO_RESUME bit at hw_params and hw_refine should
work instead.  A totally untested fix is below.

(And I believe the drop of INFO_PAUSE should be handled similarly,
 too, instead of dropping spcm->info bit there.)


Takashi

--- a/src/pcm/pcm_direct.c
+++ b/src/pcm/pcm_direct.c
@@ -1018,6 +1018,7 @@ int snd_pcm_direct_hw_refine(snd_pcm_t *pcm, snd_pcm_hw_params_t *params)
 	}
 	dshare->timer_ticks = hw_param_interval(params, SND_PCM_HW_PARAM_PERIOD_SIZE)->max / dshare->slave_period_size;
 	params->info = dshare->shmptr->s.info;
+	params->info &= ~SND_PCM_INFO_RESUME;
 #ifdef REFINE_DEBUG
 	snd_output_puts(log, "DMIX REFINE (end):\n");
 	snd_pcm_hw_params_dump(params, log);
@@ -1031,6 +1032,7 @@ int snd_pcm_direct_hw_params(snd_pcm_t *pcm, snd_pcm_hw_params_t * params)
 	snd_pcm_direct_t *dmix = pcm->private_data;
 
 	params->info = dmix->shmptr->s.info;
+	params->info &= ~SND_PCM_INFO_RESUME;
 	params->rate_num = dmix->shmptr->s.rate;
 	params->rate_den = 1;
 	params->fifo_size = 0;
@@ -1183,8 +1185,6 @@ static void save_slave_setting(snd_pcm_direct_t *dmix, snd_pcm_t *spcm)
 	COPY_SLAVE(buffer_time);
 	COPY_SLAVE(sample_bits);
 	COPY_SLAVE(frame_bits);
-
-	dmix->shmptr->s.info &= ~SND_PCM_INFO_RESUME;
 }
 
 #undef COPY_SLAVE

  reply	other threads:[~2024-09-05 13:36 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-27  7:06 Suspend/resume Issue on pcm_dmix.c in alsa-lib Chancel Liu
2024-08-27  9:54 ` Takashi Iwai
2024-08-27 10:49 ` Takashi Iwai
2024-09-04  9:07   ` Chancel Liu
2024-09-04  9:29     ` Jaroslav Kysela
2024-09-04 10:04       ` Takashi Iwai
2024-09-04  9:57     ` Takashi Iwai
2024-09-05  7:44       ` Chancel Liu
2024-09-05  8:10         ` Takashi Iwai
2024-09-05 11:01           ` [EXT] " Chancel Liu
2024-09-05 13:36             ` Takashi Iwai [this message]
2024-09-06  6:22               ` Chancel Liu
2024-09-06  6:31                 ` Takashi Iwai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877cbqgr31.wl-tiwai@suse.de \
    --to=tiwai@suse.de \
    --cc=alsa-devel@alsa-project.org \
    --cc=chancel.liu@nxp.com \
    --cc=perex@perex.cz \
    --cc=shengjiu.wang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.