From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42B79C43331 for ; Wed, 13 Nov 2019 09:50:55 +0000 (UTC) Received: from alsa0.perex.cz (alsa0.perex.cz [77.48.224.243]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B5BB420818 for ; Wed, 13 Nov 2019 09:50:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=alsa-project.org header.i=@alsa-project.org header.b="FJNnfN6h" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B5BB420818 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=alsa-devel-bounces@alsa-project.org Received: from alsa1.perex.cz (alsa1.perex.cz [207.180.221.201]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by alsa0.perex.cz (Postfix) with ESMTPS id 66AF51666; Wed, 13 Nov 2019 10:50:02 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa0.perex.cz 66AF51666 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=alsa-project.org; s=default; t=1573638652; bh=Ntohj+liaIu46X7m5AahARFzC8H6hBLkvg8Kor2PLdM=; h=Date:From:To:In-Reply-To:References:Cc:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From; b=FJNnfN6hC0kBvqvVssTx4QEoQBN+BwroEHAyYU5pCBL9z5k0WbW98zeisxaCbrt2F DGB/xxFCMY8ZjRmHoGm3Sg7nDLYUCBavgkdXnHbUzpopnBYwQZkD7NbEJxpsBE6ndq ootzhrGfHhtTGoJtlOvHPj+cIYczgsQHyOzr8dj0= Received: from alsa1.perex.cz (localhost.localdomain [127.0.0.1]) by alsa1.perex.cz (Postfix) with ESMTP id C97C6F80519; Wed, 13 Nov 2019 10:49:15 +0100 (CET) Received: by alsa1.perex.cz (Postfix, from userid 50401) id 8BCCEF804FF; Wed, 13 Nov 2019 10:48:01 +0100 (CET) Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by alsa1.perex.cz (Postfix) with ESMTPS id 4A157F802E0 for ; Wed, 13 Nov 2019 10:47:52 +0100 (CET) DKIM-Filter: OpenDKIM Filter v2.11.0 alsa1.perex.cz 4A157F802E0 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 4AFF7B29A; Wed, 13 Nov 2019 09:47:51 +0000 (UTC) Date: Wed, 13 Nov 2019 10:47:51 +0100 Message-ID: From: Takashi Iwai To: Chih-Yang Hsia In-Reply-To: References: <20191112171715.128727-1-paulhsia@chromium.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 Emacs/25.3 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Cc: alsa-devel@alsa-project.org, Mark Brown , linux-kernel@vger.kernel.org, Takashi Iwai Subject: Re: [alsa-devel] [PATCH 0/2] ALSA: pcm: Fix race condition in runtime access X-BeenThere: alsa-devel@alsa-project.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: "Alsa-devel mailing list for ALSA developers - http://www.alsa-project.org" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: alsa-devel-bounces@alsa-project.org Sender: "Alsa-devel" On Wed, 13 Nov 2019 08:24:41 +0100, Chih-Yang Hsia wrote: > > On Wed, Nov 13, 2019 at 2:16 AM Takashi Iwai wrote: > > > > On Tue, 12 Nov 2019 18:17:13 +0100, > > paulhsia wrote: > > > > > > Since > > > - snd_pcm_detach_substream sets runtime to null without stream lock and > > > - snd_pcm_period_elapsed checks the nullity of the runtime outside of > > > stream lock. > > > > > > This will trigger null memory access in snd_pcm_running() call in > > > snd_pcm_period_elapsed. > > > > Well, if a stream is detached, it means that the stream must have been > > already closed; i.e. it's already a clear bug in the driver that > > snd_pcm_period_elapsed() is called against such a stream. > > > > Or am I missing other possible case? > > > > > > thanks, > > > > Takashi > > > > In multithreaded environment, it is possible to have to access both > `interrupt_handler` (from irq) and `substream close` (from > snd_pcm_release) at the same time. > Therefore, in driver implementation, if "substream close function" and > the "code section where snd_pcm_period_elapsed() in" do not hold the > same lock, then the following things can happen: > > 1. interrupt_handler -> goes into snd_pcm_period_elapsed with a valid > sustream pointer > 2. snd_pcm_release_substream: call close without blocking > 3. snd_pcm_release_substream: call snd_pcm_detache_substream and set > substream->runtime to NULL > 4. interrupt_handler -> call snd_pcm_runtime() and crash while > accessing fields in `substream->runtime` > > e.g. In intel8x0.c driver for ac97 device, > In driver intel8x0.c, `snd_pcm_period_elapsed` is called after > checking `ichdev->substream` in `snd_intel8x0_update`. > And if a `snd_pcm_release` call from alsa-lib and pass through close() > and run to snd_pcm_detach_substream() in another thread, it's possible > to trigger a crash. > I can reproduce the issue within a multithread VM easily. > > My patches are trying to provide a basic protection for this situation > (and internal pcm lock between detach and elapsed), since > - the usage of `snd_pcm_period_elapsed` does not warn callers about > the possible race if the driver does not force the order for `calling > snd_pcm_period_elapsed` and `close` by lock and > - lots of drivers already have this hidden issue and I can't fix them > one by one (You can check the "snd_pcm_period_elapsed usage" and the > "close implementation" within all the drivers). The most common > mistake is that > - Checking if the substream is null and call into snd_pcm_period_elapsed > - But `close` can happen anytime, pass without block and > snd_pcm_detach_substream will be trigger right after it Thanks, point taken. While this argument is valid and it's good to harden the PCM core side, the concurrent calls are basically a bug, and we'd need another fix in anyway. Also, the patch 2 makes little sense; there can't be multiple close calls racing with each other. So I'll go for taking your fix but only the first patch. Back to this race: the surfaced issue is, as you pointed out, the race between snd_pcm_period_elapsed() vs close call. However, the fundamental problem is the pending action after the PCM trigger-stop call. Since the PCM trigger doesn't block nor wait until the hardware actually stops the things, the driver may go to the other step even after this "supposed-to-be-stopped" point. In your case, it goes up to close, and crashes. If we had a sync-stop operation, the interrupt handler should have finished before moving to the close stage, hence such a race could be avoided. It's been a long known problem, and some drivers have the own implementation for stop-sync. I think it's time to investigate and start implementing the fundamental solution. thanks, Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org https://mailman.alsa-project.org/mailman/listinfo/alsa-devel