From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-vk1-f174.google.com (mail-vk1-f174.google.com [209.85.221.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E75F38DD5 for ; Tue, 13 Feb 2024 12:08:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707826131; cv=none; b=Xs3iepE6EjVY5Y8iw/bLqFcU034JqpwlT8wat70I0Os+dRzSuMHL9GU5Hn4KWamomDd5bOc5fTpc7Yb0bTysBBmJHnzfInqeTW8T2dUVLhCm3eLMehJWRj4YPcXG3GidsGdkOz6yyamOjKB8bPfHPeGDK1baLbnLKUOMZ49FVB8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707826131; c=relaxed/simple; bh=RD2gOE43WjYVxdi1ngCyfW/4evEA9UOVCJgWZA9RArA=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=dkdop5+/uKpYoefUIk9DKJKcKsZlrNqHxhuQa1JqqOvGf8vWL0BILcL5gUOYATV2PwD8O+IQ4YOQk4SBE3eewZSqbOO2slYupnLBwEdFR7eKvyFueXlFFa+JW1DELM6XfAyNR+NNACVSPJz/Z4b+sP5SZtABnYzMRxSv7cMbR6I= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=bgdev.pl; spf=none smtp.mailfrom=bgdev.pl; dkim=pass (2048-bit key) header.d=bgdev-pl.20230601.gappssmtp.com header.i=@bgdev-pl.20230601.gappssmtp.com header.b=VLdxOGll; arc=none smtp.client-ip=209.85.221.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=bgdev.pl Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=bgdev.pl Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bgdev-pl.20230601.gappssmtp.com header.i=@bgdev-pl.20230601.gappssmtp.com header.b="VLdxOGll" Received: by mail-vk1-f174.google.com with SMTP id 71dfb90a1353d-4c033928deaso3401831e0c.0 for ; Tue, 13 Feb 2024 04:08:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bgdev-pl.20230601.gappssmtp.com; s=20230601; t=1707826128; x=1708430928; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=RD2gOE43WjYVxdi1ngCyfW/4evEA9UOVCJgWZA9RArA=; b=VLdxOGll2aXINr6xr0F90BKX8l56IcVS7Rtxl0Lk91kxpM8gccBgLH0vx3s4NNtszF lsdTX28Iqmcv+HsCQ1aEJQzbS2uByIHP3v5rf6di84+BqrEbjSZzGhtKS3kuzYMNxUv7 iHm6NQ0clQzjH7xvNV7mLv8002zC4fRdAXXWJwb0dKkQBMzNRHv1Z5zOiL/B7e6HPHPl vbYNUnYxBOOvzrp1f4pNMd29GmH2Ri7k1hB3jpYnezJdoAKPpTz1Yv3ixeatKLlvJG68 ATDatEpUmRq8uSwZ/oMW+uqKjz4DkCcUn0f9I/BhNXsKZ/+YrTITOASh+hVwfMth/hCn Tusw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707826128; x=1708430928; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RD2gOE43WjYVxdi1ngCyfW/4evEA9UOVCJgWZA9RArA=; b=HlN/kiVv6nzJPqYP9NlQ3uekn9NVXwzKrisUNJ4akcAGlie/UBiZXTmHmj+U8RsDkq BbuYQqJVoKvclBBz4DAGfn6ozBxw4B/4Iz7/wbArZKP/Ijsj4uucETpJOQjTfAW2YwaK zgos1raLSvGwHUyOsPej4OdzPh/p21sA/kvShjkKxzIgGGhwu25xRA5EnNUY+/NNzYeS 1mlsahG1LVjRyf8o9En1mHdXi+T0oBNVch2UGAXZulZCJxf3aIKve8BTOenxJhGjfPQu e9njm04zft2NRztYx+ZABSRZT+6SdlOPk4ie/6Q+Ez9p4bNrxag6hq7B7pSsNQAWle1S 3v3g== X-Gm-Message-State: AOJu0YwjvVFvTq3Y/2aaFLPB3AQPr7esbH8CwlmLZRMRY/LKofKq44oq nc9/dy/5vHL8ByCbSeDbmUTXI84t/SrvWLupmROi80DzwHJNeP4k/cnkly4eAwyMYGwS8vcVBjr zaledF9d+z+NeX8UJmmRv73woY36ZiwHUCP9IHA== X-Google-Smtp-Source: AGHT+IGlnW4jJFYXgcuBWKQX+kpeByQrzmcNz+1+GaJF6Zm0adomc/SghGDfF//5mW2qfRM2I1XGer2HTXyrIynQ7ck= X-Received: by 2002:a1f:6281:0:b0:4bd:54d0:e6df with SMTP id w123-20020a1f6281000000b004bd54d0e6dfmr1945875vkb.1.1707826128494; Tue, 13 Feb 2024 04:08:48 -0800 (PST) Precedence: bulk X-Mailing-List: linux-gpio@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240208095920.8035-1-brgl@bgdev.pl> In-Reply-To: From: Bartosz Golaszewski Date: Tue, 13 Feb 2024 13:08:37 +0100 Message-ID: Subject: Re: [PATCH v3 00/24] gpio: rework locking and object life-time control To: Marek Szyprowski Cc: Linus Walleij , Kent Gibson , Alex Elder , Geert Uytterhoeven , "Paul E . McKenney" , Andy Shevchenko , Wolfram Sang , linux-gpio@vger.kernel.org, linux-kernel@vger.kernel.org, Bartosz Golaszewski Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Feb 13, 2024 at 1:05=E2=80=AFPM Marek Szyprowski wrote: > > On 08.02.2024 10:58, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski > > > > This is a big rework of locking in GPIOLIB. The current serialization i= s > > pretty much useless. There is one big spinlock (gpio_lock) that "protec= ts" > > both the GPIO device list, GPIO descriptor access and who knows what el= se. > > > > I'm putting "protects" in quotes as in several places the lock is > > taken, released whenever a sleeping function is called and re-taken > > without regards for the "protected" state that may have changed. > > > > First a little background on what we're dealing with in GPIOLIB. We hav= e > > consumer API functions that can be called from any context explicitly > > (get/set value, set direction) as well as many others which will get > > called in atomic context implicitly (e.g. set config called in certain > > situations from gpiod_direction_output()). > > > > On the other side: we have GPIO provider drivers whose callbacks may or > > may not sleep depending on the underlying protocol. > > > > This makes any attempts at serialization quite complex. We typically > > cannot use sleeping locks - we may be called from atomic - but we also > > often cannot use spinlocks - provider callbacks may sleep. Moreover: we > > have close ties with the interrupt and pinctrl subsystems, often either > > calling into them or getting called from them. They use their own locki= ng > > schemes which are at odds with ours (pinctrl uses mutexes, the interrup= t > > subsystem can call GPIO helpers with spinlock taken). > > > > There is also another significant issue: the GPIO device object contain= s > > a pointer to gpio_chip which is the implementation of the GPIO provider= . > > This object can be removed at any point - as GPIOLIB officially support= s > > hotplugging with all the dynamic expanders that we provide drivers for = - > > and leave the GPIO API callbacks with a suddenly NULL pointer. This is > > a problem that allowed user-space processes to easily crash the kernel > > until we patched it with a read-write semaphore in the user-space facin= g > > code (but the problem still exists for in-kernel users). This was > > recognized before as evidenced by the implementation of validate_desc() > > but without proper serialization, simple checking for a NULL pointer is > > pointless and we do need a generic solution for that issue as well. > > > > If we want to get it right - the more lockless we go, the better. This = is > > why SRCU seems to be the right candidate for the mechanism to use. In f= act > > it's the only mechanism we can use our read-only critical sections to b= e > > called from atomic and protecc contexts as well as call driver callback= s > > that may sleep (for the latter case). > > > > We're going to use it in three places: to protect the global list of GP= IO > > devices, to ensure consistency when dereferencing the chip pointer in G= PIO > > device struct and finally to ensure that users can access GPIO descript= ors > > and always see a consistent state. > > > > We do NOT serialize all API callbacks. This means that provider callbac= ks > > may be called simultaneously and GPIO drivers need to provide their own > > locking if needed. This is on purpose. First: we only support exclusive > > GPIO usage* so there's no risk of two drivers getting in each other's w= ay > > over the same GPIO. Second: with this series, we ensure enough consiste= ncy > > to limit the chance of drivers or user-space users crashing the kernel. > > With additional improvements in handling the flags field in GPIO > > descriptors there's very little to gain, while bitbanging drivers may c= are > > about the increased performance of going lockless. > > > > This series brings in one somewhat significant functional change for > > in-kernel users, namely: GPIO API calls, for which the underlying GPIO > > chip is gone, will no longer return 0 and emit a log message but instea= d > > will return -ENODEV. > > > > I know this is a lot of code to go through but the more eyes we get on = it > > the better. > > I've noticed that this patchset landed in today's linux-next. It causes > a lots of warning during boot on my test boards when LOCKDEP is enabled > in kernel configs. Do you want me to report all of them? Some can be > easily reproduced even with QEMU's virt ARM and ARM64 machines. > Marek, Thanks for the report. I've already sent out patches that fix these problems. Sorry for this. Bartosz [snip]