From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from the.earth.li (the.earth.li [93.93.131.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB1A129A308; Fri, 4 Jul 2025 09:33:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=93.93.131.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751621616; cv=none; b=OFtaXp7K8h50B8dHQasuSacSBAB1qTL7WUlAxWdeNUv/gEVZH//xUQxhLRAalR4iekZCsnfXouEbJqQfNzilmHTbXcWS3QyGTwNosrx+qJUqibmQoPLTxXCxuTN9IVUwYbGZqBz3UGFeQmMt7j8F5agk9DSUctBjr5Np2ZS+NUc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751621616; c=relaxed/simple; bh=fHcJeLxd3czLO7BU7brQAuvBtgPi8QsgFNuxt0mkEqM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=J74H/hneYxGuaHDGzNv1SBRy7BfhWJKjcjxuQE775VAd4H/wwnBQ+/5uObSr9awtEpjQeLZuJbt10azvY7lNU6+P0rJg1k4BKRiCr2xqwHwWjH/ksTgQgxYo/UYauUCvR0LBJIr0NpR3bZvfGUkp054T8DS56BzHV5ReciUCsJM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=earth.li; spf=pass smtp.mailfrom=earth.li; dkim=pass (2048-bit key) header.d=earth.li header.i=@earth.li header.b=uKxsXHdj; arc=none smtp.client-ip=93.93.131.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=earth.li Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=earth.li Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=earth.li header.i=@earth.li header.b="uKxsXHdj" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=earth.li; s=the; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject: Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=+ugYoCt4pJwQ2pOWir6Edta+6ErzUvYTIgWTgVQ/3MQ=; b=uKxsXHdj6Ne/ZaRoOuewi5ciYi 8fwgDW1KbupmKAcb4xGGFmTeQi1OzdjLFFhRKbQWak0cxynu3yV6kBA3QjNIfX9xZ8o2QYnvNfLZK Ih1F/sNvLjZzuq4/q1FpTrL/6Yn4t/vk7Kmy2k5k4U9u+PWKdCPr5DsxRWp0mm8wQ2Jp/LpKW4Zbl PEfWzem+TeDVHfbNdf8ak+Hdxua2egOzivAl28wvaWi2mWDdMwMvgQYMwj4byMzq5ZcxdMNqGzt3p 6+8/OvF5dV8Za+/THlftghSXRMuR6U7kHkUd/yRS0xUdqvvnCCwS5QhjZblkSbMTQpWHfpCDXIPq2 NG8bCTrg==; Received: from noodles by the.earth.li with local (Exim 4.96) (envelope-from ) id 1uXcJV-00F2KC-2R; Fri, 04 Jul 2025 10:02:33 +0100 Date: Fri, 4 Jul 2025 10:02:33 +0100 From: Jonathan McDowell To: Jarkko Sakkinen Cc: "Orlov, Ivan" , "peterhuewe@gmx.de" , "jgg@ziepe.ca" , "linux-integrity@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "Woodhouse, David" Subject: Re: [PATCH v2] tpm: Fix the timeout & use ktime Message-ID: References: <20250620180828.98413-1-iorlov@amazon.com> Precedence: bulk X-Mailing-List: linux-integrity@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: On Wed, Jun 25, 2025 at 07:43:07PM +0300, Jarkko Sakkinen wrote: >On Sun, Jun 22, 2025 at 09:52:58PM +0100, Jonathan McDowell wrote: >> On Fri, Jun 20, 2025 at 06:08:31PM +0000, Orlov, Ivan wrote: >> > The current implementation of timeout detection works in the following >> > way: >> > >> > 1. Read completion status. If completed, return the data >> > 2. Sleep for some time (usleep_range) >> > 3. Check for timeout using current jiffies value. Return an error if >> > timed out >> > 4. Goto 1 >> > >> > usleep_range doesn't guarantee it's always going to wake up strictly in >> > (min, max) range, so such a situation is possible: >> > >> > 1. Driver reads completion status. No completion yet >> > 2. Process sleeps indefinitely. In the meantime, TPM responds >> > 3. We check for timeout without checking for the completion again. >> > Result is lost. >> > >> > Such a situation also happens for the guest VMs: if vCPU goes to sleep >> > and doesn't get scheduled for some time, the guest TPM driver will >> > timeout instantly after waking up without checking for the completion >> > (which may already be in place). >> > >> > Perform the completion check once again after exiting the busy loop in >> > order to give the device the last chance to send us some data. >> > >> > Since now we check for completion in two places, extract this check into >> > a separate function. >> > >> > Signed-off-by: Ivan Orlov >> > --- >> > V1 -> V2: >> > - Exclude the jiffies -> ktime change from the patch >> > - Instead of recording the time before checking for completion, check >> > for completion once again after leaving the loop >> > >> > drivers/char/tpm/tpm-interface.c | 17 +++++++++++++++-- >> > 1 file changed, 15 insertions(+), 2 deletions(-) >> > >> > diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c >> > index 8d7e4da6ed53..6960ee2798e1 100644 >> > --- a/drivers/char/tpm/tpm-interface.c >> > +++ b/drivers/char/tpm/tpm-interface.c >> > @@ -82,6 +82,13 @@ static bool tpm_chip_req_canceled(struct tpm_chip *chip, u8 status) >> > return chip->ops->req_canceled(chip, status); >> > } >> > >> > +static bool tpm_transmit_completed(struct tpm_chip *chip) >> > +{ >> > + u8 status_masked = tpm_chip_status(chip) & chip->ops->req_complete_mask; >> > + >> > + return status_masked == chip->ops->req_complete_val; >> > +} >> > + >> > static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz) >> > { >> > struct tpm_header *header = buf; >> > @@ -129,8 +136,7 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz) >> > stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal); >> > do { >> > u8 status = tpm_chip_status(chip); >> > - if ((status & chip->ops->req_complete_mask) == >> > - chip->ops->req_complete_val) >> > + if (tpm_transmit_completed(chip)) >> > goto out_recv; >> >> The only thing I'd point out here is we end up doing a double status read >> one after the other (once here, once in tpm_transmit_completed), and I'm >> pretty sure I've seen instances where that caused a problem. > >It would be easy to to prevent at least double reads after completion >e.g., in tpm_chip_status(): Or just take the simple approach and make the check after the while loop: if ((tpm_chip_status(chip) & chip->ops->req_complete_mask) == chip->ops->req_complete_val) goto out_recv; There might be potential for a longer term cleanup using chip->status to cache things, but I'm little concerned that's going to open paths where we might not correctly update it, so I think it should be a separate piece. (I'm motivated by the fact we've started to see the "Operation Canceled" error and I'd like us to close on the best way to fix it. :) ) J. -- I am afraid of the dark.