linux-integrity.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] tpm: Check for completion after timeout
@ 2025-07-19 20:13 Ivan Orlov
  2025-07-22 16:02 ` Jonathan McDowell
  2025-07-22 23:18 ` Jarkko Sakkinen
  0 siblings, 2 replies; 4+ messages in thread
From: Ivan Orlov @ 2025-07-19 20:13 UTC (permalink / raw)
  To: peterhuewe, jarkko
  Cc: Ivan Orlov, iorlov, jgg, linux-integrity, linux-kernel, dwmw,
	noodles

The current implementation of timeout detection works in the following
way:

1. Read completion status. If completed, return the data
2. Sleep for some time (usleep_range)
3. Check for timeout using current jiffies value. Return an error if
   timed out
4. Goto 1

usleep_range doesn't guarantee it's always going to wake up strictly in
(min, max) range, so such a situation is possible:

1. Driver reads completion status. No completion yet
2. Process sleeps indefinitely. In the meantime, TPM responds
3. We check for timeout without checking for the completion again.
   Result is lost.

Such a situation also happens for the guest VMs: if vCPU goes to sleep
and doesn't get scheduled for some time, the guest TPM driver will
timeout instantly after waking up without checking for the completion
(which may already be in place).

Perform the completion check once again after exiting the busy loop in
order to give the device the last chance to send us some data.

Since now we check for completion in two places, extract this check into
a separate function.

Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
---
V1 -> V2:
- Exclude the jiffies -> ktime change from the patch
- Instead of recording the time before checking for completion, check
  for completion once again after leaving the loop
V2 -> V3:
- Avoid reading the chip status twice in the inner loop by passing
  status into tpm_transmit_completed

 drivers/char/tpm/tpm-interface.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 8d7e4da6ed53..8d18b33aa62d 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -82,6 +82,13 @@ static bool tpm_chip_req_canceled(struct tpm_chip *chip, u8 status)
 	return chip->ops->req_canceled(chip, status);
 }
 
+static bool tpm_transmit_completed(u8 status, struct tpm_chip *chip)
+{
+	u8 status_masked = status & chip->ops->req_complete_mask;
+
+	return status_masked == chip->ops->req_complete_val;
+}
+
 static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
 {
 	struct tpm_header *header = buf;
@@ -129,8 +136,7 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
 	stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
 	do {
 		u8 status = tpm_chip_status(chip);
-		if ((status & chip->ops->req_complete_mask) ==
-		    chip->ops->req_complete_val)
+		if (tpm_transmit_completed(status, chip))
 			goto out_recv;
 
 		if (tpm_chip_req_canceled(chip, status)) {
@@ -142,6 +148,13 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
 		rmb();
 	} while (time_before(jiffies, stop));
 
+	/*
+	 * Check for completion one more time, just in case the device reported
+	 * it while the driver was sleeping in the busy loop above.
+	 */
+	if (tpm_transmit_completed(tpm_chip_status(chip), chip))
+		goto out_recv;
+
 	tpm_chip_cancel(chip);
 	dev_err(&chip->dev, "Operation Timed out\n");
 	return -ETIME;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] tpm: Check for completion after timeout
  2025-07-19 20:13 [PATCH v3] tpm: Check for completion after timeout Ivan Orlov
@ 2025-07-22 16:02 ` Jonathan McDowell
  2025-07-22 23:18 ` Jarkko Sakkinen
  1 sibling, 0 replies; 4+ messages in thread
From: Jonathan McDowell @ 2025-07-22 16:02 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: peterhuewe, jarkko, iorlov, jgg, linux-integrity, linux-kernel,
	dwmw

On Sat, Jul 19, 2025 at 08:13:39PM +0000, Ivan Orlov wrote:
>The current implementation of timeout detection works in the following
>way:
>
>1. Read completion status. If completed, return the data
>2. Sleep for some time (usleep_range)
>3. Check for timeout using current jiffies value. Return an error if
>   timed out
>4. Goto 1
>
>usleep_range doesn't guarantee it's always going to wake up strictly in
>(min, max) range, so such a situation is possible:
>
>1. Driver reads completion status. No completion yet
>2. Process sleeps indefinitely. In the meantime, TPM responds
>3. We check for timeout without checking for the completion again.
>   Result is lost.
>
>Such a situation also happens for the guest VMs: if vCPU goes to sleep
>and doesn't get scheduled for some time, the guest TPM driver will
>timeout instantly after waking up without checking for the completion
>(which may already be in place).
>
>Perform the completion check once again after exiting the busy loop in
>order to give the device the last chance to send us some data.
>
>Since now we check for completion in two places, extract this check into
>a separate function.
>
>Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>

I'm running this (though with an open coded repeated check, instead of 
the check function) in our fleet now. It hasn't rolled out to enough 
machines for me to confirm it definitely fixes the problem we see, but:

Reviewed-By: Jonathan McDowell <noodles@meta.com>

>---
>V1 -> V2:
>- Exclude the jiffies -> ktime change from the patch
>- Instead of recording the time before checking for completion, check
>  for completion once again after leaving the loop
>V2 -> V3:
>- Avoid reading the chip status twice in the inner loop by passing
>  status into tpm_transmit_completed
>
> drivers/char/tpm/tpm-interface.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
>index 8d7e4da6ed53..8d18b33aa62d 100644
>--- a/drivers/char/tpm/tpm-interface.c
>+++ b/drivers/char/tpm/tpm-interface.c
>@@ -82,6 +82,13 @@ static bool tpm_chip_req_canceled(struct tpm_chip *chip, u8 status)
> 	return chip->ops->req_canceled(chip, status);
> }
>
>+static bool tpm_transmit_completed(u8 status, struct tpm_chip *chip)
>+{
>+	u8 status_masked = status & chip->ops->req_complete_mask;
>+
>+	return status_masked == chip->ops->req_complete_val;
>+}
>+
> static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
> {
> 	struct tpm_header *header = buf;
>@@ -129,8 +136,7 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
> 	stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
> 	do {
> 		u8 status = tpm_chip_status(chip);
>-		if ((status & chip->ops->req_complete_mask) ==
>-		    chip->ops->req_complete_val)
>+		if (tpm_transmit_completed(status, chip))
> 			goto out_recv;
>
> 		if (tpm_chip_req_canceled(chip, status)) {
>@@ -142,6 +148,13 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
> 		rmb();
> 	} while (time_before(jiffies, stop));
>
>+	/*
>+	 * Check for completion one more time, just in case the device reported
>+	 * it while the driver was sleeping in the busy loop above.
>+	 */
>+	if (tpm_transmit_completed(tpm_chip_status(chip), chip))
>+		goto out_recv;
>+
> 	tpm_chip_cancel(chip);
> 	dev_err(&chip->dev, "Operation Timed out\n");
> 	return -ETIME;
>-- 
>2.43.0
>

J.

-- 
/-\                             |   <fledermaus> you should see the
|@/  Debian GNU/Linux Developer |  damage a bear on fire can do to a
\-                              |          rack of switches.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] tpm: Check for completion after timeout
  2025-07-19 20:13 [PATCH v3] tpm: Check for completion after timeout Ivan Orlov
  2025-07-22 16:02 ` Jonathan McDowell
@ 2025-07-22 23:18 ` Jarkko Sakkinen
  2025-07-22 23:22   ` Jarkko Sakkinen
  1 sibling, 1 reply; 4+ messages in thread
From: Jarkko Sakkinen @ 2025-07-22 23:18 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: peterhuewe, iorlov, jgg, linux-integrity, linux-kernel, dwmw,
	noodles

On Sat, Jul 19, 2025 at 08:13:39PM +0000, Ivan Orlov wrote:
> The current implementation of timeout detection works in the following
> way:
> 
> 1. Read completion status. If completed, return the data
> 2. Sleep for some time (usleep_range)
> 3. Check for timeout using current jiffies value. Return an error if
>    timed out
> 4. Goto 1
> 
> usleep_range doesn't guarantee it's always going to wake up strictly in
> (min, max) range, so such a situation is possible:
> 
> 1. Driver reads completion status. No completion yet
> 2. Process sleeps indefinitely. In the meantime, TPM responds
> 3. We check for timeout without checking for the completion again.
>    Result is lost.
> 
> Such a situation also happens for the guest VMs: if vCPU goes to sleep
> and doesn't get scheduled for some time, the guest TPM driver will
> timeout instantly after waking up without checking for the completion
> (which may already be in place).
> 
> Perform the completion check once again after exiting the busy loop in
> order to give the device the last chance to send us some data.
> 
> Since now we check for completion in two places, extract this check into
> a separate function.
> 
> Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
> ---
> V1 -> V2:
> - Exclude the jiffies -> ktime change from the patch
> - Instead of recording the time before checking for completion, check
>   for completion once again after leaving the loop
> V2 -> V3:
> - Avoid reading the chip status twice in the inner loop by passing
>   status into tpm_transmit_completed
> 
>  drivers/char/tpm/tpm-interface.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
> index 8d7e4da6ed53..8d18b33aa62d 100644
> --- a/drivers/char/tpm/tpm-interface.c
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -82,6 +82,13 @@ static bool tpm_chip_req_canceled(struct tpm_chip *chip, u8 status)
>  	return chip->ops->req_canceled(chip, status);
>  }
>  
> +static bool tpm_transmit_completed(u8 status, struct tpm_chip *chip)
> +{
> +	u8 status_masked = status & chip->ops->req_complete_mask;
> +
> +	return status_masked == chip->ops->req_complete_val;
> +}
> +
>  static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
>  {
>  	struct tpm_header *header = buf;
> @@ -129,8 +136,7 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
>  	stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
>  	do {
>  		u8 status = tpm_chip_status(chip);
> -		if ((status & chip->ops->req_complete_mask) ==
> -		    chip->ops->req_complete_val)
> +		if (tpm_transmit_completed(status, chip))
>  			goto out_recv;
>  
>  		if (tpm_chip_req_canceled(chip, status)) {
> @@ -142,6 +148,13 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
>  		rmb();
>  	} while (time_before(jiffies, stop));
>  
> +	/*
> +	 * Check for completion one more time, just in case the device reported
> +	 * it while the driver was sleeping in the busy loop above.
> +	 */
> +	if (tpm_transmit_completed(tpm_chip_status(chip), chip))
> +		goto out_recv;
> +
>  	tpm_chip_cancel(chip);
>  	dev_err(&chip->dev, "Operation Timed out\n");
>  	return -ETIME;
> -- 
> 2.43.0
> 

I guess this is completed too by now ...

Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

BR, Jarkko

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] tpm: Check for completion after timeout
  2025-07-22 23:18 ` Jarkko Sakkinen
@ 2025-07-22 23:22   ` Jarkko Sakkinen
  0 siblings, 0 replies; 4+ messages in thread
From: Jarkko Sakkinen @ 2025-07-22 23:22 UTC (permalink / raw)
  To: Ivan Orlov
  Cc: peterhuewe, iorlov, jgg, linux-integrity, linux-kernel, dwmw,
	noodles

On Wed, Jul 23, 2025 at 02:18:52AM +0300, Jarkko Sakkinen wrote:
> On Sat, Jul 19, 2025 at 08:13:39PM +0000, Ivan Orlov wrote:
> > The current implementation of timeout detection works in the following
> > way:
> > 
> > 1. Read completion status. If completed, return the data
> > 2. Sleep for some time (usleep_range)
> > 3. Check for timeout using current jiffies value. Return an error if
> >    timed out
> > 4. Goto 1
> > 
> > usleep_range doesn't guarantee it's always going to wake up strictly in
> > (min, max) range, so such a situation is possible:
> > 
> > 1. Driver reads completion status. No completion yet
> > 2. Process sleeps indefinitely. In the meantime, TPM responds
> > 3. We check for timeout without checking for the completion again.
> >    Result is lost.
> > 
> > Such a situation also happens for the guest VMs: if vCPU goes to sleep
> > and doesn't get scheduled for some time, the guest TPM driver will
> > timeout instantly after waking up without checking for the completion
> > (which may already be in place).
> > 
> > Perform the completion check once again after exiting the busy loop in
> > order to give the device the last chance to send us some data.
> > 
> > Since now we check for completion in two places, extract this check into
> > a separate function.
> > 
> > Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
> > ---
> > V1 -> V2:
> > - Exclude the jiffies -> ktime change from the patch
> > - Instead of recording the time before checking for completion, check
> >   for completion once again after leaving the loop
> > V2 -> V3:
> > - Avoid reading the chip status twice in the inner loop by passing
> >   status into tpm_transmit_completed
> > 
> >  drivers/char/tpm/tpm-interface.c | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
> > index 8d7e4da6ed53..8d18b33aa62d 100644
> > --- a/drivers/char/tpm/tpm-interface.c
> > +++ b/drivers/char/tpm/tpm-interface.c
> > @@ -82,6 +82,13 @@ static bool tpm_chip_req_canceled(struct tpm_chip *chip, u8 status)
> >  	return chip->ops->req_canceled(chip, status);
> >  }
> >  
> > +static bool tpm_transmit_completed(u8 status, struct tpm_chip *chip)
> > +{
> > +	u8 status_masked = status & chip->ops->req_complete_mask;
> > +
> > +	return status_masked == chip->ops->req_complete_val;
> > +}
> > +
> >  static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
> >  {
> >  	struct tpm_header *header = buf;
> > @@ -129,8 +136,7 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
> >  	stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
> >  	do {
> >  		u8 status = tpm_chip_status(chip);
> > -		if ((status & chip->ops->req_complete_mask) ==
> > -		    chip->ops->req_complete_val)
> > +		if (tpm_transmit_completed(status, chip))
> >  			goto out_recv;
> >  
> >  		if (tpm_chip_req_canceled(chip, status)) {
> > @@ -142,6 +148,13 @@ static ssize_t tpm_try_transmit(struct tpm_chip *chip, void *buf, size_t bufsiz)
> >  		rmb();
> >  	} while (time_before(jiffies, stop));
> >  
> > +	/*
> > +	 * Check for completion one more time, just in case the device reported
> > +	 * it while the driver was sleeping in the busy loop above.
> > +	 */
> > +	if (tpm_transmit_completed(tpm_chip_status(chip), chip))
> > +		goto out_recv;
> > +
> >  	tpm_chip_cancel(chip);
> >  	dev_err(&chip->dev, "Operation Timed out\n");
> >  	return -ETIME;
> > -- 
> > 2.43.0
> > 
> 
> I guess this is completed too by now ...
> 
> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>

Just saying (i.e. I will fix it up): s/Reviewed-By/Reviewed-by/g ;-)

checkpatch.pl does scream about this but yeah not a huge deal!

BR, Jarkko

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-07-22 23:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-19 20:13 [PATCH v3] tpm: Check for completion after timeout Ivan Orlov
2025-07-22 16:02 ` Jonathan McDowell
2025-07-22 23:18 ` Jarkko Sakkinen
2025-07-22 23:22   ` Jarkko Sakkinen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).