All of lore.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Marzinski <bmarzins@redhat.com>
To: Mikulas Patocka <mpatocka@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>,
	dm-devel@lists.linux.dev, Damien Le Moal <dlemoal@kernel.org>,
	Christian Loehle <christian.loehle@arm.com>
Subject: Re: [PATCH] dm-delay: don't busy-wait in kthread
Date: Mon, 14 Apr 2025 13:06:14 -0400	[thread overview]
Message-ID: <Z_1Ahi59V3YjG208@redhat.com> (raw)
In-Reply-To: <Z_09e7Zddckboca9@redhat.com>

On Mon, Apr 14, 2025 at 12:53:15PM -0400, Benjamin Marzinski wrote:
> On Mon, Apr 14, 2025 at 03:52:18PM +0200, Mikulas Patocka wrote:
> > 
> > 
> > On Fri, 11 Apr 2025, Benjamin Marzinski wrote:
> > 
> > > When using a kthread to delay the IOs, dm-delay would continuously loop,
> > > checking if IOs were ready to submit. It had a cond_resched() call in
> > > the loop, but might still loop hundreds of millions of times waiting for
> > > an IO that was scheduled to be submitted 10s of ms in the future. With
> > > the change to make dm-delay over zoned devices always use kthreads
> > > regardless of the length of the delay, this wasted work only gets worse.
> > > 
> > > To solve this and still keep roughly the same precision for very short
> > > delays, dm-delay now calls fsleep() for 1/8th of the smallest non-zero
> > > delay it will place on IOs, or 1 ms, whichever is smaller. The reason
> > > that dm-delay doesn't just use the actual expiration time of the next
> > > delayed IO to calculated the sleep time is that delay_dtr() must wait
> > > for the kthread to finish before deleting the table. If a zoned device
> > > with a long delay queued an IO shortly before being suspended and
> > > removed, the IO would be flushed in delay_presuspend(), but the removing
> > > the device would still have to wait for the remainder of the long delay.
> > > This time is now capped at 1 ms.
> > > 
> > > Fixes: 70bbeb29fab09 ("dm delay: for short delays, use kthread instead of timers and wq")
> > > Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> > > ---
> > > This patch is meant to apply on top of Damien Le Moal's "dm-delay:
> > > Prevent zoned write reordering on suspend" patch. If people think it's
> > > important to avoid either this much smaller amount of looping or the
> > > possible 1 ms delay on deleting a table, I can send a patch that uses
> > > usleep_range_state() and msleep_interruptible() to do an interruptible
> > > sleep with a duration based on the expiration time of the next delayed
> > > IO.
> > 
> > Hi
> > 
> > worker_sleep_ns should be worker_sleep_us - as the value is in 
> > microseconds.
> 
> Oops.
> 
> > fsleep in flush_worker_fn should be called unconditionally, to not consume 
> > 100% CPU when suspending.
> 
> This is fine, but flush_worker_fn() won't busy-wait while suspending.
> Since it is calling flush_delayed_bios() with flush_all equal to true,
> it will handle all the bios on the list. As long as bios keep getting
> added to the list while it's flushing the last batch, it will keep
> looping to flush them. But it will be doing necessary work on each loop,
> and flush_delayed_bios() has a cond_resched(), so even if there are a
> flood of bios, it will still take breaks. As soon as bios stop
> continuously arriving, flush_worker_fn() will see the empty list and the
> kthread will stop.
> 
> Unconditionally sleeping here makes it more likely that dm-delay will
> end up sleeping unnecessarily while a there are just a few remaining
> bios on the list. Like I said in my commit message, this sleep will be
> capped at 1 ms, so it's not that big of a deal. But it's a trade-off.
> I'm o.k. with your version. I just not sure that it is the better
> trade-off. Perhaps I'm overlooking something.

And the thing that I overlooked was you NAK'ing Damien's patch. Oops.
Please ignore this.

-Ben
 
> > cond_resched() shouldn't be removed because fsleep may fall back to 
> > udelay.
> 
> Again, your version is fine, but I'm not sure that cond_resched() was
> ever necessary, since there already is one in flush_delayed_bios().
> Also, at least the way it's currently coded, fsleep() will only resort
> to busy-waiting when the delay is 10 us or less, and the shortest it can
> be with this code is 62 us, so I don't think this cond_resched() will
> ever do anything.
> 
> > The patch should increase target version.
> > 
> > I fixed the patch so that it applies on the top Linus' tree and applied 
> > it to the linux-dm tree.
> > 
> > BTW. do we need to backport this to the stable kernels? I think not, but 
> > if you have some reason why should we backport it, explain it.
> 
> dm-delay is basically a testing target, so I agree that it seems
> unnecessary to backport this.
> 
> -Ben
> 
> > 
> > Mikulas
> 


  reply	other threads:[~2025-04-14 17:06 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-11 20:56 [PATCH] dm-delay: don't busy-wait in kthread Benjamin Marzinski
2025-04-11 21:12 ` Benjamin Marzinski
2025-04-14  0:00 ` Damien Le Moal
2025-04-14  0:16 ` Damien Le Moal
2025-04-14 13:52 ` Mikulas Patocka
2025-04-14 16:53   ` Benjamin Marzinski
2025-04-14 17:06     ` Benjamin Marzinski [this message]
2025-04-14 20:09     ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_1Ahi59V3YjG208@redhat.com \
    --to=bmarzins@redhat.com \
    --cc=christian.loehle@arm.com \
    --cc=dlemoal@kernel.org \
    --cc=dm-devel@lists.linux.dev \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.