linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fs/dcache: Resolve the last RT woes.
@ 2022-06-13 14:07 Sebastian Andrzej Siewior
  2022-06-13 14:07 ` [PATCH 1/4] fs/dcache: Disable preemption on i_dir_seq write side on PREEMPT_RT Sebastian Andrzej Siewior
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Sebastian Andrzej Siewior @ 2022-06-13 14:07 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Alexander Viro, Matthew Wilcox, Thomas Gleixner

PREEMPT_RT has two issues with the dcache code:

   1) i_dir_seq is a special cased sequence counter which uses the lowest
      bit as writer lock. This requires that preemption is disabled.

      On !RT kernels preemption is implictly disabled by spin_lock(), but
      that's not the case on RT kernels.

      Replacing i_dir_seq on RT with a seqlock_t comes with its own
      problems due to arbitrary lock nesting. Using a seqcount with an
      associated lock is not possible either because the locks held by the
      callers are not necessarily related.

      Explicitly disabling preemption on RT kernels across the i_dir_seq
      write held critical section is the obvious and simplest solution. The
      critical section is small, so latency is not really a concern.

   2) The wake up of dentry::d_wait waiters is in a preemption disabled
      section, which violates the RT constraints as wake_up_all() has
      to acquire the wait queue head lock which is a 'sleeping' spinlock
      on RT.

      There are two reasons for the non-preemtible section:

         A) The wake up happens under the hash list bit lock

         B) The wake up happens inside the i_dir_seq write side
            critical section

       #A is solvable by moving it outside of the hash list bit lock held                                                                                                                      
       section.

       Making #B preemptible on RT is hard or even impossible due to lock
       nesting constraints.

       A possible solution would be to replace the waitqueue by a simple
       waitqueue which can be woken up inside atomic sections on RT.

       But aside of Linus not being a fan of simple waitqueues, there is
       another observation vs. this wake up. It's likely for the woken up
       waiter to immediately contend on dentry::lock.

       It turns out that there is no reason to do the wake up within the
       i_dir_seq write held region. The only requirement is to do the wake
       up within the dentry::lock held region. Further details in the
       individual patches.

       That allows to move the wake up out of the non-preemptible section
       on RT, which also reduces the dentry::lock held time after wake up.

Thanks,

Sebastian




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-07-27  3:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-13 14:07 fs/dcache: Resolve the last RT woes Sebastian Andrzej Siewior
2022-06-13 14:07 ` [PATCH 1/4] fs/dcache: Disable preemption on i_dir_seq write side on PREEMPT_RT Sebastian Andrzej Siewior
2022-06-13 14:07 ` [PATCH 2/4] fs/dcache: Split __d_lookup_done() Sebastian Andrzej Siewior
2022-07-27  3:31   ` Al Viro
2022-07-27  3:36     ` Al Viro
2022-06-13 14:07 ` [PATCH 3/4] fs/dcache: Use __d_lookup_unhash() in __d_add/move() Sebastian Andrzej Siewior
2022-07-26  3:00   ` Al Viro
2022-07-26  7:47     ` [PATCH 3/4 v2] " Sebastian Andrzej Siewior
2022-06-13 14:07 ` [PATCH 4/4] fs/dcache: Move wakeup out of i_seq_dir write held region Sebastian Andrzej Siewior
2022-07-08 15:52 ` fs/dcache: Resolve the last RT woes Thomas Gleixner
2022-07-25 15:49 ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).