All of lore.kernel.org
 help / color / mirror / Atom feed
* Wait for mutex to become unlocked
@ 2022-05-04 21:44 Matthew Wilcox
  2022-05-05  0:22 ` Thomas Gleixner
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Matthew Wilcox @ 2022-05-04 21:44 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon, Waiman Long
  Cc: Paul E. McKenney, Thomas Gleixner, Liam R. Howlett, linux-kernel

Paul, Liam and I were talking about some code we intend to write soon
and realised there's a missing function in the mutex & rwsem API.
We're intending to use it for an rwsem, but I think it applies equally
to mutexes.

The customer has a low priority task which wants to read /proc/pid/smaps
of a higher priority task.  Today, everything is awful; smaps acquires
mmap_sem read-only, is preempted, then the high-pri task calls mmap()
and the down_write(mmap_sem) blocks on the low-pri task.  Then all the
other threads in the high-pri task block on the mmap_sem as they take
page faults because we don't want writers to starve.

The approach we're looking at is to allow RCU lookup of VMAs, and then
take a per-VMA rwsem for read.  Because we're under RCU protection,
that looks a bit like this:

	rcu_read_lock();
	vma = vma_lookup();
	if (down_read_trylock(&vma->sem)) {
		rcu_read_unlock();
	} else {
		rcu_read_unlock();
		down_read(&mm->mmap_sem);
		vma = vma_lookup();
		down_read(&vma->sem);
		up_read(&mm->mmap_sem);
	}

(for clarity, I've skipped the !vma checks; don't take this too literally)

So this is Good.  For the vast majority of cases, we avoid taking the
mmap read lock and the problem will appear much less often.  But we can
do Better with a new API.  You see, for this case, we don't actually
want to acquire the mmap_sem; we're happy to spin a bit, but there's no
point in spinning waiting for the writer to finish when we can sleep.
I'd like to write this code:

again:
	rcu_read_lock();
	vma = vma_lookup();
	if (down_read_trylock(&vma->sem)) {
		rcu_read_unlock();
	} else {
		rcu_read_unlock();
		rwsem_wait_read(&mm->mmap_sem);
		goto again;
	}

That is, rwsem_wait_read() puts the thread on the rwsem's wait queue,
and wakes it up without giving it the lock.  Now this thread will never
be able to block any thread that tries to acquire mmap_sem for write.

Similarly, it may make sense to add rwsem_wait_write() and mutex_wait().
Perhaps also mutex_wait_killable() and mutex_wait_interruptible()
(the combinatoric explosion is a bit messy; I don't know that it makes
sense to do the _nested, _io variants).

Does any of this make sense?

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-05-05  5:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-05-04 21:44 Wait for mutex to become unlocked Matthew Wilcox
2022-05-05  0:22 ` Thomas Gleixner
2022-05-05  0:38   ` Matthew Wilcox
2022-05-05  1:14     ` Thomas Gleixner
2022-05-05  5:04       ` Paul E. McKenney
2022-05-05  5:21         ` Paul E. McKenney
2022-05-05  1:11 ` Waiman Long
     [not found] ` <20220505015223.5132-1-hdanton@sina.com>
2022-05-05  4:12   ` Matthew Wilcox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.