All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Al Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Daniel Wagner <daniel.wagner@bmw-carit.de>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Ingo Molnar <mingo@redhat.com>, Tejun Heo <tj@kernel.org>,
	linux-kernel@vger.kernel.org
Subject: [PATCH RFC 0/4] change sb_writers to use percpu_rw_semaphore
Date: Mon, 13 Jul 2015 23:25:36 +0200	[thread overview]
Message-ID: <20150713212536.GA13855@redhat.com> (raw)

Hello,

Al, Jan, could you comment? I mean the intent, the patches are
obviously not for inclusion yet.

We can remove everything from struct sb_writers except frozen
(which can become a boolean, it seems) and add the array of
percpu_rw_semaphore's instead.

__sb_start/end_write() can use percpu_down/up_read(), and
freeze/thaw_super() can use percpu_down/up_write().

Why:

	- Firstly, __sb_start_write() looks simply buggy. I does
	  __sb_end_write() if it sees ->frozen, but if it migrates
	  to another CPU before percpu_counter_dec() sb_wait_write()
	  can wrongly succeed if there is another task which holds
	  the same "semaphore": sb_wait_write() can miss the result
	  of the previous percpu_counter_inc() but see the result
	  of this percpu_counter_dec().

	- This code doesn't look simple. It would be better to rely
	  on the generic locking code.

	- __sb_start_write() will be a little bit faster, but this
	  is minor.

Todo:

	- __sb_start_write(wait => false) always fail.

	  Thivial, we already have percpu_down_read_trylock() just
	  this patch wasn't merged yet.

	- sb_lockdep_release() and sb_lockdep_acquire() play with
	  percpu_rw_semaphore's internals.

	  Trivial, we need a couple of new helper in percpu-rwsem.c.

	- Fix get_super_thawed(), it will spin if MS_RDONLY...

	  It is not clear to me what exactly should we do, but this
	  doesn't look hard. Perhaps it can just return if MS_RDONLY.

	- Most probably I missed something else, and I do not need
	  how to test.

Finally. freeze_super() calls synchronize_sched_expedited() 3 times in
a row. This is bad and just stupid. But if we change percpu_rw_semaphore
to use rcu_sync (see https://lkml.org/lkml/2015/7/11/211) we can avoid
this and do synchronize_sched() only once. Just we need some more simple
changes in percpu-rwsem.c, so that all sb_writers->rw_sem[] semaphores
could use the single sb_writers->rss.

In this case destroy_super() needs some modifications too,
percpu_free_rwsem() will be might_sleep(). But this looks simple too.

Oleg.

 fs/super.c         |  147 +++++++++++++++++++--------------------------------
 include/linux/fs.h |   14 +----
 2 files changed, 58 insertions(+), 103 deletions(-)


             reply	other threads:[~2015-07-13 21:27 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-13 21:25 Oleg Nesterov [this message]
2015-07-13 21:25 ` [PATCH 1/4] change get_super_thawed() to use sb_start/end_write() Oleg Nesterov
2015-07-14 10:49   ` Jan Kara
2015-07-14 13:38     ` Oleg Nesterov
2015-07-13 21:25 ` [PATCH 2/4] introduce sb_unlock_frozen() Oleg Nesterov
2015-07-13 21:25 ` [PATCH 3/4] introduce sb_lockdep_release() Oleg Nesterov
2015-07-13 21:25 ` [PATCH 4/4] change sb_writers to use percpu_rw_semaphore Oleg Nesterov
2015-07-13 22:23 ` [PATCH RFC 0/4] " Dave Chinner
2015-07-13 22:42   ` Oleg Nesterov
2015-07-13 23:14     ` Dave Chinner
2015-07-14 10:48 ` Jan Kara
2015-07-14 13:37   ` Oleg Nesterov
2015-07-14 21:17     ` Dave Hansen
2015-07-14 21:22       ` Oleg Nesterov
2015-07-14 21:41         ` Dave Hansen
2015-07-15  6:47           ` Jan Kara
2015-07-15 18:19             ` Oleg Nesterov
2015-07-16  7:26               ` Jan Kara
2015-07-16  7:30                 ` Dave Hansen
2015-07-16  8:55                   ` Jan Kara
2015-07-16 17:32                 ` Oleg Nesterov
2015-07-17  1:27                   ` Dave Chinner
2015-07-17 17:31                     ` Oleg Nesterov
2015-07-17 22:40                       ` Dave Chinner
2015-07-20  8:26                         ` Jan Kara
2015-07-22 21:09                           ` Oleg Nesterov
2015-07-20 16:23                         ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150713212536.GA13855@redhat.com \
    --to=oleg@redhat.com \
    --cc=daniel.wagner@bmw-carit.de \
    --cc=dave@stgolabs.net \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.