From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Waiman Long <Waiman.Long@hp.com>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
"Chandramouleeswaran, Aswin" <aswin@hp.com>,
"Norton, Scott J" <scott.norton@hp.com>,
George Spelvin <linux@horizon.com>,
John Stoffel <john@stoffel.org>
Subject: Re: [PATCH v3 1/1] dcache: Translating dentry into pathname without taking rename_lock
Date: Sat, 7 Sep 2013 19:07:24 +0100 [thread overview]
Message-ID: <20130907180724.GE13318@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFyJ8CvfYzh8LFRY_6tx+GgzVfps1hSwJeSUz=G=FrsgkQ@mail.gmail.com>
On Sat, Sep 07, 2013 at 10:52:02AM -0700, Linus Torvalds wrote:
> So I think we could make a more complicated data structure that looks
> something like this:
>
> struct seqlock_retry {
> unsigned int seq_no;
> int state;
> };
>
> and pass that around. Gcc should do pretty well, especially if we
> inline things (but even if not, small structures that fit in 64 bytes
> generate reasonable code even on 32-bit targets, because gcc knows
> about using two registers for passing data around)..
>
> Then you can make "state" have a retry counter in it, and have a
> negative value mean "I hold the lock for writing". Add a couple of
> helper functions, and you can fairly easily handle the mixed "try for
> reading first, then fall back to writing".
>
> That said, __d_lookup() still shows up as very performance-critical on
> some loads (symlinks in particular cause us to fall out of the RCU
> cases) so I'd like to keep that using the simple pure read case. I
> don't believe you can livelock it, as mentioned. But the other ones
> might well be worth moving to a "fall back to write-locking after <n>
> tries" model. They might all traverse user-specified paths of fairly
> arbitrary depth, no?
>
> So this "seqlock_retry" thing wouldn't _replace_ bare seqlocks, it
> would just be a helper thing for this kind of behavior where we want
> to normally do things with just the read-lock, but want to guarantee
> that we don't live-lock.
>
> Sounds reasonable?
More or less; I just wonder if we are overdesigning here - if we don't
do "repeat more than once", we can simply use the lower bit of seq -
read_seqlock() always returns an even value. So we could do something
like seqretry_and_lock(lock, &seq):
if ((*seq & 1) || !read_seqretry(lock, *seq))
return true;
*seq |= 1;
write_seqlock(lock);
return false;
and seqretry_done(lock, seq):
if (seq & 1)
write_sequnlock(lock);
with these loops turning into
seq = read_seqlock(&rename_lock);
...
if (!seqretry_and_lock(&rename_lock, &seq))
goto again;
...
seqretry_done(&rename_lock);
But I'd really like to understand the existing zoo - in particular, ceph and
cifs users can't be converted to anything of that kind (blocking kmalloc()
can't live under write_seqlock()) and they are _easier_ to livelock than
d_path(), due to the same kmalloc() widening the window. Guys, do we really
care about precisely-sized allocations there?
next prev parent reply other threads:[~2013-09-07 18:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-06 16:08 [PATCH v3 0/1] dcache: Translating dentry into pathname without taking rename_lock Waiman Long
2013-09-06 16:08 ` [PATCH v3 1/1] " Waiman Long
2013-09-06 20:52 ` Linus Torvalds
2013-09-06 21:05 ` Al Viro
2013-09-06 21:48 ` Linus Torvalds
2013-09-07 0:00 ` Al Viro
2013-09-07 0:19 ` Linus Torvalds
2013-09-07 0:58 ` Linus Torvalds
2013-09-07 3:01 ` Al Viro
2013-09-07 17:32 ` Al Viro
2013-09-08 4:15 ` Ian Kent
2013-09-08 4:58 ` Al Viro
2013-09-08 8:51 ` Ian Kent
2013-09-07 17:52 ` Linus Torvalds
2013-09-07 18:07 ` Al Viro [this message]
2013-09-07 18:53 ` Al Viro
2013-09-09 14:31 ` Waiman Long
2013-09-07 2:24 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130907180724.GE13318@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=Waiman.Long@hp.com \
--cc=aswin@hp.com \
--cc=john@stoffel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@horizon.com \
--cc=scott.norton@hp.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox