From: Vlad Zolotarov <vlad@scalemp.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, Shai@scalemp.com
Subject: Re: [PATCH RESEND] fs: Move bh_cachep to the __read_mostly section
Date: Sun, 10 Jun 2012 12:36:52 +0300 [thread overview]
Message-ID: <55807110.YgdjxJAxpX@vlad> (raw)
In-Reply-To: <20120607170721.474bd8d8.akpm@linux-foundation.org>
On Thursday 07 June 2012 17:07:21 Andrew Morton wrote:
> On Mon, 28 May 2012 14:58:42 +0300
>
> Vlad Zolotarov <vlad@scalemp.com> wrote:
> > From: Shai Fultheim <shai@scalemp.com>
> >
> > bh_cachep is only written to once on initialization, so move it to the
> > __read_mostly section.
> >
> > Signed-off-by: Shai Fultheim <shai@scalemp.com>
> > Signed-off-by: Vlad Zolotarov <vlad@scalemp.com>
> > ---
> >
> > fs/buffer.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/buffer.c b/fs/buffer.c
> > index ad5938c..838a9cf 100644
> > --- a/fs/buffer.c
> > +++ b/fs/buffer.c
> > @@ -3152,7 +3152,7 @@ SYSCALL_DEFINE2(bdflush, int, func, long, data)
> >
> > /*
> >
> > * Buffer-head allocation
> > */
> >
> > -static struct kmem_cache *bh_cachep;
> > +static struct kmem_cache *bh_cachep __read_mostly;
>
> hm, I thought I replied to this earlier, but I can't find that email.
>
> Yes, bh_cachep is read-mostly. In fact it's write-once. But the same
> is true of all kmem_cache*'s. I don't see any particular reason for
> singling out bh_cachep.
>
>
> Alas, I don't see a smart way of addressing this. It's either a
> patchset which adds __read_mostly to all kmem_cache*'s, or a patchset
> which converts all the definitions to use some nasty macro which
> inserts the __read_mostly.
Well, it may be done. However my ability to properly check it is limited as I
have only a certain number of systems to check it on. I can create the patch,
test it in our labs and post it on this mailing list with request to test it
on other platforms (like non-x86 platforms). However we may also hit the
problem u describe below if do so...
>
> And I still have theoretical concerns with __read_mostly. As we
> further sort the storage into read-mostly and write-often sections, the
> density of writes in the write-mostly section increases. IOW, removing
> the read-mostly padding *increase* cross-CPU traffic in the write-often
> scction. IOW2, leaving the read-mostly stuff where it is provides
> beneficial padding to the write-often fields. I don't think it has
> been shown that there will be net gains.
Great explanation! The above actually nicely concludes (maybe u haven't
actually meant that ;)) why defining write-mostly section(s) is pointless. ;)
This is a main topic of this (http://markmail.org/thread/wl4lnjluroqxgabf)
thread between me and Ingo.
However there is a clear motivation to define a read-mostly section(s) just
the same way there is a motivation to put constants separately from non-
constant variables which I don't think anybody argues about. ;)
On the other hand, generally speaking, if we "complete the task" and put ALL
read-mostly variables into a separate section all the variables that would be
left will actually represent the write-mostly section, which we would prefer
not to have (according to u). Yet we are still far from it today... ;)
Unfortunately, we can't consider all types of bad C-code then we define
something like "const" or "__read_mostly". We do our best. And if someone
haven't defined a per-CPU write-mostly variable in order to prevent heavy
cross-CPU traffic in his/her code (like in your example above) we can only try
to fix this code. But I don't think that the existence of such code shell
imply that the whole idea of __read_mostly section is actually bad or useless.
It's this bad C-code that should be fixed and IMHO padding the variables with
constants is not the proper way to fix it...
That's why I think it could be dangerous to go ahead and patch all variables
of a certain sort (like kmem_cache*'s) with __read_mostly as we may mess the
things up in some places (like in your example above) where there should be
done a deeper code analysis than just pattern matching.
So, getting back to the first section of my reply, do u still think we want to
patch all kmem_cache*'s with __read_mostly this time or we would prefer this
to be done incrementally in order to have better regression-ability?
Pls., comment.
thanks in advance,
vlad
next prev parent reply other threads:[~2012-06-10 9:36 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-28 11:58 [PATCH RESEND] fs: Move bh_cachep to the __read_mostly section Vlad Zolotarov
2012-06-07 8:23 ` Vlad Zolotarov
2012-06-08 0:07 ` Andrew Morton
2012-06-10 9:36 ` Vlad Zolotarov [this message]
2012-07-01 11:34 ` Vlad Zolotarov
[not found] ` <CAK-9PRBiqtyiQahunPex9FT2dtKA0k0gv9PR+=sNCXVvn5Bn9Q@mail.gmail.com>
2012-07-02 12:13 ` Vlad Zolotarov
2012-07-02 16:00 ` Chinmay V S
2012-07-02 17:21 ` Vlad Zolotarov
2012-07-02 18:24 ` Chinmay V S
2012-07-03 9:05 ` Vlad Zolotarov
2012-07-03 21:23 ` Andrew Morton
2012-07-04 9:08 ` Vlad Zolotarov
2012-07-03 9:08 ` Vlad Zolotarov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55807110.YgdjxJAxpX@vlad \
--to=vlad@scalemp.com \
--cc=Shai@scalemp.com \
--cc=akpm@linux-foundation.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).