From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: dsterba@suse.cz, Nikolay Borisov <nborisov@suse.com>,
Josef Bacik <josef@toxicpanda.com>, Qu Wenruo <wqu@suse.com>,
linux-btrfs@vger.kernel.org,
Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
David Sterba <dsterba@suse.com>
Subject: Re: [PATCH v2] btrfs: relocation: Fix KASAN reports caused by extended reloc tree lifespan
Date: Thu, 9 Jan 2020 15:37:42 +0100 [thread overview]
Message-ID: <20200109143742.GN3929@twin.jikos.cz> (raw)
In-Reply-To: <85422cb2-e140-563b-fadd-f820354ed156@gmx.com>
On Thu, Jan 09, 2020 at 01:54:34PM +0800, Qu Wenruo wrote:
> >> The way I read it is more like smp_rmb/smp_wmb, but for bits in this
> >> case, so the smp_mb__before/after_atomic was only a syntactic sugar to
> >> match that it's atomic bitops. I realize this could have caused some
> >> confusion, however I still think that some sort of barrier is needed.
> >
> > There's an existing pattern used for serializing set/clear of
> > BTRFS_ROOT_IN_TRANS_SETUP (record_root_in_trans,
> > btrfs_record_root_in_trans).
> >
> > Once upon a time there were barriers like smp_mb__before_clear_bit but
> > they got unified to just smp_mb__before_atomic for all set/clear
> > operations, so I believe I was not all wrong with using them.
> >
> I used to believe the same fairy tail, that mb() works like a flush(),
> but we're not living in a fairy tail.
This sounds strange, by flush I was refering to internal CPU mechanism
that makes all temporary changes visible to other cpus, so you're saying
that this does not work as everybody expects?
> What mb really does is keep certain ordering from happening.
> And ordering means, we have at least *2* different memory accesses.
Sorry but I think you're missing some pieces here. There are 2 memory
accesses: set_bit/clear_bit (writer) and test_bit (reader).
The same could be achieved by a plain variable, that it's a bit brings
more confusion about what barrier should be really used.
The pattern we want to use here is pretty standard. Read barrier before
read and write barrier that separates the data change from the indicator
of the change. If you line up the barriers, there's a clear line between
the data changes and the indicator value.
reloc_root = NULL
smp_wmb smp_rmb()
test_bit()
... here
set_bit(DEAD)
... or here, it'll be always
reloc_root == NULL, but it still could
be before or after set_bit
You should also distinguish between mb() and smp_mb() (and the rmb/wmb).
mb is a unconditional barrier, used to synchronize access to hardware io
ports, suitable in drivers.
We use smp_mb() because this serializes memory among multipe CPUs, when
one changes memory but stores it to some temporary structures, while
other CPUs don't see the effects. I'm sure you've read about that in the
memory barrier docs.
> It's not to ensure every reader get the same result, as there is no way
> to do that. Read can happen whenever they want.
That is true about the 'whenever', but what we need to guarantee here is
that when the read happens the following condition will have view of the
changes implied by the pairing barrier.
> So before we talk about mb, we first need to know which 2 memory
> accesses we're talking about.
> If there is even no 2 memory access, then there is no need for mb().
>
> E.g. for the mb implied by spinlock(), it's not to ensure the spinlock
> counter reader, but to ensure the memory ordering between the spinlock
> counter itself and the memory it's protecting.
>
> So for memory barrier around test_bit(), as long as the compiler is not
> doing reordering, we don't need extra mb.
This is not about compiler.
> And if the compiler is really doing re-ordering for the
> have_reloc_root(), then the compiler is doing something wrong, as that
> would makes the test_bit() meaningless.
next prev parent reply other threads:[~2020-01-09 14:37 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-08 5:12 [PATCH v2] btrfs: relocation: Fix KASAN reports caused by extended reloc tree lifespan Qu Wenruo
2020-01-08 12:28 ` Nikolay Borisov
2020-01-08 12:36 ` Qu WenRuo
2020-01-08 14:55 ` Josef Bacik
2020-01-08 15:03 ` Nikolay Borisov
2020-01-08 15:08 ` David Sterba
2020-01-08 15:11 ` David Sterba
2020-01-09 5:54 ` Qu Wenruo
2020-01-09 14:37 ` David Sterba [this message]
2020-01-10 0:21 ` Qu Wenruo
2020-01-10 0:58 ` Qu Wenruo
2020-01-13 4:41 ` Qu Wenruo
2020-01-13 17:19 ` David Sterba
2020-01-13 19:15 ` Nikolay Borisov
2020-01-08 15:19 ` Josef Bacik
2020-01-09 0:11 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200109143742.GN3929@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=dsterba@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nborisov@suse.com \
--cc=quwenruo.btrfs@gmx.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox