Re: Memory coherency issue with IO thread offloading?

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: Jens Axboe <axboe@kernel.dk>
To: Michael Ellerman <mpe@ellerman.id.au>,
	Nicholas Piggin <npiggin@gmail.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: daniel@mariadb.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: Memory coherency issue with IO thread offloading?
Date: Tue, 28 Mar 2023 10:38:03 -0600	[thread overview]
Message-ID: <91477fb8-c9d8-53e7-e657-f5d6ba2e276f@kernel.dk> (raw)
In-Reply-To: <87a5zxca3t.fsf@mpe.ellerman.id.au>

On 3/28/23 6:51?AM, Michael Ellerman wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>>>> Can the  queueing cause the creation of an IO thread (if one does not
>>>> exist, or all blocked?)
>>>
>>> Yep
>>>
>>> Since writing this email, I've gone through a lot of different tests.
>>> Here's a rough listing of what I found:
>>>
>>> - Like using the hack patch, if I just limit the number of IO thread
>>>   workers to 1, it seems to pass. At least longer than before, does 1000
>>>   iterations.
>>>
>>> - If I pin each IO worker to a single CPU, it also passes.
>>>
>>> - If I liberally sprinkle smp_mb() for the io-wq side, test still fails.
>>>   I've added one before queueing the work item, and after. One before
>>>   the io-wq worker grabs a work item and one after. Eg full hammer
>>>   approach. This still fails.
>>>
>>> Puzzling... For the "pin each IO worker to a single CPU" I added some
>>> basic code around trying to ensure that a work item queued on CPU X
>>> would be processed by a worker on CPU X, and too a large degree, this
>>> does happen. But since the work list is a normal list, it's quite
>>> possible that some other worker finishes its work on CPU Y just in time
>>> to grab the one from cpu X. I checked and this does happen in the test
>>> case, yet it still passes. This may be because I got a bit lucky, but
>>> seems suspect with thousands of passes of the test case.
>>>
>>> Another theory there is that it's perhaps related to an io-wq worker
>>> being rescheduled on a different CPU. Though again puzzled as to why the
>>> smp_mb sprinkling didn't fix that then. I'm going to try and run the
>>> test case with JUST the io-wq worker pinning and not caring about where
>>> the work is processed to see if that does anything.
>>
>> Just pinning each worker to whatever CPU they got created on seemingly
>> fixes the issue too. This does not mean that each worker will process
>> work on the CPU on which it was queued, just that each worker will
>> remain on whatever CPU it originally got created on.
>>
>> Puzzling...
>>
>> Note that it is indeed quite possible that this isn't a ppc issue at
>> all, just shows on ppc. It could be page cache related, or it could even
>> be a bug in mariadb itself.
> 
> I tried binary patching every lwsync to hwsync (read/write to full
> barrier) in mariadbd and all the libaries it links. It didn't fix the
> problem.
> 
> I also tried switching all the kernel barriers/spin locks to using a
> hwsync, but that also didn't fix it.
> 
> It's still possible there's somewhere that currently has no barrier at
> all that needs one, the above would only fix the problem if we have a
> read/write barrier that actually needs to be a full barrier.
> 
> 
> I also looked at making all TLB invalidates broadcast, regardless of
> whether we think the thread has only been on a single CPU. That didn't
> help, but I'm not sure I got all places where we do TLB invalidates, so
> I'll look at that some more tomorrow.

Thanks, appreciate your testing! I have no new data points since
yesterday, but the key point from then still seems to be that if an io
worker never reschedules onto a different CPU, then the problem doesn't
occur. This could very well be a page cache issue, if it isn't an issue
on the powerpc side...

-- 
Jens Axboe

next prev parent reply	other threads:[~2023-03-28 16:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-23 18:54 Memory coherency issue with IO thread offloading? Jens Axboe
2023-03-24  7:27 ` Christophe Leroy
2023-03-24 12:06   ` Jens Axboe
2023-03-25  0:15     ` Michael Ellerman
2023-03-25  0:20       ` Jens Axboe
2023-03-25  0:42 ` Michael Ellerman
2023-03-25  1:15   ` Jens Axboe
2023-03-25  1:20     ` Jens Axboe
2023-03-27  4:22       ` Nicholas Piggin
2023-03-27 12:39         ` Jens Axboe
2023-03-27 21:24           ` Jens Axboe
2023-03-28 12:51             ` Michael Ellerman
2023-03-28 16:38               ` Jens Axboe [this message]
2023-03-27 13:53     ` Michael Ellerman
  -- strict thread matches above, loose matches on Subject: below --
2023-03-28  6:20 Daniel Black
2023-03-28 12:10 ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91477fb8-c9d8-53e7-e657-f5d6ba2e276f@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=christophe.leroy@csgroup.eu \
    --cc=daniel@mariadb.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).