From: Brian Foster <bfoster@redhat.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: linux-bcachefs@vger.kernel.org
Subject: Re: [PATCH RFC] bcachefs: use inode as write point index instead of task
Date: Mon, 19 Dec 2022 10:42:09 -0500 [thread overview]
Message-ID: <Y6CGUZhqs5ABBIbO@bfoster> (raw)
In-Reply-To: <Y5wYjUhTAzhSIvM3@moria.home.lan>
On Fri, Dec 16, 2022 at 02:04:45AM -0500, Kent Overstreet wrote:
> On Thu, Dec 15, 2022 at 07:09:58PM -0500, Kent Overstreet wrote:
> > I do think we could probably be doing something more than using the pid for the
> > writepoint, I've just been waiting until we see specific workloads where the
> > current behaviour falls over or have a specific complaint before designing
> > something new.
>
> Random late night thoughts:
>
> Say we introduce a new object, 'writepoint_handle' or somesuch.
>
> Allocate them when opening a file for write, close them when close the file.
>
> Then we'd be explicitly picking which writepoint to use when allocating the
> writepoint_handle; it would be easy to add logic for "if there's a writepoint
> which was last used by this process and doesn't currently have any handles
> pointing to it, use that".
>
Ok, but if we alloc the handle at open (or first write or whatever),
we'd still need to potentially keep it around after ->release() (i.e.
userspace close()) while the mapping is dirty and thus still needs to be
written back, right?
If so, perhaps this would need some additional state to track an
"active" writepoint, explicitly defined as a "writepoint with currently
open files" as opposed to simply a handle pointer? IOW, if the task is
no longer writing to the previous file, it's probably Ok to reuse that
writepoint even though the handle might still have a reference..?
But generally I think I get the idea: preserve the current ability for a
single sequential writer to use the same writepoint across N files, but
fall back to a separate writepoint where we otherwise detect multi-file
activity. I think that makes sense, though I'd probably have to think a
bit more about an explicit open() -> close() handle lifecycle and
whether that's robust enough for fileserver like use cases. I.e., I'd be
a little concerned about whether that workload might make inter-spersed
sub-file writes look a bit too much like the single user open -> write
-> close -> repeat use case..
> One of the things that needs to be considered is - what do we do when there's
> more writepoint_handles than writepoints?
>
Does bcachefs have to deal with something like that today? For example
if there is some max number of writepoints, what happens if there might
be some greater number of tasks doing allocations at the same time?
Brian
> bcache has some logic for this by tracking when a writepoint was used, and if we
> don't find a writepoint that matches up with the IO being issued - pick the
> oldest one off an LRU queue. Was dropped in bcachefs because the straight hash
> table seemed to work just as well and was faster - or maybe I'm thinking of the
> sequential bypass data structure?
>
next prev parent reply other threads:[~2022-12-19 15:43 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-12 19:06 [PATCH RFC] bcachefs: use inode as write point index instead of task Brian Foster
2022-12-13 18:38 ` Kent Overstreet
2022-12-14 17:44 ` Brian Foster
2022-12-16 0:09 ` Kent Overstreet
2022-12-16 7:04 ` Kent Overstreet
2022-12-19 15:42 ` Brian Foster [this message]
2022-12-20 1:56 ` Kent Overstreet
2022-12-28 22:24 ` Eric Wheeler
2022-12-29 20:59 ` Kent Overstreet
2022-12-29 22:26 ` Eric Wheeler
2022-12-30 3:18 ` Kent Overstreet
2022-12-19 15:27 ` Brian Foster
2022-12-20 1:02 ` Kent Overstreet
2022-12-22 14:03 ` Brian Foster
2022-12-23 4:36 ` Kent Overstreet
2022-12-23 11:49 ` Brian Foster
2022-12-23 18:02 ` Kent Overstreet
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y6CGUZhqs5ABBIbO@bfoster \
--to=bfoster@redhat.com \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.