From: Kundan Kumar <kundan.kumar@samsung.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@lst.de>,
lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
anuj20.g@samsung.com, mcgrof@kernel.org, joshi.k@samsung.com,
axboe@kernel.dk, clm@meta.com, willy@infradead.org,
gost.dev@samsung.com
Subject: Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Parallelizing filesystem writeback
Date: Thu, 20 Feb 2025 19:49:22 +0530 [thread overview]
Message-ID: <20250220141824.ju5va75s3xp472cd@green245> (raw)
In-Reply-To: <Z6qkLjSj1K047yPt@dread.disaster.area>
[-- Attachment #1: Type: text/plain, Size: 1976 bytes --]
>Well, that's currently selected by __inode_attach_wb() based on
>whether there is a memcg attached to the folio/task being dirtied or
>not. If there isn't a cgroup based writeback task, then it uses the
>bdi->wb as the wb context.
We have created a proof of concept for per-AG context-based writeback, as
described in [1]. The AG is mapped to a writeback context (wb_ctx). Using
the filesystem handler, __mark_inode_dirty() selects writeback context
corresponding to the inode.
We attempted to handle memcg and bdi based writeback in a similar manner.
This approach aims to maintain the original writeback semantics while
providing parallelism. This helps in pushing more data early to the
device, trying to ease the write pressure faster.
[1] https://lore.kernel.org/all/20250212103634.448437-1-kundan.kumar@samsung.com/
>Then selecting inodes for writeback becomes a list_lru_walk()
>variant depending on what needs to be written back (e.g. physical
>node, memcg, both, everything that is dirty everywhere, etc).
We considered using list_lru to track inodes within a writeback context.
This can be implemented as:
struct bdi_writeback {
struct list_lru b_dirty_inodes_lru; // instead of a single b_dirty list
struct list_lru b_io_dirty_inodes_lru;
...
...
};
By doing this, we would obtain a sharded list of inodes per NUMA node.
However, we would also need per-NUMA writeback contexts. Otherwise,
even if inodes are NUMA-sharded, a single writeback context would stil
process them sequentially, limiting parallelism. But there’s a concern:
NUMA-based writeback contexts are not aligned with filesystem geometry,
which could negatively impact delayed allocation and writeback efficiency,
as you pointed out in your previous reply [2].
Would it be better to let the filesystem dictate the number of writeback
threads, rather than enforcing a per-NUMA model?
Do you see it differently?
[2] https://lore.kernel.org/all/Z5qw_1BOqiFum5Dn@dread.disaster.area/
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2025-02-20 14:35 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20250129103448epcas5p1f7d71506e4443429a0b0002eb842e749@epcas5p1.samsung.com>
2025-01-29 10:26 ` [LSF/MM/BPF TOPIC] Parallelizing filesystem writeback Kundan Kumar
2025-01-29 15:42 ` Christoph Hellwig
2025-01-31 9:57 ` Kundan Kumar
2025-01-29 22:51 ` Dave Chinner
2025-01-31 9:32 ` Kundan Kumar
2025-01-31 17:06 ` Luis Chamberlain
2025-02-04 2:50 ` Dave Chinner
2025-02-04 5:06 ` Christoph Hellwig
2025-02-04 7:08 ` Dave Chinner
2025-02-10 17:28 ` [Lsf-pc] " Jan Kara
2025-02-11 1:13 ` Dave Chinner
2025-02-11 13:43 ` Jan Kara
2025-02-20 14:19 ` Kundan Kumar [this message]
2025-03-13 20:22 ` Jan Kara
2025-03-18 3:42 ` Dave Chinner
[not found] ` <CGME20250319081521epcas5p39ab71751aef70c73ba0f664db852ad69@epcas5p3.samsung.com>
2025-03-19 8:07 ` Anuj Gupta
2025-03-18 6:41 ` Kundan Kumar
2025-03-18 11:37 ` Anuj Gupta
2025-03-19 15:54 ` Jan Kara
2025-03-20 7:08 ` Anuj Gupta
2025-03-12 17:47 ` Luis Chamberlain
2025-03-13 19:39 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250220141824.ju5va75s3xp472cd@green245 \
--to=kundan.kumar@samsung.com \
--cc=anuj20.g@samsung.com \
--cc=axboe@kernel.dk \
--cc=clm@meta.com \
--cc=david@fromorbit.com \
--cc=gost.dev@samsung.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=joshi.k@samsung.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mcgrof@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).