* [PATCH 0/3] filemap_add_folio_nocharge()
@ 2025-08-06 0:11 Boris Burkov
2025-08-06 0:11 ` [PATCH 1/3] mm/filemap: add filemap_add_folio_nocharge() Boris Burkov
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Boris Burkov @ 2025-08-06 0:11 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, kernel-team; +Cc: shakeel.butt, hch, wqu
I would like to revisit Qu's proposal to not charge btrfs extent_buffer
allocations to the user's cgroup.
https://lore.kernel.org/linux-mm/b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@suse.com/
I believe it is detrimental to randomly account these global pages to
the cgroup using them, basically at random. A bit more justification
and explanation in the patches themselves.
Three meta-considerations/questions:
1. Which tree should this go through, assuming it is acceptable?
For now, I have based it off btrfs/for-next as that is what I am
used to doing, but I am happy to re-send it based off the appropriate mm
branch.
2. Christoph wrote the first patch as-is in his suggestion to Qu. I am happy
to replace it with his authorship/s-o-b, I just didn't want to do that
without asking. For now, I put his "Suggested-by".
3. The previous suggestion also requested "proper" documentation. I don't
know what that entails in this case, and was unable to find corresponding
documentation for filemap_add_folio() in the code or in Documentation/.
Please let me know what I should be doing there, as well.
Boris Burkov (3):
mm/filemap: add filemap_add_folio_nocharge()
btrfs: use filemap_add_folio_nocharge() for extent_buffers
mm: add vmstat for cgroup uncharged pages
fs/btrfs/extent_io.c | 4 ++--
include/linux/mmzone.h | 3 +++
include/linux/pagemap.h | 2 ++
mm/filemap.c | 41 +++++++++++++++++++++++++++++++++++------
mm/vmstat.c | 3 +++
5 files changed, 45 insertions(+), 8 deletions(-)
--
2.50.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/3] mm/filemap: add filemap_add_folio_nocharge()
2025-08-06 0:11 [PATCH 0/3] filemap_add_folio_nocharge() Boris Burkov
@ 2025-08-06 0:11 ` Boris Burkov
2025-08-06 0:11 ` [PATCH 2/3] btrfs: use filemap_add_folio_nocharge() for extent_buffers Boris Burkov
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Boris Burkov @ 2025-08-06 0:11 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, kernel-team; +Cc: shakeel.butt, hch, wqu
Btrfs currently tracks its metadata pages in the page cache, using a
fake inode (fs_info->btree_inode) with offsets corresponding to where
the metadata is stored in the filesystem's full logical address space.
A consequence of this is that when btrfs uses filemap_add_folio(), this
usage is charged to the cgroup of whichever task happens to be running
at the time. These folios don't belong to any particular user cgroup, so
I don't think it makes much sense for them to be charged in that way.
Some negative consequences as a result:
- A task can be holding some important btrfs locks, then need to lookup
some metadata and go into reclaim, extending the duration it holds
that lock for, and unfairly pushing its own reclaim pain onto other
cgroups.
- If that cgroup goes into reclaim, it might reclaim these folios a
different non-reclaiming cgroup might need soon. This is naturally
offset by LRU reclaim, but still.
A very similar proposal to use the root cgroup was previously made by
Qu, but he was advised by Christoph to instead introduce a _nocharge
variant of filemap_add_folio.
Link: https://lore.kernel.org/linux-mm/b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@suse.com/
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Boris Burkov <boris@bur.io>
---
include/linux/pagemap.h | 2 ++
mm/filemap.c | 23 +++++++++++++++++------
2 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index e63fbfbd5b0f..acc8d390ecbb 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -1237,6 +1237,8 @@ size_t fault_in_readable(const char __user *uaddr, size_t size);
int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
pgoff_t index, gfp_t gfp);
+int filemap_add_folio_nocharge(struct address_space *mapping,
+ struct folio *folio, pgoff_t index, gfp_t gfp);
int filemap_add_folio(struct address_space *mapping, struct folio *folio,
pgoff_t index, gfp_t gfp);
void filemap_remove_folio(struct folio *folio);
diff --git a/mm/filemap.c b/mm/filemap.c
index bada249b9fb7..ccc9cfb4d418 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -955,20 +955,15 @@ noinline int __filemap_add_folio(struct address_space *mapping,
}
ALLOW_ERROR_INJECTION(__filemap_add_folio, ERRNO);
-int filemap_add_folio(struct address_space *mapping, struct folio *folio,
+int filemap_add_folio_nocharge(struct address_space *mapping, struct folio *folio,
pgoff_t index, gfp_t gfp)
{
void *shadow = NULL;
int ret;
- ret = mem_cgroup_charge(folio, NULL, gfp);
- if (ret)
- return ret;
-
__folio_set_locked(folio);
ret = __filemap_add_folio(mapping, folio, index, gfp, &shadow);
if (unlikely(ret)) {
- mem_cgroup_uncharge(folio);
__folio_clear_locked(folio);
} else {
/*
@@ -986,6 +981,22 @@ int filemap_add_folio(struct address_space *mapping, struct folio *folio,
}
return ret;
}
+EXPORT_SYMBOL_GPL(filemap_add_folio_nocharge);
+
+int filemap_add_folio(struct address_space *mapping, struct folio *folio,
+ pgoff_t index, gfp_t gfp)
+{
+ int ret;
+
+ ret = mem_cgroup_charge(folio, NULL, gfp);
+ if (ret)
+ return ret;
+
+ ret = filemap_add_folio_nocharge(mapping, folio, index, gfp);
+ if (ret)
+ mem_cgroup_uncharge(folio);
+ return ret;
+}
EXPORT_SYMBOL_GPL(filemap_add_folio);
#ifdef CONFIG_NUMA
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] btrfs: use filemap_add_folio_nocharge() for extent_buffers
2025-08-06 0:11 [PATCH 0/3] filemap_add_folio_nocharge() Boris Burkov
2025-08-06 0:11 ` [PATCH 1/3] mm/filemap: add filemap_add_folio_nocharge() Boris Burkov
@ 2025-08-06 0:11 ` Boris Burkov
2025-08-06 0:11 ` [PATCH 3/3] mm: add vmstat for cgroup uncharged pages Boris Burkov
2025-08-06 14:00 ` [PATCH 0/3] filemap_add_folio_nocharge() Matthew Wilcox
3 siblings, 0 replies; 9+ messages in thread
From: Boris Burkov @ 2025-08-06 0:11 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, kernel-team; +Cc: shakeel.butt, hch, wqu
Don't block on cgroup reclaim or subject shared extent_buffers to cgroup
reclaim.
Signed-off-by: Boris Burkov <boris@bur.io>
---
fs/btrfs/extent_io.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index f9ccbe5a203a..1f25447f1e39 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3258,8 +3258,8 @@ static int attach_eb_folio_to_filemap(struct extent_buffer *eb, int i,
retry:
existing_folio = NULL;
- ret = filemap_add_folio(mapping, eb->folios[i], index + i,
- GFP_NOFS | __GFP_NOFAIL);
+ ret = filemap_add_folio_nocharge(mapping, eb->folios[i], index + i,
+ GFP_NOFS | __GFP_NOFAIL);
if (!ret)
goto finish;
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] mm: add vmstat for cgroup uncharged pages
2025-08-06 0:11 [PATCH 0/3] filemap_add_folio_nocharge() Boris Burkov
2025-08-06 0:11 ` [PATCH 1/3] mm/filemap: add filemap_add_folio_nocharge() Boris Burkov
2025-08-06 0:11 ` [PATCH 2/3] btrfs: use filemap_add_folio_nocharge() for extent_buffers Boris Burkov
@ 2025-08-06 0:11 ` Boris Burkov
2025-08-06 20:40 ` kernel test robot
` (2 more replies)
2025-08-06 14:00 ` [PATCH 0/3] filemap_add_folio_nocharge() Matthew Wilcox
3 siblings, 3 replies; 9+ messages in thread
From: Boris Burkov @ 2025-08-06 0:11 UTC (permalink / raw)
To: linux-btrfs, linux-mm, linux-fsdevel, kernel-team; +Cc: shakeel.butt, hch, wqu
If cgroups are configured into the kernel, then uncharged pages can only
come from filemap_add_folio_nocharge. Track such uncharged folios in
vmstat so that they are accounted for.
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Signed-off-by: Boris Burkov <boris@bur.io>
---
include/linux/mmzone.h | 3 +++
mm/filemap.c | 18 ++++++++++++++++++
mm/vmstat.c | 3 +++
3 files changed, 24 insertions(+)
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 283913d42d7b..a945dec65371 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -241,6 +241,9 @@ enum node_stat_item {
NR_HUGETLB,
#endif
NR_BALLOON_PAGES,
+#ifdef CONFIG_MEMCG
+ NR_UNCHARGED_FILE_PAGES,
+#endif
NR_VM_NODE_STAT_ITEMS
};
diff --git a/mm/filemap.c b/mm/filemap.c
index ccc9cfb4d418..0a258b4a9246 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -146,6 +146,22 @@ static void page_cache_delete(struct address_space *mapping,
mapping->nrpages -= nr;
}
+#ifdef CONFIG_MEMCG
+static void filemap_mod_uncharged_vmstat(struct folio *folio, int sign)
+{
+ long nr = folio_nr_pages(folio) * sign;
+
+ if (!folio_memcg(folio))
+ __lruvec_stat_mod_folio(folio, NR_UNCHARGED_FILE_PAGES, nr);
+}
+#else
+static void filemap_mod_uncharged_cgroup_vmstat(struct folio *folio, int sign)
+{
+ return;
+}
+#endif
+
+
static void filemap_unaccount_folio(struct address_space *mapping,
struct folio *folio)
{
@@ -190,6 +206,7 @@ static void filemap_unaccount_folio(struct address_space *mapping,
__lruvec_stat_mod_folio(folio, NR_FILE_THPS, -nr);
filemap_nr_thps_dec(mapping);
}
+ filemap_mod_uncharged_vmstat(folio, -1);
/*
* At this point folio must be either written or cleaned by
@@ -978,6 +995,7 @@ int filemap_add_folio_nocharge(struct address_space *mapping, struct folio *foli
if (!(gfp & __GFP_WRITE) && shadow)
workingset_refault(folio, shadow);
folio_add_lru(folio);
+ filemap_mod_uncharged_vmstat(folio, 1);
}
return ret;
}
diff --git a/mm/vmstat.c b/mm/vmstat.c
index a78d70ddeacd..63318742ae5a 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1281,6 +1281,9 @@ const char * const vmstat_text[] = {
"nr_hugetlb",
#endif
"nr_balloon_pages",
+#ifdef CONFIG_MEMCG
+ "nr_uncharged_file_pages",
+#endif
/* system-wide enum vm_stat_item counters */
"nr_dirty_threshold",
"nr_dirty_background_threshold",
--
2.50.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3] filemap_add_folio_nocharge()
2025-08-06 0:11 [PATCH 0/3] filemap_add_folio_nocharge() Boris Burkov
` (2 preceding siblings ...)
2025-08-06 0:11 ` [PATCH 3/3] mm: add vmstat for cgroup uncharged pages Boris Burkov
@ 2025-08-06 14:00 ` Matthew Wilcox
2025-08-06 23:19 ` Shakeel Butt
3 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2025-08-06 14:00 UTC (permalink / raw)
To: Boris Burkov
Cc: linux-btrfs, linux-mm, linux-fsdevel, kernel-team, shakeel.butt,
hch, wqu
On Tue, Aug 05, 2025 at 05:11:46PM -0700, Boris Burkov wrote:
> I would like to revisit Qu's proposal to not charge btrfs extent_buffer
> allocations to the user's cgroup.
>
> https://lore.kernel.org/linux-mm/b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@suse.com/
I prefer Qu's suggestion to add a flag to the address_space. This really
is a property of the address_space, not a property of the call-site.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] mm: add vmstat for cgroup uncharged pages
2025-08-06 0:11 ` [PATCH 3/3] mm: add vmstat for cgroup uncharged pages Boris Burkov
@ 2025-08-06 20:40 ` kernel test robot
2025-08-06 20:51 ` kernel test robot
2025-08-07 17:23 ` Shakeel Butt
2 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-08-06 20:40 UTC (permalink / raw)
To: Boris Burkov, linux-btrfs, linux-mm, linux-fsdevel, kernel-team
Cc: oe-kbuild-all, shakeel.butt, hch, wqu
Hi Boris,
kernel test robot noticed the following build warnings:
[auto build test WARNING on kdave/for-next]
[also build test WARNING on v6.16]
[cannot apply to akpm-mm/mm-everything linus/master next-20250806]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Boris-Burkov/mm-filemap-add-filemap_add_folio_nocharge/20250806-130147
base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link: https://lore.kernel.org/r/eae30d630ba07de8966d09a3e1700f53715980c2.1754438418.git.boris%40bur.io
patch subject: [PATCH 3/3] mm: add vmstat for cgroup uncharged pages
config: i386-buildonly-randconfig-003-20250807 (https://download.01.org/0day-ci/archive/20250807/202508070434.6EpRsRF3-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14+deb12u1) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250807/202508070434.6EpRsRF3-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508070434.6EpRsRF3-lkp@intel.com/
All warnings (new ones prefixed by >>):
mm/filemap.c: In function 'filemap_unaccount_folio':
mm/filemap.c:209:9: error: implicit declaration of function 'filemap_mod_uncharged_vmstat'; did you mean 'filemap_mod_uncharged_cgroup_vmstat'? [-Werror=implicit-function-declaration]
209 | filemap_mod_uncharged_vmstat(folio, -1);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| filemap_mod_uncharged_cgroup_vmstat
mm/filemap.c: At top level:
>> mm/filemap.c:158:13: warning: 'filemap_mod_uncharged_cgroup_vmstat' defined but not used [-Wunused-function]
158 | static void filemap_mod_uncharged_cgroup_vmstat(struct folio *folio, int sign)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: some warnings being treated as errors
vim +/filemap_mod_uncharged_cgroup_vmstat +158 mm/filemap.c
148
149 #ifdef CONFIG_MEMCG
150 static void filemap_mod_uncharged_vmstat(struct folio *folio, int sign)
151 {
152 long nr = folio_nr_pages(folio) * sign;
153
154 if (!folio_memcg(folio))
155 __lruvec_stat_mod_folio(folio, NR_UNCHARGED_FILE_PAGES, nr);
156 }
157 #else
> 158 static void filemap_mod_uncharged_cgroup_vmstat(struct folio *folio, int sign)
159 {
160 return;
161 }
162 #endif
163
164
165 static void filemap_unaccount_folio(struct address_space *mapping,
166 struct folio *folio)
167 {
168 long nr;
169
170 VM_BUG_ON_FOLIO(folio_mapped(folio), folio);
171 if (!IS_ENABLED(CONFIG_DEBUG_VM) && unlikely(folio_mapped(folio))) {
172 pr_alert("BUG: Bad page cache in process %s pfn:%05lx\n",
173 current->comm, folio_pfn(folio));
174 dump_page(&folio->page, "still mapped when deleted");
175 dump_stack();
176 add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
177
178 if (mapping_exiting(mapping) && !folio_test_large(folio)) {
179 int mapcount = folio_mapcount(folio);
180
181 if (folio_ref_count(folio) >= mapcount + 2) {
182 /*
183 * All vmas have already been torn down, so it's
184 * a good bet that actually the page is unmapped
185 * and we'd rather not leak it: if we're wrong,
186 * another bad page check should catch it later.
187 */
188 atomic_set(&folio->_mapcount, -1);
189 folio_ref_sub(folio, mapcount);
190 }
191 }
192 }
193
194 /* hugetlb folios do not participate in page cache accounting. */
195 if (folio_test_hugetlb(folio))
196 return;
197
198 nr = folio_nr_pages(folio);
199
200 __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, -nr);
201 if (folio_test_swapbacked(folio)) {
202 __lruvec_stat_mod_folio(folio, NR_SHMEM, -nr);
203 if (folio_test_pmd_mappable(folio))
204 __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, -nr);
205 } else if (folio_test_pmd_mappable(folio)) {
206 __lruvec_stat_mod_folio(folio, NR_FILE_THPS, -nr);
207 filemap_nr_thps_dec(mapping);
208 }
> 209 filemap_mod_uncharged_vmstat(folio, -1);
210
211 /*
212 * At this point folio must be either written or cleaned by
213 * truncate. Dirty folio here signals a bug and loss of
214 * unwritten data - on ordinary filesystems.
215 *
216 * But it's harmless on in-memory filesystems like tmpfs; and can
217 * occur when a driver which did get_user_pages() sets page dirty
218 * before putting it, while the inode is being finally evicted.
219 *
220 * Below fixes dirty accounting after removing the folio entirely
221 * but leaves the dirty flag set: it has no effect for truncated
222 * folio and anyway will be cleared before returning folio to
223 * buddy allocator.
224 */
225 if (WARN_ON_ONCE(folio_test_dirty(folio) &&
226 mapping_can_writeback(mapping)))
227 folio_account_cleaned(folio, inode_to_wb(mapping->host));
228 }
229
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] mm: add vmstat for cgroup uncharged pages
2025-08-06 0:11 ` [PATCH 3/3] mm: add vmstat for cgroup uncharged pages Boris Burkov
2025-08-06 20:40 ` kernel test robot
@ 2025-08-06 20:51 ` kernel test robot
2025-08-07 17:23 ` Shakeel Butt
2 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-08-06 20:51 UTC (permalink / raw)
To: Boris Burkov, linux-btrfs, linux-mm, linux-fsdevel, kernel-team
Cc: llvm, oe-kbuild-all, shakeel.butt, hch, wqu
Hi Boris,
kernel test robot noticed the following build errors:
[auto build test ERROR on kdave/for-next]
[also build test ERROR on v6.16]
[cannot apply to akpm-mm/mm-everything linus/master next-20250806]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Boris-Burkov/mm-filemap-add-filemap_add_folio_nocharge/20250806-130147
base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link: https://lore.kernel.org/r/eae30d630ba07de8966d09a3e1700f53715980c2.1754438418.git.boris%40bur.io
patch subject: [PATCH 3/3] mm: add vmstat for cgroup uncharged pages
config: i386-buildonly-randconfig-002-20250807 (https://download.01.org/0day-ci/archive/20250807/202508070450.bjCshoYI-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250807/202508070450.bjCshoYI-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202508070450.bjCshoYI-lkp@intel.com/
All errors (new ones prefixed by >>):
>> mm/filemap.c:209:2: error: call to undeclared function 'filemap_mod_uncharged_vmstat'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
209 | filemap_mod_uncharged_vmstat(folio, -1);
| ^
mm/filemap.c:209:2: note: did you mean 'filemap_mod_uncharged_cgroup_vmstat'?
mm/filemap.c:158:13: note: 'filemap_mod_uncharged_cgroup_vmstat' declared here
158 | static void filemap_mod_uncharged_cgroup_vmstat(struct folio *folio, int sign)
| ^
mm/filemap.c:998:3: error: call to undeclared function 'filemap_mod_uncharged_vmstat'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
998 | filemap_mod_uncharged_vmstat(folio, 1);
| ^
2 errors generated.
vim +/filemap_mod_uncharged_vmstat +209 mm/filemap.c
163
164
165 static void filemap_unaccount_folio(struct address_space *mapping,
166 struct folio *folio)
167 {
168 long nr;
169
170 VM_BUG_ON_FOLIO(folio_mapped(folio), folio);
171 if (!IS_ENABLED(CONFIG_DEBUG_VM) && unlikely(folio_mapped(folio))) {
172 pr_alert("BUG: Bad page cache in process %s pfn:%05lx\n",
173 current->comm, folio_pfn(folio));
174 dump_page(&folio->page, "still mapped when deleted");
175 dump_stack();
176 add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
177
178 if (mapping_exiting(mapping) && !folio_test_large(folio)) {
179 int mapcount = folio_mapcount(folio);
180
181 if (folio_ref_count(folio) >= mapcount + 2) {
182 /*
183 * All vmas have already been torn down, so it's
184 * a good bet that actually the page is unmapped
185 * and we'd rather not leak it: if we're wrong,
186 * another bad page check should catch it later.
187 */
188 atomic_set(&folio->_mapcount, -1);
189 folio_ref_sub(folio, mapcount);
190 }
191 }
192 }
193
194 /* hugetlb folios do not participate in page cache accounting. */
195 if (folio_test_hugetlb(folio))
196 return;
197
198 nr = folio_nr_pages(folio);
199
200 __lruvec_stat_mod_folio(folio, NR_FILE_PAGES, -nr);
201 if (folio_test_swapbacked(folio)) {
202 __lruvec_stat_mod_folio(folio, NR_SHMEM, -nr);
203 if (folio_test_pmd_mappable(folio))
204 __lruvec_stat_mod_folio(folio, NR_SHMEM_THPS, -nr);
205 } else if (folio_test_pmd_mappable(folio)) {
206 __lruvec_stat_mod_folio(folio, NR_FILE_THPS, -nr);
207 filemap_nr_thps_dec(mapping);
208 }
> 209 filemap_mod_uncharged_vmstat(folio, -1);
210
211 /*
212 * At this point folio must be either written or cleaned by
213 * truncate. Dirty folio here signals a bug and loss of
214 * unwritten data - on ordinary filesystems.
215 *
216 * But it's harmless on in-memory filesystems like tmpfs; and can
217 * occur when a driver which did get_user_pages() sets page dirty
218 * before putting it, while the inode is being finally evicted.
219 *
220 * Below fixes dirty accounting after removing the folio entirely
221 * but leaves the dirty flag set: it has no effect for truncated
222 * folio and anyway will be cleared before returning folio to
223 * buddy allocator.
224 */
225 if (WARN_ON_ONCE(folio_test_dirty(folio) &&
226 mapping_can_writeback(mapping)))
227 folio_account_cleaned(folio, inode_to_wb(mapping->host));
228 }
229
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/3] filemap_add_folio_nocharge()
2025-08-06 14:00 ` [PATCH 0/3] filemap_add_folio_nocharge() Matthew Wilcox
@ 2025-08-06 23:19 ` Shakeel Butt
0 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-08-06 23:19 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Boris Burkov, linux-btrfs, linux-mm, linux-fsdevel, kernel-team,
hch, wqu, hannes, mhocko, roman.gushchin, muchun.song, cgroups
CCing memcg maintainers.
On Wed, Aug 06, 2025 at 03:00:43PM +0100, Matthew Wilcox wrote:
> On Tue, Aug 05, 2025 at 05:11:46PM -0700, Boris Burkov wrote:
> > I would like to revisit Qu's proposal to not charge btrfs extent_buffer
> > allocations to the user's cgroup.
> >
> > https://lore.kernel.org/linux-mm/b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@suse.com/
>
> I prefer Qu's suggestion to add a flag to the address_space. This really
> is a property of the address_space, not a property of the call-site.
>
I think Michal wanted call-site (explicit interface) for easy
searchability. I don't have a strong opinion either way. However I
wonder if having this information in address_space might be more useful.
At the moment, this series is using !folio_memcg(folio) to detect if the
folio has skipped memcg charging on the free path to decrement the stat.
Having information in address_space would be another way to extract that
information. I need to think more on this.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] mm: add vmstat for cgroup uncharged pages
2025-08-06 0:11 ` [PATCH 3/3] mm: add vmstat for cgroup uncharged pages Boris Burkov
2025-08-06 20:40 ` kernel test robot
2025-08-06 20:51 ` kernel test robot
@ 2025-08-07 17:23 ` Shakeel Butt
2 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2025-08-07 17:23 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs, linux-mm, linux-fsdevel, kernel-team, hch, wqu
On Tue, Aug 05, 2025 at 05:11:49PM -0700, Boris Burkov wrote:
> If cgroups are configured into the kernel, then uncharged pages can only
> come from filemap_add_folio_nocharge. Track such uncharged folios in
> vmstat so that they are accounted for.
>
> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
> include/linux/mmzone.h | 3 +++
> mm/filemap.c | 18 ++++++++++++++++++
> mm/vmstat.c | 3 +++
> 3 files changed, 24 insertions(+)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 283913d42d7b..a945dec65371 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -241,6 +241,9 @@ enum node_stat_item {
> NR_HUGETLB,
> #endif
> NR_BALLOON_PAGES,
> +#ifdef CONFIG_MEMCG
> + NR_UNCHARGED_FILE_PAGES,
> +#endif
> NR_VM_NODE_STAT_ITEMS
> };
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index ccc9cfb4d418..0a258b4a9246 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -146,6 +146,22 @@ static void page_cache_delete(struct address_space *mapping,
> mapping->nrpages -= nr;
> }
>
> +#ifdef CONFIG_MEMCG
> +static void filemap_mod_uncharged_vmstat(struct folio *folio, int sign)
> +{
> + long nr = folio_nr_pages(folio) * sign;
> +
> + if (!folio_memcg(folio))
> + __lruvec_stat_mod_folio(folio, NR_UNCHARGED_FILE_PAGES, nr);
From filemap_add_folio_nocharge(), this function is called without
preemption disabled, so you will get lockdep warning from following
chain:
__lruvec_stat_mod_folio -> __mod_node_page_state ->
preempt_disable_nested -> lockdep_assert_preemption_disabled().
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-08-07 17:23 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-06 0:11 [PATCH 0/3] filemap_add_folio_nocharge() Boris Burkov
2025-08-06 0:11 ` [PATCH 1/3] mm/filemap: add filemap_add_folio_nocharge() Boris Burkov
2025-08-06 0:11 ` [PATCH 2/3] btrfs: use filemap_add_folio_nocharge() for extent_buffers Boris Burkov
2025-08-06 0:11 ` [PATCH 3/3] mm: add vmstat for cgroup uncharged pages Boris Burkov
2025-08-06 20:40 ` kernel test robot
2025-08-06 20:51 ` kernel test robot
2025-08-07 17:23 ` Shakeel Butt
2025-08-06 14:00 ` [PATCH 0/3] filemap_add_folio_nocharge() Matthew Wilcox
2025-08-06 23:19 ` Shakeel Butt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).