linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] protect page cache from freeing inode
@ 2020-02-23  9:31 Yafang Shao
  2020-02-23  9:31 ` [PATCH v4 1/3] mm, list_lru: make memcg visible to lru walker isolation function Yafang Shao
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Yafang Shao @ 2020-02-23  9:31 UTC (permalink / raw)
  To: dchinner, hannes, mhocko, vdavydov.dev, guro, akpm, viro
  Cc: linux-mm, linux-fsdevel, Yafang Shao

On my server there're some running MEMCGs protected by memory.{min, low},
but I found the usage of these MEMCGs abruptly became very small, which
were far less than the protect limit. It confused me and finally I
found that was because of inode stealing.
Once an inode is freed, all its belonging page caches will be dropped as
well, no matter how may page caches it has. So if we intend to protect the
page caches in a memcg, we must protect their host (the inode) first.
Otherwise the memcg protection can be easily bypassed with freeing inode,
especially if there're big files in this memcg.
The inherent mismatch between memcg and inode is a trouble. One inode can
be shared by different MEMCGs, but it is a very rare case. If an inode is
shared, its belonging page caches may be charged to different MEMCGs.
Currently there's no perfect solution to fix this kind of issue, but the
inode majority-writer ownership switching can help it more or less.

- Changes against v3:
Fix the possible risk pointed by Johannes in another patchset [1].
Per discussion with Johannes in that mail thread, I found that the issue
Johannes is trying to fix is different with the issue I'm trying to fix.
That's why I update this patchset and post it again. This specific memcg
protection issue should be addressed.

- Changes against v2:
    1. Seperates memcg patches from this patchset, suggested by Roman.
    2. Improves code around the usage of for_each_mem_cgroup(), suggested
       by Dave
    3. Use memcg_low_reclaim passed from scan_control, instead of
       introducing a new member in struct mem_cgroup.
    4. Some other code improvement suggested by Dave.


- Changes against v1:
Use the memcg passed from the shrink_control, instead of getting it from
inode itself, suggested by Dave. That could make the laying better.

[1]. https://lore.kernel.org/linux-mm/20200211175507.178100-1-hannes@cmpxchg.org/

Yafang Shao (3):
  mm, list_lru: make memcg visible to lru walker isolation function
  mm, shrinker: make memcg low reclaim visible to lru walker isolation
    function
  inode: protect page cache from freeing inode

 fs/inode.c                 | 76 ++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/memcontrol.h | 21 +++++++++++++
 include/linux/shrinker.h   |  3 ++
 mm/list_lru.c              | 47 ++++++++++++++++------------
 mm/memcontrol.c            | 15 ---------
 mm/vmscan.c                | 27 +++++++++-------
 6 files changed, 141 insertions(+), 48 deletions(-)

-- 
Yafang Shao


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-02-26 14:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-02-23  9:31 [PATCH v4 0/3] protect page cache from freeing inode Yafang Shao
2020-02-23  9:31 ` [PATCH v4 1/3] mm, list_lru: make memcg visible to lru walker isolation function Yafang Shao
2020-02-23  9:31 ` [PATCH v4 2/3] mm, shrinker: make memcg low reclaim " Yafang Shao
2020-02-23  9:31 ` [PATCH v4 3/3] inode: protect page cache from freeing inode Yafang Shao
2020-02-24  3:17 ` [PATCH v4 0/3] " Andrew Morton
2020-02-26 14:16   ` Yafang Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).