From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B940DC3F68F for ; Wed, 18 Dec 2019 01:17:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 786FE2176D for ; Wed, 18 Dec 2019 01:17:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="V2GockHv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 786FE2176D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0E2848E00BE; Tue, 17 Dec 2019 20:17:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 06C0C8E0079; Tue, 17 Dec 2019 20:17:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E74EA8E00BE; Tue, 17 Dec 2019 20:17:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0144.hostedemail.com [216.40.44.144]) by kanga.kvack.org (Postfix) with ESMTP id CE6D38E0079 for ; Tue, 17 Dec 2019 20:17:48 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id A0D5940FE for ; Wed, 18 Dec 2019 01:17:48 +0000 (UTC) X-FDA: 76276500216.10.flesh91_7e58969b30149 X-HE-Tag: flesh91_7e58969b30149 X-Filterd-Recvd-Size: 5708 Received: from mail-il1-f193.google.com (mail-il1-f193.google.com [209.85.166.193]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Dec 2019 01:17:48 +0000 (UTC) Received: by mail-il1-f193.google.com with SMTP id f5so233299ilq.5 for ; Tue, 17 Dec 2019 17:17:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oA/yf12U/Kxc1zDI4JzibNhsgFIEwCoLpFPoyClVoyc=; b=V2GockHvWu/kJZOrb/Fjb++aDcjeKE9nF1tswj9bcP56aSDm1mWkF0Bb1o8cWasTMB onk1Pi1FvBXa+vhd+b0Obx9GOTwPMp3xVPlF9+aeD/O/muLWemx0dYrrcskjt7fcz7ed YJ3a3xm+t9W+PakO/2LygdAkHcJtGQ7fd4kPyFchZHO64KRO5RUjSpRp4nRuy42Q6MGg vuq2f2bUqwEBet+aAJTxBMF38hSVlZfSYG6KSHFU607Xu0EK+OdCvN1A0bc9vnJbvaZQ qkk9xGzcITv5dXkMmgkNlqCXAbcKGRPklfR70XYeUPtneoTWBkR8RTNfOKEdYXCyn3Ur vLbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oA/yf12U/Kxc1zDI4JzibNhsgFIEwCoLpFPoyClVoyc=; b=FIfxe/7xFNXgILnk4kSjW+G0irETGG7KC2sj0JRp5G7yfIve83e/lqLmYkihx3Of4C 5rCoXoplz33ZvAdLBdsZJmoPVFJ48Uo+gQVvLk1XR07bOY/BlqCoXHz1UTzN/qsIPkQR jE5PMyexgBB6GirJrZ0Kyoja33VldgfKBYWkzNtjmaaiAgJZZdkWMpuDpiXjvFSrdC8z DFF1mEbVDtL9U6k38u8g4ROUCsX+0SciXojCZR5WmLfNnCKV6vNsc8vNG6OzfwhzMqlh c9rU6DmwhbDs+uf8qsY7gwIt3WMySUWeFfUN/9nZFLOqY/+VjLmZdqb5DrlEYNxt3rXJ EpyA== X-Gm-Message-State: APjAAAUhz2Yev8wI991eB2pVOS3UGoROU4gbU4rH6H1aPxzhvLLM8S98 HUMGEKKtAw503cahBBCVdRB+DaFOoCx6tsaFBAU= X-Google-Smtp-Source: APXvYqyNOT9j7j2T8jDNq965KC84cLs7ROS0VaTwM3kqpxYEC2cEax272aefDu09HlYAEi/5zKrQh2uydZFCZnTGRic= X-Received: by 2002:a92:5c8a:: with SMTP id d10mr342478ilg.137.1576631867378; Tue, 17 Dec 2019 17:17:47 -0800 (PST) MIME-Version: 1.0 References: <1576582159-5198-1-git-send-email-laoar.shao@gmail.com> <20191217115603.GA10016@dhcp22.suse.cz> <20191217165422.GA213613@cmpxchg.org> In-Reply-To: <20191217165422.GA213613@cmpxchg.org> From: Yafang Shao Date: Wed, 18 Dec 2019 09:17:11 +0800 Message-ID: Subject: Re: [PATCH 0/4] memcg, inode: protect page cache from freeing inode To: Johannes Weiner Cc: Michal Hocko , Vladimir Davydov , Andrew Morton , Dave Chinner , Al Viro , Linux MM , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Dec 18, 2019 at 12:54 AM Johannes Weiner wrote: > > CCing Dave > > On Tue, Dec 17, 2019 at 08:19:08PM +0800, Yafang Shao wrote: > > On Tue, Dec 17, 2019 at 7:56 PM Michal Hocko wrote: > > > What do you mean by this exactly. Are those inodes reclaimed by the > > > regular memory reclaim or by other means? Because shrink_node does > > > exclude shrinking slab for protected memcgs. > > > > By the regular memory reclaim, kswapd, direct reclaimer or memcg reclaimer. > > IOW, the current->reclaim_state it set. > > > > Take an example for you. > > > > kswapd > > balance_pgdat > > shrink_node_memcgs > > switch (mem_cgroup_protected) <<<< memory.current= 1024M > > memory.min = 512M a file has 800M page caches > > case MEMCG_PROT_NONE: <<<< hard limit is not reached. > > beak; > > shrink_lruvec > > shrink_slab <<< it may free the inode and the free all its > > page caches (800M) > > This problem exists independent of cgroup protection. > > The inode shrinker may take down an inode that's still holding a ton > of (potentially active) page cache pages when the inode hasn't been > referenced recently. > > IMO we shouldn't be dropping data that the VM still considers hot > compared to other data, just because the inode object hasn't been used > as recently as other inode objects (e.g. drowned in a stream of > one-off inode accesses). > > I've carried the below patch in my private tree for testing cache > aging decisions that the shrinker interfered with. (It would be nicer > if page cache pages could pin the inode of course, but reclaim cannot > easily participate in the inode refcounting scheme.) > > Thoughts? > I have already though about this solution. But I found there is a similar revert by Dave - see 69056ee6a8a3 ("Revert "mm: don't reclaim inodes with many attached pages""). That's why I CCed Dave in patch-4. So I only fix it for memcg protection because that will not impact too much. > diff --git a/fs/inode.c b/fs/inode.c > index fef457a42882..bfcaaaf6314f 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -753,7 +753,13 @@ static enum lru_status inode_lru_isolate(struct list_head *item, > return LRU_ROTATE; > } > > - if (inode_has_buffers(inode) || inode->i_data.nrpages) { > + /* Leave the pages to page reclaim */ > + if (inode->i_data.nrpages) { > + spin_unlock(&inode->i_lock); > + return LRU_ROTATE; > + } > + > + if (inode_has_buffers(inode)) { > __iget(inode); > spin_unlock(&inode->i_lock); > spin_unlock(lru_lock);