From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8872C433E7 for ; Thu, 8 Oct 2020 18:49:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E7E96208C7 for ; Thu, 8 Oct 2020 18:49:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Qm3OHv+7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E7E96208C7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CEB326B005C; Thu, 8 Oct 2020 14:49:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C75C2900002; Thu, 8 Oct 2020 14:49:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B63B96B0062; Thu, 8 Oct 2020 14:49:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0207.hostedemail.com [216.40.44.207]) by kanga.kvack.org (Postfix) with ESMTP id 8320A6B005C for ; Thu, 8 Oct 2020 14:49:00 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 1BAE2180AD807 for ; Thu, 8 Oct 2020 18:49:00 +0000 (UTC) X-FDA: 77349645240.09.cow83_590f356271da Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin09.hostedemail.com (Postfix) with ESMTP id 011DF180AD806 for ; Thu, 8 Oct 2020 18:48:59 +0000 (UTC) X-HE-Tag: cow83_590f356271da X-Filterd-Recvd-Size: 6162 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Thu, 8 Oct 2020 18:48:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1602182938; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XKvUN7mKPbV+B612jB+Si5npQlDaawUSuukFx01DKck=; b=Qm3OHv+7WtTMRXEPblASj+0Yaa3t9mT2F+0gIRDfz+feq89PxT1NqQDEuxoSdnI5KxnftT Y6oPv0d9Ok52HEJw/XN5ez6ZU9kEePOAe7o8mjkyQqt/tQSDlkBrfl6oOwTPvs2xOs/JO+ Zgqguu7OaLIjtlSLlQc3AnxG3/l6i9A= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-163-d1bW96jfNOOyzfUHXbAlQw-1; Thu, 08 Oct 2020 14:48:54 -0400 X-MC-Unique: d1bW96jfNOOyzfUHXbAlQw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 01FB99CC11; Thu, 8 Oct 2020 18:48:53 +0000 (UTC) Received: from redhat.com (ovpn-119-161.rdu2.redhat.com [10.10.119.161]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C7DC66EF4A; Thu, 8 Oct 2020 18:48:51 +0000 (UTC) Date: Thu, 8 Oct 2020 14:48:49 -0400 From: Jerome Glisse To: Matthew Wilcox Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Alexander Viro , Tejun Heo , Jan Kara , Josef Bacik Subject: Re: [PATCH 00/14] Small step toward KSM for file back page. Message-ID: <20201008184849.GA3514601@redhat.com> References: <20201007010603.3452458-1-jglisse@redhat.com> <20201007032013.GS20115@casper.infradead.org> <20201007144835.GA3471400@redhat.com> <20201007170558.GU20115@casper.infradead.org> <20201007175419.GA3478056@redhat.com> <20201007220916.GX20115@casper.infradead.org> <20201008153028.GA3508856@redhat.com> <20201008154341.GJ20115@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20201008154341.GJ20115@casper.infradead.org> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Oct 08, 2020 at 04:43:41PM +0100, Matthew Wilcox wrote: > On Thu, Oct 08, 2020 at 11:30:28AM -0400, Jerome Glisse wrote: > > On Wed, Oct 07, 2020 at 11:09:16PM +0100, Matthew Wilcox wrote: > > > So ... why don't you put a PageKsm page in the page cache? That wa= y you > > > can share code with the current KSM implementation. You'd need > > > something like this: > >=20 > > I do just that but there is no need to change anything in page cache. >=20 > That's clearly untrue. If you just put a PageKsm page in the page > cache today, here's what will happen on a truncate: >=20 > void truncate_inode_pages_range(struct address_space *mapping, > loff_t lstart, loff_t lend) > { > ... > struct page *page =3D find_lock_page(mapping, start - 1= ); >=20 > find_lock_page() does this: > return pagecache_get_page(mapping, offset, FGP_LOCK, 0); >=20 > pagecache_get_page(): >=20 > repeat: > page =3D find_get_entry(mapping, index); > ... > if (fgp_flags & FGP_LOCK) { > ... > if (unlikely(compound_head(page)->mapping !=3D mapping)= ) { > unlock_page(page); > put_page(page); > goto repeat; >=20 > so it's just going to spin. There are plenty of other codepaths that > would need to be checked. If you haven't found them, that shows you > don't understand the problem deeply enough yet. I also change truncate, splice and few other special cases that do not goes through GUP/page fault/mkwrite (memory debug too but that's a different beast). > I believe we should solve this problem, but I don't think you're going > about it the right way. I have done much more than what i posted but there is bug that i need to hammer down before posting everything and i wanted to get the discussion started. I guess i will finish tracking that one down and post the whole thing. > > So flow is: > >=20 > > Same as before: > > 1 - write fault (address, vma) > > 2 - regular write fault handler -> find page in page cache > >=20 > > New to common page fault code: > > 3 - ksm check in write fault common code (same as ksm today > > for anonymous page fault code path). > > 4 - break ksm (address, vma) -> (file offset, mapping) > > 4.a - use mapping and file offset to lookup the proper > > fs specific information that were save when the > > page was made ksm. > > 4.b - allocate new page and initialize it with that > > information (and page content), update page cache > > and mappings ie all the pte who where pointing to > > the ksm for that mapping at that offset to now use > > the new page (like KSM for anonymous page today). >=20 > But by putting that logic in the page fault path, you've missed > the truncate path. And maybe other places. Putting the logic > down in pagecache_get_page() means you _don't_ need to find > all the places that call pagecache_get_page(). They are cases where pagecache is not even in the loop ie you already have the page and you do not need to look it up (page fault, some fs common code, anything that goes through GUP, memory reclaim, ...). Making all those places having to go through page cache all the times will slow them down and many are hot code path that i do not believe we want to slow even if a feature is not use. Cheers, J=E9r=F4me