From mboxrd@z Thu Jan 1 00:00:00 1970 From: Florian Weimer Subject: Re: [PATCH 2/3] mm/filemap: initiate readahead even if IOCB_NOWAIT is set for the I/O Date: Thu, 31 Jan 2019 11:47:24 +0100 Message-ID: <87imy5f6ir.fsf@oldenburg2.str.redhat.com> References: <20190130124420.1834-1-vbabka@suse.cz> <20190130124420.1834-3-vbabka@suse.cz> <87munii3uj.fsf@oldenburg2.str.redhat.com> Mime-Version: 1.0 Content-Type: text/plain Return-path: In-Reply-To: (Jiri Kosina's message of "Wed, 30 Jan 2019 16:15:55 +0100 (CET)") Sender: linux-kernel-owner@vger.kernel.org To: Jiri Kosina Cc: Vlastimil Babka , Andrew Morton , Linus Torvalds , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-api@vger.kernel.org, Peter Zijlstra , Greg KH , Jann Horn , Dominique Martinet , Andy Lutomirski , Dave Chinner , Kevin Easton , Matthew Wilcox , Cyril Hrubis , Tejun Heo , "Kirill A . Shutemov" , Daniel Gruss List-Id: linux-api@vger.kernel.org * Jiri Kosina: > On Wed, 30 Jan 2019, Florian Weimer wrote: > >> > preadv2(RWF_NOWAIT) can be used to open a side-channel to pagecache >> > contents, as it reveals metadata about residency of pages in >> > pagecache. >> > >> > If preadv2(RWF_NOWAIT) returns immediately, it provides a clear "page >> > not resident" information, and vice versa. >> > >> > Close that sidechannel by always initiating readahead on the cache if >> > we encounter a cache miss for preadv2(RWF_NOWAIT); with that in place, >> > probing the pagecache residency itself will actually populate the >> > cache, making the sidechannel useless. >> >> I think this needs to use a different flag because the semantics are so >> much different. If I understand this change correctly, previously, >> RWF_NOWAIT essentially avoided any I/O, and now it does not. > > It still avoid synchronous I/O, due to this code still being in place: > > if (!PageUptodate(page)) { > if (iocb->ki_flags & IOCB_NOWAIT) { > put_page(page); > goto would_block; > } > > but goes the would_block path only after initiating asynchronous > readahead. But it wouldn't schedule asynchronous readahead before? I'm worried that something, say PostgreSQL doing a sequential scan, would implement a two-pass approach, first using RWF_NOWAIT to process what's in the kernel page cache, and then read the rest without it. If RWF_NOWAIT is treated as a prefetch hint, there could be much more read activity, and a lot of it would be pointless because the data might have to be evicted before userspace can use it. Thanks, Florian