From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A88A1C2D0C5 for ; Tue, 10 Dec 2019 20:43:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 62D942077B for ; Tue, 10 Dec 2019 20:43:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="VmNeW0Ah" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 62D942077B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.dk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 03D7C6B2E3B; Tue, 10 Dec 2019 15:43:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F30B06B2E3C; Tue, 10 Dec 2019 15:43:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DAAAE6B2E3D; Tue, 10 Dec 2019 15:43:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id BD1166B2E3B for ; Tue, 10 Dec 2019 15:43:15 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8634C5DC1 for ; Tue, 10 Dec 2019 20:43:15 +0000 (UTC) X-FDA: 76250406750.01.suit91_5698912b8f863 X-HE-Tag: suit91_5698912b8f863 X-Filterd-Recvd-Size: 8828 Received: from mail-pj1-f67.google.com (mail-pj1-f67.google.com [209.85.216.67]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Tue, 10 Dec 2019 20:43:14 +0000 (UTC) Received: by mail-pj1-f67.google.com with SMTP id s35so7847784pjb.7 for ; Tue, 10 Dec 2019 12:43:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=La+NjwGWvt7GtzB/lTDOhWu8CGazP1OPST351UYuSZU=; b=VmNeW0AhbIFvToeaKcXSGJpXV4duIlEUi83oBivoP6f/zYhBxVSNKZMaX2bBPcUUPu epQNja0TfDYERlBEuIeQjWCyTugaN95R49GIEme7K2YToJRGfkseJV2Ctzt5g5+/2QMd Tjs3tkuFOsmP6a0NwJoEKNnhQc027QKtWpbBywAMmC0NL7V6gU2X5enDUONiV8Uq3Aeo 07SLH8MkXvRV/kuIXyWA5l4PLIgJ6eGyPYfzAj3PUo/cP56wO6Ak59DrKVOQGxwBKlVB 7dRI5ZNnirDJTHCpOTu55xtP/C/apXPcdI1M/EBHLuoCOlxDQZMkTsXUlFnbOxd4cGeQ ORGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=La+NjwGWvt7GtzB/lTDOhWu8CGazP1OPST351UYuSZU=; b=VKpV/rqUSWa0A0N3v68CiMquInTqIBxrdiKmPBliwhWsRXdVKnhLYc2DmlfqZxTyS3 uuKNlebMGuH+IMiark3Z330ir4+bPtOP93HWo8xClMmjJVOd+ZGkc9bsA3qsEKy7GMyO X0RzSxZU3eYc7O5WnGd/ilhhe8svsiS9oPhmFNDuLzzyelV9FIPSviX3beBKH0LWsVyr /ewnkDuMokKAEKdMDT87a/By1Dj3rLGBlyWRJbvbqLa9DWAfeSbQJHW7OIwWlXoYCCMl Na4X8KLnuZj1zUlYJQJZcn5xQ6/NFSw4JVOidGGZCG5PUZabn2FFV1rRERAFS7+JRuJ4 aAeQ== X-Gm-Message-State: APjAAAUdb02X15Sf/27bOxPP3OqddVBrsaoaykDxynjdRYudGxFljj34 i6n1vFhJdGomBpNB79lcVazcKfY5jms7vQ== X-Google-Smtp-Source: APXvYqwgqZKt25BF3h7lSSi7BUUZNK2CUq5kWJ1XJaGbTEbq+wqqeHVoePnNe2UiFNm+R/2EmgVeqg== X-Received: by 2002:a17:90a:cf11:: with SMTP id h17mr7620307pju.103.1576010593276; Tue, 10 Dec 2019 12:43:13 -0800 (PST) Received: from x1.thefacebook.com ([66.219.217.145]) by smtp.gmail.com with ESMTPSA id o15sm4387829pgf.2.2019.12.10.12.43.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Dec 2019 12:43:12 -0800 (PST) From: Jens Axboe To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Cc: willy@infradead.org, clm@fb.com, Jens Axboe Subject: [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Date: Tue, 10 Dec 2019 13:43:02 -0700 Message-Id: <20191210204304.12266-4-axboe@kernel.dk> X-Mailer: git-send-email 2.24.0 In-Reply-To: <20191210204304.12266-1-axboe@kernel.dk> References: <20191210204304.12266-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If RWF_UNCACHED is set for io_uring (or pwritev2(2)), we'll drop the cache instantiated for buffered writes. If new pages aren't instantiated, we leave them alone. This provides similar semantics to reads with RWF_UNCACHED set. Signed-off-by: Jens Axboe --- include/linux/fs.h | 5 +++ mm/filemap.c | 85 +++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 85 insertions(+), 5 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index bf58db1bc032..7ea3dfdd9aa5 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -285,6 +285,7 @@ enum positive_aop_returns { #define AOP_FLAG_NOFS 0x0002 /* used by filesystem to direct * helper code (eg buffer layer) * to clear GFP_FS from alloc */ +#define AOP_FLAG_UNCACHED 0x0004 =20 /* * oh the beauties of C type declarations. @@ -3106,6 +3107,10 @@ extern ssize_t generic_file_direct_write(struct ki= ocb *, struct iov_iter *); extern ssize_t generic_perform_write(struct file *, struct iov_iter *, struct kiocb *); =20 +struct pagevec; +extern void write_drop_cached_pages(struct pagevec *pvec, + struct address_space *mapping); + ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *= ppos, rwf_t flags); ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t = *ppos, diff --git a/mm/filemap.c b/mm/filemap.c index fe37bd2b2630..2e36129ebe38 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3287,10 +3287,12 @@ struct page *grab_cache_page_write_begin(struct a= ddress_space *mapping, pgoff_t index, unsigned flags) { struct page *page; - int fgp_flags =3D FGP_LOCK|FGP_WRITE|FGP_CREAT; + int fgp_flags =3D FGP_LOCK|FGP_WRITE; =20 if (flags & AOP_FLAG_NOFS) fgp_flags |=3D FGP_NOFS; + if (!(flags & AOP_FLAG_UNCACHED)) + fgp_flags |=3D FGP_CREAT; =20 page =3D pagecache_get_page(mapping, index, fgp_flags, mapping_gfp_mask(mapping)); @@ -3301,21 +3303,65 @@ struct page *grab_cache_page_write_begin(struct a= ddress_space *mapping, } EXPORT_SYMBOL(grab_cache_page_write_begin); =20 +/* + * Start writeback on the pages in pgs[], and then try and remove those = pages + * from the page cached. Used with RWF_UNCACHED. + */ +void write_drop_cached_pages(struct pagevec *pvec, + struct address_space *mapping) +{ + loff_t start, end; + int i; + + end =3D 0; + start =3D LLONG_MAX; + for (i =3D 0; i < pagevec_count(pvec); i++) { + loff_t off =3D page_offset(pvec->pages[i]); + if (off < start) + start =3D off; + if (off > end) + end =3D off; + } + + __filemap_fdatawrite_range(mapping, start, end, WB_SYNC_NONE); + + for (i =3D 0; i < pagevec_count(pvec); i++) { + struct page *page =3D pvec->pages[i]; + + lock_page(page); + if (page->mapping =3D=3D mapping) { + wait_on_page_writeback(page); + if (!page_has_private(page) || + try_to_release_page(page, 0)) + remove_mapping(mapping, page); + } + unlock_page(page); + } + pagevec_release(pvec); +} +EXPORT_SYMBOL_GPL(write_drop_cached_pages); + +#define GPW_PAGE_BATCH 16 + ssize_t generic_perform_write(struct file *file, struct iov_iter *i, struct kiocb *iocb) { struct address_space *mapping =3D file->f_mapping; const struct address_space_operations *a_ops =3D mapping->a_ops; loff_t pos =3D iocb->ki_pos; + struct pagevec pvec; long status =3D 0; ssize_t written =3D 0; unsigned int flags =3D 0; =20 + pagevec_init(&pvec); + do { struct page *page; unsigned long offset; /* Offset into pagecache page */ unsigned long bytes; /* Bytes to write to page */ size_t copied; /* Bytes copied from user */ + bool drop_page =3D false; /* drop page after IO */ void *fsdata; =20 offset =3D (pos & (PAGE_SIZE - 1)); @@ -3323,6 +3369,9 @@ ssize_t generic_perform_write(struct file *file, iov_iter_count(i)); =20 again: + if (iocb->ki_flags & IOCB_UNCACHED) + flags |=3D AOP_FLAG_UNCACHED; + /* * Bring in the user page that we will copy from _first_. * Otherwise there's a nasty deadlock on copying from the @@ -3343,10 +3392,17 @@ ssize_t generic_perform_write(struct file *file, break; } =20 +retry: status =3D a_ops->write_begin(file, mapping, pos, bytes, flags, &page, &fsdata); - if (unlikely(status < 0)) + if (unlikely(status < 0)) { + if (status =3D=3D -ENOMEM && (flags & AOP_FLAG_UNCACHED)) { + drop_page =3D true; + flags &=3D ~AOP_FLAG_UNCACHED; + goto retry; + } break; + } =20 if (mapping_writably_mapped(mapping)) flush_dcache_page(page); @@ -3354,10 +3410,16 @@ ssize_t generic_perform_write(struct file *file, copied =3D iov_iter_copy_from_user_atomic(page, i, offset, bytes); flush_dcache_page(page); =20 + if (drop_page) + get_page(page); + status =3D a_ops->write_end(file, mapping, pos, bytes, copied, page, fsdata); - if (unlikely(status < 0)) + if (unlikely(status < 0)) { + if (drop_page) + put_page(page); break; + } copied =3D status; =20 cond_resched(); @@ -3374,14 +3436,27 @@ ssize_t generic_perform_write(struct file *file, */ bytes =3D min_t(unsigned long, PAGE_SIZE - offset, iov_iter_single_seg_count(i)); + if (drop_page) + put_page(page); goto again; } + if (drop_page && + ((pos >> PAGE_SHIFT) !=3D ((pos + copied) >> PAGE_SHIFT))) { + if (!pagevec_add(&pvec, page)) + write_drop_cached_pages(&pvec, mapping); + } else { + if (drop_page) + put_page(page); + balance_dirty_pages_ratelimited(mapping); + } + pos +=3D copied; written +=3D copied; - - balance_dirty_pages_ratelimited(mapping); } while (iov_iter_count(i)); =20 + if (pagevec_count(&pvec)) + write_drop_cached_pages(&pvec, mapping); + return written ? written : status; } EXPORT_SYMBOL(generic_perform_write); --=20 2.24.0