From: David Herrmann <dh.herrmann@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Michael Kerrisk <mtk.manpages@gmail.com>,
Ryan Lortie <desrt@desrt.ca>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-api@vger.kernel.org, Greg Kroah-Hartman <greg@kroah.com>,
john.stultz@linaro.org,
Lennart Poettering <lennart@poettering.net>,
Daniel Mack <zonque@gmail.com>, Kay Sievers <kay@vrfy.org>,
Hugh Dickins <hughd@google.com>,
Tony Battersby <tonyb@cybernetics.com>,
Andy Lutomirski <luto@amacapital.net>,
David Herrmann <dh.herrmann@gmail.com>
Subject: [RFC v3 6/7] shm: wait for pins to be released when sealing
Date: Fri, 13 Jun 2014 12:36:58 +0200 [thread overview]
Message-ID: <1402655819-14325-7-git-send-email-dh.herrmann@gmail.com> (raw)
In-Reply-To: <1402655819-14325-1-git-send-email-dh.herrmann@gmail.com>
We currently fail setting SEAL_WRITE in case there're pending page
references. This patch extends the pin-tests to wait up to 150ms for all
references to be dropped. This is still not perfect in that it doesn't
account for harmless read-only pins, but it's much better than a hard
failure.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
mm/shmem.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 82 insertions(+), 15 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index e7c5fe1..ddc3998 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1735,25 +1735,19 @@ static loff_t shmem_file_llseek(struct file *file, loff_t offset, int whence)
}
/*
- * Setting SEAL_WRITE requires us to verify there's no pending writer. However,
- * via get_user_pages(), drivers might have some pending I/O without any active
- * user-space mappings (eg., direct-IO, AIO). Therefore, we look at all pages
- * and see whether it has an elevated ref-count. If so, we abort.
- * The caller must guarantee that no new user will acquire writable references
- * to those pages to avoid races.
+ * We need a tag: a new tag would expand every radix_tree_node by 8 bytes,
+ * so reuse a tag which we firmly believe is never set or cleared on shmem.
*/
-static int shmem_test_for_pins(struct address_space *mapping)
+#define SHMEM_TAG_PINNED PAGECACHE_TAG_TOWRITE
+#define LAST_SCAN 4 /* about 150ms max */
+
+static void shmem_tag_pins(struct address_space *mapping)
{
struct radix_tree_iter iter;
void **slot;
pgoff_t start;
struct page *page;
- int error;
-
- /* flush additional refs in lru_add early */
- lru_add_drain_all();
- error = 0;
start = 0;
rcu_read_lock();
@@ -1764,8 +1758,10 @@ restart:
if (radix_tree_deref_retry(page))
goto restart;
} else if (page_count(page) - page_mapcount(page) > 1) {
- error = -EBUSY;
- break;
+ spin_lock_irq(&mapping->tree_lock);
+ radix_tree_tag_set(&mapping->page_tree, iter.index,
+ SHMEM_TAG_PINNED);
+ spin_unlock_irq(&mapping->tree_lock);
}
if (need_resched()) {
@@ -1775,6 +1771,77 @@ restart:
}
}
rcu_read_unlock();
+}
+
+/*
+ * Setting SEAL_WRITE requires us to verify there's no pending writer. However,
+ * via get_user_pages(), drivers might have some pending I/O without any active
+ * user-space mappings (eg., direct-IO, AIO). Therefore, we look at all pages
+ * and see whether it has an elevated ref-count. If so, we tag them and wait for
+ * them to be dropped.
+ * The caller must guarantee that no new user will acquire writable references
+ * to those pages to avoid races.
+ */
+static int shmem_wait_for_pins(struct address_space *mapping)
+{
+ struct radix_tree_iter iter;
+ void **slot;
+ pgoff_t start;
+ struct page *page;
+ int error, scan;
+
+ shmem_tag_pins(mapping);
+
+ error = 0;
+ for (scan = 0; scan <= LAST_SCAN; scan++) {
+ if (!radix_tree_tagged(&mapping->page_tree, SHMEM_TAG_PINNED))
+ break;
+
+ if (!scan)
+ lru_add_drain_all();
+ else if (schedule_timeout_killable((HZ << scan) / 200))
+ scan = LAST_SCAN;
+
+ start = 0;
+ rcu_read_lock();
+restart:
+ radix_tree_for_each_tagged(slot, &mapping->page_tree, &iter,
+ start, SHMEM_TAG_PINNED) {
+
+ page = radix_tree_deref_slot(slot);
+ if (radix_tree_exception(page)) {
+ if (radix_tree_deref_retry(page))
+ goto restart;
+
+ page = NULL;
+ }
+
+ if (page &&
+ page_count(page) - page_mapcount(page) != 1) {
+ if (scan < LAST_SCAN)
+ goto continue_resched;
+
+ /*
+ * On the last scan, we clean up all those tags
+ * we inserted; but make a note that we still
+ * found pages pinned.
+ */
+ error = -EBUSY;
+ }
+
+ spin_lock_irq(&mapping->tree_lock);
+ radix_tree_tag_clear(&mapping->page_tree,
+ iter.index, SHMEM_TAG_PINNED);
+ spin_unlock_irq(&mapping->tree_lock);
+continue_resched:
+ if (need_resched()) {
+ cond_resched_rcu();
+ start = iter.index + 1;
+ goto restart;
+ }
+ }
+ rcu_read_unlock();
+ }
return error;
}
@@ -1840,7 +1907,7 @@ int shmem_add_seals(struct file *file, unsigned int seals)
if (error)
goto unlock;
- error = shmem_test_for_pins(file->f_mapping);
+ error = shmem_wait_for_pins(file->f_mapping);
if (error) {
mapping_allow_writable(file->f_mapping);
goto unlock;
--
2.0.0
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: David Herrmann <dh.herrmann@gmail.com>
To: linux-kernel@vger.kernel.org
Cc: Michael Kerrisk <mtk.manpages@gmail.com>,
Ryan Lortie <desrt@desrt.ca>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
linux-api@vger.kernel.org, Greg Kroah-Hartman <greg@kroah.com>,
john.stultz@linaro.org,
Lennart Poettering <lennart@poettering.net>,
Daniel Mack <zonque@gmail.com>, Kay Sievers <kay@vrfy.org>,
Hugh Dickins <hughd@google.com>,
Tony Battersby <tonyb@cybernetics.com>,
Andy Lutomirski <luto@amacapital.net>,
David Herrmann <dh.herrmann@gmail.com>
Subject: [RFC v3 6/7] shm: wait for pins to be released when sealing
Date: Fri, 13 Jun 2014 12:36:58 +0200 [thread overview]
Message-ID: <1402655819-14325-7-git-send-email-dh.herrmann@gmail.com> (raw)
In-Reply-To: <1402655819-14325-1-git-send-email-dh.herrmann@gmail.com>
We currently fail setting SEAL_WRITE in case there're pending page
references. This patch extends the pin-tests to wait up to 150ms for all
references to be dropped. This is still not perfect in that it doesn't
account for harmless read-only pins, but it's much better than a hard
failure.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
mm/shmem.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 82 insertions(+), 15 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index e7c5fe1..ddc3998 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1735,25 +1735,19 @@ static loff_t shmem_file_llseek(struct file *file, loff_t offset, int whence)
}
/*
- * Setting SEAL_WRITE requires us to verify there's no pending writer. However,
- * via get_user_pages(), drivers might have some pending I/O without any active
- * user-space mappings (eg., direct-IO, AIO). Therefore, we look at all pages
- * and see whether it has an elevated ref-count. If so, we abort.
- * The caller must guarantee that no new user will acquire writable references
- * to those pages to avoid races.
+ * We need a tag: a new tag would expand every radix_tree_node by 8 bytes,
+ * so reuse a tag which we firmly believe is never set or cleared on shmem.
*/
-static int shmem_test_for_pins(struct address_space *mapping)
+#define SHMEM_TAG_PINNED PAGECACHE_TAG_TOWRITE
+#define LAST_SCAN 4 /* about 150ms max */
+
+static void shmem_tag_pins(struct address_space *mapping)
{
struct radix_tree_iter iter;
void **slot;
pgoff_t start;
struct page *page;
- int error;
-
- /* flush additional refs in lru_add early */
- lru_add_drain_all();
- error = 0;
start = 0;
rcu_read_lock();
@@ -1764,8 +1758,10 @@ restart:
if (radix_tree_deref_retry(page))
goto restart;
} else if (page_count(page) - page_mapcount(page) > 1) {
- error = -EBUSY;
- break;
+ spin_lock_irq(&mapping->tree_lock);
+ radix_tree_tag_set(&mapping->page_tree, iter.index,
+ SHMEM_TAG_PINNED);
+ spin_unlock_irq(&mapping->tree_lock);
}
if (need_resched()) {
@@ -1775,6 +1771,77 @@ restart:
}
}
rcu_read_unlock();
+}
+
+/*
+ * Setting SEAL_WRITE requires us to verify there's no pending writer. However,
+ * via get_user_pages(), drivers might have some pending I/O without any active
+ * user-space mappings (eg., direct-IO, AIO). Therefore, we look at all pages
+ * and see whether it has an elevated ref-count. If so, we tag them and wait for
+ * them to be dropped.
+ * The caller must guarantee that no new user will acquire writable references
+ * to those pages to avoid races.
+ */
+static int shmem_wait_for_pins(struct address_space *mapping)
+{
+ struct radix_tree_iter iter;
+ void **slot;
+ pgoff_t start;
+ struct page *page;
+ int error, scan;
+
+ shmem_tag_pins(mapping);
+
+ error = 0;
+ for (scan = 0; scan <= LAST_SCAN; scan++) {
+ if (!radix_tree_tagged(&mapping->page_tree, SHMEM_TAG_PINNED))
+ break;
+
+ if (!scan)
+ lru_add_drain_all();
+ else if (schedule_timeout_killable((HZ << scan) / 200))
+ scan = LAST_SCAN;
+
+ start = 0;
+ rcu_read_lock();
+restart:
+ radix_tree_for_each_tagged(slot, &mapping->page_tree, &iter,
+ start, SHMEM_TAG_PINNED) {
+
+ page = radix_tree_deref_slot(slot);
+ if (radix_tree_exception(page)) {
+ if (radix_tree_deref_retry(page))
+ goto restart;
+
+ page = NULL;
+ }
+
+ if (page &&
+ page_count(page) - page_mapcount(page) != 1) {
+ if (scan < LAST_SCAN)
+ goto continue_resched;
+
+ /*
+ * On the last scan, we clean up all those tags
+ * we inserted; but make a note that we still
+ * found pages pinned.
+ */
+ error = -EBUSY;
+ }
+
+ spin_lock_irq(&mapping->tree_lock);
+ radix_tree_tag_clear(&mapping->page_tree,
+ iter.index, SHMEM_TAG_PINNED);
+ spin_unlock_irq(&mapping->tree_lock);
+continue_resched:
+ if (need_resched()) {
+ cond_resched_rcu();
+ start = iter.index + 1;
+ goto restart;
+ }
+ }
+ rcu_read_unlock();
+ }
return error;
}
@@ -1840,7 +1907,7 @@ int shmem_add_seals(struct file *file, unsigned int seals)
if (error)
goto unlock;
- error = shmem_test_for_pins(file->f_mapping);
+ error = shmem_wait_for_pins(file->f_mapping);
if (error) {
mapping_allow_writable(file->f_mapping);
goto unlock;
--
2.0.0
next prev parent reply other threads:[~2014-06-13 10:36 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-13 10:36 [PATCH v3 0/7] File Sealing & memfd_create() David Herrmann
2014-06-13 10:36 ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 1/7] mm: allow drivers to prevent new writable mappings David Herrmann
2014-06-13 10:36 ` David Herrmann
2014-07-09 8:55 ` Hugh Dickins
2014-07-09 8:55 ` Hugh Dickins
2014-07-19 16:12 ` David Herrmann
2014-07-19 16:12 ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 2/7] shm: add sealing API David Herrmann
2014-06-13 10:36 ` David Herrmann
2014-07-16 10:06 ` Hugh Dickins
2014-07-16 10:06 ` Hugh Dickins
2014-07-19 16:17 ` David Herrmann
2014-07-19 16:17 ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 3/7] shm: add memfd_create() syscall David Herrmann
2014-06-13 10:36 ` David Herrmann
2014-06-13 12:27 ` Michael Kerrisk (man-pages)
2014-06-13 12:27 ` Michael Kerrisk (man-pages)
2014-06-13 12:41 ` David Herrmann
2014-06-13 12:41 ` David Herrmann
2014-06-13 14:20 ` Michael Kerrisk (man-pages)
2014-06-13 14:20 ` Michael Kerrisk (man-pages)
[not found] ` <CAKgNAkgMA39AfoSoA5Pe1r9N+ZzfYQNvNPvcRN7tOvRb8+v06Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-13 16:20 ` John Stultz
2014-06-13 16:20 ` John Stultz
2014-06-13 16:20 ` John Stultz
2014-06-16 4:12 ` Michael Kerrisk (man-pages)
2014-06-16 4:12 ` Michael Kerrisk (man-pages)
2014-07-08 18:39 ` David Herrmann
2014-07-08 18:39 ` David Herrmann
2014-06-15 10:50 ` Jann Horn
2014-07-16 10:07 ` Hugh Dickins
2014-07-16 10:07 ` Hugh Dickins
2014-07-19 16:29 ` David Herrmann
2014-07-19 16:29 ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 4/7] selftests: add memfd_create() + sealing tests David Herrmann
2014-06-13 10:36 ` David Herrmann
2014-07-16 10:07 ` Hugh Dickins
2014-07-16 10:07 ` Hugh Dickins
2014-07-19 16:31 ` David Herrmann
2014-07-19 16:31 ` David Herrmann
2014-06-13 10:36 ` [PATCH v3 5/7] selftests: add memfd/sealing page-pinning tests David Herrmann
2014-06-13 10:36 ` David Herrmann
2014-07-16 10:08 ` Hugh Dickins
2014-07-16 10:08 ` Hugh Dickins
2014-07-19 16:32 ` David Herrmann
2014-07-19 16:32 ` David Herrmann
2014-06-13 10:36 ` David Herrmann [this message]
2014-06-13 10:36 ` [RFC v3 6/7] shm: wait for pins to be released when sealing David Herrmann
2014-07-16 10:09 ` Hugh Dickins
2014-07-16 10:09 ` Hugh Dickins
2014-07-19 16:36 ` David Herrmann
2014-07-19 16:36 ` David Herrmann
2014-06-13 10:36 ` [RFC v3 7/7] shm: isolate pinned pages when sealing files David Herrmann
2014-06-13 10:36 ` David Herrmann
2014-06-13 15:06 ` Andy Lutomirski
2014-06-13 15:06 ` Andy Lutomirski
2014-06-13 15:27 ` David Herrmann
2014-06-13 15:27 ` David Herrmann
2014-06-13 17:23 ` Andy Lutomirski
2014-06-13 17:23 ` Andy Lutomirski
2014-07-09 8:57 ` Hugh Dickins
2014-07-09 8:57 ` Hugh Dickins
2014-07-19 16:40 ` David Herrmann
2014-07-19 16:40 ` David Herrmann
2014-06-13 15:10 ` [PATCH v3 0/7] File Sealing & memfd_create() Andy Lutomirski
2014-06-13 15:10 ` Andy Lutomirski
[not found] ` <CALCETrVoE+JO2rLsBUHAOJdvescEEjxikj8iQ339Nxfopfc7pw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-13 15:15 ` David Herrmann
2014-06-13 15:15 ` David Herrmann
2014-06-13 15:15 ` David Herrmann
[not found] ` <CANq1E4SaWLD=hNEc-CDJbNnrGfXE_PkxZFBhpW4tbK7wor7xPA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-13 15:17 ` Andy Lutomirski
2014-06-13 15:17 ` Andy Lutomirski
2014-06-13 15:17 ` Andy Lutomirski
[not found] ` <CALCETrU8N9EbnJ3=oQ1WQCG9Vunn3nR9Ba=J48wJm0SuH0YB4A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-13 15:33 ` David Herrmann
2014-06-13 15:33 ` David Herrmann
2014-06-13 15:33 ` David Herrmann
2014-06-17 9:54 ` Florian Weimer
2014-06-17 9:54 ` Florian Weimer
[not found] ` <53A01049.6020502-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-06-17 10:01 ` David Herrmann
2014-06-17 10:01 ` David Herrmann
2014-06-17 10:01 ` David Herrmann
2014-06-17 10:04 ` Florian Weimer
2014-06-17 10:04 ` Florian Weimer
2014-06-17 10:10 ` David Herrmann
2014-06-17 10:10 ` David Herrmann
2014-06-17 12:13 ` Florian Weimer
2014-06-17 12:13 ` Florian Weimer
[not found] ` <53A030E9.7010701-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2014-06-17 13:26 ` David Herrmann
2014-06-17 13:26 ` David Herrmann
2014-06-17 13:26 ` David Herrmann
2014-06-17 16:20 ` Andy Lutomirski
2014-06-17 16:36 ` David Herrmann
2014-06-17 16:36 ` David Herrmann
2014-06-17 16:41 ` Andy Lutomirski
2014-06-17 16:41 ` Andy Lutomirski
2014-06-17 16:51 ` David Herrmann
2014-06-17 16:51 ` David Herrmann
2014-06-17 17:01 ` Andy Lutomirski
2014-06-17 17:01 ` Andy Lutomirski
[not found] ` <CALCETrWCbc=nhK-_+=uwCpUH0ZYWJXLwObVzAQeT20q8STa4Gw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2014-06-17 20:31 ` Hugh Dickins
2014-06-17 20:31 ` Hugh Dickins
2014-06-17 20:31 ` Hugh Dickins
2014-06-17 21:25 ` Andy Lutomirski
2014-06-17 21:25 ` Andy Lutomirski
2014-07-08 16:54 ` David Herrmann
2014-07-08 16:54 ` David Herrmann
2014-07-09 8:53 ` Hugh Dickins
2014-07-09 8:53 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1402655819-14325-7-git-send-email-dh.herrmann@gmail.com \
--to=dh.herrmann@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=desrt@desrt.ca \
--cc=greg@kroah.com \
--cc=hughd@google.com \
--cc=john.stultz@linaro.org \
--cc=kay@vrfy.org \
--cc=lennart@poettering.net \
--cc=linux-api@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@amacapital.net \
--cc=mtk.manpages@gmail.com \
--cc=tonyb@cybernetics.com \
--cc=torvalds@linux-foundation.org \
--cc=zonque@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.