From: Jan Kara <jack@suse.cz>
To: linux-ext4@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH 0/4] Block reservation on page fault time for ext3
Date: Mon, 2 May 2011 22:56:52 +0200 [thread overview]
Message-ID: <1304369816-14545-1-git-send-email-jack@suse.cz> (raw)
Hi,
ext3 has a problem that mmap writes end up allocating blocks only in
writepage() callback. This then effectively invalidates any quota checking
because writepage() is called from flusher thread thus with root priviledges.
So any user is able to arbitrarily exceed quota limit using mmap write.
The following four patches try to address this problem. The patches implement
page_mkwrite() callback which allocates all necessary metadata and reserves
space for data block (this is the main difference from the patches I was
sending last autumn which did not allocate metadata). Then during writepage()
(or write()) time the reservation gets converted into real block allocation.
With this implementation I don't see any performance difference in heavy
BerkleyDB load from the ext3 without these patches. Simple allocation in
page_mkwrite() ends up being about 3x slower than this reservation scheme
because of fragmentation.
I've tested the patch on both x86_64 (1K and 4K blocksize) and ppc with 64k
pages (1K and 4K blocksize) to catch possible bugs. I've also run tests in
ENOSPC conditions and conditions when quota is getting exceeded. All these
tests run fine with this version of patches (actually, I've triggered two
genuine ext3 bugs during this testing which I'm going to merge separately).
So I'd like to merge these patches but before I do that I'd like another
pair of eyes to have a look at these changes... So comments are welcome.
Maybe one more addition: As we spoke at LSF, we plan to remove ext3 driver
from kernel. But it's still going to take significant amount of time (more
than an year) so I'd like to have this serious issue fixed in ext3.
Honza
next reply other threads:[~2011-05-02 20:57 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-02 20:56 Jan Kara [this message]
2011-05-02 20:56 ` [PATCH 1/4] vfs: Unmap underlying metadata of new data buffers only when buffer is mapped Jan Kara
2011-05-02 20:56 ` [PATCH 3/4] ext3: Implement per-cpu counters for delayed allocation Jan Kara
2011-05-02 21:08 ` Andrew Morton
2011-05-02 20:56 ` [PATCH 4/4] ext3: Implement delayed allocation on page_mkwrite time Jan Kara
2011-05-02 21:12 ` Andrew Morton
2011-05-02 22:20 ` Jan Kara
2011-05-02 22:29 ` Andrew Morton
2011-05-03 17:09 ` Jan Kara
2011-05-11 15:38 ` Jan Kara
2011-05-11 19:52 ` Andrew Morton
2011-05-03 10:39 ` Amir Goldstein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1304369816-14545-1-git-send-email-jack@suse.cz \
--to=jack@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).