linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: Jan Kara <jack@suse.cz>, Eric Sandeen <sandeen@redhat.com>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: [PATCH 0/3] ext4: don't use quota reservation for speculative metadata blocks
Date: Mon, 12 Apr 2010 16:24:05 +0200	[thread overview]
Message-ID: <20100412142404.GC3404@quack.suse.cz> (raw)
In-Reply-To: <871vekhigl.fsf@openvz.org>

On Mon 12-04-10 18:08:10, Dmitry Monakhov wrote:
> Jan Kara <jack@suse.cz> writes:
> 
> >> Jan Kara <jack@suse.cz> writes:
> >> 
> >> >> Eric Sandeen <sandeen@redhat.com> writes:
> >> >> 
> >> >> > Dmitry Monakhov wrote:
> >> >> >> Eric Sandeen <sandeen@redhat.com> writes:
> >> >> >> 
> >> >> >>> Because we can badly over-reserve metadata when we
> >> >> >>> calculate worst-case, it complicates things for quota, since
> >> >> >>> we must reserve and then claim later, retry on EDQUOT, etc.
> >> >> >>> Quota is also a generally smaller pool than fs free blocks,
> >> >> >>> so this over-reservation hurts more, and more often.
> >> >> >>>
> >> >> >>> I'm of the opinion that it's not the worst thing to allow
> >> >> >>> metadata to push a user slightly over quota.  This simplifies
> >> >> >>> the code and avoids the false quota rejections that result
> >> >> >>> from worst-case speculation.
> >> >> >> Hm.. Totally agree with issue description. And seem there is no another
> >> >> >> solution except yours.
> >> >> >> ASAIU alloc_nofail is called from places where it is impossible to fail
> >> >> >> an allocation even if something goes wrong.
> >> >> >> I ask because currently i'm working on EIO handling in alloc/free calls.
> >> >> >> I've found that it is useless to fail claim/free procedures because
> >> >> >> caller is unable to handle it properly.
> >> >> >> It is impossible to fail following operation
> >> >> >> ->writepage
> >> >> >>  ->dquot_claim_space (what to do if EIO happens?)
> >> >> >
> >> >> > Hm, if these start returning EIO then maybe my patch should be modified
> >> >> > to treat EDQUOT differently than EIO ... assuming callers can handle
> >> >> > the return at all.
> >> >> >
> >> >> > In other words, make NOFAIL really just mean "don't fail for EDQUOT"
> >> >> Yes. agree So we have two types of errors
> >> >> 1) expected errors: EDQUOT
> >> >> 2) fatal errors: (EIO/ENOSPC/ENOMEM)
> >> >> So we need two types of flags:
> >> >> 1)FORCE (IMHO it is better name than you proposed) to allow exceed a
> >> >>   quota limit
> >> >> 2)NOFAIL to allow ignore fatal errors.
> >> >> 
> >> >> We still need NOFAIL, because for example if something is happens in
> >> >> ->write_page()
> >> >>  ->dquot_claim()
> >> >>      update_quota() -> EIO  /* update disk quota */
> >> >>      update_bytes() /* update i_bytes count */
> >> >> It is obvious that write_page should fail because it is too late to
> >> >> return the error to userspace, so data will probably lost which
> >> >> is much more dramatic bug than quota inconsistency.
> >> >> So the only options we have is to:
> >> >> 1) Do not modify inode->i_bytes and return error which caller will
> >> >>    probably ignore. IMHO this is not good because result in
> >> >>    incorrect stat()
> >> >> 
> >> >> 2) do as much as we can (as it happens for now), modify inode->i_bytes
> >> >>    and return positive error code to caller.(which signal what error
> >> >>    result in quota inconsystency only)
> >> >   Yes, agreed that 2) is a better solution.
> >> >
> >> >> This fatal errors handling logic i'll post on top of your patch-set.
> >> >> But please change flag name from NOFAIL to FORCE.
> >> >   Hmm, do we really need to distinguish between your NOFAIL and FORCE?
> >> > I mean there are places where we can handle quota failures (both EDQUOT
> >> > or others) and places where we cannot and then we just want to go on as
> >> > seamlessly as possible. So NOFAIL flag seems to be enough...
> >> >   Now I agree that in theory there can be some caller which might wish
> >> > to seamlessly continue on EDQUOT and bail out on EIO but I'm not aware
> >> > of such callsite currently so there's no immediate need for the flag.
> >> > So Eric's patches seem to be fine to me as they are. What do you think?
> >> FORCE but without NOFAIL will be useful for example in
> >> write to fallocated area
> >> extent split/convert (uninitialized=>initialized conversion)
> >> It is not good idea to return EDQUOT from write to reserved area
> >> due to metadata overhead, but it is easy to handle EIO from that method.
> >> So IMHO two flags is not a fancy option, but reasonable design solution.
> >> This design confirms to right for openvz's userbean counters.
> >> The only thing i ask is to rename NOFAIL => FORCE.
> >   But as I wrote in my other email that callsite in ext4 really needs NOFAIL,
> > not only FORCE as far as I can see. So Eric's patch is actually right, isn't
> > it?
> Ohh. Seems we are talking about the same thing but my explanation is
> not clear (sorry for my crappy English)
> The whole patch is definitely right. All places changed has NOFAIL
> (it must succeed in 100% of cases)semantics. And it is reasonable to
> accept it as is.
> Later i'll add new FORCE flag and:
> 1) convert all NOFAIL callers to (FORCE|NOFAIL)
> 2) add new callers of FORCE.
  OK :) This is fine with me. I'd just slightly favour "NOLIMIT" instead
of "FORCE" since it explains better what the flag does.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2010-04-12 14:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-07 21:45 [PATCH 0/3] ext4: don't use quota reservation for speculative metadata blocks Eric Sandeen
2010-04-07 21:49 ` [PATCH 1/3] quota: use flags interface for dquot alloc/free space Eric Sandeen
2010-04-12 13:22   ` Jan Kara
2010-05-20 22:29   ` tytso
2010-04-07 21:52 ` [PATCH 2/3] quota: add the option to not fail with EDQUOT in block allocation Eric Sandeen
2010-04-12 13:23   ` Jan Kara
2010-05-20 22:32   ` tytso
2010-04-07 21:55 ` [PATCH 3/3] ext4: don't use quota reservation for speculative metadata blocks Eric Sandeen
2010-04-12 13:36   ` Jan Kara
2010-05-20 23:10   ` tytso
2010-04-08  8:20 ` [PATCH 0/3] " Dmitry Monakhov
2010-04-08 15:28   ` Eric Sandeen
2010-04-11  9:37     ` Dmitry Monakhov
2010-04-12 13:15       ` Jan Kara
2010-04-12 13:22         ` Jan Kara
2010-04-12 13:40           ` Dmitry Monakhov
2010-04-12 13:33         ` Dmitry Monakhov
2010-04-12 13:37           ` Dmitry Monakhov
2010-04-12 13:42           ` tytso
2010-04-12 13:52             ` Dmitry Monakhov
2010-04-12 13:55           ` Jan Kara
2010-04-12 14:08             ` Dmitry Monakhov
2010-04-12 14:24               ` Jan Kara [this message]
2010-04-12 15:50                 ` Eric Sandeen
2010-04-12  2:20 ` tytso
2010-04-12 14:03   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100412142404.GC3404@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=dmonakhov@openvz.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=sandeen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).