From: Jeff Bonwick <Jeff.Bonwick@sun.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] Moving forward on Quotas
Date: Sat, 31 May 2008 21:53:02 -0700 [thread overview]
Message-ID: <20080601045302.GA29979@eng.sun.com> (raw)
In-Reply-To: <C46829E1.5610%peter.braam@sun.com>
I'd suggest working with Matt Ahrens on this.
Jeff
On Sun, Jun 01, 2008 at 10:26:41AM +0800, Peter Braam wrote:
> Jeff -
>
> could you get in touch with Nikita and Ricardo and assist them with a draft
> of quota design for the DMU. Nikita has some interesting API proposals, but
> there are some pretty deep ZFS issues involved where help would be welcome,
> as far as I can see.
>
> Just as a heads up, quota in systems like Lustre is quite a difficult issue,
> as many servers contribute to quota usage and this needs "acquire", and
> "release" of quota in reasonable chunks to avoid the server server protocol
> getting too chatty.
>
> Thank you for your help!
>
> Peter
>
>
> On 5/28/08 10:54 PM, "Nikita Danilov" <Nikita.Danilov@Sun.COM> wrote:
>
> > Ricardo M. Correia writes:
> >> On Ter, 2008-05-27 at 07:28 +0800, Peter Braam wrote:
> >>
> >>>> Going aside, if I were designing quota from the scratch right now, I
> >>>> would implement it completely inside of Lustre. All that is needed for
> >>>> such an implementation is a set of call-backs that local file-system
> >>>> invokes when it allocates/frees blocks (or inodes) for a given
> >>>> object. Lustre would use these call-backs to transactionally update
> >>>> local quota in its own format. That would save us a lot of hassle we
> >>>> have dealing with the changing kernel quota interfaces, uid re-mappings,
> >>>> and subtle differences between quota implementations on a different file
> >>>> systems.
> >>>
> >>> ======> IMPORTANT: get in touch with Jeff Bonwick now, let's get quota
> >>> implemented in this way in DMU then.
> >>
> >>
> >> I think this was proposed by Alex before, but AFAIU the conclusion is
> >> that this was not possible to do with ZFS (or at least, not easy to do).
> >>
> >> The problem is that ZFS uses delayed allocations, i.e., allocations
> >> occur long after a transaction group has been closed, and therefore we
> >> can't transactionally keep track of allocated space because by the time
> >> the callbacks were called we are not allowed to write to the transaction
> >> group anymore, since another 2 txgs could have been opened already.
> >
> > But that problem has to be solved anyway to implement per-user quotas
> > for ZFS, correct?
> >
> > One possible solution I see is to use something like ZIL to log
> > operations in the context of current transaction group. This log can be
> > replayed during mount to update quota file.
> >
> >>
> >> Since this couldn't be done transactionally, if the node crashes, there
> >> would be no way of knowing how many blocks had been allocated on the
> >> latest (actually, the latest 2) committed transaction groups..
> >>
> >> Regards,
> >> Ricardo
> >
> > Nikita.
>
>
next prev parent reply other threads:[~2008-06-01 4:53 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <18490.63940.619731.992500@gargle.gargle.HOWL>
2008-05-26 23:28 ` [Lustre-devel] Moving forward on Quotas Peter Braam
2008-05-28 8:06 ` Johann Lombardi
2008-06-01 2:32 ` Peter Braam
2008-06-02 12:22 ` Johann Lombardi
2008-06-02 23:24 ` Andreas Dilger
2008-06-03 8:49 ` Landen tian
2008-06-04 1:24 ` Peter Braam
2008-06-04 7:05 ` Landen tian
2008-06-04 8:26 ` Johann Lombardi
2008-05-28 14:29 ` Ricardo M. Correia
2008-05-28 14:54 ` Nikita Danilov
2008-05-28 15:14 ` Ricardo M. Correia
2008-05-28 16:22 ` Nikita Danilov
2008-05-28 17:05 ` Ricardo M. Correia
2008-05-28 20:06 ` Nikita Danilov
2008-05-28 21:07 ` Ricardo M. Correia
2008-05-28 21:11 ` Nikita Danilov
2008-05-28 21:33 ` Ricardo M. Correia
2008-05-29 8:39 ` Nikita Danilov
[not found] ` <18496.11672.844774.815457@gargle.gargle.HOWL>
2008-05-31 15:31 ` Ricardo M. Correia
2008-05-31 15:49 ` Ricardo M. Correia
[not found] ` <1212247447.21348.70.camel@localhost>
2008-05-31 16:19 ` Nikita Danilov
2008-05-31 17:19 ` Ricardo M. Correia
2008-05-31 19:11 ` Nikita Danilov
2008-06-01 2:36 ` Peter Braam
2008-06-01 3:17 ` Mike Shapiro
2008-06-01 2:26 ` Peter Braam
2008-06-01 4:53 ` Jeff Bonwick [this message]
2008-06-01 13:58 ` Nikita Danilov
2008-06-03 0:50 ` Matthew Ahrens
2008-06-03 7:49 ` Nikita Danilov
2008-06-04 23:50 ` Matthew Ahrens
2008-05-28 15:24 ` Nikita Danilov
2008-05-31 10:25 ` Peter Braam
[not found] <92825021-D566-4805-9297-5EFBD3260D73@Sun.COM>
2008-06-01 2:44 ` Peter Braam
[not found] <20080605083957.GQ6283@lore>
2008-06-05 11:09 ` Peter Braam
2008-06-05 12:27 ` Johann Lombardi
2008-06-05 13:45 ` Peter Braam
2008-06-06 7:33 ` Johann Lombardi
2008-06-06 12:21 ` Peter Braam
2008-06-09 8:52 ` Yong Fan
2008-06-09 15:37 ` Peter Braam
2008-06-09 16:09 ` Yong Fan
2008-06-10 13:54 ` Yong Fan
2008-06-10 16:51 ` Peter Braam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080601045302.GA29979@eng.sun.com \
--to=jeff.bonwick@sun.com \
--cc=lustre-devel@lists.lustre.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.