linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Updated ext4 quota design document
@ 2010-06-21 12:29 Theodore Ts'o
  2010-06-22 14:20 ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: Theodore Ts'o @ 2010-06-21 12:29 UTC (permalink / raw)
  To: linux-ext4, Jan Kara


I've created an updated quota design document here:

https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4

No major changes from last time.

One new thing is a proposed (optional) change to the quota format,
to use the 32-bit dqpb_pad field in the v2r1 on-disk quota structure as
a 32-bit CRC of the quota entry.  This would allow the quota system to
detect corrupted quota entries.  Jan, what do you think?

       		       		      	      - Ted



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Updated ext4 quota design document
  2010-06-21 12:29 Updated ext4 quota design document Theodore Ts'o
@ 2010-06-22 14:20 ` Jan Kara
  2010-06-22 20:08   ` tytso
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2010-06-22 14:20 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-ext4, Jan Kara

  Hi,

On Mon 21-06-10 08:29:06, Theodore Ts'o wrote:
> I've created an updated quota design document here:
> 
> https://ext4.wiki.kernel.org/index.php/Design_For_1st_Class_Quota_in_Ext4
> 
> No major changes from last time.
> 
> One new thing is a proposed (optional) change to the quota format,
> to use the 32-bit dqpb_pad field in the v2r1 on-disk quota structure as
> a 32-bit CRC of the quota entry.  This would allow the quota system to
> detect corrupted quota entries.  Jan, what do you think?
  It might be reasonable to checksum dquots so that we get closer to
all-metadata-are-checksummed state. I'm just thinking whether checksumming
each dquot is so useful. For example OCFS2 checksums each quota block. That
has an advantage that also quota file tree blocks and headers are
protected. Also it's possible to use the generic block checksumming
framework in JBD2 for this case. OTOH ext4 seems to have chosen to checksum
each group descriptor individually so checksumming each dquot structure
would seem more consistent.
  So I don't have a strong opinion which checksumming scheme should be
chosen. I just wanted to point out that there's another reasonable option.
Generic quota code can easily handle both (including leaving some bytes at
the end of each block for checksums as it does for OCFS2 now).

									Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Updated ext4 quota design document
  2010-06-22 14:20 ` Jan Kara
@ 2010-06-22 20:08   ` tytso
  2010-06-22 20:29     ` Jan Kara
  0 siblings, 1 reply; 7+ messages in thread
From: tytso @ 2010-06-22 20:08 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Tue, Jun 22, 2010 at 04:20:47PM +0200, Jan Kara wrote:
>   It might be reasonable to checksum dquots so that we get closer to
> all-metadata-are-checksummed state. I'm just thinking whether checksumming
> each dquot is so useful. For example OCFS2 checksums each quota block. That
> has an advantage that also quota file tree blocks and headers are
> protected. Also it's possible to use the generic block checksumming
> framework in JBD2 for this case. OTOH ext4 seems to have chosen to checksum
> each group descriptor individually so checksumming each dquot structure
> would seem more consistent.

Well, the reason why I suggested just checksuming the each quota entry
is that it was the simplest thing to do, and wouldn't require making
huge changes to the rest of the quota_tree code.  It also means we
don't need to do any kind of special locking to make sure there isn't
another process modifying another quota entry in the same block at the
same time that we are calculating the per-block checksum --- i.e.,
some of the headaches that we're seeing with the DIF code.

>   So I don't have a strong opinion which checksumming scheme should be
> chosen. I just wanted to point out that there's another reasonable option.
> Generic quota code can easily handle both (including leaving some bytes at
> the end of each block for checksums as it does for OCFS2 now).

I assume OCFS2 is just using dqdh_pad2 or dqdh_pad1 for its checksum?

  	       	       	     	       	  	    - Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Updated ext4 quota design document
  2010-06-22 20:08   ` tytso
@ 2010-06-22 20:29     ` Jan Kara
  2010-06-22 21:52       ` tytso
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Kara @ 2010-06-22 20:29 UTC (permalink / raw)
  To: tytso; +Cc: Jan Kara, linux-ext4

On Tue 22-06-10 16:08:53, tytso@mit.edu wrote:
> On Tue, Jun 22, 2010 at 04:20:47PM +0200, Jan Kara wrote:
> >   It might be reasonable to checksum dquots so that we get closer to
> > all-metadata-are-checksummed state. I'm just thinking whether checksumming
> > each dquot is so useful. For example OCFS2 checksums each quota block. That
> > has an advantage that also quota file tree blocks and headers are
> > protected. Also it's possible to use the generic block checksumming
> > framework in JBD2 for this case. OTOH ext4 seems to have chosen to checksum
> > each group descriptor individually so checksumming each dquot structure
> > would seem more consistent.
> 
> Well, the reason why I suggested just checksuming the each quota entry
> is that it was the simplest thing to do, and wouldn't require making
> huge changes to the rest of the quota_tree code.  It also means we
> don't need to do any kind of special locking to make sure there isn't
> another process modifying another quota entry in the same block at the
> same time that we are calculating the per-block checksum --- i.e.,
> some of the headaches that we're seeing with the DIF code.
  With metadata which get journaled it should be quite easy. JBD already
must know before you go and modify buffer contents - that's why
journal_get_write_access and friends exist. It also makes sure that your
data cannot be modified from the moment the buffer enters commit upto the
moment the commit is finished. So you can use buffer commit hook to compute
and store block checksum safely.

> >   So I don't have a strong opinion which checksumming scheme should be
> > chosen. I just wanted to point out that there's another reasonable option.
> > Generic quota code can easily handle both (including leaving some bytes at
> > the end of each block for checksums as it does for OCFS2 now).
> 
> I assume OCFS2 is just using dqdh_pad2 or dqdh_pad1 for its checksum?
  No. quota_tree code sets info->dqi_usable_bs to something smaller than
1 << info->dqi_qtree_depth. Thus quota code leaves a few bytes in each
block unused and ocfs2 stores there the checksum.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Updated ext4 quota design document
  2010-06-22 20:29     ` Jan Kara
@ 2010-06-22 21:52       ` tytso
  2010-06-23 12:30         ` Jan Kara
  2010-07-02  7:41         ` Dmitry Monakhov
  0 siblings, 2 replies; 7+ messages in thread
From: tytso @ 2010-06-22 21:52 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4

On Tue, Jun 22, 2010 at 10:29:27PM +0200, Jan Kara wrote:
>   With metadata which get journaled it should be quite easy. JBD already
> must know before you go and modify buffer contents - that's why
> journal_get_write_access and friends exist. It also makes sure that your
> data cannot be modified from the moment the buffer enters commit upto the
> moment the commit is finished. So you can use buffer commit hook to compute
> and store block checksum safely.

True, but we're also interested in making sure this feature can be
used in the non-journal case as well....

     						- Ted

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Updated ext4 quota design document
  2010-06-22 21:52       ` tytso
@ 2010-06-23 12:30         ` Jan Kara
  2010-07-02  7:41         ` Dmitry Monakhov
  1 sibling, 0 replies; 7+ messages in thread
From: Jan Kara @ 2010-06-23 12:30 UTC (permalink / raw)
  To: tytso; +Cc: Jan Kara, linux-ext4

On Tue 22-06-10 17:52:00, tytso@mit.edu wrote:
> On Tue, Jun 22, 2010 at 10:29:27PM +0200, Jan Kara wrote:
> >   With metadata which get journaled it should be quite easy. JBD already
> > must know before you go and modify buffer contents - that's why
> > journal_get_write_access and friends exist. It also makes sure that your
> > data cannot be modified from the moment the buffer enters commit upto the
> > moment the commit is finished. So you can use buffer commit hook to compute
> > and store block checksum safely.
> 
> True, but we're also interested in making sure this feature can be
> used in the non-journal case as well....
  Ah, I forgot about that... Doing per-block checksums in that case would
be indeed more complicated. So computing checksum for each dquot is
probably simpler.

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Updated ext4 quota design document
  2010-06-22 21:52       ` tytso
  2010-06-23 12:30         ` Jan Kara
@ 2010-07-02  7:41         ` Dmitry Monakhov
  1 sibling, 0 replies; 7+ messages in thread
From: Dmitry Monakhov @ 2010-07-02  7:41 UTC (permalink / raw)
  To: tytso; +Cc: Jan Kara, linux-ext4

tytso@mit.edu writes:

> On Tue, Jun 22, 2010 at 10:29:27PM +0200, Jan Kara wrote:
>>   With metadata which get journaled it should be quite easy. JBD already
>> must know before you go and modify buffer contents - that's why
>> journal_get_write_access and friends exist. It also makes sure that your
>> data cannot be modified from the moment the buffer enters commit upto the
>> moment the commit is finished. So you can use buffer commit hook to compute
>> and store block checksum safely.
>
> True, but we're also interested in making sure this feature can be
> used in the non-journal case as well....
Q:What is use case for that non-journal quota ?
A: ASAIU answer will "GFS chunkservers"
Are any chances that quota will be consistent with real space usage
after any failure? Currently difference may be huge.
BTW: ASAIU that it is not safe to use unclean fs in nojournal mode
without explicit e2fsck.  And ASAIU that is the reason why nojournal
users use replication or any other redundancy mechanism to protect
data and just throw away broken data after any failure on a single node.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-07-02  7:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-21 12:29 Updated ext4 quota design document Theodore Ts'o
2010-06-22 14:20 ` Jan Kara
2010-06-22 20:08   ` tytso
2010-06-22 20:29     ` Jan Kara
2010-06-22 21:52       ` tytso
2010-06-23 12:30         ` Jan Kara
2010-07-02  7:41         ` Dmitry Monakhov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).