From: Edward Shishkin <edward@namesys.com>
To: Reiserfs List <reiserfs-list@namesys.com>
Subject: Handling holes by reiser4 cryptcompress (crc) object plugin
Date: Tue, 08 Feb 2005 18:30:02 +0300 [thread overview]
Message-ID: <4208DAFA.5090007@namesys.com> (raw)
Hi everyone.
This holes design looks pretty reasonable and Hans asked me to discuss
in the list about
possible obscured issues.
First, this is a short report about crc-plugin business.
All crc-file operations are going per-cluster. Cluster size is an
attribute of crc-file
assigned by user as N*PAGE_SIZE (N==1, 2, 4, 8, 16). So each crc-file is
considered as
a set of chunks and the crc object plugin manages the following objects:
.page cluster of index I (a set of N pages, the first one has index
I*PAGE_SIZE),
.disk cluster of key K (a set of mergeable items (powered by reiser4
ctail item plugin),
the first of which has key K).
Disk clusters contain processed (compressed, encrypted) data, whereas
page clusters
contain plain text.
The file can contain holes. Currently if we create a hole which occupies
more then one
cluster, we don't represent it by any items. We use to say that
appropriate disk cluster
is _fake_ (it doesn't exist neither in memory nor on disk). Partial
holes are represented
by real disk clusters (a hole is partial if it starts in the cluster
from non-zero offset).
The crc-specific file_read(), mmap(), etc.. call readpage(), readpages()
which use to fill
pages by plain text prepared from the data which is contained in the
appropriate disk cluster,
and if the last one is not found (dc is fake) the pages are filled by
zeroes. Note we take
care of this to be not found in a good sense: the search routine should
return NOTFOUND
(not an error).
If we modify a hole page cluster (by crc-specific file_write, mmapped
write), and the
appropriate disk cluster is fake, then real disk cluster will be
created. If we truncate
up/down a hole page cluster and the appropriate dc is fake, we don't
create a real one
(just update stat-data).
Example of possible crc-file evolution (cluster size == 64K):
(Operations/Resulted disk structure)
1. create / disk stat-data (i_size == 0)
2. truncate up to 10G / disk stat-data (i_size == 10G)
3. truncate down to 20M / disk stat-data (i_size == 20M)
4. write 100K from off 70K /disk stat-data (i_size == 20M) + 2 disk
clusters (key1== 64K, key2 == 128K)
5. truncate down to 100K / disk stat-data (i_size == 100K) + 1 disk
cluster (key1==64K)
6. truncate down to 50K / disk stat-data (i_size == 50K)
etc...
This design has been encoded and seems to be working on various
benchmarks which
create intensive hole fragmentation, but ... so all
questions/suggestions are welcome.
Thanks,
Edward.
reply other threads:[~2005-02-08 15:30 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4208DAFA.5090007@namesys.com \
--to=edward@namesys.com \
--cc=reiserfs-list@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.