* [FEATURE][PATCH 0/2] reiser4: Auto-punching holes on commit
@ 2015-07-19 15:42 Edward Shishkin
2015-07-19 17:47 ` Edward Shishkin
0 siblings, 1 reply; 2+ messages in thread
From: Edward Shishkin @ 2015-07-19 15:42 UTC (permalink / raw)
To: ReiserFS development mailing list
[-- Attachment #1: Type: text/plain, Size: 4060 bytes --]
Auto-punching holes on commit
Storing zeros on disk is a rather stupid business. Indeed, right before
writing data to disk we can convert zeros to holes (this is abstract
objects described in POSIX), and, hence, save a lot of disk space.
Compressing zeros before storing them on disk is even more stupid
business: checking for zeros is less expensive procedure than
compression transform, so in addition we can save a lot of CPU
resources.
I'll remind how reiser4 implements holes.
The unix file plugin represents them via extent pointers marked by
some special way. The situation with cryptcompress file plugin is more
simple: it represents holes as literal holes (that is, absence of any
items of specific keys). It means that we can simply check and remove
all items, which represent a logical chunk filled with zeros. This is
exactly what we do now at flush time right before commit.
The best time for such check is atom's flush, which is to complete all
delayed actions. Specifically, it calls a static machine ->convert_node()
for all dirty formatted nodes. This machine scans all items of a node
and calls ->convert() method of every such item.
We used this framework for transparent compression on commit
(specifically to replace old fragments that compose compressed file's
body with the new ones). Now we use it also to punch holes at logical
chunks filled with zeros. That is, instead of replacing old items, we
just remove them from tree. Think of hole punching like of one more
delayed action.
I have implemented hole punching only for cryptcompress plugin. It also
can be implemented for "classic" unix-file plugin, which doesn't compress
data. However, it will be more complicated because of more complicated
format of holes. Finally, I think that having such feature only for one
file plugin is enough.
Solved Problems:
When flushing modified dirty pages, the process should be able to find
in the tree a respective item group to be replaced with new data. So we
should handle possible races when one process checks/creates the items
and the flushing process deletes those items during hole punching
procedure. To avoid this situation we maintain a special "economical"
counter of checked-in modifications for every logical cluster in struct
jnode. If the counter is greater than 1, then we simply don't punch a
hole.
Mount option "dont_punch_holes"
Since hole punching is useful feature for both HDD and SSD, I enabled it
by default. To turn it off use the mount option "dont_punch_holes". The
changes are backward and forward compatible, so no new format is needed.
How it looks on practice:
# mkfs.reiser4 -f -y /dev/sdaX
# mount /dev/sdaX /mnt
# dd if=/dev/zero of=/mnt/foo bs=65536 count=1000
# umount /mnt
Now dump the tree:
# debugfs.reiser4 -t /dev/sdaX | less
As we can see (attachment 1) the file foo doesn't have body, only stat-data
(on-disk inode): we removed its body at flush time, because it is composed
of zeros (see my remark above about holes). Let's now append non-zero
data to our file "foo":
# mount /dev/sdaX /mnt
# echo "This is not zeros" >> /mnt/foo
# umount /mnt
# debugfs.reiser4 -t /dev/sdaX | less
As we can see (attachment 2) the body of the file "foo" now consists of only
one item of length 59, which has offset 0x3e80000 (=65536000). This is
exactly
the string "This is not zeros" supplemented with zeros up to page size
(4096)
and compressed by LZO1 algorithm.
*******************************************************************************
NOTE: with the feature of hole auto-punching some benchmarks won't produce
any visible IO load.
********************************************************************************
WARNING WARNING WARNING:
This is only for testing. Don't use it for important data for now!
********************************************************************************
If something goes wrong, then please let me know.
Thanks,
Edward.
[-- Attachment #2: sda7.1 --]
[-- Type: application/x-troff-man, Size: 993 bytes --]
[-- Attachment #3: sda7.2 --]
[-- Type: application/x-troff-man, Size: 1311 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [FEATURE][PATCH 0/2] reiser4: Auto-punching holes on commit
2015-07-19 15:42 [FEATURE][PATCH 0/2] reiser4: Auto-punching holes on commit Edward Shishkin
@ 2015-07-19 17:47 ` Edward Shishkin
0 siblings, 0 replies; 2+ messages in thread
From: Edward Shishkin @ 2015-07-19 17:47 UTC (permalink / raw)
To: ReiserFS development mailing list
On 07/19/2015 11:42 PM, Edward Shishkin wrote:
>
> Auto-punching holes on commit
>
>
> Storing zeros on disk is a rather stupid business. Indeed, right before
> writing data to disk we can convert zeros to holes (this is abstract
> objects described in POSIX), and, hence, save a lot of disk space.
>
> Compressing zeros before storing them on disk is even more stupid
> business: checking for zeros is less expensive procedure than
> compression transform, so in addition we can save a lot of CPU
> resources.
>
> I'll remind how reiser4 implements holes.
> The unix file plugin represents them via extent pointers marked by
> some special way. The situation with cryptcompress file plugin is more
> simple: it represents holes as literal holes (that is, absence of any
> items of specific keys). It means that we can simply check and remove
> all items, which represent a logical chunk filled with zeros. This is
> exactly what we do now at flush time right before commit.
>
> The best time for such check is atom's flush, which is to complete all
> delayed actions. Specifically, it calls a static machine ->convert_node()
> for all dirty formatted nodes. This machine scans all items of a node
> and calls ->convert() method of every such item.
>
> We used this framework for transparent compression on commit
> (specifically to replace old fragments that compose compressed file's
> body with the new ones). Now we use it also to punch holes at logical
> chunks filled with zeros. That is, instead of replacing old items, we
> just remove them from tree. Think of hole punching like of one more
> delayed action.
>
> I have implemented hole punching only for cryptcompress plugin. It also
> can be implemented for "classic" unix-file plugin, which doesn't compress
> data. However, it will be more complicated because of more complicated
> format of holes. Finally, I think that having such feature only for one
> file plugin is enough.
>
>
> Solved Problems:
>
>
> When flushing modified dirty pages, the process should be able to find
> in the tree a respective item group to be replaced with new data. So we
> should handle possible races when one process checks/creates the items
> and the flushing process deletes those items during hole punching
> procedure. To avoid this situation we maintain a special "economical"
> counter of checked-in modifications for every logical cluster in struct
> jnode. If the counter is greater than 1, then we simply don't punch a
> hole.
>
>
> Mount option "dont_punch_holes"
>
>
> Since hole punching is useful feature for both HDD and SSD, I enabled it
> by default. To turn it off use the mount option "dont_punch_holes". The
> changes are backward and forward compatible, so no new format is needed.
>
>
> How it looks on practice:
>
>
> # mkfs.reiser4 -f -y /dev/sdaX
> # mount /dev/sdaX /mnt
> # dd if=/dev/zero of=/mnt/foo bs=65536 count=1000
> # umount /mnt
>
> Now dump the tree:
>
> # debugfs.reiser4 -t /dev/sdaX | less
>
> As we can see (attachment 1) the file foo doesn't have body, only
> stat-data
> (on-disk inode): we removed its body at flush time, because it is
> composed
> of zeros (see my remark above about holes). Let's now append non-zero
> data to our file "foo":
>
> # mount /dev/sdaX /mnt
> # echo "This is not zeros" >> /mnt/foo
> # umount /mnt
> # debugfs.reiser4 -t /dev/sdaX | less
>
> As we can see (attachment 2) the body of the file "foo" now consists
> of only
> one item of length 59, which has offset 0x3e80000 (=65536000). This is
> exactly
> the string "This is not zeros" supplemented with zeros up to page size
> (4096)
> and compressed by LZO1 algorithm.
Sorry, I have attached a wrong file in the attachment 2. Should be the
following:
#3 CTAIL (ctail40): [2a:4(FB):666f6f00000000:10001:3e80000] OFF=340,
LEN=19, flags=0x0 shift=16
That is, body of file "foo" consists of only one item of length 19 (=
length of the
string "This is not zeros" plus one byte, where the size of logical
cluster is stored).
Zeros at the end of in-memory file are not stored on disk !
Thanks,
Edward.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2015-07-19 17:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-19 15:42 [FEATURE][PATCH 0/2] reiser4: Auto-punching holes on commit Edward Shishkin
2015-07-19 17:47 ` Edward Shishkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).