reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes

reiserfs-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes
@ 2014-09-10 19:00 Ivan Shapovalov
  2014-09-10 20:17 ` Edward Shishkin
  0 siblings, 1 reply; 10+ messages in thread
From: Ivan Shapovalov @ 2014-09-10 19:00 UTC (permalink / raw)
  To: reiserfs-devel

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

Hi!

The preamble: recently I had to force-change my configuration (the old laptop
was stolen). What I have now is a combination of a tiny 16 GiB SSD and a huge
1 TiB HDD.

...So I've placed my /home on HDD. Partition size is 800 GiB, formatting
options are "create=ccreg40,compress=gzip1,compressMode=latt" and I have a few
questions.

1. What is the recommended compression mode?
More specifically, what is the default "conv" mode? What is its purpose, why is
it the default?
I'm asking, because I wasn't able to understand its purpose from code, and the
code itself looks hackish in some places (hardcoded fallback to extent-only
files, hardcoded policy, hardcoded fallback to "latt" in many cases, etc).

2. The mount time of a 800-GiB partition is >20 seconds. And with
dont_load_bitmap it's around 1-2 seconds. Why so much? Why other filesystems
have drastically less mount times? If they have an equivalent of
dont_load_bitmap enabled by default, why don't we do it?

3. Given a directory tree with ~20k files of total size around 20 GiB,
its removal takes forever. From strace I see that a single unlink takes
~1 second. Again, why so much? Is it related to my choice of "latt" compression
mode over the default "conv"?

3a. I can reproduce the "directory not empty" bug :) Interestingly, it is
always the same directory under the aforementioned huge hierarchy. (I've
done the unpack-remove cycle a few times.)

Thanks,
-- 
Ivan Shapovalov / intelfx /

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 213 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes
  2014-09-10 19:00 reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes Ivan Shapovalov
@ 2014-09-10 20:17 ` Edward Shishkin
  2014-09-10 21:26   ` Edward Shishkin
  2014-09-10 21:39   ` Ivan Shapovalov
  0 siblings, 2 replies; 10+ messages in thread
From: Edward Shishkin @ 2014-09-10 20:17 UTC (permalink / raw)
  To: Ivan Shapovalov, ReiserFS Development mailing list

On 09/10/2014 09:00 PM, Ivan Shapovalov wrote:
> Hi!
>
> The preamble: recently I had to force-change my configuration (the old laptop
> was stolen). What I have now is a combination of a tiny 16 GiB SSD and a huge
> 1 TiB HDD.
>
> ...So I've placed my /home on HDD. Partition size is 800 GiB, formatting
> options are "create=ccreg40,compress=gzip1,compressMode=latt" and I have a few
> questions.
>
> 1. What is the recommended compression mode?


The default one (conv).


> More specifically, what is the default "conv" mode? What is its purpose, why is
> it the default?


In this mode intelligent switches take place in 2 interfaces:
1) in FILE interface (if the first 64K of the file are incompressible, then
     management is passed to unix-file plugin forever);
2) in COMPRESSION interface (turn on/off compression transform
     on a dynamic lattice).

In other compression modes switches take place only in COMPRESSION
interface.


> I'm asking, because I wasn't able to understand its purpose from code, and the
> code itself looks hackish in some places (hardcoded fallback to extent-only
> files,


Actually, this is implementation of a compression mode, not a hardcoded
fallback.


>   hardcoded policy, hardcoded fallback to "latt" in many cases, etc).


ditto


>
> 2. The mount time of a 800-GiB partition is >20 seconds. And with
> dont_load_bitmap it's around 1-2 seconds. Why so much?


By default all bitmap blocks are loaded to memory at mount time.
Now calculate a number of bitmap blocks for 800-GiB partition that
should be read from disk.



>   Why other filesystems
> have drastically less mount times? If they have an equivalent of
> dont_load_bitmap enabled by default, why don't we do it?


For historical reasons. I recommended to not use large partitions
for reiser4, so there wasn't any need in this option.


>
> 3. Given a directory tree with ~20k files of total size around 20 GiB,
> its removal takes forever. From strace I see that a single unlink takes
> ~1 second. Again, why so much? Is it related to my choice of "latt" compression
> mode over the default "conv"?


Yes, in particular.
"latt" means that all file bodies are represented by fragments in 
formatted nodes.


>
> 3a. I can reproduce the "directory not empty" bug :) Interestingly, it is
> always the same directory under the aforementioned huge hierarchy. (I've
> done the unpack-remove cycle a few times.)


I've made a conclusion that this is caused by unexpected disappearing
of a record, which represents a directory entry in the directory item
(currently directory items are managed by cde ITEM plugin, aka "compound
directory entries"). In the error path (ENOENT) the size of the directory is
not decremented, which makes the directory undeletable. I still don't know
who kills the entries. Special debugging info is needed to find/fix it.

Thanks,
Edward.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes
  2014-09-10 20:17 ` Edward Shishkin
@ 2014-09-10 21:26   ` Edward Shishkin
  2014-09-10 21:39   ` Ivan Shapovalov
  1 sibling, 0 replies; 10+ messages in thread
From: Edward Shishkin @ 2014-09-10 21:26 UTC (permalink / raw)
  To: Ivan Shapovalov, ReiserFS Development mailing list


On 09/10/2014 10:17 PM, Edward Shishkin wrote:
> On 09/10/2014 09:00 PM, Ivan Shapovalov wrote:
>> Hi!
>>
>> The preamble: recently I had to force-change my configuration (the 
>> old laptop
>> was stolen). What I have now is a combination of a tiny 16 GiB SSD 
>> and a huge
>> 1 TiB HDD.
>>
>> ...So I've placed my /home on HDD. Partition size is 800 GiB, formatting
>> options are "create=ccreg40,compress=gzip1,compressMode=latt" and I 
>> have a few
>> questions.
>>
>> 1. What is the recommended compression mode?
>
>
> The default one (conv).
>
>
>> More specifically, what is the default "conv" mode? What is its 
>> purpose, why is
>> it the default?
>
>
> In this mode intelligent switches take place in 2 interfaces:
> 1) in FILE interface (if the first 64K of the file are incompressible, 
> then
>     management is passed to unix-file plugin forever);
> 2) in COMPRESSION interface (turn on/off compression transform
>     on a dynamic lattice).
>
> In other compression modes switches take place only in COMPRESSION
> interface.
>
>
>> I'm asking, because I wasn't able to understand its purpose from 
>> code, and the
>> code itself looks hackish in some places (hardcoded fallback to 
>> extent-only
>> files,
>
>
> Actually, this is implementation of a compression mode, not a hardcoded
> fallback.
>
>
>>   hardcoded policy, hardcoded fallback to "latt" in many cases, etc).
>
>
> ditto
>
>
>>
>> 2. The mount time of a 800-GiB partition is >20 seconds. And with
>> dont_load_bitmap it's around 1-2 seconds. Why so much?
>
>
> By default all bitmap blocks are loaded to memory at mount time.
> Now calculate a number of bitmap blocks for 800-GiB partition that
> should be read from disk.
>
>
>
>>   Why other filesystems
>> have drastically less mount times? If they have an equivalent of
>> dont_load_bitmap enabled by default, why don't we do it?
>
>
> For historical reasons. I recommended to not use large partitions
> for reiser4, so there wasn't any need in this option.
>
>
>>
>> 3. Given a directory tree with ~20k files of total size around 20 GiB,
>> its removal takes forever. From strace I see that a single unlink takes
>> ~1 second. Again, why so much? Is it related to my choice of "latt" 
>> compression
>> mode over the default "conv"?
>
>
> Yes, in particular.
> "latt" means that all file bodies are represented by fragments in 
> formatted nodes.


also make sure that debug mode is off..


>
>
>>
>> 3a. I can reproduce the "directory not empty" bug :) Interestingly, 
>> it is
>> always the same directory under the aforementioned huge hierarchy. (I've
>> done the unpack-remove cycle a few times.)
>
>
> I've made a conclusion that this is caused by unexpected disappearing
> of a record, which represents a directory entry in the directory item
> (currently directory items are managed by cde ITEM plugin, aka "compound
> directory entries"). In the error path (ENOENT) the size of the 
> directory is
> not decremented, which makes the directory undeletable. I still don't 
> know
> who kills the entries. Special debugging info is needed to find/fix it.
>
> Thanks,
> Edward.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes
  2014-09-10 20:17 ` Edward Shishkin
  2014-09-10 21:26   ` Edward Shishkin
@ 2014-09-10 21:39   ` Ivan Shapovalov
  2014-09-11 17:16     ` Edward Shishkin
  1 sibling, 1 reply; 10+ messages in thread
From: Ivan Shapovalov @ 2014-09-10 21:39 UTC (permalink / raw)
  To: Edward Shishkin; +Cc: ReiserFS Development mailing list

[-- Attachment #1: Type: text/plain, Size: 3989 bytes --]

On Wednesday 10 September 2014 at 22:17:15, Edward Shishkin wrote:	
> On 09/10/2014 09:00 PM, Ivan Shapovalov wrote:
> > Hi!
> >
> > The preamble: recently I had to force-change my configuration (the old laptop
> > was stolen). What I have now is a combination of a tiny 16 GiB SSD and a huge
> > 1 TiB HDD.
> >
> > ...So I've placed my /home on HDD. Partition size is 800 GiB, formatting
> > options are "create=ccreg40,compress=gzip1,compressMode=latt" and I have a few
> > questions.
> >
> > 1. What is the recommended compression mode?
> 
> 
> The default one (conv).

OK, thanks.

> > More specifically, what is the default "conv" mode? What is its purpose, why is
> > it the default?
> 
> 
> In this mode intelligent switches take place in 2 interfaces:
> 1) in FILE interface (if the first 64K of the file are incompressible, then
>      management is passed to unix-file plugin forever);
> 2) in COMPRESSION interface (turn on/off compression transform
>      on a dynamic lattice).
> 
> In other compression modes switches take place only in COMPRESSION
> interface.
> 
> 
> > I'm asking, because I wasn't able to understand its purpose from code, and the
> > code itself looks hackish in some places (hardcoded fallback to extent-only
> > files,
> 
> 
> Actually, this is implementation of a compression mode, not a hardcoded
> fallback.
> 
> 
> >   hardcoded policy, hardcoded fallback to "latt" in many cases, etc).
> 
> 
> ditto

Yes, I understand that this is implementation and it doesn't have an obligation
to be configurable in every aspect... but still it feels somewhat strange.
E. g. why "extents only" formatting is forced when a file is decided to be
incompressible? Why the heuristic in FILE interface check (compressible only if
size can be reduced twice) is different from the one in COMPRESSION interface
(compressible if size can be reduced at all)?

(I'm sorry for too many questions. I'm just curious.)

> > 2. The mount time of a 800-GiB partition is >20 seconds. And with
> > dont_load_bitmap it's around 1-2 seconds. Why so much?
> 
> 
> By default all bitmap blocks are loaded to memory at mount time.
> Now calculate a number of bitmap blocks for 800-GiB partition that
> should be read from disk.

25 MiB of bitmaps. 20 seconds still looks strange...
Are the blocks specially processed? Don't see anything.

> >   Why other filesystems
> > have drastically less mount times? If they have an equivalent of
> > dont_load_bitmap enabled by default, why don't we do it?
> 
> 
> For historical reasons. I recommended to not use large partitions
> for reiser4, so there wasn't any need in this option.

OK...

> > 3. Given a directory tree with ~20k files of total size around 20 GiB,
> > its removal takes forever. From strace I see that a single unlink takes
> > ~1 second. Again, why so much? Is it related to my choice of "latt" compression
> > mode over the default "conv"?
> 
> 
> Yes, in particular.
> "latt" means that all file bodies are represented by fragments in 
> formatted nodes.

So... are all cryptcompress files stored in formatted nodes, without
any equivalent of extents?

> > 3a. I can reproduce the "directory not empty" bug :) Interestingly, it is
> > always the same directory under the aforementioned huge hierarchy. (I've
> > done the unpack-remove cycle a few times.)
> 
> 
> I've made a conclusion that this is caused by unexpected disappearing
> of a record, which represents a directory entry in the directory item
> (currently directory items are managed by cde ITEM plugin, aka "compound
> directory entries"). In the error path (ENOENT) the size of the directory is
> not decremented, which makes the directory undeletable. I still don't know
> who kills the entries. Special debugging info is needed to find/fix it.

What kind of information is needed?

Thanks for explanations and hints,
-- 
Ivan Shapovalov / intelfx /

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 213 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes
  2014-09-10 21:39   ` Ivan Shapovalov
@ 2014-09-11 17:16     ` Edward Shishkin
  2014-09-24 19:51       ` Non-deleted directories (Was Re: reiser4 (ccreg40)...) Edward Shishkin
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Shishkin @ 2014-09-11 17:16 UTC (permalink / raw)
  To: Ivan Shapovalov, ReiserFS Development mailing list


On 09/10/2014 11:39 PM, Ivan Shapovalov wrote:
> On Wednesday 10 September 2014 at 22:17:15, Edward Shishkin wrote:	
>> On 09/10/2014 09:00 PM, Ivan Shapovalov wrote:
>>> Hi!
>>>
>>> The preamble: recently I had to force-change my configuration (the old laptop
>>> was stolen). What I have now is a combination of a tiny 16 GiB SSD and a huge
>>> 1 TiB HDD.
>>>
>>> ...So I've placed my /home on HDD. Partition size is 800 GiB, formatting
>>> options are "create=ccreg40,compress=gzip1,compressMode=latt" and I have a few
>>> questions.
>>>
>>> 1. What is the recommended compression mode?
>> The default one (conv).
> OK, thanks.
>
>>> More specifically, what is the default "conv" mode? What is its purpose, why is
>>> it the default?
>> In this mode intelligent switches take place in 2 interfaces:
>> 1) in FILE interface (if the first 64K of the file are incompressible, then
>>       management is passed to unix-file plugin forever);
>> 2) in COMPRESSION interface (turn on/off compression transform
>>       on a dynamic lattice).
>>
>> In other compression modes switches take place only in COMPRESSION
>> interface.
>>
>>
>>> I'm asking, because I wasn't able to understand its purpose from code, and the
>>> code itself looks hackish in some places (hardcoded fallback to extent-only
>>> files,
>> Actually, this is implementation of a compression mode, not a hardcoded
>> fallback.
>>
>>
>>>    hardcoded policy, hardcoded fallback to "latt" in many cases, etc).
>> ditto
> Yes, I understand that this is implementation and it doesn't have an obligation
> to be configurable in every aspect... but still it feels somewhat strange.
> E. g. why "extents only" formatting is forced when a file is decided to be
> incompressible?


"extents only" formatting policy was set to facilitate debugging process
when implementing the "conv" compression mode.

When "conv" is set, cryptcompress plugin "sends a signal" to the upper
dispatcher to perform switch to unix-file plugin, which, in turn, performs
switches in the ITEM interface, if "smart" formatting policy is 
installed (this is
"classic" tail conversion: tails to extents, if file size >= 20K, and 
backward).

Setting "extents only", or "tails only" disables the switches.
Why "extents only" instead of "tails only"? When "conv" makes a decision
about the switch, the file is 64K long, so extents are better than tails.

I think that now we can set "smart" instead of "extents only": those
switches won't step on each other.


>   Why the heuristic in FILE interface check (compressible only if
> size can be reduced twice) is different from the one in COMPRESSION interface
> (compressible if size can be reduced at all)?


I wanted to increase the portion of unix-files on the partition. It showed
better performance than the heuristics that performs switches in the
COMPRESSION interface. I still don't have satisfactory explanation of 
this fact.


> (I'm sorry for too many questions. I'm just curious.)
>
>>> 2. The mount time of a 800-GiB partition is >20 seconds. And with
>>> dont_load_bitmap it's around 1-2 seconds. Why so much?
>> By default all bitmap blocks are loaded to memory at mount time.
>> Now calculate a number of bitmap blocks for 800-GiB partition that
>> should be read from disk.
> 25 MiB of bitmaps. 20 seconds still looks strange...
> Are the blocks specially processed? Don't see anything.
>
>>>    Why other filesystems
>>> have drastically less mount times? If they have an equivalent of
>>> dont_load_bitmap enabled by default, why don't we do it?
>> For historical reasons. I recommended to not use large partitions
>> for reiser4, so there wasn't any need in this option.
> OK...
>
>>> 3. Given a directory tree with ~20k files of total size around 20 GiB,
>>> its removal takes forever. From strace I see that a single unlink takes
>>> ~1 second. Again, why so much? Is it related to my choice of "latt" compression
>>> mode over the default "conv"?
>> Yes, in particular.
>> "latt" means that all file bodies are represented by fragments in
>> formatted nodes.
> So... are all cryptcompress files stored in formatted nodes, without
> any equivalent of extents?


Yes, cryptcompress files are composed of items of only one type, so-called
"ctails" (they resembles tails, but have a 1-byte header, which contain size
of file's logical cluster). Unlike unix-file plugin cryptcompress plugin 
doesn't
perform switches in ITEM interface.


>>> 3a. I can reproduce the "directory not empty" bug :) Interestingly, it is
>>> always the same directory under the aforementioned huge hierarchy. (I've
>>> done the unpack-remove cycle a few times.)
>> I've made a conclusion that this is caused by unexpected disappearing
>> of a record, which represents a directory entry in the directory item
>> (currently directory items are managed by cde ITEM plugin, aka "compound
>> directory entries"). In the error path (ENOENT) the size of the directory is
>> not decremented, which makes the directory undeletable. I still don't know
>> who kills the entries. Special debugging info is needed to find/fix it.
> What kind of information is needed?


We need to find all places, where the records are created / killed
and insert a hook, which prints such events for the entry which
unexpectedly disappears. This will get us a chance to find the culprit.
I have to say: this is not a big fun...

Thanks,
Edward.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Non-deleted directories (Was Re: reiser4 (ccreg40)...)
  2014-09-11 17:16     ` Edward Shishkin
@ 2014-09-24 19:51       ` Edward Shishkin
  2014-09-26 17:27         ` Ivan Shapovalov
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Shishkin @ 2014-09-24 19:51 UTC (permalink / raw)
  To: Ivan Shapovalov, ReiserFS Development mailing list

On 09/11/2014 07:16 PM, Edward Shishkin wrote:
>
> On 09/10/2014 11:39 PM, Ivan Shapovalov wrote:
>

[...]

>>>> 3a. I can reproduce the "directory not empty" bug :) Interestingly, 
>>>> it is
>>>> always the same directory under the aforementioned huge hierarchy. 
>>>> (I've
>>>> done the unpack-remove cycle a few times.)
>>> I've made a conclusion that this is caused by unexpected disappearing
>>> of a record, which represents a directory entry in the directory item
>>> (currently directory items are managed by cde ITEM plugin, aka 
>>> "compound
>>> directory entries"). In the error path (ENOENT) the size of the 
>>> directory is
>>> not decremented, which makes the directory undeletable. I still 
>>> don't know
>>> who kills the entries. Special debugging info is needed to find/fix it.
>> What kind of information is needed?
>
>
> We need to find all places, where the records are created / killed
> and insert a hook, which prints such events for the entry which
> unexpectedly disappears. This will get us a chance to find the culprit.
> I have to say: this is not a big fun...


Ughhh, parse_cut (node40.c) is the culprit.
If region to cut contains objects with non-unique keys (the case of
hash collisions), then this function evaluates the cut mode incorrectly.

I think that this bug has been introduced implicitly ~11 years ago
after the design change in reiser4 (introducing non-unique keys).

I'll provide the fixup later..

Edward.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Non-deleted directories (Was Re: reiser4 (ccreg40)...)
  2014-09-24 19:51       ` Non-deleted directories (Was Re: reiser4 (ccreg40)...) Edward Shishkin
@ 2014-09-26 17:27         ` Ivan Shapovalov
  2014-09-26 19:57           ` Edward Shishkin
  0 siblings, 1 reply; 10+ messages in thread
From: Ivan Shapovalov @ 2014-09-26 17:27 UTC (permalink / raw)
  To: Edward Shishkin; +Cc: ReiserFS Development mailing list

[-- Attachment #1: Type: text/plain, Size: 2053 bytes --]

On Wednesday 24 September 2014 at 21:51:53, Edward Shishkin wrote:	
> On 09/11/2014 07:16 PM, Edward Shishkin wrote:
> >
> > On 09/10/2014 11:39 PM, Ivan Shapovalov wrote:
> >
> 
> [...]
> 
> >>>> 3a. I can reproduce the "directory not empty" bug :) Interestingly, 
> >>>> it is
> >>>> always the same directory under the aforementioned huge hierarchy. 
> >>>> (I've
> >>>> done the unpack-remove cycle a few times.)
> >>> I've made a conclusion that this is caused by unexpected disappearing
> >>> of a record, which represents a directory entry in the directory item
> >>> (currently directory items are managed by cde ITEM plugin, aka 
> >>> "compound
> >>> directory entries"). In the error path (ENOENT) the size of the 
> >>> directory is
> >>> not decremented, which makes the directory undeletable. I still 
> >>> don't know
> >>> who kills the entries. Special debugging info is needed to find/fix it.
> >> What kind of information is needed?
> >
> >
> > We need to find all places, where the records are created / killed
> > and insert a hook, which prints such events for the entry which
> > unexpectedly disappears. This will get us a chance to find the culprit.
> > I have to say: this is not a big fun...
> 
> 
> Ughhh, parse_cut (node40.c) is the culprit.
> If region to cut contains objects with non-unique keys (the case of
> hash collisions), then this function evaluates the cut mode incorrectly.

> non-unique keys

Wow.
(I wonder, how many else "what the..."-style things are there in reiser4?...)

So this becomes re-classified as kernel version agnostic bug, right? Then why
do you see it in 3.16 only?..

> 
> I think that this bug has been introduced implicitly ~11 years ago
> after the design change in reiser4 (introducing non-unique keys).
> 
> I'll provide the fixup later..

Great. Will be waiting for. /* also, what's with batch discard code and the
second space allocation patchset? do you have any plans for reviewing it? */

Thanks,
-- 
Ivan Shapovalov / intelfx /

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 213 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Non-deleted directories (Was Re: reiser4 (ccreg40)...)
  2014-09-26 17:27         ` Ivan Shapovalov
@ 2014-09-26 19:57           ` Edward Shishkin
  2014-09-26 20:09             ` Ivan Shapovalov
  0 siblings, 1 reply; 10+ messages in thread
From: Edward Shishkin @ 2014-09-26 19:57 UTC (permalink / raw)
  To: Ivan Shapovalov; +Cc: ReiserFS Development mailing list


On 09/26/2014 07:27 PM, Ivan Shapovalov wrote:
> On Wednesday 24 September 2014 at 21:51:53, Edward Shishkin wrote:	
>> On 09/11/2014 07:16 PM, Edward Shishkin wrote:
>>> On 09/10/2014 11:39 PM, Ivan Shapovalov wrote:
>>>
>> [...]
>>
>>>>>> 3a. I can reproduce the "directory not empty" bug :) Interestingly,
>>>>>> it is
>>>>>> always the same directory under the aforementioned huge hierarchy.
>>>>>> (I've
>>>>>> done the unpack-remove cycle a few times.)
>>>>> I've made a conclusion that this is caused by unexpected disappearing
>>>>> of a record, which represents a directory entry in the directory item
>>>>> (currently directory items are managed by cde ITEM plugin, aka
>>>>> "compound
>>>>> directory entries"). In the error path (ENOENT) the size of the
>>>>> directory is
>>>>> not decremented, which makes the directory undeletable. I still
>>>>> don't know
>>>>> who kills the entries. Special debugging info is needed to find/fix it.
>>>> What kind of information is needed?
>>>
>>> We need to find all places, where the records are created / killed
>>> and insert a hook, which prints such events for the entry which
>>> unexpectedly disappears. This will get us a chance to find the culprit.
>>> I have to say: this is not a big fun...
>>
>> Ughhh, parse_cut (node40.c) is the culprit.
>> If region to cut contains objects with non-unique keys (the case of
>> hash collisions), then this function evaluates the cut mode incorrectly.
>> non-unique keys
> Wow.
> (I wonder, how many else "what the..."-style things are there in reiser4?...)


Currently I don't know open issues, which lead to data corruptions.

There is a number of failed assertions when debug mode is on and
partition is formatted with "create=reg40". Specifically, they appear
in paths of tail conversion. I believe they are false positives, however,
everything is possible..


> So this becomes re-classified as kernel version agnostic bug, right? Then why
> do you see it in 3.16 only?..


Not really. The issue of non-deletable directories is very old.
Now we know that it was caused by non-unique keys (because of
hash collisions). However, having non-unique keys on the partition
is not enough to reproduce this problem: the cut offset should be
between objects with identical keys. Now let's assume that tree
layout depends on the kernel version...


>
>> I think that this bug has been introduced implicitly ~11 years ago
>> after the design change in reiser4 (introducing non-unique keys).
>>
>> I'll provide the fixup later..
> Great. Will be waiting for. /* also, what's with batch discard code and the
> second space allocation patchset? do you have any plans for reviewing it? */


Discard support v8 looks OK, we'll include it to 3.6.X.
As to FITRIM ioctl: I'll try to review it at the end of my vacations
(weekends 11, 12 Oct).

Thanks,
Edward.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Non-deleted directories (Was Re: reiser4 (ccreg40)...)
  2014-09-26 19:57           ` Edward Shishkin
@ 2014-09-26 20:09             ` Ivan Shapovalov
  2014-09-26 20:46               ` Edward Shishkin
  0 siblings, 1 reply; 10+ messages in thread
From: Ivan Shapovalov @ 2014-09-26 20:09 UTC (permalink / raw)
  To: Edward Shishkin; +Cc: ReiserFS Development mailing list

[-- Attachment #1: Type: text/plain, Size: 3419 bytes --]

On Friday 26 September 2014 at 21:57:13, Edward Shishkin wrote:	
> 
> On 09/26/2014 07:27 PM, Ivan Shapovalov wrote:
> > On Wednesday 24 September 2014 at 21:51:53, Edward Shishkin wrote:	
> >> On 09/11/2014 07:16 PM, Edward Shishkin wrote:
> >>> On 09/10/2014 11:39 PM, Ivan Shapovalov wrote:
> >>>
> >> [...]
> >>
> >>>>>> 3a. I can reproduce the "directory not empty" bug :) Interestingly,
> >>>>>> it is
> >>>>>> always the same directory under the aforementioned huge hierarchy.
> >>>>>> (I've
> >>>>>> done the unpack-remove cycle a few times.)
> >>>>> I've made a conclusion that this is caused by unexpected disappearing
> >>>>> of a record, which represents a directory entry in the directory item
> >>>>> (currently directory items are managed by cde ITEM plugin, aka
> >>>>> "compound
> >>>>> directory entries"). In the error path (ENOENT) the size of the
> >>>>> directory is
> >>>>> not decremented, which makes the directory undeletable. I still
> >>>>> don't know
> >>>>> who kills the entries. Special debugging info is needed to find/fix it.
> >>>> What kind of information is needed?
> >>>
> >>> We need to find all places, where the records are created / killed
> >>> and insert a hook, which prints such events for the entry which
> >>> unexpectedly disappears. This will get us a chance to find the culprit.
> >>> I have to say: this is not a big fun...
> >>
> >> Ughhh, parse_cut (node40.c) is the culprit.
> >> If region to cut contains objects with non-unique keys (the case of
> >> hash collisions), then this function evaluates the cut mode incorrectly.
> >> non-unique keys
> > Wow.
> > (I wonder, how many else "what the..."-style things are there in reiser4?...)
> 
> 
> Currently I don't know open issues, which lead to data corruptions.
> 
> There is a number of failed assertions when debug mode is on and
> partition is formatted with "create=reg40". Specifically, they appear
> in paths of tail conversion. I believe they are false positives, however,
> everything is possible..
> 
> 
> > So this becomes re-classified as kernel version agnostic bug, right? Then why
> > do you see it in 3.16 only?..
> 
> 
> Not really. The issue of non-deletable directories is very old.
> Now we know that it was caused by non-unique keys (because of
> hash collisions). However, having non-unique keys on the partition
> is not enough to reproduce this problem: the cut offset should be
> between objects with identical keys. Now let's assume that tree
> layout depends on the kernel version...
> 
> 
> >
> >> I think that this bug has been introduced implicitly ~11 years ago
> >> after the design change in reiser4 (introducing non-unique keys).
> >>
> >> I'll provide the fixup later..
> > Great. Will be waiting for. /* also, what's with batch discard code and the
> > second space allocation patchset? do you have any plans for reviewing it? */
> 
> 
> Discard support v8 looks OK, we'll include it to 3.6.X.

v8 is what? I don't see any PATCHv8 in the mailing list...
Actually, by "second space allocation patchset" I've meant this:
http://www.spinics.net/lists/reiserfs-devel/msg04180.html

BTW, it should be [RFC]: I'm completely unsure if it's OK...

> As to FITRIM ioctl: I'll try to review it at the end of my vacations
> (weekends 11, 12 Oct).

OK, will be waiting, thanks.
-- 
Ivan Shapovalov / intelfx /

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 213 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Non-deleted directories (Was Re: reiser4 (ccreg40)...)
  2014-09-26 20:09             ` Ivan Shapovalov
@ 2014-09-26 20:46               ` Edward Shishkin
  0 siblings, 0 replies; 10+ messages in thread
From: Edward Shishkin @ 2014-09-26 20:46 UTC (permalink / raw)
  To: Ivan Shapovalov; +Cc: ReiserFS Development mailing list


On 09/26/2014 10:09 PM, Ivan Shapovalov wrote:
> On Friday 26 September 2014 at 21:57:13, Edward Shishkin wrote:	
>> On 09/26/2014 07:27 PM, Ivan Shapovalov wrote:
>>> On Wednesday 24 September 2014 at 21:51:53, Edward Shishkin wrote:	
>>>> On 09/11/2014 07:16 PM, Edward Shishkin wrote:
>>>>> On 09/10/2014 11:39 PM, Ivan Shapovalov wrote:
>>>>>
>>>> [...]
>>>>
>>>>>>>> 3a. I can reproduce the "directory not empty" bug :) Interestingly,
>>>>>>>> it is
>>>>>>>> always the same directory under the aforementioned huge hierarchy.
>>>>>>>> (I've
>>>>>>>> done the unpack-remove cycle a few times.)
>>>>>>> I've made a conclusion that this is caused by unexpected disappearing
>>>>>>> of a record, which represents a directory entry in the directory item
>>>>>>> (currently directory items are managed by cde ITEM plugin, aka
>>>>>>> "compound
>>>>>>> directory entries"). In the error path (ENOENT) the size of the
>>>>>>> directory is
>>>>>>> not decremented, which makes the directory undeletable. I still
>>>>>>> don't know
>>>>>>> who kills the entries. Special debugging info is needed to find/fix it.
>>>>>> What kind of information is needed?
>>>>> We need to find all places, where the records are created / killed
>>>>> and insert a hook, which prints such events for the entry which
>>>>> unexpectedly disappears. This will get us a chance to find the culprit.
>>>>> I have to say: this is not a big fun...
>>>> Ughhh, parse_cut (node40.c) is the culprit.
>>>> If region to cut contains objects with non-unique keys (the case of
>>>> hash collisions), then this function evaluates the cut mode incorrectly.
>>>> non-unique keys
>>> Wow.
>>> (I wonder, how many else "what the..."-style things are there in reiser4?...)
>>
>> Currently I don't know open issues, which lead to data corruptions.
>>
>> There is a number of failed assertions when debug mode is on and
>> partition is formatted with "create=reg40". Specifically, they appear
>> in paths of tail conversion. I believe they are false positives, however,
>> everything is possible..
>>
>>
>>> So this becomes re-classified as kernel version agnostic bug, right? Then why
>>> do you see it in 3.16 only?..
>>
>> Not really. The issue of non-deletable directories is very old.
>> Now we know that it was caused by non-unique keys (because of
>> hash collisions). However, having non-unique keys on the partition
>> is not enough to reproduce this problem: the cut offset should be
>> between objects with identical keys. Now let's assume that tree
>> layout depends on the kernel version...
>>
>>
>>>> I think that this bug has been introduced implicitly ~11 years ago
>>>> after the design change in reiser4 (introducing non-unique keys).
>>>>
>>>> I'll provide the fixup later..
>>> Great. Will be waiting for. /* also, what's with batch discard code and the
>>> second space allocation patchset? do you have any plans for reviewing it? */
>>
>> Discard support v8 looks OK, we'll include it to 3.6.X.
> v8 is what? I don't see any PATCHv8 in the mailing list...


v8 means (v7 + unconditionally delayed de-allocation)


> Actually, by "second space allocation patchset" I've meant this:
> http://www.spinics.net/lists/reiserfs-devel/msg04180.html


Ah, sorry, I forgot about this patchset,
Ok, I'll take a look..

Edward.


>
> BTW, it should be [RFC]: I'm completely unsure if it's OK...
>
>> As to FITRIM ioctl: I'll try to review it at the end of my vacations
>> (weekends 11, 12 Oct).
> OK, will be waiting, thanks.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-09-26 20:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-10 19:00 reiser4 (ccreg40): very slow mount, poor unlink performance, questions about compression modes Ivan Shapovalov
2014-09-10 20:17 ` Edward Shishkin
2014-09-10 21:26   ` Edward Shishkin
2014-09-10 21:39   ` Ivan Shapovalov
2014-09-11 17:16     ` Edward Shishkin
2014-09-24 19:51       ` Non-deleted directories (Was Re: reiser4 (ccreg40)...) Edward Shishkin
2014-09-26 17:27         ` Ivan Shapovalov
2014-09-26 19:57           ` Edward Shishkin
2014-09-26 20:09             ` Ivan Shapovalov
2014-09-26 20:46               ` Edward Shishkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).