* ReiserFS v3 choking when free space falls below 10%?
@ 2006-06-29 17:41 Mike Benoit
2006-06-29 19:12 ` Vladimir V. Saveliev
2006-07-24 22:26 ` ReiserFS v3 choking when free space falls below 10% - FIXED Mike Benoit
0 siblings, 2 replies; 49+ messages in thread
From: Mike Benoit @ 2006-06-29 17:41 UTC (permalink / raw)
To: reiserfs-list
[-- Attachment #1: Type: text/plain, Size: 12962 bytes --]
My MythTV box recently started showing odd behavior during recordings,
at certain times the load of the box would spike to 10+ and recordings
would start losing frames and become unwatchable. TOP would show
mythbackend as using 90+% SYS CPU usage, which under normal
circumstances it normally uses about 5% USR.
So I finally got around to profiling mythbackend when the load starts to
spike. To my surprise it appears that once I have less then 10% (30GB)
free on the drive reiserfs can't up, even just writing at 1mb/sec is too
much for it.
Is there something that can be done to fix this, 30gb seems like a lot
of wasted space.
#opreport
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
TIMER:0|
samples| %|
------------------
77863 78.7856 reiserfs
18183 18.3984 vmlinux
695 0.7032 mysqld
452 0.4574 libc-2.4.so
360 0.3643 libmythtv-0.19.so.0.19.0
324 0.3278 ivtv
323 0.3268 nvidia
242 0.2449 libqt-mt.so.3.3.6
110 0.1113 libpthread-2.4.so
53 0.0536 libstdc++.so.6.0.8
35 0.0354 ld-2.4.so
23 0.0233 libperl.so
22 0.0223 libz.so.1.2.3
<snip>
#opreport -l /usr/src/linux/vmlinux
CPU: CPU with timer interrupt, speed 0 MHz (estimated)
Profiling through timer interrupt
samples % symbol name
9607 52.8351 default_idle
7694 42.3142 find_next_zero_bit
183 1.0064 __copy_from_user_ll
57 0.3135 handle_IRQ_event
37 0.2035 __copy_to_user_ll
34 0.1870 ide_outb
30 0.1650 ide_end_request
22 0.1210 ioread8
22 0.1210 schedule
21 0.1155 get_page_from_freelist
17 0.0935 mmx_clear_page
<snip>
System Details:
-----------------------------------------------
Kernel v2.6.16.21 (custom compiled)
- This issue also happened with 2.6.14 too though.
Filesystem Size Used Avail Use% Mounted on
/dev/hda1 280G 269G 12G 97% /
[root@mythtv]# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev /dev tmpfs rw 0 0
/dev/root / reiserfs rw,noatime,nodiratime 0 0
[root@mythtv]# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 6
model name : AMD Athlon(tm) XP 2100+
stepping : 2
cpu MHz : 1759.680
cache size : 256 KB
[root@mythtv]# free
total used free shared buffers
cached
Mem: 515992 496256 19736 0 36256
271728
-/+ buffers/cache: 188272 327720
Swap: 262136 408 261728
[root@mythtv ~]# hdparm -i /dev/hda
/dev/hda:
Model=ST3300622A, FwRev=3.AND, SerialNo=3NF1GAGW
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=16
CurCHS=4047/16/255, CurSects=16511760, LBA=yes, LBAsects=268435455
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5
AdvancedPM=no WriteCache=enabled
Drive conforms to: Unspecified: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3
ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7
* signifies the current active mode
[root@mythtv ~]# hdparm -tT /dev/hda
/dev/hda:
Timing cached reads: 1296 MB in 2.00 seconds = 646.99 MB/sec
Timing buffered disk reads: 166 MB in 3.02 seconds = 55.05 MB/sec
vmstat 1 output:
--------------------------------------------------------------
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
8 0 408 5800 29308 248604 0 0 0 1036 406 132 2 98
0 0
4 0 408 5644 29396 248608 0 0 0 1128 437 184 2 92
0 6
7 0 408 6316 29428 248020 0 0 0 1316 539 287 0 86
0 14
5 0 408 6104 29480 248180 0 0 0 588 415 187 0 99
0 1
4 0 408 5764 29536 248364 0 0 0 1092 421 172 2 97
1 0
6 0 408 6528 29592 247684 0 0 0 1092 425 161 2 98
0 1
2 1 408 6372 29676 247724 0 0 0 2304 385 170 2 97
1 0
5 0 408 6400 29676 247616 0 0 0 48 383 122 0
100 0 0
7 0 408 6192 29704 247872 0 0 0 1080 409 162 1 98
0 1
6 0 408 5720 29732 248304 0 0 0 1076 414 178 1 98
0 1
7 0 408 6348 29800 247552 0 0 0 1656 460 300 2 87
1 11
5 0 408 6628 29848 247248 0 0 0 1164 407 207 1 94
0 5
5 0 408 5884 29896 247996 0 0 4 1116 453 353 1 76
0 23
6 0 408 5640 29868 248204 0 0 0 1052 416 132 1 99
0 0
4 0 408 5772 29940 248104 0 0 0 648 490 314 1 84
1 14
6 1 408 6328 30036 247464 0 0 0 1928 488 305 2 85
0 13
4 0 408 6184 30076 247472 0 0 4 860 404 201 1 94
0 5
4 0 408 6332 30044 247328 0 0 0 1312 429 156 1 99
0 0
9 0 408 6120 30100 247580 0 0 0 604 494 305 3 81
1 16
2 1 408 6460 30140 247116 0 0 0 1372 436 315 1 79
0 20
10 0 408 6252 30176 247372 0 0 0 456 412 126 1 99
0 0
6 0 408 6432 30164 247276 0 0 4 1268 425 255 1 88
1 10
3 0 408 5688 30220 247948 0 0 0 1332 454 352 0 78
0 22
2 1 408 6352 30284 247124 0 0 0 1140 362 156 2 96
1 1
5 0 408 6564 30284 246908 0 0 0 92 472 316 2 83
0 15
5 0 408 6348 30352 247056 0 0 0 1168 506 350 0 83
0 17
4 0 408 5604 30404 247828 0 0 4 1124 448 262 2 87
0 11
3 0 408 5880 30444 247500 0 0 0 1104 426 315 2 77
1 20
2 1 408 5916 30496 247352 0 0 0 1064 365 152 1 97
0 2
7 0 408 6072 30496 247204 0 0 0 440 489 307 1 82
0 17
6 0 408 5936 30528 247288 0 0 0 816 434 130 2 98
0 0
4 0 408 5944 30588 247300 0 0 0 1108 359 172 0 98
0 2
4 0 408 5664 30624 247508 0 0 0 1444 426 161 0 99
1 0
5 0 408 6656 30608 246572 0 0 0 1220 425 163 2 98
0 1
6 0 408 6316 30656 246848 0 0 0 1552 441 180 1 98
0 1
4 0 408 6408 30632 246776 0 0 0 644 403 140 1 99
0 0
9 0 408 6072 30696 247060 0 0 4 744 496 351 2 82
1 16
5 0 408 5864 30708 247240 0 0 0 1680 509 335 1 83
1 15
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
3 1 408 6284 30768 246768 0 0 0 1132 434 328 1 76
1 22
6 0 408 6352 30772 246692 0 0 0 576 373 170 0 93
0 7
4 0 408 6008 30820 246932 0 0 4 612 496 322 1 83
0 16
4 0 408 6288 30836 246600 0 0 0 1484 480 304 1 85
0 14
4 0 408 6064 30896 246844 0 0 0 1136 504 337 1 84
1 15
5 0 408 5728 30900 247116 0 0 4 1188 426 156 1 99
0 0
6 0 408 5696 30968 247144 0 0 0 1104 367 123 3 97
0 0
4 0 408 5608 31016 247144 0 0 0 1152 445 378 2 74
1 23
7 0 408 5576 31008 247088 0 0 0 964 402 115 1 99
0 0
4 0 408 6328 31052 246396 0 0 0 628 355 152 1 98
0 1
5 0 408 6116 31112 246524 0 0 0 1620 472 299 2 85
1 12
2 1 408 6336 31204 246176 0 0 0 1112 367 156 2 96
0 2
7 0 408 6388 31176 246192 0 0 0 76 457 272 0 86
0 14
5 0 408 6268 31232 246284 0 0 0 1136 466 267 1 85
1 13
2 1 408 5932 31304 246616 0 0 4 2068 374 173 1 99
0 0
6 0 408 5960 31224 246564 0 0 0 104 472 273 1 84
0 15
6 0 408 5692 31308 246716 0 0 0 1160 412 206 2 94
0 4
5 0 408 5600 31336 246892 0 0 4 1660 480 289 2 86
0 12
7 0 408 6400 31336 245964 0 0 0 1052 418 160 3 97
0 0
6 0 408 6316 31292 246136 0 0 0 512 432 127 1 99
0 0
5 0 408 5856 31372 246528 0 0 0 1824 404 159 2 96
0 2
3 0 408 5880 31424 246412 0 0 0 1156 454 174 1 97
1 1
3 0 408 6024 31372 246336 0 0 0 896 399 130 0
100 0 0
5 0 408 5812 31432 246492 0 0 0 708 413 160 1 97
0 2
5 0 408 6396 31424 246024 0 0 0 1604 436 163 1 97
1 1
6 1 408 6276 31492 245924 0 0 216 1176 511 409 3 82
0 15
4 0 408 6312 31528 245944 0 0 0 1116 468 263 1 86
0 13
1 2 408 6592 31576 245628 0 0 56 1044 343 126 0 97
0 3
5 0 408 6312 31576 245904 0 0 32 48 427 155 0 97
0 3
1 0 408 5816 31624 246360 0 0 72 1796 590 834 2 40
35 24
1 1 408 16872 31704 247564 0 0 1232 248 513 1185 28 4
11 57
1 1 408 31240 31768 248520 0 0 932 92 403 996 32 4
10 54
1 0 408 29576 31880 248704 0 0 188 248 372 997 7 6
61 26
1 1 408 28284 31952 249852 0 0 316 344 402 842 20 21
45 13
0 1 408 27188 32008 250940 0 0 112 976 393 465 33 58
0 9
5 1 408 24748 32100 253228 0 0 1212 1424 571 949 31 31
0 37
2 0 408 23052 32156 255032 0 0 544 1036 415 351 16 80
0 4
0 1 408 21148 32232 256808 0 0 516 1480 454 692 33 41
0 25
procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us sy
id wa
2 1 408 19616 32288 258308 0 0 576 1352 414 478 33 59
0 8
4 0 408 18084 32348 259816 0 0 496 1344 423 524 29 56
0 15
5 0 408 17016 32428 260844 0 0 192 812 518 574 24 63
0 13
2 0 408 15348 32488 262444 0 0 208 1064 416 295 14 85
0 1
5 0 408 13616 32552 264104 0 0 84 1684 497 615 32 66
0 2
5 1 408 13496 32612 263992 0 0 92 1148 530 526 14 71
0 14
0 1 408 13000 32784 264556 0 0 80 1240 506 504 1 59
0 40
3 1 408 12132 32864 265324 0 0 36 612 431 438 2 65
0 34
1 1 408 10196 33048 266960 0 0 216 4 440 565 1 60
0 39
1 1 408 9252 33284 267768 0 0 168 2444 463 617 1 56
0 43
0 3 408 7208 33376 269680 0 0 32 3460 459 497 1 59
0 40
2 1 408 6416 33444 270392 0 0 24 748 448 423 0 71
0 29
0 1 408 5976 33664 270568 0 0 220 1436 481 654 2 55
0 43
1 0 408 6100 33700 270356 0 0 8 844 406 389 9 70
16 5
0 0 408 5848 33732 270568 0 0 0 1128 435 401 0 72
27 1
1 0 408 5720 33772 270664 0 0 0 852 398 350 1 73
25 1
1 0 408 6100 33780 270320 0 0 0 1216 446 522 0 54
45 1
3 0 408 5736 33780 270644 0 0 0 1092 475 736 0 32
67 1
1 0 408 6372 33952 269720 0 0 0 1040 462 522 4 69
26 1
2 0 408 6436 33944 269592 0 0 0 864 433 287 0 83
16 1
0 0 408 5848 34024 270140 0 0 4 1232 480 701 3 39
53 5
2 0 408 9196 33936 266612 0 0 104 212 596 1035 10 43
40 8
3 0 408 8824 33936 267380 0 0 0 512 388 90 0
100 0 0
4 0 408 7956 33968 268148 0 0 0 548 400 114 1 98
0 1
2 0 408 6492 34000 269604 0 0 0 892 432 629 0 38
61 1
2 0 408 6416 34084 269648 0 0 0 1712 403 591 0 40
58 2
5 0 408 6612 34120 269376 0 0 0 844 447 557 1 49
49 1
4 0 408 6424 34148 269548 0 0 0 880 465 493 0 65
35 0
1 0 408 6336 34196 269596 0 0 0 1112 475 552 3 59
36 2
4 1 408 6304 34340 269404 0 0 0 1668 378 316 0 78
22 0
3 0 408 6096 34368 269608 0 0 0 308 411 625 1 38
59 2
3 0 408 6268 34412 269372 0 0 0 1148 398 583 0 39
60 1
5 0 408 6400 34444 269264 0 0 0 824 431 414 0 67
33 0
--
Mike Benoit <ipso@snappymail.ca>
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 49+ messages in thread* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-29 17:41 ReiserFS v3 choking when free space falls below 10%? Mike Benoit @ 2006-06-29 19:12 ` Vladimir V. Saveliev 2006-06-29 20:15 ` Mike Benoit 2006-07-24 22:26 ` ReiserFS v3 choking when free space falls below 10% - FIXED Mike Benoit 1 sibling, 1 reply; 49+ messages in thread From: Vladimir V. Saveliev @ 2006-06-29 19:12 UTC (permalink / raw) To: Mike Benoit; +Cc: reiserfs-list Hello On Thu, 2006-06-29 at 10:41 -0700, Mike Benoit wrote: > My MythTV box recently started showing odd behavior during recordings, > at certain times the load of the box would spike to 10+ and recordings > would start losing frames and become unwatchable. TOP would show > mythbackend as using 90+% SYS CPU usage, which under normal > circumstances it normally uses about 5% USR. > > So I finally got around to profiling mythbackend when the load starts to > spike. To my surprise it appears that once I have less then 10% (30GB) > free on the drive reiserfs can't up, even just writing at 1mb/sec is too > much for it. > > Is there something that can be done to fix this, 30gb seems like a lot > of wasted space. > > #opreport > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > Profiling through timer interrupt > TIMER:0| > samples| %| > ------------------ > 77863 78.7856 reiserfs > 18183 18.3984 vmlinux > 695 0.7032 mysqld > 452 0.4574 libc-2.4.so > 360 0.3643 libmythtv-0.19.so.0.19.0 > 324 0.3278 ivtv > 323 0.3268 nvidia > 242 0.2449 libqt-mt.so.3.3.6 > 110 0.1113 libpthread-2.4.so > 53 0.0536 libstdc++.so.6.0.8 > 35 0.0354 ld-2.4.so > 23 0.0233 libperl.so > 22 0.0223 libz.so.1.2.3 > <snip> > > #opreport -l /usr/src/linux/vmlinux > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > Profiling through timer interrupt > samples % symbol name > 9607 52.8351 default_idle > 7694 42.3142 find_next_zero_bit It looks like the problem is high fragmentation of free space. find_next_zero_bit is a function which is used to scan bitmaps in order to find blocks for allocation. > 183 1.0064 __copy_from_user_ll > 57 0.3135 handle_IRQ_event > 37 0.2035 __copy_to_user_ll > 34 0.1870 ide_outb > 30 0.1650 ide_end_request > 22 0.1210 ioread8 > 22 0.1210 schedule > 21 0.1155 get_page_from_freelist > 17 0.0935 mmx_clear_page > <snip> > > System Details: > ----------------------------------------------- > Kernel v2.6.16.21 (custom compiled) > - This issue also happened with 2.6.14 too though. > > Filesystem Size Used Avail Use% Mounted on > /dev/hda1 280G 269G 12G 97% / > > [root@mythtv]# cat /proc/mounts > rootfs / rootfs rw 0 0 > /dev /dev tmpfs rw 0 0 > /dev/root / reiserfs rw,noatime,nodiratime 0 0 > > [root@mythtv]# cat /proc/cpuinfo > processor : 0 > vendor_id : AuthenticAMD > cpu family : 6 > model : 6 > model name : AMD Athlon(tm) XP 2100+ > stepping : 2 > cpu MHz : 1759.680 > cache size : 256 KB > > [root@mythtv]# free > total used free shared buffers > cached > Mem: 515992 496256 19736 0 36256 > 271728 > -/+ buffers/cache: 188272 327720 > Swap: 262136 408 261728 > > [root@mythtv ~]# hdparm -i /dev/hda > /dev/hda: > Model=ST3300622A, FwRev=3.AND, SerialNo=3NF1GAGW > Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } > RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 > BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=16 > CurCHS=4047/16/255, CurSects=16511760, LBA=yes, LBAsects=268435455 > IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} > PIO modes: pio0 pio1 pio2 pio3 pio4 > DMA modes: mdma0 mdma1 mdma2 > UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 > AdvancedPM=no WriteCache=enabled > Drive conforms to: Unspecified: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 > ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 > * signifies the current active mode > > [root@mythtv ~]# hdparm -tT /dev/hda > /dev/hda: > Timing cached reads: 1296 MB in 2.00 seconds = 646.99 MB/sec > Timing buffered disk reads: 166 MB in 3.02 seconds = 55.05 MB/sec > > vmstat 1 output: > -------------------------------------------------------------- > > procs -----------memory---------- ---swap-- -----io---- --system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 8 0 408 5800 29308 248604 0 0 0 1036 406 132 2 98 > 0 0 > 4 0 408 5644 29396 248608 0 0 0 1128 437 184 2 92 > 0 6 > 7 0 408 6316 29428 248020 0 0 0 1316 539 287 0 86 > 0 14 > 5 0 408 6104 29480 248180 0 0 0 588 415 187 0 99 > 0 1 > 4 0 408 5764 29536 248364 0 0 0 1092 421 172 2 97 > 1 0 > 6 0 408 6528 29592 247684 0 0 0 1092 425 161 2 98 > 0 1 > 2 1 408 6372 29676 247724 0 0 0 2304 385 170 2 97 > 1 0 > 5 0 408 6400 29676 247616 0 0 0 48 383 122 0 > 100 0 0 > 7 0 408 6192 29704 247872 0 0 0 1080 409 162 1 98 > 0 1 > 6 0 408 5720 29732 248304 0 0 0 1076 414 178 1 98 > 0 1 > 7 0 408 6348 29800 247552 0 0 0 1656 460 300 2 87 > 1 11 > 5 0 408 6628 29848 247248 0 0 0 1164 407 207 1 94 > 0 5 > 5 0 408 5884 29896 247996 0 0 4 1116 453 353 1 76 > 0 23 > 6 0 408 5640 29868 248204 0 0 0 1052 416 132 1 99 > 0 0 > 4 0 408 5772 29940 248104 0 0 0 648 490 314 1 84 > 1 14 > 6 1 408 6328 30036 247464 0 0 0 1928 488 305 2 85 > 0 13 > 4 0 408 6184 30076 247472 0 0 4 860 404 201 1 94 > 0 5 > 4 0 408 6332 30044 247328 0 0 0 1312 429 156 1 99 > 0 0 > 9 0 408 6120 30100 247580 0 0 0 604 494 305 3 81 > 1 16 > 2 1 408 6460 30140 247116 0 0 0 1372 436 315 1 79 > 0 20 > 10 0 408 6252 30176 247372 0 0 0 456 412 126 1 99 > 0 0 > 6 0 408 6432 30164 247276 0 0 4 1268 425 255 1 88 > 1 10 > 3 0 408 5688 30220 247948 0 0 0 1332 454 352 0 78 > 0 22 > 2 1 408 6352 30284 247124 0 0 0 1140 362 156 2 96 > 1 1 > 5 0 408 6564 30284 246908 0 0 0 92 472 316 2 83 > 0 15 > 5 0 408 6348 30352 247056 0 0 0 1168 506 350 0 83 > 0 17 > 4 0 408 5604 30404 247828 0 0 4 1124 448 262 2 87 > 0 11 > 3 0 408 5880 30444 247500 0 0 0 1104 426 315 2 77 > 1 20 > 2 1 408 5916 30496 247352 0 0 0 1064 365 152 1 97 > 0 2 > 7 0 408 6072 30496 247204 0 0 0 440 489 307 1 82 > 0 17 > 6 0 408 5936 30528 247288 0 0 0 816 434 130 2 98 > 0 0 > 4 0 408 5944 30588 247300 0 0 0 1108 359 172 0 98 > 0 2 > 4 0 408 5664 30624 247508 0 0 0 1444 426 161 0 99 > 1 0 > 5 0 408 6656 30608 246572 0 0 0 1220 425 163 2 98 > 0 1 > 6 0 408 6316 30656 246848 0 0 0 1552 441 180 1 98 > 0 1 > 4 0 408 6408 30632 246776 0 0 0 644 403 140 1 99 > 0 0 > 9 0 408 6072 30696 247060 0 0 4 744 496 351 2 82 > 1 16 > 5 0 408 5864 30708 247240 0 0 0 1680 509 335 1 83 > 1 15 > procs -----------memory---------- ---swap-- -----io---- --system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 3 1 408 6284 30768 246768 0 0 0 1132 434 328 1 76 > 1 22 > 6 0 408 6352 30772 246692 0 0 0 576 373 170 0 93 > 0 7 > 4 0 408 6008 30820 246932 0 0 4 612 496 322 1 83 > 0 16 > 4 0 408 6288 30836 246600 0 0 0 1484 480 304 1 85 > 0 14 > 4 0 408 6064 30896 246844 0 0 0 1136 504 337 1 84 > 1 15 > 5 0 408 5728 30900 247116 0 0 4 1188 426 156 1 99 > 0 0 > 6 0 408 5696 30968 247144 0 0 0 1104 367 123 3 97 > 0 0 > 4 0 408 5608 31016 247144 0 0 0 1152 445 378 2 74 > 1 23 > 7 0 408 5576 31008 247088 0 0 0 964 402 115 1 99 > 0 0 > 4 0 408 6328 31052 246396 0 0 0 628 355 152 1 98 > 0 1 > 5 0 408 6116 31112 246524 0 0 0 1620 472 299 2 85 > 1 12 > 2 1 408 6336 31204 246176 0 0 0 1112 367 156 2 96 > 0 2 > 7 0 408 6388 31176 246192 0 0 0 76 457 272 0 86 > 0 14 > 5 0 408 6268 31232 246284 0 0 0 1136 466 267 1 85 > 1 13 > 2 1 408 5932 31304 246616 0 0 4 2068 374 173 1 99 > 0 0 > 6 0 408 5960 31224 246564 0 0 0 104 472 273 1 84 > 0 15 > 6 0 408 5692 31308 246716 0 0 0 1160 412 206 2 94 > 0 4 > 5 0 408 5600 31336 246892 0 0 4 1660 480 289 2 86 > 0 12 > 7 0 408 6400 31336 245964 0 0 0 1052 418 160 3 97 > 0 0 > 6 0 408 6316 31292 246136 0 0 0 512 432 127 1 99 > 0 0 > 5 0 408 5856 31372 246528 0 0 0 1824 404 159 2 96 > 0 2 > 3 0 408 5880 31424 246412 0 0 0 1156 454 174 1 97 > 1 1 > 3 0 408 6024 31372 246336 0 0 0 896 399 130 0 > 100 0 0 > 5 0 408 5812 31432 246492 0 0 0 708 413 160 1 97 > 0 2 > 5 0 408 6396 31424 246024 0 0 0 1604 436 163 1 97 > 1 1 > 6 1 408 6276 31492 245924 0 0 216 1176 511 409 3 82 > 0 15 > 4 0 408 6312 31528 245944 0 0 0 1116 468 263 1 86 > 0 13 > 1 2 408 6592 31576 245628 0 0 56 1044 343 126 0 97 > 0 3 > 5 0 408 6312 31576 245904 0 0 32 48 427 155 0 97 > 0 3 > 1 0 408 5816 31624 246360 0 0 72 1796 590 834 2 40 > 35 24 > 1 1 408 16872 31704 247564 0 0 1232 248 513 1185 28 4 > 11 57 > 1 1 408 31240 31768 248520 0 0 932 92 403 996 32 4 > 10 54 > 1 0 408 29576 31880 248704 0 0 188 248 372 997 7 6 > 61 26 > 1 1 408 28284 31952 249852 0 0 316 344 402 842 20 21 > 45 13 > 0 1 408 27188 32008 250940 0 0 112 976 393 465 33 58 > 0 9 > 5 1 408 24748 32100 253228 0 0 1212 1424 571 949 31 31 > 0 37 > 2 0 408 23052 32156 255032 0 0 544 1036 415 351 16 80 > 0 4 > 0 1 408 21148 32232 256808 0 0 516 1480 454 692 33 41 > 0 25 > procs -----------memory---------- ---swap-- -----io---- --system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 2 1 408 19616 32288 258308 0 0 576 1352 414 478 33 59 > 0 8 > 4 0 408 18084 32348 259816 0 0 496 1344 423 524 29 56 > 0 15 > 5 0 408 17016 32428 260844 0 0 192 812 518 574 24 63 > 0 13 > 2 0 408 15348 32488 262444 0 0 208 1064 416 295 14 85 > 0 1 > 5 0 408 13616 32552 264104 0 0 84 1684 497 615 32 66 > 0 2 > 5 1 408 13496 32612 263992 0 0 92 1148 530 526 14 71 > 0 14 > 0 1 408 13000 32784 264556 0 0 80 1240 506 504 1 59 > 0 40 > 3 1 408 12132 32864 265324 0 0 36 612 431 438 2 65 > 0 34 > 1 1 408 10196 33048 266960 0 0 216 4 440 565 1 60 > 0 39 > 1 1 408 9252 33284 267768 0 0 168 2444 463 617 1 56 > 0 43 > 0 3 408 7208 33376 269680 0 0 32 3460 459 497 1 59 > 0 40 > 2 1 408 6416 33444 270392 0 0 24 748 448 423 0 71 > 0 29 > 0 1 408 5976 33664 270568 0 0 220 1436 481 654 2 55 > 0 43 > 1 0 408 6100 33700 270356 0 0 8 844 406 389 9 70 > 16 5 > 0 0 408 5848 33732 270568 0 0 0 1128 435 401 0 72 > 27 1 > 1 0 408 5720 33772 270664 0 0 0 852 398 350 1 73 > 25 1 > 1 0 408 6100 33780 270320 0 0 0 1216 446 522 0 54 > 45 1 > 3 0 408 5736 33780 270644 0 0 0 1092 475 736 0 32 > 67 1 > 1 0 408 6372 33952 269720 0 0 0 1040 462 522 4 69 > 26 1 > 2 0 408 6436 33944 269592 0 0 0 864 433 287 0 83 > 16 1 > 0 0 408 5848 34024 270140 0 0 4 1232 480 701 3 39 > 53 5 > 2 0 408 9196 33936 266612 0 0 104 212 596 1035 10 43 > 40 8 > 3 0 408 8824 33936 267380 0 0 0 512 388 90 0 > 100 0 0 > 4 0 408 7956 33968 268148 0 0 0 548 400 114 1 98 > 0 1 > 2 0 408 6492 34000 269604 0 0 0 892 432 629 0 38 > 61 1 > 2 0 408 6416 34084 269648 0 0 0 1712 403 591 0 40 > 58 2 > 5 0 408 6612 34120 269376 0 0 0 844 447 557 1 49 > 49 1 > 4 0 408 6424 34148 269548 0 0 0 880 465 493 0 65 > 35 0 > 1 0 408 6336 34196 269596 0 0 0 1112 475 552 3 59 > 36 2 > 4 1 408 6304 34340 269404 0 0 0 1668 378 316 0 78 > 22 0 > 3 0 408 6096 34368 269608 0 0 0 308 411 625 1 38 > 59 2 > 3 0 408 6268 34412 269372 0 0 0 1148 398 583 0 39 > 60 1 > 5 0 408 6400 34444 269264 0 0 0 824 431 414 0 67 > 33 0 > ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-29 19:12 ` Vladimir V. Saveliev @ 2006-06-29 20:15 ` Mike Benoit 2006-06-29 20:22 ` Vladimir V. Saveliev ` (2 more replies) 0 siblings, 3 replies; 49+ messages in thread From: Mike Benoit @ 2006-06-29 20:15 UTC (permalink / raw) To: Vladimir V. Saveliev; +Cc: reiserfs-list [-- Attachment #1: Type: text/plain, Size: 2004 bytes --] On Thu, 2006-06-29 at 23:12 +0400, Vladimir V. Saveliev wrote: > Hello > > On Thu, 2006-06-29 at 10:41 -0700, Mike Benoit wrote: > > So I finally got around to profiling mythbackend when the load starts to > > spike. To my surprise it appears that once I have less then 10% (30GB) > > free on the drive reiserfs can't up, even just writing at 1mb/sec is too > > much for it. > > > > Is there something that can be done to fix this, 30gb seems like a lot > > of wasted space. > > > > #opreport > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > > Profiling through timer interrupt > > TIMER:0| > > samples| %| > > ------------------ > > 77863 78.7856 reiserfs > > 18183 18.3984 vmlinux > > 695 0.7032 mysqld > > 452 0.4574 libc-2.4.so > > 360 0.3643 libmythtv-0.19.so.0.19.0 > > 324 0.3278 ivtv > > 323 0.3268 nvidia > > 242 0.2449 libqt-mt.so.3.3.6 > > 110 0.1113 libpthread-2.4.so > > 53 0.0536 libstdc++.so.6.0.8 > > 35 0.0354 ld-2.4.so > > 23 0.0233 libperl.so > > 22 0.0223 libz.so.1.2.3 > > <snip> > > > > #opreport -l /usr/src/linux/vmlinux > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > > Profiling through timer interrupt > > samples % symbol name > > 9607 52.8351 default_idle > > 7694 42.3142 find_next_zero_bit > > It looks like the problem is high fragmentation of free space. > find_next_zero_bit is a function which is used to scan bitmaps in order > to find blocks for allocation. > This seems strange, because to me this type of workload would lend itself to being less fragmented then most workloads. All the box does is records TV programs, so over the course of 30-60min periods I would guess 95+% of the writes are sequential. Why would the fragmentation be so bad? Is there a way to tell what the fragmentation rate is? Thanks. -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-29 20:15 ` Mike Benoit @ 2006-06-29 20:22 ` Vladimir V. Saveliev 2006-06-29 21:01 ` Mike Benoit 2006-06-29 20:36 ` Nate Diller 2006-06-30 16:33 ` Hans Reiser 2 siblings, 1 reply; 49+ messages in thread From: Vladimir V. Saveliev @ 2006-06-29 20:22 UTC (permalink / raw) To: Mike Benoit; +Cc: reiserfs-list Hello On Thu, 2006-06-29 at 13:15 -0700, Mike Benoit wrote: > On Thu, 2006-06-29 at 23:12 +0400, Vladimir V. Saveliev wrote: > > Hello > > > > On Thu, 2006-06-29 at 10:41 -0700, Mike Benoit wrote: > > > So I finally got around to profiling mythbackend when the load starts to > > > spike. To my surprise it appears that once I have less then 10% (30GB) > > > free on the drive reiserfs can't up, even just writing at 1mb/sec is too > > > much for it. > > > > > > Is there something that can be done to fix this, 30gb seems like a lot > > > of wasted space. > > > > > > #opreport > > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > > > Profiling through timer interrupt > > > TIMER:0| > > > samples| %| > > > ------------------ > > > 77863 78.7856 reiserfs > > > 18183 18.3984 vmlinux > > > 695 0.7032 mysqld > > > 452 0.4574 libc-2.4.so > > > 360 0.3643 libmythtv-0.19.so.0.19.0 > > > 324 0.3278 ivtv > > > 323 0.3268 nvidia > > > 242 0.2449 libqt-mt.so.3.3.6 > > > 110 0.1113 libpthread-2.4.so > > > 53 0.0536 libstdc++.so.6.0.8 > > > 35 0.0354 ld-2.4.so > > > 23 0.0233 libperl.so > > > 22 0.0223 libz.so.1.2.3 > > > <snip> > > > > > > #opreport -l /usr/src/linux/vmlinux > > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > > > Profiling through timer interrupt > > > samples % symbol name > > > 9607 52.8351 default_idle > > > 7694 42.3142 find_next_zero_bit > > > > It looks like the problem is high fragmentation of free space. > > find_next_zero_bit is a function which is used to scan bitmaps in order > > to find blocks for allocation. > > > > This seems strange, because to me this type of workload would lend > itself to being less fragmented then most workloads. All the box does is > records TV programs, so over the course of 30-60min periods I would > guess 95+% of the writes are sequential. > do you ever remove files? > Why would the fragmentation be so bad? Is there a way to tell what the > fragmentation rate is? > can you please run debugreiserfs -m /dev/hda1 > bitmap and send me that file? bitmap should contain dump of free and used blocks. If most of bitmap blocks contain a lot of interleaving free/used sections - free space is highly fragmented and allocating new free blocks can be CPU expensive. > Thanks. > ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-29 20:22 ` Vladimir V. Saveliev @ 2006-06-29 21:01 ` Mike Benoit 0 siblings, 0 replies; 49+ messages in thread From: Mike Benoit @ 2006-06-29 21:01 UTC (permalink / raw) To: Vladimir V. Saveliev; +Cc: reiserfs-list [-- Attachment #1: Type: text/plain, Size: 1683 bytes --] On Fri, 2006-06-30 at 00:22 +0400, Vladimir V. Saveliev wrote: > > This seems strange, because to me this type of workload would lend > > itself to being less fragmented then most workloads. All the box > does is > > records TV programs, so over the course of 30-60min periods I would > > guess 95+% of the writes are sequential. > > > > do you ever remove files? Yes, files are deleted when the drive starts to fill up, which is how I discovered this issue in the first place. I always kept a minimum of 10gb free, and when I got close to that limit is when the load would spike. I have since set to the limit to 40gb and I haven't seen the problem since, but I can't use that 40gb of space either though. > > > Why would the fragmentation be so bad? Is there a way to tell what > the > > fragmentation rate is? > > > > can you please run debugreiserfs -m /dev/hda1 > bitmap and send me > that > file? > bitmap should contain dump of free and used blocks. If most of bitmap > blocks contain a lot of interleaving free/used sections - free space > is > highly fragmented and allocating new free blocks can be CPU > expensive. I do record two programs at once from time to time, so I can understand how that would cause fragmentation. However after each program I also transcode them to a different format one at a time. So I would think that would reduce fragmentation that may have occurred from recording two programs at once? Although I suppose if I was transcoding and recording at the same time, it would just make things worse. I will send Vladimir the debugreiserfs output privately. -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-29 20:15 ` Mike Benoit 2006-06-29 20:22 ` Vladimir V. Saveliev @ 2006-06-29 20:36 ` Nate Diller 2006-06-30 16:33 ` Hans Reiser 2 siblings, 0 replies; 49+ messages in thread From: Nate Diller @ 2006-06-29 20:36 UTC (permalink / raw) To: Mike Benoit; +Cc: Vladimir V. Saveliev, reiserfs-list On 6/29/06, Mike Benoit <ipso@snappymail.ca> wrote: > On Thu, 2006-06-29 at 23:12 +0400, Vladimir V. Saveliev wrote: > > Hello > > > > On Thu, 2006-06-29 at 10:41 -0700, Mike Benoit wrote: > > > So I finally got around to profiling mythbackend when the load starts to > > > spike. To my surprise it appears that once I have less then 10% (30GB) > > > free on the drive reiserfs can't up, even just writing at 1mb/sec is too > > > much for it. > > > > > > Is there something that can be done to fix this, 30gb seems like a lot > > > of wasted space. > > > > > > #opreport > > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > > > Profiling through timer interrupt > > > TIMER:0| > > > samples| %| > > > ------------------ > > > 77863 78.7856 reiserfs > > > 18183 18.3984 vmlinux > > > 695 0.7032 mysqld > > > 452 0.4574 libc-2.4.so > > > 360 0.3643 libmythtv-0.19.so.0.19.0 > > > 324 0.3278 ivtv > > > 323 0.3268 nvidia > > > 242 0.2449 libqt-mt.so.3.3.6 > > > 110 0.1113 libpthread-2.4.so > > > 53 0.0536 libstdc++.so.6.0.8 > > > 35 0.0354 ld-2.4.so > > > 23 0.0233 libperl.so > > > 22 0.0223 libz.so.1.2.3 > > > <snip> > > > > > > #opreport -l /usr/src/linux/vmlinux > > > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > > > Profiling through timer interrupt > > > samples % symbol name > > > 9607 52.8351 default_idle > > > 7694 42.3142 find_next_zero_bit > > > > It looks like the problem is high fragmentation of free space. > > find_next_zero_bit is a function which is used to scan bitmaps in order > > to find blocks for allocation. > > > > This seems strange, because to me this type of workload would lend > itself to being less fragmented then most workloads. All the box does is > records TV programs, so over the course of 30-60min periods I would > guess 95+% of the writes are sequential. do you frequently record more than one program at once? > Why would the fragmentation be so bad? Is there a way to tell what the > fragmentation rate is? reiserfs does not use delayed allocation, and AFAIK cannot do reservations either. this means that if you are recording two things at once, they will get interleaved on disk in very small increments. then when you delete one file, the free space is fragmented too. best bet would be to use reiser4, which has delayed allocation, and set /proc/sys/vm/dirty_background_ratio to 40, and dirty_ratio to 45 or 50. NATE ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-29 20:15 ` Mike Benoit 2006-06-29 20:22 ` Vladimir V. Saveliev 2006-06-29 20:36 ` Nate Diller @ 2006-06-30 16:33 ` Hans Reiser 2006-06-30 16:47 ` Jeff Mahoney 2 siblings, 1 reply; 49+ messages in thread From: Hans Reiser @ 2006-06-30 16:33 UTC (permalink / raw) To: Mike Benoit, Jeff Mahoney; +Cc: Vladimir V. Saveliev, reiserfs-list Mike Benoit wrote: > >This seems strange, because to me this type of workload would lend >itself to being less fragmented then most workloads. All the box does is >records TV programs, so over the course of 30-60min periods I would >guess 95+% of the writes are sequential. > >Why would the fragmentation be so bad? Is there a way to tell what the >fragmentation rate is? > >Thanks. > > > I wonder how the bitmap optimizations that Jeff added handle this usage pattern. Jeff? ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-30 16:33 ` Hans Reiser @ 2006-06-30 16:47 ` Jeff Mahoney 2006-06-30 17:04 ` Hans Reiser 2006-07-05 0:37 ` Mike Benoit 0 siblings, 2 replies; 49+ messages in thread From: Jeff Mahoney @ 2006-06-30 16:47 UTC (permalink / raw) To: Hans Reiser; +Cc: Mike Benoit, Vladimir V. Saveliev, reiserfs-list [-- Attachment #1: Type: text/plain, Size: 1312 bytes --] -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: > Mike Benoit wrote: > >> This seems strange, because to me this type of workload would lend >> itself to being less fragmented then most workloads. All the box does is >> records TV programs, so over the course of 30-60min periods I would >> guess 95+% of the writes are sequential. >> >> Why would the fragmentation be so bad? Is there a way to tell what the >> fragmentation rate is? >> >> Thanks. >> >> >> > I wonder how the bitmap optimizations that Jeff added handle this usage > pattern. Jeff? That's certainly interesting. The bitmap hinting code should skip bitmap blocks with fewer blocks that are being asked for. The first zero hint patch was never applied to mainline. I have that in my queue as well. Try using the attached patch. It directs the block allocator to start the search at the first known 0 bit rather than scanning the entire block to find it. I'm not sure if will have a meaningful performance impact, but it's worth a try. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFEpVWRLPWxlyuTD7IRAm1uAJwIExdMY1ju2VjnVFmbweEluNUi+QCgqZWL rNWRcVDW0KqBCrvUl1L4veE= =Cuir -----END PGP SIGNATURE----- [-- Attachment #2: reiserfs-05-bitmap-use-first-zero-hint.diff --] [-- Type: text/x-patch, Size: 1349 bytes --] From: Jeff Mahoney <jeffm@suse.com> Subject: [PATCH 5/5] reiserfs: make bitmap use cached first zero bit Currently, the bitmap code uses half of the hinting data gathered and cached and wastes the other half. We'll skip completely full bitmaps, but start scanning in bitmaps at locations where if we consulted the zero bit hint, we'd know there aren't any free bits available. This patch uses the first zero hint to bump the beginning of the search window to where we know there is at least one zero bit. fs/reiserfs/bitmap.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletion(-) Signed-off-by: Jeff Mahoney <jeffm@suse.com> diff -ruNpX ../dontdiff linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c --- linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c 2006-01-16 16:53:35.663319136 -0500 +++ linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c 2006-01-16 16:53:35.673317616 -0500 @@ -187,7 +187,10 @@ static int scan_bitmap_block(struct reis return 0; // No free blocks in this bitmap } - /* search for a first zero bit -- beggining of a window */ + if (*beg < bi->first_zero_hint) + *beg = bi->first_zero_hint; + + /* search for a first zero bit -- beginning of a window */ *beg = reiserfs_find_next_zero_le_bit ((unsigned long *)(bh->b_data), boundary, *beg); ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-30 16:47 ` Jeff Mahoney @ 2006-06-30 17:04 ` Hans Reiser 2006-06-30 17:46 ` Mike Benoit 2006-07-05 0:37 ` Mike Benoit 1 sibling, 1 reply; 49+ messages in thread From: Hans Reiser @ 2006-06-30 17:04 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Mike Benoit, Vladimir V. Saveliev, reiserfs-list I have to apologize, I just now read the part where find_next_zero is using 42% of CPU. Jeff, this has the "feel" of a bug (affecting performance not correctness). I am skeptical that we are searching only bitmap blocks in which we successfully find and use a free block. Could you look at the code and his results with some care? This profiling result is what I would have expected to see BEFORE your optimizations occurred, and I would not expect it now. Thanks Mike for bringing this to our attention. Hans ------------------------- From: Jeff Mahoney <jeffm@suse.com> Subject: [PATCH 5/5] reiserfs: make bitmap use cached first zero bit Currently, the bitmap code uses half of the hinting data gathered and cached and wastes the other half. We'll skip completely full bitmaps, but start scanning in bitmaps at locations where if we consulted the zero bit hint, we'd know there aren't any free bits available. This patch uses the first zero hint to bump the beginning of the search window to where we know there is at least one zero bit. fs/reiserfs/bitmap.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletion(-) Signed-off-by: Jeff Mahoney <jeffm@suse.com> diff -ruNpX ../dontdiff linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c --- linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c 2006-01-16 16:53:35.663319136 -0500 +++ linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c 2006-01-16 16:53:35.673317616 -0500 @@ -187,7 +187,10 @@ static int scan_bitmap_block(struct reis return 0; // No free blocks in this bitmap } - /* search for a first zero bit -- beggining of a window */ + if (*beg < bi->first_zero_hint) + *beg = bi->first_zero_hint; + + /* search for a first zero bit -- beginning of a window */ *beg = reiserfs_find_next_zero_le_bit ((unsigned long *)(bh->b_data), boundary, *beg); ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-30 17:04 ` Hans Reiser @ 2006-06-30 17:46 ` Mike Benoit 2006-06-30 18:18 ` Hans Reiser 0 siblings, 1 reply; 49+ messages in thread From: Mike Benoit @ 2006-06-30 17:46 UTC (permalink / raw) To: Hans Reiser; +Cc: Jeff Mahoney, Vladimir V. Saveliev, reiserfs-list [-- Attachment #1: Type: text/plain, Size: 3057 bytes --] I have emailed the results of debugreiserfs -m /dev/hda1 to Vladimir already, Jeff I can email it to you as well if you want? Just you so you know, there is a thread about this on the MythTV mailing list (http://www.gossamer-threads.com/lists/mythtv/users/208573) and it looks like there might be as many as 5 other people experiencing the same issue. We originally thought it was caused by IVTV drivers, but one other person and myself (after profiling) discovered it was related to how much disk space we have available on our reiserfs partitions. I also haven't _noticed_ a gradual increase in load as the free space decreases, as I would expect if reiserfs had to spend longer searching for unused blocks, it seems the box all of a sudden gets hit at a certain point and the load spikes to 10 and writing slows to a crawl. On Fri, 2006-06-30 at 10:04 -0700, Hans Reiser wrote: > I have to apologize, I just now read the part where find_next_zero is > using 42% of CPU. > > Jeff, this has the "feel" of a bug (affecting performance not > correctness). I am skeptical that we are searching only bitmap blocks > in which we successfully find and use a free block. Could you look at > the code and his results with some care? This profiling result is what > I would have expected to see BEFORE your optimizations occurred, and I > would not expect it now. Thanks Mike for bringing this to our attention. > > Hans > > ------------------------- > > From: Jeff Mahoney <jeffm@suse.com> > Subject: [PATCH 5/5] reiserfs: make bitmap use cached first zero bit > > Currently, the bitmap code uses half of the hinting data gathered and > cached > and wastes the other half. We'll skip completely full bitmaps, but start > scanning in bitmaps at locations where if we consulted the zero bit hint, > we'd know there aren't any free bits available. > > This patch uses the first zero hint to bump the beginning of the search > window to where we know there is at least one zero bit. > > fs/reiserfs/bitmap.c | 5 ++++- > 1 files changed, 4 insertions(+), 1 deletion(-) > > Signed-off-by: Jeff Mahoney <jeffm@suse.com> > > diff -ruNpX ../dontdiff linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c > linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c > --- linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c 2006-01-16 > 16:53:35.663319136 -0500 > +++ linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c 2006-01-16 > 16:53:35.673317616 -0500 > @@ -187,7 +187,10 @@ static int scan_bitmap_block(struct reis > return 0; // No free blocks in this bitmap > } > > - /* search for a first zero bit -- beggining of a window */ > + if (*beg < bi->first_zero_hint) > + *beg = bi->first_zero_hint; > + > + /* search for a first zero bit -- beginning of a window */ > *beg = reiserfs_find_next_zero_le_bit > ((unsigned long *)(bh->b_data), boundary, *beg); > > > -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-30 17:46 ` Mike Benoit @ 2006-06-30 18:18 ` Hans Reiser 0 siblings, 0 replies; 49+ messages in thread From: Hans Reiser @ 2006-06-30 18:18 UTC (permalink / raw) To: Mike Benoit; +Cc: Jeff Mahoney, Vladimir V. Saveliev, reiserfs-list Jeff, does the code do anything funny when crossing the 90% point? You have special heuristics for that, yes? Maybe a bug is hiding in them? ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-06-30 16:47 ` Jeff Mahoney 2006-06-30 17:04 ` Hans Reiser @ 2006-07-05 0:37 ` Mike Benoit 2006-07-05 2:37 ` Hans Reiser 1 sibling, 1 reply; 49+ messages in thread From: Mike Benoit @ 2006-07-05 0:37 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Hans Reiser, Vladimir V. Saveliev, reiserfs-list [-- Attachment #1: Type: text/plain, Size: 1691 bytes --] Hi Jeff, I just tried the patch you suggested and it didn't make a difference. The load still spikes as soon as the free space falls below ~10%. On Fri, 2006-06-30 at 12:47 -0400, Jeff Mahoney wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hans Reiser wrote: > > Mike Benoit wrote: > > > >> This seems strange, because to me this type of workload would lend > >> itself to being less fragmented then most workloads. All the box does is > >> records TV programs, so over the course of 30-60min periods I would > >> guess 95+% of the writes are sequential. > >> > >> Why would the fragmentation be so bad? Is there a way to tell what the > >> fragmentation rate is? > >> > >> Thanks. > >> > >> > >> > > I wonder how the bitmap optimizations that Jeff added handle this usage > > pattern. Jeff? > > That's certainly interesting. The bitmap hinting code should skip bitmap > blocks with fewer blocks that are being asked for. The first zero hint > patch was never applied to mainline. I have that in my queue as well. > > Try using the attached patch. It directs the block allocator to start > the search at the first known 0 bit rather than scanning the entire > block to find it. I'm not sure if will have a meaningful performance > impact, but it's worth a try. > > - -Jeff > > - -- > Jeff Mahoney > SUSE Labs > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.2 (GNU/Linux) > Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org > > iD8DBQFEpVWRLPWxlyuTD7IRAm1uAJwIExdMY1ju2VjnVFmbweEluNUi+QCgqZWL > rNWRcVDW0KqBCrvUl1L4veE= > =Cuir > -----END PGP SIGNATURE----- -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-05 0:37 ` Mike Benoit @ 2006-07-05 2:37 ` Hans Reiser 2006-07-05 14:42 ` Tom Vier ` (2 more replies) 0 siblings, 3 replies; 49+ messages in thread From: Hans Reiser @ 2006-07-05 2:37 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Mike Benoit, Vladimir V. Saveliev, reiserfs-list Mike Benoit wrote: >Hi Jeff, > > I just tried the patch you suggested and it didn't make a difference. >The load still spikes as soon as the free space falls below ~10%. > > Jeff, please audit your code for what happens when all the bitmap blocks reach 90% full. Could you discuss your design and code in that regard for our benefit? Mike, thanks so much for going to this much effort. It is rather likely this is a problem affecting many users. Hans ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-05 2:37 ` Hans Reiser @ 2006-07-05 14:42 ` Tom Vier 2006-07-05 19:12 ` Jeff Mahoney [not found] ` <20060706125856.fdac1d16.pegasus@nerv.eu.org> 2 siblings, 0 replies; 49+ messages in thread From: Tom Vier @ 2006-07-05 14:42 UTC (permalink / raw) To: reiserfs-list On Tue, Jul 04, 2006 at 07:37:34PM -0700, Hans Reiser wrote: > Mike, thanks so much for going to this much effort. It is rather likely > this is a problem affecting many users. Last weekend, i accidentally filled my /. I noticed when i heard the drives (it's a 2 drive raid 0) thrashing. I didn't watch the cpu load, which may've been high, but it seemed to be io bound. -- Tom Vier <tmv@comcast.net> DSA Key ID 0x15741ECE ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-05 2:37 ` Hans Reiser 2006-07-05 14:42 ` Tom Vier @ 2006-07-05 19:12 ` Jeff Mahoney [not found] ` <20060706125856.fdac1d16.pegasus@nerv.eu.org> 2 siblings, 0 replies; 49+ messages in thread From: Jeff Mahoney @ 2006-07-05 19:12 UTC (permalink / raw) To: Hans Reiser; +Cc: Mike Benoit, Vladimir V. Saveliev, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: > Mike Benoit wrote: > >> Hi Jeff, >> >> I just tried the patch you suggested and it didn't make a difference. >> The load still spikes as soon as the free space falls below ~10%. >> >> > Jeff, please audit your code for what happens when all the bitmap blocks > reach 90% full. Could you discuss your design and code in that regard > for our benefit? > > Mike, thanks so much for going to this much effort. It is rather likely > this is a problem affecting many users. Mike - Can you post a copy of debugreiserfs -p <dev> |gzip -c > somefile.gz somewhere? I can't reproduce that behavior locally and it would help quite a bit if I had a test case where I could. Thanks. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFErA8jLPWxlyuTD7IRAl03AJ4wPthmJ2/SSIJPux5waXGdaEoDeACfV2gK g12ngw/mzsZUYC3Kj8uuIdE= =1qtg -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
[parent not found: <20060706125856.fdac1d16.pegasus@nerv.eu.org>]
* Re: ReiserFS v3 choking when free space falls below 10%? [not found] ` <20060706125856.fdac1d16.pegasus@nerv.eu.org> @ 2006-07-06 15:43 ` Mike Benoit 2006-07-06 16:01 ` Jonathan Briggs ` (2 more replies) 0 siblings, 3 replies; 49+ messages in thread From: Mike Benoit @ 2006-07-06 15:43 UTC (permalink / raw) To: Jure Pečar; +Cc: reiserfs-list [-- Attachment #1: Type: text/plain, Size: 1352 bytes --] On Thu, 2006-07-06 at 12:58 +0200, Jure Pečar wrote: > On Tue, 04 Jul 2006 19:37:34 -0700 > Hans Reiser <reiser@namesys.com> wrote: > > > Mike Benoit wrote: > > > > >Hi Jeff, > > > > > > I just tried the patch you suggested and it didn't make a > > > difference. > > >The load still spikes as soon as the free space falls below ~10%. > > > > > > > > Jeff, please audit your code for what happens when all the bitmap > > blocks reach 90% full. Could you discuss your design and code in > > that regard for our benefit? > > > > Mike, thanks so much for going to this much effort. It is rather > > likely this is a problem affecting many users. > > I run my busy mailservers with 0.5-2% free space (that's still a couple of gigabytes) and have no problems. It's true that I haven't touched the kernel & reiserfs there (2.4.21), so it does not have any additions to the reiserfs v3 code since then. It just works, so I don't have any desire to fix it :) > > My desktop machine (v2.6.16, same as my MythTV box) is running with 9% free space right now and it is not experiencing any slow down. I think the problem is caused by the usage pattern of MythTV and how it simultaneously streams one or more large files to the HD in relatively small chunks over a long period of time. -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 15:43 ` Mike Benoit @ 2006-07-06 16:01 ` Jonathan Briggs 2006-07-06 17:26 ` Toby Thain 2006-07-06 18:02 ` Jeff Mahoney 2 siblings, 0 replies; 49+ messages in thread From: Jonathan Briggs @ 2006-07-06 16:01 UTC (permalink / raw) To: Mike Benoit; +Cc: Reiserfs mail-list [-- Attachment #1: Type: text/plain, Size: 600 bytes --] On Thu, 2006-07-06 at 08:43 -0700, Mike Benoit wrote: [snip] > My desktop machine (v2.6.16, same as my MythTV box) is running with 9% > free space right now and it is not experiencing any slow down. I think > the problem is caused by the usage pattern of MythTV and how it > simultaneously streams one or more large files to the HD in relatively > small chunks over a long period of time. Hasn't someone patched MythTV to pre-allocate (zero-write) the video files to the expected sizes? I was sure I'd read about that somewhere... -- Jonathan Briggs <jbriggs@esoft.com> eSoft, Inc. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? @ 2006-07-06 17:26 ` Toby Thain 0 siblings, 0 replies; 49+ messages in thread From: Toby Thain @ 2006-07-06 17:26 UTC (permalink / raw) To: reiserfs-list [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="windows-1254"; delsp="yes"; format="flowed", Size: 1583 bytes --] On 6-Jul-06, at 11:43 AM, Mike Benoit wrote: > On Thu, 2006-07-06 at 12:58 +0200, Jure Pečar wrote: >> On Tue, 04 Jul 2006 19:37:34 -0700 >> Hans Reiser <reiser@namesys.com> wrote: >> >>> Mike Benoit wrote: >>> >>>> Hi Jeff, >>>> >>>> I just tried the patch you suggested and it didn't make a >>>> difference. >>>> The load still spikes as soon as the free space falls below ~10%. >>>> >>>> >>> Jeff, please audit your code for what happens when all the bitmap >>> blocks reach 90% full. Could you discuss your design and code in >>> that regard for our benefit? >>> >>> Mike, thanks so much for going to this much effort. It is rather >>> likely this is a problem affecting many users. >> >> I run my busy mailservers with 0.5-2% free space (that's still a >> couple of gigabytes) and have no problems. It's true that I >> haven't touched the kernel & reiserfs there (2.4.21), so it does >> not have any additions to the reiserfs v3 code since then. It just >> works, so I don't have any desire to fix it :) >> >> > > My desktop machine (v2.6.16, same as my MythTV box) is running with 9% > free space right now and it is not experiencing any slow down. I think > the problem is caused by the usage pattern of MythTV and how it > simultaneously streams one or more large files to the HD in relatively > small chunks over a long period of time. ...And then has a hard timing requirement when reusing the free space, which a desktop/server doesn't have, exposing the issue. --T > > -- > Mike Benoit <ipso@snappymail.ca> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? @ 2006-07-06 17:26 ` Toby Thain 0 siblings, 0 replies; 49+ messages in thread From: Toby Thain @ 2006-07-06 17:26 UTC (permalink / raw) To: reiserfs-list [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="windows-1254"; delsp="yes"; format="flowed", Size: 1583 bytes --] On 6-Jul-06, at 11:43 AM, Mike Benoit wrote: > On Thu, 2006-07-06 at 12:58 +0200, Jure Pečar wrote: >> On Tue, 04 Jul 2006 19:37:34 -0700 >> Hans Reiser <reiser@namesys.com> wrote: >> >>> Mike Benoit wrote: >>> >>>> Hi Jeff, >>>> >>>> I just tried the patch you suggested and it didn't make a >>>> difference. >>>> The load still spikes as soon as the free space falls below ~10%. >>>> >>>> >>> Jeff, please audit your code for what happens when all the bitmap >>> blocks reach 90% full. Could you discuss your design and code in >>> that regard for our benefit? >>> >>> Mike, thanks so much for going to this much effort. It is rather >>> likely this is a problem affecting many users. >> >> I run my busy mailservers with 0.5-2% free space (that's still a >> couple of gigabytes) and have no problems. It's true that I >> haven't touched the kernel & reiserfs there (2.4.21), so it does >> not have any additions to the reiserfs v3 code since then. It just >> works, so I don't have any desire to fix it :) >> >> > > My desktop machine (v2.6.16, same as my MythTV box) is running with 9% > free space right now and it is not experiencing any slow down. I think > the problem is caused by the usage pattern of MythTV and how it > simultaneously streams one or more large files to the HD in relatively > small chunks over a long period of time. ...And then has a hard timing requirement when reusing the free space, which a desktop/server doesn't have, exposing the issue. --T > > -- > Mike Benoit <ipso@snappymail.ca> ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 15:43 ` Mike Benoit 2006-07-06 16:01 ` Jonathan Briggs 2006-07-06 17:26 ` Toby Thain @ 2006-07-06 18:02 ` Jeff Mahoney 2006-07-06 18:12 ` Hans Reiser 2006-07-06 18:27 ` Mike Benoit 2 siblings, 2 replies; 49+ messages in thread From: Jeff Mahoney @ 2006-07-06 18:02 UTC (permalink / raw) To: Mike Benoit; +Cc: Jure Pečar, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mike Benoit wrote: > My desktop machine (v2.6.16, same as my MythTV box) is running with 9% > free space right now and it is not experiencing any slow down. I think > the problem is caused by the usage pattern of MythTV and how it > simultaneously streams one or more large files to the HD in relatively > small chunks over a long period of time. Ok, if you run into the problem again, can you dump the metadata before freeing the space? The code itself looks sound, and I'm wondering if you've managed to create pathological fragmentation that's mucking things up. Being able to see the fs metadata would confirm or disprove that theory, and help in fixing it. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFErVAgLPWxlyuTD7IRAnXvAJ9gpOT9PR0ndGhmtDOgKsEtcuZB6wCfRkYR WMPwT7Tn8hW/Y/HFs8g6TrU= =2lCS -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 18:02 ` Jeff Mahoney @ 2006-07-06 18:12 ` Hans Reiser 2006-07-06 18:19 ` Jeff Mahoney 2006-07-06 18:27 ` Mike Benoit 1 sibling, 1 reply; 49+ messages in thread From: Hans Reiser @ 2006-07-06 18:12 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Mike Benoit, Jure Pečar, reiserfs-list Jeff Mahoney wrote: > > > > Ok, if you run into the problem again, can you dump the metadata before > freeing the space? The code itself looks sound, and I'm wondering if > you've managed to create pathological fragmentation that's mucking > things up. There should be no possible fragmentation that would increase CPU usage like that. With the current algorithms, in which you check one field in the bitmap to see if it has any free blocks, it should not be possible for scanning bitmaps to take so much time...... There must be a bug in there. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 18:12 ` Hans Reiser @ 2006-07-06 18:19 ` Jeff Mahoney 2006-07-06 18:47 ` Mike Benoit 0 siblings, 1 reply; 49+ messages in thread From: Jeff Mahoney @ 2006-07-06 18:19 UTC (permalink / raw) To: Hans Reiser; +Cc: Mike Benoit, Jure Pečar, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: > There should be no possible fragmentation that would increase CPU usage > like that. With the current algorithms, in which you check one field in > the bitmap to see if it has any free blocks, it should not be possible > for scanning bitmaps to take so much time...... > > There must be a bug in there. I'm sure there is, but it's a bug that others don't seem to be seeing, including myself, and Mike reported he's not experiencing the problem anymore. Given my current workload, unless I can readily reproduce it locally, this takes a low priority for me. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFErVQwLPWxlyuTD7IRArvSAJ9pXBTGPzJjHYXQFHBQhYz5CTqQXwCeM4G4 zUqhLF9xWk1XInebVRevTVo= =qzQB -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 18:19 ` Jeff Mahoney @ 2006-07-06 18:47 ` Mike Benoit 2006-07-06 19:17 ` Hans Reiser 0 siblings, 1 reply; 49+ messages in thread From: Mike Benoit @ 2006-07-06 18:47 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Hans Reiser, Jure Pečar, reiserfs-list [-- Attachment #1: Type: text/plain, Size: 1930 bytes --] On Thu, 2006-07-06 at 14:19 -0400, Jeff Mahoney wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hans Reiser wrote: > > There should be no possible fragmentation that would increase CPU usage > > like that. With the current algorithms, in which you check one field in > > the bitmap to see if it has any free blocks, it should not be possible > > for scanning bitmaps to take so much time...... > > > > There must be a bug in there. > > I'm sure there is, but it's a bug that others don't seem to be seeing, > including myself, and Mike reported he's not experiencing the problem > anymore. Given my current workload, unless I can readily reproduce it > locally, this takes a low priority for me. Jeff, I'm sure there are at least 5 other people seeing the same or similar symptoms of the problem on the MythTV mailing list: http://www.gossamer-threads.com/lists/mythtv/users/208573?do=post_view_threaded The common factors are high load and ReiserFS. So far I can re-create the problem at will, I just need to record enough programs so my free space falls below 10% for it to happen. However since the box that is experiencing the issue does recordings for other people I have to clear off enough space so the problem goes away between trying to track down the bug. Unfortunately I got a little delete happy this last round and deleted 70gb worth of data. So I've setup 40hrs of recording to be done in 20hrs (two tuners) which should trigger the problem again by tonight or tomorrow morning at the latest. Once that happens I will take a metadata snapshot and email it off to you. I'll be sure to not free up so much space from now on so I can re-create the problem in just a couple hours in case you need to me to collect additional data. If I get a chance I'll attempt to make a script that re-creates the problem too. -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 18:47 ` Mike Benoit @ 2006-07-06 19:17 ` Hans Reiser 0 siblings, 0 replies; 49+ messages in thread From: Hans Reiser @ 2006-07-06 19:17 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Mike Benoit, Jure Pečar, reiserfs-list Jeff, I am suspicious, because I know that 90% is a magic number in your code. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 18:02 ` Jeff Mahoney 2006-07-06 18:12 ` Hans Reiser @ 2006-07-06 18:27 ` Mike Benoit 2006-07-06 18:39 ` Jeff Mahoney 1 sibling, 1 reply; 49+ messages in thread From: Mike Benoit @ 2006-07-06 18:27 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Jure Pečar, reiserfs-list [-- Attachment #1: Type: text/plain, Size: 1146 bytes --] On Thu, 2006-07-06 at 14:02 -0400, Jeff Mahoney wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Mike Benoit wrote: > > My desktop machine (v2.6.16, same as my MythTV box) is running with 9% > > free space right now and it is not experiencing any slow down. I think > > the problem is caused by the usage pattern of MythTV and how it > > simultaneously streams one or more large files to the HD in relatively > > small chunks over a long period of time. > > Ok, if you run into the problem again, can you dump the metadata before > freeing the space? The code itself looks sound, and I'm wondering if > you've managed to create pathological fragmentation that's mucking > things up. Being able to see the fs metadata would confirm or disprove > that theory, and help in fixing it. Will do, I've started a bunch of recordings so I should start seeing the problem again by tonight or tomorrow morning. Is there any other data you would like me to collect? Additional oprofile reports, vmstat information before the problem occurs and/or after? Let me know. Thanks. -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 18:27 ` Mike Benoit @ 2006-07-06 18:39 ` Jeff Mahoney 2006-07-07 7:29 ` Mike Benoit 0 siblings, 1 reply; 49+ messages in thread From: Jeff Mahoney @ 2006-07-06 18:39 UTC (permalink / raw) To: Mike Benoit; +Cc: Jure Pečar, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mike Benoit wrote: > On Thu, 2006-07-06 at 14:02 -0400, Jeff Mahoney wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Mike Benoit wrote: >>> My desktop machine (v2.6.16, same as my MythTV box) is running with 9% >>> free space right now and it is not experiencing any slow down. I think >>> the problem is caused by the usage pattern of MythTV and how it >>> simultaneously streams one or more large files to the HD in relatively >>> small chunks over a long period of time. >> Ok, if you run into the problem again, can you dump the metadata before >> freeing the space? The code itself looks sound, and I'm wondering if >> you've managed to create pathological fragmentation that's mucking >> things up. Being able to see the fs metadata would confirm or disprove >> that theory, and help in fixing it. > > Will do, I've started a bunch of recordings so I should start seeing the > problem again by tonight or tomorrow morning. Is there any other data > you would like me to collect? Additional oprofile reports, vmstat > information before the problem occurs and/or after? Great. Any information you can provide would help quite a bit. oprofile would be useful, as would vmstat information. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFErVjRLPWxlyuTD7IRAlgoAKCRqtHLk6Uq9Bp3yZq/18tHt8l2mwCfT206 UMSE1Om/pvg+svHImWkwLT8= =I5Oj -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-06 18:39 ` Jeff Mahoney @ 2006-07-07 7:29 ` Mike Benoit 2006-07-07 17:49 ` Jan Kara 0 siblings, 1 reply; 49+ messages in thread From: Mike Benoit @ 2006-07-07 7:29 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Jure Pečar, Hans Reiser, reiserfs-list [-- Attachment #1.1: Type: text/plain, Size: 6061 bytes --] Hi Jeff, Like clock work the problem showed up pretty much exactly when I expected. This time however I discovered a few other interesting tidbits along the way. I tried to re-create the problem much faster by writing a little script that would append data to a file at about 5mb/s, I ran two instances of this script simultaneously, so each script was writing to two separate files at the same time. After the files reached 2gb (close to the same size of a recording) both scripts would start writing to new files. Two recordings were also going on at the same time as all of this. I filled the drive up so about 10gb (5%) was free, and although the write speed dropped off significantly as the drive filled up, the SYS CPU time never increased. So this method obviously failed to re-create the problem, so I deleted all these files the script created (~60gb worth) and let MythTV do its thing until the drive filled up that way. The pattern MythTV writes data out must have something to do with this? The other interesting thing that happened was at about 11pm tonight I noticed the problem started occurring, (SYS CPU was high) but MythTV was transcoding MPEG2 recordings to MPEG4 which was using USR CPU time, so I tried to stop this process so the oprofile data wouldn't be so cluttered. Well when I stopped the transcoding, MythTV deleted the temp file it had been creating and this caused the problem stop almost immediately! I didn't realize MythTV deleted this temp file at first, so I was disappointed and therefore started the transcoding again in hopes of re-creating the problem. Within about 45mins, it started happening again, but this time I renamed the temp file (so it wouldn't get deleted) MythTV was trancoding to before I killed that process, and was able to keep the problem happening long enough to get a clean oprofile and better vmstat data. The really interesting thing is how much free space was available each time the problem hit: Time #1: Thu Jul 6 22:53:26 PDT 2006 /dev/hda1 293024652 269089456 23935196 92% / Time #2 (45mins later): Thu Jul 6 23:35:38 PDT 2006 /dev/hda1 293024652 269227580 23797072 92% / It seems like once the free space hits a very specific point, the problem is triggered. As you will notice in the vmstat logs, within about 30seconds the SYS CPU time goes from 4% to 75+% and hovers there. Attached are the vmstat logs and oprofile report and I'm sending you the output of debugreiserfs -p /dev/hda1 when the drive is 92% full privately (when it finishes, could be morning). Just so you know, I had vmstat set to output 6 times every 10 seconds, then I ran date/df, rinse and repeat. This is the script I used to collect the data: { while [ "1" != "0" ] ; do date df vmstat 10 6 done } 2>>/tmp/monitor.log 1>>/tmp/monitor.log vmstat_1.txt is the first time the problem occurred. vmstat_2.txt is the second time the problem occurred. Hopefully this helps you track down the issue. If not, let me know if you want me to collect more data. I'll try to keep the drive as full as possible so I can re-create the problem much faster. Also if you need access to the box, that can be arranged. Thanks. PS. I'm running kernel v2.6.16.21-rfsfix, "rfsfix" is the following patch you sent me. I experienced the problem on kernels as old as 2.6.14. diff -ruNpX ../dontdiff linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c --- linux-2.6.15.orig.staging1/fs/reiserfs/bitmap.c 2006-01-16 16:53:35.663319136 -0500 +++ linux-2.6.15.orig.staging2/fs/reiserfs/bitmap.c 2006-01-16 16:53:35.673317616 -0500 @@ -187,7 +187,10 @@ static int scan_bitmap_block(struct reis return 0; // No free blocks in this bitmap } - /* search for a first zero bit -- beggining of a window */ + if (*beg < bi->first_zero_hint) + *beg = bi->first_zero_hint; + + /* search for a first zero bit -- beginning of a window */ *beg = reiserfs_find_next_zero_le_bit ((unsigned long *)(bh->b_data), boundary, *beg); On Thu, 2006-07-06 at 14:39 -0400, Jeff Mahoney wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Mike Benoit wrote: > > On Thu, 2006-07-06 at 14:02 -0400, Jeff Mahoney wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- > >> Hash: SHA1 > >> > >> Mike Benoit wrote: > >>> My desktop machine (v2.6.16, same as my MythTV box) is running with 9% > >>> free space right now and it is not experiencing any slow down. I think > >>> the problem is caused by the usage pattern of MythTV and how it > >>> simultaneously streams one or more large files to the HD in relatively > >>> small chunks over a long period of time. > >> Ok, if you run into the problem again, can you dump the metadata before > >> freeing the space? The code itself looks sound, and I'm wondering if > >> you've managed to create pathological fragmentation that's mucking > >> things up. Being able to see the fs metadata would confirm or disprove > >> that theory, and help in fixing it. > > > > Will do, I've started a bunch of recordings so I should start seeing the > > problem again by tonight or tomorrow morning. Is there any other data > > you would like me to collect? Additional oprofile reports, vmstat > > information before the problem occurs and/or after? > > Great. Any information you can provide would help quite a bit. oprofile > would be useful, as would vmstat information. > > - -Jeff > > - -- > Jeff Mahoney > SUSE Labs > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.2 (GNU/Linux) > Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org > > iD8DBQFErVjRLPWxlyuTD7IRAlgoAKCRqtHLk6Uq9Bp3yZq/18tHt8l2mwCfT206 > UMSE1Om/pvg+svHImWkwLT8= > =I5Oj > -----END PGP SIGNATURE----- -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #1.2: oprofile.txt --] [-- Type: text/plain, Size: 1163 bytes --] [root@mythtv tmp]# opreport CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt TIMER:0| samples| %| ------------------ 49983 87.9595 reiserfs 6146 10.8157 vmlinux 162 0.2851 libc-2.4.so 111 0.1953 nvidia 99 0.1742 libmythtv-0.19.so.0.19.0 90 0.1584 ivtv 42 0.0739 mysqld 35 0.0616 libstdc++.so.6.0.8 33 0.0581 opreport 26 0.0458 libqt-mt.so.3.3.6 14 0.0246 ld-2.4.so 14 0.0246 libperl.so 13 0.0229 libbfd-2.16.91.0.6.so <snip> [root@mythtv tmp]# opreport -l /usr/src/linux-2.6.16.21-rfsfix/vmlinux CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt samples % symbol name 3393 55.2066 find_next_zero_bit 2256 36.7068 default_idle 114 1.8549 __do_softirq 56 0.9112 __copy_from_user_ll 16 0.2603 __copy_to_user_ll 13 0.2115 __find_get_block 13 0.2115 mmx_clear_page 12 0.1952 ide_outb 10 0.1627 number 8 0.1302 ioread8 7 0.1139 find_get_page <snip> [-- Attachment #1.3: vmstat_1.txt --] [-- Type: text/plain, Size: 11572 bytes --] Thu Jul 6 22:51:45 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268876336 24148316 92% / ipso:/home/ipso/backup 116130272 103483184 12647088 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 0 80144 6100 12596 108080 6 7 15 62 8 82 46 2 42 10 1 1 80144 6236 12592 107936 0 0 1577 2174 571 1567 97 3 0 0 1 1 80144 6076 12608 107896 0 0 1607 2215 558 1556 97 3 0 0 4 0 80144 6192 12500 108128 0 0 1591 2456 562 1558 96 4 0 0 1 1 80144 6096 12524 108032 0 0 1578 2224 570 1523 93 7 0 0 2 1 80144 5436 12816 108400 0 0 1437 2256 558 1428 89 10 0 1 Thu Jul 6 22:52:35 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268983824 24040828 92% / ipso:/home/ipso/backup 116130272 103483184 12647088 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 1 80144 5980 13012 107732 6 7 16 63 8 83 46 2 42 10 2 0 80144 6016 13120 107632 0 0 1526 2063 575 1496 91 9 0 1 2 1 80144 6116 13004 107392 0 0 1439 2348 571 1453 89 10 0 1 4 0 80144 6252 13092 107388 0 0 1526 2354 563 1542 94 5 0 1 3 0 80144 6268 13156 107280 0 0 1553 2080 563 1499 92 8 0 0 3 1 80144 6288 13032 107264 0 0 1540 2383 576 1504 92 6 0 2 Thu Jul 6 22:53:26 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269089456 23935196 92% / ipso:/home/ipso/backup 116130272 103483184 12647088 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 1 80144 5900 13112 107640 6 7 16 63 8 83 46 2 41 10 1 0 80144 6356 13196 107128 0 0 1512 2381 566 1492 90 8 0 2 2 1 80144 6484 13224 106820 0 0 1501 2089 571 1518 86 9 0 5 1 0 80144 6600 13456 106460 0 0 1514 2458 580 1564 92 7 0 1 2 1 80144 5704 13820 107024 0 0 1577 2208 562 1459 83 15 0 2 1 0 80144 6308 13612 106788 0 0 1476 2124 561 1397 82 17 0 1 Thu Jul 6 22:54:17 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269194040 23830612 92% / ipso:/home/ipso/backup 116130272 103483064 12647208 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 80144 6440 13640 106588 6 7 16 64 8 83 46 2 41 10 6 1 80144 6656 13160 106636 0 0 1362 2432 567 1265 71 26 0 2 11 0 80144 5756 13112 107684 0 0 669 368 438 298 17 83 0 0 5 1 80144 6716 13192 106556 0 0 835 372 464 469 30 70 0 0 9 0 80144 5656 13268 107676 0 0 860 308 467 492 31 69 0 0 13 0 80144 6028 13196 107448 0 0 769 302 476 417 24 76 0 0 Thu Jul 6 22:55:09 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269227876 23796776 92% / ipso:/home/ipso/backup 116130272 103483064 12647208 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 4 0 80144 6628 13224 106688 6 7 17 64 8 84 46 2 41 10 2 0 80144 16668 13124 97352 0 0 629 248 441 377 22 78 0 0 9 0 80144 15988 13520 100960 0 0 94 282 378 158 11 89 0 0 9 0 80144 12272 13768 104540 0 0 51 415 359 102 6 94 0 0 6 0 80144 7672 14112 108780 0 0 78 400 379 141 8 92 0 0 9 0 80144 5920 14424 110016 0 0 92 409 378 150 10 90 0 0 Thu Jul 6 22:56:00 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269243268 23781384 92% / ipso:/home/ipso/backup 116130272 103483120 12647152 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 5 0 80144 6544 14456 109372 6 7 17 64 9 84 46 2 41 10 4 0 80144 5856 14644 109900 0 0 64 342 356 116 8 92 0 0 11 1 80140 6224 14900 109088 1 0 108 424 369 140 8 92 0 0 7 2 80140 5744 15052 109084 0 0 261 312 387 233 22 78 0 0 7 1 80132 6068 15404 107160 84 0 424 396 394 204 19 81 0 0 7 1 80128 6424 15476 107440 1 0 169 411 378 148 10 90 0 0 Thu Jul 6 22:56:52 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269259644 23765008 92% / ipso:/home/ipso/backup 116130272 103483128 12647144 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 4 0 80128 6336 15536 107708 6 7 17 64 9 84 46 2 41 10 6 1 80124 6164 15728 107732 0 0 100 296 366 138 9 91 0 0 7 0 80124 5880 15836 107864 6 0 280 388 417 285 26 72 0 2 4 3 80120 5884 16080 106664 0 0 120 449 378 156 10 90 0 0 5 0 80116 5776 16192 107476 1 0 55 346 361 93 5 95 0 0 8 0 80112 6236 16488 106688 0 0 120 389 399 182 12 88 0 0 Thu Jul 6 22:57:43 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269276516 23748136 92% / ipso:/home/ipso/backup 116130272 103483128 12647144 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 6 0 80112 6236 16488 106692 6 7 17 64 9 84 46 2 41 10 3 0 80112 6428 16648 106336 0 0 77 353 366 128 9 91 0 0 8 1 80108 6224 16872 105456 0 0 116 417 371 154 9 91 0 0 4 0 80108 6396 17084 105804 0 0 121 366 373 172 12 87 0 1 13 0 80108 6532 17320 105392 0 0 128 412 384 172 13 87 0 0 5 0 80104 5720 17684 105000 0 0 132 424 381 179 13 87 0 0 Thu Jul 6 22:58:34 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269293472 23731180 92% / ipso:/home/ipso/backup 116130272 103483128 12647144 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 9 0 80104 5484 17684 105136 6 7 17 64 9 84 46 2 41 10 7 0 80104 6572 17896 104072 3 0 100 381 360 124 8 92 0 0 15 1 80104 6468 18332 103100 0 0 265 393 422 291 23 76 0 1 7 1 80104 6360 18816 103324 0 0 164 369 377 174 11 89 0 0 8 0 80076 6656 19040 102456 6 0 210 451 399 201 14 86 0 0 5 0 80076 5724 19220 103624 0 0 206 360 402 254 21 79 0 0 Thu Jul 6 22:59:26 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269309984 23714668 92% / ipso:/home/ipso/backup 116130272 103483128 12647144 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 0 80076 5788 19220 103624 6 7 17 64 9 84 46 2 41 10 0 2 80052 7976 21904 121856 1 0 328 2766 466 391 3 59 0 37 2 0 80052 6352 21508 104400 0 0 728 2993 546 1090 60 6 1 33 1 0 80056 5732 21476 105328 0 0 770 2494 493 923 86 2 0 12 1 2 80052 6116 18996 95456 0 0 1622 1520 480 1023 93 4 0 3 4 0 80052 6392 19060 107020 0 0 1028 2472 512 1004 91 7 0 2 Thu Jul 6 23:00:16 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268256840 24767812 92% / ipso:/home/ipso/backup 116130272 103483128 12647144 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 80052 6824 19192 106672 6 7 17 65 9 84 46 3 41 10 2 0 80052 8016 17464 109484 0 0 799 2246 486 977 95 3 0 2 2 0 80052 6468 15016 113108 0 0 1027 2664 518 1066 96 3 0 1 1 0 80052 6000 15268 113620 0 0 808 2368 475 959 97 2 0 0 1 0 80052 6452 14864 113308 0 0 804 2577 497 984 96 2 0 2 0 1 80052 25848 16052 111012 18 0 478 1777 517 775 19 2 23 56 Thu Jul 6 23:01:07 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268326500 24698152 92% / ipso:/home/ipso/backup 116130272 103483128 12647144 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 1 80052 25788 16088 111052 6 7 17 66 9 84 46 3 41 10 0 0 80048 5644 17124 131912 1 0 178 2403 492 698 3 2 63 32 0 0 80048 5732 18144 130416 0 0 100 2128 512 727 1 2 66 31 0 0 80048 5784 18884 129856 0 0 8 1929 497 717 1 2 85 11 0 0 80048 6604 18592 129432 0 0 0 1841 453 647 0 1 95 5 0 0 80048 5960 18648 129904 0 0 2 1912 477 696 1 1 95 4 Thu Jul 6 23:01:57 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268421444 24603208 92% / ipso:/home/ipso/backup 116130272 103483136 12647136 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 80048 5704 18656 130184 6 7 17 66 9 84 46 3 41 10 0 0 80048 5596 18796 130104 0 0 2 2048 480 698 1 1 94 5 0 0 80048 6244 19416 128764 0 0 5 1982 477 678 1 2 85 12 0 0 80048 6052 19232 129272 0 0 6 1874 481 697 1 1 90 9 0 0 80048 5928 19604 128920 0 0 15 2039 488 716 1 2 93 5 0 0 80048 5560 19668 129188 0 0 22 1775 466 667 0 1 96 4 [-- Attachment #1.4: vmstat_2.txt --] [-- Type: text/plain, Size: 18700 bytes --] Thu Jul 6 23:31:25 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268695440 24329212 92% / ipso:/home/ipso/backup 116130272 103483600 12646672 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 1 53388 5600 11960 98968 6 7 28 87 15 96 46 3 41 10 1 1 53388 5496 11980 99376 0 0 1552 2160 555 1546 95 4 0 1 1 1 53388 6608 11612 98364 0 0 1552 2310 576 1575 96 3 0 0 2 1 53388 5288 11720 99584 0 0 1553 2129 577 1600 96 4 0 0 3 0 53388 5944 11920 98960 0 0 1541 2321 557 1566 96 4 0 1 2 1 53388 6444 12476 97612 0 0 1592 2315 587 1580 94 4 0 2 Thu Jul 6 23:32:15 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268800636 24224016 92% / ipso:/home/ipso/backup 116130272 103483600 12646672 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 1 53388 6388 12564 97604 6 7 29 87 15 96 46 3 41 10 1 0 53388 6576 12364 97692 0 0 1526 2268 581 1579 95 4 0 2 1 1 53388 5288 12312 98976 0 0 1552 2187 559 1561 96 4 0 0 2 1 53388 5308 12044 99084 0 0 1540 2144 574 1573 94 5 0 1 1 0 53388 6144 12052 98480 0 0 1529 2486 582 1566 96 3 0 0 1 1 53388 5604 11644 99240 0 0 1538 2150 550 1531 96 3 0 0 Thu Jul 6 23:33:06 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 268908104 24116548 92% / ipso:/home/ipso/backup 116130272 103483600 12646672 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 1 53388 6448 11800 98196 6 7 29 88 15 97 46 3 41 10 1 1 53388 4964 11952 99604 0 0 1488 2306 575 1540 93 4 0 3 2 1 53388 5624 12276 98516 0 0 1491 2370 586 1570 95 4 0 1 2 0 53388 6012 12480 97808 0 0 1514 2235 577 1552 96 4 0 0 1 1 53388 5308 12712 98296 0 0 1564 2327 590 1624 95 4 0 1 1 0 53388 5836 12536 98120 0 0 1565 2404 599 1596 96 4 0 0 Thu Jul 6 23:33:57 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269015876 24008776 92% / ipso:/home/ipso/backup 116130272 103483648 12646624 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 4 1 53388 4960 12572 98752 6 7 30 89 15 97 46 3 41 10 1 0 53388 5856 12652 98060 0 0 1542 2270 586 1576 96 3 0 1 1 1 53388 5452 12372 98472 0 0 1564 2261 572 1615 97 3 0 0 1 1 53388 5188 12136 98964 0 0 1554 2206 586 1558 95 4 0 1 4 0 53388 6292 12120 97976 0 0 1567 2288 582 1562 94 4 0 3 1 0 53388 6380 12184 97688 0 0 1553 2417 572 1562 95 3 0 2 Thu Jul 6 23:34:47 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269121368 23903284 92% / ipso:/home/ipso/backup 116130272 103483648 12646624 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 1 53388 5328 12316 98556 6 7 30 89 15 98 46 3 41 10 1 1 53388 5116 12568 98580 0 0 1514 2254 578 1571 96 4 0 0 1 2 53388 6280 12264 97592 0 0 1554 2137 574 1562 95 4 0 1 1 1 53388 4980 12220 99000 0 0 1564 2381 560 1552 96 4 0 0 2 0 53388 6484 12156 97656 0 0 1528 2440 576 1553 94 6 0 0 2 1 53388 6080 12104 97992 0 0 1475 2201 572 1397 82 17 0 1 Thu Jul 6 23:35:38 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269227580 23797072 92% / ipso:/home/ipso/backup 116130272 103483648 12646624 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 4 1 53388 5392 12140 98460 6 7 30 90 15 98 46 3 41 10 2 2 53388 5040 12340 98800 0 0 1398 2354 556 1345 82 18 0 1 10 0 53388 5820 12464 97976 0 0 1349 2136 574 1318 77 22 0 1 1 0 53388 6348 12268 97352 0 0 1207 1725 543 1029 61 39 0 1 10 1 53388 18020 12472 90884 0 0 366 1730 445 490 27 73 0 0 8 0 53388 13616 12744 95320 0 0 39 478 348 93 3 97 0 0 Thu Jul 6 23:36:30 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269306828 23717824 92% / ipso:/home/ipso/backup 116130272 103483648 12646624 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 4 0 53388 12564 12796 96276 6 7 31 90 16 98 46 3 41 10 6 1 53388 6844 13148 100976 1 0 115 392 374 135 7 93 0 0 9 1 53388 6376 13368 101052 0 0 316 365 427 304 29 70 0 1 8 1 53384 6464 13960 100628 3 0 262 446 435 263 18 80 0 2 8 1 53384 6564 14556 99936 0 0 274 407 398 289 14 85 0 1 11 0 53384 9348 14768 96812 0 0 246 399 395 229 11 89 0 0 Thu Jul 6 23:37:24 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269323968 23700684 92% / ipso:/home/ipso/backup 116130272 103483656 12646616 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 0 53384 9332 14812 97072 6 7 31 90 16 98 46 3 41 10 6 0 53384 5636 15028 100700 0 0 90 407 377 146 10 90 0 0 10 0 53384 6600 15944 98740 0 0 106 495 390 396 10 90 0 0 11 1 53380 6656 16168 98304 0 0 86 436 392 199 9 91 0 0 6 1 53376 26724 16540 100164 1 0 72 465 385 171 8 86 0 6 0 2 52636 20920 17076 104348 1 0 49 497 383 149 1 88 0 11 Thu Jul 6 23:38:16 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269342424 23682228 92% / ipso:/home/ipso/backup 116130272 103483656 12646616 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 2 52636 20920 17076 104348 6 7 31 91 16 98 46 3 41 10 8 1 52636 16932 17996 107840 0 0 75 476 439 258 1 68 0 30 5 2 52636 8120 18392 114748 0 0 206 570 379 124 1 91 0 8 7 1 52636 6448 18636 115504 0 0 11 470 350 66 1 98 0 1 0 2 52636 5784 18972 115464 0 0 47 518 344 76 1 97 0 2 7 0 52636 5960 18528 98312 0 0 213 372 370 187 13 86 0 1 Thu Jul 6 23:39:07 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269362396 23662256 92% / ipso:/home/ipso/backup 116130272 103483656 12646616 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 10 0 52636 5812 18528 98452 6 7 31 91 16 98 46 3 41 10 3 1 52636 8996 18800 94600 0 0 5 583 394 180 11 89 0 0 4 0 52636 6140 18876 97804 0 0 110 404 376 149 11 88 0 1 8 0 52636 6020 18932 97824 0 0 103 407 369 150 10 90 0 0 8 0 52636 6324 19176 97244 0 0 52 470 364 128 7 93 0 0 5 0 52636 6552 19272 97024 0 0 97 379 405 214 11 89 0 0 Thu Jul 6 23:39:58 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269381820 23642832 92% / ipso:/home/ipso/backup 116130272 103483656 12646616 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 7 0 52636 6528 19272 97024 6 7 31 91 16 98 46 3 41 10 9 0 52636 6692 19464 98536 0 0 62 339 386 181 13 87 0 0 7 0 52636 6320 19512 99264 0 0 90 440 386 169 12 88 0 0 8 0 52636 6280 19812 98972 0 0 90 459 373 140 9 91 0 0 4 0 52636 6100 20084 98796 0 0 102 347 392 214 14 86 0 0 7 0 52636 17288 20376 99256 0 0 46 514 368 124 6 92 0 1 Thu Jul 6 23:40:50 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269399972 23624680 92% / ipso:/home/ipso/backup 116130272 103483664 12646608 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 10 0 52636 17044 20376 99516 6 7 31 91 16 98 46 3 41 10 7 0 52636 25408 20900 98952 0 0 68 486 392 159 1 87 6 6 2 0 52636 20736 21580 102904 0 0 21 547 404 164 1 88 0 10 3 0 52636 16760 21812 106748 0 0 0 482 358 93 1 93 0 6 3 1 52636 11900 22012 110892 0 0 8 392 346 73 1 98 0 1 2 0 52636 7336 22988 114776 0 0 6 501 344 87 1 98 0 1 Thu Jul 6 23:41:41 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269414460 23610192 92% / ipso:/home/ipso/backup 116130272 103483664 12646608 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 0 52636 7336 22988 114776 6 7 31 91 16 98 46 3 41 10 3 1 52636 5840 23260 115876 0 0 0 521 366 107 1 90 0 9 4 0 52636 6508 23600 114952 0 0 0 464 368 104 0 93 0 7 0 1 52636 9292 24336 111212 0 0 2 525 386 171 2 87 3 8 3 0 52636 6136 24540 114204 0 0 0 400 345 76 1 97 0 3 6 0 52636 6428 24408 114164 0 0 0 436 358 84 1 93 0 6 Thu Jul 6 23:42:32 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269433368 23591284 92% / ipso:/home/ipso/backup 116130272 103483664 12646608 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 7 0 52636 6420 24408 114176 6 7 31 91 16 98 46 3 41 10 4 1 52636 9080 23816 112132 0 0 3 486 361 90 2 98 0 0 4 0 52636 5832 24008 115140 0 0 0 417 330 53 0 99 0 0 6 0 52636 6100 23812 115112 0 0 0 416 346 74 0 97 0 3 5 0 52636 6336 23504 115224 0 0 1 455 364 87 0 95 0 4 3 0 52636 6316 23400 115240 0 0 1 525 379 141 1 90 3 7 Thu Jul 6 23:43:23 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269453124 23571528 92% / ipso:/home/ipso/backup 116130272 103483664 12646608 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 6 0 52636 8492 23212 112768 6 7 31 91 16 98 46 3 41 10 9 0 52636 5968 23452 114660 0 0 0 538 346 74 1 97 0 2 6 0 52636 8756 23268 112824 0 0 7 535 375 102 1 97 0 2 5 1 52636 5988 22920 116024 0 0 0 484 345 58 1 99 0 0 3 0 52636 6396 22988 115488 0 0 0 507 339 58 1 99 0 0 0 0 52636 6436 23324 115120 0 0 0 533 362 111 1 92 2 5 Thu Jul 6 23:44:16 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269476608 23548044 92% / ipso:/home/ipso/backup 116130272 103483704 12646568 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 52636 6420 23324 115120 6 7 31 91 17 98 46 3 41 10 2 1 52636 6160 23528 115164 0 0 2 511 370 139 2 93 5 0 5 0 52636 5956 23740 115184 0 0 0 502 339 56 1 99 0 1 6 0 52636 6000 24144 114028 0 0 4 450 347 85 1 98 0 1 6 0 52636 5956 24328 114596 0 0 14 518 373 94 1 98 0 0 4 0 52636 6248 23716 114932 0 0 0 549 347 75 1 99 0 0 Thu Jul 6 23:45:07 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269499692 23524960 92% / ipso:/home/ipso/backup 116130272 103483704 12646568 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 1 52636 5876 23800 115156 6 7 31 91 17 98 46 3 41 10 4 0 52636 5724 23088 116072 0 0 3 490 382 105 1 94 0 5 9 1 52632 5640 23468 115632 0 0 5 564 387 156 1 90 5 4 5 0 52632 5844 23608 115392 0 0 0 460 368 89 1 95 0 4 5 0 52632 9172 23528 112108 0 0 3 442 385 120 1 98 0 0 4 0 52632 5628 23824 115468 0 0 0 447 337 56 1 99 0 0 Thu Jul 6 23:45:58 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269520312 23504340 92% / ipso:/home/ipso/backup 116130272 103483704 12646568 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 0 52632 5620 23824 115480 6 7 31 92 17 98 46 3 41 10 4 0 52632 5856 24092 114976 0 0 0 452 339 56 1 99 0 0 10 2 52632 6916 24400 111148 0 0 9 469 368 109 2 93 0 5 1 0 52632 6232 24704 110184 0 0 6 434 386 167 5 88 3 5 10 1 52632 5744 24736 110920 0 0 36 457 376 153 1 93 0 6 6 1 52632 6288 25004 110036 0 0 121 561 367 112 5 94 0 1 Thu Jul 6 23:46:50 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269540344 23484308 92% / ipso:/home/ipso/backup 116130272 103483712 12646560 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 6 1 52632 6148 25004 110184 6 7 31 92 17 98 46 3 41 10 5 1 52632 16564 24844 100752 0 0 375 486 405 224 10 81 0 9 5 1 52632 12296 25336 105868 0 0 82 514 370 125 8 89 0 3 10 0 52632 8568 25696 109268 0 0 74 453 383 198 12 81 2 5 5 0 52632 6044 25880 111568 0 0 0 529 362 221 25 74 0 0 0 0 52616 17924 25336 102048 0 0 272 352 356 333 11 36 48 5 Thu Jul 6 23:47:41 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269555780 23468872 92% / ipso:/home/ipso/backup 116130272 103483712 12646560 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 52616 17924 25336 102048 6 7 31 92 17 98 46 3 41 10 0 0 52616 17992 25348 102048 0 0 0 2 312 322 0 0 100 0 0 0 52616 17996 25360 102048 0 0 0 2 312 324 0 0 100 0 0 0 52616 17996 25600 102112 0 0 0 52 320 325 0 0 100 0 0 0 52616 17996 25612 102112 0 0 0 2 312 324 0 0 99 0 0 0 52616 17996 25624 102112 0 0 0 2 312 323 0 0 100 0 Thu Jul 6 23:48:31 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 269555860 23468792 92% / ipso:/home/ipso/backup 116130272 103483712 12646560 90% /mnt/backup procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 52616 17972 25624 102112 6 7 31 92 17 99 46 3 41 10 0 0 52616 17988 25664 102116 0 0 0 11 314 324 0 1 100 0 0 0 52616 17872 25688 102116 0 0 0 4 312 322 0 0 100 0 0 0 52616 17872 25716 102120 0 0 0 6 312 324 0 0 99 0 0 0 52616 17872 25732 102120 0 0 0 8 314 323 0 0 100 0 0 0 52616 17748 25896 102120 0 0 0 33 315 322 0 1 100 0 [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 7:29 ` Mike Benoit @ 2006-07-07 17:49 ` Jan Kara 2006-07-07 17:50 ` Jeff Mahoney ` (2 more replies) 0 siblings, 3 replies; 49+ messages in thread From: Jan Kara @ 2006-07-07 17:49 UTC (permalink / raw) To: Mike Benoit; +Cc: Jeff Mahoney, Jure Pečar, Hans Reiser, reiserfs-list Hi, just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: /* When the bitmap is more than 10% free, anyone can allocate. * When it's less than 10% free, only files that already use the * bitmap are allowed. Once we pass 80% full, this restriction * is lifted. * * We do this so that files that grow later still have space * close to * their original allocation. This improves locality, and * presumably * performance as a result. * * This is only an allocation policy and does not make up for * getting a * bad hint. Decent hinting must be implemented for this to work * well. */ if (TEST_OPTION(skip_busy, s) && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { So the comment suggests we should lift the restriction when we are 80% full but if you see the condition, it checks wherher we are 95% full! I guess that is really asking for trouble and could explain the behaviour... Mike could you try changing that 20 in the test to 5? IMHO that could fix your problem. Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 17:49 ` Jan Kara @ 2006-07-07 17:50 ` Jeff Mahoney 2006-07-07 18:07 ` Jan Kara 2006-07-07 20:18 ` Mike Benoit 2006-07-07 21:04 ` Mike Benoit 2 siblings, 1 reply; 49+ messages in thread From: Jeff Mahoney @ 2006-07-07 17:50 UTC (permalink / raw) To: Jan Kara; +Cc: Mike Benoit, Jure Pečar, Hans Reiser, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jan Kara wrote: > Hi, > > just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: > /* When the bitmap is more than 10% free, anyone can allocate. > * When it's less than 10% free, only files that already use the > * bitmap are allowed. Once we pass 80% full, this restriction > * is lifted. > * > * We do this so that files that grow later still have space > * close to > * their original allocation. This improves locality, and > * presumably > * performance as a result. > * > * This is only an allocation policy and does not make up for > * getting a > * bad hint. Decent hinting must be implemented for this to work > * well. > */ > if (TEST_OPTION(skip_busy, s) > && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { > > So the comment suggests we should lift the restriction when we are 80% > full but if you see the condition, it checks wherher we are 95% full! I > guess that is really asking for trouble and could explain the > behaviour... > Mike could you try changing that 20 in the test to 5? IMHO that could > fix your problem. Shoot. I guess I never sent that mail out last night. I had discovered the same thing. The thing is, I don't think it will cause the kind of performance problem we're seeing here. Once it sees the 90% check it will bail out. Minor slowdown, not anything like we're seeing. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFErp76LPWxlyuTD7IRAqJ1AJ9ce8HTFNauhcriJzUlKJ1p68u4MwCdE4W/ IA09T6t/46TD+PSAQs/MHkk= =/9Xa -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 17:50 ` Jeff Mahoney @ 2006-07-07 18:07 ` Jan Kara 2006-07-07 18:08 ` Jeff Mahoney 2006-07-07 19:05 ` Hans Reiser 0 siblings, 2 replies; 49+ messages in thread From: Jan Kara @ 2006-07-07 18:07 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Mike Benoit, Jure Pečar, Hans Reiser, reiserfs-list > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Jan Kara wrote: > > Hi, > > > > just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: > > /* When the bitmap is more than 10% free, anyone can allocate. > > * When it's less than 10% free, only files that already use the > > * bitmap are allowed. Once we pass 80% full, this restriction > > * is lifted. > > * > > * We do this so that files that grow later still have space > > * close to > > * their original allocation. This improves locality, and > > * presumably > > * performance as a result. > > * > > * This is only an allocation policy and does not make up for > > * getting a > > * bad hint. Decent hinting must be implemented for this to work > > * well. > > */ > > if (TEST_OPTION(skip_busy, s) > > && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { > > > > So the comment suggests we should lift the restriction when we are 80% > > full but if you see the condition, it checks wherher we are 95% full! I > > guess that is really asking for trouble and could explain the > > behaviour... > > Mike could you try changing that 20 in the test to 5? IMHO that could > > fix your problem. > > Shoot. I guess I never sent that mail out last night. I had discovered > the same thing. The thing is, I don't think it will cause the kind of > performance problem we're seeing here. Once it sees the 90% check it > will bail out. Minor slowdown, not anything like we're seeing. Hmm, right. You'll only scan that one bitmap the file is in, won't you? That can still take some time so maybe it's worth trying this fix anyway. Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 18:07 ` Jan Kara @ 2006-07-07 18:08 ` Jeff Mahoney 2006-07-07 19:05 ` Hans Reiser 1 sibling, 0 replies; 49+ messages in thread From: Jeff Mahoney @ 2006-07-07 18:08 UTC (permalink / raw) To: Jan Kara; +Cc: Mike Benoit, Jure Pečar, Hans Reiser, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jan Kara wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Jan Kara wrote: >>> Hi, >>> >>> just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: >>> /* When the bitmap is more than 10% free, anyone can allocate. >>> * When it's less than 10% free, only files that already use the >>> * bitmap are allowed. Once we pass 80% full, this restriction >>> * is lifted. >>> * >>> * We do this so that files that grow later still have space >>> * close to >>> * their original allocation. This improves locality, and >>> * presumably >>> * performance as a result. >>> * >>> * This is only an allocation policy and does not make up for >>> * getting a >>> * bad hint. Decent hinting must be implemented for this to work >>> * well. >>> */ >>> if (TEST_OPTION(skip_busy, s) >>> && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { >>> >>> So the comment suggests we should lift the restriction when we are 80% >>> full but if you see the condition, it checks wherher we are 95% full! I >>> guess that is really asking for trouble and could explain the >>> behaviour... >>> Mike could you try changing that 20 in the test to 5? IMHO that could >>> fix your problem. >> Shoot. I guess I never sent that mail out last night. I had discovered >> the same thing. The thing is, I don't think it will cause the kind of >> performance problem we're seeing here. Once it sees the 90% check it >> will bail out. Minor slowdown, not anything like we're seeing. > Hmm, right. You'll only scan that one bitmap the file is in, won't > you? That can still take some time so maybe it's worth trying this fix > anyway. Oh, I agree that it's a bug that needs to be fixed. I just don't think it's causing 90% CPU usage. :) - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFErqMELPWxlyuTD7IRAnitAJ9rbkY8sKzJqqVZnwA1Gqo2aEcV1QCgqBgt YsXQ7d6S/70du/bWQ28Xhkc= =Jv9h -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 18:07 ` Jan Kara 2006-07-07 18:08 ` Jeff Mahoney @ 2006-07-07 19:05 ` Hans Reiser 2006-07-07 19:18 ` Jan Kara 1 sibling, 1 reply; 49+ messages in thread From: Hans Reiser @ 2006-07-07 19:05 UTC (permalink / raw) To: Jan Kara; +Cc: Jeff Mahoney, Mike Benoit, Jure Pečar, reiserfs-list Jan Kara wrote: >>-----BEGIN PGP SIGNED MESSAGE----- >>Hash: SHA1 >> >>Jan Kara wrote: >> >> >>> Hi, >>> >>> just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: >>> /* When the bitmap is more than 10% free, anyone can allocate. >>> * When it's less than 10% free, only files that already use the >>> * bitmap are allowed. Once we pass 80% full, this restriction >>> * is lifted. >>> * >>> * We do this so that files that grow later still have space >>> * close to >>> * their original allocation. This improves locality, and >>> * presumably >>> * performance as a result. >>> * >>> * This is only an allocation policy and does not make up for >>> * getting a >>> * bad hint. Decent hinting must be implemented for this to work >>> * well. >>> */ >>> if (TEST_OPTION(skip_busy, s) >>> && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { >>> >>> How about eliminating this feature entirely. It seems rather dubious. >>> So the comment suggests we should lift the restriction when we are 80% >>>full but if you see the condition, it checks wherher we are 95% full! I >>>guess that is really asking for trouble and could explain the >>>behaviour... >>> Mike could you try changing that 20 in the test to 5? IMHO that could >>>fix your problem. >>> >>> >>Shoot. I guess I never sent that mail out last night. I had discovered >>the same thing. The thing is, I don't think it will cause the kind of >>performance problem we're seeing here. Once it sees the 90% check it >>will bail out. Minor slowdown, not anything like we're seeing. >> >> > Hmm, right. You'll only scan that one bitmap the file is in, won't > > I don't understand your remark. These files are in many many bitmaps.... Can you quote more of the code? >you? That can still take some time so maybe it's worth trying this fix >anyway. > > Honza > > ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 19:05 ` Hans Reiser @ 2006-07-07 19:18 ` Jan Kara 2006-07-07 19:38 ` Hans Reiser 0 siblings, 1 reply; 49+ messages in thread From: Jan Kara @ 2006-07-07 19:18 UTC (permalink / raw) To: Hans Reiser; +Cc: Jeff Mahoney, Mike Benoit, Jure Pečar, reiserfs-list > Jan Kara wrote: > > >>-----BEGIN PGP SIGNED MESSAGE----- > >>Hash: SHA1 > >> > >>Jan Kara wrote: > >> > >> > >>> Hi, > >>> > >>> just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: > >>> /* When the bitmap is more than 10% free, anyone can allocate. > >>> * When it's less than 10% free, only files that already use the > >>> * bitmap are allowed. Once we pass 80% full, this restriction > >>> * is lifted. > >>> * > >>> * We do this so that files that grow later still have space > >>> * close to > >>> * their original allocation. This improves locality, and > >>> * presumably > >>> * performance as a result. > >>> * > >>> * This is only an allocation policy and does not make up for > >>> * getting a > >>> * bad hint. Decent hinting must be implemented for this to work > >>> * well. > >>> */ > >>> if (TEST_OPTION(skip_busy, s) > >>> && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { > >>> > >>> > How about eliminating this feature entirely. It seems rather dubious. Yes, but it may help reducing fragmentation as it leaves some free space in bitmaps for the files already ending in that bitmaps. I'm not sure if it really helps through... > >>> So the comment suggests we should lift the restriction when we are 80% > >>>full but if you see the condition, it checks wherher we are 95% full! I > >>>guess that is really asking for trouble and could explain the > >>>behaviour... > >>> Mike could you try changing that 20 in the test to 5? IMHO that could > >>>fix your problem. > >>> > >>> > >>Shoot. I guess I never sent that mail out last night. I had discovered > >>the same thing. The thing is, I don't think it will cause the kind of > >>performance problem we're seeing here. Once it sees the 90% check it > >>will bail out. Minor slowdown, not anything like we're seeing. > >> > >> > > Hmm, right. You'll only scan that one bitmap the file is in, won't > > > > > I don't understand your remark. These files are in many many > bitmaps.... Can you quote more of the code? The condition really is: if ((off && (!unfm || (file_block != 0))) || SB_AP_BITMAP(s)[bm].free_count > (s->s_blocksize << 3) / 10) and we reset 'off' after the first test so the first part of || can be true only once (when we are scanning the bitmap containing the last file block). Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 19:18 ` Jan Kara @ 2006-07-07 19:38 ` Hans Reiser 0 siblings, 0 replies; 49+ messages in thread From: Hans Reiser @ 2006-07-07 19:38 UTC (permalink / raw) To: Jan Kara; +Cc: Jeff Mahoney, Mike Benoit, Jure Pečar, reiserfs-list Jan Kara wrote: >>Jan Kara wrote: >> >> >> >>>>-----BEGIN PGP SIGNED MESSAGE----- >>>>Hash: SHA1 >>>> >>>>Jan Kara wrote: >>>> >>>> >>>> >>>> >>>>> Hi, >>>>> >>>>> just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: >>>>> /* When the bitmap is more than 10% free, anyone can allocate. >>>>> * When it's less than 10% free, only files that already use the >>>>> * bitmap are allowed. Once we pass 80% full, this restriction >>>>> * is lifted. >>>>> * >>>>> * We do this so that files that grow later still have space >>>>> * close to >>>>> * their original allocation. This improves locality, and >>>>> * presumably >>>>> * performance as a result. >>>>> * >>>>> * This is only an allocation policy and does not make up for >>>>> * getting a >>>>> * bad hint. Decent hinting must be implemented for this to work >>>>> * well. >>>>> */ >>>>> if (TEST_OPTION(skip_busy, s) >>>>> && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { >>>>> >>>>> >>>>> >>>>> >>How about eliminating this feature entirely. It seems rather dubious. >> >> > Yes, but it may help reducing fragmentation as it leaves some free >space in bitmaps for the files already ending in that bitmaps. I'm not >sure if it really helps through... > > I think I was wrong, and retract my remark. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 17:49 ` Jan Kara 2006-07-07 17:50 ` Jeff Mahoney @ 2006-07-07 20:18 ` Mike Benoit 2006-07-07 21:04 ` Mike Benoit 2 siblings, 0 replies; 49+ messages in thread From: Mike Benoit @ 2006-07-07 20:18 UTC (permalink / raw) To: Jan Kara; +Cc: Jeff Mahoney, Jure Pečar, Hans Reiser, reiserfs-list [-- Attachment #1: Type: text/plain, Size: 1601 bytes --] On Fri, 2006-07-07 at 19:49 +0200, Jan Kara wrote: > Hi, > > just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: > /* When the bitmap is more than 10% free, anyone can allocate. > * When it's less than 10% free, only files that already use the > * bitmap are allowed. Once we pass 80% full, this restriction > * is lifted. > * > * We do this so that files that grow later still have space > * close to > * their original allocation. This improves locality, and > * presumably > * performance as a result. > * > * This is only an allocation policy and does not make up for > * getting a > * bad hint. Decent hinting must be implemented for this to work > * well. > */ > if (TEST_OPTION(skip_busy, s) > && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { > > So the comment suggests we should lift the restriction when we are 80% > full but if you see the condition, it checks wherher we are 95% full! I > guess that is really asking for trouble and could explain the > behaviour... > Mike could you try changing that 20 in the test to 5? IMHO that could > fix your problem. I've recompiled my kernel with this suggested change, and so far I have surpassed (just barely) the free space "trigger" point that occurred twice yesterday. I'll keep the recordings going so I can give you guys more conclusive results in a couple hours. -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 17:49 ` Jan Kara 2006-07-07 17:50 ` Jeff Mahoney 2006-07-07 20:18 ` Mike Benoit @ 2006-07-07 21:04 ` Mike Benoit 2006-07-07 21:20 ` Hans Reiser 2 siblings, 1 reply; 49+ messages in thread From: Mike Benoit @ 2006-07-07 21:04 UTC (permalink / raw) To: Jan Kara; +Cc: Jeff Mahoney, Jure Pečar, Hans Reiser, reiserfs-list [-- Attachment #1.1: Type: text/plain, Size: 2360 bytes --] On Fri, 2006-07-07 at 19:49 +0200, Jan Kara wrote: > Hi, > > just one note: I've looked to the in scan_bitmap() in bitmap.c. There is: > /* When the bitmap is more than 10% free, anyone can allocate. > * When it's less than 10% free, only files that already use the > * bitmap are allowed. Once we pass 80% full, this restriction > * is lifted. > * > * We do this so that files that grow later still have space > * close to > * their original allocation. This improves locality, and > * presumably > * performance as a result. > * > * This is only an allocation policy and does not make up for > * getting a > * bad hint. Decent hinting must be implemented for this to work > * well. > */ > if (TEST_OPTION(skip_busy, s) > && SB_FREE_BLOCKS(s) > SB_BLOCK_COUNT(s) / 20) { > > So the comment suggests we should lift the restriction when we are 80% > full but if you see the condition, it checks wherher we are 95% full! I > guess that is really asking for trouble and could explain the > behaviour... > Mike could you try changing that 20 in the test to 5? IMHO that could > fix your problem. It looks like it lasted a little longer, but probably not enough to determine that this change made the difference or not. /dev/hda1 293024652 271457512 21567140 93% / Attached is the vmstat output of the problem occurring. [root@mythtv tmp]# opreport -l /usr/src/linux/vmlinux | head -n20 CPU: CPU with timer interrupt, speed 0 MHz (estimated) Profiling through timer interrupt samples % symbol name 3945 53.9082 default_idle 3031 41.4184 find_next_zero_bit 50 0.6832 __copy_from_user_ll 30 0.4099 handle_IRQ_event 16 0.2186 ide_outb 16 0.2186 ioread8 10 0.1366 ide_end_request 10 0.1366 mmx_clear_page 10 0.1366 number 9 0.1230 __copy_to_user_ll 7 0.0957 get_page_from_freelist 5 0.0683 __find_get_block 5 0.0683 __link_path_walk 5 0.0683 kmem_cache_alloc 5 0.0683 mmx_copy_page 5 0.0683 sysenter_past_esp 4 0.0547 __make_request -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #1.2: vmstat_3.txt --] [-- Type: text/plain, Size: 5581 bytes --] Fri Jul 7 13:48:49 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 271457512 21567140 93% / procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 572 6416 13472 108712 0 0 836 2086 520 966 57 3 31 9 0 0 572 6112 13444 108996 0 0 1 1830 461 468 0 5 60 35 0 0 572 6788 13404 108368 0 0 2 1916 477 484 1 7 61 31 0 0 572 5760 13424 109288 0 0 1 1884 461 470 0 5 58 37 0 0 572 6104 13536 108884 0 0 1 1950 462 462 0 10 55 34 0 1 572 6652 13468 108460 0 0 2 1901 478 483 1 9 57 34 Fri Jul 7 13:49:39 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 271546292 21478360 93% / procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 0 572 6636 13480 108492 0 0 831 2085 520 962 57 3 31 9 0 1 572 6012 13692 108912 0 0 8 1900 470 476 0 5 60 35 0 0 572 6148 13584 108792 0 0 1 1844 464 468 0 8 58 33 0 1 572 5740 13832 108980 0 0 2 1949 479 506 1 1 71 27 0 1 572 6140 13632 108712 0 0 1 1828 467 481 0 2 60 38 0 0 572 5984 13532 109072 0 0 2 2012 466 481 0 5 62 34 Fri Jul 7 13:50:29 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 271636228 21388424 93% / procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 2 572 6172 13576 108736 0 0 825 2084 520 959 56 3 31 9 3 1 572 6288 13664 108584 0 0 2 1844 484 481 1 9 56 34 3 0 572 6352 13548 108580 0 0 2 1568 435 372 0 32 46 21 6 0 572 6092 13808 108600 0 0 0 560 360 161 2 94 0 3 2 2 572 5648 13968 108812 0 0 2 626 399 149 1 88 2 9 6 1 572 6344 14096 108092 0 0 1 464 343 62 0 99 0 0 Fri Jul 7 13:51:20 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 271682836 21341816 93% / procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 7 0 572 6328 14096 108100 0 0 820 2077 519 954 56 3 31 9 3 0 572 5920 14352 108284 0 0 0 416 338 67 1 99 0 0 5 0 572 6160 14536 107800 0 0 0 476 366 90 1 95 0 4 7 1 572 6164 14784 107556 0 0 1 428 372 106 0 93 0 6 2 0 572 6284 14968 107232 0 0 1 541 377 121 1 92 0 7 5 0 572 5756 15124 107600 0 0 0 474 354 75 1 96 0 3 Fri Jul 7 13:52:12 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 271702980 21321672 93% / procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 5 0 572 5652 15124 107600 0 0 814 2066 518 949 56 4 31 9 4 0 572 5868 15096 107628 0 0 0 444 332 56 1 99 0 0 7 0 572 5608 15324 107632 0 0 0 462 338 68 1 98 0 1 5 1 572 5924 15408 107220 0 0 0 513 358 80 1 96 0 3 4 0 572 6368 15568 106676 0 0 1 432 364 94 1 94 0 5 7 1 572 6476 15816 106292 0 0 10 514 365 88 1 94 0 5 Fri Jul 7 13:53:02 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 271723752 21300900 93% / procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 6 1 572 6412 15816 106300 0 0 809 2056 517 943 55 5 31 9 4 0 572 6456 15968 106152 0 0 0 462 367 95 1 94 0 6 4 0 572 5956 15992 106500 0 0 0 544 349 78 1 97 0 3 5 1 572 6616 16116 105856 0 0 0 400 338 69 0 99 0 1 3 1 572 6456 16252 105868 0 0 0 495 353 88 1 94 0 5 4 0 572 5524 16460 106448 0 0 0 470 370 95 1 94 0 5 Fri Jul 7 13:53:53 PDT 2006 Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda1 293024652 271744908 21279744 93% / procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 3 1 572 6304 16432 105708 0 0 804 2045 515 937 55 5 31 9 5 0 572 5792 16596 106112 0 0 1 469 353 129 1 95 0 4 2 0 572 5592 16872 106028 0 0 1 492 382 112 1 89 0 10 4 1 572 6240 16864 105492 0 0 2 554 347 75 1 97 0 2 1 1 572 5588 17124 105744 0 0 0 462 350 70 0 98 0 1 4 0 572 5876 17228 105432 0 0 0 476 341 61 0 99 0 0 [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 21:04 ` Mike Benoit @ 2006-07-07 21:20 ` Hans Reiser 2006-07-08 18:45 ` Jeff Mahoney 0 siblings, 1 reply; 49+ messages in thread From: Hans Reiser @ 2006-07-07 21:20 UTC (permalink / raw) To: Mike Benoit; +Cc: Jan Kara, Jeff Mahoney, Jure Pečar, reiserfs-list Guys, if you run the kernel under a debugger, and get it to where you see the excessive CPU usage, and then start stepping through the bitmap code, I am sure it will be very obvious what the error is. Can anyone do that for us? Jeff? ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-07 21:20 ` Hans Reiser @ 2006-07-08 18:45 ` Jeff Mahoney 2006-07-09 0:01 ` Hans Reiser ` (2 more replies) 0 siblings, 3 replies; 49+ messages in thread From: Jeff Mahoney @ 2006-07-08 18:45 UTC (permalink / raw) To: Hans Reiser; +Cc: Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: > Guys, if you run the kernel under a debugger, and get it to where you > see the excessive CPU usage, and then start stepping through the bitmap > code, I am sure it will be very obvious what the error is. Can anyone > do that for us? Jeff? Apologies to everyone CC'd who've already seen this message. It was bounced from the namesys servers and I wanted to preserve the CC list. *** Mike sent me a copy of the metadata and I am now able to reproduce locally. My profiling looks like this: samples % image name app name symbol name 148596 17.8573 reiserfs.ko reiserfs reiserfs_in_journal 58194 6.9934 reiserfs.ko reiserfs search_by_key 38937 4.6792 vmlinux vmlinux memmove 38783 4.6607 reiserfs.ko reiserfs scan_bitmap_block 38466 4.6226 jbd jbd (no symbols) 23249 2.7939 vmlinux vmlinux __find_get_block 18196 2.1867 vmlinux vmlinux tty_write 17734 2.1312 vmlinux vmlinux do_ioctl 17293 2.0782 loop loop (no symbols) 15400 1.8507 vmlinux vmlinux cond_resched_lock 14836 1.7829 vmlinux vmlinux copy_user_generic_c 14143 1.6996 reiserfs.ko reiserfs do_journal_end 13638 1.6389 vmlinux vmlinux find_next_zero_bit 13236 1.5906 vmlinux vmlinux default_llseek 12925 1.5532 vmlinux vmlinux bit_waitqueue 8921 1.0721 vmlinux vmlinux __delay Hans - My speculation about the bitmaps being fragmented was right on. I wrote a quick little script to parse the output of debugreiserfs -m and report on the frequency of different window sizes. Windows of 1-31 blocks are extremely common, accounting for 99.8% of all free windows. The problem is that in my testing, where I made the allocator report the size of allocation requests, the most common request was for a window of 32 blocks. What's happening is that we keep finding windows that are too small, which results in a lot of wasted effort. The cycle goes like this: if (unfm && is_block_in_journal(s, bmap_n, *beg, beg)) continue; /* first zero bit found; we check next bits */ for (end = *beg + 1;; end++) { if (end >= *beg + max || end >= boundary || reiserfs_test_le_bit(end, bi->bh->b_data)) { next = end; break; } /* finding the other end of zero bit window requires * looking into journal structures (in * case of searching for free blocks for unformatted nodes) */ if (unfm && is_block_in_journal(s, bmap_n, end, &next)) break; } If the window is too small, we end up looping up to the top and try to find another one. Since the overwhelming majority of the windows are too small, we go through just about all the bitmaps without backing off the window size. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFEr/1QLPWxlyuTD7IRApS7AJ9FgnAIGagxeWLDxpiixZt3bW7RmQCgoYwS +ycgwRw+I6mVATMNTeuLPQ8= =67kl -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-08 18:45 ` Jeff Mahoney @ 2006-07-09 0:01 ` Hans Reiser 2006-07-09 0:02 ` Hans Reiser 2006-07-12 0:54 ` Jeffrey Mahoney 2 siblings, 0 replies; 49+ messages in thread From: Hans Reiser @ 2006-07-09 0:01 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list So limit the number of iterations of rejecting windows that are too small. Say, 8. Hans Jeff Mahoney wrote: > > What's happening is that we keep finding windows that are too small, > which results in a lot of wasted effort. The cycle goes like this: > > if (unfm && is_block_in_journal(s, bmap_n, *beg, beg)) > continue; > /* first zero bit found; we check next bits */ > for (end = *beg + 1;; end++) { > if (end >= *beg + max || end >= boundary > || reiserfs_test_le_bit(end, bi->bh->b_data)) { > next = end; > break; > } > /* finding the other end of zero bit window requires > * looking into journal structures (in > * case of searching for free blocks for unformatted nodes) */ > if (unfm && is_block_in_journal(s, bmap_n, end, &next)) > break; > } > > If the window is too small, we end up looping up to the top and try to > find another one. Since the overwhelming majority of the windows are too > small, we go through just about all the bitmaps without backing off the > window size. > > -Jeff > > -- > Jeff Mahoney > SUSE Labs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-08 18:45 ` Jeff Mahoney 2006-07-09 0:01 ` Hans Reiser @ 2006-07-09 0:02 ` Hans Reiser 2006-07-12 0:54 ` Jeffrey Mahoney 2 siblings, 0 replies; 49+ messages in thread From: Hans Reiser @ 2006-07-09 0:02 UTC (permalink / raw) To: Jeff Mahoney; +Cc: Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list By 8 iterations, I mean 8 bitmaps scanned. ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-08 18:45 ` Jeff Mahoney 2006-07-09 0:01 ` Hans Reiser 2006-07-09 0:02 ` Hans Reiser @ 2006-07-12 0:54 ` Jeffrey Mahoney 2006-07-12 5:42 ` Hans Reiser 2 siblings, 1 reply; 49+ messages in thread From: Jeffrey Mahoney @ 2006-07-12 0:54 UTC (permalink / raw) To: Jeff Mahoney Cc: Hans Reiser, Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Jeff Mahoney wrote: > Hans Reiser wrote: >>> Guys, if you run the kernel under a debugger, and get it to where you >>> see the excessive CPU usage, and then start stepping through the bitmap >>> code, I am sure it will be very obvious what the error is. Can anyone >>> do that for us? Jeff? > > Apologies to everyone CC'd who've already seen this message. It was > bounced from the namesys servers and I wanted to preserve the CC list. > > *** > > Mike sent me a copy of the metadata and I am now able to reproduce > locally. My profiling looks like this: > > samples % image name app name symbol name > 148596 17.8573 reiserfs.ko reiserfs reiserfs_in_journal > 58194 6.9934 reiserfs.ko reiserfs search_by_key > 38937 4.6792 vmlinux vmlinux memmove > 38783 4.6607 reiserfs.ko reiserfs scan_bitmap_block > 38466 4.6226 jbd jbd (no symbols) > 23249 2.7939 vmlinux vmlinux __find_get_block > 18196 2.1867 vmlinux vmlinux tty_write > 17734 2.1312 vmlinux vmlinux do_ioctl > 17293 2.0782 loop loop (no symbols) > 15400 1.8507 vmlinux vmlinux cond_resched_lock > 14836 1.7829 vmlinux vmlinux copy_user_generic_c > 14143 1.6996 reiserfs.ko reiserfs do_journal_end > 13638 1.6389 vmlinux vmlinux find_next_zero_bit > 13236 1.5906 vmlinux vmlinux default_llseek > 12925 1.5532 vmlinux vmlinux bit_waitqueue > 8921 1.0721 vmlinux vmlinux __delay > > > Hans - > > My speculation about the bitmaps being fragmented was right on. I wrote > a quick little script to parse the output of debugreiserfs -m and report > on the frequency of different window sizes. Windows of 1-31 blocks are > extremely common, accounting for 99.8% of all free windows. The problem > is that in my testing, where I made the allocator report the size of > allocation requests, the most common request was for a window of 32 blocks. > > What's happening is that we keep finding windows that are too small, > which results in a lot of wasted effort. The cycle goes like this: > > if (unfm && is_block_in_journal(s, bmap_n, *beg, beg)) > continue; > /* first zero bit found; we check next bits */ > for (end = *beg + 1;; end++) { > if (end >= *beg + max || end >= boundary > || reiserfs_test_le_bit(end, bi->bh->b_data)) { > next = end; > break; > } > /* finding the other end of zero bit window requires > * looking into journal structures (in > * case of searching for free blocks for unformatted nodes) */ > if (unfm && is_block_in_journal(s, bmap_n, end, &next)) > break; > } > > If the window is too small, we end up looping up to the top and try to > find another one. Since the overwhelming majority of the windows are too > small, we go through just about all the bitmaps without backing off the > window size. To be clear, eventually the allocations are honored, but only after *all* of the bitmaps are searched. On the third pass, we drop the window to a single block and restart the scan, eventually building a 32-block set that is probably quite fragmented. This occurs on every write, hence the huge performance hit. It appears as though ext3 doesn't have this problem because they don't batch writes the way reiserfs does. They'll start a search at a decent hint the same way we do, but the window is always one block. So, we're stuck between a rock and a hard place. We can have the better allocation performance at lower usage and sacrifice performance later or we can have stable allocation performance at an overall reduction in performance. I have an idea that may get around both problems, but I'm not sure how well it would be received. We currently do some very basic caching of bitmap metadata such as the first zero bit and how many free blocks there are. What if we constructed an extent map of the free windows in each bitmaps when we cache the metadata and adjust the map when we There's a third option, but I'm not sure how well it would be received Right now, the allocator keeps track of things like how full a bitmap is and where the first zero bit is. It would also be possible to cache a list of windows in each bitmap to accelerate performance. This would have to be a shrinkable cache, since the pathlogical case could mean occupying - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEtEhVLPWxlyuTD7IRAnt8AJ4qnp+578/oqKbyLbXJJoFewfOuSwCcDJJN izEeprRI0kSOmTZ860sVYOY= =xUpP -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-12 0:54 ` Jeffrey Mahoney @ 2006-07-12 5:42 ` Hans Reiser 2006-07-12 5:52 ` Jeffrey Mahoney 0 siblings, 1 reply; 49+ messages in thread From: Hans Reiser @ 2006-07-12 5:42 UTC (permalink / raw) To: Jeffrey Mahoney; +Cc: Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list You make this way too complicated because you are trying to be way too perfect. If you scan 3 bitmap blocks and find nothing, stop trying to size match. Hans Jeffrey Mahoney wrote: > Jeff Mahoney wrote: > > >Hans Reiser wrote: > > >>>Guys, if you run the kernel under a debugger, and get it to where you > >>>see the excessive CPU usage, and then start stepping through the bitmap > >>>code, I am sure it will be very obvious what the error is. Can anyone > >>>do that for us? Jeff? > > >Apologies to everyone CC'd who've already seen this message. It was > >bounced from the namesys servers and I wanted to preserve the CC list. > > >*** > > >Mike sent me a copy of the metadata and I am now able to reproduce > >locally. My profiling looks like this: > > >samples % image name app name symbol name > >148596 17.8573 reiserfs.ko reiserfs reiserfs_in_journal > >58194 6.9934 reiserfs.ko reiserfs search_by_key > >38937 4.6792 vmlinux vmlinux memmove > >38783 4.6607 reiserfs.ko reiserfs scan_bitmap_block > >38466 4.6226 jbd jbd (no symbols) > >23249 2.7939 vmlinux vmlinux __find_get_block > >18196 2.1867 vmlinux vmlinux tty_write > >17734 2.1312 vmlinux vmlinux do_ioctl > >17293 2.0782 loop loop (no symbols) > >15400 1.8507 vmlinux vmlinux cond_resched_lock > >14836 1.7829 vmlinux vmlinux copy_user_generic_c > >14143 1.6996 reiserfs.ko reiserfs do_journal_end > >13638 1.6389 vmlinux vmlinux find_next_zero_bit > >13236 1.5906 vmlinux vmlinux default_llseek > >12925 1.5532 vmlinux vmlinux bit_waitqueue > >8921 1.0721 vmlinux vmlinux __delay > > > >Hans - > > >My speculation about the bitmaps being fragmented was right on. I wrote > >a quick little script to parse the output of debugreiserfs -m and report > >on the frequency of different window sizes. Windows of 1-31 blocks are > >extremely common, accounting for 99.8% of all free windows. The problem > >is that in my testing, where I made the allocator report the size of > >allocation requests, the most common request was for a window of 32 > blocks. > > >What's happening is that we keep finding windows that are too small, > >which results in a lot of wasted effort. The cycle goes like this: > > >if (unfm && is_block_in_journal(s, bmap_n, *beg, beg)) > > continue; > >/* first zero bit found; we check next bits */ > >for (end = *beg + 1;; end++) { > > if (end >= *beg + max || end >= boundary > > || reiserfs_test_le_bit(end, bi->bh->b_data)) { > > next = end; > > break; > > } > > /* finding the other end of zero bit window requires > > * looking into journal structures (in > > * case of searching for free blocks for unformatted nodes) */ > > if (unfm && is_block_in_journal(s, bmap_n, end, &next)) > > break; > >} > > >If the window is too small, we end up looping up to the top and try to > >find another one. Since the overwhelming majority of the windows are too > >small, we go through just about all the bitmaps without backing off the > >window size. > > > To be clear, eventually the allocations are honored, but only after > *all* of the bitmaps are searched. On the third pass, we drop the window > to a single block and restart the scan, eventually building a 32-block > set that is probably quite fragmented. This occurs on every write, hence > the huge performance hit. > > It appears as though ext3 doesn't have this problem because they don't > batch writes the way reiserfs does. They'll start a search at a decent > hint the same way we do, but the window is always one block. > > So, we're stuck between a rock and a hard place. We can have the better > allocation performance at lower usage and sacrifice performance later or > we can have stable allocation performance at an overall reduction in > performance. > > I have an idea that may get around both problems, but I'm not sure how > well it would be received. We currently do some very basic caching of > bitmap metadata such as the first zero bit and how many free blocks > there are. What if we constructed an extent map of the free windows in > each bitmaps when we cache the metadata and adjust the map when we > > There's a third option, but I'm not sure how well it would be received > > Right now, the allocator keeps track of things like how full a bitmap is > and where the first zero bit is. It would also be possible to cache a > list of windows in each bitmap to accelerate performance. This would > have to be a shrinkable cache, since the pathlogical case could mean > occupying > > -Jeff > > -- > Jeff Mahoney > SUSE Labs ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-12 5:42 ` Hans Reiser @ 2006-07-12 5:52 ` Jeffrey Mahoney 2006-07-12 8:18 ` Hans Reiser 0 siblings, 1 reply; 49+ messages in thread From: Jeffrey Mahoney @ 2006-07-12 5:52 UTC (permalink / raw) To: Hans Reiser; +Cc: Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: > You make this way too complicated because you are trying to be way too > perfect. If you scan 3 bitmap blocks and find nothing, stop trying to > size match. Agreed on the trying too hard. I think we can find a better, less "perfect" solution. I wrote that email on Friday on my notebook and it couldn't connect. It managed to do so this evening. I spent the weekend experimenting with the idea, and while I came up with something that worked, it wasn't really usable. The memory footprint was much too large to be worthwhile. For some fragmentation patterns, it would work. The worst case scenario was totally intolerable. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEtI4xLPWxlyuTD7IRAh0uAJ4xqU2JFRUqgyQYDDQBr0oGBJBCXgCcCXD7 et36eQ8yUt3CD7e6+thPZvU= =iFAe -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-12 5:52 ` Jeffrey Mahoney @ 2006-07-12 8:18 ` Hans Reiser 2006-07-12 16:06 ` Jeff Mahoney 0 siblings, 1 reply; 49+ messages in thread From: Hans Reiser @ 2006-07-12 8:18 UTC (permalink / raw) To: Jeffrey Mahoney; +Cc: Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list Jeffrey Mahoney wrote: > Hans Reiser wrote: > > >You make this way too complicated because you are trying to be way too > >perfect. If you scan 3 bitmap blocks and find nothing, stop trying to > >size match. > > > Agreed on the trying too hard.. What about the actual algorithm suggested? > I think we can find a better, less > "perfect" solution. I wrote that email on Friday on my notebook and it > couldn't connect. It managed to do so this evening. I spent the weekend > experimenting with the idea, and while I came up with something that > worked, it wasn't really usable. The memory footprint was much too large > to be worthwhile. For some fragmentation patterns, it would work. The > worst case scenario was totally intolerable. > > -Jeff > > -- > Jeff Mahoney > SUSE Labs > ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10%? 2006-07-12 8:18 ` Hans Reiser @ 2006-07-12 16:06 ` Jeff Mahoney 0 siblings, 0 replies; 49+ messages in thread From: Jeff Mahoney @ 2006-07-12 16:06 UTC (permalink / raw) To: Hans Reiser; +Cc: Mike Benoit, Jan Kara, Jure Pečar, reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hans Reiser wrote: > Jeffrey Mahoney wrote: > >> Hans Reiser wrote: >> >>> You make this way too complicated because you are trying to be way too >>> perfect. If you scan 3 bitmap blocks and find nothing, stop trying to >>> size match. >> >> Agreed on the trying too hard.. > > What about the actual algorithm suggested? "Worked" really just meant I made the initial window scanning/tracking code work and experimented with several ways of organizing it. I focused on memory usage at first and my results forced me to abandon it. More on that below. I didn't get as far as modifying the allocator to use this code. The initial idea was to keep track of the largest window available within a bitmap. Since any write could split a window, that necessitates tracking all the windows for each bitmap. Once we have that tracking, we can make smarter decisions involving placement and reservation windows if we so desired. Another advantage of maintaining a separate tree outside of the bitmap is that the reiserfs_in_journal() check can go away. When blocks are freed, we can make the journal responsible for freeing them in the window tree, so that the view of the bitmap is always "current." The window tree could be built on the first read of the bitmap, just like the bitmap metadata is generated now when the dynamic bitmap patches are applied. Seems like a good idea so far, except that the memory cost is unreasonable. I tried several ways of organizing the data, but none were small enough to be usable. The basic idea is a tree sorted by window size. My first attempt used lists of windows of a particular size, sorted by position. The memory usage was horrendous. On a worst-case scenario of 16k 1-bit windows, memory usage was just under 400k/bitmap on a 64-bit system. The second attempt still used the tree but used arrays of shorts to track the beginning of each window. The worst case memory usage is approximately 16k/bitmap if kept trimmed, but that involves quite a bit of overhead. The kernel has no realloc and every bit allocation/free would involve an allocation/copy/free cycle to occur which isn't exactly nice. The next incarnation managed the arrays in chunks, so that the array is expanded by n blocks when space was required. Shrinks would occur when the usage meant there was ~ 1.25 chunks free. The performance problems of using the arrays still existed though. I guess in contrast with scanning the bitmap blocks themselves, it's not so bad. Unfortunately, this means that up to ~64M of RAM is pinned when used on a 500 GB file system. I suppose that since the data is only a summary of what is on disk and is easily regenerated, it would be possible to make the cache shrinkable and regenerated when needed. When the cache is shrunk, we could keep the largest window as part of the bitmap metadata structure and regenerate it to update it. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFEtR4hLPWxlyuTD7IRAp0pAKCejAb+AlMWnCa/zaZcEEYON2PIAgCff3u5 9zDTmuFS2BKZ/tA1SBuorD4= =xDS+ -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10% - FIXED 2006-06-29 17:41 ReiserFS v3 choking when free space falls below 10%? Mike Benoit 2006-06-29 19:12 ` Vladimir V. Saveliev @ 2006-07-24 22:26 ` Mike Benoit 2006-07-24 22:32 ` Jeff Mahoney 2006-07-26 0:10 ` David Masover 1 sibling, 2 replies; 49+ messages in thread From: Mike Benoit @ 2006-07-24 22:26 UTC (permalink / raw) To: reiserfs-list; +Cc: Jeff Mahoney [-- Attachment #1.1: Type: text/plain, Size: 14408 bytes --] I applied the attached patch that Jeff supplied me and so far it is working flawlessly. I currently have less than 4% free space on my drive and the CPU usage is less then 3% with two recordings going. I'll let it run until about 2% free space just to test further. It also _appears_ that overall CPU usage is down slightly based on the vmstat output from when we were trying to diagnose the problem before compared to now. The SYS CPU time was hovering between 3-10% before, and now it seems to be between 0-2%. I haven't done any actual performance tests though. Jeff, what drawbacks does this patch have? Thanks for all your hard work, I'm sure many other MythTV users will be appreciate it. On Thu, 2006-06-29 at 10:41 -0700, Mike Benoit wrote: > My MythTV box recently started showing odd behavior during recordings, > at certain times the load of the box would spike to 10+ and recordings > would start losing frames and become unwatchable. TOP would show > mythbackend as using 90+% SYS CPU usage, which under normal > circumstances it normally uses about 5% USR. > > So I finally got around to profiling mythbackend when the load starts to > spike. To my surprise it appears that once I have less then 10% (30GB) > free on the drive reiserfs can't up, even just writing at 1mb/sec is too > much for it. > > Is there something that can be done to fix this, 30gb seems like a lot > of wasted space. > > #opreport > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > Profiling through timer interrupt > TIMER:0| > samples| %| > ------------------ > 77863 78.7856 reiserfs > 18183 18.3984 vmlinux > 695 0.7032 mysqld > 452 0.4574 libc-2.4.so > 360 0.3643 libmythtv-0.19.so.0.19.0 > 324 0.3278 ivtv > 323 0.3268 nvidia > 242 0.2449 libqt-mt.so.3.3.6 > 110 0.1113 libpthread-2.4.so > 53 0.0536 libstdc++.so.6.0.8 > 35 0.0354 ld-2.4.so > 23 0.0233 libperl.so > 22 0.0223 libz.so.1.2.3 > <snip> > > #opreport -l /usr/src/linux/vmlinux > CPU: CPU with timer interrupt, speed 0 MHz (estimated) > Profiling through timer interrupt > samples % symbol name > 9607 52.8351 default_idle > 7694 42.3142 find_next_zero_bit > 183 1.0064 __copy_from_user_ll > 57 0.3135 handle_IRQ_event > 37 0.2035 __copy_to_user_ll > 34 0.1870 ide_outb > 30 0.1650 ide_end_request > 22 0.1210 ioread8 > 22 0.1210 schedule > 21 0.1155 get_page_from_freelist > 17 0.0935 mmx_clear_page > <snip> > > System Details: > ----------------------------------------------- > Kernel v2.6.16.21 (custom compiled) > - This issue also happened with 2.6.14 too though. > > Filesystem Size Used Avail Use% Mounted on > /dev/hda1 280G 269G 12G 97% / > > [root@mythtv]# cat /proc/mounts > rootfs / rootfs rw 0 0 > /dev /dev tmpfs rw 0 0 > /dev/root / reiserfs rw,noatime,nodiratime 0 0 > > [root@mythtv]# cat /proc/cpuinfo > processor : 0 > vendor_id : AuthenticAMD > cpu family : 6 > model : 6 > model name : AMD Athlon(tm) XP 2100+ > stepping : 2 > cpu MHz : 1759.680 > cache size : 256 KB > > [root@mythtv]# free > total used free shared buffers > cached > Mem: 515992 496256 19736 0 36256 > 271728 > -/+ buffers/cache: 188272 327720 > Swap: 262136 408 261728 > > [root@mythtv ~]# hdparm -i /dev/hda > /dev/hda: > Model=ST3300622A, FwRev=3.AND, SerialNo=3NF1GAGW > Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } > RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 > BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=16 > CurCHS=4047/16/255, CurSects=16511760, LBA=yes, LBAsects=268435455 > IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} > PIO modes: pio0 pio1 pio2 pio3 pio4 > DMA modes: mdma0 mdma1 mdma2 > UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 > AdvancedPM=no WriteCache=enabled > Drive conforms to: Unspecified: ATA/ATAPI-1 ATA/ATAPI-2 ATA/ATAPI-3 > ATA/ATAPI-4 ATA/ATAPI-5 ATA/ATAPI-6 ATA/ATAPI-7 > * signifies the current active mode > > [root@mythtv ~]# hdparm -tT /dev/hda > /dev/hda: > Timing cached reads: 1296 MB in 2.00 seconds = 646.99 MB/sec > Timing buffered disk reads: 166 MB in 3.02 seconds = 55.05 MB/sec > > vmstat 1 output: > -------------------------------------------------------------- > > procs -----------memory---------- ---swap-- -----io---- --system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 8 0 408 5800 29308 248604 0 0 0 1036 406 132 2 98 > 0 0 > 4 0 408 5644 29396 248608 0 0 0 1128 437 184 2 92 > 0 6 > 7 0 408 6316 29428 248020 0 0 0 1316 539 287 0 86 > 0 14 > 5 0 408 6104 29480 248180 0 0 0 588 415 187 0 99 > 0 1 > 4 0 408 5764 29536 248364 0 0 0 1092 421 172 2 97 > 1 0 > 6 0 408 6528 29592 247684 0 0 0 1092 425 161 2 98 > 0 1 > 2 1 408 6372 29676 247724 0 0 0 2304 385 170 2 97 > 1 0 > 5 0 408 6400 29676 247616 0 0 0 48 383 122 0 > 100 0 0 > 7 0 408 6192 29704 247872 0 0 0 1080 409 162 1 98 > 0 1 > 6 0 408 5720 29732 248304 0 0 0 1076 414 178 1 98 > 0 1 > 7 0 408 6348 29800 247552 0 0 0 1656 460 300 2 87 > 1 11 > 5 0 408 6628 29848 247248 0 0 0 1164 407 207 1 94 > 0 5 > 5 0 408 5884 29896 247996 0 0 4 1116 453 353 1 76 > 0 23 > 6 0 408 5640 29868 248204 0 0 0 1052 416 132 1 99 > 0 0 > 4 0 408 5772 29940 248104 0 0 0 648 490 314 1 84 > 1 14 > 6 1 408 6328 30036 247464 0 0 0 1928 488 305 2 85 > 0 13 > 4 0 408 6184 30076 247472 0 0 4 860 404 201 1 94 > 0 5 > 4 0 408 6332 30044 247328 0 0 0 1312 429 156 1 99 > 0 0 > 9 0 408 6120 30100 247580 0 0 0 604 494 305 3 81 > 1 16 > 2 1 408 6460 30140 247116 0 0 0 1372 436 315 1 79 > 0 20 > 10 0 408 6252 30176 247372 0 0 0 456 412 126 1 99 > 0 0 > 6 0 408 6432 30164 247276 0 0 4 1268 425 255 1 88 > 1 10 > 3 0 408 5688 30220 247948 0 0 0 1332 454 352 0 78 > 0 22 > 2 1 408 6352 30284 247124 0 0 0 1140 362 156 2 96 > 1 1 > 5 0 408 6564 30284 246908 0 0 0 92 472 316 2 83 > 0 15 > 5 0 408 6348 30352 247056 0 0 0 1168 506 350 0 83 > 0 17 > 4 0 408 5604 30404 247828 0 0 4 1124 448 262 2 87 > 0 11 > 3 0 408 5880 30444 247500 0 0 0 1104 426 315 2 77 > 1 20 > 2 1 408 5916 30496 247352 0 0 0 1064 365 152 1 97 > 0 2 > 7 0 408 6072 30496 247204 0 0 0 440 489 307 1 82 > 0 17 > 6 0 408 5936 30528 247288 0 0 0 816 434 130 2 98 > 0 0 > 4 0 408 5944 30588 247300 0 0 0 1108 359 172 0 98 > 0 2 > 4 0 408 5664 30624 247508 0 0 0 1444 426 161 0 99 > 1 0 > 5 0 408 6656 30608 246572 0 0 0 1220 425 163 2 98 > 0 1 > 6 0 408 6316 30656 246848 0 0 0 1552 441 180 1 98 > 0 1 > 4 0 408 6408 30632 246776 0 0 0 644 403 140 1 99 > 0 0 > 9 0 408 6072 30696 247060 0 0 4 744 496 351 2 82 > 1 16 > 5 0 408 5864 30708 247240 0 0 0 1680 509 335 1 83 > 1 15 > procs -----------memory---------- ---swap-- -----io---- --system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 3 1 408 6284 30768 246768 0 0 0 1132 434 328 1 76 > 1 22 > 6 0 408 6352 30772 246692 0 0 0 576 373 170 0 93 > 0 7 > 4 0 408 6008 30820 246932 0 0 4 612 496 322 1 83 > 0 16 > 4 0 408 6288 30836 246600 0 0 0 1484 480 304 1 85 > 0 14 > 4 0 408 6064 30896 246844 0 0 0 1136 504 337 1 84 > 1 15 > 5 0 408 5728 30900 247116 0 0 4 1188 426 156 1 99 > 0 0 > 6 0 408 5696 30968 247144 0 0 0 1104 367 123 3 97 > 0 0 > 4 0 408 5608 31016 247144 0 0 0 1152 445 378 2 74 > 1 23 > 7 0 408 5576 31008 247088 0 0 0 964 402 115 1 99 > 0 0 > 4 0 408 6328 31052 246396 0 0 0 628 355 152 1 98 > 0 1 > 5 0 408 6116 31112 246524 0 0 0 1620 472 299 2 85 > 1 12 > 2 1 408 6336 31204 246176 0 0 0 1112 367 156 2 96 > 0 2 > 7 0 408 6388 31176 246192 0 0 0 76 457 272 0 86 > 0 14 > 5 0 408 6268 31232 246284 0 0 0 1136 466 267 1 85 > 1 13 > 2 1 408 5932 31304 246616 0 0 4 2068 374 173 1 99 > 0 0 > 6 0 408 5960 31224 246564 0 0 0 104 472 273 1 84 > 0 15 > 6 0 408 5692 31308 246716 0 0 0 1160 412 206 2 94 > 0 4 > 5 0 408 5600 31336 246892 0 0 4 1660 480 289 2 86 > 0 12 > 7 0 408 6400 31336 245964 0 0 0 1052 418 160 3 97 > 0 0 > 6 0 408 6316 31292 246136 0 0 0 512 432 127 1 99 > 0 0 > 5 0 408 5856 31372 246528 0 0 0 1824 404 159 2 96 > 0 2 > 3 0 408 5880 31424 246412 0 0 0 1156 454 174 1 97 > 1 1 > 3 0 408 6024 31372 246336 0 0 0 896 399 130 0 > 100 0 0 > 5 0 408 5812 31432 246492 0 0 0 708 413 160 1 97 > 0 2 > 5 0 408 6396 31424 246024 0 0 0 1604 436 163 1 97 > 1 1 > 6 1 408 6276 31492 245924 0 0 216 1176 511 409 3 82 > 0 15 > 4 0 408 6312 31528 245944 0 0 0 1116 468 263 1 86 > 0 13 > 1 2 408 6592 31576 245628 0 0 56 1044 343 126 0 97 > 0 3 > 5 0 408 6312 31576 245904 0 0 32 48 427 155 0 97 > 0 3 > 1 0 408 5816 31624 246360 0 0 72 1796 590 834 2 40 > 35 24 > 1 1 408 16872 31704 247564 0 0 1232 248 513 1185 28 4 > 11 57 > 1 1 408 31240 31768 248520 0 0 932 92 403 996 32 4 > 10 54 > 1 0 408 29576 31880 248704 0 0 188 248 372 997 7 6 > 61 26 > 1 1 408 28284 31952 249852 0 0 316 344 402 842 20 21 > 45 13 > 0 1 408 27188 32008 250940 0 0 112 976 393 465 33 58 > 0 9 > 5 1 408 24748 32100 253228 0 0 1212 1424 571 949 31 31 > 0 37 > 2 0 408 23052 32156 255032 0 0 544 1036 415 351 16 80 > 0 4 > 0 1 408 21148 32232 256808 0 0 516 1480 454 692 33 41 > 0 25 > procs -----------memory---------- ---swap-- -----io---- --system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 2 1 408 19616 32288 258308 0 0 576 1352 414 478 33 59 > 0 8 > 4 0 408 18084 32348 259816 0 0 496 1344 423 524 29 56 > 0 15 > 5 0 408 17016 32428 260844 0 0 192 812 518 574 24 63 > 0 13 > 2 0 408 15348 32488 262444 0 0 208 1064 416 295 14 85 > 0 1 > 5 0 408 13616 32552 264104 0 0 84 1684 497 615 32 66 > 0 2 > 5 1 408 13496 32612 263992 0 0 92 1148 530 526 14 71 > 0 14 > 0 1 408 13000 32784 264556 0 0 80 1240 506 504 1 59 > 0 40 > 3 1 408 12132 32864 265324 0 0 36 612 431 438 2 65 > 0 34 > 1 1 408 10196 33048 266960 0 0 216 4 440 565 1 60 > 0 39 > 1 1 408 9252 33284 267768 0 0 168 2444 463 617 1 56 > 0 43 > 0 3 408 7208 33376 269680 0 0 32 3460 459 497 1 59 > 0 40 > 2 1 408 6416 33444 270392 0 0 24 748 448 423 0 71 > 0 29 > 0 1 408 5976 33664 270568 0 0 220 1436 481 654 2 55 > 0 43 > 1 0 408 6100 33700 270356 0 0 8 844 406 389 9 70 > 16 5 > 0 0 408 5848 33732 270568 0 0 0 1128 435 401 0 72 > 27 1 > 1 0 408 5720 33772 270664 0 0 0 852 398 350 1 73 > 25 1 > 1 0 408 6100 33780 270320 0 0 0 1216 446 522 0 54 > 45 1 > 3 0 408 5736 33780 270644 0 0 0 1092 475 736 0 32 > 67 1 > 1 0 408 6372 33952 269720 0 0 0 1040 462 522 4 69 > 26 1 > 2 0 408 6436 33944 269592 0 0 0 864 433 287 0 83 > 16 1 > 0 0 408 5848 34024 270140 0 0 4 1232 480 701 3 39 > 53 5 > 2 0 408 9196 33936 266612 0 0 104 212 596 1035 10 43 > 40 8 > 3 0 408 8824 33936 267380 0 0 0 512 388 90 0 > 100 0 0 > 4 0 408 7956 33968 268148 0 0 0 548 400 114 1 98 > 0 1 > 2 0 408 6492 34000 269604 0 0 0 892 432 629 0 38 > 61 1 > 2 0 408 6416 34084 269648 0 0 0 1712 403 591 0 40 > 58 2 > 5 0 408 6612 34120 269376 0 0 0 844 447 557 1 49 > 49 1 > 4 0 408 6424 34148 269548 0 0 0 880 465 493 0 65 > 35 0 > 1 0 408 6336 34196 269596 0 0 0 1112 475 552 3 59 > 36 2 > 4 1 408 6304 34340 269404 0 0 0 1668 378 316 0 78 > 22 0 > 3 0 408 6096 34368 269608 0 0 0 308 411 625 1 38 > 59 2 > 3 0 408 6268 34412 269372 0 0 0 1148 398 583 0 39 > 60 1 > 5 0 408 6400 34444 269264 0 0 0 824 431 414 0 67 > 33 0 > -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #1.2: reiserfs-bitmap-no-minimum-window-size.diff --] [-- Type: text/x-patch, Size: 1476 bytes --] --- linux-2.6.17.orig/fs/reiserfs/bitmap.c 2006-01-02 22:21:10.000000000 -0500 +++ linux-2.6.17.orig.devel/fs/reiserfs/bitmap.c 2006-07-23 19:10:57.000000000 -0400 @@ -1020,7 +1020,6 @@ b_blocknr_t finish = SB_BLOCK_COUNT(s) - 1; int passno = 0; int nr_allocated = 0; - int bigalloc = 0; determine_prealloc_size(hint); if (!hint->formatted_node) { @@ -1047,28 +1046,9 @@ hint->preallocate = hint->prealloc_size = 0; } /* for unformatted nodes, force large allocations */ - bigalloc = amount_needed; } do { - /* in bigalloc mode, nr_allocated should stay zero until - * the entire allocation is filled - */ - if (unlikely(bigalloc && nr_allocated)) { - reiserfs_warning(s, "bigalloc is %d, nr_allocated %d\n", - bigalloc, nr_allocated); - /* reset things to a sane value */ - bigalloc = amount_needed - nr_allocated; - } - /* - * try pass 0 and pass 1 looking for a nice big - * contiguous allocation. Then reset and look - * for anything you can find. - */ - if (passno == 2 && bigalloc) { - passno = 0; - bigalloc = 0; - } switch (passno++) { case 0: /* Search from hint->search_start to end of disk */ start = hint->search_start; @@ -1106,8 +1086,7 @@ new_blocknrs + nr_allocated, start, finish, - bigalloc ? - bigalloc : 1, + 1, amount_needed - nr_allocated, hint-> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10% - FIXED 2006-07-24 22:26 ` ReiserFS v3 choking when free space falls below 10% - FIXED Mike Benoit @ 2006-07-24 22:32 ` Jeff Mahoney 2006-07-26 0:10 ` David Masover 1 sibling, 0 replies; 49+ messages in thread From: Jeff Mahoney @ 2006-07-24 22:32 UTC (permalink / raw) To: Mike Benoit; +Cc: reiserfs-list -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Mike Benoit wrote: > I applied the attached patch that Jeff supplied me and so far it is > working flawlessly. I currently have less than 4% free space on my drive > and the CPU usage is less then 3% with two recordings going. I'll let it > run until about 2% free space just to test further. > > It also _appears_ that overall CPU usage is down slightly based on the > vmstat output from when we were trying to diagnose the problem before > compared to now. The SYS CPU time was hovering between 3-10% before, and > now it seems to be between 0-2%. I haven't done any actual performance > tests though. > > Jeff, what drawbacks does this patch have? > > Thanks for all your hard work, I'm sure many other MythTV users will be > appreciate it. Hi Mike - There really shouldn't be any. I suspect that the window searching was actually causing more problems than it was solving. The original goal would have been to try to keep chunks of blocks contiguous for better access patterns, but if those chunks end up getting spread out all over the disk, that's hardly the outcome we were looking for. So, what will now happen is that the allocator will allocate the next n blocks it can find, regardless of the window size. If there happens to be a window of the size we needed, it will automatically find it through the normal process of allocating one block at a time. - -Jeff - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iD8DBQFExUqOLPWxlyuTD7IRAuaFAJ47W+zr2ZwIs//vMgm3RNHuw4dpwACdECdv ueI91PGuCLQdeKipY5G9kqk= =vk6Z -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10% - FIXED 2006-07-24 22:26 ` ReiserFS v3 choking when free space falls below 10% - FIXED Mike Benoit 2006-07-24 22:32 ` Jeff Mahoney @ 2006-07-26 0:10 ` David Masover 2006-07-26 2:25 ` Mike Benoit 2006-07-26 14:29 ` Hans Reiser 1 sibling, 2 replies; 49+ messages in thread From: David Masover @ 2006-07-26 0:10 UTC (permalink / raw) To: Mike Benoit; +Cc: reiserfs-list, Jeff Mahoney [-- Attachment #1: Type: text/plain, Size: 351 bytes --] Mike Benoit wrote: > Thanks for all your hard work, I'm sure many other MythTV users will be > appreciate it. As a future MythTV user a bit late to this discussion, I'm curious -- was this Reiser3 or 4? Are there any known MythTV issues with v4? I say this because the box with my capture card is running on a Reiser4 root right now... [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 890 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10% - FIXED 2006-07-26 0:10 ` David Masover @ 2006-07-26 2:25 ` Mike Benoit 2006-07-26 14:29 ` Hans Reiser 1 sibling, 0 replies; 49+ messages in thread From: Mike Benoit @ 2006-07-26 2:25 UTC (permalink / raw) To: David Masover; +Cc: reiserfs-list, Jeff Mahoney [-- Attachment #1: Type: text/plain, Size: 683 bytes --] On Tue, 2006-07-25 at 19:10 -0500, David Masover wrote: > Mike Benoit wrote: > > > Thanks for all your hard work, I'm sure many other MythTV users will be > > appreciate it. > > As a future MythTV user a bit late to this discussion, I'm curious -- > was this Reiser3 or 4? Are there any known MythTV issues with v4? I > say this because the box with my capture card is running on a Reiser4 > root right now... > It was Reiser3. I personally don't know of any issues with Reiser4 and MythTV, but if Reiser4 has "pauses" or "hangs" during flush that I have heard so much about, I could see that posing a problem to MythTV. -- Mike Benoit <ipso@snappymail.ca> [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: ReiserFS v3 choking when free space falls below 10% - FIXED 2006-07-26 0:10 ` David Masover 2006-07-26 2:25 ` Mike Benoit @ 2006-07-26 14:29 ` Hans Reiser 1 sibling, 0 replies; 49+ messages in thread From: Hans Reiser @ 2006-07-26 14:29 UTC (permalink / raw) To: David Masover; +Cc: Mike Benoit, reiserfs-list, Jeff Mahoney David Masover wrote: > >As a future MythTV user a bit late to this discussion, I'm curious -- >was this Reiser3 or 4? Are there any known MythTV issues with v4? I >say this because the box with my capture card is running on a Reiser4 >root right now... > > > I think you get to be the one to tell us.... ^ permalink raw reply [flat|nested] 49+ messages in thread
end of thread, other threads:[~2006-07-26 14:29 UTC | newest]
Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-06-29 17:41 ReiserFS v3 choking when free space falls below 10%? Mike Benoit
2006-06-29 19:12 ` Vladimir V. Saveliev
2006-06-29 20:15 ` Mike Benoit
2006-06-29 20:22 ` Vladimir V. Saveliev
2006-06-29 21:01 ` Mike Benoit
2006-06-29 20:36 ` Nate Diller
2006-06-30 16:33 ` Hans Reiser
2006-06-30 16:47 ` Jeff Mahoney
2006-06-30 17:04 ` Hans Reiser
2006-06-30 17:46 ` Mike Benoit
2006-06-30 18:18 ` Hans Reiser
2006-07-05 0:37 ` Mike Benoit
2006-07-05 2:37 ` Hans Reiser
2006-07-05 14:42 ` Tom Vier
2006-07-05 19:12 ` Jeff Mahoney
[not found] ` <20060706125856.fdac1d16.pegasus@nerv.eu.org>
2006-07-06 15:43 ` Mike Benoit
2006-07-06 16:01 ` Jonathan Briggs
2006-07-06 17:26 ` Toby Thain
2006-07-06 17:26 ` Toby Thain
2006-07-06 18:02 ` Jeff Mahoney
2006-07-06 18:12 ` Hans Reiser
2006-07-06 18:19 ` Jeff Mahoney
2006-07-06 18:47 ` Mike Benoit
2006-07-06 19:17 ` Hans Reiser
2006-07-06 18:27 ` Mike Benoit
2006-07-06 18:39 ` Jeff Mahoney
2006-07-07 7:29 ` Mike Benoit
2006-07-07 17:49 ` Jan Kara
2006-07-07 17:50 ` Jeff Mahoney
2006-07-07 18:07 ` Jan Kara
2006-07-07 18:08 ` Jeff Mahoney
2006-07-07 19:05 ` Hans Reiser
2006-07-07 19:18 ` Jan Kara
2006-07-07 19:38 ` Hans Reiser
2006-07-07 20:18 ` Mike Benoit
2006-07-07 21:04 ` Mike Benoit
2006-07-07 21:20 ` Hans Reiser
2006-07-08 18:45 ` Jeff Mahoney
2006-07-09 0:01 ` Hans Reiser
2006-07-09 0:02 ` Hans Reiser
2006-07-12 0:54 ` Jeffrey Mahoney
2006-07-12 5:42 ` Hans Reiser
2006-07-12 5:52 ` Jeffrey Mahoney
2006-07-12 8:18 ` Hans Reiser
2006-07-12 16:06 ` Jeff Mahoney
2006-07-24 22:26 ` ReiserFS v3 choking when free space falls below 10% - FIXED Mike Benoit
2006-07-24 22:32 ` Jeff Mahoney
2006-07-26 0:10 ` David Masover
2006-07-26 2:25 ` Mike Benoit
2006-07-26 14:29 ` Hans Reiser
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.