* ARGH MORE BUGS!!!
@ 2005-02-21 20:04 Christian Schmid
2005-02-21 20:19 ` Matthias-Christian Ott
2005-02-21 20:28 ` Francois Romieu
0 siblings, 2 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 20:04 UTC (permalink / raw)
To: netdev
Hi.
Another bug hit me today HARD! I have been experiencing with lowering the socket-buffer to see if
the behaviour of a slowdown reappears at the same position. Result: With a 128 KB send-buffer, the
slowdown appears at around 3000 sockets. With 64 KB, it didnt appear up to 4500 sockets where a
small slow-down appeared but I think this was a disk-issue. So its definetly something with TCP-memory.
But now I hit another BUG: After I have managed to create 4500 sockets, 10 minutes later an
interesting phenomenon appeared: It locks for 5 seconds every 60 seconds. I first thought this was
something in my program but I can do what I want, I wasn't able to fix this. Even a restart of my
program didnt help. It even appears with 400 connections. Then I despairedly just restarted the
system and: It was gone. So what is THIS?
Sorry if I am a bit angry. I know you are doing a really good job. Maybe I can donate some money
somewhere but PLEASE!!!!! help me fix this bugs.... Thank you.
Chris
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ARGH MORE BUGS!!!
2005-02-21 20:04 ARGH MORE BUGS!!! Christian Schmid
@ 2005-02-21 20:19 ` Matthias-Christian Ott
2005-02-21 20:25 ` Christian Schmid
2005-02-21 20:28 ` Francois Romieu
1 sibling, 1 reply; 14+ messages in thread
From: Matthias-Christian Ott @ 2005-02-21 20:19 UTC (permalink / raw)
To: Christian Schmid; +Cc: netdev
Christian Schmid wrote:
> Hi.
>
> Another bug hit me today HARD! I have been experiencing with lowering
> the socket-buffer to see if the behaviour of a slowdown reappears at
> the same position. Result: With a 128 KB send-buffer, the slowdown
> appears at around 3000 sockets. With 64 KB, it didnt appear up to 4500
> sockets where a small slow-down appeared but I think this was a
> disk-issue. So its definetly something with TCP-memory.
>
> But now I hit another BUG: After I have managed to create 4500
> sockets, 10 minutes later an interesting phenomenon appeared: It locks
> for 5 seconds every 60 seconds. I first thought this was something in
> my program but I can do what I want, I wasn't able to fix this. Even a
> restart of my program didnt help. It even appears with 400
> connections. Then I despairedly just restarted the system and: It was
> gone. So what is THIS?
>
> Sorry if I am a bit angry. I know you are doing a really good job.
> Maybe I can donate some money somewhere but PLEASE!!!!! help me fix
> this bugs.... Thank you.
>
> Chris
>
>
Hi!
I'm not a Perl Coder or Socket Specialist, but did try an implementation
of your program in C (maybe it's a perl "bug"?)? And as mentioned in the
other Thread, try to use send () instead of sendfile (). Anyway it's
strange. Try to contact some of the Maintainers and Developers of this
part of the IP v4 implementation in Linux.
Matthias-Christian Ott
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ARGH MORE BUGS!!!
2005-02-21 20:19 ` Matthias-Christian Ott
@ 2005-02-21 20:25 ` Christian Schmid
0 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 20:25 UTC (permalink / raw)
To: Matthias-Christian Ott; +Cc: netdev
netdev@oss.sgi.com has been found in MAINTAINERS in the ipv4/ipv6 section ;)
Matthias-Christian Ott wrote:
> Christian Schmid wrote:
>
>> Hi.
>>
>> Another bug hit me today HARD! I have been experiencing with lowering
>> the socket-buffer to see if the behaviour of a slowdown reappears at
>> the same position. Result: With a 128 KB send-buffer, the slowdown
>> appears at around 3000 sockets. With 64 KB, it didnt appear up to 4500
>> sockets where a small slow-down appeared but I think this was a
>> disk-issue. So its definetly something with TCP-memory.
>>
>> But now I hit another BUG: After I have managed to create 4500
>> sockets, 10 minutes later an interesting phenomenon appeared: It locks
>> for 5 seconds every 60 seconds. I first thought this was something in
>> my program but I can do what I want, I wasn't able to fix this. Even a
>> restart of my program didnt help. It even appears with 400
>> connections. Then I despairedly just restarted the system and: It was
>> gone. So what is THIS?
>>
>> Sorry if I am a bit angry. I know you are doing a really good job.
>> Maybe I can donate some money somewhere but PLEASE!!!!! help me fix
>> this bugs.... Thank you.
>>
>> Chris
>>
>>
> Hi!
> I'm not a Perl Coder or Socket Specialist, but did try an implementation
> of your program in C (maybe it's a perl "bug"?)? And as mentioned in the
> other Thread, try to use send () instead of sendfile (). Anyway it's
> strange. Try to contact some of the Maintainers and Developers of this
> part of the IP v4 implementation in Linux.
>
> Matthias-Christian Ott
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ARGH MORE BUGS!!!
2005-02-21 20:04 ARGH MORE BUGS!!! Christian Schmid
2005-02-21 20:19 ` Matthias-Christian Ott
@ 2005-02-21 20:28 ` Francois Romieu
2005-02-21 20:34 ` Christian Schmid
1 sibling, 1 reply; 14+ messages in thread
From: Francois Romieu @ 2005-02-21 20:28 UTC (permalink / raw)
To: Christian Schmid; +Cc: netdev
Christian Schmid <webmaster@rapidforum.com> :
[...]
> But now I hit another BUG: After I have managed to create 4500 sockets, 10
> minutes later an interesting phenomenon appeared: It locks for 5 seconds
> every 60 seconds. I first thought this was something in my program but I
> can do what I want, I wasn't able to fix this. Even a restart of my program
> didnt help. It even appears with 400 connections. Then I despairedly just
> restarted the system and: It was gone. So what is THIS?
What about monitoring /proc/slabinfo and vmstat output (with heavy renicing) ?
--
Ueimor
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ARGH MORE BUGS!!!
2005-02-21 20:28 ` Francois Romieu
@ 2005-02-21 20:34 ` Christian Schmid
2005-02-21 20:56 ` Francois Romieu
0 siblings, 1 reply; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 20:34 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
(s02) [21:05:13] root:~# cat /proc/slabinfo
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables
<batchcount> <limit> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
ip_fib_alias 14 226 16 226 1 : tunables 120 60 8 : slabdata 1 1
0
ip_fib_hash 14 119 32 119 1 : tunables 120 60 8 : slabdata 1 1
0
rpc_buffers 8 8 2048 2 1 : tunables 24 12 8 : slabdata 4 4
0
rpc_tasks 8 15 256 15 1 : tunables 120 60 8 : slabdata 1 1
0
rpc_inode_cache 0 0 512 7 1 : tunables 54 27 8 : slabdata 0 0
0
unix_sock 15 28 512 7 1 : tunables 54 27 8 : slabdata 4 4
0
ipt_hashlimit 0 0 64 61 1 : tunables 120 60 8 : slabdata 0 0
0
tcp_tw_bucket 3474 4464 128 31 1 : tunables 120 60 8 : slabdata 144 144
480
tcp_bind_bucket 11 226 16 226 1 : tunables 120 60 8 : slabdata 1 1
0
tcp_open_request 310 427 64 61 1 : tunables 120 60 8 : slabdata 7 7
0
inet_peer_cache 430 488 64 61 1 : tunables 120 60 8 : slabdata 8 8
0
secpath_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
0
xfrm_dst_cache 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0
0
ip_dst_cache 10434 25410 256 15 1 : tunables 120 60 8 : slabdata 1694 1694
0
arp_cache 3 15 256 15 1 : tunables 120 60 8 : slabdata 1 1
0
raw_sock 5 7 512 7 1 : tunables 54 27 8 : slabdata 1 1
0
udp_sock 0 0 512 7 1 : tunables 54 27 8 : slabdata 0 0
0
tcp_sock 3218 3374 1152 7 2 : tunables 24 12 8 : slabdata 482 482
12
flow_cache 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
0
dm-snapshot-in 128 140 56 70 1 : tunables 120 60 8 : slabdata 2 2
0
dm-snapshot-ex 0 0 24 156 1 : tunables 120 60 8 : slabdata 0 0
0
dm-crypt_io 0 0 76 52 1 : tunables 120 60 8 : slabdata 0 0
0
dm_tio 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0
0
dm_io 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0
0
scsi_cmd_cache 261 390 384 10 1 : tunables 54 27 8 : slabdata 39 39
184
cfq_ioc_pool 0 0 24 156 1 : tunables 120 60 8 : slabdata 0 0
0
cfq_pool 0 0 104 38 1 : tunables 120 60 8 : slabdata 0 0
0
crq_pool 0 0 56 70 1 : tunables 120 60 8 : slabdata 0 0
0
deadline_drq 0 0 52 75 1 : tunables 120 60 8 : slabdata 0 0
0
as_arq 665 793 64 61 1 : tunables 120 60 8 : slabdata 13 13
420
mqueue_inode_cache 1 7 512 7 1 : tunables 54 27 8 : slabdata 1 1
0
udf_inode_cache 0 0 368 11 1 : tunables 54 27 8 : slabdata 0 0
0
nfs_write_data 36 42 512 7 1 : tunables 54 27 8 : slabdata 6 6
0
nfs_read_data 32 35 512 7 1 : tunables 54 27 8 : slabdata 5 5
0
nfs_inode_cache 0 0 572 7 1 : tunables 54 27 8 : slabdata 0 0
0
nfs_page 0 0 64 61 1 : tunables 120 60 8 : slabdata 0 0
0
isofs_inode_cache 0 0 340 11 1 : tunables 54 27 8 : slabdata 0 0
0
fat_inode_cache 0 0 372 10 1 : tunables 54 27 8 : slabdata 0 0
0
fat_cache 0 0 20 185 1 : tunables 120 60 8 : slabdata 0 0
0
ext2_inode_cache 2687 4446 424 9 1 : tunables 54 27 8 : slabdata 494 494
10
journal_handle 0 0 20 185 1 : tunables 120 60 8 : slabdata 0 0
0
journal_head 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0
0
revoke_table 0 0 12 290 1 : tunables 120 60 8 : slabdata 0 0
0
revoke_record 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0
0
ext3_inode_cache 0 0 500 8 1 : tunables 54 27 8 : slabdata 0 0
0
ext3_xattr 0 0 48 81 1 : tunables 120 60 8 : slabdata 0 0
0
reiser_inode_cache 483 840 392 10 1 : tunables 54 27 8 : slabdata 84 84
0
dnotify_cache 0 0 20 185 1 : tunables 120 60 8 : slabdata 0 0
0
eventpoll_pwq 0 0 36 107 1 : tunables 120 60 8 : slabdata 0 0
0
eventpoll_epi 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
0
kioctx 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0
0
kiocb 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
0
fasync_cache 0 0 16 226 1 : tunables 120 60 8 : slabdata 0 0
0
shmem_inode_cache 85 117 408 9 1 : tunables 54 27 8 : slabdata 13 13
0
posix_timers_cache 0 0 104 38 1 : tunables 120 60 8 : slabdata 0 0
0
uid_cache 5 61 64 61 1 : tunables 120 60 8 : slabdata 1 1
0
sgpool-128 52 57 2560 3 2 : tunables 24 12 8 : slabdata 19 19
0
sgpool-64 35 42 1280 3 1 : tunables 24 12 8 : slabdata 14 14
0
sgpool-32 292 294 640 6 1 : tunables 54 27 8 : slabdata 49 49
173
sgpool-16 247 280 384 10 1 : tunables 54 27 8 : slabdata 28 28
114
sgpool-8 321 345 256 15 1 : tunables 120 60 8 : slabdata 23 23
104
blkdev_ioc 71 135 28 135 1 : tunables 120 60 8 : slabdata 1 1
0
blkdev_queue 29 40 372 10 1 : tunables 54 27 8 : slabdata 4 4
0
blkdev_requests 662 783 148 27 1 : tunables 120 60 8 : slabdata 29 29
420
biovec-(256) 256 256 3072 2 2 : tunables 24 12 8 : slabdata 128 128
0
biovec-128 256 260 1536 5 2 : tunables 24 12 8 : slabdata 52 52
0
biovec-64 503 580 768 5 1 : tunables 54 27 8 : slabdata 116 116
158
biovec-16 478 495 256 15 1 : tunables 120 60 8 : slabdata 33 33
104
biovec-4 402 488 64 61 1 : tunables 120 60 8 : slabdata 8 8
44
biovec-1 857 4068 16 226 1 : tunables 120 60 8 : slabdata 18 18
404
bio 791 1829 128 31 1 : tunables 120 60 8 : slabdata 59 59
300
file_lock_cache 20 43 92 43 1 : tunables 120 60 8 : slabdata 1 1
0
sock_inode_cache 3190 3360 384 10 1 : tunables 54 27 8 : slabdata 336 336
54
skbuff_head_cache 98736 109950 256 15 1 : tunables 120 60 8 : slabdata 7330 7330
420
sock 7 20 384 10 1 : tunables 54 27 8 : slabdata 2 2
0
proc_inode_cache 19 48 328 12 1 : tunables 54 27 8 : slabdata 4 4
0
sigqueue 149 216 148 27 1 : tunables 120 60 8 : slabdata 8 8
0
radix_tree_node 55690 56182 276 14 1 : tunables 54 27 8 : slabdata 4013 4013
27
bdev_cache 33 42 512 7 1 : tunables 54 27 8 : slabdata 6 6
0
mnt_cache 20 31 128 31 1 : tunables 120 60 8 : slabdata 1 1
0
inode_cache 1099 1104 312 12 1 : tunables 54 27 8 : slabdata 92 92
0
dentry_cache 7375 10388 140 28 1 : tunables 120 60 8 : slabdata 371 371
0
filp 6532 6945 256 15 1 : tunables 120 60 8 : slabdata 463 463
0
names_cache 40 40 4096 1 1 : tunables 24 12 8 : slabdata 40 40
12
idr_layer_cache 77 116 136 29 1 : tunables 120 60 8 : slabdata 4 4
0
buffer_head 68550 68550 52 75 1 : tunables 120 60 8 : slabdata 914 914
0
mm_struct 61 72 640 6 1 : tunables 54 27 8 : slabdata 12 12
0
vm_area_struct 1980 2745 88 45 1 : tunables 120 60 8 : slabdata 61 61
120
fs_cache 46 183 64 61 1 : tunables 120 60 8 : slabdata 3 3
0
files_cache 46 63 512 7 1 : tunables 54 27 8 : slabdata 9 9
0
signal_cache 115 150 256 15 1 : tunables 120 60 8 : slabdata 10 10
0
sighand_cache 90 95 1408 5 2 : tunables 24 12 8 : slabdata 19 19
0
task_struct 106 108 1280 3 1 : tunables 24 12 8 : slabdata 36 36
0
anon_vma 499 870 12 290 1 : tunables 120 60 8 : slabdata 3 3
0
pgd 90 238 32 119 1 : tunables 120 60 8 : slabdata 2 2
0
pmd 62 62 4096 1 1 : tunables 24 12 8 : slabdata 62 62
0
size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0
0
size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0
0
size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0
0
size-65536 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0
0
size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0
0
size-32768 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0
0
size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0
0
size-16384 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0
0
size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0
0
size-8192 8 8 8192 1 2 : tunables 8 4 0 : slabdata 8 8
0
size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 8 : slabdata 0 0
0
size-4096 2509 2509 4096 1 1 : tunables 24 12 8 : slabdata 2509 2509
0
size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 8 : slabdata 0 0
0
size-2048 290 326 2048 2 1 : tunables 24 12 8 : slabdata 160 163
60
size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 8 : slabdata 0 0
0
size-1024 308 324 1024 4 1 : tunables 54 27 8 : slabdata 81 81
27
size-512(DMA) 0 0 512 8 1 : tunables 54 27 8 : slabdata 0 0
0
size-512 95390 97064 512 8 1 : tunables 54 27 8 : slabdata 12133 12133
216
size-256(DMA) 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0
0
size-256 272 300 256 15 1 : tunables 120 60 8 : slabdata 20 20
0
size-128(DMA) 0 0 128 31 1 : tunables 120 60 8 : slabdata 0 0
0
size-128 1601 1674 128 31 1 : tunables 120 60 8 : slabdata 54 54
0
size-64(DMA) 0 0 64 61 1 : tunables 120 60 8 : slabdata 0 0
0
size-64 18303 23058 64 61 1 : tunables 120 60 8 : slabdata 378 378
60
size-32(DMA) 0 0 32 119 1 : tunables 120 60 8 : slabdata 0 0
0
size-32 7614 10591 32 119 1 : tunables 120 60 8 : slabdata 89 89
0
kmem_cache 135 135 256 15 1 : tunables 120 60 8 : slabdata 9 9
0
Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> [...]
>
>>But now I hit another BUG: After I have managed to create 4500 sockets, 10
>>minutes later an interesting phenomenon appeared: It locks for 5 seconds
>>every 60 seconds. I first thought this was something in my program but I
>>can do what I want, I wasn't able to fix this. Even a restart of my program
>>didnt help. It even appears with 400 connections. Then I despairedly just
>>restarted the system and: It was gone. So what is THIS?
>
>
> What about monitoring /proc/slabinfo and vmstat output (with heavy renicing) ?
>
> --
> Ueimor
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: ARGH MORE BUGS!!!
2005-02-21 20:34 ` Christian Schmid
@ 2005-02-21 20:56 ` Francois Romieu
2005-02-21 21:17 ` Christian Schmid
2005-02-21 21:18 ` Christian Schmid
0 siblings, 2 replies; 14+ messages in thread
From: Francois Romieu @ 2005-02-21 20:56 UTC (permalink / raw)
To: Christian Schmid; +Cc: netdev
Christian Schmid <webmaster@rapidforum.com> :
[...]
Let's calm down, please.
# cat>/tmp/momo<<EOD
i=0
while : ; do
cat /proc/slabinfo > /tmp/\$i
i=$[ $i + 1 ]
sleep 1
done
EOD
# nice -n -19 vmstat 1 > /tmp/v & nice -n -19 sh /tmp/momo
Wait a few minutes or more until the 5s pauses happen several times
and post the result somewhere.
I doubt I'll fix it but it should give nifty graphics.
--
Ueimor
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: ARGH MORE BUGS!!!
2005-02-21 20:56 ` Francois Romieu
@ 2005-02-21 21:17 ` Christian Schmid
2005-02-21 21:36 ` Francois Romieu
2005-02-21 21:18 ` Christian Schmid
1 sibling, 1 reply; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 21:17 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
Its attached, since these are several files. your script didnt work. I used this one:
# nice -n -19 vmstat 1 > /tmp/v & nice -n -19 perl -e 'for(1..1000){sleep(1);system("cat
/proc/slabinfo > /tmp/sl$_")}'
I started it after one break-cyclus and stopped it immedately after the next break-cyclus ended.
Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> [...]
>
> Let's calm down, please.
>
> # cat>/tmp/momo<<EOD
> i=0
> while : ; do
> cat /proc/slabinfo > /tmp/\$i
> i=$[ $i + 1 ]
> sleep 1
> done
> EOD
> # nice -n -19 vmstat 1 > /tmp/v & nice -n -19 sh /tmp/momo
>
> Wait a few minutes or more until the 5s pauses happen several times
> and post the result somewhere.
>
> I doubt I'll fix it but it should give nifty graphics.
>
> --
> Ueimor
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: ARGH MORE BUGS!!!
2005-02-21 21:17 ` Christian Schmid
@ 2005-02-21 21:36 ` Francois Romieu
2005-02-21 22:10 ` Christian Schmid
` (2 more replies)
0 siblings, 3 replies; 14+ messages in thread
From: Francois Romieu @ 2005-02-21 21:36 UTC (permalink / raw)
To: Christian Schmid; +Cc: netdev
Christian Schmid <webmaster@rapidforum.com> :
> I started it after one break-cyclus and stopped it immedately after the
> next break-cyclus ended.
I'd welcome several cycles to feed the data through gnuplot.
A sample based on 15 minutes or more would not hurt: I speak perl too.
--
Ueimor
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: ARGH MORE BUGS!!!
2005-02-21 21:36 ` Francois Romieu
@ 2005-02-21 22:10 ` Christian Schmid
2005-02-21 23:03 ` Christian Schmid
2005-02-22 0:10 ` ARGH MORE BUGS!!! Christian Schmid
2 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 22:10 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
>
>>I started it after one break-cyclus and stopped it immedately after the
>>next break-cyclus ended.
>
>
> I'd welcome several cycles to feed the data through gnuplot.
> A sample based on 15 minutes or more would not hurt: I speak perl too.
Hm I have already restarted the server. There is no other way to make this behaviour go away....
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ARGH MORE BUGS!!!
2005-02-21 21:36 ` Francois Romieu
2005-02-21 22:10 ` Christian Schmid
@ 2005-02-21 23:03 ` Christian Schmid
2005-02-22 0:23 ` argh more bugs!!! Francois Romieu
2005-02-22 0:10 ` ARGH MORE BUGS!!! Christian Schmid
2 siblings, 1 reply; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 23:03 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 393 bytes --]
It suddenly appeared again. there you go..........
Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
>
>>I started it after one break-cyclus and stopped it immedately after the
>>next break-cyclus ended.
>
>
> I'd welcome several cycles to feed the data through gnuplot.
> A sample based on 15 minutes or more would not hurt: I speak perl too.
>
> --
> Ueimor
>
>
[-- Attachment #2: log.tgz --]
[-- Type: application/x-compressed, Size: 377644 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: argh more bugs!!!
2005-02-21 23:03 ` Christian Schmid
@ 2005-02-22 0:23 ` Francois Romieu
2005-02-22 0:37 ` Christian Schmid
0 siblings, 1 reply; 14+ messages in thread
From: Francois Romieu @ 2005-02-22 0:23 UTC (permalink / raw)
To: Christian Schmid; +Cc: netdev
Christian Schmid <webmaster@rapidforum.com> :
> It suddenly appeared again. there you go..........
Thanks. I'll do some graphics tomorrow to be sure but the slabs do not seem
wrong. vmstat output looks weird:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
2 3 0 8496 25236 7941848 0 0 37920 0 7563 2788 13 19 34 33
2 2 0 9268 25172 7941300 0 0 36688 0 7424 2814 15 19 40 26
1 0 0 19576 25264 7928356 0 0 9468 13080 8072 607 22 13 59 6
1 0 0 18052 25264 7928356 0 0 0 0 7975 40 18 7 75 0
1 0 0 17660 25264 7928356 0 0 0 0 7487 38 21 4 75 0
1 0 0 18560 25264 7928356 0 0 0 0 6500 44 22 3 75 0
1 0 0 20072 25264 7928356 0 0 0 0 5834 44 23 2 75 0
1 3 0 21516 25320 7928300 0 0 0 3408 6796 153 24 3 58 15
0 4 0 10596 25436 7942056 0 0 44084 2220 11226 4282 12 10 31 47
2 2 0 9324 25240 7943952 0 0 39292 0 8433 3212 9 13 39 38
4 1 0 11596 25300 7941580 0 0 35820 0 7945 4306 17 21 30 32
0 5 0 13208 25560 7939280 0 0 40684 6456 7920 4081 19 18 32 31
4 1 0 12620 24944 7859724 0 0 32204 272 7306 2304 12 28 27 34
1 3 0 64964 24852 7888240 0 0 44944 96 7314 2631 19 31 24 27
???
Since you have a lot of cpu, could you "strace -f -T -o /tmp/nitz -p xyz" one
or two of your perl processes when they hang ?
If you do not have too many processes, monitoring "echo t > /proc/sysrq-trigger"
for some time could tell what the system is waiting for.
--
Ueimor
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: argh more bugs!!!
2005-02-22 0:23 ` argh more bugs!!! Francois Romieu
@ 2005-02-22 0:37 ` Christian Schmid
0 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-22 0:37 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
OK the problem with the break is solved. I am REALLY sorry but it was not the net-code in linux. The
SQL-Server has experienced an index-key collision as I added a second multi-key to it. It seems the
sort-buffer overflowed and it suddenly raised the cpu-time very high. I will contact
mysql-developers and ask them about it. The break was due to the table-lock mysql does for every update.
The problem with the slowdown at many sockets still exists. This isnt solved yet. I hope this isnt
my fault as well. Else I feel forced to spend 1000 dollars to some open-source foundation. *grin*
Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
>
>>It suddenly appeared again. there you go..........
>
>
> Thanks. I'll do some graphics tomorrow to be sure but the slabs do not seem
> wrong. vmstat output looks weird:
>
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 2 3 0 8496 25236 7941848 0 0 37920 0 7563 2788 13 19 34 33
> 2 2 0 9268 25172 7941300 0 0 36688 0 7424 2814 15 19 40 26
> 1 0 0 19576 25264 7928356 0 0 9468 13080 8072 607 22 13 59 6
> 1 0 0 18052 25264 7928356 0 0 0 0 7975 40 18 7 75 0
> 1 0 0 17660 25264 7928356 0 0 0 0 7487 38 21 4 75 0
> 1 0 0 18560 25264 7928356 0 0 0 0 6500 44 22 3 75 0
> 1 0 0 20072 25264 7928356 0 0 0 0 5834 44 23 2 75 0
> 1 3 0 21516 25320 7928300 0 0 0 3408 6796 153 24 3 58 15
> 0 4 0 10596 25436 7942056 0 0 44084 2220 11226 4282 12 10 31 47
> 2 2 0 9324 25240 7943952 0 0 39292 0 8433 3212 9 13 39 38
> 4 1 0 11596 25300 7941580 0 0 35820 0 7945 4306 17 21 30 32
> 0 5 0 13208 25560 7939280 0 0 40684 6456 7920 4081 19 18 32 31
> 4 1 0 12620 24944 7859724 0 0 32204 272 7306 2304 12 28 27 34
> 1 3 0 64964 24852 7888240 0 0 44944 96 7314 2631 19 31 24 27
>
> ???
>
> Since you have a lot of cpu, could you "strace -f -T -o /tmp/nitz -p xyz" one
> or two of your perl processes when they hang ?
>
> If you do not have too many processes, monitoring "echo t > /proc/sysrq-trigger"
> for some time could tell what the system is waiting for.
>
> --
> Ueimor
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ARGH MORE BUGS!!!
2005-02-21 21:36 ` Francois Romieu
2005-02-21 22:10 ` Christian Schmid
2005-02-21 23:03 ` Christian Schmid
@ 2005-02-22 0:10 ` Christian Schmid
2 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-22 0:10 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
I FOUND IT!!!
At least I were able to track this down very far. It seems it only appears on Port 80. So might
there be a problem if too many sockets are sharing the same port? In this case, port 80?
What can I do now? The problem is still there. All 60 seconds a break for 5 seconds. I am not really
willing to reboot anymore. So bug-hunters, please help.
Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
>
>>I started it after one break-cyclus and stopped it immedately after the
>>next break-cyclus ended.
>
>
> I'd welcome several cycles to feed the data through gnuplot.
> A sample based on 15 minutes or more would not hurt: I speak perl too.
>
> --
> Ueimor
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: ARGH MORE BUGS!!!
2005-02-21 20:56 ` Francois Romieu
2005-02-21 21:17 ` Christian Schmid
@ 2005-02-21 21:18 ` Christian Schmid
1 sibling, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 21:18 UTC (permalink / raw)
To: Francois Romieu; +Cc: netdev
[-- Attachment #1: Type: text/plain, Size: 841 bytes --]
OOPS forgot the file.... *grin*
Its attached, since these are several files. your script didnt work. I used this one:
# nice -n -19 vmstat 1 > /tmp/v & nice -n -19 perl -e 'for(1..1000){sleep(1);system("cat
/proc/slabinfo > /tmp/sl$_")}'
I started it after one break-cyclus and stopped it immedately after the next break-cyclus ended.
Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> [...]
>
> Let's calm down, please.
>
> # cat>/tmp/momo<<EOD
> i=0
> while : ; do
> cat /proc/slabinfo > /tmp/\$i
> i=$[ $i + 1 ]
> sleep 1
> done
> EOD
> # nice -n -19 vmstat 1 > /tmp/v & nice -n -19 sh /tmp/momo
>
> Wait a few minutes or more until the 5s pauses happen several times
> and post the result somewhere.
>
> I doubt I'll fix it but it should give nifty graphics.
>
> --
> Ueimor
>
>
[-- Attachment #2: log.tgz --]
[-- Type: application/x-compressed, Size: 23187 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2005-02-22 0:37 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-21 20:04 ARGH MORE BUGS!!! Christian Schmid
2005-02-21 20:19 ` Matthias-Christian Ott
2005-02-21 20:25 ` Christian Schmid
2005-02-21 20:28 ` Francois Romieu
2005-02-21 20:34 ` Christian Schmid
2005-02-21 20:56 ` Francois Romieu
2005-02-21 21:17 ` Christian Schmid
2005-02-21 21:36 ` Francois Romieu
2005-02-21 22:10 ` Christian Schmid
2005-02-21 23:03 ` Christian Schmid
2005-02-22 0:23 ` argh more bugs!!! Francois Romieu
2005-02-22 0:37 ` Christian Schmid
2005-02-22 0:10 ` ARGH MORE BUGS!!! Christian Schmid
2005-02-21 21:18 ` Christian Schmid
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).