netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ARGH MORE BUGS!!!
@ 2005-02-21 20:04 Christian Schmid
  2005-02-21 20:19 ` Matthias-Christian Ott
  2005-02-21 20:28 ` Francois Romieu
  0 siblings, 2 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 20:04 UTC (permalink / raw)
  To: netdev

Hi.

Another bug hit me today HARD! I have been experiencing with lowering the socket-buffer to see if 
the behaviour of a slowdown reappears at the same position. Result: With a 128 KB send-buffer, the 
slowdown appears at around 3000 sockets. With 64 KB, it didnt appear up to 4500 sockets where a 
small slow-down appeared but I think this was a disk-issue. So its definetly something with TCP-memory.

But now I hit another BUG: After I have managed to create 4500 sockets, 10 minutes later an 
interesting phenomenon appeared: It locks for 5 seconds every 60 seconds. I first thought this was 
something in my program but I can do what I want, I wasn't able to fix this. Even a restart of my 
program didnt help. It even appears with 400 connections. Then I despairedly just restarted the 
system and: It was gone. So what is THIS?

Sorry if I am a bit angry. I know you are doing a really good job. Maybe I can donate some money 
somewhere but PLEASE!!!!! help me fix this bugs.... Thank you.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 20:04 ARGH MORE BUGS!!! Christian Schmid
@ 2005-02-21 20:19 ` Matthias-Christian Ott
  2005-02-21 20:25   ` Christian Schmid
  2005-02-21 20:28 ` Francois Romieu
  1 sibling, 1 reply; 14+ messages in thread
From: Matthias-Christian Ott @ 2005-02-21 20:19 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

Christian Schmid wrote:

> Hi.
>
> Another bug hit me today HARD! I have been experiencing with lowering 
> the socket-buffer to see if the behaviour of a slowdown reappears at 
> the same position. Result: With a 128 KB send-buffer, the slowdown 
> appears at around 3000 sockets. With 64 KB, it didnt appear up to 4500 
> sockets where a small slow-down appeared but I think this was a 
> disk-issue. So its definetly something with TCP-memory.
>
> But now I hit another BUG: After I have managed to create 4500 
> sockets, 10 minutes later an interesting phenomenon appeared: It locks 
> for 5 seconds every 60 seconds. I first thought this was something in 
> my program but I can do what I want, I wasn't able to fix this. Even a 
> restart of my program didnt help. It even appears with 400 
> connections. Then I despairedly just restarted the system and: It was 
> gone. So what is THIS?
>
> Sorry if I am a bit angry. I know you are doing a really good job. 
> Maybe I can donate some money somewhere but PLEASE!!!!! help me fix 
> this bugs.... Thank you.
>
> Chris
>
>
Hi!
I'm not a Perl Coder or Socket Specialist, but did try an implementation 
of your program in C (maybe it's a perl "bug"?)? And as mentioned in the 
other Thread, try to use send () instead of sendfile (). Anyway it's 
strange. Try to contact some of the Maintainers and Developers of this 
part of the IP v4 implementation in Linux.

Matthias-Christian Ott

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 20:19 ` Matthias-Christian Ott
@ 2005-02-21 20:25   ` Christian Schmid
  0 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 20:25 UTC (permalink / raw)
  To: Matthias-Christian Ott; +Cc: netdev

netdev@oss.sgi.com has been found in MAINTAINERS in the ipv4/ipv6 section ;)

Matthias-Christian Ott wrote:
> Christian Schmid wrote:
> 
>> Hi.
>>
>> Another bug hit me today HARD! I have been experiencing with lowering 
>> the socket-buffer to see if the behaviour of a slowdown reappears at 
>> the same position. Result: With a 128 KB send-buffer, the slowdown 
>> appears at around 3000 sockets. With 64 KB, it didnt appear up to 4500 
>> sockets where a small slow-down appeared but I think this was a 
>> disk-issue. So its definetly something with TCP-memory.
>>
>> But now I hit another BUG: After I have managed to create 4500 
>> sockets, 10 minutes later an interesting phenomenon appeared: It locks 
>> for 5 seconds every 60 seconds. I first thought this was something in 
>> my program but I can do what I want, I wasn't able to fix this. Even a 
>> restart of my program didnt help. It even appears with 400 
>> connections. Then I despairedly just restarted the system and: It was 
>> gone. So what is THIS?
>>
>> Sorry if I am a bit angry. I know you are doing a really good job. 
>> Maybe I can donate some money somewhere but PLEASE!!!!! help me fix 
>> this bugs.... Thank you.
>>
>> Chris
>>
>>
> Hi!
> I'm not a Perl Coder or Socket Specialist, but did try an implementation 
> of your program in C (maybe it's a perl "bug"?)? And as mentioned in the 
> other Thread, try to use send () instead of sendfile (). Anyway it's 
> strange. Try to contact some of the Maintainers and Developers of this 
> part of the IP v4 implementation in Linux.
> 
> Matthias-Christian Ott
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 20:04 ARGH MORE BUGS!!! Christian Schmid
  2005-02-21 20:19 ` Matthias-Christian Ott
@ 2005-02-21 20:28 ` Francois Romieu
  2005-02-21 20:34   ` Christian Schmid
  1 sibling, 1 reply; 14+ messages in thread
From: Francois Romieu @ 2005-02-21 20:28 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

Christian Schmid <webmaster@rapidforum.com> :
[...]
> But now I hit another BUG: After I have managed to create 4500 sockets, 10 
> minutes later an interesting phenomenon appeared: It locks for 5 seconds 
> every 60 seconds. I first thought this was something in my program but I 
> can do what I want, I wasn't able to fix this. Even a restart of my program 
> didnt help. It even appears with 400 connections. Then I despairedly just 
> restarted the system and: It was gone. So what is THIS?

What about monitoring /proc/slabinfo and vmstat output (with heavy renicing) ?

--
Ueimor

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 20:28 ` Francois Romieu
@ 2005-02-21 20:34   ` Christian Schmid
  2005-02-21 20:56     ` Francois Romieu
  0 siblings, 1 reply; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 20:34 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

(s02) [21:05:13] root:~# cat /proc/slabinfo
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables 
<batchcount> <limit> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
ip_fib_alias          14    226     16  226    1 : tunables  120   60    8 : slabdata      1      1 
      0
ip_fib_hash           14    119     32  119    1 : tunables  120   60    8 : slabdata      1      1 
      0
rpc_buffers            8      8   2048    2    1 : tunables   24   12    8 : slabdata      4      4 
      0
rpc_tasks              8     15    256   15    1 : tunables  120   60    8 : slabdata      1      1 
      0
rpc_inode_cache        0      0    512    7    1 : tunables   54   27    8 : slabdata      0      0 
      0
unix_sock             15     28    512    7    1 : tunables   54   27    8 : slabdata      4      4 
      0
ipt_hashlimit          0      0     64   61    1 : tunables  120   60    8 : slabdata      0      0 
      0
tcp_tw_bucket       3474   4464    128   31    1 : tunables  120   60    8 : slabdata    144    144 
    480
tcp_bind_bucket       11    226     16  226    1 : tunables  120   60    8 : slabdata      1      1 
      0
tcp_open_request     310    427     64   61    1 : tunables  120   60    8 : slabdata      7      7 
      0
inet_peer_cache      430    488     64   61    1 : tunables  120   60    8 : slabdata      8      8 
      0
secpath_cache          0      0    128   31    1 : tunables  120   60    8 : slabdata      0      0 
      0
xfrm_dst_cache         0      0    256   15    1 : tunables  120   60    8 : slabdata      0      0 
      0
ip_dst_cache       10434  25410    256   15    1 : tunables  120   60    8 : slabdata   1694   1694 
      0
arp_cache              3     15    256   15    1 : tunables  120   60    8 : slabdata      1      1 
      0
raw_sock               5      7    512    7    1 : tunables   54   27    8 : slabdata      1      1 
      0
udp_sock               0      0    512    7    1 : tunables   54   27    8 : slabdata      0      0 
      0
tcp_sock            3218   3374   1152    7    2 : tunables   24   12    8 : slabdata    482    482 
     12
flow_cache             0      0    128   31    1 : tunables  120   60    8 : slabdata      0      0 
      0
dm-snapshot-in       128    140     56   70    1 : tunables  120   60    8 : slabdata      2      2 
      0
dm-snapshot-ex         0      0     24  156    1 : tunables  120   60    8 : slabdata      0      0 
      0
dm-crypt_io            0      0     76   52    1 : tunables  120   60    8 : slabdata      0      0 
      0
dm_tio                 0      0     16  226    1 : tunables  120   60    8 : slabdata      0      0 
      0
dm_io                  0      0     16  226    1 : tunables  120   60    8 : slabdata      0      0 
      0
scsi_cmd_cache       261    390    384   10    1 : tunables   54   27    8 : slabdata     39     39 
    184
cfq_ioc_pool           0      0     24  156    1 : tunables  120   60    8 : slabdata      0      0 
      0
cfq_pool               0      0    104   38    1 : tunables  120   60    8 : slabdata      0      0 
      0
crq_pool               0      0     56   70    1 : tunables  120   60    8 : slabdata      0      0 
      0
deadline_drq           0      0     52   75    1 : tunables  120   60    8 : slabdata      0      0 
      0
as_arq               665    793     64   61    1 : tunables  120   60    8 : slabdata     13     13 
    420
mqueue_inode_cache      1      7    512    7    1 : tunables   54   27    8 : slabdata      1      1 
      0
udf_inode_cache        0      0    368   11    1 : tunables   54   27    8 : slabdata      0      0 
      0
nfs_write_data        36     42    512    7    1 : tunables   54   27    8 : slabdata      6      6 
      0
nfs_read_data         32     35    512    7    1 : tunables   54   27    8 : slabdata      5      5 
      0
nfs_inode_cache        0      0    572    7    1 : tunables   54   27    8 : slabdata      0      0 
      0
nfs_page               0      0     64   61    1 : tunables  120   60    8 : slabdata      0      0 
      0
isofs_inode_cache      0      0    340   11    1 : tunables   54   27    8 : slabdata      0      0 
      0
fat_inode_cache        0      0    372   10    1 : tunables   54   27    8 : slabdata      0      0 
      0
fat_cache              0      0     20  185    1 : tunables  120   60    8 : slabdata      0      0 
      0
ext2_inode_cache    2687   4446    424    9    1 : tunables   54   27    8 : slabdata    494    494 
     10
journal_handle         0      0     20  185    1 : tunables  120   60    8 : slabdata      0      0 
      0
journal_head           0      0     48   81    1 : tunables  120   60    8 : slabdata      0      0 
      0
revoke_table           0      0     12  290    1 : tunables  120   60    8 : slabdata      0      0 
      0
revoke_record          0      0     16  226    1 : tunables  120   60    8 : slabdata      0      0 
      0
ext3_inode_cache       0      0    500    8    1 : tunables   54   27    8 : slabdata      0      0 
      0
ext3_xattr             0      0     48   81    1 : tunables  120   60    8 : slabdata      0      0 
      0
reiser_inode_cache    483    840    392   10    1 : tunables   54   27    8 : slabdata     84     84 
      0
dnotify_cache          0      0     20  185    1 : tunables  120   60    8 : slabdata      0      0 
      0
eventpoll_pwq          0      0     36  107    1 : tunables  120   60    8 : slabdata      0      0 
      0
eventpoll_epi          0      0    128   31    1 : tunables  120   60    8 : slabdata      0      0 
      0
kioctx                 0      0    256   15    1 : tunables  120   60    8 : slabdata      0      0 
      0
kiocb                  0      0    128   31    1 : tunables  120   60    8 : slabdata      0      0 
      0
fasync_cache           0      0     16  226    1 : tunables  120   60    8 : slabdata      0      0 
      0
shmem_inode_cache     85    117    408    9    1 : tunables   54   27    8 : slabdata     13     13 
      0
posix_timers_cache      0      0    104   38    1 : tunables  120   60    8 : slabdata      0      0 
      0
uid_cache              5     61     64   61    1 : tunables  120   60    8 : slabdata      1      1 
      0
sgpool-128            52     57   2560    3    2 : tunables   24   12    8 : slabdata     19     19 
      0
sgpool-64             35     42   1280    3    1 : tunables   24   12    8 : slabdata     14     14 
      0
sgpool-32            292    294    640    6    1 : tunables   54   27    8 : slabdata     49     49 
    173
sgpool-16            247    280    384   10    1 : tunables   54   27    8 : slabdata     28     28 
    114
sgpool-8             321    345    256   15    1 : tunables  120   60    8 : slabdata     23     23 
    104
blkdev_ioc            71    135     28  135    1 : tunables  120   60    8 : slabdata      1      1 
      0
blkdev_queue          29     40    372   10    1 : tunables   54   27    8 : slabdata      4      4 
      0
blkdev_requests      662    783    148   27    1 : tunables  120   60    8 : slabdata     29     29 
    420
biovec-(256)         256    256   3072    2    2 : tunables   24   12    8 : slabdata    128    128 
      0
biovec-128           256    260   1536    5    2 : tunables   24   12    8 : slabdata     52     52 
      0
biovec-64            503    580    768    5    1 : tunables   54   27    8 : slabdata    116    116 
    158
biovec-16            478    495    256   15    1 : tunables  120   60    8 : slabdata     33     33 
    104
biovec-4             402    488     64   61    1 : tunables  120   60    8 : slabdata      8      8 
     44
biovec-1             857   4068     16  226    1 : tunables  120   60    8 : slabdata     18     18 
    404
bio                  791   1829    128   31    1 : tunables  120   60    8 : slabdata     59     59 
    300
file_lock_cache       20     43     92   43    1 : tunables  120   60    8 : slabdata      1      1 
      0
sock_inode_cache    3190   3360    384   10    1 : tunables   54   27    8 : slabdata    336    336 
     54
skbuff_head_cache  98736 109950    256   15    1 : tunables  120   60    8 : slabdata   7330   7330 
    420
sock                   7     20    384   10    1 : tunables   54   27    8 : slabdata      2      2 
      0
proc_inode_cache      19     48    328   12    1 : tunables   54   27    8 : slabdata      4      4 
      0
sigqueue             149    216    148   27    1 : tunables  120   60    8 : slabdata      8      8 
      0
radix_tree_node    55690  56182    276   14    1 : tunables   54   27    8 : slabdata   4013   4013 
     27
bdev_cache            33     42    512    7    1 : tunables   54   27    8 : slabdata      6      6 
      0
mnt_cache             20     31    128   31    1 : tunables  120   60    8 : slabdata      1      1 
      0
inode_cache         1099   1104    312   12    1 : tunables   54   27    8 : slabdata     92     92 
      0
dentry_cache        7375  10388    140   28    1 : tunables  120   60    8 : slabdata    371    371 
      0
filp                6532   6945    256   15    1 : tunables  120   60    8 : slabdata    463    463 
      0
names_cache           40     40   4096    1    1 : tunables   24   12    8 : slabdata     40     40 
     12
idr_layer_cache       77    116    136   29    1 : tunables  120   60    8 : slabdata      4      4 
      0
buffer_head        68550  68550     52   75    1 : tunables  120   60    8 : slabdata    914    914 
      0
mm_struct             61     72    640    6    1 : tunables   54   27    8 : slabdata     12     12 
      0
vm_area_struct      1980   2745     88   45    1 : tunables  120   60    8 : slabdata     61     61 
    120
fs_cache              46    183     64   61    1 : tunables  120   60    8 : slabdata      3      3 
      0
files_cache           46     63    512    7    1 : tunables   54   27    8 : slabdata      9      9 
      0
signal_cache         115    150    256   15    1 : tunables  120   60    8 : slabdata     10     10 
      0
sighand_cache         90     95   1408    5    2 : tunables   24   12    8 : slabdata     19     19 
      0
task_struct          106    108   1280    3    1 : tunables   24   12    8 : slabdata     36     36 
      0
anon_vma             499    870     12  290    1 : tunables  120   60    8 : slabdata      3      3 
      0
pgd                   90    238     32  119    1 : tunables  120   60    8 : slabdata      2      2 
      0
pmd                   62     62   4096    1    1 : tunables   24   12    8 : slabdata     62     62 
      0
size-131072(DMA)       0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0 
      0
size-131072            0      0 131072    1   32 : tunables    8    4    0 : slabdata      0      0 
      0
size-65536(DMA)        0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0 
      0
size-65536             0      0  65536    1   16 : tunables    8    4    0 : slabdata      0      0 
      0
size-32768(DMA)        0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0 
      0
size-32768             0      0  32768    1    8 : tunables    8    4    0 : slabdata      0      0 
      0
size-16384(DMA)        0      0  16384    1    4 : tunables    8    4    0 : slabdata      0      0 
      0
size-16384             0      0  16384    1    4 : tunables    8    4    0 : slabdata      0      0 
      0
size-8192(DMA)         0      0   8192    1    2 : tunables    8    4    0 : slabdata      0      0 
      0
size-8192              8      8   8192    1    2 : tunables    8    4    0 : slabdata      8      8 
      0
size-4096(DMA)         0      0   4096    1    1 : tunables   24   12    8 : slabdata      0      0 
      0
size-4096           2509   2509   4096    1    1 : tunables   24   12    8 : slabdata   2509   2509 
      0
size-2048(DMA)         0      0   2048    2    1 : tunables   24   12    8 : slabdata      0      0 
      0
size-2048            290    326   2048    2    1 : tunables   24   12    8 : slabdata    160    163 
     60
size-1024(DMA)         0      0   1024    4    1 : tunables   54   27    8 : slabdata      0      0 
      0
size-1024            308    324   1024    4    1 : tunables   54   27    8 : slabdata     81     81 
     27
size-512(DMA)          0      0    512    8    1 : tunables   54   27    8 : slabdata      0      0 
      0
size-512           95390  97064    512    8    1 : tunables   54   27    8 : slabdata  12133  12133 
    216
size-256(DMA)          0      0    256   15    1 : tunables  120   60    8 : slabdata      0      0 
      0
size-256             272    300    256   15    1 : tunables  120   60    8 : slabdata     20     20 
      0
size-128(DMA)          0      0    128   31    1 : tunables  120   60    8 : slabdata      0      0 
      0
size-128            1601   1674    128   31    1 : tunables  120   60    8 : slabdata     54     54 
      0
size-64(DMA)           0      0     64   61    1 : tunables  120   60    8 : slabdata      0      0 
      0
size-64            18303  23058     64   61    1 : tunables  120   60    8 : slabdata    378    378 
     60
size-32(DMA)           0      0     32  119    1 : tunables  120   60    8 : slabdata      0      0 
      0
size-32             7614  10591     32  119    1 : tunables  120   60    8 : slabdata     89     89 
      0
kmem_cache           135    135    256   15    1 : tunables  120   60    8 : slabdata      9      9 
      0


Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> [...]
> 
>>But now I hit another BUG: After I have managed to create 4500 sockets, 10 
>>minutes later an interesting phenomenon appeared: It locks for 5 seconds 
>>every 60 seconds. I first thought this was something in my program but I 
>>can do what I want, I wasn't able to fix this. Even a restart of my program 
>>didnt help. It even appears with 400 connections. Then I despairedly just 
>>restarted the system and: It was gone. So what is THIS?
> 
> 
> What about monitoring /proc/slabinfo and vmstat output (with heavy renicing) ?
> 
> --
> Ueimor
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 20:34   ` Christian Schmid
@ 2005-02-21 20:56     ` Francois Romieu
  2005-02-21 21:17       ` Christian Schmid
  2005-02-21 21:18       ` Christian Schmid
  0 siblings, 2 replies; 14+ messages in thread
From: Francois Romieu @ 2005-02-21 20:56 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

Christian Schmid <webmaster@rapidforum.com> :
[...]

Let's calm down, please.

# cat>/tmp/momo<<EOD
i=0
while : ; do
        cat /proc/slabinfo > /tmp/\$i
        i=$[ $i + 1 ]
        sleep 1
done
EOD
# nice -n -19 vmstat 1 > /tmp/v & nice -n -19 sh /tmp/momo

Wait a few minutes or more until the 5s pauses happen several times
and post the result somewhere.

I doubt I'll fix it but it should give nifty graphics.

--
Ueimor

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 20:56     ` Francois Romieu
@ 2005-02-21 21:17       ` Christian Schmid
  2005-02-21 21:36         ` Francois Romieu
  2005-02-21 21:18       ` Christian Schmid
  1 sibling, 1 reply; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 21:17 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Its attached, since these are several files. your script didnt work. I used this one:

# nice -n -19 vmstat 1 > /tmp/v & nice -n -19 perl -e 'for(1..1000){sleep(1);system("cat 
/proc/slabinfo > /tmp/sl$_")}'

I started it after one break-cyclus and stopped it immedately after the next break-cyclus ended.

Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> [...]
> 
> Let's calm down, please.
> 
> # cat>/tmp/momo<<EOD
> i=0
> while : ; do
>         cat /proc/slabinfo > /tmp/\$i
>         i=$[ $i + 1 ]
>         sleep 1
> done
> EOD
> # nice -n -19 vmstat 1 > /tmp/v & nice -n -19 sh /tmp/momo
> 
> Wait a few minutes or more until the 5s pauses happen several times
> and post the result somewhere.
> 
> I doubt I'll fix it but it should give nifty graphics.
> 
> --
> Ueimor
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 20:56     ` Francois Romieu
  2005-02-21 21:17       ` Christian Schmid
@ 2005-02-21 21:18       ` Christian Schmid
  1 sibling, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 21:18 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 841 bytes --]

OOPS forgot the file.... *grin*

Its attached, since these are several files. your script didnt work. I used this one:

# nice -n -19 vmstat 1 > /tmp/v & nice -n -19 perl -e 'for(1..1000){sleep(1);system("cat 
/proc/slabinfo > /tmp/sl$_")}'

I started it after one break-cyclus and stopped it immedately after the next break-cyclus ended.

Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> [...]
> 
> Let's calm down, please.
> 
> # cat>/tmp/momo<<EOD
> i=0
> while : ; do
>         cat /proc/slabinfo > /tmp/\$i
>         i=$[ $i + 1 ]
>         sleep 1
> done
> EOD
> # nice -n -19 vmstat 1 > /tmp/v & nice -n -19 sh /tmp/momo
> 
> Wait a few minutes or more until the 5s pauses happen several times
> and post the result somewhere.
> 
> I doubt I'll fix it but it should give nifty graphics.
> 
> --
> Ueimor
> 
> 

[-- Attachment #2: log.tgz --]
[-- Type: application/x-compressed, Size: 23187 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 21:17       ` Christian Schmid
@ 2005-02-21 21:36         ` Francois Romieu
  2005-02-21 22:10           ` Christian Schmid
                             ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Francois Romieu @ 2005-02-21 21:36 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

Christian Schmid <webmaster@rapidforum.com> :
> I started it after one break-cyclus and stopped it immedately after the 
> next break-cyclus ended.

I'd welcome several cycles to feed the data through gnuplot.
A sample based on 15 minutes or more would not hurt: I speak perl too.

--
Ueimor

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 21:36         ` Francois Romieu
@ 2005-02-21 22:10           ` Christian Schmid
  2005-02-21 23:03           ` Christian Schmid
  2005-02-22  0:10           ` ARGH MORE BUGS!!! Christian Schmid
  2 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 22:10 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> 
>>I started it after one break-cyclus and stopped it immedately after the 
>>next break-cyclus ended.
> 
> 
> I'd welcome several cycles to feed the data through gnuplot.
> A sample based on 15 minutes or more would not hurt: I speak perl too.

Hm I have already restarted the server. There is no other way to make this behaviour go away....

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 21:36         ` Francois Romieu
  2005-02-21 22:10           ` Christian Schmid
@ 2005-02-21 23:03           ` Christian Schmid
  2005-02-22  0:23             ` argh more bugs!!! Francois Romieu
  2005-02-22  0:10           ` ARGH MORE BUGS!!! Christian Schmid
  2 siblings, 1 reply; 14+ messages in thread
From: Christian Schmid @ 2005-02-21 23:03 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

[-- Attachment #1: Type: text/plain, Size: 393 bytes --]

It suddenly appeared again. there you go..........

Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> 
>>I started it after one break-cyclus and stopped it immedately after the 
>>next break-cyclus ended.
> 
> 
> I'd welcome several cycles to feed the data through gnuplot.
> A sample based on 15 minutes or more would not hurt: I speak perl too.
> 
> --
> Ueimor
> 
> 

[-- Attachment #2: log.tgz --]
[-- Type: application/x-compressed, Size: 377644 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ARGH MORE BUGS!!!
  2005-02-21 21:36         ` Francois Romieu
  2005-02-21 22:10           ` Christian Schmid
  2005-02-21 23:03           ` Christian Schmid
@ 2005-02-22  0:10           ` Christian Schmid
  2 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-22  0:10 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

I FOUND IT!!!

At least I were able to track this down very far. It seems it only appears on Port 80. So might 
there be a problem if too many sockets are sharing the same port? In this case, port 80?

What can I do now? The problem is still there. All 60 seconds a break for 5 seconds. I am not really 
willing to reboot anymore. So bug-hunters, please help.

Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> 
>>I started it after one break-cyclus and stopped it immedately after the 
>>next break-cyclus ended.
> 
> 
> I'd welcome several cycles to feed the data through gnuplot.
> A sample based on 15 minutes or more would not hurt: I speak perl too.
> 
> --
> Ueimor
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: argh more bugs!!!
  2005-02-21 23:03           ` Christian Schmid
@ 2005-02-22  0:23             ` Francois Romieu
  2005-02-22  0:37               ` Christian Schmid
  0 siblings, 1 reply; 14+ messages in thread
From: Francois Romieu @ 2005-02-22  0:23 UTC (permalink / raw)
  To: Christian Schmid; +Cc: netdev

Christian Schmid <webmaster@rapidforum.com> :
> It suddenly appeared again. there you go..........

Thanks. I'll do some graphics tomorrow to be sure but the slabs do not seem
wrong. vmstat output looks weird:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 2  3      0   8496  25236 7941848    0    0 37920     0 7563  2788 13 19 34 33
 2  2      0   9268  25172 7941300    0    0 36688     0 7424  2814 15 19 40 26
 1  0      0  19576  25264 7928356    0    0  9468 13080 8072   607 22 13 59  6
 1  0      0  18052  25264 7928356    0    0     0     0 7975    40 18  7 75  0
 1  0      0  17660  25264 7928356    0    0     0     0 7487    38 21  4 75  0
 1  0      0  18560  25264 7928356    0    0     0     0 6500    44 22  3 75  0
 1  0      0  20072  25264 7928356    0    0     0     0 5834    44 23  2 75  0
 1  3      0  21516  25320 7928300    0    0     0  3408 6796   153 24  3 58 15
 0  4      0  10596  25436 7942056    0    0 44084  2220 11226  4282 12 10 31 47
 2  2      0   9324  25240 7943952    0    0 39292     0 8433  3212  9 13 39 38
 4  1      0  11596  25300 7941580    0    0 35820     0 7945  4306 17 21 30 32
 0  5      0  13208  25560 7939280    0    0 40684  6456 7920  4081 19 18 32 31
 4  1      0  12620  24944 7859724    0    0 32204   272 7306  2304 12 28 27 34
 1  3      0  64964  24852 7888240    0    0 44944    96 7314  2631 19 31 24 27

???

Since you have a lot of cpu, could you "strace -f -T -o /tmp/nitz -p xyz" one
or two of your perl processes when they hang ?

If you do not have too many processes, monitoring "echo t > /proc/sysrq-trigger"
for some time could tell what the system is waiting for.

--
Ueimor

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: argh more bugs!!!
  2005-02-22  0:23             ` argh more bugs!!! Francois Romieu
@ 2005-02-22  0:37               ` Christian Schmid
  0 siblings, 0 replies; 14+ messages in thread
From: Christian Schmid @ 2005-02-22  0:37 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev

OK the problem with the break is solved. I am REALLY sorry but it was not the net-code in linux. The 
SQL-Server has experienced an index-key collision as I added a second multi-key to it. It seems the 
sort-buffer overflowed and it suddenly raised the cpu-time very high. I will contact 
mysql-developers and ask them about it. The break was due to the table-lock mysql does for every update.

The problem with the slowdown at many sockets still exists. This isnt solved yet. I hope this isnt 
my fault as well. Else I feel forced to spend 1000 dollars to some open-source foundation. *grin*

Francois Romieu wrote:
> Christian Schmid <webmaster@rapidforum.com> :
> 
>>It suddenly appeared again. there you go..........
> 
> 
> Thanks. I'll do some graphics tomorrow to be sure but the slabs do not seem
> wrong. vmstat output looks weird:
> 
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
>  2  3      0   8496  25236 7941848    0    0 37920     0 7563  2788 13 19 34 33
>  2  2      0   9268  25172 7941300    0    0 36688     0 7424  2814 15 19 40 26
>  1  0      0  19576  25264 7928356    0    0  9468 13080 8072   607 22 13 59  6
>  1  0      0  18052  25264 7928356    0    0     0     0 7975    40 18  7 75  0
>  1  0      0  17660  25264 7928356    0    0     0     0 7487    38 21  4 75  0
>  1  0      0  18560  25264 7928356    0    0     0     0 6500    44 22  3 75  0
>  1  0      0  20072  25264 7928356    0    0     0     0 5834    44 23  2 75  0
>  1  3      0  21516  25320 7928300    0    0     0  3408 6796   153 24  3 58 15
>  0  4      0  10596  25436 7942056    0    0 44084  2220 11226  4282 12 10 31 47
>  2  2      0   9324  25240 7943952    0    0 39292     0 8433  3212  9 13 39 38
>  4  1      0  11596  25300 7941580    0    0 35820     0 7945  4306 17 21 30 32
>  0  5      0  13208  25560 7939280    0    0 40684  6456 7920  4081 19 18 32 31
>  4  1      0  12620  24944 7859724    0    0 32204   272 7306  2304 12 28 27 34
>  1  3      0  64964  24852 7888240    0    0 44944    96 7314  2631 19 31 24 27
> 
> ???
> 
> Since you have a lot of cpu, could you "strace -f -T -o /tmp/nitz -p xyz" one
> or two of your perl processes when they hang ?
> 
> If you do not have too many processes, monitoring "echo t > /proc/sysrq-trigger"
> for some time could tell what the system is waiting for.
> 
> --
> Ueimor
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2005-02-22  0:37 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-21 20:04 ARGH MORE BUGS!!! Christian Schmid
2005-02-21 20:19 ` Matthias-Christian Ott
2005-02-21 20:25   ` Christian Schmid
2005-02-21 20:28 ` Francois Romieu
2005-02-21 20:34   ` Christian Schmid
2005-02-21 20:56     ` Francois Romieu
2005-02-21 21:17       ` Christian Schmid
2005-02-21 21:36         ` Francois Romieu
2005-02-21 22:10           ` Christian Schmid
2005-02-21 23:03           ` Christian Schmid
2005-02-22  0:23             ` argh more bugs!!! Francois Romieu
2005-02-22  0:37               ` Christian Schmid
2005-02-22  0:10           ` ARGH MORE BUGS!!! Christian Schmid
2005-02-21 21:18       ` Christian Schmid

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).