public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
* Problem with function select on kernel  2.6.29.6-rt23
@ 2009-09-10 10:47 Jürgen Mell
  2009-09-10 11:33 ` Sujit K M
  2009-09-18 19:40 ` Jürgen Mell
  0 siblings, 2 replies; 8+ messages in thread
From: Jürgen Mell @ 2009-09-10 10:47 UTC (permalink / raw)
  To: linux-rt-users

Hi,

I have an application which connects via a network socket to a server 
running on the same machine (IP 127.0.0.1) This application uses the 
function 'select' to wait for new data from the server or until a two 
seconds timeout. This works well until there is network traffic on the 
external network interfaces (eth* or WLAN). When there is network 
traffic on the external interfaces, the select function does not wait 
anymore but it returns with a return code of zero, indicating not data 
available on the socket. This happens nearly immediately (after 8 to 9 
microseconds) and not after the specified two seconds interval. The 
timeout parameter of select is updated accordingly (it shows eg. 1 s 
999991 us).
Up to now I could not test this with another kernel but I will try to do 
it this afternoon. Are there any known problems with select? Is there 
any way to circumvent this?

Any help would be greatly appreciated!

        Jürgen

-- 
Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
---------------------------------------------------------------------------- 


--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with function select on kernel 2.6.29.6-rt23
  2009-09-10 10:47 Problem with function select on kernel 2.6.29.6-rt23 Jürgen Mell
@ 2009-09-10 11:33 ` Sujit K M
  2009-09-10 11:50   ` Jürgen Mell
  2009-09-18 19:40 ` Jürgen Mell
  1 sibling, 1 reply; 8+ messages in thread
From: Sujit K M @ 2009-09-10 11:33 UTC (permalink / raw)
  To: mell, linux-rt-users

Could you Check the sort of load that your server machine is able to
take. The site you would be looking at is
http://www.petefreitag.com/item/689.cfm.
I think if the network is rejecting the select call, As the socket is
not getting created. Other wise it would wait to the required 2 Second
limit set by you.
Also are you sure of the bind part of the socket creation. Making it
threaded is an option.

Thanks,
Sujit

On Thu, Sep 10, 2009 at 4:17 PM, Jürgen Mell <mell@hedrich-winders.com> wrote:
> Hi,
>
> I have an application which connects via a network socket to a server
> running on the same machine (IP 127.0.0.1) This application uses the
> function 'select' to wait for new data from the server or until a two
> seconds timeout. This works well until there is network traffic on the
> external network interfaces (eth* or WLAN). When there is network traffic on
> the external interfaces, the select function does not wait anymore but it
> returns with a return code of zero, indicating not data available on the
> socket. This happens nearly immediately (after 8 to 9 microseconds) and not
> after the specified two seconds interval. The timeout parameter of select is
> updated accordingly (it shows eg. 1 s 999991 us).
> Up to now I could not test this with another kernel but I will try to do it
> this afternoon. Are there any known problems with select? Is there any way
> to circumvent this?
>
> Any help would be greatly appreciated!
>
>       Jürgen
>
> --
> Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
> Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
> FAX :  +49-511-762-18225
> Mobil: +49-160-7428156
> ----------------------------------------------------------------------------
> HEDRICH winding systems GmbH
> An der Universität 2 (im PZH)
> D-30823 Garbsen (GERMANY)
> ----------------------------------------------------------------------------
> Geschäftsführer: Karsten Adam
> Handelsregister: Wetzlar, HRB 4768
> Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
> ----------------------------------------------------------------------------
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
-- Sujit K M

blog(http://kmsujit.blogspot.com/)
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with function select on kernel 2.6.29.6-rt23
  2009-09-10 11:33 ` Sujit K M
@ 2009-09-10 11:50   ` Jürgen Mell
  0 siblings, 0 replies; 8+ messages in thread
From: Jürgen Mell @ 2009-09-10 11:50 UTC (permalink / raw)
  To: Sujit K M, linux-rt-users

Thanks for your reply! The machine is not loaded very much (CPU load 
below 30%) . ANY network traffic, which is not even directed to the 
server process on the machine will cause the problem, eg. just a copy 
operation which copies the  Linux kernel source from a SMB server to the 
local disk.
The application program has established the socket connection. 'bind' 
has completed without problems. It is then running in a loop where it 
calls 'select' to wait for new data, reads the data if any is available, 
processes it and then repeats these steps.

Bye,
          Jürgen


Sujit K M wrote:
> Could you Check the sort of load that your server machine is able to
> take. The site you would be looking at is
> http://www.petefreitag.com/item/689.cfm.
> I think if the network is rejecting the select call, As the socket is
> not getting created. Other wise it would wait to the required 2 Second
> limit set by you.
> Also are you sure of the bind part of the socket creation. Making it
> threaded is an option.
>
> Thanks,
> Sujit
>
> On Thu, Sep 10, 2009 at 4:17 PM, Jürgen Mell <mell@hedrich-winders.com> wrote:
>   
>> Hi,
>>
>> I have an application which connects via a network socket to a server
>> running on the same machine (IP 127.0.0.1) This application uses the
>> function 'select' to wait for new data from the server or until a two
>> seconds timeout. This works well until there is network traffic on the
>> external network interfaces (eth* or WLAN). When there is network traffic on
>> the external interfaces, the select function does not wait anymore but it
>> returns with a return code of zero, indicating not data available on the
>> socket. This happens nearly immediately (after 8 to 9 microseconds) and not
>> after the specified two seconds interval. The timeout parameter of select is
>> updated accordingly (it shows eg. 1 s 999991 us).
>> Up to now I could not test this with another kernel but I will try to do it
>> this afternoon. Are there any known problems with select? Is there any way
>> to circumvent this?
>>
>> Any help would be greatly appreciated!
>>
>>       Jürgen
>>
>> --
>> Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
>> Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
>> FAX :  +49-511-762-18225
>> Mobil: +49-160-7428156
>> ----------------------------------------------------------------------------
>> HEDRICH winding systems GmbH
>> An der Universität 2 (im PZH)
>> D-30823 Garbsen (GERMANY)
>> ----------------------------------------------------------------------------
>> Geschäftsführer: Karsten Adam
>> Handelsregister: Wetzlar, HRB 4768
>> Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
>> ----------------------------------------------------------------------------
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>     
>
>
>
>   


-- 
Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
---------------------------------------------------------------------------- 


--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with function select on kernel  2.6.29.6-rt23
  2009-09-10 10:47 Problem with function select on kernel 2.6.29.6-rt23 Jürgen Mell
  2009-09-10 11:33 ` Sujit K M
@ 2009-09-18 19:40 ` Jürgen Mell
  2009-09-20 10:20   ` Sujit K M
  1 sibling, 1 reply; 8+ messages in thread
From: Jürgen Mell @ 2009-09-18 19:40 UTC (permalink / raw)
  To: linux-rt-users

Meanwhile I have dug a little deeper into this problem. The problem
occurs under the following conditions:
- the BIOS clock must be slow
- the NTP daemon is used to adjust the system time
The problem can be reproduced on real hardware as well as on a virtual
machine running under VMware. Set the BIOS clock back about ten minutes
against the 'real' time. Then start the NTP daemon and then run the
little test program:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <errno.h>
#include <sys/select.h>

int main(int argc, char *argv[])
{
   time_t t;
   struct timeval timeout;
   int i;
   int ret;

   t = time (NULL);
   printf ("Current time before = %s", ctime (&t));

   for (i = 0; i < 20; i++)
   {
      timeout.tv_sec  = 1;
      timeout.tv_usec = 0;

      if ((ret = select (FD_SETSIZE, NULL, NULL, NULL, &timeout)) < 0)
      {
         printf ("select returned %d, errno = %d\n", ret, errno);
         return EXIT_FAILURE;
      }
   }
   t = time (NULL);
   printf ("Current time after = %s", ctime (&t));
   return EXIT_SUCCESS;
}

On a virtual machine under VMware I got the following result after some
minutes of system run time:

hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:05:51 2009
Current time after = Fri Sep 18 20:06:11 2009
hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:14:29 2009
Current time after = Fri Sep 18 20:14:33 2009
hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:14:57 2009
Current time after = Fri Sep 18 20:14:57 2009
hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:15:20 2009
Current time after = Fri Sep 18 20:15:40 2009
hws@cwc-vmware:/home/hws >

Normally, the time distance between 'before' and 'after' should be 20
seconds as in the first and last run of the program. For the second run
the time difference is only 4 seconds and for the third run it is even zero.

On the real hardware I have also some other time-related issues when the
problem occurs. Keyboard input will often 'bounce' - key presses are
detected two or more times and some delay times are prolonged (!). I
could not yet reproduce this in the virtual machine.

The problem will not always occur immediately after the system is
started but it may take several minutes until the first effects occur. I
have not tested this issue with other kernels yet but I will do so
during the weekend.

Are there any ideas what to do about this (beside buying a better BIOS
clock)? I would really like to have the NTP daemon running to keep the
system time accurate, but somehow it seems to effect wait queues in the
kernel pretty badly.

Bye,
           Jürgen

Jürgen Mell schrieb:
> Hi,
>
> I have an application which connects via a network socket to a server
> running on the same machine (IP 127.0.0.1) This application uses the
> function 'select' to wait for new data from the server or until a two
> seconds timeout. This works well until there is network traffic on the
> external network interfaces (eth* or WLAN). When there is network
> traffic on the external interfaces, the select function does not wait
> anymore but it returns with a return code of zero, indicating not data
> available on the socket. This happens nearly immediately (after 8 to 9
> microseconds) and not after the specified two seconds interval. The
> timeout parameter of select is updated accordingly (it shows eg. 1 s
> 999991 us).
> Up to now I could not test this with another kernel but I will try to
> do it this afternoon. Are there any known problems with select? Is
> there any way to circumvent this?
>
> Any help would be greatly appreciated!
>
>        Jürgen
>

-- 
Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
---------------------------------------------------------------------------- 

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with function select on kernel 2.6.29.6-rt23
  2009-09-18 19:40 ` Jürgen Mell
@ 2009-09-20 10:20   ` Sujit K M
  2009-09-21  9:23     ` Sujit K M
  0 siblings, 1 reply; 8+ messages in thread
From: Sujit K M @ 2009-09-20 10:20 UTC (permalink / raw)
  To: mell; +Cc: linux-rt-users

Hi,

One thing at the onset I would like you to check is that what happens
to the program when the loop
count is made more like 1000/10,000/100000 - 1 Million/10 Million.
Does the Time Graph Increase.
Try Plotting the Difference with actual time start. Try Making Use of
Some scripting language like TCL/TK.

There is some info regarding the select system call. I think it is
pertaining to this problem.
http://linux.die.net/man/2/syscalls. Basically It is an Optimization
that the Current Kernels Look Into.

Thanks,
Sujit

On Sat, Sep 19, 2009 at 1:10 AM, Jürgen Mell <mell@hedrich-winders.com> wrote:
> Meanwhile I have dug a little deeper into this problem. The problem
> occurs under the following conditions:
> - the BIOS clock must be slow
> - the NTP daemon is used to adjust the system time
> The problem can be reproduced on real hardware as well as on a virtual
> machine running under VMware. Set the BIOS clock back about ten minutes
> against the 'real' time. Then start the NTP daemon and then run the
> little test program:
>
> #include <stdio.h>
> #include <stdlib.h>
> #include <time.h>
> #include <errno.h>
> #include <sys/select.h>
>
> int main(int argc, char *argv[])
> {
>   time_t t;
>   struct timeval timeout;
>   int i;
>   int ret;
>
>   t = time (NULL);
>   printf ("Current time before = %s", ctime (&t));
>
>   for (i = 0; i < 20; i++)
>   {
>      timeout.tv_sec  = 1;
>      timeout.tv_usec = 0;
>
>      if ((ret = select (FD_SETSIZE, NULL, NULL, NULL, &timeout)) < 0)
>      {
>         printf ("select returned %d, errno = %d\n", ret, errno);
>         return EXIT_FAILURE;
>      }
>   }
>   t = time (NULL);
>   printf ("Current time after = %s", ctime (&t));
>   return EXIT_SUCCESS;
> }
>
> On a virtual machine under VMware I got the following result after some
> minutes of system run time:
>
> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
> Current time before = Fri Sep 18 20:05:51 2009
> Current time after = Fri Sep 18 20:06:11 2009
> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
> Current time before = Fri Sep 18 20:14:29 2009
> Current time after = Fri Sep 18 20:14:33 2009
> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
> Current time before = Fri Sep 18 20:14:57 2009
> Current time after = Fri Sep 18 20:14:57 2009
> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
> Current time before = Fri Sep 18 20:15:20 2009
> Current time after = Fri Sep 18 20:15:40 2009
> hws@cwc-vmware:/home/hws >
>
> Normally, the time distance between 'before' and 'after' should be 20
> seconds as in the first and last run of the program. For the second run
> the time difference is only 4 seconds and for the third run it is even zero.
>
> On the real hardware I have also some other time-related issues when the
> problem occurs. Keyboard input will often 'bounce' - key presses are
> detected two or more times and some delay times are prolonged (!). I
> could not yet reproduce this in the virtual machine.
>
> The problem will not always occur immediately after the system is
> started but it may take several minutes until the first effects occur. I
> have not tested this issue with other kernels yet but I will do so
> during the weekend.
>
> Are there any ideas what to do about this (beside buying a better BIOS
> clock)? I would really like to have the NTP daemon running to keep the
> system time accurate, but somehow it seems to effect wait queues in the
> kernel pretty badly.
>
> Bye,
>           Jürgen
>
> Jürgen Mell schrieb:
>> Hi,
>>
>> I have an application which connects via a network socket to a server
>> running on the same machine (IP 127.0.0.1) This application uses the
>> function 'select' to wait for new data from the server or until a two
>> seconds timeout. This works well until there is network traffic on the
>> external network interfaces (eth* or WLAN). When there is network
>> traffic on the external interfaces, the select function does not wait
>> anymore but it returns with a return code of zero, indicating not data
>> available on the socket. This happens nearly immediately (after 8 to 9
>> microseconds) and not after the specified two seconds interval. The
>> timeout parameter of select is updated accordingly (it shows eg. 1 s
>> 999991 us).
>> Up to now I could not test this with another kernel but I will try to
>> do it this afternoon. Are there any known problems with select? Is
>> there any way to circumvent this?
>>
>> Any help would be greatly appreciated!
>>
>>        Jürgen
>>
>
> --
> Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
> Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
> FAX :  +49-511-762-18225
> Mobil: +49-160-7428156
> ----------------------------------------------------------------------------
> HEDRICH winding systems GmbH
> An der Universität 2 (im PZH)
> D-30823 Garbsen (GERMANY)
> ----------------------------------------------------------------------------
> Geschäftsführer: Karsten Adam
> Handelsregister: Wetzlar, HRB 4768
> Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
> ----------------------------------------------------------------------------
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
-- Sujit K M

blog(http://kmsujit.blogspot.com/)
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with function select on kernel 2.6.29.6-rt23
  2009-09-20 10:20   ` Sujit K M
@ 2009-09-21  9:23     ` Sujit K M
  2009-09-21  9:58       ` Jürgen Mell
  0 siblings, 1 reply; 8+ messages in thread
From: Sujit K M @ 2009-09-21  9:23 UTC (permalink / raw)
  To: mell; +Cc: linux-rt-users

this seems to be normal functionality.

As quoted from

http://linux.die.net/man/2/select

(ii)
select() may update the timeout argument to indicate how much time was
left. pselect() does not change this argument.



On Sun, Sep 20, 2009 at 3:50 PM, Sujit K M <sjt.kar@gmail.com> wrote:
> Hi,
>
> One thing at the onset I would like you to check is that what happens
> to the program when the loop
> count is made more like 1000/10,000/100000 - 1 Million/10 Million.
> Does the Time Graph Increase.
> Try Plotting the Difference with actual time start. Try Making Use of
> Some scripting language like TCL/TK.
>
> There is some info regarding the select system call. I think it is
> pertaining to this problem.
> http://linux.die.net/man/2/syscalls. Basically It is an Optimization
> that the Current Kernels Look Into.
>
> Thanks,
> Sujit
>
> On Sat, Sep 19, 2009 at 1:10 AM, Jürgen Mell <mell@hedrich-winders.com> wrote:
>> Meanwhile I have dug a little deeper into this problem. The problem
>> occurs under the following conditions:
>> - the BIOS clock must be slow
>> - the NTP daemon is used to adjust the system time
>> The problem can be reproduced on real hardware as well as on a virtual
>> machine running under VMware. Set the BIOS clock back about ten minutes
>> against the 'real' time. Then start the NTP daemon and then run the
>> little test program:
>>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <time.h>
>> #include <errno.h>
>> #include <sys/select.h>
>>
>> int main(int argc, char *argv[])
>> {
>>   time_t t;
>>   struct timeval timeout;
>>   int i;
>>   int ret;
>>
>>   t = time (NULL);
>>   printf ("Current time before = %s", ctime (&t));
>>
>>   for (i = 0; i < 20; i++)
>>   {
>>      timeout.tv_sec  = 1;
>>      timeout.tv_usec = 0;
>>
>>      if ((ret = select (FD_SETSIZE, NULL, NULL, NULL, &timeout)) < 0)
>>      {
>>         printf ("select returned %d, errno = %d\n", ret, errno);
>>         return EXIT_FAILURE;
>>      }
>>   }
>>   t = time (NULL);
>>   printf ("Current time after = %s", ctime (&t));
>>   return EXIT_SUCCESS;
>> }
>>
>> On a virtual machine under VMware I got the following result after some
>> minutes of system run time:
>>
>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>> Current time before = Fri Sep 18 20:05:51 2009
>> Current time after = Fri Sep 18 20:06:11 2009
>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>> Current time before = Fri Sep 18 20:14:29 2009
>> Current time after = Fri Sep 18 20:14:33 2009
>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>> Current time before = Fri Sep 18 20:14:57 2009
>> Current time after = Fri Sep 18 20:14:57 2009
>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>> Current time before = Fri Sep 18 20:15:20 2009
>> Current time after = Fri Sep 18 20:15:40 2009
>> hws@cwc-vmware:/home/hws >
>>
>> Normally, the time distance between 'before' and 'after' should be 20
>> seconds as in the first and last run of the program. For the second run
>> the time difference is only 4 seconds and for the third run it is even zero.
>>
>> On the real hardware I have also some other time-related issues when the
>> problem occurs. Keyboard input will often 'bounce' - key presses are
>> detected two or more times and some delay times are prolonged (!). I
>> could not yet reproduce this in the virtual machine.
>>
>> The problem will not always occur immediately after the system is
>> started but it may take several minutes until the first effects occur. I
>> have not tested this issue with other kernels yet but I will do so
>> during the weekend.
>>
>> Are there any ideas what to do about this (beside buying a better BIOS
>> clock)? I would really like to have the NTP daemon running to keep the
>> system time accurate, but somehow it seems to effect wait queues in the
>> kernel pretty badly.
>>
>> Bye,
>>           Jürgen
>>
>> Jürgen Mell schrieb:
>>> Hi,
>>>
>>> I have an application which connects via a network socket to a server
>>> running on the same machine (IP 127.0.0.1) This application uses the
>>> function 'select' to wait for new data from the server or until a two
>>> seconds timeout. This works well until there is network traffic on the
>>> external network interfaces (eth* or WLAN). When there is network
>>> traffic on the external interfaces, the select function does not wait
>>> anymore but it returns with a return code of zero, indicating not data
>>> available on the socket. This happens nearly immediately (after 8 to 9
>>> microseconds) and not after the specified two seconds interval. The
>>> timeout parameter of select is updated accordingly (it shows eg. 1 s
>>> 999991 us).
>>> Up to now I could not test this with another kernel but I will try to
>>> do it this afternoon. Are there any known problems with select? Is
>>> there any way to circumvent this?
>>>
>>> Any help would be greatly appreciated!
>>>
>>>        Jürgen
>>>
>>
>> --
>> Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
>> Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
>> FAX :  +49-511-762-18225
>> Mobil: +49-160-7428156
>> ----------------------------------------------------------------------------
>> HEDRICH winding systems GmbH
>> An der Universität 2 (im PZH)
>> D-30823 Garbsen (GERMANY)
>> ----------------------------------------------------------------------------
>> Geschäftsführer: Karsten Adam
>> Handelsregister: Wetzlar, HRB 4768
>> Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
>> ----------------------------------------------------------------------------
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> -- Sujit K M
>
> blog(http://kmsujit.blogspot.com/)
>



-- 
-- Sujit K M

blog(http://kmsujit.blogspot.com/)
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with function select on kernel 2.6.29.6-rt23
  2009-09-21  9:23     ` Sujit K M
@ 2009-09-21  9:58       ` Jürgen Mell
  2009-09-21 10:34         ` Jürgen Mell
  0 siblings, 1 reply; 8+ messages in thread
From: Jürgen Mell @ 2009-09-21  9:58 UTC (permalink / raw)
  To: Sujit K M; +Cc: linux-rt-users

No, I do not think that this is intentional. Some lines later, you will find

"Some code calls *select*() with all three sets empty, /n/ zero, and a 
non-NULL /timeout/ as a fairly portable way to sleep with subsecond 
precision."

This cannot make any sense, if I have to call select several times to 
get the full delay period. The overhead for calling the function several 
times is significant. I have modified the test program according to your 
proposal to run the loop 2000 times with 10000 us delay and get - 
depending on the speed of the computer - times between 22 and 24 seconds 
total.

I understand that the timeout argument of select is updated when select 
returns after one of the monitored file descriptors is ready for the 
selected operation.

I have tested this issue now with the kernel 2.6.31-rt11 and got a new 
problem: this time select does not abort prematurely any more but now 
each second of computer time is about three seconds in reality (the 
computer clock is extremely slow). NTP is running.

Somehow fiddling with NTP causes very strange side effects...

Bye,
           Jürgen

Sujit K M schrieb:
> this seems to be normal functionality.
>
> As quoted from
>
> http://linux.die.net/man/2/select
>
> (ii)
> select() may update the timeout argument to indicate how much time was
> left. pselect() does not change this argument.
>
>
>
> On Sun, Sep 20, 2009 at 3:50 PM, Sujit K M <sjt.kar@gmail.com> wrote:
>   
>> Hi,
>>
>> One thing at the onset I would like you to check is that what happens
>> to the program when the loop
>> count is made more like 1000/10,000/100000 - 1 Million/10 Million.
>> Does the Time Graph Increase.
>> Try Plotting the Difference with actual time start. Try Making Use of
>> Some scripting language like TCL/TK.
>>
>> There is some info regarding the select system call. I think it is
>> pertaining to this problem.
>> http://linux.die.net/man/2/syscalls. Basically It is an Optimization
>> that the Current Kernels Look Into.
>>
>> Thanks,
>> Sujit
>>
>> On Sat, Sep 19, 2009 at 1:10 AM, Jürgen Mell <mell@hedrich-winders.com> wrote:
>>     
>>> Meanwhile I have dug a little deeper into this problem. The problem
>>> occurs under the following conditions:
>>> - the BIOS clock must be slow
>>> - the NTP daemon is used to adjust the system time
>>> The problem can be reproduced on real hardware as well as on a virtual
>>> machine running under VMware. Set the BIOS clock back about ten minutes
>>> against the 'real' time. Then start the NTP daemon and then run the
>>> little test program:
>>>
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <time.h>
>>> #include <errno.h>
>>> #include <sys/select.h>
>>>
>>> int main(int argc, char *argv[])
>>> {
>>>   time_t t;
>>>   struct timeval timeout;
>>>   int i;
>>>   int ret;
>>>
>>>   t = time (NULL);
>>>   printf ("Current time before = %s", ctime (&t));
>>>
>>>   for (i = 0; i < 20; i++)
>>>   {
>>>      timeout.tv_sec  = 1;
>>>      timeout.tv_usec = 0;
>>>
>>>      if ((ret = select (FD_SETSIZE, NULL, NULL, NULL, &timeout)) < 0)
>>>      {
>>>         printf ("select returned %d, errno = %d\n", ret, errno);
>>>         return EXIT_FAILURE;
>>>      }
>>>   }
>>>   t = time (NULL);
>>>   printf ("Current time after = %s", ctime (&t));
>>>   return EXIT_SUCCESS;
>>> }
>>>
>>> On a virtual machine under VMware I got the following result after some
>>> minutes of system run time:
>>>
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:05:51 2009
>>> Current time after = Fri Sep 18 20:06:11 2009
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:14:29 2009
>>> Current time after = Fri Sep 18 20:14:33 2009
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:14:57 2009
>>> Current time after = Fri Sep 18 20:14:57 2009
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:15:20 2009
>>> Current time after = Fri Sep 18 20:15:40 2009
>>> hws@cwc-vmware:/home/hws >
>>>
>>> Normally, the time distance between 'before' and 'after' should be 20
>>> seconds as in the first and last run of the program. For the second run
>>> the time difference is only 4 seconds and for the third run it is even zero.
>>>
>>> On the real hardware I have also some other time-related issues when the
>>> problem occurs. Keyboard input will often 'bounce' - key presses are
>>> detected two or more times and some delay times are prolonged (!). I
>>> could not yet reproduce this in the virtual machine.
>>>
>>> The problem will not always occur immediately after the system is
>>> started but it may take several minutes until the first effects occur. I
>>> have not tested this issue with other kernels yet but I will do so
>>> during the weekend.
>>>
>>> Are there any ideas what to do about this (beside buying a better BIOS
>>> clock)? I would really like to have the NTP daemon running to keep the
>>> system time accurate, but somehow it seems to effect wait queues in the
>>> kernel pretty badly.
>>>
>>> Bye,
>>>           Jürgen
>>>
>>> Jürgen Mell schrieb:
>>>       
>>>> Hi,
>>>>
>>>> I have an application which connects via a network socket to a server
>>>> running on the same machine (IP 127.0.0.1) This application uses the
>>>> function 'select' to wait for new data from the server or until a two
>>>> seconds timeout. This works well until there is network traffic on the
>>>> external network interfaces (eth* or WLAN). When there is network
>>>> traffic on the external interfaces, the select function does not wait
>>>> anymore but it returns with a return code of zero, indicating not data
>>>> available on the socket. This happens nearly immediately (after 8 to 9
>>>> microseconds) and not after the specified two seconds interval. The
>>>> timeout parameter of select is updated accordingly (it shows eg. 1 s
>>>> 999991 us).
>>>> Up to now I could not test this with another kernel but I will try to
>>>> do it this afternoon. Are there any known problems with select? Is
>>>> there any way to circumvent this?
>>>>
>>>> Any help would be greatly appreciated!
>>>>
>>>>        Jürgen
>>>>
>>>>         
>>> --
>>> Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
>>> Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
>>> FAX :  +49-511-762-18225
>>> Mobil: +49-160-7428156
>>> ----------------------------------------------------------------------------
>>> HEDRICH winding systems GmbH
>>> An der Universität 2 (im PZH)
>>> D-30823 Garbsen (GERMANY)
>>> ----------------------------------------------------------------------------
>>> Geschäftsführer: Karsten Adam
>>> Handelsregister: Wetzlar, HRB 4768
>>> Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
>>> ----------------------------------------------------------------------------
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>       
>>
>> --
>> -- Sujit K M
>>
>> blog(http://kmsujit.blogspot.com/)
>>
>>     
>
>
>
>   


-- 
Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
---------------------------------------------------------------------------- 


--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Problem with function select on kernel 2.6.29.6-rt23
  2009-09-21  9:58       ` Jürgen Mell
@ 2009-09-21 10:34         ` Jürgen Mell
  0 siblings, 0 replies; 8+ messages in thread
From: Jürgen Mell @ 2009-09-21 10:34 UTC (permalink / raw)
  To: Sujit K M; +Cc: linux-rt-users

The slow clock was caused because the kernel suspected a defective ACPI 
PM timer. After fixing that, 2.6.31-rt11 runs up to now without problems.

       Jürgen

Jürgen Mell schrieb:
> No, I do not think that this is intentional. Some lines later, you 
> will find
>
> "Some code calls *select*() with all three sets empty, /n/ zero, and a 
> non-NULL /timeout/ as a fairly portable way to sleep with subsecond 
> precision."
>
> This cannot make any sense, if I have to call select several times to 
> get the full delay period. The overhead for calling the function 
> several times is significant. I have modified the test program 
> according to your proposal to run the loop 2000 times with 10000 us 
> delay and get - depending on the speed of the computer - times between 
> 22 and 24 seconds total.
>
> I understand that the timeout argument of select is updated when 
> select returns after one of the monitored file descriptors is ready 
> for the selected operation.
>
> I have tested this issue now with the kernel 2.6.31-rt11 and got a new 
> problem: this time select does not abort prematurely any more but now 
> each second of computer time is about three seconds in reality (the 
> computer clock is extremely slow). NTP is running.
>
> Somehow fiddling with NTP causes very strange side effects...
>
> Bye,
>           Jürgen
>
> Sujit K M schrieb:
>> this seems to be normal functionality.
>>
>> As quoted from
>>
>> http://linux.die.net/man/2/select
>>
>> (ii)
>> select() may update the timeout argument to indicate how much time was
>> left. pselect() does not change this argument.
>>
>>
>>
>> On Sun, Sep 20, 2009 at 3:50 PM, Sujit K M <sjt.kar@gmail.com> wrote:
>>  
>>> Hi,
>>>
>>> One thing at the onset I would like you to check is that what happens
>>> to the program when the loop
>>> count is made more like 1000/10,000/100000 - 1 Million/10 Million.
>>> Does the Time Graph Increase.
>>> Try Plotting the Difference with actual time start. Try Making Use of
>>> Some scripting language like TCL/TK.
>>>
>>> There is some info regarding the select system call. I think it is
>>> pertaining to this problem.
>>> http://linux.die.net/man/2/syscalls. Basically It is an Optimization
>>> that the Current Kernels Look Into.
>>>
>>> Thanks,
>>> Sujit
>>>
>>> On Sat, Sep 19, 2009 at 1:10 AM, Jürgen Mell 
>>> <mell@hedrich-winders.com> wrote:
>>>    
>>>> Meanwhile I have dug a little deeper into this problem. The problem
>>>> occurs under the following conditions:
>>>> - the BIOS clock must be slow
>>>> - the NTP daemon is used to adjust the system time
>>>> The problem can be reproduced on real hardware as well as on a virtual
>>>> machine running under VMware. Set the BIOS clock back about ten 
>>>> minutes
>>>> against the 'real' time. Then start the NTP daemon and then run the
>>>> little test program:
>>>>
>>>> #include <stdio.h>
>>>> #include <stdlib.h>
>>>> #include <time.h>
>>>> #include <errno.h>
>>>> #include <sys/select.h>
>>>>
>>>> int main(int argc, char *argv[])
>>>> {
>>>>   time_t t;
>>>>   struct timeval timeout;
>>>>   int i;
>>>>   int ret;
>>>>
>>>>   t = time (NULL);
>>>>   printf ("Current time before = %s", ctime (&t));
>>>>
>>>>   for (i = 0; i < 20; i++)
>>>>   {
>>>>      timeout.tv_sec  = 1;
>>>>      timeout.tv_usec = 0;
>>>>
>>>>      if ((ret = select (FD_SETSIZE, NULL, NULL, NULL, &timeout)) < 0)
>>>>      {
>>>>         printf ("select returned %d, errno = %d\n", ret, errno);
>>>>         return EXIT_FAILURE;
>>>>      }
>>>>   }
>>>>   t = time (NULL);
>>>>   printf ("Current time after = %s", ctime (&t));
>>>>   return EXIT_SUCCESS;
>>>> }
>>>>
>>>> On a virtual machine under VMware I got the following result after 
>>>> some
>>>> minutes of system run time:
>>>>
>>>> hws@cwc-vmware:/home/hws > 
>>>> /space/software/select_test/debug/src/select_test
>>>> Current time before = Fri Sep 18 20:05:51 2009
>>>> Current time after = Fri Sep 18 20:06:11 2009
>>>> hws@cwc-vmware:/home/hws > 
>>>> /space/software/select_test/debug/src/select_test
>>>> Current time before = Fri Sep 18 20:14:29 2009
>>>> Current time after = Fri Sep 18 20:14:33 2009
>>>> hws@cwc-vmware:/home/hws > 
>>>> /space/software/select_test/debug/src/select_test
>>>> Current time before = Fri Sep 18 20:14:57 2009
>>>> Current time after = Fri Sep 18 20:14:57 2009
>>>> hws@cwc-vmware:/home/hws > 
>>>> /space/software/select_test/debug/src/select_test
>>>> Current time before = Fri Sep 18 20:15:20 2009
>>>> Current time after = Fri Sep 18 20:15:40 2009
>>>> hws@cwc-vmware:/home/hws >
>>>>
>>>> Normally, the time distance between 'before' and 'after' should be 20
>>>> seconds as in the first and last run of the program. For the second 
>>>> run
>>>> the time difference is only 4 seconds and for the third run it is 
>>>> even zero.
>>>>
>>>> On the real hardware I have also some other time-related issues 
>>>> when the
>>>> problem occurs. Keyboard input will often 'bounce' - key presses are
>>>> detected two or more times and some delay times are prolonged (!). I
>>>> could not yet reproduce this in the virtual machine.
>>>>
>>>> The problem will not always occur immediately after the system is
>>>> started but it may take several minutes until the first effects 
>>>> occur. I
>>>> have not tested this issue with other kernels yet but I will do so
>>>> during the weekend.
>>>>
>>>> Are there any ideas what to do about this (beside buying a better BIOS
>>>> clock)? I would really like to have the NTP daemon running to keep the
>>>> system time accurate, but somehow it seems to effect wait queues in 
>>>> the
>>>> kernel pretty badly.
>>>>
>>>> Bye,
>>>>           Jürgen
>>>>
>>>> Jürgen Mell schrieb:
>>>>      
>>>>> Hi,
>>>>>
>>>>> I have an application which connects via a network socket to a server
>>>>> running on the same machine (IP 127.0.0.1) This application uses the
>>>>> function 'select' to wait for new data from the server or until a two
>>>>> seconds timeout. This works well until there is network traffic on 
>>>>> the
>>>>> external network interfaces (eth* or WLAN). When there is network
>>>>> traffic on the external interfaces, the select function does not wait
>>>>> anymore but it returns with a return code of zero, indicating not 
>>>>> data
>>>>> available on the socket. This happens nearly immediately (after 8 
>>>>> to 9
>>>>> microseconds) and not after the specified two seconds interval. The
>>>>> timeout parameter of select is updated accordingly (it shows eg. 1 s
>>>>> 999991 us).
>>>>> Up to now I could not test this with another kernel but I will try to
>>>>> do it this afternoon. Are there any known problems with select? Is
>>>>> there any way to circumvent this?
>>>>>
>>>>> Any help would be greatly appreciated!
>>>>>
>>>>>        Jürgen
>>>>>
>>>>>         
>>>> -- 
>>>> Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
>>>> Tel.:  +49-511-762-18226                 
>>>> http://www.hedrich-winding.com
>>>> FAX :  +49-511-762-18225
>>>> Mobil: +49-160-7428156
>>>> ---------------------------------------------------------------------------- 
>>>>
>>>> HEDRICH winding systems GmbH
>>>> An der Universität 2 (im PZH)
>>>> D-30823 Garbsen (GERMANY)
>>>> ---------------------------------------------------------------------------- 
>>>>
>>>> Geschäftsführer: Karsten Adam
>>>> Handelsregister: Wetzlar, HRB 4768
>>>> Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
>>>> ---------------------------------------------------------------------------- 
>>>>
>>>>
>>>> -- 
>>>> To unsubscribe from this list: send the line "unsubscribe 
>>>> linux-rt-users" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>       
>>>
>>> -- 
>>> -- Sujit K M
>>>
>>> blog(http://kmsujit.blogspot.com/)
>>>
>>>     
>>
>>
>>
>>   
>
>


-- 
Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
---------------------------------------------------------------------------- 


--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-09-21 10:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-10 10:47 Problem with function select on kernel 2.6.29.6-rt23 Jürgen Mell
2009-09-10 11:33 ` Sujit K M
2009-09-10 11:50   ` Jürgen Mell
2009-09-18 19:40 ` Jürgen Mell
2009-09-20 10:20   ` Sujit K M
2009-09-21  9:23     ` Sujit K M
2009-09-21  9:58       ` Jürgen Mell
2009-09-21 10:34         ` Jürgen Mell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox