public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed
From: "Jürgen Mell" <mell@hedrich-winders.com>
To: Sujit K M <sjt.kar@gmail.com>
Cc: linux-rt-users@vger.kernel.org
Subject: Re: Problem with function select on kernel 2.6.29.6-rt23
Date: Mon, 21 Sep 2009 11:58:00 +0200	[thread overview]
Message-ID: <4AB74E28.7020901@hedrich-winders.com> (raw)
In-Reply-To: <921ca19c0909210223x74d371a9g720986748b9a4ffc@mail.gmail.com>

No, I do not think that this is intentional. Some lines later, you will find

"Some code calls *select*() with all three sets empty, /n/ zero, and a 
non-NULL /timeout/ as a fairly portable way to sleep with subsecond 
precision."

This cannot make any sense, if I have to call select several times to 
get the full delay period. The overhead for calling the function several 
times is significant. I have modified the test program according to your 
proposal to run the loop 2000 times with 10000 us delay and get - 
depending on the speed of the computer - times between 22 and 24 seconds 
total.

I understand that the timeout argument of select is updated when select 
returns after one of the monitored file descriptors is ready for the 
selected operation.

I have tested this issue now with the kernel 2.6.31-rt11 and got a new 
problem: this time select does not abort prematurely any more but now 
each second of computer time is about three seconds in reality (the 
computer clock is extremely slow). NTP is running.

Somehow fiddling with NTP causes very strange side effects...

Bye,
           Jürgen

Sujit K M schrieb:
> this seems to be normal functionality.
>
> As quoted from
>
> http://linux.die.net/man/2/select
>
> (ii)
> select() may update the timeout argument to indicate how much time was
> left. pselect() does not change this argument.
>
>
>
> On Sun, Sep 20, 2009 at 3:50 PM, Sujit K M <sjt.kar@gmail.com> wrote:
>   
>> Hi,
>>
>> One thing at the onset I would like you to check is that what happens
>> to the program when the loop
>> count is made more like 1000/10,000/100000 - 1 Million/10 Million.
>> Does the Time Graph Increase.
>> Try Plotting the Difference with actual time start. Try Making Use of
>> Some scripting language like TCL/TK.
>>
>> There is some info regarding the select system call. I think it is
>> pertaining to this problem.
>> http://linux.die.net/man/2/syscalls. Basically It is an Optimization
>> that the Current Kernels Look Into.
>>
>> Thanks,
>> Sujit
>>
>> On Sat, Sep 19, 2009 at 1:10 AM, Jürgen Mell <mell@hedrich-winders.com> wrote:
>>     
>>> Meanwhile I have dug a little deeper into this problem. The problem
>>> occurs under the following conditions:
>>> - the BIOS clock must be slow
>>> - the NTP daemon is used to adjust the system time
>>> The problem can be reproduced on real hardware as well as on a virtual
>>> machine running under VMware. Set the BIOS clock back about ten minutes
>>> against the 'real' time. Then start the NTP daemon and then run the
>>> little test program:
>>>
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <time.h>
>>> #include <errno.h>
>>> #include <sys/select.h>
>>>
>>> int main(int argc, char *argv[])
>>> {
>>>   time_t t;
>>>   struct timeval timeout;
>>>   int i;
>>>   int ret;
>>>
>>>   t = time (NULL);
>>>   printf ("Current time before = %s", ctime (&t));
>>>
>>>   for (i = 0; i < 20; i++)
>>>   {
>>>      timeout.tv_sec  = 1;
>>>      timeout.tv_usec = 0;
>>>
>>>      if ((ret = select (FD_SETSIZE, NULL, NULL, NULL, &timeout)) < 0)
>>>      {
>>>         printf ("select returned %d, errno = %d\n", ret, errno);
>>>         return EXIT_FAILURE;
>>>      }
>>>   }
>>>   t = time (NULL);
>>>   printf ("Current time after = %s", ctime (&t));
>>>   return EXIT_SUCCESS;
>>> }
>>>
>>> On a virtual machine under VMware I got the following result after some
>>> minutes of system run time:
>>>
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:05:51 2009
>>> Current time after = Fri Sep 18 20:06:11 2009
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:14:29 2009
>>> Current time after = Fri Sep 18 20:14:33 2009
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:14:57 2009
>>> Current time after = Fri Sep 18 20:14:57 2009
>>> hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
>>> Current time before = Fri Sep 18 20:15:20 2009
>>> Current time after = Fri Sep 18 20:15:40 2009
>>> hws@cwc-vmware:/home/hws >
>>>
>>> Normally, the time distance between 'before' and 'after' should be 20
>>> seconds as in the first and last run of the program. For the second run
>>> the time difference is only 4 seconds and for the third run it is even zero.
>>>
>>> On the real hardware I have also some other time-related issues when the
>>> problem occurs. Keyboard input will often 'bounce' - key presses are
>>> detected two or more times and some delay times are prolonged (!). I
>>> could not yet reproduce this in the virtual machine.
>>>
>>> The problem will not always occur immediately after the system is
>>> started but it may take several minutes until the first effects occur. I
>>> have not tested this issue with other kernels yet but I will do so
>>> during the weekend.
>>>
>>> Are there any ideas what to do about this (beside buying a better BIOS
>>> clock)? I would really like to have the NTP daemon running to keep the
>>> system time accurate, but somehow it seems to effect wait queues in the
>>> kernel pretty badly.
>>>
>>> Bye,
>>>           Jürgen
>>>
>>> Jürgen Mell schrieb:
>>>       
>>>> Hi,
>>>>
>>>> I have an application which connects via a network socket to a server
>>>> running on the same machine (IP 127.0.0.1) This application uses the
>>>> function 'select' to wait for new data from the server or until a two
>>>> seconds timeout. This works well until there is network traffic on the
>>>> external network interfaces (eth* or WLAN). When there is network
>>>> traffic on the external interfaces, the select function does not wait
>>>> anymore but it returns with a return code of zero, indicating not data
>>>> available on the socket. This happens nearly immediately (after 8 to 9
>>>> microseconds) and not after the specified two seconds interval. The
>>>> timeout parameter of select is updated accordingly (it shows eg. 1 s
>>>> 999991 us).
>>>> Up to now I could not test this with another kernel but I will try to
>>>> do it this afternoon. Are there any known problems with select? Is
>>>> there any way to circumvent this?
>>>>
>>>> Any help would be greatly appreciated!
>>>>
>>>>        Jürgen
>>>>
>>>>         
>>> --
>>> Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
>>> Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
>>> FAX :  +49-511-762-18225
>>> Mobil: +49-160-7428156
>>> ----------------------------------------------------------------------------
>>> HEDRICH winding systems GmbH
>>> An der Universität 2 (im PZH)
>>> D-30823 Garbsen (GERMANY)
>>> ----------------------------------------------------------------------------
>>> Geschäftsführer: Karsten Adam
>>> Handelsregister: Wetzlar, HRB 4768
>>> Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
>>> ----------------------------------------------------------------------------
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>       
>>
>> --
>> -- Sujit K M
>>
>> blog(http://kmsujit.blogspot.com/)
>>
>>     
>
>
>
>   


-- 
Jürgen Mell (Software-Entwicklung)       mell@hedrich-winders.com
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
---------------------------------------------------------------------------- 


--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-09-21  9:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-10 10:47 Problem with function select on kernel 2.6.29.6-rt23 Jürgen Mell
2009-09-10 11:33 ` Sujit K M
2009-09-10 11:50   ` Jürgen Mell
2009-09-18 19:40 ` Jürgen Mell
2009-09-20 10:20   ` Sujit K M
2009-09-21  9:23     ` Sujit K M
2009-09-21  9:58       ` Jürgen Mell [this message]
2009-09-21 10:34         ` Jürgen Mell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AB74E28.7020901@hedrich-winders.com \
    --to=mell@hedrich-winders.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=sjt.kar@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox