public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Steffen Persvold <sp@scali.com>
To: mingo@elte.hu
Cc: Jens Axboe <axboe@suse.de>, lkml <linux-kernel@vger.kernel.org>
Subject: Re: Short question regarding generic_make_request()
Date: Mon, 04 Feb 2002 09:57:02 +0100	[thread overview]
Message-ID: <3C5E4CDE.9B60F077@scali.com> (raw)
In-Reply-To: <Pine.LNX.4.33.0202040130200.19055-100000@localhost.localdomain>

Ingo Molnar wrote:
> 
> On Sun, 3 Feb 2002, Steffen Persvold wrote:
> 
> > Ok, the reason I'm asking is that I receive a request from a remote
> > machine on interrupt level (tasklet) and want to submit this to the
> > local device. The reason I'm using a tasklet instead of a kernel
> > thread is that somewhere between RedHat's 2.4.3-12 and 2.4.9-12
> > kernels the latency of waking up a kernel thread increased (using a
> > semaphore method similar to the one used in loop.c). I don't know why
> > this happened, but I guess that if I still could use a kernel thread
> > there wouldn't be any problems using generic_make_request().
> 
> you really want a kernel thread for this. The wakeup latency of a kernel
> thread is on the order of 2-3 usecs (context switch overhead included),
> nothing compared to usual block IO costs.
> 
> you say that the latency of waking up a kernel thread has increased - by
> how much?
> 

Well, I might be analyzing it wrong but, the same driver that I'm gonna use for this shared disk
stuff can also be enabled for ethernet emulation. The reason I'm saying that the latency of waking
up a kernel thread increased somewhere between RedHat's 2.4.3 and 2.4.9 is that with the packet
receive handler running in a tasklet I get these ping-pong numbers (measured with /bin/ping) :

[root@damd1 root]# ping sci4
PING sci4 (192.168.4.4) from 192.168.4.3 : 56(84) bytes of data.
64 bytes from sci4 (192.168.4.4): icmp_seq=0 ttl=255 time=238 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=1 ttl=255 time=176 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=2 ttl=255 time=200 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=3 ttl=255 time=177 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=4 ttl=255 time=172 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=5 ttl=255 time=156 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=6 ttl=255 time=160 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=7 ttl=255 time=177 usec
64 bytes from sci4 (192.168.4.4): icmp_seq=8 ttl=255 time=173 usec

For simplicity I'll say ~200usec. When I change the receive handler from a tasklet to a kernel
thread, the numbers look like this :

[root@damd1 root]# ping sci4
PING sci4 (192.168.4.4) from 192.168.4.3 : 56(84) bytes of data.
64 bytes from sci4 (192.168.4.4): icmp_seq=0 ttl=255 time=4.215 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=1 ttl=255 time=5.728 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=2 ttl=255 time=3.825 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=3 ttl=255 time=6.521 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=4 ttl=255 time=5.700 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=5 ttl=255 time=5.666 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=6 ttl=255 time=6.495 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=7 ttl=255 time=5.631 msec
64 bytes from sci4 (192.168.4.4): icmp_seq=8 ttl=255 time=6.480 msec


A bit up and down, but all of them in the msec range wich is means that the ping-pong latency has
increased atleast 100 times.

This might of course be related to the network stack (i.e it doesn't like to netif_rx() from user
context, or just a high turnaround time when trying to send the response) but on 2.4.3-12 it didn't
behave like this. The strange thing is that when I run a network benchmark such as netperf, I get
nice numbers on UDP bandwith (one-way traffic) :

(with kernel thread):
UDP UNIDIRECTIONAL SEND TEST to sci4
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   MBytes/sec

262144   32768   10.00       55958      0     174.92
262144           10.00       55925            174.82

(with tasklet):
UDP UNIDIRECTIONAL SEND TEST to sci4
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   MBytes/sec

262144   32768   10.01       53648      0     167.55
262144           10.01       53648            167.55



The TCP stack seems to like it better with tasklets :

(with kernel thread):
TCP STREAM TEST to sci4
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    MBytes/sec  

262144 262144 262144    10.02      16.24   

(with tasklet):
TCP STREAM TEST to sci4
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    MBytes/sec  

262144 262144 262144    10.00     116.00   

Regards,
-- 
  Steffen Persvold   | Scalable Linux Systems |   Try out the world's best
 mailto:sp@scali.com |  http://www.scali.com  | performing MPI implementation:
Tel: (+47) 2262 8950 |   Olaf Helsets vei 6   |      - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 |   N0621 Oslo, NORWAY   | >320MBytes/s and <4uS latency

  reply	other threads:[~2002-02-04  8:57 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-02-03 13:31 Short question regarding generic_make_request() Steffen Persvold
2002-02-03 15:33 ` Ingo Molnar
2002-02-03 13:39   ` Jens Axboe
2002-02-03 22:27     ` Steffen Persvold
2002-02-04  0:32       ` Ingo Molnar
2002-02-04  8:57         ` Steffen Persvold [this message]
2002-02-04 12:16           ` Ingo Molnar
2002-02-04 11:56             ` Steffen Persvold
2002-02-04 12:01             ` Steffen Persvold
2002-02-03 22:18   ` Steffen Persvold
2002-02-04  0:24     ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3C5E4CDE.9B60F077@scali.com \
    --to=sp@scali.com \
    --cc=axboe@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox