From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: [PATCH] net: release skb->dst in sock_queue_rcv_skb() Date: Wed, 26 Nov 2008 08:39:30 +0100 Message-ID: <492CFD32.7020109@cosmosbay.com> References: <492B8274.6080609@cosmosbay.com> <20081124.210038.90767194.davem@davemloft.net> <492C919E.3050108@cosmosbay.com> <20081125.180404.157859705.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: andi@firstfloor.org, netdev@vger.kernel.org To: David Miller Return-path: Received: from gw1.cosmosbay.com ([86.65.150.130]:50939 "EHLO gw1.cosmosbay.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751255AbYKZHjf convert rfc822-to-8bit (ORCPT ); Wed, 26 Nov 2008 02:39:35 -0500 In-Reply-To: <20081125.180404.157859705.davem@davemloft.net> Sender: netdev-owner@vger.kernel.org List-ID: David Miller a =E9crit : > From: Eric Dumazet > Date: Wed, 26 Nov 2008 01:00:30 +0100 >=20 >> In the meantime, what do you think of the following patch ? >> >> [PATCH] net: release skb->dst in sock_queue_rcv_skb() >> >> When queuing a skb to sk->sk_receive_queue, we can release its dst, = not >> anymore needed. >> Since current cpu did the dst_hold(), refcount is probably still hot >> int this cpu caches. >> >> This avoids readers to access the original dst to decrement its refc= ount, >> possibly a long time after packet reception. This should speedup UDP >> and RAW receive path. >> >> Signed-off-by: Eric Dumazet >=20 > I guess the idea is that if we release quickly we'll not have > to reget the cacheline in owned state. Yes, this is the idea. >=20 > I wonder if this might actually slightly hurt loads like tbench where > we are banging on the refcnt constantly on every cpu anyways. Yes, the only way to reduce the load on this case is not moving the increments/decrements : It could help on some machines, and hurt on oth= ers. In the long term, only reducing the number of dirtying can help the ave= rage. >=20 > Can you do a quick check? >=20 >=20 Sure I can do a check :) No impact at all on tbench, unless this bench can run in UDP mode :) CPU: Core 2, speed 3000.1 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a u= nit mask of 0x00 (Unhalted core cycles) count 100000 samples cum. samples % cum. % symbol name 599983 599983 11.2948 11.2948 copy_from_user 549349 1149332 10.3416 21.6364 ipt_do_table 245697 1395029 4.6253 26.2617 copy_to_user 221663 1616692 4.1729 30.4346 schedule 144888 1761580 2.7275 33.1621 tcp_sendmsg 136967 1898547 2.5784 35.7406 tcp_ack 115513 2014060 2.1746 37.9151 tcp_transmit_skb 99517 2113577 1.8734 39.7885 sysenter_past_esp 99438 2213015 1.8719 41.6605 ip_queue_xmit 98171 2311186 1.8481 43.5086 tcp_recvmsg 80763 2391949 1.5204 45.0290 __switch_to 79621 2471570 1.4989 46.5278 tcp_v4_rcv 77210 2548780 1.4535 47.9813 dst_release 69387 2618167 1.3062 49.2876 tcp_rcv_established 64709 2682876 1.2182 50.5057 __tcp_push_pending_frames 55223 2738099 1.0396 51.5453 lock_sock_nested 53754 2791853 1.0119 52.5572 sys_socketcall 50092 2841945 0.9430 53.5002 netif_receive_skb 49499 2891444 0.9318 54.4321 release_sock 47796 2939240 0.8998 55.3318 __inet_lookup_established 45162 2984402 0.8502 56.1820 update_curr 44895 3029297 0.8452 57.0272 ip_rcv 42945 3072242 0.8084 57.8356 dev_queue_xmit 42892 3115134 0.8075 58.6431 tcp_event_data_recv 42768 3157902 0.8051 59.4482 local_bh_enable 41555 3199457 0.7823 60.2305 netif_rx 38613 3238070 0.7269 60.9574 __alloc_skb 38016 3276086 0.7157 61.6730 ip_finish_output 36867 3312953 0.6940 62.3671 tcp_current_mss 36759 3349712 0.6920 63.0591 skb_release_data 35560 3385272 0.6694 63.7285 local_bh_enable_ip 34100 3419372 0.6419 64.3704 sock_recvmsg 33829 3453201 0.6368 65.0073 __kfree_skb 32949 3486150 0.6203 65.6275 sched_clock_cpu