From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: inetpeer with create==0 Date: Wed, 02 Mar 2011 20:45:45 -0800 (PST) Message-ID: <20110302.204545.193730647.davem@davemloft.net> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org To: eric.dumazet@gmail.com Return-path: Received: from 74-93-104-97-Washington.hfc.comcastbusiness.net ([74.93.104.97]:49786 "EHLO sunset.davemloft.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932144Ab1CCEpI (ORCPT ); Wed, 2 Mar 2011 23:45:08 -0500 Sender: netdev-owner@vger.kernel.org List-ID: Eric, I was profiling the non-routing-cache case and something that stuck out is the case of calling inet_getpeer() with create==0. If an entry is not found, we have to redo the lookup under a spinlock to make certain that a concurrent writer rebalancing the tree does not "hide" an existing entry from us. This makes the case of a create==0 lookup for a not-present entry really expensive. It is on the order of 600 cpu cycles on my Niagara2. I added a hack to not do the relookup under the lock when create==0 and it now costs less than 300 cycles. This is now a pretty common operation with the way we handle COW'd metrics, so I think it's definitely worth optimizing. I looked at the generic radix tree implementation, and it supports full RCU lookups in parallel with insert/delete. It handles the race case without the relookup under lock because it creates fixed paths to "slots" where nodes live using shifts and masks. So if a path to a slot ever existed, it will always exist. Take a look at lib/radix-tree.c and include/linux/radix-tree.h if you are curious. I think we should do something similar for inetpeer. Currently we cannot just use the existing generic radix-tree code because it only supports indexes as large as "unsigned long" and we need to handle 128-bit ipv6 addresses.