From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CD69C52D49 for ; Thu, 27 Feb 2020 13:58:37 +0000 (UTC) Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F396820801 for ; Thu, 27 Feb 2020 13:58:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=joelfernandes.org header.i=@joelfernandes.org header.b="QQgxfMHm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F396820801 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=joelfernandes.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linux-kernel-mentees-bounces@lists.linuxfoundation.org Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id D28B685F09; Thu, 27 Feb 2020 13:58:36 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gaNAnoq03K2U; Thu, 27 Feb 2020 13:58:35 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by fraxinus.osuosl.org (Postfix) with ESMTP id D44E385DF2; Thu, 27 Feb 2020 13:58:35 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id BE4F6C1D80; Thu, 27 Feb 2020 13:58:35 +0000 (UTC) Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 199C2C0177 for ; Thu, 27 Feb 2020 13:58:34 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 1644D87647 for ; Thu, 27 Feb 2020 13:58:34 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Yg8F-2y43n5l for ; Thu, 27 Feb 2020 13:58:32 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from mail-pf1-f195.google.com (mail-pf1-f195.google.com [209.85.210.195]) by whitealder.osuosl.org (Postfix) with ESMTPS id 87F4C871F4 for ; Thu, 27 Feb 2020 13:58:32 +0000 (UTC) Received: by mail-pf1-f195.google.com with SMTP id 15so1272610pfo.9 for ; Thu, 27 Feb 2020 05:58:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=iGBwlRPOz0RFCrlO5EPuWAgJWFzfpEoFMahebMifJ5o=; b=QQgxfMHmsxqGnYaUMtIeeRwYnSCEYuE25sLHZ//sKgE57AEzj0NzxYE0zgGvUwUxZo W4lqe04uKi0TkwrCASE+i4dnfMWXThYFRLm+DwjoP/Bw2oUXOiIcurNdI4YZN0xLu5u4 S6CIgHarbdg6MHidTwrOFK4xWBUYJgT5VDinc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=iGBwlRPOz0RFCrlO5EPuWAgJWFzfpEoFMahebMifJ5o=; b=guUf2pUBJVbuT1jDWJ2H6LQ4mcUQTUs9mEkPz1fm8Q1cdDIS615i2xuSvMhMSiLE66 0mqa/m4mL4eVRwHjR64MzTpBshZN0cgorD655SwxY+RLDp5Ci9EZEL3Pad0u2TLuhDcB mwKXUhagVowEQHIhwIzLgiBGczuDNUJsLJUt688V84vtL5r/z7RZT3/VBU11uLllQVie YosgYrPcBL0uT4EREQ+AHOBr1llRR33z8mwS1NAIM+tg3L7/01gH7XYw0iifm2Hb+lwu MGTTXiciBdSoYJHWfHBNmalKavXAaZsFRRacM5g/2g+EPA7ofRS+Xp+AKzFltejPbGME /c1Q== X-Gm-Message-State: APjAAAW72GFfCE9EEtiOruS8WKFkbz36H2xRyU5NwVRE4ulHqj1ZI96v pprMCFrs4P8LSF83AsbtpiniyPU4vE0= X-Google-Smtp-Source: APXvYqyYFbj9++y6OBfemDdW3+kGbXme9jJAPfOYIdIKHQGblyUKP0b9fU/TrwEAONG2Q3LOmQqf7g== X-Received: by 2002:a0c:eed2:: with SMTP id h18mr5111311qvs.184.1582810117485; Thu, 27 Feb 2020 05:28:37 -0800 (PST) Received: from localhost ([2620:15c:6:12:9c46:e0da:efbf:69cc]) by smtp.gmail.com with ESMTPSA id v10sm3141066qtj.26.2020.02.27.05.28.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Feb 2020 05:28:36 -0800 (PST) Date: Thu, 27 Feb 2020 08:28:36 -0500 From: Joel Fernandes To: Aman Sharma Message-ID: <20200227132836.GB161459@google.com> References: <20200227104448.10154-1-amanharitsh123@gmail.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20200227104448.10154-1-amanharitsh123@gmail.com> User-Agent: Mutt/1.12.2 (2019-09-21) Cc: "Paul E. McKenney" , Jonathan Corbet , linux-doc@vger.kernel.org, Lai Jiangshan , Josh Triplett , Steven Rostedt , linux-kernel@vger.kernel.org, rcu@vger.kernel.org, Mathieu Desnoyers , linux-kernel-mentees@lists.linuxfoundation.org Subject: Re: [Linux-kernel-mentees] [PATCH] doc: Convert rculist_nulls.txt to rculist_nulls.rst X-BeenThere: linux-kernel-mentees@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-kernel-mentees-bounces@lists.linuxfoundation.org Sender: "Linux-kernel-mentees" On Thu, Feb 27, 2020 at 04:14:46PM +0530, Aman Sharma wrote: > This patch converts rculist_nulls from .txt to .rst format, and also adds > it to the index.rst file. When posting such patches, could you clarify what has changed during the conversion when you write the commit message? thanks, - Joel > Signed-off-by: Aman Sharma > --- > Documentation/RCU/index.rst | 1 + > Documentation/RCU/rculist_nulls.rst | 179 ++++++++++++++++++++++++++++ > Documentation/RCU/rculist_nulls.txt | 172 -------------------------- > 3 files changed, 180 insertions(+), 172 deletions(-) > create mode 100644 Documentation/RCU/rculist_nulls.rst > delete mode 100644 Documentation/RCU/rculist_nulls.txt > > diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst > index d60eb4ba2cd0..2cf55bd141b3 100644 > --- a/Documentation/RCU/index.rst > +++ b/Documentation/RCU/index.rst > @@ -10,6 +10,7 @@ RCU concepts > arrayRCU > rcubarrier > rcu_dereference > + rculist_nulls > checklist > whatisRCU > rcu > diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst > new file mode 100644 > index 000000000000..8e04de44fe3d > --- /dev/null > +++ b/Documentation/RCU/rculist_nulls.rst > @@ -0,0 +1,179 @@ > +.. _rculist_nulls_doc: > + > +Using hlist_nulls to protect read-mostly linked lists and objects using SLAB_TYPESAFE_BY_RCU allocations. > +========================================================================================================= > + > +Please read the basics in Documentation/RCU/listRCU.rst > + > +Using special makers (called 'nulls') is a convenient way > +to solve following problem : > + > +A typical RCU linked list managing objects which are > +allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can > +use following algos : > + > +1. Lookup algo > +-------------- > +:: > + > + rcu_read_lock() > + begin: > + obj = lockless_lookup(key); > + if (obj) { > + if (!try_get_ref(obj)) // might fail for free objects > + goto begin; > + /* > + * Because a writer could delete object, and a writer could > + * reuse these object before the RCU grace period, we > + * must check key after getting the reference on object > + */ > + if (obj->key != key) { // not the object we expected > + put_ref(obj); > + goto begin; > + } > + } > + rcu_read_unlock(); > + > +Beware that lockless_lookup(key) cannot use traditional hlist_for_each_entry_rcu() > +but a version with an additional memory barrier (smp_rmb()) :: > + > + lockless_lookup(key) > + { > + struct hlist_node *node, *next; > + for (pos = rcu_dereference((head)->first); > + pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && > + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); > + pos = rcu_dereference(next)) > + if (obj->key == key) > + return obj; > + return NULL; > + > +And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb() :: > + > + struct hlist_node *node; > + for (pos = rcu_dereference((head)->first); > + pos && ({ prefetch(pos->next); 1; }) && > + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); > + pos = rcu_dereference(pos->next)) > + if (obj->key == key) > + return obj; > + return NULL; > +} > + > +Quoting Corey Minyard : > + > +"If the object is moved from one list to another list in-between the > + time the hash is calculated and the next field is accessed, and the > + object has moved to the end of a new list, the traversal will not > + complete properly on the list it should have, since the object will > + be on the end of the new list and there's not a way to tell it's on a > + new list and restart the list traversal. I think that this can be > + solved by pre-fetching the "next" field (with proper barriers) before > + checking the key." > + > +2. Insert algo > +---------------- > + > +We need to make sure a reader cannot read the new 'obj->obj_next' value > +and previous value of 'obj->key'. Or else, an item could be deleted > +from a chain, and inserted into another chain. If new chain was empty > +before the move, 'next' pointer is NULL, and lockless reader can > +not detect it missed following items in original chain :: > + > + /* > + * Please note that new inserts are done at the head of list, > + * not in the middle or end. > + */ > + obj = kmem_cache_alloc(...); > + lock_chain(); // typically a spin_lock() > + obj->key = key; > + /* > + * we need to make sure obj->key is updated before obj->next > + * or obj->refcnt > + */ > + smp_wmb(); > + atomic_set(&obj->refcnt, 1); > + hlist_add_head_rcu(&obj->obj_node, list); > + unlock_chain(); // typically a spin_unlock() > + > + > +3. Remove algo > +-------------- > +Nothing special here, we can use a standard RCU hlist deletion. > +But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused > +very very fast (before the end of RCU grace period) :: > + > + if (put_last_reference_on(obj) { > + lock_chain(); // typically a spin_lock() > + hlist_del_init_rcu(&obj->obj_node); > + unlock_chain(); // typically a spin_unlock() > + kmem_cache_free(cachep, obj); > + } > + > + > + > +With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup() > +and extra smp_wmb() in insert function. > + > +For example, if we choose to store the slot number as the 'nulls' > +end-of-list marker for each slot of the hash table, we can detect > +a race (some writer did a delete and/or a move of an object > +to another chain) checking the final 'nulls' value if > +the lookup met the end of chain. If final 'nulls' value > +is not the slot number, then we must restart the lookup at > +the beginning. If the object was moved to the same chain, > +then the reader doesn't care : It might eventually > +scan the list again without harm. > + > + > +1. lookup algo > +-------------- > +:: > + > + head = &table[slot]; > + rcu_read_lock(); > + begin: > + hlist_nulls_for_each_entry_rcu(obj, node, head, member) { > + if (obj->key == key) { > + if (!try_get_ref(obj)) // might fail for free objects > + goto begin; > + if (obj->key != key) { // not the object we expected > + put_ref(obj); > + goto begin; > + } > + goto out; > + } > + /* > + * if the nulls value we got at the end of this lookup is > + * not the expected one, we must restart lookup. > + * We probably met an item that was moved to another chain. > + */ > + if (get_nulls_value(node) != slot) > + goto begin; > + obj = NULL; > + > + out: > + rcu_read_unlock(); > + > +2. Insert function > +------------------ > + > +:: > + > + /* > + * Please note that new inserts are done at the head of list, > + * not in the middle or end. > + */ > + obj = kmem_cache_alloc(cachep); > + lock_chain(); // typically a spin_lock() > + obj->key = key; > + /* > + * changes to obj->key must be visible before refcnt one > + */ > + smp_wmb(); > + atomic_set(&obj->refcnt, 1); > + /* > + * insert obj in RCU way (readers might be traversing chain) > + */ > + hlist_nulls_add_head_rcu(&obj->obj_node, list); > + unlock_chain(); // typically a spin_unlock() > diff --git a/Documentation/RCU/rculist_nulls.txt b/Documentation/RCU/rculist_nulls.txt > deleted file mode 100644 > index 23f115dc87cf..000000000000 > --- a/Documentation/RCU/rculist_nulls.txt > +++ /dev/null > @@ -1,172 +0,0 @@ > -Using hlist_nulls to protect read-mostly linked lists and > -objects using SLAB_TYPESAFE_BY_RCU allocations. > - > -Please read the basics in Documentation/RCU/listRCU.rst > - > -Using special makers (called 'nulls') is a convenient way > -to solve following problem : > - > -A typical RCU linked list managing objects which are > -allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can > -use following algos : > - > -1) Lookup algo > --------------- > -rcu_read_lock() > -begin: > -obj = lockless_lookup(key); > -if (obj) { > - if (!try_get_ref(obj)) // might fail for free objects > - goto begin; > - /* > - * Because a writer could delete object, and a writer could > - * reuse these object before the RCU grace period, we > - * must check key after getting the reference on object > - */ > - if (obj->key != key) { // not the object we expected > - put_ref(obj); > - goto begin; > - } > -} > -rcu_read_unlock(); > - > -Beware that lockless_lookup(key) cannot use traditional hlist_for_each_entry_rcu() > -but a version with an additional memory barrier (smp_rmb()) > - > -lockless_lookup(key) > -{ > - struct hlist_node *node, *next; > - for (pos = rcu_dereference((head)->first); > - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && > - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); > - pos = rcu_dereference(next)) > - if (obj->key == key) > - return obj; > - return NULL; > - > -And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb() : > - > - struct hlist_node *node; > - for (pos = rcu_dereference((head)->first); > - pos && ({ prefetch(pos->next); 1; }) && > - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); > - pos = rcu_dereference(pos->next)) > - if (obj->key == key) > - return obj; > - return NULL; > -} > - > -Quoting Corey Minyard : > - > -"If the object is moved from one list to another list in-between the > - time the hash is calculated and the next field is accessed, and the > - object has moved to the end of a new list, the traversal will not > - complete properly on the list it should have, since the object will > - be on the end of the new list and there's not a way to tell it's on a > - new list and restart the list traversal. I think that this can be > - solved by pre-fetching the "next" field (with proper barriers) before > - checking the key." > - > -2) Insert algo : > ----------------- > - > -We need to make sure a reader cannot read the new 'obj->obj_next' value > -and previous value of 'obj->key'. Or else, an item could be deleted > -from a chain, and inserted into another chain. If new chain was empty > -before the move, 'next' pointer is NULL, and lockless reader can > -not detect it missed following items in original chain. > - > -/* > - * Please note that new inserts are done at the head of list, > - * not in the middle or end. > - */ > -obj = kmem_cache_alloc(...); > -lock_chain(); // typically a spin_lock() > -obj->key = key; > -/* > - * we need to make sure obj->key is updated before obj->next > - * or obj->refcnt > - */ > -smp_wmb(); > -atomic_set(&obj->refcnt, 1); > -hlist_add_head_rcu(&obj->obj_node, list); > -unlock_chain(); // typically a spin_unlock() > - > - > -3) Remove algo > --------------- > -Nothing special here, we can use a standard RCU hlist deletion. > -But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused > -very very fast (before the end of RCU grace period) > - > -if (put_last_reference_on(obj) { > - lock_chain(); // typically a spin_lock() > - hlist_del_init_rcu(&obj->obj_node); > - unlock_chain(); // typically a spin_unlock() > - kmem_cache_free(cachep, obj); > -} > - > - > - > --------------------------------------------------------------------------- > -With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup() > -and extra smp_wmb() in insert function. > - > -For example, if we choose to store the slot number as the 'nulls' > -end-of-list marker for each slot of the hash table, we can detect > -a race (some writer did a delete and/or a move of an object > -to another chain) checking the final 'nulls' value if > -the lookup met the end of chain. If final 'nulls' value > -is not the slot number, then we must restart the lookup at > -the beginning. If the object was moved to the same chain, > -then the reader doesn't care : It might eventually > -scan the list again without harm. > - > - > -1) lookup algo > - > - head = &table[slot]; > - rcu_read_lock(); > -begin: > - hlist_nulls_for_each_entry_rcu(obj, node, head, member) { > - if (obj->key == key) { > - if (!try_get_ref(obj)) // might fail for free objects > - goto begin; > - if (obj->key != key) { // not the object we expected > - put_ref(obj); > - goto begin; > - } > - goto out; > - } > -/* > - * if the nulls value we got at the end of this lookup is > - * not the expected one, we must restart lookup. > - * We probably met an item that was moved to another chain. > - */ > - if (get_nulls_value(node) != slot) > - goto begin; > - obj = NULL; > - > -out: > - rcu_read_unlock(); > - > -2) Insert function : > --------------------- > - > -/* > - * Please note that new inserts are done at the head of list, > - * not in the middle or end. > - */ > -obj = kmem_cache_alloc(cachep); > -lock_chain(); // typically a spin_lock() > -obj->key = key; > -/* > - * changes to obj->key must be visible before refcnt one > - */ > -smp_wmb(); > -atomic_set(&obj->refcnt, 1); > -/* > - * insert obj in RCU way (readers might be traversing chain) > - */ > -hlist_nulls_add_head_rcu(&obj->obj_node, list); > -unlock_chain(); // typically a spin_unlock() > -- > 2.20.1 > _______________________________________________ Linux-kernel-mentees mailing list Linux-kernel-mentees@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/linux-kernel-mentees