From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753342AbcGTTyO (ORCPT ); Wed, 20 Jul 2016 15:54:14 -0400 Received: from mail-sn1nam01on0102.outbound.protection.outlook.com ([104.47.32.102]:62435 "EHLO NAM01-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751244AbcGTTyL (ORCPT ); Wed, 20 Jul 2016 15:54:11 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=waiman.long@hpe.com; Message-ID: <578FD6D4.4010005@hpe.com> Date: Wed, 20 Jul 2016 15:53:56 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Tejun Heo CC: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Christoph Lameter , , , Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Boqun Feng , Scott J Norton , Douglas Hatch Subject: Re: [PATCH v3 1/4] lib/dlock-list: Distributed and lock-protected lists References: <1468604383-40362-1-git-send-email-Waiman.Long@hpe.com> <1468604383-40362-2-git-send-email-Waiman.Long@hpe.com> <20160718233803.GN3078@mtj.duckdns.org> <578E7497.30602@hpe.com> <20160719192333.GP3078@mtj.duckdns.org> In-Reply-To: <20160719192333.GP3078@mtj.duckdns.org> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.182] X-ClientProxiedBy: BY1PR15CA0008.namprd15.prod.outlook.com (10.162.17.146) To CS1PR84MB0312.NAMPRD84.PROD.OUTLOOK.COM (10.162.190.30) X-MS-Office365-Filtering-Correlation-Id: a77b780a-520c-4733-4a70-08d3b0d79cd2 X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;2:265NZI8wU8hKFGuhus6BJWs3KYVtu4XPGC9iKNw+SnBwXFbCDRQOrVLuG2ZzcLGI+DvFY6qOlCCBeHKUUmzrlYmpGNq9vDIIRl/Yy9pEuukbxKYnHeeXs/5BnzG2PLmWMdSZRNRtFcky3w8QUa1/+YOBnYl488XT0xBp5D2jxFSKDGEuPzM6drQsPTzJTbb/;3:yq/OGwiuc3sxZOt/TWqMazoP358g1JpkS5lFfgp5V/UqsT9b0yg5OXWK1HDWZ5W7QUgDXPHNQuHF+p28rIGuoWGwTv3RW0TPKuMiprm4UnD9iND6Atn3UgAh8pS5biae X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0312; X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;25:fP2pBSTkGGm+JsCCK207097MLmvge8L2YiYhiZJhcuwsb649B3TAXK84OXjDNiYAaI08NKwmQxRxT5QRH/IBm+L8ZN3bY6VkI7Cn63nmNLquttwo9TKOMTeWO67Uf/Wi9BScx1Zg852fQfzfPZb7DGMWgeuhSfMK37SuH0R8Sa5Y0eyMOAlsscqvR/XxFLEQ1tuB+ZL4XieimrqIrSJApkqZW/YbtV/MyZ7hXYnbiTV9VVrl/IRUt2UcOcAdCNZwJraABJq5xdm+nl6bTwqurm23qoYdsslZzJLOu8uZJtZrZLy2tmaeEtIBp6R5zf6IIAL8FeuW0+QFOF9P93PZuTKGmsSckZHIYl+JDMvR8ZudJzuuxXodtUJmYaEuejXxX1SF9Ne0x4Z8aa4bUqOLNGTYK39nvIakQbhqKw7ZXxiDG5+7DbM1gyWL/+CA70Q1NyKRUpSEhOg4p8DllBgVK0Sw89Bwmwh2cfd1IbBCn3UIZiKHH3VA567grWretO09pWHzXB4QOwDkvIfse/9kDK8y9JxCtKprrIdSkDTeoXn4OeLZQtYr/JNHkV6YK8DsCpxyYzByxlr3alXSF5zO8mmCaccMF+vJkAn2G56NmPoyczOucLktaejKF4B2EbOmRst9PeL9H2nxQVtvJ36bWy94AdpNHTIxbu9DKES1DtN6jbujynfFtY6U0wmsgeTkjLTcyGamnsW43kW0DQPMzqrBE9Ot/YupjTbDT5eDAWI=;31:AsJginGXPFV2+2SfJKSeubqbsRUxlHVSYOZEomYgWf57nhAAxr82gz8kPoBtAKjgSYzyCrzq6g8V52dXmPTwPX3SsDSboc+r0DUTcTxWz6xpCVSndLrVZT/wBy4ccwov3GmMnKsYJUWo5Tt0o+IL/ers7z8o/O+tbbEmIfTfwpwk8TSzwee5IzgNvwW4ccdKsIYbTf9mLYNqTSd7N4VdAA== X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;20:flO0chZTVqllmPXB4OBCPJUV+SHb+Gj1awnY1jMyHM9JQka2ftDdbmL+SBtxWkLBvmcY+OWXZaXcLdQdr7/ULZuuJGolRjYTJ2ocv7De9mend1eTL8aWF6nT8at4AGvq0od4W8oJT9bdO4FbaugDTHkXcuY/eveir5us1ox6nbXlidvJi/dB2c5A6/HTuvIoQ1Widfz3bXp2QcIFXIIeZmQWomTDyCkIuBdvSncZucjCAe4FHGcRjh5MClS3NgoUXQaiQk2OD11Wxan9y6BLaFh+uHoGY5OviDuN1lU75w/i16G+1vnPL80AcX7qvgtchIbokBFMXCMeLDDrOGoQ85CDaSN1CBk1nwtd1MZfWoHNka5vpVEsflpUdcNj8n+pXMaZPg0TRxOBPA7e/BcfD1Gka9ev+wdrL0zcE6/lasEAJ/ThPB72uZlexNQRF7hNVEyqj4rV8SDyts6o7DmKQjMb6WmHRss6PaorDRwx0eL5dGNP6gbb9PDQyHEK/Kry;4:lf9Q08D4kLb4DXbl9NPrCRQrE3p1cXdBG6N+iveYGA1yNY4t0xgfAgOdWMxg+uUHa7hOXHak70OivPkGxsHXwPIFEO9ZL7H/+60gq+o2g7XYaupOxHe02mjzAH99iz3C5QrgcBvUSw1ZvxBAKhFc59zADBy7WAXkj03+V5C2NyggFKCcl7qReThkNBYkhc+yGfl3Crub7HMwxbwEwGtCwQxMbZwu6b4EC1S95KToq4mkt9W02o6wzqtW2iRmpL+v4ncHJA/A6wTG8iduWgqJMY6ZmFdaMsz6vOw65VRW7SEeF3cxndA1Qw6ixWg/ApU4ht83gDdyyXeOWseb86vXikVBie9VohiVr07octKheEVU4gX2bmK26ls1YlcTn7+mJ36umMF5oHdfLT+NCzATzA== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026);SRVR:CS1PR84MB0312;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0312; X-Forefront-PRVS: 000947967F X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(7916002)(199003)(24454002)(377454003)(189002)(50466002)(105586002)(230700001)(93886004)(110136002)(106356001)(36756003)(6116002)(3846002)(64126003)(586003)(23756003)(33656002)(47776003)(66066001)(65806001)(77096005)(65956001)(83506001)(7846002)(54356999)(80316001)(87266999)(50986999)(65816999)(76176999)(19580395003)(2950100001)(42186005)(101416001)(92566002)(81156014)(97736004)(8676002)(2906002)(4001350100001)(68736007)(189998001)(7736002)(86362001)(305945005)(59896002)(4326007)(81166006)(117156001);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0312;H:[192.168.142.174];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;CS1PR84MB0312;23:+DF1ZVXeKIddb6fodbrAgbOWvJlPaXetiPx2z8X?= =?iso-8859-1?Q?hkrLwZ0Xv+v/4V9kO5de9YJSEGuzMT4ulMo++tyKrXtmBYQ8HOS2O8M4Uk?= =?iso-8859-1?Q?IHaltV+ELP6Jqq1vn6c604wvA1lUXncES7ERT10/9/LgPM13y3wzWUvUY0?= =?iso-8859-1?Q?mYtqcNOlEmfyfn2We4/w3fo3F+qgz96mswuovjQ18OXJBD51VfJf2Iz5ig?= =?iso-8859-1?Q?wjDoaZ6JLBWoeOB9K+iyCs6PDsU/UAgtj4gNzwlI5goFzbw31kqzIZ5+n3?= =?iso-8859-1?Q?EpNYJPiCVa4l/VyiW0/bbx8vyoGaAUay6oUsBRRZ5QeckPXB4a2WHvJQMl?= =?iso-8859-1?Q?siANekfpNtnv4eYTjUTzK7hy9K+lTvp85imwZIpEDSuYSBjuUHS2r8l+CH?= =?iso-8859-1?Q?HkC9arTkbCxtB9ln4JRY/EpFEoKZOmBpfoSXnwddwDao9mIQyPPhEXhwIU?= =?iso-8859-1?Q?7clEbOfX9jBiPwekHGo4bVeJNKrm/DSkBQOMo2QUbCySLQSQ2myZFOciqN?= =?iso-8859-1?Q?b06IdfhNQfXN49tiNwxaskTI1lijbGMR137VhdbEWToGaFUiG1OkrMcVas?= =?iso-8859-1?Q?Qm9kt+er53JrpWU7Le10Yq31ZDknon6E2+nBEb2LWwF4IjI0QOU8ZbITAP?= =?iso-8859-1?Q?Zx/jyS0BfdiR1XWmSHJfIxYFgbHHqo3/nBgNLmNigGLoLZGs+2pnkideEB?= =?iso-8859-1?Q?nP6CpfkbW4XDXMwVAUvBoIjPBVxbxN1uJnGQ27KmMyO35+9eH3AbM6ylwo?= =?iso-8859-1?Q?y0qeb136oHjX8b5kc2NBJ64aJassEz6efEVaZljjKHf9lXk+FCb4E9/vA3?= =?iso-8859-1?Q?zThXj83cCwoyx6xChze1SscBsOoefBpo7+fof4Njx9i9h8wXhjdyZPKxqt?= =?iso-8859-1?Q?PowTPbJuYppc0LHTAC8dlrKQrgGhM5urAX/Hj4reP8CFX7c9A7nyI3vsud?= =?iso-8859-1?Q?IG2zBde0O2ZTSdXPvd/+e8ZiV3odj8cLMAWf5KM3odMbcMy6+ANa6UEFev?= =?iso-8859-1?Q?CYEc9GDA3ul2p6r12WJXqcR7UB3pj/+vCERvZmOtJMSnVQHs47BSFdfzoy?= =?iso-8859-1?Q?W9d/3GKXVYTgxkpwveB9bL7rNHwyWQYPr0VDUaw/zPyIh6MojX/5j9k0Tt?= =?iso-8859-1?Q?hZGez/JL5gX3MJi3EsXZL/VoKGG2S2/vC2UXuVfq6yBd3rmYWmcjAPG/gW?= =?iso-8859-1?Q?OibhdvUD88BcCgoHUohfyiUzTKfWlmWiDfN2vcXf4YSkhh3wqxhyQs5SWe?= =?iso-8859-1?Q?4nIJoSk13NG5C14++nvXPM+63o63G6LG7d2e596fhrnEfJv+ZYVTwWDlEw?= =?iso-8859-1?Q?mw4GuZRMNSYW5NWkhXMTHJ/8R6hmTWAPNWeBCxhoBnfU/GmjJ369RQy/Pl?= =?iso-8859-1?Q?Tnt44D+U=3D?= X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0312;6:j9yQEiqnL5IHXq4L320KESWDypIDnHm4SLUbz9y+NS7SZRNH8Aa2fADaByoSRHoi7rxZyt/t4RdmBeLEHtM9jFVvFuPThAvvb6SmkcKMEEdVG8FqOQujnOqgfPCPBNUdBfVA0B5YW/zyUidK0YlLHWmvLjL/L1oLLv99kuO28CifK9RLwL09n8Ni+eAgLHOjDTNrBthIf2GtWjOUpVIF85RIxoqvVPM9SAk9ncY/nXKJ61JzibVZa2xaZaZeXBSxDsf1lPTHmnx9rZlNC5F/Rc6MmbavGIDdeSjdBNwuJhTqNaKITAGotkwYaFGax7jKoOmP5lnBZ2gtw7+yC+XkjQ==;5:RhqO5A2eutXeFzn6RumQRNcFPNprZCr65SaevVeX0FOwOlJLX6NW62iqgEg+F8oRfOg/PgGJLmxvB2DzBcXFj+2VhKoPxIwuX9HmsW0wcw86ggYyI45be5jB8XzQd0IvAZ1YoNHQmc7GlRiylgavSA==;24:6J9dydNWRJdY+zoOLka1EmuJFIfA1i/vhbjJlGX7DGC7F0bGUtre4zmW0UtXXo5SY/KPw2CPXpyWGALGjjR0zgOcwvHdhNbjuuikrbb87lo=;7:rknZy1qdc59yz7y8ECfySjd/ZILLYkSOzfAonZ0ArKxLYQvBGXydmexNY1UqdV0Z7Pp0qCcfevpqHFeOgCQh/JU2uTbvCSVLrcOkeY/yI56HOEZYEowu1ZMC2v8Ybaxnsz7TQI4tRj8dlFHABLOdq7PLoL//2fFYwyvnXNkAgPvSuNV4sdKTLXSYpMHaOGH119E3idEMG+O0F7sU8vVAe77b+D2JF+pqyqlq/W4LgUL2Tq3ANwejgplgs2zVdHuX SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Jul 2016 19:54:06.3166 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0312 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/19/2016 03:23 PM, Tejun Heo wrote: > Hello, > > On Tue, Jul 19, 2016 at 02:42:31PM -0400, Waiman Long wrote: >> On 07/18/2016 07:38 PM, Tejun Heo wrote: >>>> +struct dlock_list_node { >>>> + struct list_head list; >>>> + spinlock_t *lockptr; >>>> +}; >>> Wouldn't it be better to point to dlock_list_percpu? >> I could. However, the only thing that matter is the spinlock that protects >> the list entry. > Yeah, we can get back to this when it's actually necessary. It just > looked a bit weird to me. > >>>> +/* >>>> + * The dlock list iteration functions which return true if iteration has >>>> + * to be continued. >>>> + */ >>>> +extern bool dlock_list_next(struct dlock_list_head *dlist, >>>> + struct dlock_list_iter *iter); >>>> +extern bool dlock_list_next_safe(struct dlock_list_head *dlist, >>>> + struct dlock_list_iter *iter); >>> Why not return dlock_list_node * for the current node? That'd more >>> conventional and allows dlock_list_iter to be opaque. >> Yes, I can make it return dlock_list_node *. >> >> However, to make dlock_list_iter opaque, I will have to dynamically allocate >> the structure. That will add an extra memory allocation and free calls as >> well as handling the error case of running out of memory. I don't think that >> is worth doing at this point. > Sure, keep it defined in the header file. Just don't require users to > reach into it and add a comment saying that the struct is opaque to > its users. Will do so. >>>> +int alloc_dlock_list_head(struct dlock_list_head *dlist) >>>> +{ >>>> + struct dlock_list_head dlist_tmp; >>>> + int cpu; >>>> + >>>> + dlist_tmp.head = alloc_percpu(struct dlock_list_head_percpu); >>>> + if (!dlist_tmp.head) >>>> + return -ENOMEM; >>>> + >>>> + for_each_possible_cpu(cpu) { >>>> + struct dlock_list_head_percpu *head; >>>> + >>>> + head = per_cpu_ptr(dlist_tmp.head, cpu); >>>> + INIT_LIST_HEAD(&head->list); >>>> + head->lock = __SPIN_LOCK_UNLOCKED(&head->lock); >>>> + lockdep_set_class(&head->lock,&dlock_list_key); >>>> + } >>>> + >>>> + dlist->head = dlist_tmp.head; >>> Just use dlist->head directly or use local __perpcu head pointer? >> I just don't want to expose the structure to world until it is fully >> initialized. If you think I am over-cautious, I can use dlist->head as >> suggested. > I don't think it makes any actual difference. No strong opinion > either way. Just use local __percpu head pointer then? > >>>> + return 0; >>>> +} >>>> +EXPORT_SYMBOL(alloc_dlock_list_head); >>> Does this actually need to be exported? If so, it might be a better >>> idea to start with EXPORT_SYMBOL_GPL(). >> For the current use case, we probably don't need to export the symbols. >> Other use cases may require that. I will change it to use the version >> instead. > If it's not immediately necessary, it's best to not export at all. I will remove the EXPORT statements. >>>> +void dlock_list_del(struct dlock_list_node *node) >>>> +{ >>>> + spinlock_t *lock = READ_ONCE(node->lockptr); >>>> + >>>> + if (unlikely(!lock)) { >>>> + WARN_ONCE(1, >>>> + "dlock_list_del: node 0x%lx has no associated lock\n", >>>> + (unsigned long)node); >>> Maybe "if (WARN_ONCE(!lock...)"? WARN_ONCE implies unlikely. >> OK, will do that. >> >>>> + return; >>>> + } >>>> + >>>> + spin_lock(lock); >>>> + if (likely(lock == node->lockptr)) { >>>> + list_del_init(&node->list); >>>> + node->lockptr = NULL; >>>> + } else { >>>> + /* >>>> + * This path should never be executed. >>>> + */ >>>> + WARN_ON_ONCE(1); >>>> + } >>> This still kinda bothers me because this pretty much requires the >>> users to have strong synchronization around the operations and makes >>> it unusable in situations where opportunistic behaviors are >>> acceptable. It negates the usefulness quite a bit. >> I understand your concern. I will make it retry again with the new lock. > It doesn't necessarily have to retry but shouldn't break down when > used in an opportunistic racy way - e.g. if adds and removes race, the > order of operations isn't clearly defined as such any outcome is fine > as long as the list maintains its integrity. I am going to retry it if the lock changes to different one and ignore it if it becomes NULL. >>>> +/** >>>> + * dlock_list_next_safe - Removal-safe iterator of dlock list >>>> + * @dlist: Pointer to the dlock_list_head structure >>>> + * @iter : Pointer to the dlock list iterator structure >>>> + * Return: true if the next entry is found, false if all the entries iterated >>>> + * >>>> + * The iterator has to be properly initialized before calling this function. >>>> + * This iteration function is safe with respect to list entry removal. >>>> + * However, it cannot correctly iterate newly added entries right after the >>>> + * current one. >>>> + */ >>> This still looks wrong to me. If you want to provide the two variants >>> of iterations, can't you just implement one next function and build >>> the two types of iterations on top of it? >> I have been thinking about making dlock_list_next_cpu() the real external >> function and have 2 inline functions that implement dlock_list_next() and >> dlock_list_next_safe(). That may strike a better balance between performance >> and code abstraction. I will do so if you have no objection to that. > Yeah, please give it a try. As mentioned in another reply, it'd > probably be best to provide an iteration macro which encapsulates the > whole thing. > > Thanks. > Yes, I am go to make the dlist_for_each_entry macro as suggested by Al. Cheers, Longman