From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751785AbcGSSmt (ORCPT ); Tue, 19 Jul 2016 14:42:49 -0400 Received: from mail-sn1nam01on0128.outbound.protection.outlook.com ([104.47.32.128]:30621 "EHLO NAM01-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751251AbcGSSmo (ORCPT ); Tue, 19 Jul 2016 14:42:44 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=waiman.long@hpe.com; Message-ID: <578E7497.30602@hpe.com> Date: Tue, 19 Jul 2016 14:42:31 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Tejun Heo CC: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Christoph Lameter , , , Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Boqun Feng , Scott J Norton , Douglas Hatch Subject: Re: [PATCH v3 1/4] lib/dlock-list: Distributed and lock-protected lists References: <1468604383-40362-1-git-send-email-Waiman.Long@hpe.com> <1468604383-40362-2-git-send-email-Waiman.Long@hpe.com> <20160718233803.GN3078@mtj.duckdns.org> In-Reply-To: <20160718233803.GN3078@mtj.duckdns.org> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.31] X-ClientProxiedBy: BY2PR06CA024.namprd06.prod.outlook.com (10.141.250.142) To AT5PR84MB0306.NAMPRD84.PROD.OUTLOOK.COM (10.162.138.28) X-MS-Office365-Filtering-Correlation-Id: 156b0c81-f15e-4106-d7ed-08d3b0047793 X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;2:P8HObdRXOA3AltHO/xa7lsZt2HDevi/ZPARPRwdX6W799D9UZsA+zEHmOtfvmFgvv6sqG8gf55Xqq0lwQAILzQZm7+qHpe24Hxu7UHAFS60JztTX+yERD/2zf6N6VzI0WVpBo651SPlykn3bgG+dDa6fo7bCmBvGWvo5yrxQRCqhR0T6uwiYc4VBKZS+OUIF;3:nwQRXddZd42N7XgqXhcTC+01JWbrI9/zU70R0JyUgS/ukcsc4TuSHSXO4R7Ht6+DLfZ3SjOsnkYXhQLcjcC1f0dEvlPuHPPWLcbi6BueeHd5v32SM27QDaM9w6nHIRlr X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0306; X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;25:5Dt2linOUiAF+WSwImkSL4EVedgTM8u35bi9n8GD2Mzsez5J/mgUPmrnf45KQmsNJHT0vIcM1pza4Lqt15UkNC4RwVafna+AChZzUgjqN1pEYyjMbXEj8oBQIf210wF98saELSmbdpG9rKTCoBQzQQxxcsUVCdbwd5qqwgl9tdzTE0cI0ZSC2jW06gCl+ozGuskB9eWiAaVCORLrFo0xT4jMf8meOu7IotKpiQHUxmxZVKV3tJPQQPtPALunL5BfKaDuuHAdGszmTPgeHMlzdqxKcoQqqKmjmeNzAiRvNKTxKUVH3j+k1/VyDhbPPRoNPmTRd7crmYjPoDXuB96BxCScgCqdZFQlNY2WHQeklgCFCtiB5nBEsLtgiWbpapOQZpU6IOjYidJMaem0XijtixkbscHIl5a+XAF69IBufYVk9HSUBDBzgUCwd6YvAH1wqgt2yqM4wQymd8xmsUR3l9G89kq94eatIuDQx5zcvaquSDvSXSoxKfv/dVRg4Xu5DhSjiDtPYQZEC/sSs4Zig7eAntKRGz6Ah+ED1mT84iq2NGD0Bn96fszuTZ8RMtWM8moIw7TiN6TfZYLgPypWUWnUPj5ab4O8fhU8pTlYKhGBsbVa0b9BV09o6ja+2V0OPFQAvcOopY8rNlvls5hePOq8WJe8kzsifZGv9kA821Mx9KeZZJHRAR9nnJ9QWYdE94AaQPRwm42c3Bria5v3fr93mxF06XokutOkJGMkRrE=;31:Kj4VE70Pf4AVBMgprT43jiRuCJ0qIYomhgJnVuYV+3eHY3qYT+VvevEfdWcF3Krja/B1wkBP5duu5sOj5oxKUaIziY0We5Yt3bLBF8gLbdiJWWiBjuwok11JiOYnvtvs0FKYQht/duOrqSmxLOsnLQunLuLciR44iLFdITqQSDDTb98XPFOJaadeZXVTLgJDbn2psAbTFY1EGab2a8CO6Q== X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;20:6iY/OLYmrCnL2unNhgd2FFKPKRZMOn/hEH5sQQ9+MVy+1+zUkRVdi59fHBTh85RaDIcvD4sAzibaZTZ1VfPpqQdm7OfEf9L05q9q6nn4g+ZK6WrvuV5Q2K1t8VOug4zaos14la2IIGM9QYyH4FoMc1XlZV4MRWSwi4EhqO9quo64iOWhTBQ3B/+WU4PdPnjpnOy27nwCfTZEMudLGkq0kMDKxF+ccBUnSJX0SIvmWjguz2PWLGsj5lcH7US5eDVNP1w2nbdGkywaNB209Cf9RcyPuTRNWVtqJ1v1nHyeRityXB+YsMzdGGD2XDWPWia8ninJPhPhUlW6r3MopJ3HngVDI4zu80aTedXco6WXe2hrq/qALtpFGHG9lh7mC11eAV8N9vNNTBjBQTGHxi3fJYNo0cPNxNoiOw2K97BYP9jdonKVSM2vH+Zn72hpshQYbB4b43VhlBiKdBv0ZkRAAdVBVJzx+2jlBu/NluLh5DsDW4KrX+rz2SBvOrUawuqn;4:gp3aOhuZUxhEkp9c0sKkOSPCKEVpooXEBAhApa0hsdRwe5ISdMoAVv+NmHa107JSbrZhXlKg0KXxqSk3yCZ7EGZ9GZB0cMjUcy77bE7tNFOcPDjTug9BYmXWhpLT5NuzEnVVW9jTDwU6nchduXPw2WEfawapq9lylJhfFcOnhjc3FUML4rg5gJHZMdpi5DYBo3oxHXNYja2F9xUWg9B/w62yFcI7tFeh5eDOoU3bdR+lOLHuxY4H8KGHh1OE6Jtni2jxgSehoZSvXlnjG/estgMpR2JH0+WDPmW2yA8N/SE7BqK78ZlzPks+TjRv9mXCJjN/eV30liF+z3n3Y7t0iiZYQbQ3ZlAkWwUUUhRsJHq8Yl4qB8tt4LDxDi2kMaxPSU/u1T2VbkRlwuSzxxoJfw== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6055026);SRVR:AT5PR84MB0306;BCL:0;PCL:0;RULEID:;SRVR:AT5PR84MB0306; X-Forefront-PRVS: 000800954F X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(979002)(6009001)(6049001)(7916002)(377454003)(24454002)(76104003)(189002)(199003)(7736002)(110136002)(86362001)(97736004)(65806001)(7846002)(77096005)(65956001)(36756003)(305945005)(4001350100001)(19580405001)(2906002)(2950100001)(33656002)(83506001)(19580395003)(8676002)(105586002)(66066001)(68736007)(50466002)(47776003)(230700001)(65816999)(64126003)(54356999)(76176999)(92566002)(23756003)(101416001)(81156014)(81166006)(50986999)(4326007)(586003)(42186005)(3846002)(6116002)(117156001)(106356001)(189998001)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1102;SCL:1;SRVR:AT5PR84MB0306;H:[192.168.142.172];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;AT5PR84MB0306;23:b/+kOqoilJ9smSxHC+gOpobiaKiyI5KDxH/+Pt8?= =?iso-8859-1?Q?wHiLz4MakIFglcc7Wy0aRkyUt9Jt7wW4KwQFGkgxei2hWA+o1C0QUdcZI0?= =?iso-8859-1?Q?8HeZeXrzQo7ASqDJo6jKdRe1ybjq4qdLHLmIyD1SNjgEGBKFM3fasbxv/P?= =?iso-8859-1?Q?Fl/AXI8IffqHzQLMJO9bdN65G6xMt8DKH8pfL4DTaNZcWOYPYC7OEZ1IvU?= =?iso-8859-1?Q?MdIOIybozOX8Gsir7GrxR5/OQrxH5SiUtIuDIhbrK8C2Sl/cSBrAhfVkK6?= =?iso-8859-1?Q?+cNx04z1jxVe0YIbVqL/aZVDn1bR0VMbPuetxbI33u9h2pwmTect/L++pY?= =?iso-8859-1?Q?pjp77hW8e62KtHqatv5OuLfiBjuPMf1S7OZcTnSCDh6r2fEYlEZjbBZ4Bq?= =?iso-8859-1?Q?XKTdigWBrwRhVNvk8kRn/MDg0n1/OrPKb4IkgDOYewGMgqYnZVMjSCDq/P?= =?iso-8859-1?Q?XD28fJyy/R9wADf316aJbJXdwK47621iiV4Ga9GtAo82dxOA4eHHmXeJmP?= =?iso-8859-1?Q?7jR16dtB/g68k5vRakTX9bn020IpA32gmcCsjVoixiqZgZd4I3jVjWc6hw?= =?iso-8859-1?Q?otXn7s3s/a5yqqZMCHj6PvV00KF/xU8iSdRkeImfOiGrlRZEaQOAQy8t+n?= =?iso-8859-1?Q?moYj1jtCMJVNn/sx7E93hwz773BQjCuMmAukQfPEisDSmvO72KgDRWfh1O?= =?iso-8859-1?Q?36PvKqB4qcx9vJycGV8DhupUUwIcDhOWm5Cfo/PJM3kQdLvzh/kLq7cZox?= =?iso-8859-1?Q?CSg/9b9MbHZHSrz5NJpmlpT90wYz5ti2vm31FhQaI5OZWzcQVkiOCyeGfW?= =?iso-8859-1?Q?ZIxnpiG228P6uwqZLy93Czoyr+wYUJTWSj9JHEA+JGfONGfkSdAVo4VLaQ?= =?iso-8859-1?Q?GgFG4ES1Mhnibcva9uOjPpexsnMN4Hy5pAHJk8x9Ai7ndktH2G8IPkEjAw?= =?iso-8859-1?Q?rhgdPOp2TCu/EjFDS3gBm0OkUcyE8rangu6vzxCCF3FcMee2GvY3k/bqeu?= =?iso-8859-1?Q?fy65Qn9ZC8FJSGnN4FqtOyOnQ6Mef41oX349WXy5cLFVzn/HzPZkiZ13nH?= =?iso-8859-1?Q?F3xzm/QxRPY14yxeIfkkYcdhFP0NAHxxPOz+PbAoaLmlx2D8C+prhic8XK?= =?iso-8859-1?Q?XjgD4qgC1uzWFdil6QYe/N1CXNHvkmHnrtyyV7KsDTg6aijPsniBU8pzNu?= =?iso-8859-1?Q?6UjDPKJmGO++R0sViNU13gqmpXXWAALqshBTsox30UxluqYSWuX2CjaRW6?= =?iso-8859-1?Q?z6ylPhZj/3u7Rke+wdgbhbnqGvm9jvCCO3NeUy+ad2txn4HNOzVhA/6mIv?= =?iso-8859-1?Q?fDDv1CZc/XVLX1aUWj1Sj1pULw/eS3yD5uQYd3jycBBxzqzamBp+50ycRW?= =?iso-8859-1?Q?WKTM3gItFAPewwox5vqaI0izFMYq4Z88h7UrRXUx2DUmaXMsXYw=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;AT5PR84MB0306;6:UVZ+F0YQ+AJAlPoy5KQzW5GnJCxDAAxUSZ6VX6J8D4jJ5DQDXHDtP4dW4lMap2YI5pOHAggs2Am4kzwMDUWs9iM1HH4fEEEhSbz/otBY+1Cfuqva7pXWlAbOacvEFkajiDqgRLOiAsUGuJkjjDdJYNZwbfPSqJKtrjGetQJbtNknJCyg3Z7KkLsXeykygjlg/z4BVoNAR+5n1kfkPbCIfEUnJ+KxnI0Eg8Og483SLLHyvwKCDE8in9FFW65PoFAsrwtZeeiFXUQIvaudW2aK1DxWEtWSUl1nI/akuBKyrkUJPsQUgaipUL8x/rmXOXX5NnzSvY8arfd1yEB1Mj02cg==;5:svqUpTLPM7r1PCSn2SO6uIM2ZOpwY/J2fPMDaZ48JYTol+wdl7U+wCP9vftF+knaA6dEnNeiO+Zyvj3c2Kklr3EY415xtgF5ifjkQt+2TkHkOtHebwmDlI9w2TKxCq4LYG5BR5MhbSvU8/TyFaTUqw==;24:1gy/81bp/L0d3YkXv0Imu6BDA0NqREUcZ8mFGABLjRnRei7BixXNbdiVjEW+dxvQJdeEodzvRseLViiKwS+uJnsTB3E7NE46sWQZl78LUpg=;7:VccV1BZ92IanSuVqFsrNNnWPH4GyUoq/OrcKitY+i3KAu6PLPEuLsVc4GBxiYYDiyZuH6FRc5N2YnW5eLFtHjvkW6NZdn7ohk3vmLks8P/J36m6B2EUb4QuUnglXNNcxmWNqHImn2DbMAfyI78mC0bX+bgHoNzauZQgjagrXUFHTCyVx2etgg4Cu3ZPeCUuAQv3jrVq18UqyciB8us0wXOzZQc6fsa8ROxyPRZq7Qact4RXTVVVYmiCcBlzjzW2G SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Jul 2016 18:42:39.7357 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AT5PR84MB0306 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/18/2016 07:38 PM, Tejun Heo wrote: > Hello, Waiman. > > On Fri, Jul 15, 2016 at 01:39:40PM -0400, Waiman Long wrote: >> Suggested-by: Tejun Heo > Not sure I should be on suggested-by given that this wasn't my idea at > all. I put the tag there because of your suggestion to restructure the data structure which did make the patch better. I can remove that tag if you think it is not appropriate. >> +/* >> + * include/linux/dlock-list.h >> + * >> + * A distributed (per-cpu) set of lists each of which is protected by its >> + * own spinlock, but acts like a single consolidated list to the callers. >> + * >> + * The dlock_list_head_percpu structure contains the spinlock, the other >> + * dlock_list_node structures only contains a pointer to the spinlock in >> + * dlock_list_head_percpu. >> + */ > The more I think about it, the more bothered I'm about the dlock_list > name. For the most part, this isn't different from other percpu data > structures in the kernel. Sure, it might benefit from doing Nth cpu, > but so are other percpu data structures and it's not just "distributed > lock" list either. The list itself is percpu, not just locking. Can > we please go back to percpu_list? Christoph, what do you think? > As I said before, I don't mind reverting the name back to percpu_list. I am just waiting for a final agreement. >> +struct dlock_list_node { >> + struct list_head list; >> + spinlock_t *lockptr; >> +}; > Wouldn't it be better to point to dlock_list_percpu? I could. However, the only thing that matter is the spinlock that protects the list entry. >> +#define DLOCK_LIST_HEAD_PERCPU_INIT(name) \ >> + { \ >> + .list.prev =&name.list, \ >> + .list.next =&name.list, \ > Use LIST_HEAD_INIT()? Also, why do we even need the initializers if > the data structure can only be dynamically allocated. In fact, do > the definitions even need to be exposed in the header? I put it there for completeness sake. You are right. That macro isn't currently used. I will remove it in the next iteration of the patch. >> + .list.lock = __SPIN_LOCK_UNLOCKED(name), \ >> + } >> + >> +/* >> + * dlock list iteration state >> + */ >> +struct dlock_list_iter { >> + int cpu; > ^ > I'm not sure lining up with space here is common in kernel. OK, I will remove the extra spaces to make it more conformant to the kernel style. >> + spinlock_t *lock; >> + struct list_head *head; /* List head of current per-cpu list */ >> + struct dlock_list_node *curr; >> + struct dlock_list_node *next; >> +}; >> + >> +#define DLOCK_LIST_ITER_INIT() \ > ^ > Don't we usually omit () in these cases? Good point. Will remove the unneeded (). >> + { \ >> + .cpu = -1, \ >> + } >> + >> +#define DEFINE_DLOCK_LIST_ITER(s) \ >> + struct dlock_list_iter s = DLOCK_LIST_ITER_INIT() >> + >> +static inline void init_dlock_list_iter(struct dlock_list_iter *iter) >> +{ >> + *iter = (struct dlock_list_iter)DLOCK_LIST_ITER_INIT(); >> +} >> + >> +#define DLOCK_LIST_NODE_INIT(name) \ >> + { \ >> + .list.prev =&name.list, \ >> + .list.next =&name.list, \ > ^ > LIST_HEAD_INIT()? Will make the change. >> + } >> + >> +static inline void init_dlock_list_node(struct dlock_list_node *node) >> +{ >> + INIT_LIST_HEAD(&node->list); >> + node->lockptr = NULL; > Why not use DLOCK_LIST_NODE_INIT()? > Yes, I can make the change. >> +} >> + >> +/* >> + * Check if all the dlock lists are empty >> + * >> + * This can be a pretty expensive function call. If this function is required >> + * in a performance critical path, we may have to maintain a global count >> + * of the list entries in the global dlock_list_head structure instead. >> + */ > /** function comment please. Sure. >> +static inline bool dlock_list_empty(struct dlock_list_head *dlist) >> +{ >> + int cpu; >> + >> + for_each_possible_cpu(cpu) >> + if (!list_empty(&per_cpu_ptr(dlist->head, cpu)->list)) >> + return false; >> + return true; >> +} >> + >> +/* >> + * Allocation and freeing of dlock list >> + */ >> +extern int alloc_dlock_list_head(struct dlock_list_head *dlist); > ^ > ditto with alignment Will remove the extra space. >> +extern void free_dlock_list_head(struct dlock_list_head *dlist); >> + >> +/* >> + * The dlock list iteration functions which return true if iteration has >> + * to be continued. >> + */ >> +extern bool dlock_list_next(struct dlock_list_head *dlist, >> + struct dlock_list_iter *iter); >> +extern bool dlock_list_next_safe(struct dlock_list_head *dlist, >> + struct dlock_list_iter *iter); > Why not return dlock_list_node * for the current node? That'd more > conventional and allows dlock_list_iter to be opaque. Yes, I can make it return dlock_list_node *. However, to make dlock_list_iter opaque, I will have to dynamically allocate the structure. That will add an extra memory allocation and free calls as well as handling the error case of running out of memory. I don't think that is worth doing at this point. >> diff --git a/lib/dlock-list.c b/lib/dlock-list.c >> new file mode 100644 >> index 0000000..af4a9f3 >> --- /dev/null >> +++ b/lib/dlock-list.c > ... >> +int alloc_dlock_list_head(struct dlock_list_head *dlist) >> +{ >> + struct dlock_list_head dlist_tmp; >> + int cpu; >> + >> + dlist_tmp.head = alloc_percpu(struct dlock_list_head_percpu); >> + if (!dlist_tmp.head) >> + return -ENOMEM; >> + >> + for_each_possible_cpu(cpu) { >> + struct dlock_list_head_percpu *head; >> + >> + head = per_cpu_ptr(dlist_tmp.head, cpu); >> + INIT_LIST_HEAD(&head->list); >> + head->lock = __SPIN_LOCK_UNLOCKED(&head->lock); >> + lockdep_set_class(&head->lock,&dlock_list_key); >> + } >> + >> + dlist->head = dlist_tmp.head; > Just use dlist->head directly or use local __perpcu head pointer? I just don't want to expose the structure to world until it is fully initialized. If you think I am over-cautious, I can use dlist->head as suggested. >> + return 0; >> +} >> +EXPORT_SYMBOL(alloc_dlock_list_head); > Does this actually need to be exported? If so, it might be a better > idea to start with EXPORT_SYMBOL_GPL(). For the current use case, we probably don't need to export the symbols. Other use cases may require that. I will change it to use the version instead. >> +void dlock_list_add(struct dlock_list_node *node, >> + struct dlock_list_head *dlist) >> +{ >> + struct dlock_list_head_percpu *head; > ^ > > This probably requires __percpu annotation. Have you run it through > sparse and checked for address space warnings? You are right. I probably miss the __percpu annotation. I haven't run it through sparse. I will do that to fix any warning found. >> + >> + /* >> + * Disable preemption to make sure that CPU won't gets changed. >> + */ >> + head = get_cpu_ptr(dlist->head); >> + spin_lock(&head->lock); >> + node->lockptr =&head->lock; >> + list_add(&node->list,&head->list); >> + spin_unlock(&head->lock); >> + put_cpu_ptr(dlist->head); >> +} >> +EXPORT_SYMBOL(dlock_list_add); >> + >> +/** >> + * dlock_list_del - Delete a node from a dlock list >> + * @node : The node to be deleted >> + * >> + * We need to check the lock pointer again after taking the lock to guard >> + * against concurrent deletion of the same node. If the lock pointer changes >> + * (becomes NULL or to a different one), we assume that the deletion was done >> + * elsewhere. A warning will be printed if this happens as it is likely to be >> + * a bug. >> + */ >> +void dlock_list_del(struct dlock_list_node *node) >> +{ >> + spinlock_t *lock = READ_ONCE(node->lockptr); >> + >> + if (unlikely(!lock)) { >> + WARN_ONCE(1, >> + "dlock_list_del: node 0x%lx has no associated lock\n", >> + (unsigned long)node); > Maybe "if (WARN_ONCE(!lock...)"? WARN_ONCE implies unlikely. OK, will do that. >> + return; >> + } >> + >> + spin_lock(lock); >> + if (likely(lock == node->lockptr)) { >> + list_del_init(&node->list); >> + node->lockptr = NULL; >> + } else { >> + /* >> + * This path should never be executed. >> + */ >> + WARN_ON_ONCE(1); >> + } > This still kinda bothers me because this pretty much requires the > users to have strong synchronization around the operations and makes > it unusable in situations where opportunistic behaviors are > acceptable. It negates the usefulness quite a bit. I understand your concern. I will make it retry again with the new lock. >> + spin_unlock(lock); >> +} >> +EXPORT_SYMBOL(dlock_list_del); >> + >> +/* >> + * Helper function to find the first entry of the next per-cpu list >> + * It works somewhat like for_each_possible_cpu(cpu). >> + * >> + * Return: true if the entry is found, false if all the lists exhausted >> + * >> + */ >> +static inline bool dlock_list_next_cpu(struct dlock_list_head *dlist, > ^ > Just let the compiler figure it out. Sure. Will remove the inline tag. >> + struct dlock_list_iter *iter) >> +{ >> + if (iter->lock) >> + spin_unlock(iter->lock); >> +next_cpu: >> + /* >> + * for_each_possible_cpu(cpu) >> + */ >> + iter->cpu = cpumask_next(iter->cpu, cpu_possible_mask); >> + if (iter->cpu>= nr_cpu_ids) >> + return false; /* All the per-cpu lists iterated */ >> + >> + iter->head =&per_cpu_ptr(dlist->head, iter->cpu)->list; >> + if (list_empty(iter->head)) >> + goto next_cpu; >> + >> + iter->lock =&per_cpu_ptr(dlist->head, iter->cpu)->lock; >> + spin_lock(iter->lock); >> + /* >> + * There is a slight chance that the list may become empty just >> + * before the lock is acquired. So an additional check is >> + * needed to make sure that iter->curr points to a valid entry. >> + */ >> + if (list_empty(iter->head)) { >> + spin_unlock(iter->lock); >> + goto next_cpu; >> + } >> + iter->curr = list_entry(iter->head->next, >> + struct dlock_list_node, list); >> + return true; >> +} > ... >> +/** >> + * dlock_list_next_safe - Removal-safe iterator of dlock list >> + * @dlist: Pointer to the dlock_list_head structure >> + * @iter : Pointer to the dlock list iterator structure >> + * Return: true if the next entry is found, false if all the entries iterated >> + * >> + * The iterator has to be properly initialized before calling this function. >> + * This iteration function is safe with respect to list entry removal. >> + * However, it cannot correctly iterate newly added entries right after the >> + * current one. >> + */ > This still looks wrong to me. If you want to provide the two variants > of iterations, can't you just implement one next function and build > the two types of iterations on top of it? I have been thinking about making dlock_list_next_cpu() the real external function and have 2 inline functions that implement dlock_list_next() and dlock_list_next_safe(). That may strike a better balance between performance and code abstraction. I will do so if you have no objection to that. Cheers, Longman