From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759103AbcBXTvk (ORCPT ); Wed, 24 Feb 2016 14:51:40 -0500 Received: from mail-by2on0127.outbound.protection.outlook.com ([207.46.100.127]:26256 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1759081AbcBXTvg (ORCPT ); Wed, 24 Feb 2016 14:51:36 -0500 X-Greylist: delayed 89371 seconds by postgrey-1.27 at vger.kernel.org; Wed, 24 Feb 2016 14:51:35 EST Authentication-Results: suse.cz; dkim=none (message not signed) header.d=none;suse.cz; dmarc=none action=none header.from=hpe.com; Message-ID: <56CE09B5.2000602@hpe.com> Date: Wed, 24 Feb 2016 14:51:17 -0500 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Jan Kara CC: Alexander Viro , Jan Kara , Jeff Layton , "J. Bruce Fields" , Tejun Heo , Christoph Lameter , , , Ingo Molnar , Peter Zijlstra , Andi Kleen , Dave Chinner , Scott J Norton , Douglas Hatch Subject: Re: [PATCH v3 1/3] lib/percpu-list: Per-cpu list with associated per-cpu locks References: <1456254272-42313-1-git-send-email-Waiman.Long@hpe.com> <1456254272-42313-2-git-send-email-Waiman.Long@hpe.com> <20160224075622.GA10096@quack.suse.cz> In-Reply-To: <20160224075622.GA10096@quack.suse.cz> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [72.71.243.229] X-ClientProxiedBy: BLUPR05CA0043.namprd05.prod.outlook.com (10.141.20.13) To CS1PR84MB0311.NAMPRD84.PROD.OUTLOOK.COM (25.162.190.29) X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0311;2:2SbKX61VP5QtnqPE7nKzDp775Kbr2s58iwxaSb6Jsn6gMV0R6mbZlCqzF8pJm2Ib788yCBBBA3P12b1lSRFawwtmWPn/1J9X1g8CDCHNc5MrlIVvegj63+wX5LMuGfsDgok0EjMBPSFlj/KzoCRI7g==;3:tOKd0kqNu47qQTG1nQ9uNubdIur4gVGp4AqXO2KTBSo5T5qnmfNf7Xug9vfWCBQKiGkV14/0Xf5wED4YNBanHu8O6rsKmrzl1g0krRLjX4VdL7VIszVmyOKH0NXYVMS1;25:xillb1OJBzP+Frc6dphTDWgBr379PUvJl5ODZgWC0fJHx5XWwjPd/W2cnPH2qy+lhJyUDnOYqUyvqOdhgDmzXsNGBbBvi3QW915r2xJf86EbU7yur/FiEMpuH5MwXwFMyI634x1u+Gylemhn3t3FMIFm6BFDJAZKdk0ICSFB4z+qKi5GVctzLEGVNyHOx7NU8yywqYPVLG8cYO02VPsQFb4z5mhRo2tI1iA7JunB88oerpTv3Q2PtQh6YMMseCqMz8Nb7Lp2nhHp2yRLhCpYIxszSO+zB6z1VscGIph9QRDmAVd1ksrzNkMG3Ptd6UrIjVw81FUwmFDpTGVNixZxmA==;20:C0bhO0X4lHN8LsT0WIgJRQTOsNFyNbCXMJNucIEK4j/D5PaSdUfUj6vlsdqIR3TK5mQ41cG9mxHy6XExjByCWaXBKr308FBr5fEVdWSUtIIx3a0D7yRHTb2p3Ybjvn9GG8iO1P44pAbLPK6n0i5ZyKuBreUxOtXTtRmwKWCjCPWPijAH3Qn2qwyNB4GtjShgxC3qZssV9r3VdnoBUtRndnOdN6Quk2LtoHptASjdocKPsT/ipJAR2mU0LuvFNece X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0311; X-MS-Office365-Filtering-Correlation-Id: b4a6022b-33eb-4619-d60f-08d33d53e247 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(227479698468861); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046);SRVR:CS1PR84MB0311;BCL:0;PCL:0;RULEID:;SRVR:CS1PR84MB0311; X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0311;4:oQnCbIPtL+udEBNdMVHJkgRwQfXCvhv6CAYZqD2MJ4E2SgEGY8Q061vvmUr58vjsdyd+/g+I60+IphuZ8T6YlpnkNRTS7Xb+YGj4ve6mbIvL943UFy4vqYSBpLGTZBNvVe6xpre26ptBOpJPPxRiORfegQCB5yeoS5J6AM597jOSdWIlKJ683f5uMyy6qXoTv4TEaQX/l2ZAUdDJFoXYggwzw+6yKcCbi75qcw2i7QXZ5UV6bzbeudJsMXbCJnzXVT/2RmvLtszacqPh7TF5VmQR9YCT/QLGO8pSQn9LX8vgFh93kD87Ju/Qk1FzmVELSJFxSjTXvd77sneTbauT5g4NiOg2u574ARzSyt8zvVHnFQ0aJUojORSsahdH8m7TGcZe8qhBKNDSuqEfgNACANMz9D0kPD9GRwJaiXG849Q= X-Forefront-PRVS: 08626BE3A5 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(479174004)(377454003)(377424004)(24454002)(50986999)(66066001)(83506001)(5008740100001)(189998001)(47776003)(76176999)(65816999)(65956001)(87266999)(54356999)(2950100001)(110136002)(50466002)(42186005)(586003)(19580405001)(19580395003)(4326007)(2906002)(5001960100002)(87976001)(230700001)(77096005)(64126003)(23756003)(1096002)(92566002)(86362001)(3846002)(117156001)(59896002)(4001350100001)(5004730100002)(36756003)(33656002)(6116002)(40100003)(7059030);DIR:OUT;SFP:1102;SCL:1;SRVR:CS1PR84MB0311;H:[192.168.142.192];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?iso-8859-1?Q?1;CS1PR84MB0311;23:yvWvEDeLhecWWxuEByo/iazOOLXu0T1a7klhULL?= =?iso-8859-1?Q?UKPmFkfsq1NNK3++fPRM48A0WR/hTxAl7UD7KfDnG3KIrswtzsW0PdBDKw?= =?iso-8859-1?Q?l8ufBq8M8d32cL7hdoqBCPZRbXi7UYYXzxSgrdN146fGGE2sgxd6+/w4+j?= =?iso-8859-1?Q?o6Ezz3WWhSpDyRKv62vT/kuIzABNPpTGhRIAtRmi2nR5yvxptf+zIrpOfv?= =?iso-8859-1?Q?yFTEqjnDAJP+xWCKeYLKXN73IewDq2UzZJGSD3iVdkRLk1oaACOMJkTuC7?= =?iso-8859-1?Q?+FSFOWlBHvgOXRkyS8bpPmUO/n3im96XEvEaqUY3d8cZDd4Dv8z2ZKBR8z?= =?iso-8859-1?Q?tjhvysfF80oGL9+I2Wkw9WCsYIpCRDAfGFZORcY18/aE1xXSHsRSqAGGSo?= =?iso-8859-1?Q?4c4njjCq79gkrd8pHbR3FHRn+s1moAo06DX4bcCgtvFyQ52X3USZuJdC5V?= =?iso-8859-1?Q?TIdXdI/eV/bOqKg6177LeYgbM1z3fLWnGJKVgQ3cqNNyoGviCAuBz3NGWJ?= =?iso-8859-1?Q?Su0CGttO9O/gh86Tx8hE2dgkN7qizbaDW84GQ4Ai2UODxURLrpfexo93id?= =?iso-8859-1?Q?A8jQxtg8qDCGAaThi7HelZN+SE24r16b/WEgkmmKGtrDQOcm39ZkC41Axc?= =?iso-8859-1?Q?IRLVQY2D/asOKXRkfaMZ3/YERYT1x7GfrJe1Yc5DKGxvhMvd56SIqEfwND?= =?iso-8859-1?Q?s5iVrKQb5dLfi4uOfTw7INIgXd7nNB1yl7WUOs1iBM+yM2aM+RomkqW63p?= =?iso-8859-1?Q?ILUs+8BtAq7NaLMthurPnu+XpLkMtOKyKJaKddtZ1icLr8B98u2lIrYRRX?= =?iso-8859-1?Q?wrY9Hg5x1JZksGseq7ntdzQqtq0PE8YxEO1+7FevvoFddOA8QF9qccKA8i?= =?iso-8859-1?Q?hAWOWE/d2+yuxAx0anWpQwGQ88qU3trSYXkZjVzIwch4arZHkz6jCPzNd1?= =?iso-8859-1?Q?xUB4Y3dUM7O6DlvBiifYqCmcQBJWZzddxb19lQF+lBxGn3gJB/u/ZdBIos?= =?iso-8859-1?Q?kXGdLXlQbqYCOrJHN0q50wnozgvUHO+F2uY/iq6qU8O5fNAyZj1dpPPRrc?= =?iso-8859-1?Q?4/G8DprG4MvLbcV3b6HTEzu85UT/N9r0iKGpF/JlabbgXaQg2xQ+ZdtsGG?= =?iso-8859-1?Q?xaNvYQ99E+kS9lMyROifJM5qbJJ2jzqm8gawPQYCfe2jL3FLBUCAtLZz7a?= =?iso-8859-1?Q?fx+1/KfsjU+bjpIrIqmKjEqsqNy1ItXdDhK/xLr0GDdT6JwU1lrH3Q2hhC?= =?iso-8859-1?Q?oPZQhbrQt92DUygUW?= X-Microsoft-Exchange-Diagnostics: 1;CS1PR84MB0311;5:6ud3AJWWTip7DAOGSnvaWVnNCADau5ORfOo6isHtQ+ZJsefLn946bq64yUgJwquvHAXK5OtH9d8Wbl++tY/kNyViK/CP15YluoBV897jmaJxlnRcBofJA21XZ9D5H+GzpFsvJd5Kf9O582fqEipFkg==;24:lOml70JTxy6IYOPggRAK/KiRkx8uK73YlCDDpREv+kSZiTYvbsIsyIEJUUaIZR72zqygbPW3iAAltLqgvZpSnwngu5/7Fj2UNdm5jBUuyZI= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: hpe.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Feb 2016 19:51:25.6997 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CS1PR84MB0311 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/24/2016 02:56 AM, Jan Kara wrote: > On Tue 23-02-16 14:04:30, Waiman Long wrote: >> Linked list is used everywhere in the Linux kernel. However, if many >> threads are trying to add or delete entries into the same linked list, >> it can create a performance bottleneck. >> >> This patch introduces a new per-cpu list subystem with associated >> per-cpu locks for protecting each of the lists individually. This >> allows list entries insertion and deletion operations to happen in >> parallel instead of being serialized with a global list and lock. >> >> List entry insertion is strictly per cpu. List deletion, however, can >> happen in a cpu other than the one that did the insertion. So we still >> need lock to protect the list. Because of that, there may still be >> a small amount of contention when deletion is being done. >> >> A new header file include/linux/percpu-list.h will be added with the >> associated pcpu_list_head and pcpu_list_node structures. The following >> functions are provided to manage the per-cpu list: >> >> 1. int init_pcpu_list_head(struct pcpu_list_head **ppcpu_head) >> 2. void pcpu_list_add(struct pcpu_list_node *node, >> struct pcpu_list_head *head) >> 3. void pcpu_list_del(struct pcpu_list *node) >> >> Iteration of all the list entries within a group of per-cpu >> lists is done by calling either the pcpu_list_iterate() or >> pcpu_list_iterate_safe() functions in a while loop. They correspond >> to the list_for_each_entry() and list_for_each_entry_safe() macros >> respectively. The iteration states are keep in a pcpu_list_state >> structure that is passed to the iteration functions. >> >> Signed-off-by: Waiman Long > Two comments below. > >> +/* >> + * Helper function to find the first entry of the next per-cpu list >> + * It works somewhat like for_each_possible_cpu(cpu). >> + * >> + * Return: true if the entry is found, false if all the lists exhausted >> + */ >> +static __always_inline bool >> +__pcpu_list_next_cpu(struct pcpu_list_head *head, struct pcpu_list_state *state) >> +{ >> + if (state->lock) >> + spin_unlock(state->lock); >> +next_cpu: >> + /* >> + * for_each_possible_cpu(cpu) >> + */ >> + state->cpu = cpumask_next(state->cpu, cpu_possible_mask); >> + if (state->cpu>= nr_cpu_ids) >> + return false; /* All the per-cpu lists iterated */ >> + >> + state->head =&per_cpu_ptr(head, state->cpu)->list; >> + state->lock =&per_cpu_ptr(head, state->cpu)->lock; >> + state->curr = list_entry(state->head->next, >> + struct pcpu_list_node, list); >> + if (&state->curr->list == state->head) >> + goto next_cpu; > This might be more comprehensible as: > > if (list_empty(state->head)) > goto next_cpu; > > and you can do it just after updating state->head (no need to init > state->lock& state->curr if the list is empty). Thank for the suggestion. Will change the code accordingly. > Another note: Initialization of state->curr is IMO racy - you need to hold > state->lock to grab state->curr reliably, don't you? Otherwise someone can > remove the entry while you are working with it. So you need to move that > down just before returning. Right. I will move the initialization of state->curr after the spin_lock(). >> + >> + spin_lock(state->lock); >> + return true; >> +} >> +#endif /* NR_CPUS == 1 */ > ... > >> +/* >> + * Delete a node from a percpu list >> + * >> + * We need to check the lock pointer again after taking the lock to guard >> + * against concurrent delete of the same node. If the lock pointer changes >> + * (becomes NULL or to a different one), we assume that the deletion was done >> + * elsewhere. >> + */ >> +void pcpu_list_del(struct pcpu_list_node *node) >> +{ >> + spinlock_t *lock = READ_ONCE(node->lockptr); >> + >> + if (unlikely(!lock)) >> + return; >> + >> + spin_lock(lock); >> + if (likely(lock == node->lockptr)) { >> + list_del_init(&node->list); >> + node->lockptr = NULL; >> + } > But someone changing lockptr under your hands would mean that there are > two processes racing to remove entries and that would generally point to a > problem (and likely use-after-free) in the caller, won't it? Or do you have > some particular usecase in mind? > > Honza > This is just defensive programming to guard against unforeseen case. I don't have any particular use case in mind that will make that happen. Maybe I should put a WARN_ON if this really happens. Cheers, Longman