* Bit spinlocks in DPDK @ 2013-12-06 21:04 Pashupati Kumar [not found] ` <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E6B-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Pashupati Kumar @ 2013-12-06 21:04 UTC (permalink / raw) To: dev-VfR2kkLFssw@public.gmane.org Hi, We use bit spinlocks extensively to have compact data structures. Are there any plans for adding them to DPDK in some future release? Thanks Pash ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E6B-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org>]
* Re: Bit spinlocks in DPDK [not found] ` <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E6B-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org> @ 2013-12-06 22:02 ` Thomas Monjalon 2013-12-06 22:12 ` Pashupati Kumar 0 siblings, 1 reply; 10+ messages in thread From: Thomas Monjalon @ 2013-12-06 22:02 UTC (permalink / raw) To: Pashupati Kumar; +Cc: dev-VfR2kkLFssw Hello, 06/12/2013 13:04, Pashupati Kumar : > We use bit spinlocks extensively to have compact data structures. Are there > any plans for adding them to DPDK in some future release? Not sure to understand your request. Are you looking for that? http://dpdk.org/doc/api/rte__spinlock_8h.html -- Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bit spinlocks in DPDK 2013-12-06 22:02 ` Thomas Monjalon @ 2013-12-06 22:12 ` Pashupati Kumar [not found] ` <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E7A-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org> 0 siblings, 1 reply; 10+ messages in thread From: Pashupati Kumar @ 2013-12-06 22:12 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev-VfR2kkLFssw@public.gmane.org I am looking for spinlocks that use a single bit (bit 31) of a 32 bit word for locking. The rest of the bits in the word are left undisturbed. This enables more compact data structures as only 1 bit is consumed for the lock. Thanks Pash -----Original Message----- From: Thomas Monjalon [mailto:thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org] Sent: Friday, December 06, 2013 2:02 PM To: Pashupati Kumar Cc: dev-VfR2kkLFssw@public.gmane.org Subject: Re: [dpdk-dev] Bit spinlocks in DPDK Hello, 06/12/2013 13:04, Pashupati Kumar : > We use bit spinlocks extensively to have compact data structures. Are > there any plans for adding them to DPDK in some future release? Not sure to understand your request. Are you looking for that? http://dpdk.org/doc/api/rte__spinlock_8h.html -- Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E7A-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org>]
* Re: Bit spinlocks in DPDK [not found] ` <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E7A-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org> @ 2013-12-06 22:24 ` Thomas Monjalon 2013-12-06 22:54 ` Pashupati Kumar 2013-12-07 17:54 ` François-Frédéric Ozog 0 siblings, 2 replies; 10+ messages in thread From: Thomas Monjalon @ 2013-12-06 22:24 UTC (permalink / raw) To: Pashupati Kumar; +Cc: dev-VfR2kkLFssw@public.gmane.org 06/12/2013 14:12, Pashupati Kumar : > From: Thomas Monjalon > > 06/12/2013 13:04, Pashupati Kumar : > > > We use bit spinlocks extensively to have compact data structures. Are > > > there any plans for adding them to DPDK in some future release? > > > > Not sure to understand your request. > > Are you looking for that? > > http://dpdk.org/doc/api/rte__spinlock_8h.html > > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit word > for locking. The rest of the bits in the word are left undisturbed. This > enables more compact data structures as only 1 bit is consumed for the > lock. Oh yes, like test_and_set_bit_lock() in Linux: http://lxr.free-electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205 I think that a patch would be appreciated :) PS: please try to answer below the question. It's far easier to read. -- Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bit spinlocks in DPDK 2013-12-06 22:24 ` Thomas Monjalon @ 2013-12-06 22:54 ` Pashupati Kumar 2013-12-07 17:54 ` François-Frédéric Ozog 1 sibling, 0 replies; 10+ messages in thread From: Pashupati Kumar @ 2013-12-06 22:54 UTC (permalink / raw) To: Thomas Monjalon; +Cc: dev-VfR2kkLFssw@public.gmane.org > 06/12/2013 14:12, Pashupati Kumar : > > From: Thomas Monjalon > > > 06/12/2013 13:04, Pashupati Kumar : > > > > We use bit spinlocks extensively to have compact data structures. > > > > Are there any plans for adding them to DPDK in some future release? > > > > > > Not sure to understand your request. > > > Are you looking for that? > > > http://dpdk.org/doc/api/rte__spinlock_8h.html > > > > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit > > word for locking. The rest of the bits in the word are left > > undisturbed. This enables more compact data structures as only 1 bit > > is consumed for the lock. > > Oh yes, like test_and_set_bit_lock() in Linux: > http://lxr.free- > electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205 > > I think that a patch would be appreciated :) > > PS: please try to answer below the question. It's far easier to read. > -- > Thomas Yes. Thank you. Pash ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bit spinlocks in DPDK 2013-12-06 22:24 ` Thomas Monjalon 2013-12-06 22:54 ` Pashupati Kumar @ 2013-12-07 17:54 ` François-Frédéric Ozog 2013-12-19 16:41 ` Thomas Monjalon 2013-12-20 15:39 ` Thomas Monjalon 1 sibling, 2 replies; 10+ messages in thread From: François-Frédéric Ozog @ 2013-12-07 17:54 UTC (permalink / raw) To: 'Thomas Monjalon', 'Pashupati Kumar'; +Cc: dev-VfR2kkLFssw > -----Message d'origine----- > De : dev [mailto:dev-bounces-VfR2kkLFssw@public.gmane.org] De la part de Thomas Monjalon > Envoyé : vendredi 6 décembre 2013 23:24 > À : Pashupati Kumar > Cc : dev-VfR2kkLFssw@public.gmane.org > Objet : Re: [dpdk-dev] Bit spinlocks in DPDK > > 06/12/2013 14:12, Pashupati Kumar : > > From: Thomas Monjalon > > > 06/12/2013 13:04, Pashupati Kumar : > > > > We use bit spinlocks extensively to have compact data structures. > > > > Are there any plans for adding them to DPDK in some future release? > > > > > > Not sure to understand your request. > > > Are you looking for that? > > > http://dpdk.org/doc/api/rte__spinlock_8h.html > > > > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit > > word for locking. The rest of the bits in the word are left > > undisturbed. This enables more compact data structures as only 1 bit > > is consumed for the lock. > > Oh yes, like test_and_set_bit_lock() in Linux: > http://lxr.free- > electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205 > > I think that a patch would be appreciated :) > > PS: please try to answer below the question. It's far easier to read. > -- > Thomas Hi, I assume you mean the x64 version, not the ia64: http://lxr.free-electrons.com/source/arch/x86/include/asm/bitops.h. I agree with Pash, the goal of compacting data structures is of the highest importance for high performance networking (HPN). Last week I gave a training on some aspects of multiprocessing, and in particular locking. So I got the mood to check DPDK implementation and I have a few remarks: 1) If the critical section deals with weakly ordered loads then explicit fencing MUST be used: if not, out of order execution will just kill your idea of critical section. Weakly order loads occur on a number of situations involving Write Combining memory (say memory mapped IO). see Intel programming guide 3, 8.1.2.2 Software Controlled Bus Locking and 12.10.3 Streaming Load Hint Instruction. So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I recommend the rte_unlock code and documentation explains the out of order execution issues and the conditions they have to be mitigated with rte*mb(). I wonder if having an explicit mfence in rte_sinlock_unlock wouldn't be just necessary to avoid "hairy" bugs. In addition, we would have rte_sinlock_unlock_no_mb used internally for performance reasons, and usable externally by advanced users. 2) code that use rte spinlocks are subject to starvation. A simple code with one producer and one consumer demonstrate the issue: I saw either the producer not giving a single chance to a consumer after 1M loops or the consumer grabs the lock right after the first production and loops for ever testing if the producer had produced something (the cycle of unlock/lock is too fast)! Introducing "random" rte_pause() (actually lcore_id number of pauses) solves the problem in a ugly unpredictable manner. It may not create any visible problem on light loads, but may end up being a real issue with several million packets per second. There are a number of scenarios were we don't care if lock algorithm create starvation, but sometimes we do. So I will provide the community with a ticketlock implementation that is starvation free. What is the policy to share a sample program: inline in a mail or as a attachment? What is the policy to submit a patch? François-Frédéric ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bit spinlocks in DPDK 2013-12-07 17:54 ` François-Frédéric Ozog @ 2013-12-19 16:41 ` Thomas Monjalon 2013-12-20 15:39 ` Thomas Monjalon 1 sibling, 0 replies; 10+ messages in thread From: Thomas Monjalon @ 2013-12-19 16:41 UTC (permalink / raw) To: François-Frédéric Ozog; +Cc: dev-VfR2kkLFssw 07/12/2013 18:54, François-Frédéric Ozog : > > De Thomas Monjalon > > 06/12/2013 14:12, Pashupati Kumar : > > > I am looking for spinlocks that use a single bit (bit 31) of a 32 bit > > > word for locking. The rest of the bits in the word are left > > > undisturbed. This enables more compact data structures as only 1 bit > > > is consumed for the lock. > > > > Oh yes, like test_and_set_bit_lock() in Linux: > > http://lxr.free-electrons.com/source/arch/ia64/include/asm/bitops.h?v=3.12#L205 > > > > I think that a patch would be appreciated :) > > I assume you mean the x64 version, not the ia64: > http://lxr.free-electrons.com/source/arch/x86/include/asm/bitops.h. Yes you're right. > What is the policy to share a sample program: inline in a mail or as a > attachment? Inline should be fine. It allows to comment. > What is the policy to submit a patch? It's now explained here: http://dpdk.org/dev#send -- Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bit spinlocks in DPDK 2013-12-07 17:54 ` François-Frédéric Ozog 2013-12-19 16:41 ` Thomas Monjalon @ 2013-12-20 15:39 ` Thomas Monjalon [not found] ` <201312201639.05277.thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 10+ messages in thread From: Thomas Monjalon @ 2013-12-20 15:39 UTC (permalink / raw) To: François-Frédéric Ozog; +Cc: dev-VfR2kkLFssw Hello, 07/12/2013 18:54, François-Frédéric Ozog : > 1) If the critical section deals with weakly ordered loads then explicit > fencing MUST be used: if not, out of order execution will just kill your > idea of critical section. [...] > So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I recommend > the rte_unlock code and documentation explains the out of order execution > issues and the conditions they have to be mitigated with rte*mb(). I > wonder if having an explicit mfence in rte_sinlock_unlock wouldn't be just > necessary to avoid "hairy" bugs. In addition, we would have > rte_sinlock_unlock_no_mb used internally for performance reasons, and > usable externally by advanced users. Using lock prefix is lighter than using memory barrier and have the same effects. But you're right about the bug in spinlocks. I am going to send a patch for this. -- Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <201312201639.05277.thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org>]
* Re: Bit spinlocks in DPDK [not found] ` <201312201639.05277.thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> @ 2013-12-20 16:00 ` François-Frédéric Ozog 2013-12-20 16:36 ` Stephen Hemminger 0 siblings, 1 reply; 10+ messages in thread From: François-Frédéric Ozog @ 2013-12-20 16:00 UTC (permalink / raw) To: 'Thomas Monjalon'; +Cc: dev-VfR2kkLFssw > -----Message d'origine----- > De : Thomas Monjalon [mailto:thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org] > Envoyé : vendredi 20 décembre 2013 16:39 > À : François-Frédéric Ozog > Cc : dev-VfR2kkLFssw@public.gmane.org > Objet : Re: [dpdk-dev] Bit spinlocks in DPDK > > Hello, > > 07/12/2013 18:54, François-Frédéric Ozog : > > 1) If the critical section deals with weakly ordered loads then > > explicit fencing MUST be used: if not, out of order execution will > > just kill your idea of critical section. > [...] > > So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I > > recommend the rte_unlock code and documentation explains the out of > > order execution issues and the conditions they have to be mitigated > > with rte*mb(). I wonder if having an explicit mfence in > > rte_sinlock_unlock wouldn't be just necessary to avoid "hairy" bugs. > > In addition, we would have rte_sinlock_unlock_no_mb used internally > > for performance reasons, and usable externally by advanced users. > > Using lock prefix is lighter than using memory barrier and have the same > effects. Well, in general yes BUT Intel states "../.. locked operations serialize all outstanding load and store operations ../.. with one exception. Load operations that reference weakly ordered memory types (such as the WC memory type) may *not* be serialized" in 8.1.2.2 Software Controlled Bus Locking; particularly if streaming loads are used (may happen on certain devices memory mapped I/O accesses and the compiler generating streaming loads). So this comment is essentially for the PMD writers: use the fencing where appropriate, even if the lock prefix is there. As I will be the one forgetting the rule, I like to have that in the documentation/comments as reminders to keep things neat. François-Frédéric > But you're right about the bug in spinlocks. > I am going to send a patch for this. > > -- > Thomas ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bit spinlocks in DPDK 2013-12-20 16:00 ` François-Frédéric Ozog @ 2013-12-20 16:36 ` Stephen Hemminger 0 siblings, 0 replies; 10+ messages in thread From: Stephen Hemminger @ 2013-12-20 16:36 UTC (permalink / raw) To: François-Frédéric Ozog; +Cc: dev-VfR2kkLFssw On Fri, 20 Dec 2013 17:00:43 +0100 François-Frédéric Ozog <ff-SpBG2i6TTxU@public.gmane.org> wrote: > > > > -----Message d'origine----- > > De : Thomas Monjalon [mailto:thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org] > > Envoyé : vendredi 20 décembre 2013 16:39 > > À : François-Frédéric Ozog > > Cc : dev-VfR2kkLFssw@public.gmane.org > > Objet : Re: [dpdk-dev] Bit spinlocks in DPDK > > > > Hello, > > > > 07/12/2013 18:54, François-Frédéric Ozog : > > > 1) If the critical section deals with weakly ordered loads then > > > explicit fencing MUST be used: if not, out of order execution will > > > just kill your idea of critical section. > > [...] > > > So use rte_mb() or rte_wmb() or rte_rmb() where appropriate. I > > > recommend the rte_unlock code and documentation explains the out of > > > order execution issues and the conditions they have to be mitigated > > > with rte*mb(). I wonder if having an explicit mfence in > > > rte_sinlock_unlock wouldn't be just necessary to avoid "hairy" bugs. > > > In addition, we would have rte_sinlock_unlock_no_mb used internally > > > for performance reasons, and usable externally by advanced users. > > > > Using lock prefix is lighter than using memory barrier and have the same > > effects. > > Well, in general yes BUT Intel states "../.. locked operations serialize all > outstanding load and store operations ../.. with one exception. Load > operations that reference weakly ordered memory types (such as the WC memory > type) may *not* be serialized" in 8.1.2.2 Software Controlled Bus Locking; > particularly if streaming loads are used (may happen on certain devices > memory mapped I/O accesses and the compiler generating streaming loads). > > So this comment is essentially for the PMD writers: use the fencing where > appropriate, even if the lock prefix is there. As I will be the one > forgetting the rule, I like to have that in the documentation/comments as > reminders to keep things neat. > > François-Frédéric I recommend anyone who needs more information read: Documentation/memory-barriers.txt in the Linux kernel source. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2013-12-20 16:36 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-06 21:04 Bit spinlocks in DPDK Pashupati Kumar [not found] ` <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E6B-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org> 2013-12-06 22:02 ` Thomas Monjalon 2013-12-06 22:12 ` Pashupati Kumar [not found] ` <6895EAE0CA8DEE40B92D7CA88BB521F332BA572E7A-7bu+sH2UtuFYKLnPCdnYV+N5nnCVAcvGQQ4Iyu8u01E@public.gmane.org> 2013-12-06 22:24 ` Thomas Monjalon 2013-12-06 22:54 ` Pashupati Kumar 2013-12-07 17:54 ` François-Frédéric Ozog 2013-12-19 16:41 ` Thomas Monjalon 2013-12-20 15:39 ` Thomas Monjalon [not found] ` <201312201639.05277.thomas.monjalon-pdR9zngts4EAvxtiuMwx3w@public.gmane.org> 2013-12-20 16:00 ` François-Frédéric Ozog 2013-12-20 16:36 ` Stephen Hemminger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).