From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
To: Jia He <hejianet@gmail.com>
Cc: dev@dpdk.org, olivier.matz@6wind.com,
konstantin.ananyev@intel.com, bruce.richardson@intel.com,
jianbo.liu@arm.com, hemant.agrawal@nxp.com,
jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com,
jia.he@hxt-semitech.com
Subject: Re: [PATCH v2] ring: guarantee ordering of cons/prod loading when doing
Date: Thu, 2 Nov 2017 22:53:38 +0530 [thread overview]
Message-ID: <20171102172337.GB1478@jerin> (raw)
In-Reply-To: <1509612210-5499-1-git-send-email-hejianet@gmail.com>
-----Original Message-----
> Date: Thu, 2 Nov 2017 08:43:30 +0000
> From: Jia He <hejianet@gmail.com>
> To: jerin.jacob@caviumnetworks.com, dev@dpdk.org, olivier.matz@6wind.com
> Cc: konstantin.ananyev@intel.com, bruce.richardson@intel.com,
> jianbo.liu@arm.com, hemant.agrawal@nxp.com, Jia He <hejianet@gmail.com>,
> jie2.liu@hxt-semitech.com, bing.zhao@hxt-semitech.com,
> jia.he@hxt-semitech.com
> Subject: [PATCH v2] ring: guarantee ordering of cons/prod loading when doing
> X-Mailer: git-send-email 2.7.4
>
> We watched a rte panic of mbuf_autotest in our qualcomm arm64 server.
> As for the possible race condition, please refer to [1].
Hi Jia,
In addition to Konstantin comments. Please find below some review
comments.
>
> Furthermore, there are 2 options as suggested by Jerin:
> 1. use rte_smp_rmb
> 2. use load_acquire/store_release(refer to [2]).
> CONFIG_RTE_ATOMIC_ACQUIRE_RELEASE_BARRIER_PREFER is provided, and by
I think, The better name would be CONFIG_RTE_RING_USE_C11_MEM_MODEL
or something like that.
> default it is n;
>
> ---
> Changelog:
> V2: let users choose whether using load_acquire/store_release
> V1: rte_smp_rmb() between 2 loads
>
> Signed-off-by: Jia He <hejianet@gmail.com>
> Signed-off-by: jie2.liu@hxt-semitech.com
> Signed-off-by: bing.zhao@hxt-semitech.com
> Signed-off-by: jia.he@hxt-semitech.com
> Suggested-by: jerin.jacob@caviumnetworks.com
> ---
> lib/librte_ring/Makefile | 4 +++-
> lib/librte_ring/rte_ring.h | 38 ++++++++++++++++++++++++------
> lib/librte_ring/rte_ring_arm64.h | 48 ++++++++++++++++++++++++++++++++++++++
> lib/librte_ring/rte_ring_generic.h | 45 +++++++++++++++++++++++++++++++++++
> 4 files changed, 127 insertions(+), 8 deletions(-)
> create mode 100644 lib/librte_ring/rte_ring_arm64.h
> create mode 100644 lib/librte_ring/rte_ring_generic.h
>
> diff --git a/lib/librte_ring/Makefile b/lib/librte_ring/Makefile
> index e34d9d9..fa57a86 100644
> --- a/lib/librte_ring/Makefile
> +++ b/lib/librte_ring/Makefile
> @@ -45,6 +45,8 @@ LIBABIVER := 1
> SRCS-$(CONFIG_RTE_LIBRTE_RING) := rte_ring.c
>
> # install includes
> -SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h
> +SYMLINK-$(CONFIG_RTE_LIBRTE_RING)-include := rte_ring.h \
> + rte_ring_arm64.h \
It is really not specific to arm64. We could rename it to rte_ring_c11_mem.h or
something like that to reflect the implementation based on c11 memory
model.
> + rte_ring_generic.h
>
> include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> index 5e9b3b7..943b1f9 100644
> --- a/lib/librte_ring/rte_ring.h
> +++ b/lib/librte_ring/rte_ring.h
> @@ -103,6 +103,18 @@ extern "C" {
> #include <rte_memzone.h>
> #include <rte_pause.h>
>
> +/* In those strong memory models (e.g. x86), there is no need to add rmb()
> + * between load and load.
> + * In those weak models(powerpc/arm), there are 2 choices for the users
> + * 1.use rmb() memory barrier
> + * 2.use one-direcion load_acquire/store_release barrier
> + * It depends on performance test results. */
> +#ifdef RTE_ARCH_ARM64
s/RTE_ARCH_ARM64/RTE_RING_USE_C11_MEM_MODEL and update the generic arm64 config.
By that way it can used by another architecture like ppc if they choose to do so.
> +#include "rte_ring_arm64.h"
> +#else
> +#include "rte_ring_generic.h"
> +#endif
> +
> #define RTE_TAILQ_RING_NAME "RTE_RING"
>
> enum rte_ring_queue_behavior {
> @@ -368,7 +380,7 @@ update_tail(struct rte_ring_headtail *ht, uint32_t old_val, uint32_t new_val,
> while (unlikely(ht->tail != old_val))
> rte_pause();
>
> - ht->tail = new_val;
> + arch_rte_atomic_store(&ht->tail, new_val, __ATOMIC_RELEASE);
> }
>
> /**
> @@ -408,7 +420,8 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
> /* Reset n to the initial burst count */
> n = max;
>
> - *old_head = r->prod.head;
> + *old_head = arch_rte_atomic_load(&r->prod.head,
> + __ATOMIC_ACQUIRE);
Same as Konstantin comments, i.e move to this function to c11 memory model
header file
> const uint32_t cons_tail = r->cons.tail;
> /*
> * The subtraction is done between two unsigned 32bits value
> @@ -430,8 +443,10 @@ __rte_ring_move_prod_head(struct rte_ring *r, int is_sp,
> if (is_sp)
> r->prod.head = *new_head, success = 1;
> else
> - success = rte_atomic32_cmpset(&r->prod.head,
> - *old_head, *new_head);
> + success = arch_rte_atomic32_cmpset(&r->prod.head,
> + old_head, *new_head,
> + 0, __ATOMIC_ACQUIRE,
> + __ATOMIC_RELAXED);
> } while (unlikely(success == 0));
> return n;
> }
> @@ -470,7 +485,10 @@ __rte_ring_do_enqueue(struct rte_ring *r, void * const *obj_table,
> goto end;
>
> ENQUEUE_PTRS(r, &r[1], prod_head, obj_table, n, void *);
> +
> +#ifndef RTE_ATOMIC_ACQUIRE_RELEASE_BARRIER_PREFER
This #define will be removed if the function moves.
> rte_smp_wmb();
> +#endif
>
> update_tail(&r->prod, prod_head, prod_next, is_sp);
> end:
> @@ -516,7 +534,8 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
> /* Restore n as it may change every loop */
> n = max;
>
> - *old_head = r->cons.head;
> + *old_head = arch_rte_atomic_load(&r->cons.head,
> + __ATOMIC_ACQUIRE);
> const uint32_t prod_tail = r->prod.tail;
> /* The subtraction is done between two unsigned 32bits value
> * (the result is always modulo 32 bits even if we have
> @@ -535,8 +554,10 @@ __rte_ring_move_cons_head(struct rte_ring *r, int is_sc,
> if (is_sc)
> r->cons.head = *new_head, success = 1;
> else
> - success = rte_atomic32_cmpset(&r->cons.head, *old_head,
> - *new_head);
> + success = arch_rte_atomic32_cmpset(&r->cons.head,
> + old_head, *new_head,
> + 0, __ATOMIC_ACQUIRE,
> + __ATOMIC_RELAXED);
> } while (unlikely(success == 0));
> return n;
> }
> @@ -575,7 +596,10 @@ __rte_ring_do_dequeue(struct rte_ring *r, void **obj_table,
> goto end;
>
> DEQUEUE_PTRS(r, &r[1], cons_head, obj_table, n, void *);
> +
> +#ifndef RTE_ATOMIC_ACQUIRE_RELEASE_BARRIER_PREFER
This #define will be removed if the function moves.
> rte_smp_rmb();
> +#endif
next prev parent reply other threads:[~2017-11-02 17:23 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-02 8:43 [PATCH v2] ring: guarantee ordering of cons/prod loading when doing Jia He
2017-11-02 13:26 ` Ananyev, Konstantin
2017-11-02 15:42 ` Jia He
2017-11-02 16:16 ` Ananyev, Konstantin
2017-11-02 17:00 ` Jerin Jacob
2017-11-02 17:23 ` Jerin Jacob [this message]
2017-11-03 1:46 ` Jia He
2017-11-03 12:56 ` Jerin Jacob
2017-11-06 7:25 ` Jia He
2017-11-07 4:36 ` Jerin Jacob
2017-11-07 8:34 ` Jia He
2017-11-07 9:57 ` Jerin Jacob
2017-11-08 2:31 ` Jia He
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171102172337.GB1478@jerin \
--to=jerin.jacob@caviumnetworks.com \
--cc=bing.zhao@hxt-semitech.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=hejianet@gmail.com \
--cc=hemant.agrawal@nxp.com \
--cc=jia.he@hxt-semitech.com \
--cc=jianbo.liu@arm.com \
--cc=jie2.liu@hxt-semitech.com \
--cc=konstantin.ananyev@intel.com \
--cc=olivier.matz@6wind.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.