All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Steve Wise" <swise@opengridcomputing.com>
To: 'Sinan Kaya' <okaya@codeaurora.org>,
	netdev@vger.kernel.org, timur@codeaurora.org,
	sulrich@codeaurora.org
Cc: linux-arm-msm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	'Steve Wise' <swise@chelsio.com>,
	'Doug Ledford' <dledford@redhat.com>,
	'Jason Gunthorpe' <jgg@ziepe.ca>,
	linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org,
	'Michael Werner' <werner@chelsio.com>,
	'Casey Leedom' <leedom@chelsio.com>
Subject: RE: [PATCH v3 18/18] infiniband: cxgb4: Eliminate duplicate barriers on weakly-ordered archs
Date: Sat, 17 Mar 2018 08:23:13 -0500	[thread overview]
Message-ID: <001d01d3bdf3$1a607cf0$4f2176d0$@opengridcomputing.com> (raw)
In-Reply-To: <1f5e3b14-05a1-08d0-c0cb-00805526448d@codeaurora.org>

> 
> On 3/17/2018 12:03 AM, Sinan Kaya wrote:
> > On 3/16/2018 11:40 PM, Sinan Kaya wrote:
> >> I'll change writel_relaxed() with __raw_writel() in the series like you
> suggested
> >> and also look at your other comments.
> >
> > I spoke too soon.
> >
> > Now that I realized, code needs to follow one of the following patterns
> for correctness
> >
> > 1)
> > wmb()
> > writel()/writel_relaxed()
> >
> > or
> >
> > 2)
> > wmb()
> > __raw_wrltel()
> > mmiowb()
> >
> > but definitely not
> >
> > wmb()
> > __raw_wrltel()
> >
> > Since #1 == #2, I'll stick to my current implementation of writel_relaxed()
> >
> > Changing writel() to writel_relaxed() or __raw_writel() isn't enough.
> PowerPC needs mmiowb()
> > for correctness. ARM's mmiowb() implementation is empty.
> >
> > So, there is no one size fits all solution with the current state of affairs.
> >
> >
> 
> I think I finally got what you mean.
> 
> Code seems to have
> 
> wmb()
> writel()/writeq()
> wmb()
> 
> this can be safely replaced with
> 
> wmb()
> __raw_writel()/__raw_writeq()
> wmb()
> 
> This will work on all arches. Below is the new version. Let me know if this is
> OK.
> 
> +++ b/drivers/infiniband/hw/cxgb4/t4.h
> @@ -457,7 +457,7 @@ static inline void pio_copy(u64 __iomem *dst, u64
> *src)
>         int count = 8;
> 
>         while (count) {
> -               writeq(*src, dst);
> +               __raw_writeq(*src, dst);
>                 src++;
>                 dst++;
>                 count--;
> @@ -477,15 +477,16 @@ static inline void t4_ring_sq_db(struct t4_wq *wq,
> u16 inc, union t4_wr *wqe)
>                                  (u64 *)wqe);
>                 } else {
>                         pr_debug("DB wq->sq.pidx = %d\n", wq->sq.pidx);
> -                       writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid),
> -                              wq->sq.bar2_va + SGE_UDB_KDOORBELL);
> +                       __raw_writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid),
> +                                    wq->sq.bar2_va + SGE_UDB_KDOORBELL);
>                 }
> 
>                 /* Flush user doorbell area writes. */
>                 wmb();
>                 return;
>         }
> -       writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db);
> +       __raw_writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db);
> +       mmiowmb()
>  }
> 
>  static inline void t4_ring_rq_db(struct t4_wq *wq, u16 inc,
> @@ -502,15 +503,16 @@ static inline void t4_ring_rq_db(struct t4_wq *wq,
> u16 inc,
>                                  (void *)wqe);
>                 } else {
>                         pr_debug("DB wq->rq.pidx = %d\n", wq->rq.pidx);
> -                       writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid),
> -                              wq->rq.bar2_va + SGE_UDB_KDOORBELL);
> +                       __raw_writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid),
> +                                    wq->rq.bar2_va + SGE_UDB_KDOORBELL);
>                 }
> 
>                 /* Flush user doorbell area writes. */
>                 wmb();
>                 return;
>         }
> -       writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db);
> +       __raw_writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db);
> +       mmiowmb();
>  }
> 
> 

Yes, this is what chelsio recommended to me.  

Reviewed-by: Steve Wise <swise@opengridcomputing.com>

WARNING: multiple messages have this Message-ID (diff)
From: swise@opengridcomputing.com (Steve Wise)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v3 18/18] infiniband: cxgb4: Eliminate duplicate barriers on weakly-ordered archs
Date: Sat, 17 Mar 2018 08:23:13 -0500	[thread overview]
Message-ID: <001d01d3bdf3$1a607cf0$4f2176d0$@opengridcomputing.com> (raw)
In-Reply-To: <1f5e3b14-05a1-08d0-c0cb-00805526448d@codeaurora.org>

> 
> On 3/17/2018 12:03 AM, Sinan Kaya wrote:
> > On 3/16/2018 11:40 PM, Sinan Kaya wrote:
> >> I'll change writel_relaxed() with __raw_writel() in the series like you
> suggested
> >> and also look at your other comments.
> >
> > I spoke too soon.
> >
> > Now that I realized, code needs to follow one of the following patterns
> for correctness
> >
> > 1)
> > wmb()
> > writel()/writel_relaxed()
> >
> > or
> >
> > 2)
> > wmb()
> > __raw_wrltel()
> > mmiowb()
> >
> > but definitely not
> >
> > wmb()
> > __raw_wrltel()
> >
> > Since #1 == #2, I'll stick to my current implementation of writel_relaxed()
> >
> > Changing writel() to writel_relaxed() or __raw_writel() isn't enough.
> PowerPC needs mmiowb()
> > for correctness. ARM's mmiowb() implementation is empty.
> >
> > So, there is no one size fits all solution with the current state of affairs.
> >
> >
> 
> I think I finally got what you mean.
> 
> Code seems to have
> 
> wmb()
> writel()/writeq()
> wmb()
> 
> this can be safely replaced with
> 
> wmb()
> __raw_writel()/__raw_writeq()
> wmb()
> 
> This will work on all arches. Below is the new version. Let me know if this is
> OK.
> 
> +++ b/drivers/infiniband/hw/cxgb4/t4.h
> @@ -457,7 +457,7 @@ static inline void pio_copy(u64 __iomem *dst, u64
> *src)
>         int count = 8;
> 
>         while (count) {
> -               writeq(*src, dst);
> +               __raw_writeq(*src, dst);
>                 src++;
>                 dst++;
>                 count--;
> @@ -477,15 +477,16 @@ static inline void t4_ring_sq_db(struct t4_wq *wq,
> u16 inc, union t4_wr *wqe)
>                                  (u64 *)wqe);
>                 } else {
>                         pr_debug("DB wq->sq.pidx = %d\n", wq->sq.pidx);
> -                       writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid),
> -                              wq->sq.bar2_va + SGE_UDB_KDOORBELL);
> +                       __raw_writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid),
> +                                    wq->sq.bar2_va + SGE_UDB_KDOORBELL);
>                 }
> 
>                 /* Flush user doorbell area writes. */
>                 wmb();
>                 return;
>         }
> -       writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db);
> +       __raw_writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db);
> +       mmiowmb()
>  }
> 
>  static inline void t4_ring_rq_db(struct t4_wq *wq, u16 inc,
> @@ -502,15 +503,16 @@ static inline void t4_ring_rq_db(struct t4_wq *wq,
> u16 inc,
>                                  (void *)wqe);
>                 } else {
>                         pr_debug("DB wq->rq.pidx = %d\n", wq->rq.pidx);
> -                       writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid),
> -                              wq->rq.bar2_va + SGE_UDB_KDOORBELL);
> +                       __raw_writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid),
> +                                    wq->rq.bar2_va + SGE_UDB_KDOORBELL);
>                 }
> 
>                 /* Flush user doorbell area writes. */
>                 wmb();
>                 return;
>         }
> -       writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db);
> +       __raw_writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db);
> +       mmiowmb();
>  }
> 
> 

Yes, this is what chelsio recommended to me.  

Reviewed-by: Steve Wise <swise@opengridcomputing.com>

WARNING: multiple messages have this Message-ID (diff)
From: "Steve Wise" <swise@opengridcomputing.com>
To: "'Sinan Kaya'" <okaya@codeaurora.org>, <netdev@vger.kernel.org>,
	<timur@codeaurora.org>, <sulrich@codeaurora.org>
Cc: <linux-arm-msm@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	"'Steve Wise'" <swise@chelsio.com>,
	"'Doug Ledford'" <dledford@redhat.com>,
	"'Jason Gunthorpe'" <jgg@ziepe.ca>, <linux-rdma@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	"'Michael Werner'" <werner@chelsio.com>,
	"'Casey Leedom'" <leedom@chelsio.com>
Subject: RE: [PATCH v3 18/18] infiniband: cxgb4: Eliminate duplicate barriers on weakly-ordered archs
Date: Sat, 17 Mar 2018 08:23:13 -0500	[thread overview]
Message-ID: <001d01d3bdf3$1a607cf0$4f2176d0$@opengridcomputing.com> (raw)
In-Reply-To: <1f5e3b14-05a1-08d0-c0cb-00805526448d@codeaurora.org>

> 
> On 3/17/2018 12:03 AM, Sinan Kaya wrote:
> > On 3/16/2018 11:40 PM, Sinan Kaya wrote:
> >> I'll change writel_relaxed() with __raw_writel() in the series like you
> suggested
> >> and also look at your other comments.
> >
> > I spoke too soon.
> >
> > Now that I realized, code needs to follow one of the following patterns
> for correctness
> >
> > 1)
> > wmb()
> > writel()/writel_relaxed()
> >
> > or
> >
> > 2)
> > wmb()
> > __raw_wrltel()
> > mmiowb()
> >
> > but definitely not
> >
> > wmb()
> > __raw_wrltel()
> >
> > Since #1 == #2, I'll stick to my current implementation of writel_relaxed()
> >
> > Changing writel() to writel_relaxed() or __raw_writel() isn't enough.
> PowerPC needs mmiowb()
> > for correctness. ARM's mmiowb() implementation is empty.
> >
> > So, there is no one size fits all solution with the current state of affairs.
> >
> >
> 
> I think I finally got what you mean.
> 
> Code seems to have
> 
> wmb()
> writel()/writeq()
> wmb()
> 
> this can be safely replaced with
> 
> wmb()
> __raw_writel()/__raw_writeq()
> wmb()
> 
> This will work on all arches. Below is the new version. Let me know if this is
> OK.
> 
> +++ b/drivers/infiniband/hw/cxgb4/t4.h
> @@ -457,7 +457,7 @@ static inline void pio_copy(u64 __iomem *dst, u64
> *src)
>         int count = 8;
> 
>         while (count) {
> -               writeq(*src, dst);
> +               __raw_writeq(*src, dst);
>                 src++;
>                 dst++;
>                 count--;
> @@ -477,15 +477,16 @@ static inline void t4_ring_sq_db(struct t4_wq *wq,
> u16 inc, union t4_wr *wqe)
>                                  (u64 *)wqe);
>                 } else {
>                         pr_debug("DB wq->sq.pidx = %d\n", wq->sq.pidx);
> -                       writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid),
> -                              wq->sq.bar2_va + SGE_UDB_KDOORBELL);
> +                       __raw_writel(PIDX_T5_V(inc) | QID_V(wq->sq.bar2_qid),
> +                                    wq->sq.bar2_va + SGE_UDB_KDOORBELL);
>                 }
> 
>                 /* Flush user doorbell area writes. */
>                 wmb();
>                 return;
>         }
> -       writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db);
> +       __raw_writel(QID_V(wq->sq.qid) | PIDX_V(inc), wq->db);
> +       mmiowmb()
>  }
> 
>  static inline void t4_ring_rq_db(struct t4_wq *wq, u16 inc,
> @@ -502,15 +503,16 @@ static inline void t4_ring_rq_db(struct t4_wq *wq,
> u16 inc,
>                                  (void *)wqe);
>                 } else {
>                         pr_debug("DB wq->rq.pidx = %d\n", wq->rq.pidx);
> -                       writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid),
> -                              wq->rq.bar2_va + SGE_UDB_KDOORBELL);
> +                       __raw_writel(PIDX_T5_V(inc) | QID_V(wq->rq.bar2_qid),
> +                                    wq->rq.bar2_va + SGE_UDB_KDOORBELL);
>                 }
> 
>                 /* Flush user doorbell area writes. */
>                 wmb();
>                 return;
>         }
> -       writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db);
> +       __raw_writel(QID_V(wq->rq.qid) | PIDX_V(inc), wq->db);
> +       mmiowmb();
>  }
> 
> 

Yes, this is what chelsio recommended to me.  

Reviewed-by: Steve Wise <swise@opengridcomputing.com>

  parent reply	other threads:[~2018-03-17 13:23 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-16 16:16 [PATCH v3 00/18] Eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
2018-03-16 16:16 ` Sinan Kaya
2018-03-16 16:16 ` [Intel-wired-lan] [PATCH v3 01/18] i40e/i40evf: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [Intel-wired-lan] [PATCH v3 02/18] ixgbe: eliminate " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [Intel-wired-lan] [PATCH v3 03/18] igbvf: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [Intel-wired-lan] [PATCH v3 04/18] igb: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [Intel-wired-lan] [PATCH v3 05/18] ixgbevf: keep writel() closer to wmb() Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [Intel-wired-lan] [PATCH v3 06/18] ixgbevf: eliminate duplicate barriers on weakly-ordered archs Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 07/18] drivers: net: cxgb: Eliminate " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 08/18] scsi: hpsa: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [Intel-wired-lan] [PATCH v3 09/18] fm10k: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:30   ` [Intel-wired-lan] " Alexander Duyck
2018-03-16 16:30     ` Alexander Duyck
2018-03-16 16:30     ` Alexander Duyck
2018-03-16 16:33     ` Sinan Kaya
2018-03-16 16:33       ` Sinan Kaya
2018-03-16 16:33       ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 10/18] net: qla3xxx: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 11/18] qlcnic: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-19 20:10   ` Chopra, Manish
2018-03-19 20:10     ` Chopra, Manish
2018-03-16 16:16 ` [PATCH v3 12/18] bnx2x: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 13/18] net: cxgb4/cxgb4vf: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 14/18] net: cxgb3: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 15/18] RDMA/bnxt_re: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 16/18] IB/mlx4: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 17/18] RDMA/i40iw: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 16:16 ` [PATCH v3 18/18] infiniband: cxgb4: " Sinan Kaya
2018-03-16 16:16   ` Sinan Kaya
2018-03-16 21:05   ` Steve Wise
2018-03-16 21:05     ` Steve Wise
2018-03-16 21:05     ` Steve Wise
2018-03-16 21:46     ` Sinan Kaya
2018-03-16 21:46       ` Sinan Kaya
2018-03-16 23:05       ` Steve Wise
2018-03-16 23:05         ` Steve Wise
2018-03-16 23:05         ` Steve Wise
2018-03-17  3:40         ` Sinan Kaya
2018-03-17  3:40           ` Sinan Kaya
2018-03-17  4:03           ` Sinan Kaya
2018-03-17  4:03             ` Sinan Kaya
2018-03-17  4:25             ` Sinan Kaya
2018-03-17  4:25               ` Sinan Kaya
2018-03-17  4:30               ` Timur Tabi
2018-03-17  4:30                 ` Timur Tabi
2018-03-17 13:23               ` Steve Wise [this message]
2018-03-17 13:23                 ` Steve Wise
2018-03-17 13:23                 ` Steve Wise
2018-03-17 13:27               ` David Miller
2018-03-17 13:27                 ` David Miller
2018-03-17 15:05               ` Jason Gunthorpe
2018-03-17 15:05                 ` Jason Gunthorpe
2018-03-17 18:30                 ` Sinan Kaya
2018-03-17 18:30                   ` Sinan Kaya
2018-03-19  1:48                   ` Jason Gunthorpe
2018-03-19  1:48                     ` Jason Gunthorpe
2018-03-16 22:13     ` Jason Gunthorpe
2018-03-16 22:13       ` Jason Gunthorpe
2018-03-16 23:04       ` Steve Wise
2018-03-16 23:04         ` Steve Wise
2018-03-16 23:04         ` Steve Wise
2018-03-17  4:08         ` Timur Tabi
2018-03-17  4:08           ` Timur Tabi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='001d01d3bdf3$1a607cf0$4f2176d0$@opengridcomputing.com' \
    --to=swise@opengridcomputing.com \
    --cc=dledford@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=leedom@chelsio.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=okaya@codeaurora.org \
    --cc=sulrich@codeaurora.org \
    --cc=swise@chelsio.com \
    --cc=timur@codeaurora.org \
    --cc=werner@chelsio.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.