Re: [PATCH v4 05/14] net-next/yunsilicon: Add eq and alloc

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Simon Horman <horms@kernel.org>
To: Xin Tian <tianx@yunsilicon.com>
Cc: netdev@vger.kernel.org, leon@kernel.org, andrew+netdev@lunn.ch,
	kuba@kernel.org, pabeni@redhat.com, edumazet@google.com,
	davem@davemloft.net, jeff.johnson@oss.qualcomm.com,
	przemyslaw.kitszel@intel.com, weihg@yunsilicon.com,
	wanry@yunsilicon.com, parthiban.veerasooran@microchip.com,
	masahiroy@kernel.org
Subject: Re: [PATCH v4 05/14] net-next/yunsilicon: Add eq and alloc
Date: Tue, 18 Feb 2025 17:10:36 +0000	[thread overview]
Message-ID: <20250218171036.GB1615191@kernel.org> (raw)
In-Reply-To: <20250213091412.2067626-6-tianx@yunsilicon.com>

On Thu, Feb 13, 2025 at 05:14:14PM +0800, Xin Tian wrote:
> Add eq management and buffer alloc apis
> 
> Signed-off-by: Xin Tian <tianx@yunsilicon.com>
> Signed-off-by: Honggang Wei <weihg@yunsilicon.com>

...

> diff --git a/drivers/net/ethernet/yunsilicon/xsc/common/xsc_core.h b/drivers/net/ethernet/yunsilicon/xsc/common/xsc_core.h

...

> +struct xsc_eq_table {
> +	void __iomem	       *update_ci;
> +	void __iomem	       *update_arm_ci;
> +	struct list_head       comp_eqs_list;

nit: The indentation of the member names above seems inconsistent
     with what is below.

> +	struct xsc_eq		pages_eq;
> +	struct xsc_eq		async_eq;
> +	struct xsc_eq		cmd_eq;
> +	int			num_comp_vectors;
> +	int			eq_vec_comp_base;
> +	/* protect EQs list
> +	 */
> +	spinlock_t		lock;
> +};

...

> diff --git a/drivers/net/ethernet/yunsilicon/xsc/pci/alloc.c b/drivers/net/ethernet/yunsilicon/xsc/pci/alloc.c

...

> +/* Handling for queue buffers -- we allocate a bunch of memory and
> + * register it in a memory region at HCA virtual address 0.  If the
> + * requested size is > max_direct, we split the allocation into
> + * multiple pages, so we don't require too much contiguous memory.
> + */

I can't help but think there is an existing API to handle this.

> +int xsc_buf_alloc(struct xsc_core_device *xdev, int size, int max_direct,

I think unsigned long would be slightly better types for size and max_direct.

> +		  struct xsc_buf *buf)
> +{
> +	dma_addr_t t;
> +
> +	buf->size = size;
> +	if (size <= max_direct) {
> +		buf->nbufs        = 1;
> +		buf->npages       = 1;
> +		buf->page_shift   = get_order(size) + PAGE_SHIFT;
> +		buf->direct.buf   = dma_alloc_coherent(&xdev->pdev->dev,
> +						       size,
> +						       &t,
> +						       GFP_KERNEL | __GFP_ZERO);
> +		if (!buf->direct.buf)
> +			return -ENOMEM;
> +
> +		buf->direct.map = t;
> +
> +		while (t & ((1 << buf->page_shift) - 1)) {

I think GENMASK() can be used here.

> +			--buf->page_shift;
> +			buf->npages *= 2;
> +		}
> +	} else {
> +		int i;
> +
> +		buf->direct.buf  = NULL;
> +		buf->nbufs       = (size + PAGE_SIZE - 1) / PAGE_SIZE;

I think this is open-coding DIV_ROUND_UP

> +		buf->npages      = buf->nbufs;
> +		buf->page_shift  = PAGE_SHIFT;
> +		buf->page_list   = kcalloc(buf->nbufs, sizeof(*buf->page_list),
> +					   GFP_KERNEL);
> +		if (!buf->page_list)
> +			return -ENOMEM;
> +
> +		for (i = 0; i < buf->nbufs; i++) {
> +			buf->page_list[i].buf =
> +				dma_alloc_coherent(&xdev->pdev->dev, PAGE_SIZE,
> +						   &t, GFP_KERNEL | __GFP_ZERO);
> +			if (!buf->page_list[i].buf)
> +				goto err_free;
> +
> +			buf->page_list[i].map = t;
> +		}
> +
> +		if (BITS_PER_LONG == 64) {
> +			struct page **pages;
> +
> +			pages = kmalloc_array(buf->nbufs, sizeof(*pages),
> +					      GFP_KERNEL);
> +			if (!pages)
> +				goto err_free;
> +			for (i = 0; i < buf->nbufs; i++) {
> +				void *addr = buf->page_list[i].buf;
> +
> +				if (is_vmalloc_addr(addr))
> +					pages[i] = vmalloc_to_page(addr);
> +				else
> +					pages[i] = virt_to_page(addr);
> +			}
> +			buf->direct.buf = vmap(pages, buf->nbufs,
> +					       VM_MAP, PAGE_KERNEL);
> +			kfree(pages);
> +			if (!buf->direct.buf)
> +				goto err_free;
> +		}

I think some explanation is warranted of why the above is relevant
only when BITS_PER_LONG == 64.

> +	}
> +
> +	return 0;
> +
> +err_free:
> +	xsc_buf_free(xdev, buf);
> +
> +	return -ENOMEM;
> +}

...

> +void xsc_fill_page_array(struct xsc_buf *buf, __be64 *pas, int npages)

As per my comment on unsigned long in my response to another patch,
I think npages can be unsigned long.

> +{
> +	int shift = PAGE_SHIFT - PAGE_SHIFT_4K;
> +	int mask = (1 << shift) - 1;

Likewise, I think that mask should be an unsigned long.
Or, both shift and mask could be #defines, as they are compile-time
constants.

Also, mask can be generated using GENMASK, e.g.

#define XSC_PAGE_ARRAY_MASK GENMASK(PAGE_SHIFT, PAGE_SHIFT_4K)
#define XSC_PAGE_ARRAY_SHIFT (PAGE_SHIFT - PAGE_SHIFT_4K)

And I note, in the (common) case of 4k pages, that both shift and mask are 0.

> +	u64 addr;
> +	int i;
> +
> +	for (i = 0; i < npages; i++) {
> +		if (buf->nbufs == 1)
> +			addr = buf->direct.map + (i << PAGE_SHIFT_4K);
> +		else
> +			addr = buf->page_list[i >> shift].map
> +			       + ((i & mask) << PAGE_SHIFT_4K);

The like above is open-coding FIELD_PREP().
However, I don't think it can be used here as
the compiler complains very loudly because the mask is 0.

> +
> +		pas[i] = cpu_to_be64(addr);
> +	}
> +}
> diff --git a/drivers/net/ethernet/yunsilicon/xsc/pci/alloc.h b/drivers/net/ethernet/yunsilicon/xsc/pci/alloc.h

...

> +static void eq_update_ci(struct xsc_eq *eq, int arm)
> +{
> +	struct xsc_eq_doorbell db = {0};
> +
> +	db.data0 = XSC_SET_FIELD(cpu_to_le32(eq->cons_index),
> +				 XSC_EQ_DB_NEXT_CID) |
> +		   XSC_SET_FIELD(cpu_to_le32(eq->eqn), XSC_EQ_DB_EQ_ID);

Each of the two uses of XSC_SET_FIELD() are passed a little-endian value
and a host-byte order mask. This does not seem correct as it seems
they byte order should be consistent.

> +	if (arm)
> +		db.data0 |= XSC_EQ_DB_ARM;

Likewise, here data0 is little-endian while XSC_EQ_DB_ARM is host
byte-order.

> +	writel(db.data0, XSC_REG_ADDR(eq->dev, eq->doorbell));

And here, db.data0 is little-endian, but writel expects a host-byte order
value (which it converts to little-endian).

I didn't dig deeper but it seems to me that it would be easier to change
the type of data0 to host byte-order and drop the use of cpu_to_le32()
above.

Issues flagged by Sparse.

> +	/* We still want ordering, just not swabbing, so add a barrier */
> +	mb();
> +}

...

> +static int xsc_eq_int(struct xsc_core_device *xdev, struct xsc_eq *eq)
> +{
> +	u32 cqn, qpn, queue_id;
> +	struct xsc_eqe *eqe;
> +	int eqes_found = 0;
> +	int set_ci = 0;
> +
> +	while ((eqe = next_eqe_sw(eq))) {
> +		/* Make sure we read EQ entry contents after we've
> +		 * checked the ownership bit.
> +		 */
> +		rmb();
> +		switch (eqe->type) {
> +		case XSC_EVENT_TYPE_COMP:
> +		case XSC_EVENT_TYPE_INTERNAL_ERROR:
> +			/* eqe is changing */
> +			queue_id = le16_to_cpu(XSC_GET_FIELD(eqe->queue_id_data,
> +							     XSC_EQE_QUEUE_ID));

Similarly, here XSC_GET_FIELD() is passed a little-endian value and a host
byte-order mask, which is inconsistent.

Perhaps this should be (completely untested!):

			queue_id = XSC_GET_FIELD(le16_to_cpu(eqe->queue_id_data),
						 XSC_EQE_QUEUE_ID);

Likewise for the two uses of XSC_GET_FIELD below.

And perhaps queue_id could be renamed, say to q_id, to make things a bit
more succinct.


> +			cqn = queue_id;

I'm unsure why both cqn and queue_id are needed.

> +			xsc_cq_completion(xdev, cqn);
> +			break;
> +
> +		case XSC_EVENT_TYPE_CQ_ERROR:
> +			queue_id = le16_to_cpu(XSC_GET_FIELD(eqe->queue_id_data,
> +							     XSC_EQE_QUEUE_ID));
> +			cqn = queue_id;
> +			xsc_eq_cq_event(xdev, cqn, eqe->type);
> +			break;
> +		case XSC_EVENT_TYPE_WQ_CATAS_ERROR:
> +		case XSC_EVENT_TYPE_WQ_INVAL_REQ_ERROR:
> +		case XSC_EVENT_TYPE_WQ_ACCESS_ERROR:
> +			queue_id = le16_to_cpu(XSC_GET_FIELD(eqe->queue_id_data,
> +							     XSC_EQE_QUEUE_ID));
> +			qpn = queue_id;
> +			xsc_qp_event(xdev, qpn, eqe->type);
> +			break;
> +		default:
> +			break;
> +		}
> +
> +		++eq->cons_index;
> +		eqes_found = 1;
> +		++set_ci;
> +
> +		/* The HCA will think the queue has overflowed if we
> +		 * don't tell it we've been processing events.  We
> +		 * create our EQs with XSC_NUM_SPARE_EQE extra
> +		 * entries, so we must update our consumer index at
> +		 * least that often.
> +		 */
> +		if (unlikely(set_ci >= XSC_NUM_SPARE_EQE)) {
> +			eq_update_ci(eq, 0);
> +			set_ci = 0;
> +		}
> +	}
> +
> +	eq_update_ci(eq, 1);
> +
> +	return eqes_found;
> +}

...

next prev parent reply	other threads:[~2025-02-18 17:10 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-13  9:14 [PATCH v4 00/14] net-next/yunsilicon: ADD Yunsilicon XSC Ethernet Driver Xin Tian
2025-02-13  9:14 ` [PATCH v4 01/14] net-next/yunsilicon: Add xsc driver basic framework Xin Tian
2025-02-13  9:14 ` [PATCH v4 02/14] net-next/yunsilicon: Enable CMDQ Xin Tian
2025-02-13  9:14 ` [PATCH v4 03/14] net-next/yunsilicon: Add hardware setup APIs Xin Tian
2025-02-13  9:14 ` [PATCH v4 04/14] net-next/yunsilicon: Add qp and cq management Xin Tian
2025-02-18 16:31   ` Simon Horman
2025-02-20  8:58     ` tianx
2025-02-13  9:14 ` [PATCH v4 05/14] net-next/yunsilicon: Add eq and alloc Xin Tian
2025-02-18 17:10   ` Simon Horman [this message]
2025-02-20 15:35     ` tianx
2025-02-24 18:58       ` Simon Horman
2025-02-25  2:34         ` Xin Tian
2025-02-25 10:22           ` Simon Horman
2025-02-13  9:14 ` [PATCH v4 06/14] net-next/yunsilicon: Add pci irq Xin Tian
2025-02-13  9:14 ` [PATCH v4 07/14] net-next/yunsilicon: Init auxiliary device Xin Tian
2025-02-13 14:37   ` Leon Romanovsky
2025-02-14  3:14     ` tianx
2025-02-16  9:59       ` Leon Romanovsky
2025-02-17  2:16         ` tianx
2025-02-13  9:14 ` [PATCH v4 08/14] net-next/yunsilicon: Add ethernet interface Xin Tian
2025-02-13  9:14 ` [PATCH v4 09/14] net-next/yunsilicon: Init net device Xin Tian
2025-02-13  9:14 ` [PATCH v4 10/14] net-next/yunsilicon: Add eth needed qp and cq apis Xin Tian
2025-02-13  9:14 ` [PATCH v4 11/14] net-next/yunsilicon: ndo_open and ndo_stop Xin Tian
2025-02-13  9:14 ` [PATCH v4 12/14] net-next/yunsilicon: Add ndo_start_xmit Xin Tian
2025-02-13  9:14 ` [PATCH v4 13/14] net-next/yunsilicon: Add eth rx Xin Tian
2025-02-13  9:14 ` [PATCH v4 14/14] net-next/yunsilicon: add ndo_get_stats64 Xin Tian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250218171036.GB1615191@kernel.org \
    --to=horms@kernel.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jeff.johnson@oss.qualcomm.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=parthiban.veerasooran@microchip.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=tianx@yunsilicon.com \
    --cc=wanry@yunsilicon.com \
    --cc=weihg@yunsilicon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.