Netdev List
 help / color / mirror / Atom feed
* [RFC PATCH 03/12] soc: qcom: ipa: generic software interface
From: Alex Elder @ 2018-11-07  0:32 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: netdev, devicetree, linux-arm-msm, linux-soc, linux-arm-kernel,
	linux-kernel, syadagir, mjavid, robh+dt, mark.rutland
In-Reply-To: <20181107003250.5832-1-elder@linaro.org>

This patch contains the code supporting the Generic Software
Interface (GSI) used by the IPA.  Although the GSI is an integral
part of the IPA, it provides a well-defined layer between the AP
subsystem (or, for that matter, the modem) and the IPA core.

The GSI code presents an abstract interface through which commands
and data transfers can be queued to be implemented on a channel.  A
hardware independent gsi_xfer_elem structure describes a single
transfer, and an array of these can be queued on a channel.  The
information in the gsi_xfer_elem is converted by the GSI layer into
the specific layout required by the hardware.

A channel has an associated event ring, through which completion of
channel commands can be signaled.  GSI channel commands are completed
in order, and may optionally generate an interrupt on completion.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/gsi.c     | 1685 +++++++++++++++++++++++++++++++++++++
 drivers/net/ipa/gsi.h     |  195 +++++
 drivers/net/ipa/gsi_reg.h |  563 +++++++++++++
 3 files changed, 2443 insertions(+)
 create mode 100644 drivers/net/ipa/gsi.c
 create mode 100644 drivers/net/ipa/gsi.h
 create mode 100644 drivers/net/ipa/gsi_reg.h

diff --git a/drivers/net/ipa/gsi.c b/drivers/net/ipa/gsi.c
new file mode 100644
index 000000000000..348ee1fc1bf5
--- /dev/null
+++ b/drivers/net/ipa/gsi.c
@@ -0,0 +1,1685 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/log2.h>
+#include <linux/bitfield.h>
+#include <linux/atomic.h>
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/completion.h>
+#include <linux/jiffies.h>
+#include <linux/string.h>
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/bug.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/platform_device.h>
+
+#include "gsi.h"
+#include "gsi_reg.h"
+#include "ipa_dma.h"
+#include "ipa_i.h"	/* ipa_err() */
+
+/**
+ * DOC: The Role of GSI in IPA Operation
+ *
+ * The generic software interface (GSI) is an integral component of
+ * the IPA, providing a well-defined layer between the AP subsystem
+ * (or, for that matter, the modem) and the IPA core::
+ *
+ *  ----------   -------------   ---------
+ *  |        |   |G|       |G|   |       |
+ *  |  APSS  |===|S|  IPA  |S|===| Modem |
+ *  |        |   |I|       |I|   |       |
+ *  ----------   -------------   ---------
+ *
+ * In the above diagram, the APSS and Modem represent "execution
+ * environments" (EEs), which are independent operating environments
+ * that use the IPA for data transfer.
+ *
+ * Each EE uses a set of unidirectional GSI "channels," which allow
+ * transfer of data to or from the IPA.  A channel is implemented as a
+ * ring buffer, with a DRAM-resident array of "transfer elements" (TREs)
+ * available to describe transfers to or from other EEs through the IPA.
+ * A transfer element can also contain an immediate command, requesting
+ * the IPA perform actions other than data transfer.
+ *
+ * Each transfer element refers to a block of data--also located DRAM.
+ * After writing one or more TREs to a channel, the writer (either the
+ * IPA or an EE) writes a doorbell register to inform the receiving side
+ * how many elements have been written.  Writing to a doorbell register
+ * triggers an interrupt on the receiver.
+ *
+ * Each channel has a GSI "event ring" associated with it.  An event
+ * ring is implemented very much like a channel ring, but is always
+ * directed from the IPA to an EE.  The IPA notifies an EE (such as
+ * the AP) about channel events by adding an entry to the event ring
+ * associated with the channel; when it writes the event ring's
+ * doorbell register the EE will be interrupted.
+ *
+ * A transfer element has a set of flags.  One flag indicates whether
+ * the completion of the transfer operation generates a channel event.
+ * Another flag allows transfer elements to be chained together,
+ * forming a single logical transaction.  These flags are used to
+ * control whether and when interrupts are generated to signal
+ * completion of a channel transfer.
+ *
+ * Elements in channel and event rings are completed (or consumed)
+ * strictly in order.  Completion of one entry implies the completion
+ * of all preceding entries.  A single completion interrupt can
+ * therefore be used to communicate the completion of many transfers.
+ */
+
+#define GSI_RING_ELEMENT_SIZE	16	/* bytes (channel or event ring) */
+
+#define GSI_CHAN_MAX		14
+#define GSI_EVT_RING_MAX	10
+
+/* Delay period if interrupt moderation is in effect */
+#define IPA_GSI_EVT_RING_INT_MODT	(32 * 1) /* 1ms under 32KHz clock */
+
+#define GSI_CMD_TIMEOUT		msecs_to_jiffies(5 * MSEC_PER_SEC)
+
+#define GSI_MHI_ER_START	10	/* First reserved event number */
+#define GSI_MHI_ER_END		16	/* Last reserved event number */
+
+#define GSI_RESET_WA_MIN_SLEEP	1000	/* microseconds */
+#define GSI_RESET_WA_MAX_SLEEP	2000	/* microseconds */
+
+#define GSI_MAX_PREFETCH	0	/* 0 means 1 segment; 1 means 2 */
+
+#define GSI_ISR_MAX_ITER	50
+
+/* Hardware values from the error log register code field */
+enum gsi_err_code {
+	GSI_INVALID_TRE_ERR			= 0x1,
+	GSI_OUT_OF_BUFFERS_ERR			= 0x2,
+	GSI_OUT_OF_RESOURCES_ERR		= 0x3,
+	GSI_UNSUPPORTED_INTER_EE_OP_ERR		= 0x4,
+	GSI_EVT_RING_EMPTY_ERR			= 0x5,
+	GSI_NON_ALLOCATED_EVT_ACCESS_ERR	= 0x6,
+	GSI_HWO_1_ERR				= 0x8,
+};
+
+/* Hardware values used when programming an event ring context */
+enum gsi_evt_chtype {
+	GSI_EVT_CHTYPE_MHI_EV	= 0x0,
+	GSI_EVT_CHTYPE_XHCI_EV	= 0x1,
+	GSI_EVT_CHTYPE_GPI_EV	= 0x2,
+	GSI_EVT_CHTYPE_XDCI_EV	= 0x3,
+};
+
+/* Hardware values used when programming a channel context */
+enum gsi_channel_protocol {
+	GSI_CHANNEL_PROTOCOL_MHI	= 0x0,
+	GSI_CHANNEL_PROTOCOL_XHCI	= 0x1,
+	GSI_CHANNEL_PROTOCOL_GPI	= 0x2,
+	GSI_CHANNEL_PROTOCOL_XDCI	= 0x3,
+};
+
+/* Hardware values returned in a transfer completion event structure */
+enum gsi_channel_evt {
+	GSI_CHANNEL_EVT_INVALID		= 0x0,
+	GSI_CHANNEL_EVT_SUCCESS		= 0x1,
+	GSI_CHANNEL_EVT_EOT		= 0x2,
+	GSI_CHANNEL_EVT_OVERFLOW	= 0x3,
+	GSI_CHANNEL_EVT_EOB		= 0x4,
+	GSI_CHANNEL_EVT_OOB		= 0x5,
+	GSI_CHANNEL_EVT_DB_MODE		= 0x6,
+	GSI_CHANNEL_EVT_UNDEFINED	= 0x10,
+	GSI_CHANNEL_EVT_RE_ERROR	= 0x11,
+};
+
+/* Hardware values signifying the state of an event ring */
+enum gsi_evt_ring_state {
+	GSI_EVT_RING_STATE_NOT_ALLOCATED	= 0x0,
+	GSI_EVT_RING_STATE_ALLOCATED		= 0x1,
+	GSI_EVT_RING_STATE_ERROR		= 0xf,
+};
+
+/* Hardware values signifying the state of a channel */
+enum gsi_channel_state {
+	GSI_CHANNEL_STATE_NOT_ALLOCATED	= 0x0,
+	GSI_CHANNEL_STATE_ALLOCATED	= 0x1,
+	GSI_CHANNEL_STATE_STARTED	= 0x2,
+	GSI_CHANNEL_STATE_STOPPED	= 0x3,
+	GSI_CHANNEL_STATE_STOP_IN_PROC	= 0x4,
+	GSI_CHANNEL_STATE_ERROR		= 0xf,
+};
+
+struct gsi_ring {
+	spinlock_t slock;		/* protects wp, rp updates */
+	struct ipa_dma_mem mem;
+	u64 wp;
+	u64 rp;
+	u64 wp_local;
+	u64 rp_local;
+	u64 end;			/* physical addr past last element */
+};
+
+struct gsi_channel {
+	bool from_ipa;			/* true: IPA->AP; false: AP->IPA */
+	bool priority;		/* Does hardware give this channel priority? */
+	enum gsi_channel_state state;
+	struct gsi_ring ring;
+	void *notify_data;
+	void **user_data;
+	struct gsi_evt_ring *evt_ring;
+	struct mutex mutex;		/* protects channel_scratch updates */
+	struct completion compl;
+	atomic_t poll_mode;
+	u32 tlv_count;			/* # slots in TLV */
+};
+
+struct gsi_evt_ring {
+	bool moderation;
+	enum gsi_evt_ring_state state;
+	struct gsi_ring ring;
+	struct completion compl;
+	struct gsi_channel *channel;
+};
+
+struct ch_debug_stats {
+	unsigned long ch_allocate;
+	unsigned long ch_start;
+	unsigned long ch_stop;
+	unsigned long ch_reset;
+	unsigned long ch_de_alloc;
+	unsigned long ch_db_stop;
+	unsigned long cmd_completed;
+};
+
+struct gsi {
+	void __iomem *base;
+	struct device *dev;
+	u32 phys;
+	unsigned int irq;
+	bool irq_wake_enabled;
+	spinlock_t slock;	/* protects global register updates */
+	struct mutex mutex;	/* protects 1-at-a-time commands, evt_bmap */
+	atomic_t channel_count;
+	atomic_t evt_ring_count;
+	struct gsi_channel channel[GSI_CHAN_MAX];
+	struct ch_debug_stats ch_dbg[GSI_CHAN_MAX];
+	struct gsi_evt_ring evt_ring[GSI_EVT_RING_MAX];
+	unsigned long evt_bmap;
+	u32 channel_max;
+	u32 evt_ring_max;
+};
+
+/* Hardware values representing a transfer element type */
+enum gsi_re_type {
+	GSI_RE_XFER	= 0x2,
+	GSI_RE_IMMD_CMD	= 0x3,
+	GSI_RE_NOP	= 0x4,
+};
+
+struct gsi_tre {
+	u64 buffer_ptr;
+	u16 buf_len;
+	u16 rsvd1;
+	u8  chain	: 1,
+	    rsvd4	: 7;
+	u8  ieob	: 1,
+	    ieot	: 1,
+	    bei		: 1,
+	    rsvd3	: 5;
+	u8 re_type;
+	u8 rsvd2;
+} __packed;
+
+struct gsi_xfer_compl_evt {
+	u64 xfer_ptr;
+	u16 len;
+	u8 rsvd1;
+	u8 code;  /* see gsi_channel_evt */
+	u16 rsvd;
+	u8 type;
+	u8 chid;
+} __packed;
+
+/* Hardware values from the error log register error type field */
+enum gsi_err_type {
+	GSI_ERR_TYPE_GLOB	= 0x1,
+	GSI_ERR_TYPE_CHAN	= 0x2,
+	GSI_ERR_TYPE_EVT	= 0x3,
+};
+
+struct gsi_log_err {
+	u8  arg3	: 4,
+	    arg2	: 4;
+	u8  arg1	: 4,
+	    code	: 4;
+	u8  rsvd	: 3,
+	    virt_idx	: 5;
+	u8  err_type	: 4,
+	    ee		: 4;
+} __packed;
+
+/* Hardware values repreasenting a channel immediate command opcode */
+enum gsi_ch_cmd_opcode {
+	GSI_CH_ALLOCATE	= 0x0,
+	GSI_CH_START	= 0x1,
+	GSI_CH_STOP	= 0x2,
+	GSI_CH_RESET	= 0x9,
+	GSI_CH_DE_ALLOC	= 0xa,
+	GSI_CH_DB_STOP	= 0xb,
+};
+
+/* Hardware values repreasenting an event ring immediate command opcode */
+enum gsi_evt_ch_cmd_opcode {
+	GSI_EVT_ALLOCATE	= 0x0,
+	GSI_EVT_RESET		= 0x9,
+	GSI_EVT_DE_ALLOC	= 0xa,
+};
+
+/** gsi_gpi_channel_scratch - GPI protocol SW config area of channel scratch
+ *
+ * @max_outstanding_tre: Used for the prefetch management sequence by the
+ *			 sequencer. Defines the maximum number of allowed
+ *			 outstanding TREs in IPA/GSI (in Bytes). RE engine
+ *			 prefetch will be limited by this configuration. It
+ *			 is suggested to configure this value to IPA_IF
+ *			 channel TLV queue size times element size. To disable
+ *			 the feature in doorbell mode (DB Mode=1). Maximum
+ *			 outstanding TREs should be set to 64KB
+ *			 (or any value larger or equal to ring length . RLEN)
+ * @outstanding_threshold: Used for the prefetch management sequence by the
+ *			 sequencer. Defines the threshold (in Bytes) as to when
+ *			 to update the channel doorbell. Should be smaller than
+ *			 Maximum outstanding TREs. value. It is suggested to
+ *			 configure this value to 2 * element size.
+ */
+struct gsi_gpi_channel_scratch {
+	u64 rsvd1;
+	u16 rsvd2;
+	u16 max_outstanding_tre;
+	u16 rsvd3;
+	u16 outstanding_threshold;
+} __packed;
+
+/** gsi_channel_scratch - channel scratch SW config area */
+union gsi_channel_scratch {
+	struct gsi_gpi_channel_scratch gpi;
+	struct {
+		u32 word1;
+		u32 word2;
+		u32 word3;
+		u32 word4;
+	} data;
+} __packed;
+
+/* Read a value from the given offset into the I/O space defined in
+ * the GSI context.
+ */
+static u32 gsi_readl(struct gsi *gsi, u32 offset)
+{
+	return readl(gsi->base + offset);
+}
+
+/* Write the provided value to the given offset into the I/O space
+ * defined in the GSI context.
+ */
+static void gsi_writel(struct gsi *gsi, u32 v, u32 offset)
+{
+	writel(v, gsi->base + offset);
+}
+
+static void
+_gsi_irq_control_event(struct gsi *gsi, u32 evt_ring_id, bool enable)
+{
+	u32 mask = BIT(evt_ring_id);
+	u32 val;
+
+	val = gsi_readl(gsi, GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFS);
+	if (enable)
+		val |= mask;
+	else
+		val &= ~mask;
+	gsi_writel(gsi, val, GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFS);
+}
+
+static void gsi_irq_disable_event(struct gsi *gsi, u32 evt_ring_id)
+{
+	_gsi_irq_control_event(gsi, evt_ring_id, false);
+}
+
+static void gsi_irq_enable_event(struct gsi *gsi, u32 evt_ring_id)
+{
+	_gsi_irq_control_event(gsi, evt_ring_id, true);
+}
+
+static void _gsi_irq_control_all(struct gsi *gsi, bool enable)
+{
+	u32 val = enable ? ~0 : 0;
+
+	/* Inter EE commands / interrupt are no supported. */
+	gsi_writel(gsi, val, GSI_CNTXT_TYPE_IRQ_MSK_OFFS);
+	gsi_writel(gsi, val, GSI_CNTXT_SRC_CH_IRQ_MSK_OFFS);
+	gsi_writel(gsi, val, GSI_CNTXT_SRC_EV_CH_IRQ_MSK_OFFS);
+	gsi_writel(gsi, val, GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFS);
+	gsi_writel(gsi, val, GSI_CNTXT_GLOB_IRQ_EN_OFFS);
+	/* Never enable GSI_BREAK_POINT */
+	val &= ~FIELD_PREP(EN_BREAK_POINT_FMASK, 1);
+	gsi_writel(gsi, val, GSI_CNTXT_GSI_IRQ_EN_OFFS);
+}
+
+static void gsi_irq_disable_all(struct gsi *gsi)
+{
+	_gsi_irq_control_all(gsi, false);
+}
+
+static void gsi_irq_enable_all(struct gsi *gsi)
+{
+	_gsi_irq_control_all(gsi, true);
+}
+
+static u32 gsi_channel_id(struct gsi *gsi, struct gsi_channel *channel)
+{
+	return (u32)(channel - &gsi->channel[0]);
+}
+
+static u32 gsi_evt_ring_id(struct gsi *gsi, struct gsi_evt_ring *evt_ring)
+{
+	return (u32)(evt_ring - &gsi->evt_ring[0]);
+}
+
+static enum gsi_channel_state gsi_channel_state(struct gsi *gsi, u32 channel_id)
+{
+	u32 val = gsi_readl(gsi, GSI_CH_C_CNTXT_0_OFFS(channel_id));
+
+	return (enum gsi_channel_state)FIELD_GET(CHSTATE_FMASK, val);
+}
+
+static enum gsi_evt_ring_state
+gsi_evt_ring_state(struct gsi *gsi, u32 evt_ring_id)
+{
+	u32 val = gsi_readl(gsi, GSI_EV_CH_E_CNTXT_0_OFFS(evt_ring_id));
+
+	return (enum gsi_evt_ring_state)FIELD_GET(EV_CHSTATE_FMASK, val);
+}
+
+static void gsi_isr_chan_ctrl(struct gsi *gsi)
+{
+	u32 channel_mask;
+
+	channel_mask = gsi_readl(gsi, GSI_CNTXT_SRC_CH_IRQ_OFFS);
+	gsi_writel(gsi, channel_mask, GSI_CNTXT_SRC_CH_IRQ_CLR_OFFS);
+
+	ipa_assert(!(channel_mask & ~GENMASK(gsi->channel_max - 1, 0)));
+
+	while (channel_mask) {
+		struct gsi_channel *channel;
+		int i = __ffs(channel_mask);
+
+		channel = &gsi->channel[i];
+		channel->state = gsi_channel_state(gsi, i);
+
+		complete(&channel->compl);
+
+		channel_mask ^= BIT(i);
+	}
+}
+
+static void gsi_isr_evt_ctrl(struct gsi *gsi)
+{
+	u32 evt_mask;
+
+	evt_mask = gsi_readl(gsi, GSI_CNTXT_SRC_EV_CH_IRQ_OFFS);
+	gsi_writel(gsi, evt_mask, GSI_CNTXT_SRC_EV_CH_IRQ_CLR_OFFS);
+
+	ipa_assert(!(evt_mask & ~GENMASK(gsi->evt_ring_max - 1, 0)));
+
+	while (evt_mask) {
+		struct gsi_evt_ring *evt_ring;
+		int i = __ffs(evt_mask);
+
+		evt_ring = &gsi->evt_ring[i];
+		evt_ring->state = gsi_evt_ring_state(gsi, i);
+
+		complete(&evt_ring->compl);
+
+		evt_mask ^= BIT(i);
+	}
+}
+
+static void
+gsi_isr_glob_chan_err(struct gsi *gsi, u32 err_ee, u32 channel_id, u32 code)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	if (err_ee != IPA_EE_AP)
+		ipa_bug_on(code != GSI_UNSUPPORTED_INTER_EE_OP_ERR);
+
+	if (WARN_ON(channel_id >= gsi->channel_max)) {
+		ipa_err("unexpected channel_id %u\n", channel_id);
+		return;
+	}
+
+	switch (code) {
+	case GSI_INVALID_TRE_ERR:
+		ipa_err("got INVALID_TRE_ERR\n");
+		channel->state = gsi_channel_state(gsi, channel_id);
+		ipa_bug_on(channel->state != GSI_CHANNEL_STATE_ERROR);
+		break;
+	case GSI_OUT_OF_BUFFERS_ERR:
+		ipa_err("got OUT_OF_BUFFERS_ERR\n");
+		break;
+	case GSI_OUT_OF_RESOURCES_ERR:
+		ipa_err("got OUT_OF_RESOURCES_ERR\n");
+		complete(&channel->compl);
+		break;
+	case GSI_UNSUPPORTED_INTER_EE_OP_ERR:
+		ipa_err("got UNSUPPORTED_INTER_EE_OP_ERR\n");
+		break;
+	case GSI_NON_ALLOCATED_EVT_ACCESS_ERR:
+		ipa_err("got NON_ALLOCATED_EVT_ACCESS_ERR\n");
+		break;
+	case GSI_HWO_1_ERR:
+		ipa_err("got HWO_1_ERR\n");
+		break;
+	default:
+		ipa_err("unexpected channel error code %u\n", code);
+		ipa_bug();
+	}
+}
+
+static void
+gsi_isr_glob_evt_err(struct gsi *gsi, u32 err_ee, u32 evt_ring_id, u32 code)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+
+	if (err_ee != IPA_EE_AP)
+		ipa_bug_on(code != GSI_UNSUPPORTED_INTER_EE_OP_ERR);
+
+	if (WARN_ON(evt_ring_id >= gsi->evt_ring_max)) {
+		ipa_err("unexpected evt_ring_id %u\n", evt_ring_id);
+		return;
+	}
+
+	switch (code) {
+	case GSI_OUT_OF_BUFFERS_ERR:
+		ipa_err("got OUT_OF_BUFFERS_ERR\n");
+		break;
+	case GSI_OUT_OF_RESOURCES_ERR:
+		ipa_err("got OUT_OF_RESOURCES_ERR\n");
+		complete(&evt_ring->compl);
+		break;
+	case GSI_UNSUPPORTED_INTER_EE_OP_ERR:
+		ipa_err("got UNSUPPORTED_INTER_EE_OP_ERR\n");
+		break;
+	case GSI_EVT_RING_EMPTY_ERR:
+		ipa_err("got EVT_RING_EMPTY_ERR\n");
+		break;
+	default:
+		ipa_err("unexpected event error code %u\n", code);
+		ipa_bug();
+	}
+}
+
+static void gsi_isr_glob_err(struct gsi *gsi, u32 err)
+{
+	struct gsi_log_err *log = (struct gsi_log_err *)&err;
+
+	ipa_err("log err_type %u ee %u idx %u\n", log->err_type, log->ee,
+		log->virt_idx);
+	ipa_err("log code 0x%1x arg1 0x%1x arg2 0x%1x arg3 0x%1x\n", log->code,
+		log->arg1, log->arg2, log->arg3);
+
+	ipa_bug_on(log->err_type == GSI_ERR_TYPE_GLOB);
+
+	switch (log->err_type) {
+	case GSI_ERR_TYPE_CHAN:
+		gsi_isr_glob_chan_err(gsi, log->ee, log->virt_idx, log->code);
+		break;
+	case GSI_ERR_TYPE_EVT:
+		gsi_isr_glob_evt_err(gsi, log->ee, log->virt_idx, log->code);
+		break;
+	default:
+		WARN_ON(1);
+	}
+}
+
+static void gsi_isr_glob_ee(struct gsi *gsi)
+{
+	u32 val;
+
+	val = gsi_readl(gsi, GSI_CNTXT_GLOB_IRQ_STTS_OFFS);
+
+	if (val & ERROR_INT_FMASK) {
+		u32 err = gsi_readl(gsi, GSI_ERROR_LOG_OFFS);
+
+		gsi_writel(gsi, 0, GSI_ERROR_LOG_OFFS);
+		gsi_writel(gsi, ~0, GSI_ERROR_LOG_CLR_OFFS);
+
+		gsi_isr_glob_err(gsi, err);
+	}
+
+	if (val & EN_GP_INT1_FMASK)
+		ipa_err("unexpected GP INT1 received\n");
+
+	ipa_bug_on(val & EN_GP_INT2_FMASK);
+	ipa_bug_on(val & EN_GP_INT3_FMASK);
+
+	gsi_writel(gsi, val, GSI_CNTXT_GLOB_IRQ_CLR_OFFS);
+}
+
+static void ring_wp_local_inc(struct gsi_ring *ring)
+{
+	ring->wp_local += GSI_RING_ELEMENT_SIZE;
+	if (ring->wp_local == ring->end)
+		ring->wp_local = ring->mem.phys;
+}
+
+static void ring_rp_local_inc(struct gsi_ring *ring)
+{
+	ring->rp_local += GSI_RING_ELEMENT_SIZE;
+	if (ring->rp_local == ring->end)
+		ring->rp_local = ring->mem.phys;
+}
+
+static u16 ring_rp_local_index(struct gsi_ring *ring)
+{
+	return (u16)(ring->rp_local - ring->mem.phys) / GSI_RING_ELEMENT_SIZE;
+}
+
+static u16 ring_wp_local_index(struct gsi_ring *ring)
+{
+	return (u16)(ring->wp_local - ring->mem.phys) / GSI_RING_ELEMENT_SIZE;
+}
+
+static void channel_xfer_cb(struct gsi_channel *channel, u16 count)
+{
+	void *xfer_data;
+
+	if (!channel->from_ipa) {
+		u16 ring_rp_local = ring_rp_local_index(&channel->ring);
+
+		xfer_data = channel->user_data[ring_rp_local];;
+		ipa_gsi_irq_tx_notify_cb(xfer_data);
+	} else {
+		ipa_gsi_irq_rx_notify_cb(channel->notify_data, count);
+	}
+}
+
+static u16 gsi_channel_process(struct gsi *gsi, struct gsi_xfer_compl_evt *evt,
+			       bool callback)
+{
+	struct gsi_channel *channel;
+	u32 channel_id = (u32)evt->chid;
+
+	ipa_assert(channel_id < gsi->channel_max);
+
+	/* Event tells us the last completed channel ring element */
+	channel = &gsi->channel[channel_id];
+	channel->ring.rp_local = evt->xfer_ptr;
+
+	if (callback) {
+		if (evt->code == GSI_CHANNEL_EVT_EOT)
+			channel_xfer_cb(channel, evt->len);
+		else
+			ipa_err("ch %u unexpected %sX event id %hhu\n",
+				channel_id, channel->from_ipa ? "R" : "T",
+				evt->code);
+	}
+
+	/* Record that we've processed this channel ring element. */
+	ring_rp_local_inc(&channel->ring);
+	channel->ring.rp = channel->ring.rp_local;
+
+	return evt->len;
+}
+
+static void
+gsi_evt_ring_doorbell(struct gsi *gsi, struct gsi_evt_ring *evt_ring)
+{
+	u32 evt_ring_id = gsi_evt_ring_id(gsi, evt_ring);
+	u32 val;
+
+	/* The doorbell 0 and 1 registers store the low-order and
+	 * high-order 32 bits of the event ring doorbell register,
+	 * respectively.  LSB (doorbell 0) must be written last.
+	 */
+	val = evt_ring->ring.wp_local >> 32;
+	gsi_writel(gsi, val, GSI_EV_CH_E_DOORBELL_1_OFFS(evt_ring_id));
+
+	val = evt_ring->ring.wp_local & GENMASK(31, 0);
+	gsi_writel(gsi, val, GSI_EV_CH_E_DOORBELL_0_OFFS(evt_ring_id));
+}
+
+static void gsi_channel_doorbell(struct gsi *gsi, struct gsi_channel *channel)
+{
+	u32 channel_id = gsi_channel_id(gsi, channel);
+	u32 val;
+
+	/* allocate new events for this channel first
+	 * before submitting the new TREs.
+	 * for TO_GSI channels the event ring doorbell is rang as part of
+	 * interrupt handling.
+	 */
+	if (channel->from_ipa)
+		gsi_evt_ring_doorbell(gsi, channel->evt_ring);
+	channel->ring.wp = channel->ring.wp_local;
+
+	/* The doorbell 0 and 1 registers store the low-order and
+	 * high-order 32 bits of the channel ring doorbell register,
+	 * respectively.  LSB (doorbell 0) must be written last.
+	 */
+	val = channel->ring.wp_local >> 32;
+	gsi_writel(gsi, val, GSI_CH_C_DOORBELL_1_OFFS(channel_id));
+	val = channel->ring.wp_local & GENMASK(31, 0);
+	gsi_writel(gsi, val, GSI_CH_C_DOORBELL_0_OFFS(channel_id));
+}
+
+static void gsi_event_handle(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	unsigned long flags;
+	bool check_again;
+
+	spin_lock_irqsave(&evt_ring->ring.slock, flags);
+
+	do {
+		u32 val = gsi_readl(gsi, GSI_EV_CH_E_CNTXT_4_OFFS(evt_ring_id));
+
+		evt_ring->ring.rp = evt_ring->ring.rp & GENMASK_ULL(63, 32);
+		evt_ring->ring.rp |= val;
+
+		check_again = false;
+		while (evt_ring->ring.rp_local != evt_ring->ring.rp) {
+			struct gsi_xfer_compl_evt *evt;
+
+			if (atomic_read(&evt_ring->channel->poll_mode)) {
+				check_again = false;
+				break;
+			}
+			check_again = true;
+
+			evt = ipa_dma_phys_to_virt(&evt_ring->ring.mem,
+						   evt_ring->ring.rp_local);
+			(void)gsi_channel_process(gsi, evt, true);
+
+			ring_rp_local_inc(&evt_ring->ring);
+			ring_wp_local_inc(&evt_ring->ring); /* recycle */
+		}
+
+		gsi_evt_ring_doorbell(gsi, evt_ring);
+	} while (check_again);
+
+	spin_unlock_irqrestore(&evt_ring->ring.slock, flags);
+}
+
+static void gsi_isr_ioeb(struct gsi *gsi)
+{
+	u32 evt_mask;
+
+	evt_mask = gsi_readl(gsi, GSI_CNTXT_SRC_IEOB_IRQ_OFFS);
+	evt_mask &= gsi_readl(gsi, GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFS);
+	gsi_writel(gsi, evt_mask, GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFS);
+
+	ipa_assert(!(evt_mask & ~GENMASK(gsi->evt_ring_max - 1, 0)));
+
+	while (evt_mask) {
+		u32 i = (u32)__ffs(evt_mask);
+
+		gsi_event_handle(gsi, i);
+
+		evt_mask ^= BIT(i);
+	}
+}
+
+static void gsi_isr_inter_ee_chan_ctrl(struct gsi *gsi)
+{
+	u32 channel_mask;
+
+	channel_mask = gsi_readl(gsi, GSI_INTER_EE_SRC_CH_IRQ_OFFS);
+	gsi_writel(gsi, channel_mask, GSI_INTER_EE_SRC_CH_IRQ_CLR_OFFS);
+
+	ipa_assert(!(channel_mask & ~GENMASK(gsi->channel_max - 1, 0)));
+
+	while (channel_mask) {
+		int i = __ffs(channel_mask);
+
+		/* not currently expected */
+		ipa_err("ch %d was inter-EE changed\n", i);
+		channel_mask ^= BIT(i);
+	}
+}
+
+static void gsi_isr_inter_ee_evt_ctrl(struct gsi *gsi)
+{
+	u32 evt_mask;
+
+	evt_mask = gsi_readl(gsi, GSI_INTER_EE_SRC_EV_CH_IRQ_OFFS);
+	gsi_writel(gsi, evt_mask, GSI_INTER_EE_SRC_EV_CH_IRQ_CLR_OFFS);
+
+	ipa_assert(!(evt_mask & ~GENMASK(gsi->evt_ring_max - 1, 0)));
+
+	while (evt_mask) {
+		u32 i = (u32)__ffs(evt_mask);
+
+		/* not currently expected */
+		ipa_err("evt %d was inter-EE changed\n", i);
+		evt_mask ^= BIT(i);
+	}
+}
+
+static void gsi_isr_general(struct gsi *gsi)
+{
+	u32 val;
+
+	val = gsi_readl(gsi, GSI_CNTXT_GSI_IRQ_STTS_OFFS);
+
+	ipa_bug_on(val & CLR_MCS_STACK_OVRFLOW_FMASK);
+	ipa_bug_on(val & CLR_CMD_FIFO_OVRFLOW_FMASK);
+	ipa_bug_on(val & CLR_BUS_ERROR_FMASK);
+
+	if (val & CLR_BREAK_POINT_FMASK)
+		ipa_err("got breakpoint\n");
+
+	gsi_writel(gsi, val, GSI_CNTXT_GSI_IRQ_CLR_OFFS);
+}
+
+/* Returns a bitmask of pending GSI interrupts */
+static u32 gsi_isr_type(struct gsi *gsi)
+{
+	return gsi_readl(gsi, GSI_CNTXT_TYPE_IRQ_OFFS);
+}
+
+static irqreturn_t gsi_isr(int irq, void *dev_id)
+{
+	struct gsi *gsi = dev_id;
+	u32 type;
+	u32 cnt;
+
+	cnt = 0;
+	while ((type = gsi_isr_type(gsi))) {
+		do {
+			u32 single = BIT(__ffs(type));
+
+			switch (single) {
+			case CH_CTRL_FMASK:
+				gsi_isr_chan_ctrl(gsi);
+				break;
+			case EV_CTRL_FMASK:
+				gsi_isr_evt_ctrl(gsi);
+				break;
+			case GLOB_EE_FMASK:
+				gsi_isr_glob_ee(gsi);
+				break;
+			case IEOB_FMASK:
+				gsi_isr_ioeb(gsi);
+				break;
+			case INTER_EE_CH_CTRL_FMASK:
+				gsi_isr_inter_ee_chan_ctrl(gsi);
+				break;
+			case INTER_EE_EV_CTRL_FMASK:
+				gsi_isr_inter_ee_evt_ctrl(gsi);
+				break;
+			case GENERAL_FMASK:
+				gsi_isr_general(gsi);
+				break;
+			default:
+				WARN(true, "%s: unrecognized type 0x%08x\n",
+				     __func__, single);
+				break;
+			}
+			type ^= single;
+		} while (type);
+
+		ipa_bug_on(++cnt > GSI_ISR_MAX_ITER);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static u32 gsi_channel_max(struct gsi *gsi)
+{
+	u32 val = gsi_readl(gsi, GSI_GSI_HW_PARAM_2_OFFS);
+
+	return FIELD_GET(NUM_CH_PER_EE_FMASK, val);
+}
+
+static u32 gsi_evt_ring_max(struct gsi *gsi)
+{
+	u32 val = gsi_readl(gsi, GSI_GSI_HW_PARAM_2_OFFS);
+
+	return FIELD_GET(NUM_EV_PER_EE_FMASK, val);
+}
+
+/* Zero bits in an event bitmap represent event numbers available
+ * for allocation.  Initialize the map so all events supported by
+ * the hardware are available; then preclude any reserved events
+ * from allocation.
+ */
+static u32 gsi_evt_bmap_init(u32 evt_ring_max)
+{
+	u32 evt_bmap = GENMASK(BITS_PER_LONG - 1, evt_ring_max);
+
+	return evt_bmap | GENMASK(GSI_MHI_ER_END, GSI_MHI_ER_START);
+}
+
+int gsi_device_init(struct gsi *gsi)
+{
+	u32 evt_ring_max;
+	u32 channel_max;
+	u32 val;
+	int ret;
+
+	val = gsi_readl(gsi, GSI_GSI_STATUS_OFFS);
+	if (!(val & ENABLED_FMASK)) {
+		ipa_err("manager EE has not enabled GSI, GSI un-usable\n");
+		return -EIO;
+	}
+
+	channel_max = gsi_channel_max(gsi);
+	ipa_debug("channel_max %u\n", channel_max);
+	ipa_assert(channel_max <= GSI_CHAN_MAX);
+
+	evt_ring_max = gsi_evt_ring_max(gsi);
+	ipa_debug("evt_ring_max %u\n", evt_ring_max);
+	ipa_assert(evt_ring_max <= GSI_EVT_RING_MAX);
+
+	ret = request_irq(gsi->irq, gsi_isr, IRQF_TRIGGER_HIGH, "gsi", gsi);
+	if (ret) {
+		ipa_err("failed to register isr for %u\n", gsi->irq);
+		return -EIO;
+	}
+
+	ret = enable_irq_wake(gsi->irq);
+	if (ret)
+		ipa_err("error %d enabling gsi wake irq\n", ret);
+	gsi->irq_wake_enabled = !ret;
+	gsi->channel_max = channel_max;
+	gsi->evt_ring_max = evt_ring_max;
+	gsi->evt_bmap = gsi_evt_bmap_init(evt_ring_max);
+
+	/* Enable all IPA interrupts */
+	gsi_irq_enable_all(gsi);
+
+	/* Writing 1 indicates IRQ interrupts; 0 would be MSI */
+	gsi_writel(gsi, 1, GSI_CNTXT_INTSET_OFFS);
+
+	/* Initialize the error log */
+	gsi_writel(gsi, 0, GSI_ERROR_LOG_OFFS);
+
+	return 0;
+}
+
+void gsi_device_exit(struct gsi *gsi)
+{
+	ipa_assert(!atomic_read(&gsi->channel_count));
+	ipa_assert(!atomic_read(&gsi->evt_ring_count));
+
+	/* Don't bother clearing the error log again (ERROR_LOG) or
+	 * setting the interrupt type again (INTSET).
+	 */
+	gsi_irq_disable_all(gsi);
+
+	/* Clean up everything else set up by gsi_device_init() */
+	gsi->evt_bmap = 0;
+	gsi->evt_ring_max = 0;
+	gsi->channel_max = 0;
+	if (gsi->irq_wake_enabled) {
+		(void)disable_irq_wake(gsi->irq);
+		gsi->irq_wake_enabled = false;
+	}
+	free_irq(gsi->irq, gsi);
+	gsi->irq = 0;
+}
+
+static void gsi_evt_ring_program(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	u32 int_modt;
+	u32 int_modc;
+	u64 phys;
+	u32 val;
+
+	phys = evt_ring->ring.mem.phys;
+	int_modt = evt_ring->moderation ? IPA_GSI_EVT_RING_INT_MODT : 0;
+	int_modc = 1;	/* moderation always comes from channel*/
+
+	val = FIELD_PREP(EV_CHTYPE_FMASK, GSI_EVT_CHTYPE_GPI_EV);
+	val |= FIELD_PREP(EV_INTYPE_FMASK, 1);
+	val |= FIELD_PREP(EV_ELEMENT_SIZE_FMASK, GSI_RING_ELEMENT_SIZE);
+	gsi_writel(gsi, val, GSI_EV_CH_E_CNTXT_0_OFFS(evt_ring_id));
+
+	val = FIELD_PREP(EV_R_LENGTH_FMASK, (u32)evt_ring->ring.mem.size);
+	gsi_writel(gsi, val, GSI_EV_CH_E_CNTXT_1_OFFS(evt_ring_id));
+
+	/* The context 2 and 3 registers store the low-order and
+	 * high-order 32 bits of the address of the event ring,
+	 * respectively.
+	 */
+	val = phys & GENMASK(31, 0);
+	gsi_writel(gsi, val, GSI_EV_CH_E_CNTXT_2_OFFS(evt_ring_id));
+
+	val = phys >> 32;
+	gsi_writel(gsi, val, GSI_EV_CH_E_CNTXT_3_OFFS(evt_ring_id));
+
+	val = FIELD_PREP(MODT_FMASK, int_modt);
+	val |= FIELD_PREP(MODC_FMASK, int_modc);
+	gsi_writel(gsi, val, GSI_EV_CH_E_CNTXT_8_OFFS(evt_ring_id));
+
+	/* No MSI write data, and MSI address high and low address is 0 */
+	gsi_writel(gsi, 0, GSI_EV_CH_E_CNTXT_9_OFFS(evt_ring_id));
+	gsi_writel(gsi, 0, GSI_EV_CH_E_CNTXT_10_OFFS(evt_ring_id));
+	gsi_writel(gsi, 0, GSI_EV_CH_E_CNTXT_11_OFFS(evt_ring_id));
+
+	/* We don't need to get event read pointer updates */
+	gsi_writel(gsi, 0, GSI_EV_CH_E_CNTXT_12_OFFS(evt_ring_id));
+	gsi_writel(gsi, 0, GSI_EV_CH_E_CNTXT_13_OFFS(evt_ring_id));
+}
+
+static void gsi_ring_init(struct gsi_ring *ring)
+{
+	ring->wp_local = ring->wp = ring->mem.phys;
+	ring->rp_local = ring->rp = ring->mem.phys;
+}
+
+static int gsi_ring_alloc(struct gsi_ring *ring, u32 count)
+{
+	size_t size = roundup_pow_of_two(count * GSI_RING_ELEMENT_SIZE);
+
+	/* Hardware requires a power-of-2 ring size (and alignment) */
+	if (ipa_dma_alloc(&ring->mem, size, GFP_KERNEL))
+		return -ENOMEM;
+	ipa_assert(!(ring->mem.phys % size));
+
+	ring->end = ring->mem.phys + size;
+	spin_lock_init(&ring->slock);
+
+	return 0;
+}
+
+static void gsi_ring_free(struct gsi_ring *ring)
+{
+	ipa_dma_free(&ring->mem);
+	memset(ring, 0, sizeof(*ring));
+}
+
+static void gsi_evt_ring_prime(struct gsi *gsi, struct gsi_evt_ring *evt_ring)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&evt_ring->ring.slock, flags);
+	memset(evt_ring->ring.mem.virt, 0, evt_ring->ring.mem.size);
+	evt_ring->ring.wp_local = evt_ring->ring.end - GSI_RING_ELEMENT_SIZE;
+	gsi_evt_ring_doorbell(gsi, evt_ring);
+	spin_unlock_irqrestore(&evt_ring->ring.slock, flags);
+}
+
+/* Issue a GSI command by writing a value to a register, then wait
+ * for completion to be signaled.  Returns true if successful or
+ * false if a timeout occurred.  Note that the register offset is
+ * first, value to write is second (reverse of writel() order).
+ */
+static bool command(struct gsi *gsi, u32 reg, u32 val, struct completion *compl)
+{
+	bool ret;
+
+	gsi_writel(gsi, val, reg);
+	ret = !!wait_for_completion_timeout(compl, GSI_CMD_TIMEOUT);
+	if (!ret)
+		ipa_err("command timeout\n");
+
+	return ret;
+}
+
+/* Issue an event ring command and wait for it to complete */
+static bool evt_ring_command(struct gsi *gsi, u32 evt_ring_id,
+			     enum gsi_evt_ch_cmd_opcode op)
+{
+	struct completion *compl = &gsi->evt_ring[evt_ring_id].compl;
+	u32 val;
+
+	reinit_completion(compl);
+
+	val = FIELD_PREP(EV_CHID_FMASK, evt_ring_id);
+	val |= FIELD_PREP(EV_OPCODE_FMASK, (u32)op);
+
+	return command(gsi, GSI_EV_CH_CMD_OFFS, val, compl);
+}
+
+/* Issue a channel command and wait for it to complete */
+static bool
+channel_command(struct gsi *gsi, u32 channel_id, enum gsi_ch_cmd_opcode op)
+{
+	struct completion *compl = &gsi->channel[channel_id].compl;
+	u32 val;
+
+	reinit_completion(compl);
+
+	val = FIELD_PREP(CH_CHID_FMASK, channel_id);
+	val |= FIELD_PREP(CH_OPCODE_FMASK, (u32)op);
+
+	return command(gsi, GSI_CH_CMD_OFFS, val, compl);
+}
+
+/* Note: only GPI interfaces, IRQ interrupts are currently supported */
+static int gsi_evt_ring_alloc(struct gsi *gsi, u32 ring_count, bool moderation)
+{
+	struct gsi_evt_ring *evt_ring;
+	unsigned long flags;
+	u32 evt_ring_id;
+	u32 val;
+	int ret;
+
+	/* Get the mutex to allocate from the bitmap and issue a command */
+	mutex_lock(&gsi->mutex);
+
+	/* Start by allocating the event id to use */
+	ipa_assert(gsi->evt_bmap != ~0UL);
+	evt_ring_id = (u32)ffz(gsi->evt_bmap);
+	gsi->evt_bmap |= BIT(evt_ring_id);
+
+	evt_ring = &gsi->evt_ring[evt_ring_id];
+
+	ret = gsi_ring_alloc(&evt_ring->ring, ring_count);
+	if (ret)
+		goto err_free_bmap;
+
+	init_completion(&evt_ring->compl);
+
+	if (!evt_ring_command(gsi, evt_ring_id, GSI_EVT_ALLOCATE)) {
+		ret = -ETIMEDOUT;
+		goto err_free_ring;
+	}
+
+	if (evt_ring->state != GSI_EVT_RING_STATE_ALLOCATED) {
+		ipa_err("evt_ring_id %u allocation failed state %u\n",
+			evt_ring_id, evt_ring->state);
+		ret = -ENOMEM;
+		goto err_free_ring;
+	}
+	atomic_inc(&gsi->evt_ring_count);
+
+	evt_ring->moderation = moderation;
+
+	gsi_evt_ring_program(gsi, evt_ring_id);
+	gsi_ring_init(&evt_ring->ring);
+	gsi_evt_ring_prime(gsi, evt_ring);
+
+	mutex_unlock(&gsi->mutex);
+
+	spin_lock_irqsave(&gsi->slock, flags);
+
+	/* Enable the event interrupt (clear it first in case pending) */
+	val = BIT(evt_ring_id);
+	gsi_writel(gsi, val, GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFS);
+	gsi_irq_enable_event(gsi, evt_ring_id);
+
+	spin_unlock_irqrestore(&gsi->slock, flags);
+
+	return evt_ring_id;
+
+err_free_ring:
+	gsi_ring_free(&evt_ring->ring);
+	memset(evt_ring, 0, sizeof(*evt_ring));
+err_free_bmap:
+	ipa_assert(gsi->evt_bmap & BIT(evt_ring_id));
+	gsi->evt_bmap &= ~BIT(evt_ring_id);
+
+	mutex_unlock(&gsi->mutex);
+
+	return ret;
+}
+
+static void gsi_evt_ring_scratch_zero(struct gsi *gsi, u32 evt_ring_id)
+{
+	gsi_writel(gsi, 0, GSI_EV_CH_E_SCRATCH_0_OFFS(evt_ring_id));
+	gsi_writel(gsi, 0, GSI_EV_CH_E_SCRATCH_1_OFFS(evt_ring_id));
+}
+
+static void gsi_evt_ring_dealloc(struct gsi *gsi, u32 evt_ring_id)
+{
+	struct gsi_evt_ring *evt_ring = &gsi->evt_ring[evt_ring_id];
+	bool completed;
+
+	ipa_bug_on(evt_ring->state != GSI_EVT_RING_STATE_ALLOCATED);
+
+	mutex_lock(&gsi->mutex);
+
+	completed = evt_ring_command(gsi, evt_ring_id, GSI_EVT_RESET);
+	ipa_bug_on(!completed);
+	ipa_bug_on(evt_ring->state != GSI_EVT_RING_STATE_ALLOCATED);
+
+	gsi_evt_ring_program(gsi, evt_ring_id);
+	gsi_ring_init(&evt_ring->ring);
+	gsi_evt_ring_scratch_zero(gsi, evt_ring_id);
+	gsi_evt_ring_prime(gsi, evt_ring);
+
+	completed = evt_ring_command(gsi, evt_ring_id, GSI_EVT_DE_ALLOC);
+	ipa_bug_on(!completed);
+
+	ipa_bug_on(evt_ring->state != GSI_EVT_RING_STATE_NOT_ALLOCATED);
+
+	ipa_assert(gsi->evt_bmap & BIT(evt_ring_id));
+	gsi->evt_bmap &= ~BIT(evt_ring_id);
+
+	mutex_unlock(&gsi->mutex);
+
+	evt_ring->moderation = false;
+	gsi_ring_free(&evt_ring->ring);
+	memset(evt_ring, 0, sizeof(*evt_ring));
+
+	atomic_dec(&gsi->evt_ring_count);
+}
+
+static void gsi_channel_program(struct gsi *gsi, u32 channel_id,
+				u32 evt_ring_id, bool doorbell_enable)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	u32 low_weight;
+	u32 val;
+
+	val = FIELD_PREP(CHTYPE_PROTOCOL_FMASK, GSI_CHANNEL_PROTOCOL_GPI);
+	val |= FIELD_PREP(CHTYPE_DIR_FMASK, channel->from_ipa ? 0 : 1);
+	val |= FIELD_PREP(ERINDEX_FMASK, evt_ring_id);
+	val |= FIELD_PREP(ELEMENT_SIZE_FMASK, GSI_RING_ELEMENT_SIZE);
+	gsi_writel(gsi, val, GSI_CH_C_CNTXT_0_OFFS(channel_id));
+
+	val = FIELD_PREP(R_LENGTH_FMASK, channel->ring.mem.size);
+	gsi_writel(gsi, val, GSI_CH_C_CNTXT_1_OFFS(channel_id));
+
+	/* The context 2 and 3 registers store the low-order and
+	 * high-order 32 bits of the address of the channel ring,
+	 * respectively.
+	 */
+	val = channel->ring.mem.phys & GENMASK(31, 0);
+	gsi_writel(gsi, val, GSI_CH_C_CNTXT_2_OFFS(channel_id));
+
+	val = channel->ring.mem.phys >> 32;
+	gsi_writel(gsi, val, GSI_CH_C_CNTXT_3_OFFS(channel_id));
+
+	low_weight = channel->priority ? FIELD_MAX(WRR_WEIGHT_FMASK) : 0;
+	val = FIELD_PREP(WRR_WEIGHT_FMASK, low_weight);
+	val |= FIELD_PREP(MAX_PREFETCH_FMASK, GSI_MAX_PREFETCH);
+	val |= FIELD_PREP(USE_DB_ENG_FMASK, doorbell_enable ? 1 : 0);
+	gsi_writel(gsi, val, GSI_CH_C_QOS_OFFS(channel_id));
+}
+
+int gsi_channel_alloc(struct gsi *gsi, u32 channel_id, u32 channel_count,
+		      bool from_ipa, bool priority, u32 evt_ring_mult,
+		      bool moderation, void *notify_data)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	u32 evt_ring_count;
+	u32 evt_ring_id;
+	void **user_data;
+	int ret;
+
+	evt_ring_count = channel_count * evt_ring_mult;
+	ret = gsi_evt_ring_alloc(gsi, evt_ring_count, moderation);
+	if (ret < 0)
+		return ret;
+	evt_ring_id = (u32)ret;
+
+	ret = gsi_ring_alloc(&channel->ring, channel_count);
+	if (ret)
+		goto err_evt_ring_free;
+
+	user_data = kcalloc(channel_count, sizeof(void *), GFP_KERNEL);
+	if (!user_data) {
+		ret = -ENOMEM;
+		goto err_ring_free;
+	}
+
+	mutex_init(&channel->mutex);
+	init_completion(&channel->compl);
+	atomic_set(&channel->poll_mode, 0);	/* Initially in callback mode */
+	channel->from_ipa = from_ipa;
+	channel->notify_data = notify_data;
+
+	mutex_lock(&gsi->mutex);
+
+	if (!channel_command(gsi, channel_id, GSI_CH_ALLOCATE)) {
+		ret = -ETIMEDOUT;
+		goto err_mutex_unlock;
+	}
+	if (channel->state != GSI_CHANNEL_STATE_ALLOCATED) {
+		ret = -EIO;
+		goto err_mutex_unlock;
+	}
+
+	gsi->ch_dbg[channel_id].ch_allocate++;
+
+	mutex_unlock(&gsi->mutex);
+
+	channel->evt_ring = &gsi->evt_ring[evt_ring_id];
+	channel->evt_ring->channel = channel;
+	channel->priority = priority;
+
+	gsi_channel_program(gsi, channel_id, evt_ring_id, true);
+	gsi_ring_init(&channel->ring);
+
+	channel->user_data = user_data;
+	atomic_inc(&gsi->channel_count);
+
+	return 0;
+
+err_mutex_unlock:
+	mutex_unlock(&gsi->mutex);
+	kfree(user_data);
+err_ring_free:
+	gsi_ring_free(&channel->ring);
+err_evt_ring_free:
+	gsi_evt_ring_dealloc(gsi, evt_ring_id);
+
+	return ret;
+}
+
+static void __gsi_channel_scratch_write(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	struct gsi_gpi_channel_scratch *gpi;
+	union gsi_channel_scratch scr = { };
+	u32 val;
+
+	gpi = &scr.gpi;
+	/* See comments above definition of gsi_gpi_channel_scratch */
+	gpi->max_outstanding_tre = channel->tlv_count * GSI_RING_ELEMENT_SIZE;
+	gpi->outstanding_threshold = 2 * GSI_RING_ELEMENT_SIZE;
+
+	val = scr.data.word1;
+	gsi_writel(gsi, val, GSI_CH_C_SCRATCH_0_OFFS(channel_id));
+
+	val = scr.data.word2;
+	gsi_writel(gsi, val, GSI_CH_C_SCRATCH_1_OFFS(channel_id));
+
+	val = scr.data.word3;
+	gsi_writel(gsi, val, GSI_CH_C_SCRATCH_2_OFFS(channel_id));
+
+	/* We must preserve the upper 16 bits of the last scratch
+	 * register.  The next sequence assumes those bits remain
+	 * unchanged between the read and the write.
+	 */
+	val = gsi_readl(gsi, GSI_CH_C_SCRATCH_3_OFFS(channel_id));
+	val = (scr.data.word4 & GENMASK(31, 16)) | (val & GENMASK(15, 0));
+	gsi_writel(gsi, val, GSI_CH_C_SCRATCH_3_OFFS(channel_id));
+}
+
+void gsi_channel_scratch_write(struct gsi *gsi, u32 channel_id, u32 tlv_count)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	channel->tlv_count = tlv_count;
+
+	mutex_lock(&channel->mutex);
+
+	__gsi_channel_scratch_write(gsi, channel_id);
+
+	mutex_unlock(&channel->mutex);
+}
+
+int gsi_channel_start(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+
+	if (channel->state != GSI_CHANNEL_STATE_ALLOCATED &&
+	    channel->state != GSI_CHANNEL_STATE_STOP_IN_PROC &&
+	    channel->state != GSI_CHANNEL_STATE_STOPPED) {
+		ipa_err("bad state %d\n", channel->state);
+		return -ENOTSUPP;
+	}
+
+	mutex_lock(&gsi->mutex);
+
+	gsi->ch_dbg[channel_id].ch_start++;
+
+	if (!channel_command(gsi, channel_id, GSI_CH_START)) {
+		mutex_unlock(&gsi->mutex);
+		return -ETIMEDOUT;
+	}
+	if (channel->state != GSI_CHANNEL_STATE_STARTED) {
+		ipa_err("channel %u unexpected state %u\n", channel_id,
+			channel->state);
+		ipa_bug();
+	}
+
+	mutex_unlock(&gsi->mutex);
+
+	return 0;
+}
+
+int gsi_channel_stop(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	int ret;
+
+	if (channel->state == GSI_CHANNEL_STATE_STOPPED)
+		return 0;
+
+	if (channel->state != GSI_CHANNEL_STATE_STARTED &&
+	    channel->state != GSI_CHANNEL_STATE_STOP_IN_PROC &&
+	    channel->state != GSI_CHANNEL_STATE_ERROR) {
+		ipa_err("bad state %d\n", channel->state);
+		return -ENOTSUPP;
+	}
+
+	mutex_lock(&gsi->mutex);
+
+	gsi->ch_dbg[channel_id].ch_stop++;
+
+	if (!channel_command(gsi, channel_id, GSI_CH_STOP)) {
+		/* check channel state here in case the channel is stopped but
+		 * the interrupt was not handled yet.
+		 */
+		channel->state = gsi_channel_state(gsi, channel_id);
+		if (channel->state == GSI_CHANNEL_STATE_STOPPED) {
+			ret = 0;
+			goto free_lock;
+		}
+		ret = -ETIMEDOUT;
+		goto free_lock;
+	}
+
+	if (channel->state != GSI_CHANNEL_STATE_STOPPED &&
+	    channel->state != GSI_CHANNEL_STATE_STOP_IN_PROC) {
+		ipa_err("channel %u unexpected state %u\n", channel_id,
+			channel->state);
+		ret = -EBUSY;
+		goto free_lock;
+	}
+
+	if (channel->state == GSI_CHANNEL_STATE_STOP_IN_PROC) {
+		ipa_err("channel %u busy try again\n", channel_id);
+		ret = -EAGAIN;
+		goto free_lock;
+	}
+
+	ret = 0;
+
+free_lock:
+	mutex_unlock(&gsi->mutex);
+
+	return ret;
+}
+
+int gsi_channel_reset(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	u32 evt_ring_id;
+	bool reset_done;
+
+	if (channel->state != GSI_CHANNEL_STATE_STOPPED) {
+		ipa_err("bad state %d\n", channel->state);
+		return -ENOTSUPP;
+	}
+
+	evt_ring_id = gsi_evt_ring_id(gsi, channel->evt_ring);
+	reset_done = false;
+	mutex_lock(&gsi->mutex);
+reset:
+
+	gsi->ch_dbg[channel_id].ch_reset++;
+
+	if (!channel_command(gsi, channel_id, GSI_CH_RESET)) {
+		mutex_unlock(&gsi->mutex);
+		return -ETIMEDOUT;
+	}
+
+	if (channel->state != GSI_CHANNEL_STATE_ALLOCATED) {
+		ipa_err("channel_id %u unexpected state %u\n", channel_id,
+			channel->state);
+		ipa_bug();
+	}
+
+	/* workaround: reset GSI producers again */
+	if (channel->from_ipa && !reset_done) {
+		usleep_range(GSI_RESET_WA_MIN_SLEEP, GSI_RESET_WA_MAX_SLEEP);
+		reset_done = true;
+		goto reset;
+	}
+
+	gsi_channel_program(gsi, channel_id, evt_ring_id, true);
+	gsi_ring_init(&channel->ring);
+
+	/* restore scratch */
+	__gsi_channel_scratch_write(gsi, channel_id);
+
+	mutex_unlock(&gsi->mutex);
+
+	return 0;
+}
+
+void gsi_channel_free(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	u32 evt_ring_id;
+	bool completed;
+
+	ipa_bug_on(channel->state != GSI_CHANNEL_STATE_ALLOCATED);
+
+	evt_ring_id = gsi_evt_ring_id(gsi, channel->evt_ring);
+	mutex_lock(&gsi->mutex);
+
+	gsi->ch_dbg[channel_id].ch_de_alloc++;
+
+	completed = channel_command(gsi, channel_id, GSI_CH_DE_ALLOC);
+	ipa_bug_on(!completed);
+
+	ipa_bug_on(channel->state != GSI_CHANNEL_STATE_NOT_ALLOCATED);
+
+	mutex_unlock(&gsi->mutex);
+
+	kfree(channel->user_data);
+	gsi_ring_free(&channel->ring);
+
+	gsi_evt_ring_dealloc(gsi, evt_ring_id);
+
+	memset(channel, 0, sizeof(*channel));
+
+	atomic_dec(&gsi->channel_count);
+}
+
+static u16 __gsi_query_ring_free_re(struct gsi_ring *ring)
+{
+	u64 delta;
+
+	if (ring->wp_local < ring->rp_local)
+		delta = ring->rp_local - ring->wp_local;
+	else
+		delta = ring->end - ring->wp_local + ring->rp_local;
+
+	return (u16)(delta / GSI_RING_ELEMENT_SIZE - 1);
+}
+
+int gsi_channel_queue(struct gsi *gsi, u32 channel_id, u16 num_xfers,
+		      struct gsi_xfer_elem *xfer, bool ring_db)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	unsigned long flags;
+	u32 i;
+
+	spin_lock_irqsave(&channel->evt_ring->ring.slock, flags);
+
+	if (num_xfers > __gsi_query_ring_free_re(&channel->ring)) {
+		spin_unlock_irqrestore(&channel->evt_ring->ring.slock, flags);
+		ipa_err("no space for %u-element transfer on ch %u\n",
+			num_xfers, channel_id);
+
+		return -ENOSPC;
+	}
+
+	for (i = 0; i < num_xfers; i++) {
+		struct gsi_tre *tre_ptr;
+		u16 idx = ring_wp_local_index(&channel->ring);
+
+		channel->user_data[idx] = xfer[i].user_data;
+
+		tre_ptr = ipa_dma_phys_to_virt(&channel->ring.mem,
+						  channel->ring.wp_local);
+
+		tre_ptr->buffer_ptr = xfer[i].addr;
+		tre_ptr->buf_len = xfer[i].len_opcode;
+		tre_ptr->bei = xfer[i].flags & GSI_XFER_FLAG_BEI ? 1 : 0;
+		tre_ptr->ieot = xfer[i].flags & GSI_XFER_FLAG_EOT ? 1 : 0;
+		tre_ptr->ieob = xfer[i].flags & GSI_XFER_FLAG_EOB ? 1 : 0;
+		tre_ptr->chain = xfer[i].flags & GSI_XFER_FLAG_CHAIN ? 1 : 0;
+
+		if (xfer[i].type == GSI_XFER_ELEM_DATA)
+			tre_ptr->re_type = GSI_RE_XFER;
+		else if (xfer[i].type == GSI_XFER_ELEM_IMME_CMD)
+			tre_ptr->re_type = GSI_RE_IMMD_CMD;
+		else if (xfer[i].type == GSI_XFER_ELEM_NOP)
+			tre_ptr->re_type = GSI_RE_NOP;
+		else
+			ipa_bug_on("invalid xfer type");
+
+		ring_wp_local_inc(&channel->ring);
+	}
+
+	wmb();	/* Ensure TRE is set before ringing doorbell */
+
+	if (ring_db)
+		gsi_channel_doorbell(gsi, channel);
+
+	spin_unlock_irqrestore(&channel->evt_ring->ring.slock, flags);
+
+	return 0;
+}
+
+int gsi_channel_poll(struct gsi *gsi, u32 channel_id)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	struct gsi_evt_ring *evt_ring;
+	unsigned long flags;
+	u32 evt_ring_id;
+	int size;
+
+	evt_ring = channel->evt_ring;
+	evt_ring_id = gsi_evt_ring_id(gsi, evt_ring);
+
+	spin_lock_irqsave(&evt_ring->ring.slock, flags);
+
+	/* update rp to see of we have anything new to process */
+	if (evt_ring->ring.rp == evt_ring->ring.rp_local) {
+		u32 val;
+
+		val = gsi_readl(gsi, GSI_EV_CH_E_CNTXT_4_OFFS(evt_ring_id));
+		evt_ring->ring.rp = channel->ring.rp & GENMASK_ULL(63, 32);
+		evt_ring->ring.rp |= val;
+	}
+
+	if (evt_ring->ring.rp != evt_ring->ring.rp_local) {
+		struct gsi_xfer_compl_evt *evt;
+
+		evt = ipa_dma_phys_to_virt(&evt_ring->ring.mem,
+					   evt_ring->ring.rp_local);
+		size = gsi_channel_process(gsi, evt, false);
+
+		ring_rp_local_inc(&evt_ring->ring);
+		ring_wp_local_inc(&evt_ring->ring); /* recycle element */
+	} else {
+		size = -ENOENT;
+	}
+
+	spin_unlock_irqrestore(&evt_ring->ring.slock, flags);
+
+	return size;
+}
+
+static void gsi_channel_mode_set(struct gsi *gsi, u32 channel_id, bool polling)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	unsigned long flags;
+	u32 evt_ring_id;
+
+	evt_ring_id = gsi_evt_ring_id(gsi, channel->evt_ring);
+
+	spin_lock_irqsave(&gsi->slock, flags);
+
+	if (polling)
+		gsi_irq_disable_event(gsi, evt_ring_id);
+	else
+		gsi_irq_enable_event(gsi, evt_ring_id);
+	atomic_set(&channel->poll_mode, polling ? 1 : 0);
+
+	spin_unlock_irqrestore(&gsi->slock, flags);
+}
+
+void gsi_channel_intr_enable(struct gsi *gsi, u32 channel_id)
+{
+	gsi_channel_mode_set(gsi, channel_id, false);
+}
+
+void gsi_channel_intr_disable(struct gsi *gsi, u32 channel_id)
+{
+	gsi_channel_mode_set(gsi, channel_id, true);
+}
+
+void gsi_channel_config(struct gsi *gsi, u32 channel_id, bool doorbell_enable)
+{
+	struct gsi_channel *channel = &gsi->channel[channel_id];
+	u32 evt_ring_id;
+
+	evt_ring_id = gsi_evt_ring_id(gsi, channel->evt_ring);
+
+	mutex_lock(&channel->mutex);
+
+	gsi_channel_program(gsi, channel_id, evt_ring_id, doorbell_enable);
+	gsi_ring_init(&channel->ring);
+
+	/* restore scratch */
+	__gsi_channel_scratch_write(gsi, channel_id);
+	mutex_unlock(&channel->mutex);
+}
+
+/* Initialize GSI driver */
+struct gsi *gsi_init(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	resource_size_t size;
+	struct gsi *gsi;
+	int irq;
+
+	/* Get GSI memory range and map it */
+	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "gsi");
+	if (!res) {
+		ipa_err("missing \"gsi\" property in DTB\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	size = resource_size(res);
+	if (res->start > U32_MAX || size > U32_MAX) {
+		ipa_err("\"gsi\" values out of range\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* Get IPA GSI IRQ number */
+	irq = platform_get_irq_byname(pdev, "gsi");
+	if (irq < 0) {
+		ipa_err("failed to get gsi IRQ!\n");
+		return ERR_PTR(irq);
+	}
+
+	gsi = kzalloc(sizeof(*gsi), GFP_KERNEL);
+	if (!gsi)
+		return ERR_PTR(-ENOMEM);
+
+	gsi->base = devm_ioremap_nocache(dev, res->start, size);
+	if (!gsi->base) {
+		kfree(gsi);
+
+		return ERR_PTR(-ENOMEM);
+	}
+	gsi->dev = dev;
+	gsi->phys = (u32)res->start;
+	gsi->irq = irq;
+	spin_lock_init(&gsi->slock);
+	mutex_init(&gsi->mutex);
+	atomic_set(&gsi->channel_count, 0);
+	atomic_set(&gsi->evt_ring_count, 0);
+
+	return gsi;
+}
diff --git a/drivers/net/ipa/gsi.h b/drivers/net/ipa/gsi.h
new file mode 100644
index 000000000000..497f67cc6f80
--- /dev/null
+++ b/drivers/net/ipa/gsi.h
@@ -0,0 +1,195 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018 Linaro Ltd.
+ */
+#ifndef _GSI_H_
+#define _GSI_H_
+
+#include <linux/types.h>
+#include <linux/platform_device.h>
+
+#define GSI_RING_ELEMENT_SIZE	16	/* bytes (channel or event ring) */
+
+/**
+ * enum gsi_xfer_flag - Transfer element flag values.
+ * @GSI_XFER_FLAG_CHAIN:	Not the last element in a transaction.
+ * @GSI_XFER_FLAG_EOB:		Generate event interrupt when complete.
+ * @GSI_XFER_FLAG_EOT:		Interrupt on end of transfer condition.
+ * @GSI_XFER_FLAG_BEI:		Block (do not generate) event interrupt.
+ *
+ * Normally an event generated by completion of a transfer will cause
+ * the AP to be interrupted; the BEI flag prevents that.
+ */
+enum gsi_xfer_flag {
+	GSI_XFER_FLAG_CHAIN	= BIT(1),
+	GSI_XFER_FLAG_EOB	= BIT(2),
+	GSI_XFER_FLAG_EOT	= BIT(3),
+	GSI_XFER_FLAG_BEI	= BIT(4),
+};
+
+/**
+ * enum gsi_xfer_elem_type - Transfer element type.
+ * @GSI_XFER_ELEM_DATA:		Element represents a data transfer
+ * @GSI_XFER_ELEM_IMME_CMD:	Element contains an immediate command.
+ * @GSI_XFER_ELEM_NOP:		Element contans a no-op command.
+ */
+enum gsi_xfer_elem_type {
+	GSI_XFER_ELEM_DATA,
+	GSI_XFER_ELEM_IMME_CMD,
+	GSI_XFER_ELEM_NOP,
+};
+
+/**
+ * gsi_xfer_elem - Description of a single transfer.
+ * @addr:	Physical address of a buffer for data or immediate commands.
+ * @len_opcode:	Length of the data buffer, or enum ipahal_imm_cmd opcode
+ * @flags:	Flags for the transfer
+ * @type:	Command type (immediate command, data transfer NOP)
+ * @user_data:	Data maintained for (but unused by) the transfer element.
+ */
+struct gsi_xfer_elem {
+	u64 addr;
+	u16 len_opcode;
+	enum gsi_xfer_flag flags;
+	enum gsi_xfer_elem_type type;
+	void *user_data;
+};
+
+struct gsi;
+
+/**
+ * gsi_init() - Initialize GSI subsystem
+ * @pdev:	IPA platform device, to look up resources
+ *
+ * This stage of initialization can occur before the GSI firmware
+ * has been loaded.
+ *
+ * Return:	GSI pointer to provide to other GSI functions.
+ */
+struct gsi *gsi_init(struct platform_device *pdev);
+
+/**
+ * gsi_device_init() - Initialize a GSI device
+ * @gsi:	GSI pointer returned by gsi_init()
+ *
+ * Initialize a GSI device.
+ *
+ * @Return:	0 if successful or a negative error code otherwise.
+ */
+int gsi_device_init(struct gsi *gsi);
+
+/**
+ * gsi_device_exit() - De-initialize a GSI device
+ * @gsi:	GSI pointer returned by gsi_init()
+ *
+ * This is the inverse of gsi_device_init()
+ */
+void gsi_device_exit(struct gsi *gsi);
+
+/**
+ * gsi_channel_alloc() - Allocate a GSI channel
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel to allocate
+ * @channel_count: Number of transfer element slots in the channel
+ * @from_ipa:	Direction of data transfer (true: IPA->AP; false: AP->IPA)
+ * @priority:	Whether this channel will given prioroity
+ * @evt_ring_mult: Factor to use to get the number of elements in the
+ *		event ring associated with this channel
+ * @moderation:	Whether interrupt moderation should be enabled
+ * @notify_data: Pointer value to supply with notifications that
+ * 		occur because of events on this channel
+ *
+ * @Return:	 0 if successful, or a negative error code.
+ */
+int gsi_channel_alloc(struct gsi *gsi, u32 channel_id, u32 channel_count,
+		      bool from_ipa, bool priority, u32 evt_ring_mult,
+		      bool moderation, void *notify_data);
+
+/**
+ * gsi_channel_scratch_write() - Write channel scratch area
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel whose scratch area should be written
+ * @tlv_count:	The number of type-length-value the channel uses
+ */
+void gsi_channel_scratch_write(struct gsi *gsi, u32 channel_id, u32 tlv_count);
+
+/**
+ * gsi_channel_start() - Make a channel operational
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel to start
+ *
+ * @Return:	 0 if successful, or a negative error code.
+ */
+int gsi_channel_start(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_stop() - Stop an operational channel
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel to stop
+ *
+ * @Return:	 0 if successful, or a negative error code.
+ */
+int gsi_channel_stop(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_reset() - Reset a channel, to recover from error state
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel to be reset
+ *
+ * @Return:	 0 if successful, or a negative error code.
+ */
+int gsi_channel_reset(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_free() - Release a previously-allocated channel
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel to be freed
+ */
+void gsi_channel_free(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_config() - Configure a channel
+ * @gsi:		GSI pointer returned by gsi_init()
+ * @channel_id:		Channel to be configured
+ * @doorbell_enable:	Whether to enable hardware doorbell engine
+ */
+void gsi_channel_config(struct gsi *gsi, u32 channel_id, bool doorbell_enable);
+
+/**
+ * gsi_channel_poll() - Poll for a single completion on a channel
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel to be polled
+ *
+ * @Return:	Byte transfer count if successful, or a negative error code
+ */
+int gsi_channel_poll(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_intr_enable() - Enable interrupts on a channel
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel whose interrupts should be enabled
+ */
+void gsi_channel_intr_enable(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_intr_disable() - Disable interrupts on a channel
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel whose interrupts should be disabled
+ */
+void gsi_channel_intr_disable(struct gsi *gsi, u32 channel_id);
+
+/**
+ * gsi_channel_queue() - Queue transfer requests on a channel
+ * @gsi:	GSI pointer returned by gsi_init()
+ * @channel_id:	Channel on which transfers should be queued
+ * @num_xfers:	Number of transfer descriptors in the @xfer array
+ * @xfer:	Array of transfer descriptors
+ * @ring_db:	Whether to tell the hardware about these queued transfers
+ *
+ * @Return:	0 if successful, or a negative error code
+ */
+int gsi_channel_queue(struct gsi *gsi, u32 channel_id, u16 num_xfers,
+		      struct gsi_xfer_elem *xfer, bool ring_db);
+
+#endif /* _GSI_H_ */
diff --git a/drivers/net/ipa/gsi_reg.h b/drivers/net/ipa/gsi_reg.h
new file mode 100644
index 000000000000..fe5f98ef3840
--- /dev/null
+++ b/drivers/net/ipa/gsi_reg.h
@@ -0,0 +1,563 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2015-2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018 Linaro Ltd.
+ */
+#ifndef __GSI_REG_H__
+#define __GSI_REG_H__
+
+/* The maximum allowed value of "n" for any N-parameterized macro below
+ * is 3.  The N value comes from the ipa_ees enumerated type.
+ *
+ * For GSI_INST_RAM_I_OFFS(), the "i" value supplied is an instruction
+ * offset (where each instruction is 32 bits wide).  The maximum offset
+ * value is 4095.
+ *
+ * Macros parameterized by (data) channel number supply a parameter "c".
+ * The maximum value of "c" is 30 (but the limit is hardware-dependent).
+ *
+ * Macros parameterized by event channel number supply a parameter "e".
+ * The maximum value of "e" is 15 (but the limit is hardware-dependent).
+ *
+ * For any K-parameterized macros, the "k" value will represent either an
+ * event ring id or a (data) channel id.  15 is the maximum value of
+ * "k" for event rings; otherwise the maximum is 30.
+ */
+#define GSI_CFG_OFFS				0x00000000
+#define GSI_ENABLE_FMASK			0x00000001
+#define MCS_ENABLE_FMASK			0x00000002
+#define DOUBLE_MCS_CLK_FREQ_FMASK		0x00000004
+#define UC_IS_MCS_FMASK				0x00000008
+#define PWR_CLPS_FMASK				0x00000010
+#define BP_MTRIX_DISABLE_FMASK			0x00000020
+
+#define GSI_MCS_CFG_OFFS			0x0000b000
+#define MCS_CFG_ENABLE_FMASK			0x00000001
+
+#define GSI_PERIPH_BASE_ADDR_LSB_OFFS		0x00000018
+
+#define GSI_PERIPH_BASE_ADDR_MSB_OFFS		0x0000001c
+
+#define GSI_IC_DISABLE_CHNL_BCK_PRS_LSB_OFFS	0x000000a0
+#define CHNL_REE_INT_FMASK			0x00000007
+#define CHNL_EV_ENG_INT_FMASK			0x00000040
+#define CHNL_INT_END_INT_FMASK			0x00001000
+#define CHNL_CSR_INT_FMASK			0x00fc0000
+#define CHNL_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_DISABLE_CHNL_BCK_PRS_MSB_OFFS	0x000000a4
+#define CHNL_TIMER_INT_FMASK			0x00000001
+#define CHNL_DB_ENG_INT_FMASK			0x00000040
+#define CHNL_RD_WR_INT_FMASK			0x00003000
+#define CHNL_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_GEN_EVNT_BCK_PRS_LSB_OFFS	0x000000a8
+#define EVT_REE_INT_FMASK			0x00000007
+#define EVT_EV_ENG_INT_FMASK			0x00000040
+#define EVT_INT_END_INT_FMASK			0x00001000
+#define EVT_CSR_INT_FMASK			0x00fc0000
+#define EVT_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_GEN_EVNT_BCK_PRS_MSB_OFFS	0x000000ac
+#define EVT_TIMER_INT_FMASK			0x00000001
+#define EVT_DB_ENG_INT_FMASK			0x00000040
+#define EVT_RD_WR_INT_FMASK			0x00003000
+#define EVT_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_GEN_INT_BCK_PRS_LSB_OFFS		0x000000b0
+#define INT_REE_INT_FMASK			0x00000007
+#define INT_EV_ENG_INT_FMASK			0x00000040
+#define INT_INT_END_INT_FMASK			0x00001000
+#define INT_CSR_INT_FMASK			0x00fc0000
+#define INT_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_GEN_INT_BCK_PRS_MSB_OFFS		0x000000b4
+#define INT_TIMER_INT_FMASK			0x00000001
+#define INT_DB_ENG_INT_FMASK			0x00000040
+#define INT_RD_WR_INT_FMASK			0x00003000
+#define INT_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_STOP_INT_MOD_BCK_PRS_LSB_OFFS	0x000000b8
+#define REE_INT_FMASK				0x00000007
+#define EV_ENG_INT_FMASK			0x00000040
+#define INT_END_INT_FMASK			0x00001000
+#define CSR_INT_FMASK				0x00fc0000
+#define TLV_INT_FMASK				0x3f000000
+
+#define GSI_IC_STOP_INT_MOD_BCK_PRS_MSB_OFFS	0x000000bc
+#define TIMER_INT_FMASK				0x00000001
+#define DB_ENG_INT_FMASK			0x00000040
+#define RD_WR_INT_FMASK				0x00003000
+#define UCONTROLLER_INT_FMASK			0x00fc0000
+
+#define GSI_IC_PROCESS_DESC_BCK_PRS_LSB_OFFS	0x000000c0
+#define DESC_REE_INT_FMASK			0x00000007
+#define DESC_EV_ENG_INT_FMASK			0x00000040
+#define DESC_INT_END_INT_FMASK			0x00001000
+#define DESC_CSR_INT_FMASK			0x00fc0000
+#define DESC_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_PROCESS_DESC_BCK_PRS_MSB_OFFS	0x000000c4
+#define DESC_TIMER_INT_FMASK			0x00000001
+#define DESC_DB_ENG_INT_FMASK			0x00000040
+#define DESC_RD_WR_INT_FMASK			0x00003000
+#define DESC_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_TLV_STOP_BCK_PRS_LSB_OFFS	0x000000c8
+#define STOP_REE_INT_FMASK			0x00000007
+#define STOP_EV_ENG_INT_FMASK			0x00000040
+#define STOP_INT_END_INT_FMASK			0x00001000
+#define STOP_CSR_INT_FMASK			0x00fc0000
+#define STOP_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_TLV_STOP_BCK_PRS_MSB_OFFS	0x000000cc
+#define STOP_TIMER_INT_FMASK			0x00000001
+#define STOP_DB_ENG_INT_FMASK			0x00000040
+#define STOP_RD_WR_INT_FMASK			0x00003000
+#define STOP_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_TLV_RESET_BCK_PRS_LSB_OFFS	0x000000d0
+#define RST_REE_INT_FMASK			0x00000007
+#define RST_EV_ENG_INT_FMASK			0x00000040
+#define RST_INT_END_INT_FMASK			0x00001000
+#define RST_CSR_INT_FMASK			0x00fc0000
+#define RST_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_TLV_RESET_BCK_PRS_MSB_OFFS	0x000000d4
+#define RST_TIMER_INT_FMASK			0x00000001
+#define RST_DB_ENG_INT_FMASK			0x00000040
+#define RST_RD_WR_INT_FMASK			0x00003000
+#define RST_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_RGSTR_TIMER_BCK_PRS_LSB_OFFS	0x000000d8
+#define TMR_REE_INT_FMASK			0x00000007
+#define TMR_EV_ENG_INT_FMASK			0x00000040
+#define TMR_INT_END_INT_FMASK			0x00001000
+#define TMR_CSR_INT_FMASK			0x00fc0000
+#define TMR_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_RGSTR_TIMER_BCK_PRS_MSB_OFFS	0x000000dc
+#define TMR_TIMER_INT_FMASK			0x00000001
+#define TMR_DB_ENG_INT_FMASK			0x00000040
+#define TMR_RD_WR_INT_FMASK			0x00003000
+#define TMR_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_READ_BCK_PRS_LSB_OFFS		0x000000e0
+#define RD_REE_INT_FMASK			0x00000007
+#define RD_EV_ENG_INT_FMASK			0x00000040
+#define RD_INT_END_INT_FMASK			0x00001000
+#define RD_CSR_INT_FMASK			0x00fc0000
+#define RD_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_READ_BCK_PRS_MSB_OFFS		0x000000e4
+#define RD_TIMER_INT_FMASK			0x00000001
+#define RD_DB_ENG_INT_FMASK			0x00000040
+#define RD_RD_WR_INT_FMASK			0x00003000
+#define RD_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_WRITE_BCK_PRS_LSB_OFFS		0x000000e8
+#define WR_REE_INT_FMASK			0x00000007
+#define WR_EV_ENG_INT_FMASK			0x00000040
+#define WR_INT_END_INT_FMASK			0x00001000
+#define WR_CSR_INT_FMASK			0x00fc0000
+#define WR_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_WRITE_BCK_PRS_MSB_OFFS		0x000000ec
+#define WR_TIMER_INT_FMASK			0x00000001
+#define WR_DB_ENG_INT_FMASK			0x00000040
+#define WR_RD_WR_INT_FMASK			0x00003000
+#define WR_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IC_UCONTROLLER_GPR_BCK_PRS_LSB_OFFS	0x000000f0
+#define UC_REE_INT_FMASK			0x00000007
+#define UC_EV_ENG_INT_FMASK			0x00000040
+#define UC_INT_END_INT_FMASK			0x00001000
+#define UC_CSR_INT_FMASK			0x00fc0000
+#define UC_TLV_INT_FMASK			0x3f000000
+
+#define GSI_IC_UCONTROLLER_GPR_BCK_PRS_MSB_OFFS	0x000000f4
+#define UC_TIMER_INT_FMASK			0x00000001
+#define UC_DB_ENG_INT_FMASK			0x00000040
+#define UC_RD_WR_INT_FMASK			0x00003000
+#define UC_UCONTROLLER_INT_FMASK		0x00fc0000
+
+#define GSI_IRAM_PTR_CH_CMD_OFFS		0x00000400
+#define CMD_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_EE_GENERIC_CMD_OFFS	0x00000404
+#define EE_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_CH_DB_OFFS			0x00000418
+#define CH_DB_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_EV_DB_OFFS			0x0000041c
+#define EV_DB_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_NEW_RE_OFFS		0x00000420
+#define NEW_RE_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_CH_DIS_COMP_OFFS		0x00000424
+#define DIS_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_CH_EMPTY_OFFS		0x00000428
+#define EMPTY_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_EVENT_GEN_COMP_OFFS	0x0000042c
+#define EVT_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_PERIPH_IF_TLV_IN_0_OFFS	0x00000430
+#define IN_0_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_PERIPH_IF_TLV_IN_2_OFFS	0x00000434
+#define IN_2_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_PERIPH_IF_TLV_IN_1_OFFS	0x00000438
+#define IN_1_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_TIMER_EXPIRED_OFFS		0x0000043c
+#define TMR_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_WRITE_ENG_COMP_OFFS	0x00000440
+#define WR_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_READ_ENG_COMP_OFFS		0x00000444
+#define RD_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_UC_GP_INT_OFFS		0x00000448
+#define UC_IRAM_PTR_FMASK			0x00000fff
+
+#define GSI_IRAM_PTR_INT_MOD_STOPPED_OFFS	0x0000044c
+#define STOP_IRAM_PTR_FMASK			0x00000fff
+
+/* Max value of I for the GSI_INST_RAM_I_OFFS() is 4095 */
+#define GSI_INST_RAM_I_OFFS(i)			(0x00004000 + 0x0004 * (i))
+#define INST_BYTE_0_FMASK			0x000000ff
+#define INST_BYTE_1_FMASK			0x0000ff00
+#define INST_BYTE_2_FMASK			0x00ff0000
+#define INST_BYTE_3_FMASK			0xff000000
+
+#define GSI_CH_C_CNTXT_0_OFFS(c) \
+				GSI_EE_N_CH_C_CNTXT_0_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_0_OFFS(c, n) \
+					(0x0001c000 + 0x4000 * (n) + 0x80 * (c))
+#define CHTYPE_PROTOCOL_FMASK			0x00000007
+#define CHTYPE_DIR_FMASK			0x00000008
+#define EE_FMASK				0x000000f0
+#define CHID_FMASK				0x00001f00
+#define ERINDEX_FMASK				0x0007c000
+#define CHSTATE_FMASK				0x00f00000
+#define ELEMENT_SIZE_FMASK			0xff000000
+
+#define GSI_CH_C_CNTXT_1_OFFS(c) \
+				GSI_EE_N_CH_C_CNTXT_1_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_1_OFFS(c, n) \
+					(0x0001c004 + 0x4000 * (n) + 0x80 * (c))
+#define R_LENGTH_FMASK				0x0000ffff
+
+#define GSI_CH_C_CNTXT_2_OFFS(c) \
+				GSI_EE_N_CH_C_CNTXT_2_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_2_OFFS(c, n) \
+					(0x0001c008 + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_CH_C_CNTXT_3_OFFS(c) \
+				GSI_EE_N_CH_C_CNTXT_3_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_3_OFFS(c, n) \
+					(0x0001c00c + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_CH_C_CNTXT_4_OFFS(c) \
+				GSI_EE_N_CH_C_CNTXT_4_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_4_OFFS(c, n) \
+					(0x0001c010 + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_CH_C_CNTXT_6_OFFS(c) \
+				GSI_EE_N_CH_C_CNTXT_6_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_CNTXT_6_OFFS(c, n) \
+					(0x0001c018 + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_CH_C_QOS_OFFS(c)	GSI_EE_N_CH_C_QOS_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_QOS_OFFS(c, n)	(0x0001c05c + 0x4000 * (n) + 0x80 * (c))
+#define WRR_WEIGHT_FMASK			0x0000000f
+#define MAX_PREFETCH_FMASK			0x00000100
+#define USE_DB_ENG_FMASK			0x00000200
+
+#define GSI_CH_C_SCRATCH_0_OFFS(c) \
+				GSI_EE_N_CH_C_SCRATCH_0_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_0_OFFS(c, n) \
+					(0x0001c060 + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_CH_C_SCRATCH_1_OFFS(c) \
+				GSI_EE_N_CH_C_SCRATCH_1_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_1_OFFS(c, n) \
+					(0x0001c064 + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_CH_C_SCRATCH_2_OFFS(c) \
+				GSI_EE_N_CH_C_SCRATCH_2_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_2_OFFS(c, n) \
+					(0x0001c068 + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_CH_C_SCRATCH_3_OFFS(c) \
+				GSI_EE_N_CH_C_SCRATCH_3_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_SCRATCH_3_OFFS(c, n) \
+					(0x0001c06c + 0x4000 * (n) + 0x80 * (c))
+
+#define GSI_EV_CH_E_CNTXT_0_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_0_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_0_OFFS(e, n) \
+					(0x0001d000 + 0x4000 * (n) + 0x80 * (e))
+#define EV_CHTYPE_FMASK				0x0000000f
+#define EV_EE_FMASK				0x000000f0
+#define EV_EVCHID_FMASK				0x0000ff00
+#define EV_INTYPE_FMASK				0x00010000
+#define EV_CHSTATE_FMASK			0x00f00000
+#define EV_ELEMENT_SIZE_FMASK			0xff000000
+
+#define GSI_EV_CH_E_CNTXT_1_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_1_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_1_OFFS(e, n) \
+					(0x0001d004 + 0x4000 * (n) + 0x80 * (e))
+#define EV_R_LENGTH_FMASK			0x0000ffff
+
+#define GSI_EV_CH_E_CNTXT_2_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_2_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_2_OFFS(e, n) \
+					(0x0001d008 + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_CNTXT_3_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_3_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_3_OFFS(e, n) \
+					(0x0001d00c + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_CNTXT_4_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_4_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_4_OFFS(e, n) \
+					(0x0001d010 + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_CNTXT_8_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_8_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_8_OFFS(e, n) \
+					(0x0001d020 + 0x4000 * (n) + 0x80 * (e))
+#define MODT_FMASK				0x0000ffff
+#define MODC_FMASK				0x00ff0000
+#define MOD_CNT_FMASK				0xff000000
+
+#define GSI_EV_CH_E_CNTXT_9_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_9_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_9_OFFS(e, n) \
+					(0x0001d024 + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_CNTXT_10_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_10_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_10_OFFS(e, n) \
+					(0x0001d028 + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_CNTXT_11_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_11_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_11_OFFS(e, n) \
+					(0x0001d02c + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_CNTXT_12_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_12_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_12_OFFS(e, n) \
+					(0x0001d030 + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_CNTXT_13_OFFS(e) \
+				GSI_EE_N_EV_CH_E_CNTXT_13_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_CNTXT_13_OFFS(e, n) \
+					(0x0001d034 + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_SCRATCH_0_OFFS(e) \
+				GSI_EE_N_EV_CH_E_SCRATCH_0_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_SCRATCH_0_OFFS(e, n) \
+					(0x0001d048 + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_EV_CH_E_SCRATCH_1_OFFS(e) \
+				GSI_EE_N_EV_CH_E_SCRATCH_1_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_SCRATCH_1_OFFS(e, n) \
+					(0x0001d04c + 0x4000 * (n) + 0x80 * (e))
+
+#define GSI_CH_C_DOORBELL_0_OFFS(c) \
+				GSI_EE_N_CH_C_DOORBELL_0_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_DOORBELL_0_OFFS(c, n) \
+					(0x0001e000 + 0x4000 * (n) + 0x08 * (c))
+
+#define GSI_CH_C_DOORBELL_1_OFFS(c) \
+				GSI_EE_N_CH_C_DOORBELL_1_OFFS(c, IPA_EE_AP)
+#define GSI_EE_N_CH_C_DOORBELL_1_OFFS(c, n) \
+					(0x0001e004 + 0x4000 * (n) + 0x08 * (c))
+
+#define GSI_EV_CH_E_DOORBELL_0_OFFS(e) \
+				GSI_EE_N_EV_CH_E_DOORBELL_0_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_DOORBELL_0_OFFS(e, n) \
+					(0x0001e100 + 0x4000 * (n) + 0x08 * (e))
+
+#define GSI_EV_CH_E_DOORBELL_1_OFFS(e) \
+				GSI_EE_N_EV_CH_E_DOORBELL_1_OFFS(e, IPA_EE_AP)
+#define GSI_EE_N_EV_CH_E_DOORBELL_1_OFFS(e, n) \
+					(0x0001e104 + 0x4000 * (n) + 0x08 * (e))
+
+#define GSI_GSI_STATUS_OFFS	GSI_EE_N_GSI_STATUS_OFFS(IPA_EE_AP)
+#define GSI_EE_N_GSI_STATUS_OFFS(n)		(0x0001f000 + 0x4000 * (n))
+#define ENABLED_FMASK				0x00000001
+
+#define GSI_CH_CMD_OFFS		GSI_EE_N_CH_CMD_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CH_CMD_OFFS(n)			(0x0001f008 + 0x4000 * (n))
+#define CH_CHID_FMASK				0x000000ff
+#define CH_OPCODE_FMASK				0xff000000
+
+#define GSI_EV_CH_CMD_OFFS	GSI_EE_N_EV_CH_CMD_OFFS(IPA_EE_AP)
+#define GSI_EE_N_EV_CH_CMD_OFFS(n)		(0x0001f010 + 0x4000 * (n))
+#define EV_CHID_FMASK				0x000000ff
+#define EV_OPCODE_FMASK				0xff000000
+
+#define GSI_GSI_HW_PARAM_2_OFFS	GSI_EE_N_GSI_HW_PARAM_2_OFFS(IPA_EE_AP)
+#define GSI_EE_N_GSI_HW_PARAM_2_OFFS(n)		(0x0001f040 + 0x4000 * (n))
+#define IRAM_SIZE_FMASK				0x00000007
+#define NUM_CH_PER_EE_FMASK			0x000000f8
+#define NUM_EV_PER_EE_FMASK			0x00001f00
+#define GSI_CH_PEND_TRANSLATE_FMASK		0x00002000
+#define GSI_CH_FULL_LOGIC_FMASK			0x00004000
+#define IRAM_SIZE_ONE_KB_FVAL			0
+#define IRAM_SIZE_TWO_KB_FVAL			1
+
+#define GSI_GSI_SW_VERSION_OFFS	GSI_EE_N_GSI_SW_VERSION_OFFS(IPA_EE_AP)
+#define GSI_EE_N_GSI_SW_VERSION_OFFS(n)		(0x0001f044 + 0x4000 * (n))
+#define STEP_FMASK				0x0000ffff
+#define MINOR_FMASK				0x0fff0000
+#define MAJOR_FMASK				0xf0000000
+
+#define GSI_GSI_MCS_CODE_VER_OFFS \
+				GSI_EE_N_GSI_MCS_CODE_VER_OFFS(IPA_EE_AP)
+#define GSI_EE_N_GSI_MCS_CODE_VER_OFFS(n)	(0x0001f048 + 0x4000 * (n))
+
+#define GSI_CNTXT_TYPE_IRQ_OFFS	GSI_EE_N_CNTXT_TYPE_IRQ_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_TYPE_IRQ_OFFS(n)		(0x0001f080 + 0x4000 * (n))
+#define CH_CTRL_FMASK				0x00000001
+#define EV_CTRL_FMASK				0x00000002
+#define GLOB_EE_FMASK				0x00000004
+#define IEOB_FMASK				0x00000008
+#define INTER_EE_CH_CTRL_FMASK			0x00000010
+#define INTER_EE_EV_CTRL_FMASK			0x00000020
+#define GENERAL_FMASK				0x00000040
+
+#define GSI_CNTXT_TYPE_IRQ_MSK_OFFS \
+				GSI_EE_N_CNTXT_TYPE_IRQ_MSK_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_TYPE_IRQ_MSK_OFFS(n)	(0x0001f088 + 0x4000 * (n))
+#define MSK_CH_CTRL_FMASK			0x00000001
+#define MSK_EV_CTRL_FMASK			0x00000002
+#define MSK_GLOB_EE_FMASK			0x00000004
+#define MSK_IEOB_FMASK				0x00000008
+#define MSK_INTER_EE_CH_CTRL_FMASK		0x00000010
+#define MSK_INTER_EE_EV_CTRL_FMASK		0x00000020
+#define MSK_GENERAL_FMASK			0x00000040
+
+#define GSI_CNTXT_SRC_CH_IRQ_OFFS \
+				GSI_EE_N_CNTXT_SRC_CH_IRQ_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_CH_IRQ_OFFS(n)	(0x0001f090 + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_EV_CH_IRQ_OFFS \
+				GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_OFFS(n)	(0x0001f094 + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_CH_IRQ_MSK_OFFS \
+				GSI_EE_N_CNTXT_SRC_CH_IRQ_MSK_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_CH_IRQ_MSK_OFFS(n)	(0x0001f098 + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_EV_CH_IRQ_MSK_OFFS \
+				GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_MSK_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_MSK_OFFS(n) (0x0001f09c + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_CH_IRQ_CLR_OFFS \
+				GSI_EE_N_CNTXT_SRC_CH_IRQ_CLR_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_CH_IRQ_CLR_OFFS(n)	(0x0001f0a0 + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_EV_CH_IRQ_CLR_OFFS \
+				GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_CLR_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_EV_CH_IRQ_CLR_OFFS(n) (0x0001f0a4 + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_IEOB_IRQ_OFFS \
+				GSI_EE_N_CNTXT_SRC_IEOB_IRQ_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_IEOB_IRQ_OFFS(n)	(0x0001f0b0 + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_IEOB_IRQ_MSK_OFFS \
+				GSI_EE_N_CNTXT_SRC_IEOB_IRQ_MSK_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_IEOB_IRQ_MSK_OFFS(n)	(0x0001f0b8 + 0x4000 * (n))
+
+#define GSI_CNTXT_SRC_IEOB_IRQ_CLR_OFFS \
+				GSI_EE_N_CNTXT_SRC_IEOB_IRQ_CLR_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_SRC_IEOB_IRQ_CLR_OFFS(n)	(0x0001f0c0 + 0x4000 * (n))
+
+#define GSI_CNTXT_GLOB_IRQ_STTS_OFFS \
+				GSI_EE_N_CNTXT_GLOB_IRQ_STTS_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_GLOB_IRQ_STTS_OFFS(n)	(0x0001f100 + 0x4000 * (n))
+#define ERROR_INT_FMASK				0x00000001
+#define GP_INT1_FMASK				0x00000002
+#define GP_INT2_FMASK				0x00000004
+#define GP_INT3_FMASK				0x00000008
+
+#define GSI_CNTXT_GLOB_IRQ_EN_OFFS \
+				GSI_EE_N_CNTXT_GLOB_IRQ_EN_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_GLOB_IRQ_EN_OFFS(n)	(0x0001f108 + 0x4000 * (n))
+#define EN_ERROR_INT_FMASK			0x00000001
+#define EN_GP_INT1_FMASK			0x00000002
+#define EN_GP_INT2_FMASK			0x00000004
+#define EN_GP_INT3_FMASK			0x00000008
+
+#define GSI_CNTXT_GLOB_IRQ_CLR_OFFS \
+				GSI_EE_N_CNTXT_GLOB_IRQ_CLR_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_GLOB_IRQ_CLR_OFFS(n)	(0x0001f110 + 0x4000 * (n))
+#define CLR_ERROR_INT_FMASK			0x00000001
+#define CLR_GP_INT1_FMASK			0x00000002
+#define CLR_GP_INT2_FMASK			0x00000004
+#define CLR_GP_INT3_FMASK			0x00000008
+
+#define GSI_CNTXT_GSI_IRQ_STTS_OFFS \
+				GSI_EE_N_CNTXT_GSI_IRQ_STTS_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_GSI_IRQ_STTS_OFFS(n)	(0x0001f118 + 0x4000 * (n))
+#define BREAK_POINT_FMASK			0x00000001
+#define BUS_ERROR_FMASK				0x00000002
+#define CMD_FIFO_OVRFLOW_FMASK			0x00000004
+#define MCS_STACK_OVRFLOW_FMASK			0x00000008
+
+#define GSI_CNTXT_GSI_IRQ_EN_OFFS \
+				GSI_EE_N_CNTXT_GSI_IRQ_EN_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_GSI_IRQ_EN_OFFS(n)	(0x0001f120 + 0x4000 * (n))
+#define EN_BREAK_POINT_FMASK			0x00000001
+#define EN_BUS_ERROR_FMASK			0x00000002
+#define EN_CMD_FIFO_OVRFLOW_FMASK		0x00000004
+#define EN_MCS_STACK_OVRFLOW_FMASK		0x00000008
+
+#define GSI_CNTXT_GSI_IRQ_CLR_OFFS \
+				GSI_EE_N_CNTXT_GSI_IRQ_CLR_OFFS(IPA_EE_AP)
+#define GSI_EE_N_CNTXT_GSI_IRQ_CLR_OFFS(n)	(0x0001f128 + 0x4000 * (n))
+#define CLR_BREAK_POINT_FMASK			0x00000001
+#define CLR_BUS_ERROR_FMASK			0x00000002
+#define CLR_CMD_FIFO_OVRFLOW_FMASK		0x00000004
+#define CLR_MCS_STACK_OVRFLOW_FMASK		0x00000008
+
+#define GSI_EE_N_CNTXT_INTSET_OFFS(n)		(0x0001f180 + 0x4000 * (n))
+#define INTYPE_FMASK				0x00000001
+#define GSI_CNTXT_INTSET_OFFS	GSI_EE_N_CNTXT_INTSET_OFFS(IPA_EE_AP)
+
+#define GSI_ERROR_LOG_OFFS	GSI_EE_N_ERROR_LOG_OFFS(IPA_EE_AP)
+#define GSI_EE_N_ERROR_LOG_OFFS(n)		(0x0001f200 + 0x4000 * (n))
+
+#define GSI_ERROR_LOG_CLR_OFFS	GSI_EE_N_ERROR_LOG_CLR_OFFS(IPA_EE_AP)
+#define GSI_EE_N_ERROR_LOG_CLR_OFFS(n)		(0x0001f210 + 0x4000 * (n))
+
+#define GSI_INTER_EE_SRC_CH_IRQ_OFFS \
+				GSI_INTER_EE_N_SRC_CH_IRQ_OFFS(IPA_EE_AP)
+#define GSI_INTER_EE_N_SRC_CH_IRQ_OFFS(n)	(0x0000c018 + 0x1000 * (n))
+
+#define GSI_INTER_EE_SRC_EV_CH_IRQ_OFFS \
+				GSI_INTER_EE_N_SRC_EV_CH_IRQ_OFFS(IPA_EE_AP)
+#define GSI_INTER_EE_N_SRC_EV_CH_IRQ_OFFS(n)	(0x0000c01c + 0x1000 * (n))
+
+#define GSI_INTER_EE_SRC_CH_IRQ_CLR_OFFS \
+				GSI_INTER_EE_N_SRC_CH_IRQ_CLR_OFFS(IPA_EE_AP)
+#define GSI_INTER_EE_N_SRC_CH_IRQ_CLR_OFFS(n)	(0x0000c028 + 0x1000 * (n))
+
+#define GSI_INTER_EE_SRC_EV_CH_IRQ_CLR_OFFS \
+				GSI_INTER_EE_N_SRC_EV_CH_IRQ_CLR_OFFS(IPA_EE_AP)
+#define GSI_INTER_EE_N_SRC_EV_CH_IRQ_CLR_OFFS(n) (0x0000c02c + 0x1000 * (n))
+
+#endif	/* _GSI_REG_H__ */
-- 
2.17.1

^ permalink raw reply related

* [RFC PATCH 02/12] soc: qcom: ipa: DMA helpers
From: Alex Elder @ 2018-11-07  0:32 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas
  Cc: netdev, devicetree, linux-arm-msm, linux-soc, linux-arm-kernel,
	linux-kernel, syadagir, mjavid, robh+dt, mark.rutland
In-Reply-To: <20181107003250.5832-1-elder@linaro.org>

This patch includes code implementing the IPA DMA module, which
defines a structure to represent a DMA allocation for the IPA device.
It's used throughout the IPA code.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 drivers/net/ipa/ipa_dma.c | 61 +++++++++++++++++++++++++++++++++++++++
 drivers/net/ipa/ipa_dma.h | 61 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 122 insertions(+)
 create mode 100644 drivers/net/ipa/ipa_dma.c
 create mode 100644 drivers/net/ipa/ipa_dma.h

diff --git a/drivers/net/ipa/ipa_dma.c b/drivers/net/ipa/ipa_dma.c
new file mode 100644
index 000000000000..dfde59e5072a
--- /dev/null
+++ b/drivers/net/ipa/ipa_dma.c
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018 Linaro Ltd.
+ */
+
+#include <linux/types.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/string.h>
+
+#include "ipa_dma.h"
+
+static struct device *ipa_dma_dev;
+
+int ipa_dma_init(struct device *dev, u32 align)
+{
+	int ret;
+
+	/* Ensure DMA addresses will have the alignment we require */
+	if (dma_get_cache_alignment() % align)
+		return -ENOTSUPP;
+
+	ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
+	if (!ret)
+		ipa_dma_dev = dev;
+
+	return ret;
+}
+
+void ipa_dma_exit(void)
+{
+	ipa_dma_dev = NULL;
+}
+
+int ipa_dma_alloc(struct ipa_dma_mem *mem, size_t size, gfp_t gfp)
+{
+	dma_addr_t phys;
+	void *virt;
+
+	virt = dma_zalloc_coherent(ipa_dma_dev, size, &phys, gfp);
+	if (!virt)
+		return -ENOMEM;
+
+	mem->virt = virt;
+	mem->phys = phys;
+	mem->size = size;
+
+	return 0;
+}
+
+void ipa_dma_free(struct ipa_dma_mem *mem)
+{
+	dma_free_coherent(ipa_dma_dev, mem->size, mem->virt, mem->phys);
+	memset(mem, 0, sizeof(*mem));
+}
+
+void *ipa_dma_phys_to_virt(struct ipa_dma_mem *mem, dma_addr_t phys)
+{
+	return mem->virt + (phys - mem->phys);
+}
diff --git a/drivers/net/ipa/ipa_dma.h b/drivers/net/ipa/ipa_dma.h
new file mode 100644
index 000000000000..e211dbd9d4ec
--- /dev/null
+++ b/drivers/net/ipa/ipa_dma.h
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/* Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ * Copyright (C) 2018 Linaro Ltd.
+ */
+#ifndef _IPA_DMA_H_
+#define _IPA_DMA_H_
+
+#include <linux/types.h>
+#include <linux/device.h>
+
+/**
+ * struct ipa_dma_mem - IPA allocated DMA memory descriptor
+ * @virt: host virtual base address of allocated DMA memory
+ * @phys: bus physical base address of DMA memory
+ * @size: size (bytes) of DMA memory
+ */
+struct ipa_dma_mem {
+	void *virt;
+	dma_addr_t phys;
+	size_t size;
+};
+
+/**
+ * ipa_dma_init() - Initialize IPA DMA system.
+ * @dev:	IPA device structure
+ * @align:	Hardware required alignment for DMA memory
+ *
+ * Returns:	 0 if successful, or a negative error code.
+ */
+int ipa_dma_init(struct device *dev, u32 align);
+
+/**
+ * ipa_dma_exit() - shut down/clean up IPA DMA system
+ */
+void ipa_dma_exit(void);
+
+/**
+ * ipa_dma_alloc() - allocate a DMA buffer, describe it in mem struct
+ * @mem:	Memory structure to fill with allocation information.
+ * @size:	Size of DMA buffer to allocate.
+ * @gfp:	Allocation mode.
+ */
+int ipa_dma_alloc(struct ipa_dma_mem *mem, size_t size, gfp_t gfp);
+
+/**
+ * ipa_dma_free() - free a previously-allocated DMA buffer
+ * @mem:	Information about DMA allocation to free
+ */
+void ipa_dma_free(struct ipa_dma_mem *mem);
+
+/**
+ * ipa_dma_phys_to_virt() - return the virtual equivalent of a DMA address
+ * @phys:	DMA allocation information
+ * @phys:	Physical address to convert
+ *
+ * Return:	Virtual address corresponding to the given physical address
+ */
+void *ipa_dma_phys_to_virt(struct ipa_dma_mem *mem, dma_addr_t phys);
+
+#endif /* !_IPA_DMA_H_ */
-- 
2.17.1

^ permalink raw reply related

* [RFC PATCH 01/12] dt-bindings: soc: qcom: add IPA bindings
From: Alex Elder @ 2018-11-07  0:32 UTC (permalink / raw)
  To: robh+dt, mark.rutland, davem, arnd, bjorn.andersson,
	ilias.apalodimas
  Cc: netdev, devicetree, linux-arm-msm, linux-soc, linux-arm-kernel,
	linux-kernel, syadagir, mjavid
In-Reply-To: <20181107003250.5832-1-elder@linaro.org>

Add the binding definitions for the "qcom,ipa" and "qcom,rmnet-ipa"
device tree nodes.

Signed-off-by: Alex Elder <elder@linaro.org>
---
 .../devicetree/bindings/soc/qcom/qcom,ipa.txt | 136 ++++++++++++++++++
 .../bindings/soc/qcom/qcom,rmnet-ipa.txt      |  15 ++
 2 files changed, 151 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ipa.txt
 create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,rmnet-ipa.txt

diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,ipa.txt b/Documentation/devicetree/bindings/soc/qcom/qcom,ipa.txt
new file mode 100644
index 000000000000..d4d3d37df029
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,ipa.txt
@@ -0,0 +1,136 @@
+Qualcomm IPA (IP Accelerator) Driver
+
+This binding describes the Qualcomm IPA.  The IPA is capable of offloading
+certain network processing tasks (e.g. filtering, routing, and NAT) from
+the main processor.  The IPA currently serves only as a network interface,
+providing access to an LTE network available via a modem.
+
+The IPA sits between multiple independent "execution environments,"
+including the AP subsystem (APSS) and the modem.  The IPA presents
+a Generic Software Interface (GSI) to each execution environment.
+The GSI is an integral part of the IPA, but it is logically isolated
+and has a distinct interrupt and a separately-defined address space.
+
+    ----------   -------------   ---------
+    |        |   |G|       |G|   |       |
+    |  APSS  |===|S|  IPA  |S|===| Modem |
+    |        |   |I|       |I|   |       |
+    ----------   -------------   ---------
+
+See also:
+  bindings/interrupt-controller/interrupts.txt
+  bindings/interconnect/interconnect.txt
+  bindings/soc/qcom/qcom,smp2p.txt
+  bindings/reserved-memory/reserved-memory.txt
+  bindings/clock/clock-bindings.txt
+
+All properties defined below are required.
+
+- compatible:
+	Must be one of the following compatible strings:
+		"qcom,ipa-sdm845-modem_init"
+		"qcom,ipa-sdm845-tz_init"
+
+-reg:
+	Resources specyfing the physical address spaces of the IPA and GSI.
+
+-reg-names:
+	The names of the address space ranges defined by the "reg" property.
+	Must be "ipa" and "gsi".
+
+- interrupts-extended:
+	Specifies the IRQs used by the IPA.  Four cells are required,
+	specifying: the IPA IRQ; the GSI IRQ; the clock query interrupt
+	from the modem; and the "ready for stage 2 initialization"
+	interrupt from the modem.  The first two are hardware IRQs; the
+	third and fourth are SMP2P input interrupts.
+
+- interrupt-names:
+	The names of the interrupts defined by the "interrupts-extended"
+	property.  Must be "ipa", "gsi", "ipa-clock-query", and
+	"ipa-post-init".
+
+- clocks:
+	Resource that defines the IPA core clock.
+
+- clock-names:
+	The name used for the IPA core clock.  Must be "core".
+
+- interconnects:
+	Specifies the interconnects used by the IPA.  Three cells are
+	required, specifying:  the path from the IPA to memory; from
+	IPA to internal (SoC resident) memory; and between the AP
+	subsystem and IPA for register access.
+
+- interconnect-names:
+	The names of the interconnects defined by the "interconnects"
+	property.  Must be "memory", "imem", and "config".
+
+- qcom,smem-states
+	The state bits used for SMP2P output.  Two cells must be specified.
+	The first indicates whether the value in the second bit is valid
+	(1 means valid).  The second, if valid, defines whether the IPA
+	clock is enabled (1 means enabled).
+
+- qcom,smem-state-names
+	The names of the state bits used for SMP2P output.  These must be
+	"ipa-clock-enabled-valid" and "ipa-clock-enabled".
+
+- memory-region
+	A phandle for a reserved memory area that holds the firmware passed
+	to Trust Zone for authentication.  (Note, this is required
+	only for "qcom,ipa-sdm845-tz_init".)
+
+= EXAMPLE
+
+The following example represents the IPA present in the SDM845 SoC.  It
+shows portions of the "modem-smp2p" node to indicate its relationship
+with the interrupts and SMEM states used by the IPA.
+
+	modem-smp2p {
+		compatible = "qcom,smp2p";
+		. . .
+		ipa_smp2p_out: ipa-ap-to-modem {
+			qcom,entry-name = "ipa";
+			#qcom,smem-state-cells = <1>;
+		};
+
+		ipa_smp2p_in: ipa-modem-to-ap {
+			qcom,entry-name = "ipa";
+			interrupt-controller;
+			#interrupt-cells = <2>;
+		};
+	};
+
+	ipa@1e00000 {
+		compatible = "qcom,ipa-sdm845-modem_init";
+
+		reg = <0x1e40000 0x34000>,
+		      <0x1e04000 0x2c000>;
+		reg-names = "ipa",
+			    "gsi";
+
+		interrupts-extended = <&intc 0 311 IRQ_TYPE_LEVEL_HIGH>,
+				      <&intc 0 432 IRQ_TYPE_LEVEL_HIGH>,
+				      <&ipa_smp2p_in 0 IRQ_TYPE_EDGE_RISING>,
+				      <&ipa_smp2p_in 1 IRQ_TYPE_EDGE_RISING>;
+		interrupt-names = "ipa",
+				  "gsi",
+				  "ipa-clock-query",
+				  "ipa-post-init";
+
+		clocks = <&rpmhcc RPMH_IPA_CLK>;
+		clock-names = "core";
+
+		interconnects = <&qnoc MASTER_IPA &qnoc SLAVE_EBI1>,
+			        <&qnoc MASTER_IPA &qnoc SLAVE_IMEM>,
+			        <&qnoc MASTER_APPSS_PROC &qnoc SLAVE_IPA_CFG>;
+		interconnect-names = "memory",
+				     "imem",
+				     "config";
+
+		qcom,smem-states = <&ipa_smp2p_out 0>,
+				   <&ipa_smp2p_out 1>;
+		qcom,smem-state-names = "ipa-clock-enabled-valid",
+					"ipa-clock-enabled";
+	};
diff --git a/Documentation/devicetree/bindings/soc/qcom/qcom,rmnet-ipa.txt b/Documentation/devicetree/bindings/soc/qcom/qcom,rmnet-ipa.txt
new file mode 100644
index 000000000000..3d0b2aabefc7
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/qcom/qcom,rmnet-ipa.txt
@@ -0,0 +1,15 @@
+Qualcomm IPA RMNet Driver
+
+This binding describes the IPA RMNet driver, which is used to
+represent virtual interfaces available on the modem accessed via
+the IPA.  Other than the compatible string there are no properties
+associated with this device.
+
+- compatible:
+	Must be "qcom,rmnet-ipa".
+
+= EXAMPLE
+
+	qcom,rmnet-ipa {
+		compatible = "qcom,rmnet-ipa";
+	};
-- 
2.17.1

^ permalink raw reply related

* [RFC PATCH 00/12] net: introduce Qualcomm IPA driver
From: Alex Elder @ 2018-11-07  0:32 UTC (permalink / raw)
  To: davem, arnd, bjorn.andersson, ilias.apalodimas, robh+dt,
	mark.rutland
  Cc: netdev, devicetree, linux-arm-msm, linux-soc, linux-arm-kernel,
	linux-kernel, syadagir, mjavid

This series presents the driver for the Qualcomm IP Accelerator (IPA).
The IPA is a hardware component present in some Qualcomm SoCs that
allows network functions--such as routing, filtering, network address
translation and aggregation--to be performed without active involvement
of the main application processor (AP).

Initially, these advanced features are not supported; the IPA driver
simply provides a network interface that makes the modem's LTE
network available to the AP.  In addition, support is only provided
for the IPA found in the Qualcomm SDM845 SoC.

This code is derived from a driver developed internally by Qualcomm.
A version of the original source can be seen here:
    https://source.codeaurora.org/quic/la/kernel/msm-4.9/tree
in the "drivers/platform/msm/ipa" directory.  Many were involved in
developing this, but the following individuals deserve explicit
acknowledgement for their substantial contributions:

    Abhishek Choubey
    Ady Abraham
    Chaitanya Pratapa
    David Arinzon
    Ghanim Fodi
    Gidon Studinski
    Ravi Gummadidala
    Shihuan Liu
    Skylar Chang

The code has undergone considerable rework to prepare it for
incorporation into upstream Linux.  Parts of it bear little
resemblance to the original driver.  Still, some work remains
to be done.  The current code and its design had a preliminary
review, and some changes to the data path implementation were
recommended.   These have not yet been addressed:
- Use NAPI for all interfaces, not just RX (and WAN data) endpoints.
- Do more work in the NAPI poll function, including collecting
  completed TX requests and posting buffers for RX.
- Do not use periodic NOP requests as a way to avoid TX interrupts.
- The NAPI context should be associated with the hardware interrupt
  (it is now associated with something abstracted from the hardware).
- Use threaded interrupts, to avoid the need for using spinlocks and
  atomic variables for synchronizing between workqueue and interrupt
  context.
- Have runtime power management enable and disable IPA clock and
  interconnects.
Many thanks to Arnd Bergmann, Ilias Apalodimas, and Bjorn Andersson
for their early feedback.

While there clearly remains work to do on this, we felt that things
are far enough along that it would be helpful to solicit broader
input on the code.  Major issues are best addressed as soon as
possible, and even minor issues when identified help in setting
priorities.

This code is dependent on the following two sets of code, which have
been posted for review but are not yet accepted upstream:
- Interconnect framework:  https://lkml.org/lkml/2018/8/31/444
- SDM845 interconnect provider driver:  https://lkml.org/lkml/2018/8/24/25

In addition, it depends on four more bits of code that have not yet
been posted for upstream review, but are expected to be available soon:
- clk-rpmh support for IPA from David Dai <daidavid1@codeaurora.org>
- SDM845 reserved memory from Bjorn Andersson <bjorn.andersson@linaro.org>
- list_cut_end() from Alex Elder <elder@linaro.org>
- FIELD_MAX() in "bitfield.h" from Alex Elder <elder@linaro.org>

This code (including its dependencies) is available in buildable
form here, based on kernel v4.19:
    remote: ssh://git@git.linaro.org/people/alex.elder/linux.git
    branch: qualcomm_ipa-v1
	    59562facd61a arm64: dts: sdm845: add IPA information

					-Alex

Alex Elder (12):
  dt-bindings: soc: qcom: add IPA bindings
  soc: qcom: ipa: DMA helpers
  soc: qcom: ipa: generic software interface
  soc: qcom: ipa: immediate commands
  soc: qcom: ipa: IPA interrupts and the microcontroller
  soc: qcom: ipa: QMI modem communication
  soc: qcom: ipa: IPA register abstraction
  soc: qcom: ipa: utility functions
  soc: qcom: ipa: main IPA source file
  soc: qcom: ipa: data path
  soc: qcom: ipa: IPA rmnet interface
  soc: qcom: ipa: build and "ipa_i.h"

 .../devicetree/bindings/soc/qcom/qcom,ipa.txt |  136 ++
 .../bindings/soc/qcom/qcom,rmnet-ipa.txt      |   15 +
 drivers/net/ipa/Kconfig                       |   30 +
 drivers/net/ipa/Makefile                      |    7 +
 drivers/net/ipa/gsi.c                         | 1685 ++++++++++++++
 drivers/net/ipa/gsi.h                         |  195 ++
 drivers/net/ipa/gsi_reg.h                     |  563 +++++
 drivers/net/ipa/ipa_dma.c                     |   61 +
 drivers/net/ipa/ipa_dma.h                     |   61 +
 drivers/net/ipa/ipa_dp.c                      | 1994 +++++++++++++++++
 drivers/net/ipa/ipa_i.h                       |  573 +++++
 drivers/net/ipa/ipa_interrupts.c              |  307 +++
 drivers/net/ipa/ipa_main.c                    | 1400 ++++++++++++
 drivers/net/ipa/ipa_qmi.c                     |  406 ++++
 drivers/net/ipa/ipa_qmi.h                     |   12 +
 drivers/net/ipa/ipa_qmi_msg.c                 |  587 +++++
 drivers/net/ipa/ipa_qmi_msg.h                 |  233 ++
 drivers/net/ipa/ipa_reg.c                     |  972 ++++++++
 drivers/net/ipa/ipa_reg.h                     |  614 +++++
 drivers/net/ipa/ipa_uc.c                      |  336 +++
 drivers/net/ipa/ipa_utils.c                   | 1035 +++++++++
 drivers/net/ipa/ipahal.c                      |  541 +++++
 drivers/net/ipa/ipahal.h                      |  253 +++
 drivers/net/ipa/msm_rmnet.h                   |  120 +
 drivers/net/ipa/rmnet_config.h                |   31 +
 drivers/net/ipa/rmnet_ipa.c                   |  805 +++++++
 26 files changed, 12972 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,ipa.txt
 create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,rmnet-ipa.txt
 create mode 100644 drivers/net/ipa/Kconfig
 create mode 100644 drivers/net/ipa/Makefile
 create mode 100644 drivers/net/ipa/gsi.c
 create mode 100644 drivers/net/ipa/gsi.h
 create mode 100644 drivers/net/ipa/gsi_reg.h
 create mode 100644 drivers/net/ipa/ipa_dma.c
 create mode 100644 drivers/net/ipa/ipa_dma.h
 create mode 100644 drivers/net/ipa/ipa_dp.c
 create mode 100644 drivers/net/ipa/ipa_i.h
 create mode 100644 drivers/net/ipa/ipa_interrupts.c
 create mode 100644 drivers/net/ipa/ipa_main.c
 create mode 100644 drivers/net/ipa/ipa_qmi.c
 create mode 100644 drivers/net/ipa/ipa_qmi.h
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.c
 create mode 100644 drivers/net/ipa/ipa_qmi_msg.h
 create mode 100644 drivers/net/ipa/ipa_reg.c
 create mode 100644 drivers/net/ipa/ipa_reg.h
 create mode 100644 drivers/net/ipa/ipa_uc.c
 create mode 100644 drivers/net/ipa/ipa_utils.c
 create mode 100644 drivers/net/ipa/ipahal.c
 create mode 100644 drivers/net/ipa/ipahal.h
 create mode 100644 drivers/net/ipa/msm_rmnet.h
 create mode 100644 drivers/net/ipa/rmnet_config.h
 create mode 100644 drivers/net/ipa/rmnet_ipa.c

-- 
2.17.1

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Alexei Starovoitov @ 2018-11-07  0:26 UTC (permalink / raw)
  To: David Ahern, David Miller
  Cc: Song Liu, netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	Kernel Team, ast@kernel.org, daniel@iogearbox.net,
	peterz@infradead.org, acme@kernel.org
In-Reply-To: <0a05a14c-3a79-e894-ae48-cbe1df4feb91@gmail.com>

On 11/6/18 4:23 PM, David Ahern wrote:
> On 11/6/18 5:13 PM, Alexei Starovoitov wrote:
>> On 11/6/18 3:36 PM, David Miller wrote:
>>> From: Alexei Starovoitov <ast@fb.com>
>>> Date: Tue, 6 Nov 2018 23:29:07 +0000
>>>
>>>> I think concerns with perf overhead from collecting bpf events
>>>> are unfounded.
>>>> I would prefer for this flag to be on by default.
>>>
>>> I will sit in userspace looping over bpf load/unload and see how the
>>> person trying to monitor something else with perf feels about that.
>>>
>>> Really, it is inappropriate to turn this on by default, I completely
>>> agree with David Ahern.
>>>
>>> It's hard enough, _AS IS_, for me to fight back all of the bloat that
>>> is in perf right now and get it back to being able to handle simple
>>> full workloads without dropping events..
>>
>> It's a separate perf thread and separate event with its own epoll.
>> I don't see how it can affect main event collection.
>> Let's put it this way. If it does affect somehow, then yes,
>> it should not be on. If it is not, there is no downside to keep it on.
>> Typical user expects to type 'perf record' and see everything that
>> is happening on the system. Right now short lived bpf programs
>> will not be seen. How user suppose to even know when to use the flag?
> 
> The default is profiling where perf record collects task events and
> periodic samples. So for the default record/report, the bpf load /
> unload events are not relevant.

Exactly the opposite.
It's for default 'perf record' collection of periodic samples.
It can be off for -e collection. That's easy.

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Alexei Starovoitov @ 2018-11-07  0:13 UTC (permalink / raw)
  To: David Miller
  Cc: Song Liu, dsahern@gmail.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, Kernel Team, ast@kernel.org,
	daniel@iogearbox.net, peterz@infradead.org, acme@kernel.org
In-Reply-To: <20181106.153647.1701013551426767213.davem@davemloft.net>

On 11/6/18 3:36 PM, David Miller wrote:
> From: Alexei Starovoitov <ast@fb.com>
> Date: Tue, 6 Nov 2018 23:29:07 +0000
> 
>> I think concerns with perf overhead from collecting bpf events
>> are unfounded.
>> I would prefer for this flag to be on by default.
> 
> I will sit in userspace looping over bpf load/unload and see how the
> person trying to monitor something else with perf feels about that.
> 
> Really, it is inappropriate to turn this on by default, I completely
> agree with David Ahern.
> 
> It's hard enough, _AS IS_, for me to fight back all of the bloat that
> is in perf right now and get it back to being able to handle simple
> full workloads without dropping events..

It's a separate perf thread and separate event with its own epoll.
I don't see how it can affect main event collection.
Let's put it this way. If it does affect somehow, then yes,
it should not be on. If it is not, there is no downside to keep it on.
Typical user expects to type 'perf record' and see everything that
is happening on the system. Right now short lived bpf programs
will not be seen. How user suppose to even know when to use the flag?
The only option is to always pass the flag 'just in case'
which is unnecessary burden.
The problem of dropped events is certainly valid, but it's
a separate issue. The aio stuff that Alexey Budankov is working on
suppose to address that.

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: David Miller @ 2018-11-06 23:36 UTC (permalink / raw)
  To: ast
  Cc: songliubraving, dsahern, netdev, linux-kernel, Kernel-team, ast,
	daniel, peterz, acme
In-Reply-To: <27fc8327-3390-ba5a-6063-89c9e7165e7b@fb.com>

From: Alexei Starovoitov <ast@fb.com>
Date: Tue, 6 Nov 2018 23:29:07 +0000

> I think concerns with perf overhead from collecting bpf events
> are unfounded.
> I would prefer for this flag to be on by default.

I will sit in userspace looping over bpf load/unload and see how the
person trying to monitor something else with perf feels about that.

Really, it is inappropriate to turn this on by default, I completely
agree with David Ahern.

It's hard enough, _AS IS_, for me to fight back all of the bloat that
is in perf right now and get it back to being able to handle simple
full workloads without dropping events..

Every new event type like this sets us back.

If people want to monitor new things, or have new functionality, fine.

But not by default, please.

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Alexei Starovoitov @ 2018-11-06 23:29 UTC (permalink / raw)
  To: Song Liu, David Ahern
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Kernel Team,
	ast@kernel.org, daniel@iogearbox.net, peterz@infradead.org,
	acme@kernel.org
In-Reply-To: <6C5A9FBD-F50D-444C-9038-E9557EC850D2@fb.com>

On 11/6/18 3:17 PM, Song Liu wrote:
> 
> 
>> On Nov 6, 2018, at 1:54 PM, David Ahern <dsahern@gmail.com> wrote:
>>
>> On 11/6/18 1:52 PM, Song Liu wrote:
>>> +
>>> static int record__mmap_read_all(struct record *rec)
>>> {
>>> 	int err;
>>>
>>> +	err = record__mmap_process_vip_events(rec);
>>> +	if (err)
>>> +		return err;
>>> +
>>> 	err = record__mmap_read_evlist(rec, rec->evlist, false);
>>> 	if (err)
>>> 		return err;
>>
>> Seems to me that is going to increase the overhead of perf on any system
>> doing BPF updates. The BPF events cause a wakeup every load and unload,
>> and perf processes not only the VIP events but then walks all of the
>> other maps.
> 
> BPF prog load/unload events should be rare events in real world use cases.
> So I think the overhead is OK. Also, I don't see an easy way to improve
> this.
> 
>>
>>> @@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
>>> 			  "signal"),
>>> 	OPT_BOOLEAN(0, "dry-run", &dry_run,
>>> 		    "Parse options then exit"),
>>> +	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
>>> +		    "do not record event on bpf program load/unload"),
>>
>> Why should this default on? If am recording FIB events, I don't care
>> about BPF events.
>>
> 
> I am OK with default off if that's the preferred way.

I think concerns with perf overhead from collecting bpf events
are unfounded.
I would prefer for this flag to be on by default.

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Song Liu @ 2018-11-06 23:17 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Kernel Team,
	ast@kernel.org, daniel@iogearbox.net, peterz@infradead.org,
	acme@kernel.org
In-Reply-To: <b984776a-ce69-ab16-ed87-1cea89c9d79a@gmail.com>



> On Nov 6, 2018, at 1:54 PM, David Ahern <dsahern@gmail.com> wrote:
> 
> On 11/6/18 1:52 PM, Song Liu wrote:
>> +
>> static int record__mmap_read_all(struct record *rec)
>> {
>> 	int err;
>> 
>> +	err = record__mmap_process_vip_events(rec);
>> +	if (err)
>> +		return err;
>> +
>> 	err = record__mmap_read_evlist(rec, rec->evlist, false);
>> 	if (err)
>> 		return err;
> 
> Seems to me that is going to increase the overhead of perf on any system
> doing BPF updates. The BPF events cause a wakeup every load and unload,
> and perf processes not only the VIP events but then walks all of the
> other maps.

BPF prog load/unload events should be rare events in real world use cases. 
So I think the overhead is OK. Also, I don't see an easy way to improve 
this. 

> 
>> @@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
>> 			  "signal"),
>> 	OPT_BOOLEAN(0, "dry-run", &dry_run,
>> 		    "Parse options then exit"),
>> +	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
>> +		    "do not record event on bpf program load/unload"),
> 
> Why should this default on? If am recording FIB events, I don't care
> about BPF events.
> 

I am OK with default off if that's the preferred way. 

Thanks,
Song

^ permalink raw reply

* Re: [PATCH v3 1/2] kretprobe: produce sane stack traces
From: Steven Rostedt @ 2018-11-06 22:15 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jonathan Corbet, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Shuah Khan, Alexei Starovoitov, Daniel Borkmann,
	Brendan Gregg, Christian Brauner, Aleksa Sarai, netdev, linux-doc,
	linux-kernel
In-Reply-To: <20181104115913.74l4yzecisvtt2j5@yavin>

On Sun, 4 Nov 2018 22:59:13 +1100
Aleksa Sarai <cyphar@cyphar.com> wrote:

> The same issue is present in __save_stack_trace
> (arch/x86/kernel/stacktrace.c). This is likely the only reason that --
> as Steven said -- stacktraces wouldn't work with ftrace-graph (and thus
> with the refactor both of you are discussing).

By the way, I was playing with the the orc unwinder and stack traces
from the function graph tracer return code, and got it working with the
below patch. Caution, that patch also has a stack trace hardcoded in
the return path of the function graph tracer, so you don't want to run
function graph tracing without filtering.

You can apply the patch and do:

 # cd /sys/kernel/debug/tracing
 # echo schedule > set_ftrace_filter
 # echo function_graph > current_tracer
 # cat trace

 3)               |  schedule() {
     rcu_preempt-10    [003] ....    91.160297: <stack trace>
 => ftrace_return_to_handler
 => return_to_handler
 => schedule_timeout
 => rcu_gp_kthread
 => kthread
 => ret_from_fork
 3) # 4009.085 us |  }
 3)               |  schedule() {
     kworker/1:0-17    [001] ....    91.163288: <stack trace>
 => ftrace_return_to_handler
 => return_to_handler
 => worker_thread
 => kthread
 => ret_from_fork
 1) # 7000.070 us |  }
 1)               |  schedule() {
     rcu_preempt-10    [003] ....    91.164311: <stack trace>
 => ftrace_return_to_handler
 => return_to_handler
 => schedule_timeout
 => rcu_gp_kthread
 => kthread
 => ret_from_fork
 3) # 4006.540 us |  }


Where just adding the stack trace without the other code, these traces
ended at "return_to_handler".

This patch is not for inclusion, it was just a test to see how to make
this work.

-- Steve

diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 91b2cff4b79a..4bcd646ae1f4 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -320,13 +320,15 @@ ENTRY(ftrace_graph_caller)
 ENDPROC(ftrace_graph_caller)
 
 ENTRY(return_to_handler)
-	UNWIND_HINT_EMPTY
-	subq  $24, %rsp
+	subq $8, %rsp
+	UNWIND_HINT_FUNC
+	subq  $16, %rsp
 
 	/* Save the return values */
 	movq %rax, (%rsp)
 	movq %rdx, 8(%rsp)
 	movq %rbp, %rdi
+	leaq 16(%rsp), %rsi
 
 	call ftrace_return_to_handler
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 169b3c44ee97..aaeca73218cc 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -242,13 +242,16 @@ ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret,
 	trace->calltime = current->ret_stack[index].calltime;
 	trace->overrun = atomic_read(&current->trace_overrun);
 	trace->depth = index;
+
+	trace_dump_stack(0);
 }
 
 /*
  * Send the trace to the ring-buffer.
  * @return the original return address.
  */
-unsigned long ftrace_return_to_handler(unsigned long frame_pointer)
+unsigned long ftrace_return_to_handler(unsigned long frame_pointer,
+	unsigned long *ptr)
 {
 	struct ftrace_graph_ret trace;
 	unsigned long ret;
@@ -257,6 +260,8 @@ unsigned long ftrace_return_to_handler(unsigned long frame_pointer)
 	trace.rettime = trace_clock_local();
 	barrier();
 	current->curr_ret_stack--;
+	*ptr = ret;
+
 	/*
 	 * The curr_ret_stack can be less than -1 only if it was
 	 * filtered out and it's about to return from the function.

^ permalink raw reply related

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: David Ahern @ 2018-11-06 21:54 UTC (permalink / raw)
  To: Song Liu, netdev, linux-kernel; +Cc: kernel-team, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-6-songliubraving@fb.com>

On 11/6/18 1:52 PM, Song Liu wrote:
> +
>  static int record__mmap_read_all(struct record *rec)
>  {
>  	int err;
>  
> +	err = record__mmap_process_vip_events(rec);
> +	if (err)
> +		return err;
> +
>  	err = record__mmap_read_evlist(rec, rec->evlist, false);
>  	if (err)
>  		return err;

Seems to me that is going to increase the overhead of perf on any system
doing BPF updates. The BPF events cause a wakeup every load and unload,
and perf processes not only the VIP events but then walks all of the
other maps.

> @@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
>  			  "signal"),
>  	OPT_BOOLEAN(0, "dry-run", &dry_run,
>  		    "Parse options then exit"),
> +	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
> +		    "do not record event on bpf program load/unload"),

Why should this default on? If am recording FIB events, I don't care
about BPF events.

^ permalink raw reply

* Re: [PATCH V2] mlx5: Fix formats with line continuation whitespace
From: Doug Ledford @ 2018-11-06 21:34 UTC (permalink / raw)
  To: Leon Romanovsky, Joe Perches
  Cc: Saeed Mahameed, David S. Miller, netdev, linux-rdma, linux-kernel
In-Reply-To: <20181101073412.GQ3974@mtr-leonro.mtl.com>

[-- Attachment #1: Type: text/plain, Size: 722 bytes --]

On Thu, 2018-11-01 at 09:34 +0200, Leon Romanovsky wrote:
> On Thu, Nov 01, 2018 at 12:24:08AM -0700, Joe Perches wrote:
> > The line continuations unintentionally add whitespace so
> > instead use coalesced formats to remove the whitespace.
> > 
> > Signed-off-by: Joe Perches <joe@perches.com>
> > ---
> > 
> > v2: Remove excess space after %u
> > 
> >  drivers/net/ethernet/mellanox/mlx5/core/rl.c | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> 
> Thanks,
> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>

Applied, thanks.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [RFC perf,bpf 4/5] perf util: introduce bpf_prog_info_event
From: Alexei Starovoitov @ 2018-11-06 21:11 UTC (permalink / raw)
  To: Song Liu, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
  Cc: Kernel Team, ast@kernel.org, daniel@iogearbox.net,
	peterz@infradead.org, acme@kernel.org
In-Reply-To: <20181106205246.567448-5-songliubraving@fb.com>

On 11/6/18 12:52 PM, Song Liu wrote:
> +	/* fill in fake symbol name for now, add real name after BTF */
> +	if (info.nr_jited_func_lens == 1 && info.name) {  /* only main prog */
> +		size_t l;
> +
> +		assert(info.nr_jited_ksyms == 1);
> +		l = snprintf(ptr, KSYM_NAME_LEN, "bpf_prog_%s", info.name);

please include the prog tag here. Just like kernel kallsyms do.
Other than this small nit the patch set looks great to me.

^ permalink raw reply

* Re: [PATCH 2/6] dt-bindings: phy: phy-of-simple: Document new binding
From: Rob Herring @ 2018-11-06 20:55 UTC (permalink / raw)
  To: Faiz Abbas
  Cc: linux-kernel@vger.kernel.org, devicetree, netdev, linux-can,
	Wolfgang Grandegger, Marc Kleine-Budde, Mark Rutland,
	Kishon Vijay Abraham I
In-Reply-To: <20181102192616.28291-3-faiz_abbas@ti.com>

On Fri, Nov 2, 2018 at 2:23 PM Faiz Abbas <faiz_abbas@ti.com> wrote:
>
> Add documentation for the generic simple phy implementation.

We don't do 'simple' or 'generic' bindings.

> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> ---
>  .../devicetree/bindings/phy/phy-of-simple.txt | 29 +++++++++++++++++++
>  1 file changed, 29 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/phy/phy-of-simple.txt
>
> diff --git a/Documentation/devicetree/bindings/phy/phy-of-simple.txt b/Documentation/devicetree/bindings/phy/phy-of-simple.txt
> new file mode 100644
> index 000000000000..696f2763395c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/phy/phy-of-simple.txt
> @@ -0,0 +1,29 @@
> +Generic simple phy device tree binding
> +--------------------------------------
> +
> +A good number of phy implementations merely read dts properties,
> +enable clocks, regulators or do resets without having a dedicated register
> +map. This binding implements a generic phy driver which can be used for
> +such simple implementations and avoid boilerplate code duplication.

Sure, but then latter some needs certain timing/ordering of those
controls or some other DT additions. 'generic' or 'simple' never work
out for bindings. By all means though, write a simple/generic phy
driver. Just make it understand an explicit list of compatible
strings. Then when a phy turns out to be not so simple, we can write a
driver for it with changing the DT.

> +Required Properties:
> +-  compatible  : must be "simple-phy"
> +-  phy-cells    : must be 0

#phy-cells

> +
> +Optional Properties:
> +-  bus-width   : generic bus-width. Must be positive.
> +-  max-bitrate : generic max-bitrate. Must be positive.
> +-  pwr         : phandle to phy pwr regulator node.

That's not the regulator binding.

> +
> +Example:
> +
> +The following example is a can transceiver implemented as a generic phy.
> +It has a max-bitrate property and a pwr regulator.
> +
> +
> +transceiver1: can-transceiver {
> +       compatible = "simple-phy";
> +       max-bitrate = <5000000>;
> +       pwr-supply = <&transceiver1_fixed>;
> +       #phy-cells = <0>;
> +};
> --
> 2.18.0
>

^ permalink raw reply

* [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

This patch enables perf-record to listen to bpf_event and generate
bpf_prog_info_event for bpf programs loaded and unloaded during
perf-record run.

To minimize latency between bpf_event and following bpf calls, separate
mmap with watermark of 1 is created to process these vip events. Then
a separate dummy event is attached to the special mmap.

By default, perf-record will listen to bpf_event. Option no-bpf-event is
added in case the user would opt out.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/builtin-record.c | 50 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/evlist.c    | 42 ++++++++++++++++++++++++++++---
 tools/perf/util/evlist.h    |  4 +++
 tools/perf/util/evsel.c     |  8 ++++++
 tools/perf/util/evsel.h     |  3 +++
 5 files changed, 104 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 73b02bde1ebc..1036a64eb9f7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -80,6 +80,7 @@ struct record {
 	bool			buildid_all;
 	bool			timestamp_filename;
 	bool			timestamp_boundary;
+	bool			no_bpf_event;
 	struct switch_output	switch_output;
 	unsigned long long	samples;
 };
@@ -381,6 +382,8 @@ static int record__open(struct record *rec)
 		pos->tracking = 1;
 		pos->attr.enable_on_exec = 1;
 	}
+	if (!rec->no_bpf_event)
+		perf_evlist__add_bpf_tracker(evlist);
 
 	perf_evlist__config(evlist, opts, &callchain_param);
 
@@ -562,10 +565,55 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	return rc;
 }
 
+static int record__mmap_process_vip_events(struct record *rec)
+{
+	int i;
+
+	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+		struct perf_mmap *map = &rec->evlist->vip_mmap[i];
+		union perf_event *event;
+
+		perf_mmap__read_init(map);
+		while ((event = perf_mmap__read_event(map)) != NULL) {
+			pr_debug("processing vip event of type %d\n",
+				 event->header.type);
+			switch (event->header.type) {
+			case PERF_RECORD_BPF_EVENT:
+				switch (event->bpf_event.type) {
+				case PERF_BPF_EVENT_PROG_LOAD:
+					perf_event__synthesize_one_bpf_prog_info(
+						&rec->tool,
+						process_synthesized_event,
+						&rec->session->machines.host,
+						event->bpf_event.id);
+					/* fall through */
+				case PERF_BPF_EVENT_PROG_UNLOAD:
+					record__write(rec, NULL, event,
+						      event->header.size);
+				break;
+				default:
+					break;
+				}
+				break;
+			default:
+				break;
+			}
+			perf_mmap__consume(map);
+		}
+		perf_mmap__read_done(map);
+	}
+
+	return 0;
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	int err;
 
+	err = record__mmap_process_vip_events(rec);
+	if (err)
+		return err;
+
 	err = record__mmap_read_evlist(rec, rec->evlist, false);
 	if (err)
 		return err;
@@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
 			  "signal"),
 	OPT_BOOLEAN(0, "dry-run", &dry_run,
 		    "Parse options then exit"),
+	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
+		    "do not record event on bpf program load/unload"),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index be440df29615..466a9f7b1e93 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -45,6 +45,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 	for (i = 0; i < PERF_EVLIST__HLIST_SIZE; ++i)
 		INIT_HLIST_HEAD(&evlist->heads[i]);
 	INIT_LIST_HEAD(&evlist->entries);
+	INIT_LIST_HEAD(&evlist->vip_entries);
 	perf_evlist__set_maps(evlist, cpus, threads);
 	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
@@ -177,6 +178,8 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
 {
 	entry->evlist = evlist;
 	list_add_tail(&entry->node, &evlist->entries);
+	if (entry->vip)
+		list_add_tail(&entry->vip_node, &evlist->vip_entries);
 	entry->idx = evlist->nr_entries;
 	entry->tracking = !entry->idx;
 
@@ -267,6 +270,27 @@ int perf_evlist__add_dummy(struct perf_evlist *evlist)
 	return 0;
 }
 
+int perf_evlist__add_bpf_tracker(struct perf_evlist *evlist)
+{
+	struct perf_event_attr attr = {
+		.type	          = PERF_TYPE_SOFTWARE,
+		.config           = PERF_COUNT_SW_DUMMY,
+		.watermark        = 1,
+		.bpf_event        = 1,
+		.wakeup_watermark = 1,
+		.size	   = sizeof(attr), /* to capture ABI version */
+	};
+	struct perf_evsel *evsel = perf_evsel__new_idx(&attr,
+						       evlist->nr_entries);
+
+	if (evsel == NULL)
+		return -ENOMEM;
+
+	evsel->vip = true;
+	perf_evlist__add(evlist, evsel);
+	return 0;
+}
+
 static int perf_evlist__add_attrs(struct perf_evlist *evlist,
 				  struct perf_event_attr *attrs, size_t nr_attrs)
 {
@@ -770,6 +794,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
 
 	evlist__for_each_entry(evlist, evsel) {
+		struct perf_mmap *vip_maps = evlist->vip_mmap;
 		struct perf_mmap *maps = evlist->mmap;
 		int *output = _output;
 		int fd;
@@ -800,7 +825,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
+		if (evsel->vip) {
+			if (perf_mmap__mmap(&vip_maps[idx], mp,
+					    fd, evlist_cpu) < 0)
+				return -1;
+		} else if (*output == -1) {
 			*output = fd;
 
 			if (perf_mmap__mmap(&maps[idx], mp, *output, evlist_cpu) < 0)
@@ -822,8 +851,12 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, &maps[idx], revent) < 0) {
-			perf_mmap__put(&maps[idx]);
+		    __perf_evlist__add_pollfd(
+			    evlist, fd,
+			    evsel->vip ? &vip_maps[idx] : &maps[idx],
+			    revent) < 0) {
+			perf_mmap__put(evsel->vip ?
+				       &vip_maps[idx] : &maps[idx]);
 			return -1;
 		}
 
@@ -1035,6 +1068,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (!evlist->mmap)
 		return -ENOMEM;
 
+	if (!evlist->vip_mmap)
+		evlist->vip_mmap = perf_evlist__alloc_mmap(evlist, false);
+
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index dc66436add98..6d99e8dab570 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -26,6 +26,7 @@ struct record_opts;
 
 struct perf_evlist {
 	struct list_head entries;
+	struct list_head vip_entries;
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
 	int		 nr_entries;
 	int		 nr_groups;
@@ -43,6 +44,7 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct perf_mmap *vip_mmap;
 	struct perf_mmap *overwrite_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -84,6 +86,8 @@ int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
 
 int perf_evlist__add_dummy(struct perf_evlist *evlist);
 
+int perf_evlist__add_bpf_tracker(struct perf_evlist *evlist);
+
 int perf_evlist__add_newtp(struct perf_evlist *evlist,
 			   const char *sys, const char *name, void *handler);
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index af9d539e4b6a..94456a493607 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -235,6 +235,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
 	evsel->evlist	   = NULL;
 	evsel->bpf_fd	   = -1;
 	INIT_LIST_HEAD(&evsel->node);
+	INIT_LIST_HEAD(&evsel->vip_node);
 	INIT_LIST_HEAD(&evsel->config_terms);
 	perf_evsel__object.init(evsel);
 	evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
@@ -1795,6 +1796,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 				     PERF_SAMPLE_BRANCH_NO_CYCLES);
 	if (perf_missing_features.group_read && evsel->attr.inherit)
 		evsel->attr.read_format &= ~(PERF_FORMAT_GROUP|PERF_FORMAT_ID);
+	if (perf_missing_features.bpf_event)
+		evsel->attr.bpf_event = 0;
 retry_sample_id:
 	if (perf_missing_features.sample_id_all)
 		evsel->attr.sample_id_all = 0;
@@ -1939,6 +1942,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		perf_missing_features.exclude_guest = true;
 		pr_debug2("switching off exclude_guest, exclude_host\n");
 		goto fallback_missing_features;
+	} else if (!perf_missing_features.bpf_event &&
+		   evsel->attr.bpf_event) {
+		perf_missing_features.bpf_event = true;
+		pr_debug2("switching off bpf_event\n");
+		goto fallback_missing_features;
 	} else if (!perf_missing_features.sample_id_all) {
 		perf_missing_features.sample_id_all = true;
 		pr_debug2("switching off sample_id_all\n");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 4107c39f4a54..82b1d3e42603 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -89,6 +89,7 @@ struct perf_stat_evsel;
  */
 struct perf_evsel {
 	struct list_head	node;
+	struct list_head	vip_node;
 	struct perf_evlist	*evlist;
 	struct perf_event_attr	attr;
 	char			*filter;
@@ -128,6 +129,7 @@ struct perf_evsel {
 	bool			ignore_missing_thread;
 	bool			forced_leader;
 	bool			use_uncore_alias;
+	bool			vip;  /* vip events have their own mmap */
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
@@ -163,6 +165,7 @@ struct perf_missing_features {
 	bool lbr_flags;
 	bool write_backward;
 	bool group_read;
+	bool bpf_event;
 };
 
 extern struct perf_missing_features perf_missing_features;
-- 
2.17.1

^ permalink raw reply related

* [RFC perf,bpf 4/5] perf util: introduce bpf_prog_info_event
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

This patch introduces struct bpf_prog_info_event to union perf_event.

struct bpf_prog_info_event {
       struct perf_event_header        header;
       u32                             prog_info_len;
       u32                             ksym_table_len;
       u64                             ksym_table;
       struct bpf_prog_info            prog_info;
       char                            data[];
};

struct bpf_prog_info_event contains information about a bpf program.
These events are written to perf.data by perf-record, and processed by
perf-report.

struct bpf_prog_info_event uses arrays for some data (ksym_table, and
arrays in struct bpf_prog_info). To make these arrays easy to serialize,
we allocate continuous memory (data). These array pointers are translated
to offset in bpf_prog_info_event before written to file. And vice-versa
when the event is read from file.

This patch enables synthesizing these events at the beginning of
perf-record run. Next patch will process short living bpf programs that
are created during perf-record.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/builtin-record.c |   5 +
 tools/perf/builtin-report.c |   2 +
 tools/perf/util/Build       |   2 +
 tools/perf/util/bpf-info.c  | 287 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-info.h  |  29 ++++
 tools/perf/util/event.c     |   1 +
 tools/perf/util/event.h     |  14 ++
 tools/perf/util/session.c   |   4 +
 tools/perf/util/tool.h      |   3 +-
 9 files changed, 346 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/util/bpf-info.c
 create mode 100644 tools/perf/util/bpf-info.h

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0980dfe3396b..73b02bde1ebc 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -41,6 +41,7 @@
 #include "util/perf-hooks.h"
 #include "util/time-utils.h"
 #include "util/units.h"
+#include "util/bpf-info.h"
 #include "asm/bug.h"
 
 #include <errno.h>
@@ -850,6 +851,9 @@ static int record__synthesize(struct record *rec, bool tail)
 	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
 					    process_synthesized_event, opts->sample_address,
 					    opts->proc_map_timeout, 1);
+
+	err = perf_event__synthesize_bpf_prog_info(
+		&rec->tool, process_synthesized_event, machine);
 out:
 	return err;
 }
@@ -1531,6 +1535,7 @@ static struct record record = {
 		.namespaces	= perf_event__process_namespaces,
 		.mmap		= perf_event__process_mmap,
 		.mmap2		= perf_event__process_mmap2,
+		.bpf_prog_info	= perf_event__process_bpf_prog_info,
 		.ordered_events	= true,
 	},
 };
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index c0703979c51d..4a9a3e8da4e0 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -41,6 +41,7 @@
 #include "util/auxtrace.h"
 #include "util/units.h"
 #include "util/branch.h"
+#include "util/bpf-info.h"
 
 #include <dlfcn.h>
 #include <errno.h>
@@ -981,6 +982,7 @@ int cmd_report(int argc, const char **argv)
 			.auxtrace_info	 = perf_event__process_auxtrace_info,
 			.auxtrace	 = perf_event__process_auxtrace,
 			.feature	 = process_feature_event,
+			.bpf_prog_info	 = perf_event__process_bpf_prog_info,
 			.ordered_events	 = true,
 			.ordering_requires_timestamps = true,
 		},
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index ecd9f9ceda77..624c7281217c 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -150,6 +150,8 @@ endif
 
 libperf-y += perf-hooks.o
 
+libperf-$(CONFIG_LIBBPF) += bpf-info.o
+
 libperf-$(CONFIG_CXX) += c++/
 
 CFLAGS_config.o   += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
diff --git a/tools/perf/util/bpf-info.c b/tools/perf/util/bpf-info.c
new file mode 100644
index 000000000000..fa598c4328be
--- /dev/null
+++ b/tools/perf/util/bpf-info.c
@@ -0,0 +1,287 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Facebook
+ */
+#include <errno.h>
+#include <stdio.h>
+#include <bpf/bpf.h>
+#include "bpf-info.h"
+#include "debug.h"
+#include "session.h"
+
+#define KSYM_NAME_LEN 128
+#define BPF_PROG_INFO_MIN_SIZE 128  /* minimal require jited_func_lens */
+
+static inline __u64 ptr_to_u64(const void *ptr)
+{
+	return (__u64) (unsigned long) ptr;
+}
+
+/* fetch information of the bpf program via bpf syscall. */
+struct bpf_prog_info_event *perf_bpf_info__get_bpf_prog_info_event(u32 prog_id)
+{
+	struct bpf_prog_info_event *prog_info_event = NULL;
+	struct bpf_prog_info info = {};
+	u32 info_len = sizeof(info);
+	u32 event_len, i;
+	int fd, err;
+	void *ptr;
+
+	fd = bpf_prog_get_fd_by_id(prog_id);
+	if (fd < 0) {
+		pr_debug("Failed to get fd for prog_id %u\n", prog_id);
+		return NULL;
+	}
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
+	if (err) {
+		pr_debug("can't get prog info: %s", strerror(errno));
+		goto close_fd;
+	}
+	if (info_len < BPF_PROG_INFO_MIN_SIZE) {
+		pr_debug("kernel is too old to support proper prog info\n");
+		goto close_fd;
+	}
+
+	/* calculate size of bpf_prog_info_event */
+	event_len = sizeof(struct bpf_prog_info_event);
+	event_len += info_len;
+	event_len -= sizeof(info);
+	event_len += info.jited_prog_len;
+	event_len += info.xlated_prog_len;
+	event_len += info.nr_map_ids * sizeof(u32);
+	event_len += info.nr_jited_ksyms * sizeof(u64);
+	event_len += info.nr_jited_func_lens * sizeof(u32);
+	event_len += info.nr_jited_ksyms * KSYM_NAME_LEN;
+
+	prog_info_event = (struct bpf_prog_info_event *) malloc(event_len);
+	if (!prog_info_event)
+		goto close_fd;
+
+	/* assign pointers for map_ids, jited_prog_insns, etc. */
+	ptr = prog_info_event->data;
+	info.map_ids = ptr_to_u64(ptr);
+	ptr += info.nr_map_ids * sizeof(u32);
+	info.jited_prog_insns = ptr_to_u64(ptr);
+	ptr += info.jited_prog_len;
+	info.xlated_prog_insns = ptr_to_u64(ptr);
+	ptr += info.xlated_prog_len;
+	info.jited_ksyms = ptr_to_u64(ptr);
+	ptr += info.nr_jited_ksyms * sizeof(u64);
+	info.jited_func_lens = ptr_to_u64(ptr);
+	ptr += info.nr_jited_func_lens * sizeof(u32);
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
+	if (err) {
+		pr_err("can't get prog info: %s\n", strerror(errno));
+		free(prog_info_event);
+		prog_info_event = NULL;
+		goto close_fd;
+	}
+
+	/* fill data in prog_info_event */
+	prog_info_event->header.type = PERF_RECORD_BPF_PROG_INFO;
+	prog_info_event->header.misc = 0;
+	prog_info_event->prog_info_len = info_len;
+
+	memcpy(&prog_info_event->prog_info, &info, info_len);
+
+	prog_info_event->ksym_table_len = 0;
+	prog_info_event->ksym_table = ptr_to_u64(ptr);
+
+	/* fill in fake symbol name for now, add real name after BTF */
+	if (info.nr_jited_func_lens == 1 && info.name) {  /* only main prog */
+		size_t l;
+
+		assert(info.nr_jited_ksyms == 1);
+		l = snprintf(ptr, KSYM_NAME_LEN, "bpf_prog_%s", info.name);
+		prog_info_event->ksym_table_len += l + 1;
+		ptr += l + 1;
+
+	} else {
+		assert(info.nr_jited_ksyms == info.nr_jited_func_lens);
+
+		for (i = 0; i < info.nr_jited_ksyms; i++) {
+			size_t l;
+
+			l = snprintf(ptr, KSYM_NAME_LEN, "bpf_prog_%d_%d",
+				     info.id, i);
+			prog_info_event->ksym_table_len += l + 1;
+			ptr += l + 1;
+		}
+	}
+
+	prog_info_event->header.size = ptr - (void *)prog_info_event;
+
+close_fd:
+	close(fd);
+	return prog_info_event;
+}
+
+static size_t fprintf_bpf_prog_info(
+	struct bpf_prog_info_event *prog_info_event, FILE *fp)
+{
+	struct bpf_prog_info *info = &prog_info_event->prog_info;
+	unsigned long *jited_ksyms = (unsigned long *)(info->jited_ksyms);
+	char *name_ptr = (char *) prog_info_event->ksym_table;
+	unsigned int i;
+	size_t ret;
+
+	ret = fprintf(fp, "bpf_prog: type: %u id: %u ", info->type, info->id);
+	ret += fprintf(fp, "nr_jited_ksyms: %u\n", info->nr_jited_ksyms);
+
+	for (i = 0; i < info->nr_jited_ksyms; i++) {
+		ret += fprintf(fp, "jited_ksyms[%u]: %lx %s\n",
+			       i, jited_ksyms[i], name_ptr);
+		name_ptr += strlen(name_ptr);
+	}
+	return ret;
+}
+
+size_t perf_event__fprintf_bpf_prog_info(union perf_event *event, FILE *fp)
+{
+	return fprintf_bpf_prog_info(&event->bpf_prog_info, fp);
+}
+
+/*
+ * translate all array ptr to offset from base address, called before
+ * writing the event to file
+ */
+void perf_bpf_info__ptr_to_offset(
+	struct bpf_prog_info_event *prog_info_event)
+{
+	u64 base = ptr_to_u64(prog_info_event);
+
+	prog_info_event->ksym_table -= base;
+	prog_info_event->prog_info.jited_prog_insns -= base;
+	prog_info_event->prog_info.xlated_prog_insns -= base;
+	prog_info_event->prog_info.map_ids -= base;
+	prog_info_event->prog_info.jited_ksyms -= base;
+	prog_info_event->prog_info.jited_func_lens -= base;
+}
+
+/*
+ * translate offset from base address to array pointer, called after
+ * reading the event from file
+ */
+void perf_bpf_info__offset_to_ptr(
+	struct bpf_prog_info_event *prog_info_event)
+{
+	u64 base = ptr_to_u64(prog_info_event);
+
+	prog_info_event->ksym_table += base;
+	prog_info_event->prog_info.jited_prog_insns += base;
+	prog_info_event->prog_info.xlated_prog_insns += base;
+	prog_info_event->prog_info.map_ids += base;
+	prog_info_event->prog_info.jited_ksyms += base;
+	prog_info_event->prog_info.jited_func_lens += base;
+}
+
+int perf_event__synthesize_one_bpf_prog_info(struct perf_tool *tool,
+					     perf_event__handler_t process,
+					     struct machine *machine,
+					     __u32 id)
+{
+	struct bpf_prog_info_event *prog_info_event;
+
+	prog_info_event = perf_bpf_info__get_bpf_prog_info_event(id);
+
+	if (!prog_info_event) {
+		pr_err("Failed to get prog_info_event\n");
+		return -1;
+	}
+	perf_bpf_info__ptr_to_offset(prog_info_event);
+
+	if (perf_tool__process_synth_event(
+		    tool, (union perf_event *)prog_info_event,
+		    machine, process) != 0) {
+		free(prog_info_event);
+		return -1;
+	}
+
+	free(prog_info_event);
+	return 0;
+}
+
+int perf_event__synthesize_bpf_prog_info(struct perf_tool *tool,
+					 perf_event__handler_t process,
+					 struct machine *machine)
+{
+	__u32 id = 0;
+	int err = 0;
+
+	while (true) {
+		err = bpf_prog_get_next_id(id, &id);
+		if (err) {
+			if (errno == ENOENT) {
+				err = 0;
+				break;
+			}
+			fprintf(stderr, "can't get next program: %s%s",
+				strerror(errno),
+				errno == EINVAL ? " -- kernel too old?" : "");
+			err = -1;
+			break;
+		}
+		err = perf_event__synthesize_one_bpf_prog_info(
+			tool, process, machine, id);
+	}
+	return err;
+}
+
+int perf_event__process_bpf_prog_info(struct perf_session *session,
+				      union perf_event *event)
+{
+	struct machine *machine = &session->machines.host;
+	struct bpf_prog_info_event *prog_info_event;
+	struct bpf_prog_info *info;
+	struct symbol *sym;
+	struct map *map;
+	char *name_ptr;
+	int ret = 0;
+	u64 *addrs;
+	u32 *lens;
+	u32 i;
+
+	prog_info_event = (struct bpf_prog_info_event *)
+		malloc(event->header.size);
+	if (!prog_info_event)
+		return -ENOMEM;
+
+	/* copy the data to rw memeory so we can modify it */
+	memcpy(prog_info_event,  &event->bpf_prog_info, event->header.size);
+	info = &prog_info_event->prog_info;
+
+	perf_bpf_info__offset_to_ptr(prog_info_event);
+	name_ptr = (char *) prog_info_event->ksym_table;
+	addrs = (u64 *)info->jited_ksyms;
+	lens = (u32 *)info->jited_func_lens;
+	for (i = 0; i < info->nr_jited_ksyms; i++) {
+		u32 len = info->nr_jited_func_lens == 1 ?
+			len = info->jited_prog_len : lens[i];
+
+		map = map_groups__find(&machine->kmaps, addrs[i]);
+		if (!map) {
+			map = dso__new_map("bpf_prog");
+			if (!map) {
+				ret = -ENOMEM;
+				break;
+			}
+			map->start = addrs[i];
+			map->pgoff = map->start;
+			map->end = map->start + len;
+			map_groups__insert(&machine->kmaps, map);
+		}
+
+		sym = symbol__new(addrs[i], len, 0, 0, name_ptr);
+		if (!sym) {
+			ret = -ENOMEM;
+			break;
+		}
+		dso__insert_symbol(map->dso, sym);
+		name_ptr += strlen(name_ptr) + 1;
+	}
+
+	free(prog_info_event);
+	return ret;
+}
diff --git a/tools/perf/util/bpf-info.h b/tools/perf/util/bpf-info.h
new file mode 100644
index 000000000000..813cad07bacb
--- /dev/null
+++ b/tools/perf/util/bpf-info.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BPF_INFO_H
+#define __PERF_BPF_INFO_H
+
+#include "event.h"
+#include "machine.h"
+#include "tool.h"
+#include "symbol.h"
+
+struct bpf_prog_info_event *perf_bpf_info__get_bpf_prog_info_event(u32 prog_id);
+
+size_t perf_event__fprintf_bpf_prog_info(union perf_event *event, FILE *fp);
+
+int perf_event__synthesize_one_bpf_prog_info(struct perf_tool *tool,
+					     perf_event__handler_t process,
+					     struct machine *machine,
+					     __u32 id);
+
+int perf_event__synthesize_bpf_prog_info(struct perf_tool *tool,
+					 perf_event__handler_t process,
+					 struct machine *machine);
+
+void perf_bpf_info__ptr_to_offset(struct bpf_prog_info_event *prog_info_event);
+void perf_bpf_info__offset_to_ptr(struct bpf_prog_info_event *prog_info_event);
+
+int perf_event__process_bpf_prog_info(struct perf_session *session,
+				      union perf_event *event);
+
+#endif /* __PERF_BPF_INFO_H */
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 601432afbfb2..33b1c168b83e 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -61,6 +61,7 @@ static const char *perf_event__names[] = {
 	[PERF_RECORD_EVENT_UPDATE]		= "EVENT_UPDATE",
 	[PERF_RECORD_TIME_CONV]			= "TIME_CONV",
 	[PERF_RECORD_HEADER_FEATURE]		= "FEATURE",
+	[PERF_RECORD_BPF_PROG_INFO]		= "BPF_PROG_INFO",
 };
 
 static const char *perf_ns__names[] = {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 13a0c64dd0ed..dc64d800eaa6 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -5,6 +5,7 @@
 #include <limits.h>
 #include <stdio.h>
 #include <linux/kernel.h>
+#include <linux/bpf.h>
 
 #include "../perf.h"
 #include "build-id.h"
@@ -258,6 +259,7 @@ enum perf_user_event_type { /* above any possible kernel type */
 	PERF_RECORD_EVENT_UPDATE		= 78,
 	PERF_RECORD_TIME_CONV			= 79,
 	PERF_RECORD_HEADER_FEATURE		= 80,
+	PERF_RECORD_BPF_PROG_INFO		= 81,
 	PERF_RECORD_HEADER_MAX
 };
 
@@ -629,6 +631,17 @@ struct feature_event {
 	char				data[];
 };
 
+#define KSYM_NAME_LEN 128
+
+struct bpf_prog_info_event {
+	struct perf_event_header	header;
+	u32				prog_info_len;
+	u32				ksym_table_len;
+	u64				ksym_table;
+	struct bpf_prog_info		prog_info;
+	char				data[];
+};
+
 union perf_event {
 	struct perf_event_header	header;
 	struct mmap_event		mmap;
@@ -661,6 +674,7 @@ union perf_event {
 	struct time_conv_event		time_conv;
 	struct feature_event		feat;
 	struct bpf_event		bpf_event;
+	struct bpf_prog_info_event	bpf_prog_info;
 };
 
 void perf_event__print_totals(void);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index dffe5120d2d3..5365ee1dfbec 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -415,6 +415,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
 		tool->time_conv = process_event_op2_stub;
 	if (tool->feature == NULL)
 		tool->feature = process_event_op2_stub;
+	if (tool->bpf_prog_info == NULL)
+		tool->bpf_prog_info = process_event_op2_stub;
 }
 
 static void swap_sample_id_all(union perf_event *event, void *data)
@@ -1397,6 +1399,8 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		return tool->time_conv(session, event);
 	case PERF_RECORD_HEADER_FEATURE:
 		return tool->feature(session, event);
+	case PERF_RECORD_BPF_PROG_INFO:
+		return tool->bpf_prog_info(session, event);
 	default:
 		return -EINVAL;
 	}
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 69ae898ca024..739a4b1188f7 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -70,7 +70,8 @@ struct perf_tool {
 			stat_config,
 			stat,
 			stat_round,
-			feature;
+			feature,
+			bpf_prog_info;
 	event_op3	auxtrace;
 	bool		ordered_events;
 	bool		ordering_requires_timestamps;
-- 
2.17.1

^ permalink raw reply related

* [RFC perf,bpf 2/5] perf: sync tools/include/uapi/linux/perf_event.h
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

Sync changes for PERF_RECORD_BPF_EVENT.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/include/uapi/linux/perf_event.h | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index f35eb72739c0..d51cacb3077a 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
 				context_switch :  1, /* context switch data */
 				write_backward :  1, /* Write ring buffer from end to beginning */
 				namespaces     :  1, /* include namespaces data */
-				__reserved_1   : 35;
+				bpf_event      :  1, /* include bpf events */
+				__reserved_1   : 34;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -963,9 +964,33 @@ enum perf_event_type {
 	 */
 	PERF_RECORD_NAMESPACES			= 16,
 
+	/*
+	 * Record different types of bpf events:
+	 *  enum perf_bpf_event_type {
+	 *     PERF_BPF_EVENT_UNKNOWN		= 0,
+	 *     PERF_BPF_EVENT_PROG_LOAD	= 1,
+	 *     PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	 *  };
+	 *
+	 * struct {
+	 *	struct perf_event_header header;
+	 *	u16 type;
+	 *	u16 flags;
+	 *	u32 id;  // prog_id or map_id
+	 * };
+	 */
+	PERF_RECORD_BPF_EVENT			= 17,
+
 	PERF_RECORD_MAX,			/* non-ABI */
 };
 
+enum perf_bpf_event_type {
+	PERF_BPF_EVENT_UNKNOWN		= 0,
+	PERF_BPF_EVENT_PROG_LOAD	= 1,
+	PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	PERF_BPF_EVENT_MAX,		/* non-ABI */
+};
+
 #define PERF_MAX_STACK_DEPTH		127
 #define PERF_MAX_CONTEXTS_PER_STACK	  8
 
-- 
2.17.1

^ permalink raw reply related

* [RFC perf,bpf 1/5] perf, bpf: Introduce PERF_RECORD_BPF_EVENT
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

For better performance analysis of BPF programs, this patch introduces
PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
load/unload information to user space.

        /*
         * Record different types of bpf events:
         *   enum perf_bpf_event_type {
         *      PERF_BPF_EVENT_UNKNOWN          = 0,
         *      PERF_BPF_EVENT_PROG_LOAD        = 1,
         *      PERF_BPF_EVENT_PROG_UNLOAD      = 2,
         *   };
         *
         * struct {
         *      struct perf_event_header header;
         *      u16 type;
         *      u16 flags;
         *      u32 id;  // prog_id or map_id
         * };
         */
        PERF_RECORD_BPF_EVENT                   = 17,

PERF_RECORD_BPF_EVENT contains minimal information about the BPF program.
Perf utility (or other user space tools) should listen to this event and
fetch more details about the event via BPF syscalls
(BPF_PROG_GET_FD_BY_ID, BPF_OBJ_GET_INFO_BY_FD, etc.).

Currently, PERF_RECORD_BPF_EVENT only support two events:
PERF_BPF_EVENT_PROG_LOAD and PERF_BPF_EVENT_PROG_UNLOAD. But it can be
easily extended to support more events.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/perf_event.h      |  5 ++
 include/uapi/linux/perf_event.h | 27 ++++++++++-
 kernel/bpf/syscall.c            |  4 ++
 kernel/events/core.c            | 82 ++++++++++++++++++++++++++++++++-
 4 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f0ca79..a3126fd5b7f1 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1113,6 +1113,9 @@ static inline void perf_event_task_sched_out(struct task_struct *prev,
 }
 
 extern void perf_event_mmap(struct vm_area_struct *vma);
+extern void perf_event_bpf_event(enum perf_bpf_event_type type,
+				 u16 flags, u32 id);
+
 extern struct perf_guest_info_callbacks *perf_guest_cbs;
 extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
 extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
@@ -1333,6 +1336,8 @@ static inline int perf_unregister_guest_info_callbacks
 (struct perf_guest_info_callbacks *callbacks)				{ return 0; }
 
 static inline void perf_event_mmap(struct vm_area_struct *vma)		{ }
+static inline void perf_event_bpf_event(enum perf_bpf_event_type type,
+					u16 flags, u32 id)		{ }
 static inline void perf_event_exec(void)				{ }
 static inline void perf_event_comm(struct task_struct *tsk, bool exec)	{ }
 static inline void perf_event_namespaces(struct task_struct *tsk)	{ }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index f35eb72739c0..d51cacb3077a 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
 				context_switch :  1, /* context switch data */
 				write_backward :  1, /* Write ring buffer from end to beginning */
 				namespaces     :  1, /* include namespaces data */
-				__reserved_1   : 35;
+				bpf_event      :  1, /* include bpf events */
+				__reserved_1   : 34;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -963,9 +964,33 @@ enum perf_event_type {
 	 */
 	PERF_RECORD_NAMESPACES			= 16,
 
+	/*
+	 * Record different types of bpf events:
+	 *  enum perf_bpf_event_type {
+	 *     PERF_BPF_EVENT_UNKNOWN		= 0,
+	 *     PERF_BPF_EVENT_PROG_LOAD	= 1,
+	 *     PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	 *  };
+	 *
+	 * struct {
+	 *	struct perf_event_header header;
+	 *	u16 type;
+	 *	u16 flags;
+	 *	u32 id;  // prog_id or map_id
+	 * };
+	 */
+	PERF_RECORD_BPF_EVENT			= 17,
+
 	PERF_RECORD_MAX,			/* non-ABI */
 };
 
+enum perf_bpf_event_type {
+	PERF_BPF_EVENT_UNKNOWN		= 0,
+	PERF_BPF_EVENT_PROG_LOAD	= 1,
+	PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	PERF_BPF_EVENT_MAX,		/* non-ABI */
+};
+
 #define PERF_MAX_STACK_DEPTH		127
 #define PERF_MAX_CONTEXTS_PER_STACK	  8
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 18e3be193a05..b37051a13be6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1101,9 +1101,12 @@ static void __bpf_prog_put_rcu(struct rcu_head *rcu)
 static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
 {
 	if (atomic_dec_and_test(&prog->aux->refcnt)) {
+		int prog_id = prog->aux->id;
+
 		/* bpf_prog_free_id() must be called first */
 		bpf_prog_free_id(prog, do_idr_lock);
 		bpf_prog_kallsyms_del_all(prog);
+		perf_event_bpf_event(PERF_BPF_EVENT_PROG_UNLOAD, 0, prog_id);
 
 		call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu);
 	}
@@ -1441,6 +1444,7 @@ static int bpf_prog_load(union bpf_attr *attr)
 	}
 
 	bpf_prog_kallsyms_add(prog);
+	perf_event_bpf_event(PERF_BPF_EVENT_PROG_LOAD, 0, prog->aux->id);
 	return err;
 
 free_used_maps:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5a97f34bc14c..54667be6669b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -385,6 +385,7 @@ static atomic_t nr_namespaces_events __read_mostly;
 static atomic_t nr_task_events __read_mostly;
 static atomic_t nr_freq_events __read_mostly;
 static atomic_t nr_switch_events __read_mostly;
+static atomic_t nr_bpf_events __read_mostly;
 
 static LIST_HEAD(pmus);
 static DEFINE_MUTEX(pmus_lock);
@@ -4235,7 +4236,7 @@ static bool is_sb_event(struct perf_event *event)
 
 	if (attr->mmap || attr->mmap_data || attr->mmap2 ||
 	    attr->comm || attr->comm_exec ||
-	    attr->task ||
+	    attr->task || attr->bpf_event ||
 	    attr->context_switch)
 		return true;
 	return false;
@@ -4305,6 +4306,8 @@ static void unaccount_event(struct perf_event *event)
 		dec = true;
 	if (has_branch_stack(event))
 		dec = true;
+	if (event->attr.bpf_event)
+		atomic_dec(&nr_bpf_events);
 
 	if (dec) {
 		if (!atomic_add_unless(&perf_sched_count, -1, 1))
@@ -7650,6 +7653,81 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 	perf_output_end(&handle);
 }
 
+/*
+ * bpf load/unload tracking
+ */
+
+struct perf_bpf_event {
+	struct {
+		struct perf_event_header        header;
+		u16 type;
+		u16 flags;
+		u32 id;
+	} event_id;
+};
+
+static int perf_event_bpf_match(struct perf_event *event)
+{
+	return event->attr.bpf_event;
+}
+
+static void perf_event_bpf_output(struct perf_event *event,
+				   void *data)
+{
+	struct perf_bpf_event *bpf_event = data;
+	struct perf_output_handle handle;
+	struct perf_sample_data sample;
+	int size = bpf_event->event_id.header.size;
+	int ret;
+
+	if (!perf_event_bpf_match(event))
+		return;
+
+	perf_event_header__init_id(&bpf_event->event_id.header, &sample, event);
+	ret = perf_output_begin(&handle, event,
+				bpf_event->event_id.header.size);
+	if (ret)
+		goto out;
+
+	perf_output_put(&handle, bpf_event->event_id);
+	perf_event__output_id_sample(event, &handle, &sample);
+
+	perf_output_end(&handle);
+out:
+	bpf_event->event_id.header.size = size;
+}
+
+static void perf_event_bpf(struct perf_bpf_event *bpf_event)
+{
+	perf_iterate_sb(perf_event_bpf_output,
+		       bpf_event,
+		       NULL);
+}
+
+void perf_event_bpf_event(enum perf_bpf_event_type type, u16 flags, u32 id)
+{
+	struct perf_bpf_event bpf_event;
+
+	if (!atomic_read(&nr_bpf_events))
+		return;
+
+	if (type <= PERF_BPF_EVENT_UNKNOWN || type >= PERF_BPF_EVENT_MAX)
+		return;
+
+	bpf_event = (struct perf_bpf_event){
+		.event_id = {
+			.header = {
+				.type = PERF_RECORD_BPF_EVENT,
+				.size = sizeof(bpf_event.event_id),
+			},
+			.type = type,
+			.flags = flags,
+			.id = id,
+		},
+	};
+	perf_event_bpf(&bpf_event);
+}
+
 void perf_event_itrace_started(struct perf_event *event)
 {
 	event->attach_state |= PERF_ATTACH_ITRACE;
@@ -9871,6 +9949,8 @@ static void account_event(struct perf_event *event)
 		inc = true;
 	if (is_cgroup_event(event))
 		inc = true;
+	if (event->attr.bpf_event)
+		atomic_inc(&nr_bpf_events);
 
 	if (inc) {
 		/*
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH net] net: phy: Allow BCM54616S PHY to setup internal TX/RX clock delay
From: Tao Ren @ 2018-11-06 19:42 UTC (permalink / raw)
  To: David Miller
  Cc: andrew@lunn.ch, f.fainelli@gmail.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, openbmc@lists.ozlabs.org
In-Reply-To: <20181106.111736.1149054212290410715.davem@davemloft.net>

On 11/6/18 11:17 AM, David Miller wrote:
> From: Tao Ren <taoren@fb.com>
> Date: Mon, 5 Nov 2018 14:35:40 -0800
> 
>> This patch allows users to enable/disable internal TX and/or RX clock
>> delay for BCM54616S PHYs so as to satisfy RGMII timing specifications.
>>
>> On a particular platform, whether TX and/or RX clock delay is required
>> depends on how PHY connected to the MAC IP. This requirement can be
>> specified through "phy-mode" property in the platform device tree.
>>
>> The patch is inspired by commit 733336262b28 ("net: phy: Allow BCM5481x
>> PHYs to setup internal TX/RX clock delay").
>>
>> Signed-off-by: Tao Ren <taoren@fb.com>
> 
> This is fine for 'net', applied, thanks.

Thanks David for the quick action.

- Tao Ren

^ permalink raw reply

* Re: [PATCH] net: skbuff.h: remove unnecessary unlikely()
From: David Miller @ 2018-11-06 19:22 UTC (permalink / raw)
  To: tiny.windzz
  Cc: edumazet, dja, willemb, ast, sbrivio, posk, pabeni, borisp,
	linux-kernel, netdev
In-Reply-To: <20181106154536.8789-1-tiny.windzz@gmail.com>

From: Yangtao Li <tiny.windzz@gmail.com>
Date: Tue,  6 Nov 2018 10:45:36 -0500

> WARN_ON() already contains an unlikely(), so it's not necessary to use
> unlikely.
> 
> Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] net: phy: Allow BCM54616S PHY to setup internal TX/RX clock delay
From: David Miller @ 2018-11-06 19:17 UTC (permalink / raw)
  To: taoren; +Cc: andrew, f.fainelli, netdev, linux-kernel, openbmc
In-Reply-To: <20181105223540.1897084-1-taoren@fb.com>

From: Tao Ren <taoren@fb.com>
Date: Mon, 5 Nov 2018 14:35:40 -0800

> This patch allows users to enable/disable internal TX and/or RX clock
> delay for BCM54616S PHYs so as to satisfy RGMII timing specifications.
> 
> On a particular platform, whether TX and/or RX clock delay is required
> depends on how PHY connected to the MAC IP. This requirement can be
> specified through "phy-mode" property in the platform device tree.
> 
> The patch is inspired by commit 733336262b28 ("net: phy: Allow BCM5481x
> PHYs to setup internal TX/RX clock delay").
> 
> Signed-off-by: Tao Ren <taoren@fb.com>

This is fine for 'net', applied, thanks.

^ permalink raw reply

* Re: [PATCH] staging: net: ipv4: tcp_westwood: fixed warnings and checks
From: David Miller @ 2018-11-06 19:15 UTC (permalink / raw)
  To: suraj1998; +Cc: edumazet, kuznet, yoshfuji, netdev, linux-kernel
In-Reply-To: <1541425985-31869-1-git-send-email-suraj1998@gmail.com>

From: Suraj Singh <suraj1998@gmail.com>
Date: Mon,  5 Nov 2018 19:23:05 +0530

> Fixed warnings and checks for TCP Westwood
> 
> Signed-off-by: Suraj Singh <suraj1998@gmail.com>

I asked you yesterday why "staging: " appears in your subject line
and you have failed to respond and explain.

There are also functional issues with your patch:

> -		tp->snd_cwnd = tp->snd_ssthresh = tcp_westwood_bw_rttmin(sk);
> +		tp->snd_cwnd = tcp_westwood_bw_rttmin(sk);
> +		tp->snd_ssthresh = tcp_westwood_bw_rttmin(sk);

This is bogus, now tcp_westwood_bw_rttmin(sk) will potentially be called
two times instead of once.

The existing code is fine, please do not modify it.

^ permalink raw reply

* Re: [PATCH v2] ISDN: eicon: Remove driver
From: David Miller @ 2018-11-06 19:04 UTC (permalink / raw)
  To: olof; +Cc: isdn, netdev, linux-kernel, isdn4linux, mac
In-Reply-To: <20181102220026.6387-1-olof@lixom.net>

From: Olof Johansson <olof@lixom.net>
Date: Fri,  2 Nov 2018 15:00:26 -0700

> I started looking at the history of this driver, and last time the
> maintainer was active on the mailing list was when discussing how to
> remove it. This was in 2012:
> 
> https://lore.kernel.org/lkml/4F4DE175.30002@melware.de/
> 
> It looks to me like this has in practice been an orphan for quite a while.
> It's throwing warnings about stack size in a function that is in dire
> need of refactoring, and it's probably a case of "it's time to call it".
> 
> Cc: Armin Schindler <mac@melware.de>
> Cc: Karsten Keil <isdn@linux-pingi.de>
> Signed-off-by: Olof Johansson <olof@lixom.net>
> ---
> 
> v2:
> Missed a git add of drivers/isdn/hardware/Kconfig

Applied to net-next.

^ permalink raw reply

* [PATCH net] igb: fix uninitialized variables
From: wangyunjian @ 2018-11-06  8:27 UTC (permalink / raw)
  To: netdev, intel-wired-lan; +Cc: stone.zhou, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

This patch fixes the variable 'phy_word' may be used uninitialized.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/net/ethernet/intel/igb/e1000_i210.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/igb/e1000_i210.c b/drivers/net/ethernet/intel/igb/e1000_i210.c
index c54ebed..c393cb2 100644
--- a/drivers/net/ethernet/intel/igb/e1000_i210.c
+++ b/drivers/net/ethernet/intel/igb/e1000_i210.c
@@ -842,6 +842,7 @@ s32 igb_pll_workaround_i210(struct e1000_hw *hw)
 		nvm_word = E1000_INVM_DEFAULT_AL;
 	tmp_nvm = nvm_word | E1000_INVM_PLL_WO_VAL;
 	igb_write_phy_reg_82580(hw, I347AT4_PAGE_SELECT, E1000_PHY_PLL_FREQ_PAGE);
+	phy_word = E1000_PHY_PLL_UNCONF;
 	for (i = 0; i < E1000_MAX_PLL_TRIES; i++) {
 		/* check current state directly from internal PHY */
 		igb_read_phy_reg_82580(hw, E1000_PHY_PLL_FREQ_REG, &phy_word);
-- 
1.8.3.1

^ permalink raw reply related

* RE: [PATCH net-next 5/6] net/ncsi: Reset channel state in ncsi_start_dev()
From: Justin.Lee1 @ 2018-11-06 17:27 UTC (permalink / raw)
  To: sam, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <de9816e6c9cb31fdae1bb3d4a38da65b8f3a7694.camel@mendozajonas.com>


> On Mon, 2018-11-05 at 18:01 +0000, Justin.Lee1@Dell.com wrote:
> > > On Tue, 2018-10-30 at 21:26 +0000, Justin.Lee1@Dell.com wrote:
> > > > > +int ncsi_reset_dev(struct ncsi_dev *nd)
> > > > > +{
> > > > > +	struct ncsi_dev_priv *ndp = TO_NCSI_DEV_PRIV(nd);
> > > > > +	struct ncsi_channel *nc, *active;
> > > > > +	struct ncsi_package *np;
> > > > > +	unsigned long flags;
> > > > > +	bool enabled;
> > > > > +	int state;
> > > > > +
> > > > > +	active = NULL;
> > > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > > > > +		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > > > > +			spin_lock_irqsave(&nc->lock, flags);
> > > > > +			enabled = nc->monitor.enabled;
> > > > > +			state = nc->state;
> > > > > +			spin_unlock_irqrestore(&nc->lock, flags);
> > > > > +
> > > > > +			if (enabled)
> > > > > +				ncsi_stop_channel_monitor(nc);
> > > > > +			if (state == NCSI_CHANNEL_ACTIVE) {
> > > > > +				active = nc;
> > > > > +				break;
> > > > 
> > > > Is the original intention to process the channel one by one?
> > > > If it is the case, there are two loops and we might need to use
> > > > "goto found" instead.
> > > 
> > > Yes we'll need to break out of the package loop here as well.
> > > 
> > > > > +			}
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > 
> > > > found: ?
> > > > 
> > > > > +	if (!active) {
> > > > > +		/* Done */
> > > > > +		spin_lock_irqsave(&ndp->lock, flags);
> > > > > +		ndp->flags &= ~NCSI_DEV_RESET;
> > > > > +		spin_unlock_irqrestore(&ndp->lock, flags);
> > > > > +		return ncsi_choose_active_channel(ndp);
> > > > > +	}
> > > > > +
> > > > > +	spin_lock_irqsave(&ndp->lock, flags);
> > > > > +	ndp->flags |= NCSI_DEV_RESET;
> > > > > +	ndp->active_channel = active;
> > > > > +	ndp->active_package = active->package;
> > > > > +	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > > +
> > > > > +	nd->state = ncsi_dev_state_suspend;
> > > > > +	schedule_work(&ndp->work);
> > > > > +	return 0;
> > > > > +}
> > > > 
> > > > Also similar issue in ncsi_choose_active_channel() function below.
> > > > 
> > > > > @@ -916,32 +1045,49 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > > > >  
> > > > >  			ncm = &nc->modes[NCSI_MODE_LINK];
> > > > >  			if (ncm->data[2] & 0x1) {
> > > > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > > > >  				found = nc;
> > > > > -				goto out;
> > > > > +				with_link = true;
> > > > >  			}
> > > > >  
> > > > > -			spin_unlock_irqrestore(&nc->lock, flags);
> > > > > +			/* If multi_channel is enabled configure all valid
> > > > > +			 * channels whether or not they currently have link
> > > > > +			 * so they will have AENs enabled.
> > > > > +			 */
> > > > > +			if (with_link || np->multi_channel) {
> > > > 
> > > > I notice that there is a case that we will misconfigure the interface.
> > > > For example below, multi-channel is not enable for package 1.
> > > > But we enable the channel for ncsi2 below (package 1 channel 0) as that interface is the first
> > > > channel for that package with link.
> > > 
> > > I don't think I see the issue here; multi-channel is not set on package
> > > 1, but both channels are in the channel whitelist. Channel 0 is
> > > configured since it's the first found on package 1, and channel 1 is not
> > > since channel 0 is already found. Are you expecting something different?
> > >  
> > 
> > The setting is that multi-package is enable for both package 0 and 1.
> > Multi-channel is only enabled for package 0.
> > 
> > > > cat /sys/kernel/debug/ncsi_protocol/ncsi_device_
> > > > IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC CS PS LS RU CR NQ HA
> > > > =====================================================================
> > > >   2   eth2   ncsi0  000 000 1  1  1  1  1  1  0  2  1  1  1  1  0  1
> > > >   2   eth2   ncsi1  000 001 1  0  1  1  1  1  0  2  1  1  1  1  0  1
> > > >   2   eth2   ncsi2  001 000 1  0  1  0  1  1  0  2  1  1  1  1  0  1
> > 
> > I was replying to the wrong old email and it might cause a bit confusion.
> > The first 1 meaning channel is enabled for package 1 channel 0 (ncsi2). 
> > For eth2, we already has ncsi0 as the active channel with TX enable.
> > I would think that package doesn't have the multi-channel enabled and
> > we should not enable the channel for ncsi2. The problem is that package 1 doesn't
> > enable the multi-channel and it believes it needs to enable one channel for its package 
> > but it doesn't aware that the other package already has one active channel.
> 
> Ah, maybe the confusion here is that multi_channel is a per-package
> setting; it determines what a package does with its own channels.
> 
> So you have package 0 with multi-channel enabled so it enables channels 0
> & 1.
> Then you have package 1 without multi-channel so it enables only channel
> 0.
> There is still only one Tx channel (package 0, channel 0).
> 
> Does that sound right, or have I missed something?

Yes, you are right. There is only one TX enabled. 
If we can hold off a few seconds before applying, then we will not see 
these configuration changes in between the back to back netlink commands.

Thanks, 
Justin

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox