* Re: [net] ixgbevf: Fix secpath usage for IPsec Tx offload
From: David Miller @ 2019-09-13 13:53 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, snelson, jonathan
In-Reply-To: <20190912190734.10560-1-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 12 Sep 2019 12:07:34 -0700
> Port the same fix for ixgbe to ixgbevf.
>
> The ixgbevf driver currently does IPsec Tx offloading
> based on an existing secpath. However, the secpath
> can also come from the Rx side, in this case it is
> misinterpreted for Tx offload and the packets are
> dropped with a "bad sa_idx" error. Fix this by using
> the xfrm_offload() function to test for Tx offload.
>
> CC: Shannon Nelson <snelson@pensando.io>
> Fixes: 7f68d4306701 ("ixgbevf: enable VF IPsec offload operations")
> Reported-by: Jonathan Tooker <jonathan@reliablehosting.com>
> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Applied, and like the ixgbe version of this fix I queued it up for -stable.
Thanks.
^ permalink raw reply
* Re: [net-next 0/6][pull request] 100GbE Intel Wired LAN Driver Updates 2019-09-12
From: David Miller @ 2019-09-13 13:51 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann
In-Reply-To: <20190912205002.12159-1-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 12 Sep 2019 13:49:56 -0700
> This series contains updates to ice driver to implement and support
> loading a Dynamic Device Personalization (DDP) package from lib/firmware
> onto the device.
>
> Paul updates the way the driver version is stored in the driver so that
> we can pass the driver version to the firmware. Passing of the driver
> version to the firmware is needed for the DDP package to ensure we have
> the appropriate support in the driver for the features in the package.
>
> Lukasz fixes how the firmware version is stored to align with how the
> firmware stores its own version. Also extended the log message to
> display additional useful information such as NVM version, API patch
> information and firmware build hash.
>
> Tony adds the needed driver support to check, load and store the DDP
> package. Also add support for the ability to load DDP packages intended
> for specific hardware devices, as well as what to do when loading of the
> DDP package fails to load.
>
> The following are changes since commit 172ca8308b0517ca2522a8c885755fd5c20294e7:
> cxgb4: Fix spelling typos
> and are available in the git repository at:
> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 100GbE
Pulled, thanks Jeff.
^ permalink raw reply
* Re: [net-next v2 00/13][pull request] Intel Wired LAN Driver Updates 2019-09-11
From: David Miller @ 2019-09-13 13:48 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, nhorman, sassmann
In-Reply-To: <20190911165014.10742-1-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Wed, 11 Sep 2019 09:50:01 -0700
> This series contains updates to i40e, ixgbe/vf and iavf.
Pulled, thanks Jeff.
^ permalink raw reply
* Re: [PATCH] rtlwifi: rtl8821ae: make array static const and remove redundant assignment
From: Kalle Valo @ 2019-09-13 13:44 UTC (permalink / raw)
To: Colin King
Cc: Ping-Ke Shih, David S . Miller, Larry Finger, linux-wireless,
netdev, kernel-janitors, linux-kernel
In-Reply-To: <20190905150022.3609-1-colin.king@canonical.com>
Colin King <colin.king@canonical.com> wrote:
> From: Colin Ian King <colin.king@canonical.com>
>
> The array channel_all can be make static const rather than populating
> it on the stack, this makes the code smaller. Also, variable place
> is being initialized with a value that is never read, so this assignment
> is redundant and can be removed.
>
> Before:
> text data bss dec hex filename
> 118537 9591 0 128128 1f480 realtek/rtlwifi/rtl8821ae/phy.o
>
> After:
> text data bss dec hex filename
> 118331 9687 0 128018 1f412 realtek/rtlwifi/rtl8821ae/phy.o
>
> Saves 110 bytes, (gcc version 9.2.1, amd64)
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
Patch applied to wireless-drivers-next.git, thanks.
569ce0a486fd rtlwifi: rtl8821ae: make array static const and remove redundant assignment
--
https://patchwork.kernel.org/patch/11133295/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
^ permalink raw reply
* [PATCH 1/1] MAINTAINERS: update FORCEDETH MAINTAINERS info
From: rain.1986.08.12 @ 2019-09-13 13:43 UTC (permalink / raw)
To: mchehab+samsung, davem, gregkh, linux-kernel, rain.1986.08.12,
netdev
From: Rain River <rain.1986.08.12@gmail.com>
Many FORCEDETH NICs are used in our hosts. Several bugs are fixed and
some features are developed for FORCEDETH NICs. And I have been
reviewing patches for FORCEDETH NIC for several months. Mark me as the
FORCEDETH NIC maintainer. I will send out the patches and maintain
FORCEDETH NIC.
Signed-off-by: Rain River <rain.1986.08.12@gmail.com>
---
MAINTAINERS | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index e7a47b5210fd..0a28090df677 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -649,6 +649,12 @@ M: Lino Sanfilippo <LinoSanfilippo@gmx.de>
S: Maintained
F: drivers/net/ethernet/alacritech/*
+FORCEDETH GIGABIT ETHERNET DRIVER
+M: Rain River <rain.1986.08.12@gmail.com>
+L: netdev@vger.kernel.org
+S: Maintained
+F: drivers/net/ethernet/nvidia/*
+
ALCATEL SPEEDTOUCH USB DRIVER
M: Duncan Sands <duncan.sands@free.fr>
L: linux-usb@vger.kernel.org
--
2.17.1
^ permalink raw reply related
* Re: [PATCH] netfilter: trivial: remove extraneous space from message
From: Rui Salvaterra @ 2019-09-13 13:43 UTC (permalink / raw)
To: pablo, davem, netfilter-devel, netdev, linux-kernel
In-Reply-To: <20190724143733.17433-1-rsalvaterra@gmail.com>
Friendly ping.
On Wed, 24 Jul 2019 at 15:37, Rui Salvaterra <rsalvaterra@gmail.com> wrote:
>
> Pure ocd, but this one has been bugging me for a while.
>
> Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
> ---
> net/netfilter/nf_conntrack_helper.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
> index 8d729e7c36ff..209123f35b4a 100644
> --- a/net/netfilter/nf_conntrack_helper.c
> +++ b/net/netfilter/nf_conntrack_helper.c
> @@ -218,7 +218,7 @@ nf_ct_lookup_helper(struct nf_conn *ct, struct net *net)
> return NULL;
> pr_info("nf_conntrack: default automatic helper assignment "
> "has been turned off for security reasons and CT-based "
> - " firewall rule not found. Use the iptables CT target "
> + "firewall rule not found. Use the iptables CT target "
> "to attach helpers instead.\n");
> net->ct.auto_assign_helper_warned = 1;
> return NULL;
> --
> 2.22.0
>
^ permalink raw reply
* Re: [PATCH 00/27] Netfilter updates for net-next
From: David Miller @ 2019-09-13 13:40 UTC (permalink / raw)
To: pablo; +Cc: netfilter-devel, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Fri, 13 Sep 2019 13:30:35 +0200
> The following patchset contains Netfilter updates for net-next:
...
> You can pull these changes from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git
Looks good, pulled.
^ permalink raw reply
* Re: [PATCH net-next 5/5] sctp: add spt_pathcpthld in struct sctp_paddrthlds
From: 'Marcelo Ricardo Leitner' @ 2019-09-13 13:40 UTC (permalink / raw)
To: David Laight
Cc: Xin Long, network dev, linux-sctp@vger.kernel.org, Neil Horman,
davem@davemloft.net
In-Reply-To: <be14cc8353f6403c82ad81e3e741d8f0@AcuMS.aculab.com>
On Fri, Sep 13, 2019 at 01:31:22PM +0000, David Laight wrote:
> From: 'Marcelo Ricardo Leitner'
> > Sent: 13 September 2019 14:20
> ...
> > Interestingly, we have/had the opposite problem with netlink. Like, it
> > was allowing too much flexibility, such as silently ignoring unknown
> > fields (which is what would happen with a new app running on an older
> > kernel would trigger here) is bad because the app cannot know if it
> > was actually used or not. Some gymnastics in the app could cut through
> > the fat here, like probing getsockopt() return size, but then it may
> > as well probe for the right sockopt to be used.
>
> Yes, it would also work if the kernel checked that all 'unexpected'
> fields were zero (up to some sanity limit of a few kB).
Though this would have to be done by older kernels, which are not
aware of this extra space by definition.
>
> Then an application complied with a 'new' header would work with
> an old kernel provided it didn't try so set any new fields.
> (And it zeroed the entire structure.)
>
> But you have to start off with that in mind.
>
> Alternatively stop the insanity of setting multiple options
> with one setsockopt call.
> If multiple system calls are an issue implement a system call
> that will set multiple options on the same socket.
> (Maybe through a CMSG()-like buffer).
> Then the application can set the ones it wants without having
> to do the read-modify-write sequence needed for some of the
> SCTP ones.
I'm not sure I get you here. You mean we could have, for example, one
sockopt for each field on each struct we currently have? That would
bring other problems to the table, like how to deal with fields that
need to be updated together.
Anyhow, I'm afraid our hands a bit tied here. That's how the RFCs are
defining the interface and we shouldn't deviate too much from it.
What would help is that the RFC definited these versioned structs
itself. Because as it is, even if we start versioning it, Linux will
have one versioning and other OSes will have another.
Marcelo
^ permalink raw reply
* Re: [PATCH 3/3] libertas: Remove unneeded variable and make function to be void
From: Kalle Valo @ 2019-09-13 13:38 UTC (permalink / raw)
To: zhong jiang; +Cc: davem, zhongjiang, linux-wireless, netdev, linux-kernel
In-Reply-To: <1568306492-42998-4-git-send-email-zhongjiang@huawei.com>
zhong jiang <zhongjiang@huawei.com> wrote:
> lbs_process_event do not need return value to cope with different
> cases. And change functon return type to void.
>
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Same here, I just don't see the benefit.
Patch set to Rejected.
--
https://patchwork.kernel.org/patch/11143397/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
^ permalink raw reply
* Re: [PATCH 1/3] brcmsmac: Remove unneeded variable and make function to be void
From: Kalle Valo @ 2019-09-13 13:37 UTC (permalink / raw)
To: zhong jiang; +Cc: davem, zhongjiang, linux-wireless, netdev, linux-kernel
In-Reply-To: <1568306492-42998-2-git-send-email-zhongjiang@huawei.com>
zhong jiang <zhongjiang@huawei.com> wrote:
> brcms_c_set_mac do not need return value to cope with different
> cases. And change functon return type to void.
>
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
I just don't see the benefit from changing the function to return void.
And if we ever add error handling to the function we need to change it
back to return int again, which is extra work.
Patch set to Rejected.
--
https://patchwork.kernel.org/patch/11143403/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
^ permalink raw reply
* RE: [PATCH net-next 5/5] sctp: add spt_pathcpthld in struct sctp_paddrthlds
From: David Laight @ 2019-09-13 13:31 UTC (permalink / raw)
To: 'Marcelo Ricardo Leitner'
Cc: Xin Long, network dev, linux-sctp@vger.kernel.org, Neil Horman,
davem@davemloft.net
In-Reply-To: <20190913131954.GX3431@localhost.localdomain>
From: 'Marcelo Ricardo Leitner'
> Sent: 13 September 2019 14:20
...
> Interestingly, we have/had the opposite problem with netlink. Like, it
> was allowing too much flexibility, such as silently ignoring unknown
> fields (which is what would happen with a new app running on an older
> kernel would trigger here) is bad because the app cannot know if it
> was actually used or not. Some gymnastics in the app could cut through
> the fat here, like probing getsockopt() return size, but then it may
> as well probe for the right sockopt to be used.
Yes, it would also work if the kernel checked that all 'unexpected'
fields were zero (up to some sanity limit of a few kB).
Then an application complied with a 'new' header would work with
an old kernel provided it didn't try so set any new fields.
(And it zeroed the entire structure.)
But you have to start off with that in mind.
Alternatively stop the insanity of setting multiple options
with one setsockopt call.
If multiple system calls are an issue implement a system call
that will set multiple options on the same socket.
(Maybe through a CMSG()-like buffer).
Then the application can set the ones it wants without having
to do the read-modify-write sequence needed for some of the
SCTP ones.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
^ permalink raw reply
* Re: [PATCH net-next 5/5] sctp: add spt_pathcpthld in struct sctp_paddrthlds
From: 'Marcelo Ricardo Leitner' @ 2019-09-13 13:19 UTC (permalink / raw)
To: David Laight
Cc: Xin Long, network dev, linux-sctp@vger.kernel.org, Neil Horman,
davem@davemloft.net
In-Reply-To: <bcaba726b7444efea7b14fcd60e4743a@AcuMS.aculab.com>
On Fri, Sep 13, 2019 at 08:36:22AM +0000, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 12 September 2019 23:52
> ...
> > Here it is more visible. If net->...ps_retrans is disabled, remaining
> > fields (currently just this one, but as we are extending it now, we
> > have to think about the possibility of more as well) will be ignored,
> > we and we may not want that.
>
> The only real way to add additional fields is to change the name
> of the structure - that way recompiled programs still work.
>
> You could require that programs zero the entire structure - but
> that is difficult to verify.
> And, in this case, it seems that the default has to be 0xffff
> rather than 0 - which is, in itself, horrid.
Yep, and with that, a new sockopt as well. May not be the most
beautiful way, but it's the safest. Applications can then probe if the
sockopt is available or not and use what they want/can.
Inner kernel code can then be rearranged like it was for the peeloff
operation and peeloff + flags, and these issues just don't exist then.
We actually had agreed on using new sockopts, on thread
[PATCH net] sctp: make sctp_setsockopt_events() less strict about the option length
Interestingly, we have/had the opposite problem with netlink. Like, it
was allowing too much flexibility, such as silently ignoring unknown
fields (which is what would happen with a new app running on an older
kernel would trigger here) is bad because the app cannot know if it
was actually used or not. Some gymnastics in the app could cut through
the fat here, like probing getsockopt() return size, but then it may
as well probe for the right sockopt to be used.
Marcelo
^ permalink raw reply
* [PATCH] net/wan: dscc4: remove broken dscc4 driver
From: Dan Carpenter @ 2019-09-13 13:28 UTC (permalink / raw)
To: David S. Miller, Francois Romieu; +Cc: netdev, kernel-janitors
Using static analysis, I discovered that the "dpriv->pci_priv->pdev"
pointer is always NULL. This pointer was supposed to be initialized
during probe and is essential for the driver to work. It would be easy
to add a "ppriv->pdev = pdev;" to dscc4_found1() but this driver has
been broken since before we started using git and no one has complained
so probably we should just remove it.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
---
MAINTAINERS | 6 -
drivers/net/wan/Kconfig | 14 -
drivers/net/wan/Makefile | 1 -
drivers/net/wan/dscc4.c | 2057 --------------------------------------
4 files changed, 2078 deletions(-)
delete mode 100644 drivers/net/wan/dscc4.c
diff --git a/MAINTAINERS b/MAINTAINERS
index ee4e873c0f9a..731fb91aa836 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5602,12 +5602,6 @@ T: git git://linuxtv.org/media_tree.git
S: Maintained
F: drivers/media/radio/dsbr100.c
-DSCC4 DRIVER
-M: Francois Romieu <romieu@fr.zoreil.com>
-L: netdev@vger.kernel.org
-S: Maintained
-F: drivers/net/wan/dscc4.c
-
DT3155 MEDIA DRIVER
M: Hans Verkuil <hverkuil@xs4all.nl>
L: linux-media@vger.kernel.org
diff --git a/drivers/net/wan/Kconfig b/drivers/net/wan/Kconfig
index 09fdd619ac67..dd1a147f2971 100644
--- a/drivers/net/wan/Kconfig
+++ b/drivers/net/wan/Kconfig
@@ -267,20 +267,6 @@ config FARSYNC
To compile this driver as a module, choose M here: the
module will be called farsync.
-config DSCC4
- tristate "Etinc PCISYNC serial board support"
- depends on HDLC && PCI && m
- help
- Driver for Etinc PCISYNC boards based on the Infineon (ex. Siemens)
- DSCC4 chipset.
-
- This is supposed to work with the four port card. Take a look at
- <http://www.cogenit.fr/dscc4/> for further information about the
- driver.
-
- To compile this driver as a module, choose M here: the
- module will be called dscc4.
-
config FSL_UCC_HDLC
tristate "Freescale QUICC Engine HDLC support"
depends on HDLC
diff --git a/drivers/net/wan/Makefile b/drivers/net/wan/Makefile
index 9532e69fda87..701f5d2fe3b6 100644
--- a/drivers/net/wan/Makefile
+++ b/drivers/net/wan/Makefile
@@ -18,7 +18,6 @@ obj-$(CONFIG_HOSTESS_SV11) += z85230.o hostess_sv11.o
obj-$(CONFIG_SEALEVEL_4021) += z85230.o sealevel.o
obj-$(CONFIG_COSA) += cosa.o
obj-$(CONFIG_FARSYNC) += farsync.o
-obj-$(CONFIG_DSCC4) += dscc4.o
obj-$(CONFIG_X25_ASY) += x25_asy.o
obj-$(CONFIG_LANMEDIA) += lmc/
diff --git a/drivers/net/wan/dscc4.c b/drivers/net/wan/dscc4.c
deleted file mode 100644
index fa78d2b14136..000000000000
--- a/drivers/net/wan/dscc4.c
+++ /dev/null
@@ -1,2057 +0,0 @@
-/*
- * drivers/net/wan/dscc4/dscc4.c: a DSCC4 HDLC driver for Linux
- *
- * This software may be used and distributed according to the terms of the
- * GNU General Public License.
- *
- * The author may be reached as romieu@cogenit.fr.
- * Specific bug reports/asian food will be welcome.
- *
- * Special thanks to the nice people at CS-Telecom for the hardware and the
- * access to the test/measure tools.
- *
- *
- * Theory of Operation
- *
- * I. Board Compatibility
- *
- * This device driver is designed for the Siemens PEB20534 4 ports serial
- * controller as found on Etinc PCISYNC cards. The documentation for the
- * chipset is available at http://www.infineon.com:
- * - Data Sheet "DSCC4, DMA Supported Serial Communication Controller with
- * 4 Channels, PEB 20534 Version 2.1, PEF 20534 Version 2.1";
- * - Application Hint "Management of DSCC4 on-chip FIFO resources".
- * - Errata sheet DS5 (courtesy of Michael Skerritt).
- * Jens David has built an adapter based on the same chipset. Take a look
- * at http://www.afthd.tu-darmstadt.de/~dg1kjd/pciscc4 for a specific
- * driver.
- * Sample code (2 revisions) is available at Infineon.
- *
- * II. Board-specific settings
- *
- * Pcisync can transmit some clock signal to the outside world on the
- * *first two* ports provided you put a quartz and a line driver on it and
- * remove the jumpers. The operation is described on Etinc web site. If you
- * go DCE on these ports, don't forget to use an adequate cable.
- *
- * Sharing of the PCI interrupt line for this board is possible.
- *
- * III. Driver operation
- *
- * The rx/tx operations are based on a linked list of descriptors. The driver
- * doesn't use HOLD mode any more. HOLD mode is definitely buggy and the more
- * I tried to fix it, the more it started to look like (convoluted) software
- * mutation of LxDA method. Errata sheet DS5 suggests to use LxDA: consider
- * this a rfc2119 MUST.
- *
- * Tx direction
- * When the tx ring is full, the xmit routine issues a call to netdev_stop.
- * The device is supposed to be enabled again during an ALLS irq (we could
- * use HI but as it's easy to lose events, it's fscked).
- *
- * Rx direction
- * The received frames aren't supposed to span over multiple receiving areas.
- * I may implement it some day but it isn't the highest ranked item.
- *
- * IV. Notes
- * The current error (XDU, RFO) recovery code is untested.
- * So far, RDO takes his RX channel down and the right sequence to enable it
- * again is still a mystery. If RDO happens, plan a reboot. More details
- * in the code (NB: as this happens, TX still works).
- * Don't mess the cables during operation, especially on DTE ports. I don't
- * suggest it for DCE either but at least one can get some messages instead
- * of a complete instant freeze.
- * Tests are done on Rev. 20 of the silicium. The RDO handling changes with
- * the documentation/chipset releases.
- *
- * TODO:
- * - test X25.
- * - use polling at high irq/s,
- * - performance analysis,
- * - endianness.
- *
- * 2001/12/10 Daniela Squassoni <daniela@cyclades.com>
- * - Contribution to support the new generic HDLC layer.
- *
- * 2002/01 Ueimor
- * - old style interface removal
- * - dscc4_release_ring fix (related to DMA mapping)
- * - hard_start_xmit fix (hint: TxSizeMax)
- * - misc crapectomy.
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/module.h>
-#include <linux/sched.h>
-#include <linux/types.h>
-#include <linux/errno.h>
-#include <linux/list.h>
-#include <linux/ioport.h>
-#include <linux/pci.h>
-#include <linux/kernel.h>
-#include <linux/mm.h>
-#include <linux/slab.h>
-
-#include <asm/cache.h>
-#include <asm/byteorder.h>
-#include <linux/uaccess.h>
-#include <asm/io.h>
-#include <asm/irq.h>
-
-#include <linux/init.h>
-#include <linux/interrupt.h>
-#include <linux/string.h>
-
-#include <linux/if_arp.h>
-#include <linux/netdevice.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/hdlc.h>
-#include <linux/mutex.h>
-
-/* Version */
-static const char version[] = "$Id: dscc4.c,v 1.173 2003/09/20 23:55:34 romieu Exp $ for Linux\n";
-static int debug;
-static int quartz;
-
-#ifdef CONFIG_DSCC4_PCI_RST
-static DEFINE_MUTEX(dscc4_mutex);
-static u32 dscc4_pci_config_store[16];
-#endif
-
-#define DRV_NAME "dscc4"
-
-#undef DSCC4_POLLING
-
-/* Module parameters */
-
-MODULE_AUTHOR("Maintainer: Francois Romieu <romieu@cogenit.fr>");
-MODULE_DESCRIPTION("Siemens PEB20534 PCI Controller");
-MODULE_LICENSE("GPL");
-module_param(debug, int, 0);
-MODULE_PARM_DESC(debug,"Enable/disable extra messages");
-module_param(quartz, int, 0);
-MODULE_PARM_DESC(quartz,"If present, on-board quartz frequency (Hz)");
-
-/* Structures */
-
-struct thingie {
- int define;
- u32 bits;
-};
-
-struct TxFD {
- __le32 state;
- __le32 next;
- __le32 data;
- __le32 complete;
- u32 jiffies; /* Allows sizeof(TxFD) == sizeof(RxFD) + extra hack */
- /* FWIW, datasheet calls that "dummy" and says that card
- * never looks at it; neither does the driver */
-};
-
-struct RxFD {
- __le32 state1;
- __le32 next;
- __le32 data;
- __le32 state2;
- __le32 end;
-};
-
-#define DUMMY_SKB_SIZE 64
-#define TX_LOW 8
-#define TX_RING_SIZE 32
-#define RX_RING_SIZE 32
-#define TX_TOTAL_SIZE TX_RING_SIZE*sizeof(struct TxFD)
-#define RX_TOTAL_SIZE RX_RING_SIZE*sizeof(struct RxFD)
-#define IRQ_RING_SIZE 64 /* Keep it a multiple of 32 */
-#define TX_TIMEOUT (HZ/10)
-#define DSCC4_HZ_MAX 33000000
-#define BRR_DIVIDER_MAX 64*0x00004000 /* Cf errata DS5 p.10 */
-#define dev_per_card 4
-#define SCC_REGISTERS_MAX 23 /* Cf errata DS5 p.4 */
-
-#define SOURCE_ID(flags) (((flags) >> 28) & 0x03)
-#define TO_SIZE(state) (((state) >> 16) & 0x1fff)
-
-/*
- * Given the operating range of Linux HDLC, the 2 defines below could be
- * made simpler. However they are a fine reminder for the limitations of
- * the driver: it's better to stay < TxSizeMax and < RxSizeMax.
- */
-#define TO_STATE_TX(len) cpu_to_le32(((len) & TxSizeMax) << 16)
-#define TO_STATE_RX(len) cpu_to_le32((RX_MAX(len) % RxSizeMax) << 16)
-#define RX_MAX(len) ((((len) >> 5) + 1) << 5) /* Cf RLCR */
-#define SCC_REG_START(dpriv) (SCC_START+(dpriv->dev_id)*SCC_OFFSET)
-
-struct dscc4_pci_priv {
- __le32 *iqcfg;
- int cfg_cur;
- spinlock_t lock;
- struct pci_dev *pdev;
-
- struct dscc4_dev_priv *root;
- dma_addr_t iqcfg_dma;
- u32 xtal_hz;
-};
-
-struct dscc4_dev_priv {
- struct sk_buff *rx_skbuff[RX_RING_SIZE];
- struct sk_buff *tx_skbuff[TX_RING_SIZE];
-
- struct RxFD *rx_fd;
- struct TxFD *tx_fd;
- __le32 *iqrx;
- __le32 *iqtx;
-
- /* FIXME: check all the volatile are required */
- volatile u32 tx_current;
- u32 rx_current;
- u32 iqtx_current;
- u32 iqrx_current;
-
- volatile u32 tx_dirty;
- volatile u32 ltda;
- u32 rx_dirty;
- u32 lrda;
-
- dma_addr_t tx_fd_dma;
- dma_addr_t rx_fd_dma;
- dma_addr_t iqtx_dma;
- dma_addr_t iqrx_dma;
-
- u32 scc_regs[SCC_REGISTERS_MAX]; /* Cf errata DS5 p.4 */
-
- struct dscc4_pci_priv *pci_priv;
- spinlock_t lock;
-
- int dev_id;
- volatile u32 flags;
- u32 timer_help;
-
- unsigned short encoding;
- unsigned short parity;
- struct net_device *dev;
- sync_serial_settings settings;
- void __iomem *base_addr;
- u32 __pad __attribute__ ((aligned (4)));
-};
-
-/* GLOBAL registers definitions */
-#define GCMDR 0x00
-#define GSTAR 0x04
-#define GMODE 0x08
-#define IQLENR0 0x0C
-#define IQLENR1 0x10
-#define IQRX0 0x14
-#define IQTX0 0x24
-#define IQCFG 0x3c
-#define FIFOCR1 0x44
-#define FIFOCR2 0x48
-#define FIFOCR3 0x4c
-#define FIFOCR4 0x34
-#define CH0CFG 0x50
-#define CH0BRDA 0x54
-#define CH0BTDA 0x58
-#define CH0FRDA 0x98
-#define CH0FTDA 0xb0
-#define CH0LRDA 0xc8
-#define CH0LTDA 0xe0
-
-/* SCC registers definitions */
-#define SCC_START 0x0100
-#define SCC_OFFSET 0x80
-#define CMDR 0x00
-#define STAR 0x04
-#define CCR0 0x08
-#define CCR1 0x0c
-#define CCR2 0x10
-#define BRR 0x2C
-#define RLCR 0x40
-#define IMR 0x54
-#define ISR 0x58
-
-#define GPDIR 0x0400
-#define GPDATA 0x0404
-#define GPIM 0x0408
-
-/* Bit masks */
-#define EncodingMask 0x00700000
-#define CrcMask 0x00000003
-
-#define IntRxScc0 0x10000000
-#define IntTxScc0 0x01000000
-
-#define TxPollCmd 0x00000400
-#define RxActivate 0x08000000
-#define MTFi 0x04000000
-#define Rdr 0x00400000
-#define Rdt 0x00200000
-#define Idr 0x00100000
-#define Idt 0x00080000
-#define TxSccRes 0x01000000
-#define RxSccRes 0x00010000
-#define TxSizeMax 0x1fff /* Datasheet DS1 - 11.1.1.1 */
-#define RxSizeMax 0x1ffc /* Datasheet DS1 - 11.1.2.1 */
-
-#define Ccr0ClockMask 0x0000003f
-#define Ccr1LoopMask 0x00000200
-#define IsrMask 0x000fffff
-#define BrrExpMask 0x00000f00
-#define BrrMultMask 0x0000003f
-#define EncodingMask 0x00700000
-#define Hold cpu_to_le32(0x40000000)
-#define SccBusy 0x10000000
-#define PowerUp 0x80000000
-#define Vis 0x00001000
-#define FrameOk (FrameVfr | FrameCrc)
-#define FrameVfr 0x80
-#define FrameRdo 0x40
-#define FrameCrc 0x20
-#define FrameRab 0x10
-#define FrameAborted cpu_to_le32(0x00000200)
-#define FrameEnd cpu_to_le32(0x80000000)
-#define DataComplete cpu_to_le32(0x40000000)
-#define LengthCheck 0x00008000
-#define SccEvt 0x02000000
-#define NoAck 0x00000200
-#define Action 0x00000001
-#define HiDesc cpu_to_le32(0x20000000)
-
-/* SCC events */
-#define RxEvt 0xf0000000
-#define TxEvt 0x0f000000
-#define Alls 0x00040000
-#define Xdu 0x00010000
-#define Cts 0x00004000
-#define Xmr 0x00002000
-#define Xpr 0x00001000
-#define Rdo 0x00000080
-#define Rfs 0x00000040
-#define Cd 0x00000004
-#define Rfo 0x00000002
-#define Flex 0x00000001
-
-/* DMA core events */
-#define Cfg 0x00200000
-#define Hi 0x00040000
-#define Fi 0x00020000
-#define Err 0x00010000
-#define Arf 0x00000002
-#define ArAck 0x00000001
-
-/* State flags */
-#define Ready 0x00000000
-#define NeedIDR 0x00000001
-#define NeedIDT 0x00000002
-#define RdoSet 0x00000004
-#define FakeReset 0x00000008
-
-/* Don't mask RDO. Ever. */
-#ifdef DSCC4_POLLING
-#define EventsMask 0xfffeef7f
-#else
-#define EventsMask 0xfffa8f7a
-#endif
-
-/* Functions prototypes */
-static void dscc4_rx_irq(struct dscc4_pci_priv *, struct dscc4_dev_priv *);
-static void dscc4_tx_irq(struct dscc4_pci_priv *, struct dscc4_dev_priv *);
-static int dscc4_found1(struct pci_dev *, void __iomem *ioaddr);
-static int dscc4_init_one(struct pci_dev *, const struct pci_device_id *ent);
-static int dscc4_open(struct net_device *);
-static netdev_tx_t dscc4_start_xmit(struct sk_buff *,
- struct net_device *);
-static int dscc4_close(struct net_device *);
-static int dscc4_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
-static int dscc4_init_ring(struct net_device *);
-static void dscc4_release_ring(struct dscc4_dev_priv *);
-static void dscc4_tx_timeout(struct net_device *);
-static irqreturn_t dscc4_irq(int irq, void *dev_id);
-static int dscc4_hdlc_attach(struct net_device *, unsigned short, unsigned short);
-static int dscc4_set_iface(struct dscc4_dev_priv *, struct net_device *);
-#ifdef DSCC4_POLLING
-static int dscc4_tx_poll(struct dscc4_dev_priv *, struct net_device *);
-#endif
-
-static inline struct dscc4_dev_priv *dscc4_priv(struct net_device *dev)
-{
- return dev_to_hdlc(dev)->priv;
-}
-
-static inline struct net_device *dscc4_to_dev(struct dscc4_dev_priv *p)
-{
- return p->dev;
-}
-
-static void scc_patchl(u32 mask, u32 value, struct dscc4_dev_priv *dpriv,
- struct net_device *dev, int offset)
-{
- u32 state;
-
- /* Cf scc_writel for concern regarding thread-safety */
- state = dpriv->scc_regs[offset >> 2];
- state &= ~mask;
- state |= value;
- dpriv->scc_regs[offset >> 2] = state;
- writel(state, dpriv->base_addr + SCC_REG_START(dpriv) + offset);
-}
-
-static void scc_writel(u32 bits, struct dscc4_dev_priv *dpriv,
- struct net_device *dev, int offset)
-{
- /*
- * Thread-UNsafe.
- * As of 2002/02/16, there are no thread racing for access.
- */
- dpriv->scc_regs[offset >> 2] = bits;
- writel(bits, dpriv->base_addr + SCC_REG_START(dpriv) + offset);
-}
-
-static inline u32 scc_readl(struct dscc4_dev_priv *dpriv, int offset)
-{
- return dpriv->scc_regs[offset >> 2];
-}
-
-static u32 scc_readl_star(struct dscc4_dev_priv *dpriv, struct net_device *dev)
-{
- /* Cf errata DS5 p.4 */
- readl(dpriv->base_addr + SCC_REG_START(dpriv) + STAR);
- return readl(dpriv->base_addr + SCC_REG_START(dpriv) + STAR);
-}
-
-static inline void dscc4_do_tx(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- dpriv->ltda = dpriv->tx_fd_dma +
- ((dpriv->tx_current-1)%TX_RING_SIZE)*sizeof(struct TxFD);
- writel(dpriv->ltda, dpriv->base_addr + CH0LTDA + dpriv->dev_id*4);
- /* Flush posted writes *NOW* */
- readl(dpriv->base_addr + CH0LTDA + dpriv->dev_id*4);
-}
-
-static inline void dscc4_rx_update(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- dpriv->lrda = dpriv->rx_fd_dma +
- ((dpriv->rx_dirty - 1)%RX_RING_SIZE)*sizeof(struct RxFD);
- writel(dpriv->lrda, dpriv->base_addr + CH0LRDA + dpriv->dev_id*4);
-}
-
-static inline unsigned int dscc4_tx_done(struct dscc4_dev_priv *dpriv)
-{
- return dpriv->tx_current == dpriv->tx_dirty;
-}
-
-static inline unsigned int dscc4_tx_quiescent(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- return readl(dpriv->base_addr + CH0FTDA + dpriv->dev_id*4) == dpriv->ltda;
-}
-
-static int state_check(u32 state, struct dscc4_dev_priv *dpriv,
- struct net_device *dev, const char *msg)
-{
- int ret = 0;
-
- if (debug > 1) {
- if (SOURCE_ID(state) != dpriv->dev_id) {
- printk(KERN_DEBUG "%s (%s): Source Id=%d, state=%08x\n",
- dev->name, msg, SOURCE_ID(state), state);
- ret = -1;
- }
- if (state & 0x0df80c00) {
- printk(KERN_DEBUG "%s (%s): state=%08x (UFO alert)\n",
- dev->name, msg, state);
- ret = -1;
- }
- }
- return ret;
-}
-
-static void dscc4_tx_print(struct net_device *dev,
- struct dscc4_dev_priv *dpriv,
- char *msg)
-{
- printk(KERN_DEBUG "%s: tx_current=%02d tx_dirty=%02d (%s)\n",
- dev->name, dpriv->tx_current, dpriv->tx_dirty, msg);
-}
-
-static void dscc4_release_ring(struct dscc4_dev_priv *dpriv)
-{
- struct device *d = &dpriv->pci_priv->pdev->dev;
- struct TxFD *tx_fd = dpriv->tx_fd;
- struct RxFD *rx_fd = dpriv->rx_fd;
- struct sk_buff **skbuff;
- int i;
-
- dma_free_coherent(d, TX_TOTAL_SIZE, tx_fd, dpriv->tx_fd_dma);
- dma_free_coherent(d, RX_TOTAL_SIZE, rx_fd, dpriv->rx_fd_dma);
-
- skbuff = dpriv->tx_skbuff;
- for (i = 0; i < TX_RING_SIZE; i++) {
- if (*skbuff) {
- dma_unmap_single(d, le32_to_cpu(tx_fd->data),
- (*skbuff)->len, DMA_TO_DEVICE);
- dev_kfree_skb(*skbuff);
- }
- skbuff++;
- tx_fd++;
- }
-
- skbuff = dpriv->rx_skbuff;
- for (i = 0; i < RX_RING_SIZE; i++) {
- if (*skbuff) {
- dma_unmap_single(d, le32_to_cpu(rx_fd->data),
- RX_MAX(HDLC_MAX_MRU),
- DMA_FROM_DEVICE);
- dev_kfree_skb(*skbuff);
- }
- skbuff++;
- rx_fd++;
- }
-}
-
-static inline int try_get_rx_skb(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- unsigned int dirty = dpriv->rx_dirty%RX_RING_SIZE;
- struct device *d = &dpriv->pci_priv->pdev->dev;
- struct RxFD *rx_fd = dpriv->rx_fd + dirty;
- const int len = RX_MAX(HDLC_MAX_MRU);
- struct sk_buff *skb;
- dma_addr_t addr;
-
- skb = dev_alloc_skb(len);
- if (!skb)
- goto err_out;
-
- skb->protocol = hdlc_type_trans(skb, dev);
- addr = dma_map_single(d, skb->data, len, DMA_FROM_DEVICE);
- if (dma_mapping_error(d, addr))
- goto err_free_skb;
-
- dpriv->rx_skbuff[dirty] = skb;
- rx_fd->data = cpu_to_le32(addr);
- return 0;
-
-err_free_skb:
- dev_kfree_skb_any(skb);
-err_out:
- rx_fd->data = 0;
- return -1;
-}
-
-/*
- * IRQ/thread/whatever safe
- */
-static int dscc4_wait_ack_cec(struct dscc4_dev_priv *dpriv,
- struct net_device *dev, char *msg)
-{
- s8 i = 0;
-
- do {
- if (!(scc_readl_star(dpriv, dev) & SccBusy)) {
- printk(KERN_DEBUG "%s: %s ack (%d try)\n", dev->name,
- msg, i);
- goto done;
- }
- schedule_timeout_uninterruptible(msecs_to_jiffies(100));
- rmb();
- } while (++i > 0);
- netdev_err(dev, "%s timeout\n", msg);
-done:
- return (i >= 0) ? i : -EAGAIN;
-}
-
-static int dscc4_do_action(struct net_device *dev, char *msg)
-{
- void __iomem *ioaddr = dscc4_priv(dev)->base_addr;
- s16 i = 0;
-
- writel(Action, ioaddr + GCMDR);
- ioaddr += GSTAR;
- do {
- u32 state = readl(ioaddr);
-
- if (state & ArAck) {
- netdev_dbg(dev, "%s ack\n", msg);
- writel(ArAck, ioaddr);
- goto done;
- } else if (state & Arf) {
- netdev_err(dev, "%s failed\n", msg);
- writel(Arf, ioaddr);
- i = -1;
- goto done;
- }
- rmb();
- } while (++i > 0);
- netdev_err(dev, "%s timeout\n", msg);
-done:
- return i;
-}
-
-static inline int dscc4_xpr_ack(struct dscc4_dev_priv *dpriv)
-{
- int cur = dpriv->iqtx_current%IRQ_RING_SIZE;
- s8 i = 0;
-
- do {
- if (!(dpriv->flags & (NeedIDR | NeedIDT)) ||
- (dpriv->iqtx[cur] & cpu_to_le32(Xpr)))
- break;
- smp_rmb();
- schedule_timeout_uninterruptible(msecs_to_jiffies(100));
- } while (++i > 0);
-
- return (i >= 0 ) ? i : -EAGAIN;
-}
-
-#if 0 /* dscc4_{rx/tx}_reset are both unreliable - more tweak needed */
-static void dscc4_rx_reset(struct dscc4_dev_priv *dpriv, struct net_device *dev)
-{
- unsigned long flags;
-
- spin_lock_irqsave(&dpriv->pci_priv->lock, flags);
- /* Cf errata DS5 p.6 */
- writel(0x00000000, dpriv->base_addr + CH0LRDA + dpriv->dev_id*4);
- scc_patchl(PowerUp, 0, dpriv, dev, CCR0);
- readl(dpriv->base_addr + CH0LRDA + dpriv->dev_id*4);
- writel(MTFi|Rdr, dpriv->base_addr + dpriv->dev_id*0x0c + CH0CFG);
- writel(Action, dpriv->base_addr + GCMDR);
- spin_unlock_irqrestore(&dpriv->pci_priv->lock, flags);
-}
-
-#endif
-
-#if 0
-static void dscc4_tx_reset(struct dscc4_dev_priv *dpriv, struct net_device *dev)
-{
- u16 i = 0;
-
- /* Cf errata DS5 p.7 */
- scc_patchl(PowerUp, 0, dpriv, dev, CCR0);
- scc_writel(0x00050000, dpriv, dev, CCR2);
- /*
- * Must be longer than the time required to fill the fifo.
- */
- while (!dscc4_tx_quiescent(dpriv, dev) && ++i) {
- udelay(1);
- wmb();
- }
-
- writel(MTFi|Rdt, dpriv->base_addr + dpriv->dev_id*0x0c + CH0CFG);
- if (dscc4_do_action(dev, "Rdt") < 0)
- netdev_err(dev, "Tx reset failed\n");
-}
-#endif
-
-/* TODO: (ab)use this function to refill a completely depleted RX ring. */
-static inline void dscc4_rx_skb(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- struct RxFD *rx_fd = dpriv->rx_fd + dpriv->rx_current%RX_RING_SIZE;
- struct device *d = &dpriv->pci_priv->pdev->dev;
- struct sk_buff *skb;
- int pkt_len;
-
- skb = dpriv->rx_skbuff[dpriv->rx_current++%RX_RING_SIZE];
- if (!skb) {
- printk(KERN_DEBUG "%s: skb=0 (%s)\n", dev->name, __func__);
- goto refill;
- }
- pkt_len = TO_SIZE(le32_to_cpu(rx_fd->state2));
- dma_unmap_single(d, le32_to_cpu(rx_fd->data),
- RX_MAX(HDLC_MAX_MRU), DMA_FROM_DEVICE);
- if ((skb->data[--pkt_len] & FrameOk) == FrameOk) {
- dev->stats.rx_packets++;
- dev->stats.rx_bytes += pkt_len;
- skb_put(skb, pkt_len);
- if (netif_running(dev))
- skb->protocol = hdlc_type_trans(skb, dev);
- netif_rx(skb);
- } else {
- if (skb->data[pkt_len] & FrameRdo)
- dev->stats.rx_fifo_errors++;
- else if (!(skb->data[pkt_len] & FrameCrc))
- dev->stats.rx_crc_errors++;
- else if ((skb->data[pkt_len] & (FrameVfr | FrameRab)) !=
- (FrameVfr | FrameRab))
- dev->stats.rx_length_errors++;
- dev->stats.rx_errors++;
- dev_kfree_skb_irq(skb);
- }
-refill:
- while ((dpriv->rx_dirty - dpriv->rx_current) % RX_RING_SIZE) {
- if (try_get_rx_skb(dpriv, dev) < 0)
- break;
- dpriv->rx_dirty++;
- }
- dscc4_rx_update(dpriv, dev);
- rx_fd->state2 = 0x00000000;
- rx_fd->end = cpu_to_le32(0xbabeface);
-}
-
-static void dscc4_free1(struct pci_dev *pdev)
-{
- struct dscc4_pci_priv *ppriv;
- struct dscc4_dev_priv *root;
- int i;
-
- ppriv = pci_get_drvdata(pdev);
- root = ppriv->root;
-
- for (i = 0; i < dev_per_card; i++)
- unregister_hdlc_device(dscc4_to_dev(root + i));
-
- for (i = 0; i < dev_per_card; i++)
- free_netdev(root[i].dev);
- kfree(root);
- kfree(ppriv);
-}
-
-static int dscc4_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
-{
- struct dscc4_pci_priv *priv;
- struct dscc4_dev_priv *dpriv;
- void __iomem *ioaddr;
- int i, rc;
-
- printk(KERN_DEBUG "%s", version);
-
- rc = pci_enable_device(pdev);
- if (rc < 0)
- goto out;
-
- rc = pci_request_region(pdev, 0, "registers");
- if (rc < 0) {
- pr_err("can't reserve MMIO region (regs)\n");
- goto err_disable_0;
- }
- rc = pci_request_region(pdev, 1, "LBI interface");
- if (rc < 0) {
- pr_err("can't reserve MMIO region (lbi)\n");
- goto err_free_mmio_region_1;
- }
-
- ioaddr = pci_ioremap_bar(pdev, 0);
- if (!ioaddr) {
- pr_err("cannot remap MMIO region %llx @ %llx\n",
- (unsigned long long)pci_resource_len(pdev, 0),
- (unsigned long long)pci_resource_start(pdev, 0));
- rc = -EIO;
- goto err_free_mmio_regions_2;
- }
- printk(KERN_DEBUG "Siemens DSCC4, MMIO at %#llx (regs), %#llx (lbi), IRQ %d\n",
- (unsigned long long)pci_resource_start(pdev, 0),
- (unsigned long long)pci_resource_start(pdev, 1), pdev->irq);
-
- /* Cf errata DS5 p.2 */
- pci_write_config_byte(pdev, PCI_LATENCY_TIMER, 0xf8);
- pci_set_master(pdev);
-
- rc = dscc4_found1(pdev, ioaddr);
- if (rc < 0)
- goto err_iounmap_3;
-
- priv = pci_get_drvdata(pdev);
-
- rc = request_irq(pdev->irq, dscc4_irq, IRQF_SHARED, DRV_NAME, priv->root);
- if (rc < 0) {
- pr_warn("IRQ %d busy\n", pdev->irq);
- goto err_release_4;
- }
-
- /* power up/little endian/dma core controlled via lrda/ltda */
- writel(0x00000001, ioaddr + GMODE);
- /* Shared interrupt queue */
- {
- u32 bits;
-
- bits = (IRQ_RING_SIZE >> 5) - 1;
- bits |= bits << 4;
- bits |= bits << 8;
- bits |= bits << 16;
- writel(bits, ioaddr + IQLENR0);
- }
- /* Global interrupt queue */
- writel((u32)(((IRQ_RING_SIZE >> 5) - 1) << 20), ioaddr + IQLENR1);
-
- rc = -ENOMEM;
-
- priv->iqcfg = (__le32 *)dma_alloc_coherent(&pdev->dev,
- IRQ_RING_SIZE*sizeof(__le32), &priv->iqcfg_dma, GFP_KERNEL);
- if (!priv->iqcfg)
- goto err_free_irq_5;
- writel(priv->iqcfg_dma, ioaddr + IQCFG);
-
- /*
- * SCC 0-3 private rx/tx irq structures
- * IQRX/TXi needs to be set soon. Learned it the hard way...
- */
- for (i = 0; i < dev_per_card; i++) {
- dpriv = priv->root + i;
- dpriv->iqtx = (__le32 *)dma_alloc_coherent(&pdev->dev,
- IRQ_RING_SIZE*sizeof(u32), &dpriv->iqtx_dma,
- GFP_KERNEL);
- if (!dpriv->iqtx)
- goto err_free_iqtx_6;
- writel(dpriv->iqtx_dma, ioaddr + IQTX0 + i*4);
- }
- for (i = 0; i < dev_per_card; i++) {
- dpriv = priv->root + i;
- dpriv->iqrx = (__le32 *)dma_alloc_coherent(&pdev->dev,
- IRQ_RING_SIZE*sizeof(u32), &dpriv->iqrx_dma,
- GFP_KERNEL);
- if (!dpriv->iqrx)
- goto err_free_iqrx_7;
- writel(dpriv->iqrx_dma, ioaddr + IQRX0 + i*4);
- }
-
- /* Cf application hint. Beware of hard-lock condition on threshold. */
- writel(0x42104000, ioaddr + FIFOCR1);
- //writel(0x9ce69800, ioaddr + FIFOCR2);
- writel(0xdef6d800, ioaddr + FIFOCR2);
- //writel(0x11111111, ioaddr + FIFOCR4);
- writel(0x18181818, ioaddr + FIFOCR4);
- // FIXME: should depend on the chipset revision
- writel(0x0000000e, ioaddr + FIFOCR3);
-
- writel(0xff200001, ioaddr + GCMDR);
-
- rc = 0;
-out:
- return rc;
-
-err_free_iqrx_7:
- while (--i >= 0) {
- dpriv = priv->root + i;
- dma_free_coherent(&pdev->dev, IRQ_RING_SIZE*sizeof(u32),
- dpriv->iqrx, dpriv->iqrx_dma);
- }
- i = dev_per_card;
-err_free_iqtx_6:
- while (--i >= 0) {
- dpriv = priv->root + i;
- dma_free_coherent(&pdev->dev, IRQ_RING_SIZE*sizeof(u32),
- dpriv->iqtx, dpriv->iqtx_dma);
- }
- dma_free_coherent(&pdev->dev, IRQ_RING_SIZE*sizeof(u32), priv->iqcfg,
- priv->iqcfg_dma);
-err_free_irq_5:
- free_irq(pdev->irq, priv->root);
-err_release_4:
- dscc4_free1(pdev);
-err_iounmap_3:
- iounmap (ioaddr);
-err_free_mmio_regions_2:
- pci_release_region(pdev, 1);
-err_free_mmio_region_1:
- pci_release_region(pdev, 0);
-err_disable_0:
- pci_disable_device(pdev);
- goto out;
-};
-
-/*
- * Let's hope the default values are decent enough to protect my
- * feet from the user's gun - Ueimor
- */
-static void dscc4_init_registers(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- /* No interrupts, SCC core disabled. Let's relax */
- scc_writel(0x00000000, dpriv, dev, CCR0);
-
- scc_writel(LengthCheck | (HDLC_MAX_MRU >> 5), dpriv, dev, RLCR);
-
- /*
- * No address recognition/crc-CCITT/cts enabled
- * Shared flags transmission disabled - cf errata DS5 p.11
- * Carrier detect disabled - cf errata p.14
- * FIXME: carrier detection/polarity may be handled more gracefully.
- */
- scc_writel(0x02408000, dpriv, dev, CCR1);
-
- /* crc not forwarded - Cf errata DS5 p.11 */
- scc_writel(0x00050008 & ~RxActivate, dpriv, dev, CCR2);
- // crc forwarded
- //scc_writel(0x00250008 & ~RxActivate, dpriv, dev, CCR2);
-}
-
-static inline int dscc4_set_quartz(struct dscc4_dev_priv *dpriv, int hz)
-{
- int ret = 0;
-
- if ((hz < 0) || (hz > DSCC4_HZ_MAX))
- ret = -EOPNOTSUPP;
- else
- dpriv->pci_priv->xtal_hz = hz;
-
- return ret;
-}
-
-static const struct net_device_ops dscc4_ops = {
- .ndo_open = dscc4_open,
- .ndo_stop = dscc4_close,
- .ndo_start_xmit = hdlc_start_xmit,
- .ndo_do_ioctl = dscc4_ioctl,
- .ndo_tx_timeout = dscc4_tx_timeout,
-};
-
-static int dscc4_found1(struct pci_dev *pdev, void __iomem *ioaddr)
-{
- struct dscc4_pci_priv *ppriv;
- struct dscc4_dev_priv *root;
- int i, ret = -ENOMEM;
-
- root = kcalloc(dev_per_card, sizeof(*root), GFP_KERNEL);
- if (!root)
- goto err_out;
-
- for (i = 0; i < dev_per_card; i++) {
- root[i].dev = alloc_hdlcdev(root + i);
- if (!root[i].dev)
- goto err_free_dev;
- }
-
- ppriv = kzalloc(sizeof(*ppriv), GFP_KERNEL);
- if (!ppriv)
- goto err_free_dev;
-
- ppriv->root = root;
- spin_lock_init(&ppriv->lock);
-
- for (i = 0; i < dev_per_card; i++) {
- struct dscc4_dev_priv *dpriv = root + i;
- struct net_device *d = dscc4_to_dev(dpriv);
- hdlc_device *hdlc = dev_to_hdlc(d);
-
- d->base_addr = (unsigned long)ioaddr;
- d->irq = pdev->irq;
- d->netdev_ops = &dscc4_ops;
- d->watchdog_timeo = TX_TIMEOUT;
- SET_NETDEV_DEV(d, &pdev->dev);
-
- dpriv->dev_id = i;
- dpriv->pci_priv = ppriv;
- dpriv->base_addr = ioaddr;
- spin_lock_init(&dpriv->lock);
-
- hdlc->xmit = dscc4_start_xmit;
- hdlc->attach = dscc4_hdlc_attach;
-
- dscc4_init_registers(dpriv, d);
- dpriv->parity = PARITY_CRC16_PR0_CCITT;
- dpriv->encoding = ENCODING_NRZ;
-
- ret = dscc4_init_ring(d);
- if (ret < 0)
- goto err_unregister;
-
- ret = register_hdlc_device(d);
- if (ret < 0) {
- pr_err("unable to register\n");
- dscc4_release_ring(dpriv);
- goto err_unregister;
- }
- }
-
- ret = dscc4_set_quartz(root, quartz);
- if (ret < 0)
- goto err_unregister;
-
- pci_set_drvdata(pdev, ppriv);
- return ret;
-
-err_unregister:
- while (i-- > 0) {
- dscc4_release_ring(root + i);
- unregister_hdlc_device(dscc4_to_dev(root + i));
- }
- kfree(ppriv);
- i = dev_per_card;
-err_free_dev:
- while (i-- > 0)
- free_netdev(root[i].dev);
- kfree(root);
-err_out:
- return ret;
-};
-
-static void dscc4_tx_timeout(struct net_device *dev)
-{
- /* FIXME: something is missing there */
-}
-
-static int dscc4_loopback_check(struct dscc4_dev_priv *dpriv)
-{
- sync_serial_settings *settings = &dpriv->settings;
-
- if (settings->loopback && (settings->clock_type != CLOCK_INT)) {
- struct net_device *dev = dscc4_to_dev(dpriv);
-
- netdev_info(dev, "loopback requires clock\n");
- return -1;
- }
- return 0;
-}
-
-#ifdef CONFIG_DSCC4_PCI_RST
-/*
- * Some DSCC4-based cards wires the GPIO port and the PCI #RST pin together
- * so as to provide a safe way to reset the asic while not the whole machine
- * rebooting.
- *
- * This code doesn't need to be efficient. Keep It Simple
- */
-static void dscc4_pci_reset(struct pci_dev *pdev, void __iomem *ioaddr)
-{
- int i;
-
- mutex_lock(&dscc4_mutex);
- for (i = 0; i < 16; i++)
- pci_read_config_dword(pdev, i << 2, dscc4_pci_config_store + i);
-
- /* Maximal LBI clock divider (who cares ?) and whole GPIO range. */
- writel(0x001c0000, ioaddr + GMODE);
- /* Configure GPIO port as output */
- writel(0x0000ffff, ioaddr + GPDIR);
- /* Disable interruption */
- writel(0x0000ffff, ioaddr + GPIM);
-
- writel(0x0000ffff, ioaddr + GPDATA);
- writel(0x00000000, ioaddr + GPDATA);
-
- /* Flush posted writes */
- readl(ioaddr + GSTAR);
-
- schedule_timeout_uninterruptible(msecs_to_jiffies(100));
-
- for (i = 0; i < 16; i++)
- pci_write_config_dword(pdev, i << 2, dscc4_pci_config_store[i]);
- mutex_unlock(&dscc4_mutex);
-}
-#else
-#define dscc4_pci_reset(pdev,ioaddr) do {} while (0)
-#endif /* CONFIG_DSCC4_PCI_RST */
-
-static int dscc4_open(struct net_device *dev)
-{
- struct dscc4_dev_priv *dpriv = dscc4_priv(dev);
- int ret = -EAGAIN;
-
- if ((dscc4_loopback_check(dpriv) < 0))
- goto err;
-
- if ((ret = hdlc_open(dev)))
- goto err;
-
- /*
- * Due to various bugs, there is no way to reliably reset a
- * specific port (manufacturer's dependent special PCI #RST wiring
- * apart: it affects all ports). Thus the device goes in the best
- * silent mode possible at dscc4_close() time and simply claims to
- * be up if it's opened again. It still isn't possible to change
- * the HDLC configuration without rebooting but at least the ports
- * can be up/down ifconfig'ed without killing the host.
- */
- if (dpriv->flags & FakeReset) {
- dpriv->flags &= ~FakeReset;
- scc_patchl(0, PowerUp, dpriv, dev, CCR0);
- scc_patchl(0, 0x00050000, dpriv, dev, CCR2);
- scc_writel(EventsMask, dpriv, dev, IMR);
- netdev_info(dev, "up again\n");
- goto done;
- }
-
- /* IDT+IDR during XPR */
- dpriv->flags = NeedIDR | NeedIDT;
-
- scc_patchl(0, PowerUp | Vis, dpriv, dev, CCR0);
-
- /*
- * The following is a bit paranoid...
- *
- * NB: the datasheet "...CEC will stay active if the SCC is in
- * power-down mode or..." and CCR2.RAC = 1 are two different
- * situations.
- */
- if (scc_readl_star(dpriv, dev) & SccBusy) {
- netdev_err(dev, "busy - try later\n");
- ret = -EAGAIN;
- goto err_out;
- } else
- netdev_info(dev, "available - good\n");
-
- scc_writel(EventsMask, dpriv, dev, IMR);
-
- /* Posted write is flushed in the wait_ack loop */
- scc_writel(TxSccRes | RxSccRes, dpriv, dev, CMDR);
-
- if ((ret = dscc4_wait_ack_cec(dpriv, dev, "Cec")) < 0)
- goto err_disable_scc_events;
-
- /*
- * I would expect XPR near CE completion (before ? after ?).
- * At worst, this code won't see a late XPR and people
- * will have to re-issue an ifconfig (this is harmless).
- * WARNING, a really missing XPR usually means a hardware
- * reset is needed. Suggestions anyone ?
- */
- if ((ret = dscc4_xpr_ack(dpriv)) < 0) {
- pr_err("XPR timeout\n");
- goto err_disable_scc_events;
- }
-
- if (debug > 2)
- dscc4_tx_print(dev, dpriv, "Open");
-
-done:
- netif_start_queue(dev);
-
- netif_carrier_on(dev);
-
- return 0;
-
-err_disable_scc_events:
- scc_writel(0xffffffff, dpriv, dev, IMR);
- scc_patchl(PowerUp | Vis, 0, dpriv, dev, CCR0);
-err_out:
- hdlc_close(dev);
-err:
- return ret;
-}
-
-#ifdef DSCC4_POLLING
-static int dscc4_tx_poll(struct dscc4_dev_priv *dpriv, struct net_device *dev)
-{
- /* FIXME: it's gonna be easy (TM), for sure */
-}
-#endif /* DSCC4_POLLING */
-
-static netdev_tx_t dscc4_start_xmit(struct sk_buff *skb,
- struct net_device *dev)
-{
- struct dscc4_dev_priv *dpriv = dscc4_priv(dev);
- struct device *d = &dpriv->pci_priv->pdev->dev;
- struct TxFD *tx_fd;
- dma_addr_t addr;
- int next;
-
- addr = dma_map_single(d, skb->data, skb->len, DMA_TO_DEVICE);
- if (dma_mapping_error(d, addr)) {
- dev_kfree_skb_any(skb);
- dev->stats.tx_dropped++;
- return NETDEV_TX_OK;
- }
-
- next = dpriv->tx_current%TX_RING_SIZE;
- dpriv->tx_skbuff[next] = skb;
- tx_fd = dpriv->tx_fd + next;
- tx_fd->state = FrameEnd | TO_STATE_TX(skb->len);
- tx_fd->data = cpu_to_le32(addr);
- tx_fd->complete = 0x00000000;
- tx_fd->jiffies = jiffies;
- mb();
-
-#ifdef DSCC4_POLLING
- spin_lock(&dpriv->lock);
- while (dscc4_tx_poll(dpriv, dev));
- spin_unlock(&dpriv->lock);
-#endif
-
- if (debug > 2)
- dscc4_tx_print(dev, dpriv, "Xmit");
- /* To be cleaned(unsigned int)/optimized. Later, ok ? */
- if (!((++dpriv->tx_current - dpriv->tx_dirty)%TX_RING_SIZE))
- netif_stop_queue(dev);
-
- if (dscc4_tx_quiescent(dpriv, dev))
- dscc4_do_tx(dpriv, dev);
-
- return NETDEV_TX_OK;
-}
-
-static int dscc4_close(struct net_device *dev)
-{
- struct dscc4_dev_priv *dpriv = dscc4_priv(dev);
-
- netif_stop_queue(dev);
-
- scc_patchl(PowerUp | Vis, 0, dpriv, dev, CCR0);
- scc_patchl(0x00050000, 0, dpriv, dev, CCR2);
- scc_writel(0xffffffff, dpriv, dev, IMR);
-
- dpriv->flags |= FakeReset;
-
- hdlc_close(dev);
-
- return 0;
-}
-
-static inline int dscc4_check_clock_ability(int port)
-{
- int ret = 0;
-
-#ifdef CONFIG_DSCC4_PCISYNC
- if (port >= 2)
- ret = -1;
-#endif
- return ret;
-}
-
-/*
- * DS1 p.137: "There are a total of 13 different clocking modes..."
- * ^^
- * Design choices:
- * - by default, assume a clock is provided on pin RxClk/TxClk (clock mode 0a).
- * Clock mode 3b _should_ work but the testing seems to make this point
- * dubious (DIY testing requires setting CCR0 at 0x00000033).
- * This is supposed to provide least surprise "DTE like" behavior.
- * - if line rate is specified, clocks are assumed to be locally generated.
- * A quartz must be available (on pin XTAL1). Modes 6b/7b are used. Choosing
- * between these it automagically done according on the required frequency
- * scaling. Of course some rounding may take place.
- * - no high speed mode (40Mb/s). May be trivial to do but I don't have an
- * appropriate external clocking device for testing.
- * - no time-slot/clock mode 5: shameless laziness.
- *
- * The clock signals wiring can be (is ?) manufacturer dependent. Good luck.
- *
- * BIG FAT WARNING: if the device isn't provided enough clocking signal, it
- * won't pass the init sequence. For example, straight back-to-back DTE without
- * external clock will fail when dscc4_open() (<- 'ifconfig hdlcx xxx') is
- * called.
- *
- * Typos lurk in datasheet (missing divier in clock mode 7a figure 51 p.153
- * DS0 for example)
- *
- * Clock mode related bits of CCR0:
- * +------------ TOE: output TxClk (0b/2b/3a/3b/6b/7a/7b only)
- * | +---------- SSEL: sub-mode select 0 -> a, 1 -> b
- * | | +-------- High Speed: say 0
- * | | | +-+-+-- Clock Mode: 0..7
- * | | | | | |
- * -+-+-+-+-+-+-+-+
- * x|x|5|4|3|2|1|0| lower bits
- *
- * Division factor of BRR: k = (N+1)x2^M (total divider = 16xk in mode 6b)
- * +-+-+-+------------------ M (0..15)
- * | | | | +-+-+-+-+-+-- N (0..63)
- * 0 0 0 0 | | | | 0 0 | | | | | |
- * ...-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- * f|e|d|c|b|a|9|8|7|6|5|4|3|2|1|0| lower bits
- *
- */
-static int dscc4_set_clock(struct net_device *dev, u32 *bps, u32 *state)
-{
- struct dscc4_dev_priv *dpriv = dscc4_priv(dev);
- int ret = -1;
- u32 brr;
-
- *state &= ~Ccr0ClockMask;
- if (*bps) { /* Clock generated - required for DCE */
- u32 n = 0, m = 0, divider;
- int xtal;
-
- xtal = dpriv->pci_priv->xtal_hz;
- if (!xtal)
- goto done;
- if (dscc4_check_clock_ability(dpriv->dev_id) < 0)
- goto done;
- divider = xtal / *bps;
- if (divider > BRR_DIVIDER_MAX) {
- divider >>= 4;
- *state |= 0x00000036; /* Clock mode 6b (BRG/16) */
- } else
- *state |= 0x00000037; /* Clock mode 7b (BRG) */
- if (divider >> 22) {
- n = 63;
- m = 15;
- } else if (divider) {
- /* Extraction of the 6 highest weighted bits */
- m = 0;
- while (0xffffffc0 & divider) {
- m++;
- divider >>= 1;
- }
- n = divider;
- }
- brr = (m << 8) | n;
- divider = n << m;
- if (!(*state & 0x00000001)) /* ?b mode mask => clock mode 6b */
- divider <<= 4;
- *bps = xtal / divider;
- } else {
- /*
- * External clock - DTE
- * "state" already reflects Clock mode 0a (CCR0 = 0xzzzzzz00).
- * Nothing more to be done
- */
- brr = 0;
- }
- scc_writel(brr, dpriv, dev, BRR);
- ret = 0;
-done:
- return ret;
-}
-
-static int dscc4_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
-{
- sync_serial_settings __user *line = ifr->ifr_settings.ifs_ifsu.sync;
- struct dscc4_dev_priv *dpriv = dscc4_priv(dev);
- const size_t size = sizeof(dpriv->settings);
- int ret = 0;
-
- if (dev->flags & IFF_UP)
- return -EBUSY;
-
- if (cmd != SIOCWANDEV)
- return -EOPNOTSUPP;
-
- switch(ifr->ifr_settings.type) {
- case IF_GET_IFACE:
- ifr->ifr_settings.type = IF_IFACE_SYNC_SERIAL;
- if (ifr->ifr_settings.size < size) {
- ifr->ifr_settings.size = size; /* data size wanted */
- return -ENOBUFS;
- }
- if (copy_to_user(line, &dpriv->settings, size))
- return -EFAULT;
- break;
-
- case IF_IFACE_SYNC_SERIAL:
- if (!capable(CAP_NET_ADMIN))
- return -EPERM;
-
- if (dpriv->flags & FakeReset) {
- netdev_info(dev, "please reset the device before this command\n");
- return -EPERM;
- }
- if (copy_from_user(&dpriv->settings, line, size))
- return -EFAULT;
- ret = dscc4_set_iface(dpriv, dev);
- break;
-
- default:
- ret = hdlc_ioctl(dev, ifr, cmd);
- break;
- }
-
- return ret;
-}
-
-static int dscc4_match(const struct thingie *p, int value)
-{
- int i;
-
- for (i = 0; p[i].define != -1; i++) {
- if (value == p[i].define)
- break;
- }
- if (p[i].define == -1)
- return -1;
- else
- return i;
-}
-
-static int dscc4_clock_setting(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- sync_serial_settings *settings = &dpriv->settings;
- int ret = -EOPNOTSUPP;
- u32 bps, state;
-
- bps = settings->clock_rate;
- state = scc_readl(dpriv, CCR0);
- if (dscc4_set_clock(dev, &bps, &state) < 0)
- goto done;
- if (bps) { /* DCE */
- printk(KERN_DEBUG "%s: generated RxClk (DCE)\n", dev->name);
- if (settings->clock_rate != bps) {
- printk(KERN_DEBUG "%s: clock adjusted (%08d -> %08d)\n",
- dev->name, settings->clock_rate, bps);
- settings->clock_rate = bps;
- }
- } else { /* DTE */
- state |= PowerUp | Vis;
- printk(KERN_DEBUG "%s: external RxClk (DTE)\n", dev->name);
- }
- scc_writel(state, dpriv, dev, CCR0);
- ret = 0;
-done:
- return ret;
-}
-
-static int dscc4_encoding_setting(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- static const struct thingie encoding[] = {
- { ENCODING_NRZ, 0x00000000 },
- { ENCODING_NRZI, 0x00200000 },
- { ENCODING_FM_MARK, 0x00400000 },
- { ENCODING_FM_SPACE, 0x00500000 },
- { ENCODING_MANCHESTER, 0x00600000 },
- { -1, 0}
- };
- int i, ret = 0;
-
- i = dscc4_match(encoding, dpriv->encoding);
- if (i >= 0)
- scc_patchl(EncodingMask, encoding[i].bits, dpriv, dev, CCR0);
- else
- ret = -EOPNOTSUPP;
- return ret;
-}
-
-static int dscc4_loopback_setting(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- sync_serial_settings *settings = &dpriv->settings;
- u32 state;
-
- state = scc_readl(dpriv, CCR1);
- if (settings->loopback) {
- printk(KERN_DEBUG "%s: loopback\n", dev->name);
- state |= 0x00000100;
- } else {
- printk(KERN_DEBUG "%s: normal\n", dev->name);
- state &= ~0x00000100;
- }
- scc_writel(state, dpriv, dev, CCR1);
- return 0;
-}
-
-static int dscc4_crc_setting(struct dscc4_dev_priv *dpriv,
- struct net_device *dev)
-{
- static const struct thingie crc[] = {
- { PARITY_CRC16_PR0_CCITT, 0x00000010 },
- { PARITY_CRC16_PR1_CCITT, 0x00000000 },
- { PARITY_CRC32_PR0_CCITT, 0x00000011 },
- { PARITY_CRC32_PR1_CCITT, 0x00000001 }
- };
- int i, ret = 0;
-
- i = dscc4_match(crc, dpriv->parity);
- if (i >= 0)
- scc_patchl(CrcMask, crc[i].bits, dpriv, dev, CCR1);
- else
- ret = -EOPNOTSUPP;
- return ret;
-}
-
-static int dscc4_set_iface(struct dscc4_dev_priv *dpriv, struct net_device *dev)
-{
- struct {
- int (*action)(struct dscc4_dev_priv *, struct net_device *);
- } *p, do_setting[] = {
- { dscc4_encoding_setting },
- { dscc4_clock_setting },
- { dscc4_loopback_setting },
- { dscc4_crc_setting },
- { NULL }
- };
- int ret = 0;
-
- for (p = do_setting; p->action; p++) {
- if ((ret = p->action(dpriv, dev)) < 0)
- break;
- }
- return ret;
-}
-
-static irqreturn_t dscc4_irq(int irq, void *token)
-{
- struct dscc4_dev_priv *root = token;
- struct dscc4_pci_priv *priv;
- struct net_device *dev;
- void __iomem *ioaddr;
- u32 state;
- unsigned long flags;
- int i, handled = 1;
-
- priv = root->pci_priv;
- dev = dscc4_to_dev(root);
-
- spin_lock_irqsave(&priv->lock, flags);
-
- ioaddr = root->base_addr;
-
- state = readl(ioaddr + GSTAR);
- if (!state) {
- handled = 0;
- goto out;
- }
- if (debug > 3)
- printk(KERN_DEBUG "%s: GSTAR = 0x%08x\n", DRV_NAME, state);
- writel(state, ioaddr + GSTAR);
-
- if (state & Arf) {
- netdev_err(dev, "failure (Arf). Harass the maintainer\n");
- goto out;
- }
- state &= ~ArAck;
- if (state & Cfg) {
- if (debug > 0)
- printk(KERN_DEBUG "%s: CfgIV\n", DRV_NAME);
- if (priv->iqcfg[priv->cfg_cur++%IRQ_RING_SIZE] & cpu_to_le32(Arf))
- netdev_err(dev, "CFG failed\n");
- if (!(state &= ~Cfg))
- goto out;
- }
- if (state & RxEvt) {
- i = dev_per_card - 1;
- do {
- dscc4_rx_irq(priv, root + i);
- } while (--i >= 0);
- state &= ~RxEvt;
- }
- if (state & TxEvt) {
- i = dev_per_card - 1;
- do {
- dscc4_tx_irq(priv, root + i);
- } while (--i >= 0);
- state &= ~TxEvt;
- }
-out:
- spin_unlock_irqrestore(&priv->lock, flags);
- return IRQ_RETVAL(handled);
-}
-
-static void dscc4_tx_irq(struct dscc4_pci_priv *ppriv,
- struct dscc4_dev_priv *dpriv)
-{
- struct net_device *dev = dscc4_to_dev(dpriv);
- u32 state;
- int cur, loop = 0;
-
-try:
- cur = dpriv->iqtx_current%IRQ_RING_SIZE;
- state = le32_to_cpu(dpriv->iqtx[cur]);
- if (!state) {
- if (debug > 4)
- printk(KERN_DEBUG "%s: Tx ISR = 0x%08x\n", dev->name,
- state);
- if ((debug > 1) && (loop > 1))
- printk(KERN_DEBUG "%s: Tx irq loop=%d\n", dev->name, loop);
- if (loop && netif_queue_stopped(dev))
- if ((dpriv->tx_current - dpriv->tx_dirty)%TX_RING_SIZE)
- netif_wake_queue(dev);
-
- if (netif_running(dev) && dscc4_tx_quiescent(dpriv, dev) &&
- !dscc4_tx_done(dpriv))
- dscc4_do_tx(dpriv, dev);
- return;
- }
- loop++;
- dpriv->iqtx[cur] = 0;
- dpriv->iqtx_current++;
-
- if (state_check(state, dpriv, dev, "Tx") < 0)
- return;
-
- if (state & SccEvt) {
- if (state & Alls) {
- struct sk_buff *skb;
- struct TxFD *tx_fd;
-
- if (debug > 2)
- dscc4_tx_print(dev, dpriv, "Alls");
- /*
- * DataComplete can't be trusted for Tx completion.
- * Cf errata DS5 p.8
- */
- cur = dpriv->tx_dirty%TX_RING_SIZE;
- tx_fd = dpriv->tx_fd + cur;
- skb = dpriv->tx_skbuff[cur];
- if (skb) {
- dma_unmap_single(&ppriv->pdev->dev,
- le32_to_cpu(tx_fd->data),
- skb->len, DMA_TO_DEVICE);
- if (tx_fd->state & FrameEnd) {
- dev->stats.tx_packets++;
- dev->stats.tx_bytes += skb->len;
- }
- dev_consume_skb_irq(skb);
- dpriv->tx_skbuff[cur] = NULL;
- ++dpriv->tx_dirty;
- } else {
- if (debug > 1)
- netdev_err(dev, "Tx: NULL skb %d\n",
- cur);
- }
- /*
- * If the driver ends sending crap on the wire, it
- * will be way easier to diagnose than the (not so)
- * random freeze induced by null sized tx frames.
- */
- tx_fd->data = tx_fd->next;
- tx_fd->state = FrameEnd | TO_STATE_TX(2*DUMMY_SKB_SIZE);
- tx_fd->complete = 0x00000000;
- tx_fd->jiffies = 0;
-
- if (!(state &= ~Alls))
- goto try;
- }
- /*
- * Transmit Data Underrun
- */
- if (state & Xdu) {
- netdev_err(dev, "Tx Data Underrun. Ask maintainer\n");
- dpriv->flags = NeedIDT;
- /* Tx reset */
- writel(MTFi | Rdt,
- dpriv->base_addr + 0x0c*dpriv->dev_id + CH0CFG);
- writel(Action, dpriv->base_addr + GCMDR);
- return;
- }
- if (state & Cts) {
- netdev_info(dev, "CTS transition\n");
- if (!(state &= ~Cts)) /* DEBUG */
- goto try;
- }
- if (state & Xmr) {
- /* Frame needs to be sent again - FIXME */
- netdev_err(dev, "Tx ReTx. Ask maintainer\n");
- if (!(state &= ~Xmr)) /* DEBUG */
- goto try;
- }
- if (state & Xpr) {
- void __iomem *scc_addr;
- unsigned long ring;
- unsigned int i;
-
- /*
- * - the busy condition happens (sometimes);
- * - it doesn't seem to make the handler unreliable.
- */
- for (i = 1; i; i <<= 1) {
- if (!(scc_readl_star(dpriv, dev) & SccBusy))
- break;
- }
- if (!i)
- netdev_info(dev, "busy in irq\n");
-
- scc_addr = dpriv->base_addr + 0x0c*dpriv->dev_id;
- /* Keep this order: IDT before IDR */
- if (dpriv->flags & NeedIDT) {
- if (debug > 2)
- dscc4_tx_print(dev, dpriv, "Xpr");
- ring = dpriv->tx_fd_dma +
- (dpriv->tx_dirty%TX_RING_SIZE)*
- sizeof(struct TxFD);
- writel(ring, scc_addr + CH0BTDA);
- dscc4_do_tx(dpriv, dev);
- writel(MTFi | Idt, scc_addr + CH0CFG);
- if (dscc4_do_action(dev, "IDT") < 0)
- goto err_xpr;
- dpriv->flags &= ~NeedIDT;
- }
- if (dpriv->flags & NeedIDR) {
- ring = dpriv->rx_fd_dma +
- (dpriv->rx_current%RX_RING_SIZE)*
- sizeof(struct RxFD);
- writel(ring, scc_addr + CH0BRDA);
- dscc4_rx_update(dpriv, dev);
- writel(MTFi | Idr, scc_addr + CH0CFG);
- if (dscc4_do_action(dev, "IDR") < 0)
- goto err_xpr;
- dpriv->flags &= ~NeedIDR;
- smp_wmb();
- /* Activate receiver and misc */
- scc_writel(0x08050008, dpriv, dev, CCR2);
- }
- err_xpr:
- if (!(state &= ~Xpr))
- goto try;
- }
- if (state & Cd) {
- if (debug > 0)
- netdev_info(dev, "CD transition\n");
- if (!(state &= ~Cd)) /* DEBUG */
- goto try;
- }
- } else { /* ! SccEvt */
- if (state & Hi) {
-#ifdef DSCC4_POLLING
- while (!dscc4_tx_poll(dpriv, dev));
-#endif
- netdev_info(dev, "Tx Hi\n");
- state &= ~Hi;
- }
- if (state & Err) {
- netdev_info(dev, "Tx ERR\n");
- dev->stats.tx_errors++;
- state &= ~Err;
- }
- }
- goto try;
-}
-
-static void dscc4_rx_irq(struct dscc4_pci_priv *priv,
- struct dscc4_dev_priv *dpriv)
-{
- struct net_device *dev = dscc4_to_dev(dpriv);
- u32 state;
- int cur;
-
-try:
- cur = dpriv->iqrx_current%IRQ_RING_SIZE;
- state = le32_to_cpu(dpriv->iqrx[cur]);
- if (!state)
- return;
- dpriv->iqrx[cur] = 0;
- dpriv->iqrx_current++;
-
- if (state_check(state, dpriv, dev, "Rx") < 0)
- return;
-
- if (!(state & SccEvt)){
- struct RxFD *rx_fd;
-
- if (debug > 4)
- printk(KERN_DEBUG "%s: Rx ISR = 0x%08x\n", dev->name,
- state);
- state &= 0x00ffffff;
- if (state & Err) { /* Hold or reset */
- printk(KERN_DEBUG "%s: Rx ERR\n", dev->name);
- cur = dpriv->rx_current%RX_RING_SIZE;
- rx_fd = dpriv->rx_fd + cur;
- /*
- * Presume we're not facing a DMAC receiver reset.
- * As We use the rx size-filtering feature of the
- * DSCC4, the beginning of a new frame is waiting in
- * the rx fifo. I bet a Receive Data Overflow will
- * happen most of time but let's try and avoid it.
- * Btw (as for RDO) if one experiences ERR whereas
- * the system looks rather idle, there may be a
- * problem with latency. In this case, increasing
- * RX_RING_SIZE may help.
- */
- //while (dpriv->rx_needs_refill) {
- while (!(rx_fd->state1 & Hold)) {
- rx_fd++;
- cur++;
- if (!(cur = cur%RX_RING_SIZE))
- rx_fd = dpriv->rx_fd;
- }
- //dpriv->rx_needs_refill--;
- try_get_rx_skb(dpriv, dev);
- if (!rx_fd->data)
- goto try;
- rx_fd->state1 &= ~Hold;
- rx_fd->state2 = 0x00000000;
- rx_fd->end = cpu_to_le32(0xbabeface);
- //}
- goto try;
- }
- if (state & Fi) {
- dscc4_rx_skb(dpriv, dev);
- goto try;
- }
- if (state & Hi ) { /* HI bit */
- netdev_info(dev, "Rx Hi\n");
- state &= ~Hi;
- goto try;
- }
- } else { /* SccEvt */
- if (debug > 1) {
- //FIXME: verifier la presence de tous les evenements
- static struct {
- u32 mask;
- const char *irq_name;
- } evts[] = {
- { 0x00008000, "TIN"},
- { 0x00000020, "RSC"},
- { 0x00000010, "PCE"},
- { 0x00000008, "PLLA"},
- { 0, NULL}
- }, *evt;
-
- for (evt = evts; evt->irq_name; evt++) {
- if (state & evt->mask) {
- printk(KERN_DEBUG "%s: %s\n",
- dev->name, evt->irq_name);
- if (!(state &= ~evt->mask))
- goto try;
- }
- }
- } else {
- if (!(state &= ~0x0000c03c))
- goto try;
- }
- if (state & Cts) {
- netdev_info(dev, "CTS transition\n");
- if (!(state &= ~Cts)) /* DEBUG */
- goto try;
- }
- /*
- * Receive Data Overflow (FIXME: fscked)
- */
- if (state & Rdo) {
- struct RxFD *rx_fd;
- void __iomem *scc_addr;
- int cur;
-
- //if (debug)
- // dscc4_rx_dump(dpriv);
- scc_addr = dpriv->base_addr + 0x0c*dpriv->dev_id;
-
- scc_patchl(RxActivate, 0, dpriv, dev, CCR2);
- /*
- * This has no effect. Why ?
- * ORed with TxSccRes, one sees the CFG ack (for
- * the TX part only).
- */
- scc_writel(RxSccRes, dpriv, dev, CMDR);
- dpriv->flags |= RdoSet;
-
- /*
- * Let's try and save something in the received data.
- * rx_current must be incremented at least once to
- * avoid HOLD in the BRDA-to-be-pointed desc.
- */
- do {
- cur = dpriv->rx_current++%RX_RING_SIZE;
- rx_fd = dpriv->rx_fd + cur;
- if (!(rx_fd->state2 & DataComplete))
- break;
- if (rx_fd->state2 & FrameAborted) {
- dev->stats.rx_over_errors++;
- rx_fd->state1 |= Hold;
- rx_fd->state2 = 0x00000000;
- rx_fd->end = cpu_to_le32(0xbabeface);
- } else
- dscc4_rx_skb(dpriv, dev);
- } while (1);
-
- if (debug > 0) {
- if (dpriv->flags & RdoSet)
- printk(KERN_DEBUG
- "%s: no RDO in Rx data\n", DRV_NAME);
- }
-#ifdef DSCC4_RDO_EXPERIMENTAL_RECOVERY
- /*
- * FIXME: must the reset be this violent ?
- */
-#warning "FIXME: CH0BRDA"
- writel(dpriv->rx_fd_dma +
- (dpriv->rx_current%RX_RING_SIZE)*
- sizeof(struct RxFD), scc_addr + CH0BRDA);
- writel(MTFi|Rdr|Idr, scc_addr + CH0CFG);
- if (dscc4_do_action(dev, "RDR") < 0) {
- netdev_err(dev, "RDO recovery failed(RDR)\n");
- goto rdo_end;
- }
- writel(MTFi|Idr, scc_addr + CH0CFG);
- if (dscc4_do_action(dev, "IDR") < 0) {
- netdev_err(dev, "RDO recovery failed(IDR)\n");
- goto rdo_end;
- }
- rdo_end:
-#endif
- scc_patchl(0, RxActivate, dpriv, dev, CCR2);
- goto try;
- }
- if (state & Cd) {
- netdev_info(dev, "CD transition\n");
- if (!(state &= ~Cd)) /* DEBUG */
- goto try;
- }
- if (state & Flex) {
- printk(KERN_DEBUG "%s: Flex. Ttttt...\n", DRV_NAME);
- if (!(state &= ~Flex))
- goto try;
- }
- }
-}
-
-/*
- * I had expected the following to work for the first descriptor
- * (tx_fd->state = 0xc0000000)
- * - Hold=1 (don't try and branch to the next descripto);
- * - No=0 (I want an empty data section, i.e. size=0);
- * - Fe=1 (required by No=0 or we got an Err irq and must reset).
- * It failed and locked solid. Thus the introduction of a dummy skb.
- * Problem is acknowledged in errata sheet DS5. Joy :o/
- */
-static struct sk_buff *dscc4_init_dummy_skb(struct dscc4_dev_priv *dpriv)
-{
- struct sk_buff *skb;
-
- skb = dev_alloc_skb(DUMMY_SKB_SIZE);
- if (skb) {
- struct device *d = &dpriv->pci_priv->pdev->dev;
- int last = dpriv->tx_dirty%TX_RING_SIZE;
- struct TxFD *tx_fd = dpriv->tx_fd + last;
- dma_addr_t addr;
-
- skb->len = DUMMY_SKB_SIZE;
- skb_copy_to_linear_data(skb, version,
- strlen(version) % DUMMY_SKB_SIZE);
- addr = dma_map_single(d, skb->data, DUMMY_SKB_SIZE,
- DMA_TO_DEVICE);
- if (dma_mapping_error(d, addr)) {
- dev_kfree_skb_any(skb);
- return NULL;
- }
- tx_fd->state = FrameEnd | TO_STATE_TX(DUMMY_SKB_SIZE);
- tx_fd->data = cpu_to_le32(addr);
- dpriv->tx_skbuff[last] = skb;
- }
- return skb;
-}
-
-static int dscc4_init_ring(struct net_device *dev)
-{
- struct dscc4_dev_priv *dpriv = dscc4_priv(dev);
- struct device *d = &dpriv->pci_priv->pdev->dev;
- struct TxFD *tx_fd;
- struct RxFD *rx_fd;
- void *ring;
- int i;
-
- ring = dma_alloc_coherent(d, RX_TOTAL_SIZE, &dpriv->rx_fd_dma,
- GFP_KERNEL);
- if (!ring)
- goto err_out;
- dpriv->rx_fd = rx_fd = (struct RxFD *) ring;
-
- ring = dma_alloc_coherent(d, TX_TOTAL_SIZE, &dpriv->tx_fd_dma,
- GFP_KERNEL);
- if (!ring)
- goto err_free_dma_rx;
- dpriv->tx_fd = tx_fd = (struct TxFD *) ring;
-
- memset(dpriv->tx_skbuff, 0, sizeof(struct sk_buff *)*TX_RING_SIZE);
- dpriv->tx_dirty = 0xffffffff;
- i = dpriv->tx_current = 0;
- do {
- tx_fd->state = FrameEnd | TO_STATE_TX(2*DUMMY_SKB_SIZE);
- tx_fd->complete = 0x00000000;
- /* FIXME: NULL should be ok - to be tried */
- tx_fd->data = cpu_to_le32(dpriv->tx_fd_dma);
- (tx_fd++)->next = cpu_to_le32(dpriv->tx_fd_dma +
- (++i%TX_RING_SIZE)*sizeof(*tx_fd));
- } while (i < TX_RING_SIZE);
-
- if (!dscc4_init_dummy_skb(dpriv))
- goto err_free_dma_tx;
-
- memset(dpriv->rx_skbuff, 0, sizeof(struct sk_buff *)*RX_RING_SIZE);
- i = dpriv->rx_dirty = dpriv->rx_current = 0;
- do {
- /* size set by the host. Multiple of 4 bytes please */
- rx_fd->state1 = HiDesc;
- rx_fd->state2 = 0x00000000;
- rx_fd->end = cpu_to_le32(0xbabeface);
- rx_fd->state1 |= TO_STATE_RX(HDLC_MAX_MRU);
- // FIXME: return value verifiee mais traitement suspect
- if (try_get_rx_skb(dpriv, dev) >= 0)
- dpriv->rx_dirty++;
- (rx_fd++)->next = cpu_to_le32(dpriv->rx_fd_dma +
- (++i%RX_RING_SIZE)*sizeof(*rx_fd));
- } while (i < RX_RING_SIZE);
-
- return 0;
-
-err_free_dma_tx:
- dma_free_coherent(d, TX_TOTAL_SIZE, ring, dpriv->tx_fd_dma);
-err_free_dma_rx:
- dma_free_coherent(d, RX_TOTAL_SIZE, rx_fd, dpriv->rx_fd_dma);
-err_out:
- return -ENOMEM;
-}
-
-static void dscc4_remove_one(struct pci_dev *pdev)
-{
- struct dscc4_pci_priv *ppriv;
- struct dscc4_dev_priv *root;
- void __iomem *ioaddr;
- int i;
-
- ppriv = pci_get_drvdata(pdev);
- root = ppriv->root;
-
- ioaddr = root->base_addr;
-
- dscc4_pci_reset(pdev, ioaddr);
-
- free_irq(pdev->irq, root);
- dma_free_coherent(&pdev->dev, IRQ_RING_SIZE*sizeof(u32), ppriv->iqcfg,
- ppriv->iqcfg_dma);
- for (i = 0; i < dev_per_card; i++) {
- struct dscc4_dev_priv *dpriv = root + i;
-
- dscc4_release_ring(dpriv);
- dma_free_coherent(&pdev->dev, IRQ_RING_SIZE*sizeof(u32),
- dpriv->iqrx, dpriv->iqrx_dma);
- dma_free_coherent(&pdev->dev, IRQ_RING_SIZE*sizeof(u32),
- dpriv->iqtx, dpriv->iqtx_dma);
- }
-
- dscc4_free1(pdev);
-
- iounmap(ioaddr);
-
- pci_release_region(pdev, 1);
- pci_release_region(pdev, 0);
-
- pci_disable_device(pdev);
-}
-
-static int dscc4_hdlc_attach(struct net_device *dev, unsigned short encoding,
- unsigned short parity)
-{
- struct dscc4_dev_priv *dpriv = dscc4_priv(dev);
-
- if (encoding != ENCODING_NRZ &&
- encoding != ENCODING_NRZI &&
- encoding != ENCODING_FM_MARK &&
- encoding != ENCODING_FM_SPACE &&
- encoding != ENCODING_MANCHESTER)
- return -EINVAL;
-
- if (parity != PARITY_NONE &&
- parity != PARITY_CRC16_PR0_CCITT &&
- parity != PARITY_CRC16_PR1_CCITT &&
- parity != PARITY_CRC32_PR0_CCITT &&
- parity != PARITY_CRC32_PR1_CCITT)
- return -EINVAL;
-
- dpriv->encoding = encoding;
- dpriv->parity = parity;
- return 0;
-}
-
-#ifndef MODULE
-static int __init dscc4_setup(char *str)
-{
- int *args[] = { &debug, &quartz, NULL }, **p = args;
-
- while (*p && (get_option(&str, *p) == 2))
- p++;
- return 1;
-}
-
-__setup("dscc4.setup=", dscc4_setup);
-#endif
-
-static const struct pci_device_id dscc4_pci_tbl[] = {
- { PCI_VENDOR_ID_SIEMENS, PCI_DEVICE_ID_SIEMENS_DSCC4,
- PCI_ANY_ID, PCI_ANY_ID, },
- { 0,}
-};
-MODULE_DEVICE_TABLE(pci, dscc4_pci_tbl);
-
-static struct pci_driver dscc4_driver = {
- .name = DRV_NAME,
- .id_table = dscc4_pci_tbl,
- .probe = dscc4_init_one,
- .remove = dscc4_remove_one,
-};
-
-module_pci_driver(dscc4_driver);
--
2.20.1
^ permalink raw reply related
* [PATCH net-next] MAINTAINERS: xen-netback: update my email address
From: Paul Durrant @ 2019-09-13 12:47 UTC (permalink / raw)
To: netdev, xen-devel; +Cc: Paul Durrant, Wei Liu
My Citrix email address will expire shortly.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
--
Cc: Wei Liu <wei.liu@kernel.org>
---
MAINTAINERS | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index e7a47b5210fd..b36d51f0fe5c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -17646,7 +17646,7 @@ F: Documentation/ABI/testing/sysfs-hypervisor-xen
XEN NETWORK BACKEND DRIVER
M: Wei Liu <wei.liu@kernel.org>
-M: Paul Durrant <paul.durrant@citrix.com>
+M: Paul Durrant <paul@xen.org>
L: xen-devel@lists.xenproject.org (moderated for non-subscribers)
L: netdev@vger.kernel.org
S: Supported
--
2.20.1.2.gb21ebb671
^ permalink raw reply related
* Re: [PATCH 0/7] net: dsa: mv88e6xxx: features to handle network storms
From: Robert Beckett @ 2019-09-13 12:47 UTC (permalink / raw)
To: Florian Fainelli, Ido Schimmel
Cc: netdev@vger.kernel.org, Andrew Lunn, Vivien Didelot,
David S. Miller, Jiri Pirko, bob.beckett
In-Reply-To: <f19c4586-9d5b-8133-b659-fda51fb5c3b4@gmail.com>
On Thu, 2019-09-12 at 10:41 -0700, Florian Fainelli wrote:
> On 9/12/19 9:46 AM, Robert Beckett wrote:
> > On Thu, 2019-09-12 at 09:25 -0700, Florian Fainelli wrote:
> > > On 9/12/19 2:03 AM, Ido Schimmel wrote:
> > > > On Wed, Sep 11, 2019 at 12:49:03PM +0100, Robert Beckett wrote:
> > > > > On Wed, 2019-09-11 at 11:21 +0000, Ido Schimmel wrote:
> > > > > > On Tue, Sep 10, 2019 at 09:49:46AM -0700, Florian Fainelli
> > > > > > wrote:
> > > > > > > +Ido, Jiri,
> > > > > > >
> > > > > > > On 9/10/19 8:41 AM, Robert Beckett wrote:
> > > > > > > > This patch-set adds support for some features of the
> > > > > > > > Marvell
> > > > > > > > switch
> > > > > > > > chips that can be used to handle packet storms.
> > > > > > > >
> > > > > > > > The rationale for this was a setup that requires the
> > > > > > > > ability to
> > > > > > > > receive
> > > > > > > > traffic from one port, while a packet storm is occuring
> > > > > > > > on
> > > > > > > > another port
> > > > > > > > (via an external switch with a deliberate loop). This
> > > > > > > > is
> > > > > > > > needed
> > > > > > > > to
> > > > > > > > ensure vital data delivery from a specific port, while
> > > > > > > > mitigating
> > > > > > > > any
> > > > > > > > loops or DoS that a user may introduce on another port
> > > > > > > > (can't
> > > > > > > > guarantee
> > > > > > > > sensible users).
> > > > > > >
> > > > > > > The use case is reasonable, but the implementation is not
> > > > > > > really.
> > > > > > > You
> > > > > > > are using Device Tree which is meant to describe hardware
> > > > > > > as
> > > > > > > a
> > > > > > > policy
> > > > > > > holder for setting up queue priorities and likewise for
> > > > > > > queue
> > > > > > > scheduling.
> > > > > > >
> > > > > > > The tool that should be used for that purpose is tc and
> > > > > > > possibly an
> > > > > > > appropriately offloaded queue scheduler in order to map
> > > > > > > the
> > > > > > > desired
> > > > > > > scheduling class to what the hardware supports.
> > > > > > >
> > > > > > > Jiri, Ido, how do you guys support this with mlxsw?
> > > > > >
> > > > > > Hi Florian,
> > > > > >
> > > > > > Are you referring to policing traffic towards the CPU using
> > > > > > a
> > > > > > policer
> > > > > > on
> > > > > > the egress of the CPU port? At least that's what I
> > > > > > understand
> > > > > > from
> > > > > > the
> > > > > > description of patch 6 below.
> > > > > >
> > > > > > If so, mlxsw sets policers for different traffic types
> > > > > > during
> > > > > > its
> > > > > > initialization sequence. These policers are not exposed to
> > > > > > the
> > > > > > user
> > > > > > nor
> > > > > > configurable. While the default settings are good for most
> > > > > > users, we
> > > > > > do
> > > > > > want to allow users to change these and expose current
> > > > > > settings.
> > > > > >
> > > > > > I agree that tc seems like the right choice, but the
> > > > > > question
> > > > > > is
> > > > > > where
> > > > > > are we going to install the filters?
> > > > > >
> > > > >
> > > > > Before I go too far down the rabbit hole of tc traffic
> > > > > shaping,
> > > > > maybe
> > > > > it would be good to explain in more detail the problem I am
> > > > > trying to
> > > > > solve.
> > > > >
> > > > > We have a setup as follows:
> > > > >
> > > > > Marvell 88E6240 switch chip, accepting traffic from 4 ports.
> > > > > Port
> > > > > 1
> > > > > (P1) is critical priority, no dropped packets allowed, all
> > > > > others
> > > > > can
> > > > > be best effort.
> > > > >
> > > > > CPU port of swtich chip is connected via phy to phy of intel
> > > > > i210
> > > > > (igb
> > > > > driver).
> > > > >
> > > > > i210 is connected via pcie switch to imx6.
> > > > >
> > > > > When too many small packets attempt to be delivered to CPU
> > > > > port
> > > > > (e.g.
> > > > > during broadcast flood) we saw dropped packets.
> > > > >
> > > > > The packets were being received by i210 in to rx descriptor
> > > > > buffer
> > > > > fine, but the CPU could not keep up with the load. We saw
> > > > > rx_fifo_errors increasing rapidly and ksoftirqd at ~100% CPU.
> > > > >
> > > > >
> > > > > With this in mind, I am wondering whether any amount of tc
> > > > > traffic
> > > > > shaping would help? Would tc shaping require that the packet
> > > > > reception
> > > > > manages to keep up before it can enact its policies? Does the
> > > > > infrastructure have accelerator offload hooks to be able to
> > > > > apply
> > > > > it
> > > > > via HW? I dont see how it would be able to inspect the
> > > > > packets to
> > > > > apply
> > > > > filtering if they were dropped due to rx descriptor
> > > > > exhaustion.
> > > > > (please
> > > > > bear with me with the basic questions, I am not familiar with
> > > > > this part
> > > > > of the stack).
> > > > >
> > > > > Assuming that tc is still the way to go, after a brief look
> > > > > in to
> > > > > the
> > > > > man pages and the documentation at largc.org, it seems like
> > > > > it
> > > > > would
> > > > > need to use the ingress qdisc, with some sort of system to
> > > > > segregate
> > > > > and priortise based on ingress port. Is this possible?
> > > >
> > > > Hi Robert,
> > > >
> > > > As I see it, you have two problems here:
> > > >
> > > > 1. Classification: Based on ingress port in your case
> > > >
> > > > 2. Scheduling: How to schedule between the different
> > > > transmission
> > > > queues
> > > >
> > > > Where the port from which the packets should egress is the CPU
> > > > port,
> > > > before they cross the PCI towards the imx6.
> > > >
> > > > Both of these issues can be solved by tc. The main problem is
> > > > that
> > > > today
> > > > we do not have a netdev to represent the CPU port and therefore
> > > > can't
> > > > use existing infra like tc. I believe we need to create one.
> > > > Besides
> > > > scheduling, we can also use it to permit/deny certain traffic
> > > > from
> > > > reaching the CPU and perform policing.
> > >
> > > We do not necessarily have to create a CPU netdev, we can overlay
> > > netdev
> > > operations onto the DSA master interface (fec in that case), and
> > > whenever you configure the DSA master interface, we also call
> > > back
> > > into
> > > the switch side for the CPU port. This is not necessarily the
> > > cleanest
> > > way to do things, but that is how we support ethtool operations
> > > (and
> > > some netdev operations incidentally), and it works
> >
> > After reading up on tc, I am not sure how this would work given the
> > semantics of the tool currently.
> >
> > My initial thought was to model the switch's 4 output queues using
> > an
> > mqprio qdisc for the CPU port, and then use either iptables's
> > classify
> > module on the input ports to set which queue it egresses from on
> > the
> > CPU port, or use vlan tagging with id 0 and priority set. (with the
> > many detail of how to implement them still left to discover).
> >
> > However, it looks like the mqprio qdisc could only be used for
> > egress,
> > so without a netdev representing the CPU port, I dont know how it
> > could
> > be used.
>
> If you are looking at mapping your DSA master/CPU port egress queues
> to
> actual switch egress queues, you can look at what bcm_sf2.c and
> bcmsysport.c do and read the commit messages that introduced the
> mapping
> functionality for background on why this was done. In a nutshell, the
> hardware has the ability to back pressure the Ethernet MAC behind the
> CPU port in order to automatically rate limit the egress out of the
> switch. So for instance, if your CPU tries to send 1Gb/sec of traffic
> to
> a port that is linked to a link partner at 100Mbits/sec, there is out
> of
> band information between the switch and the Ethernet DMA of the CPU
> port
> to pace the TX completion interrupt rate to match 100Mbits/sec.
>
> This is going to be different for you here obviously because the
> hardware has not been specifically designed for that, so you do need
> to
> rely on more standard constructs, like actual egress QoS on both
> ends.
>
> >
> > Another thing I thought of using was just to use iptable's TOS
> > module
> > to set the minimal delay bit and rely on default behaviours, but
> > Ive
> > yet to find anything in the Marvell manual that indicates it could
> > set
> > that bit on all frames entering a port.
> >
> > Another option might be to use vlans with their priority bits being
> > used to steer to output queues, but I really dont want to introduce
> > more virtual interfaces in to the setup, and I cant see how to
> > configure an enforce default vlan tag with id 0 and priority bits
> > set
> > via linux userland tools.
> >
> >
> > It does look like tc would be quite nice for configuring the egress
> > rate limiting assuming we a netdev to target with the rate controls
> > of
> > the qdisc.
> >
> >
> > So far, this seems like I am trying to shoe horn this stuff in to
> > tc.
> > It seems like tc is meant to configure how the ip stack configures
> > flow within the stack, whereas in a switch chip, the packets go
> > nowhere
> > near the CPUs kernel ip stack. I cant help thinking that it would
> > be
> > good have a specific utility for configuring switches that operates
> > on
> > the port level for manage flow within the chip, or maybe simple
> > sysfs
> > attributes to set the ports priority.
>
> I am not looking at tc the same way you are doing, tc is just the
> tool
> to configure all QoS/ingress/egress related operations on a network
> device. Whether that network device can offload some of the TC
> operations or not is where things get interesting.
>
> TC has ingress filtering support, which is what you could use for
> offloading broadcast storms, I would imagine that the following
> should
> be possible to be offloaded (this is not a working command but you
> get
> the idea):
>
> tc qdisc add dev sw0p0 handle ffff: ingress
> tc filter add dev sw0p0 parent ffff: protocol ip prio 1 u32 match
> ether
> src 0xfffffffffffff police rate 100k burst 10k skip_sw
>
> something along those lines is how I would implement ingress rate
> limiting leveraging what the switch could do. This might mean adding
> support for offloading specific TC filters, Jiri and Ido can
> certainly
> suggest a cleverer way of achieving that same functionality.
Thanks for your thoughts on this, its been very helpful in leanring the
stack and coming up with ideas for a better design.
I wrote up a set of high level options for discussions internally, and
would appreciate any feedback you had on them:
To get this upstreamed, I think we will need something like the
following high level design:
1. Handle egress rate limiting
1.1. Add frames per second as a rate metric usable throughout
tc and associated kernel interfaces.
1.2. Handle any changes required to make a command like this
work:
tc qdisc add dev enp4s0 handle ffff: ingress
tc filter add dev enp4s0 parent ffff: protocol ip prio 1 u32 match
ether src 0xfffffffffffffffff police rate 50kfrm burst 10kfrm
This should mostly already work as a valid command, maybe some
changes for handling the new frame based rates.
1.3. Add tc bindings in dsa driver that hook in to the parent
netdev's tc bindings (similar to how the ethtool bindings hook in to
the parent's ethtool bindings) to setup the HW egress rate limiting as
done in the existing patch.
2. Add ability to set output queue scheduling algorithm
Currently no netdev is created for the CPU port, so it can't be
targeted by tc or any of the other userland utilities. We need to be
able to set settings for that port. Currently I can think of 2 options:
2.1. Add a netdev to represent the CPU port. This will likely
face objections from some people upstream, though it has already been
suggested as a way to handle this by others upstream.
This would likely require a lot of effort and learning to figure out
how to do this in a way that doesn't start to break key assumptions
with the rest of dsa (like the CPU port not having its own IP address).
If this were achieved, we could then do one of the following:
2.1.1. Use mqprio disc to model the 4 output queues with a new
parameter to select the scheduling mode.
2.1.2. Add an ethtool priv settings capability (similar to priv
flags) that configures the scheduling mode. This would be my preferred
method as it allows port prioritization irrespective of linux's qdisc
priorities, which seems to model the HW better.
2.2. Add tc bindings similar to 1.3 above, which allow us to
define a new ingress qdisc parameter for scheduling mode, which the CPU
port code can see due to hooking in to the parent device's tc bindings.
This might be the simplest approach to implement, but feels a bit hinky
w.r.t the semantics of ingress qdisc as we are actually specifying the
scheduling of output queues for its link partner.
3. Add ability to set default queue priority of incoming packets on a
port.
This could be done as a new parameter for tc's ingress qdisc. I would
suggest that it should specify an 802.1p priority number (e.g. "hwprio
3" would specify all traffic ingressing should be considered critical)
as this neatly lines up with the 8 priority levels used in mqprio and
could be extended further to allow/disallow priority setting per frame
from 802.1q tags (e.g. "hwprio 3 notag" or something similar).
This might balloon in required effort, particularly if we have to
handle the none HW offload path, requiring us to figure out and
implement a priority tagging within the kernel's buffers. This would
likely be able to reuse a lot of the infrastructure in place for 802.1q
tagging that currently exists within the kernel, though Ive not looked
in to those code paths to estimate difficulty.
^ permalink raw reply
* Re: [v2 2/3] samples: pktgen: add helper functions for IP(v4/v6) CIDR parsing
From: Jesper Dangaard Brouer @ 2019-09-13 12:43 UTC (permalink / raw)
To: Daniel T. Lee; +Cc: David S . Miller, netdev, brouer
In-Reply-To: <20190911184807.21770-2-danieltimlee@gmail.com>
On Thu, 12 Sep 2019 03:48:06 +0900
"Daniel T. Lee" <danieltimlee@gmail.com> wrote:
> This commit adds CIDR parsing and IP validate helper function to parse
> single IP or range of IP with CIDR. (e.g. 198.18.0.0/15)
>
> Helpers will be used in prior to set target address in samples/pktgen.
>
> Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
> ---
> samples/pktgen/functions.sh | 122 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 122 insertions(+)
>
> diff --git a/samples/pktgen/functions.sh b/samples/pktgen/functions.sh
> index 4af4046d71be..8be5a6b6c097 100644
[...]
> +# Given a single IP(v4/v6) or CIDR, return minimum and maximum IP addr.
> +function parse_addr()
> +{
> + # check function is called with (funcname)6
> + [[ ${FUNCNAME[1]: -1} == 6 ]] && local IP6=6
> + local bitlen=$[ IP6 ? 128 : 32 ]
> + local octet=$[ IP6 ? 16 : 8 ]
> +
> + local addr=$1
> + local net prefix
> + local min_ip max_ip
> +
> + IFS='/' read net prefix <<< $addr
> + [[ $IP6 ]] && net=$(extend_addr6 $net)
> + validate_addr$IP6 $net
> +
> + if [[ $prefix -gt $bitlen ]]; then
> + err 5 "Invalid prefix: $prefix"
> + elif [[ -z $prefix ]]; then
> + min_ip=$net
> + max_ip=$net
> + else
> + # defining array for converting Decimal 2 Binary
> + # 00000000 00000001 00000010 00000011 00000100 ...
> + local d2b='{0..1}{0..1}{0..1}{0..1}{0..1}{0..1}{0..1}{0..1}'
> + [[ $IP6 ]] && d2b+=$d2b
> + eval local D2B=($d2b)
I must say this is a rather cool shell/bash trick to use an array for
converting decimal numbers into binary.
> +
> + local shift=$[ bitlen-prefix ]
Using a variable named 'shift' is slightly problematic for shell/bash
code. It works, but it is just confusing.
> + local min_mask max_mask
> + local min max
> + local ip_bit
> + local ip sep
> +
> + # set separator for each IP(v4/v6)
> + [[ $IP6 ]] && sep=: || sep=.
> + IFS=$sep read -ra ip <<< $net
> +
> + min_mask="$(printf '1%.s' $(seq $prefix))$(printf '0%.s' $(seq $shift))"
> + max_mask="$(printf '0%.s' $(seq $prefix))$(printf '1%.s' $(seq $shift))"
Also a surprising shell trick to get binary numbers out of a prefix number.
> +
> + # calculate min/max ip with &,| operator
> + for i in "${!ip[@]}"; do
> + digit=$[ IP6 ? 16#${ip[$i]} : ${ip[$i]} ]
> + ip_bit=${D2B[$digit]}
> +
> + idx=$[ octet*i ]
> + min[$i]=$[ 2#$ip_bit & 2#${min_mask:$idx:$octet} ]
> + max[$i]=$[ 2#$ip_bit | 2#${max_mask:$idx:$octet} ]
> + [[ $IP6 ]] && { min[$i]=$(printf '%X' ${min[$i]});
> + max[$i]=$(printf '%X' ${max[$i]}); }
> + done
> +
> + min_ip=$(IFS=$sep; echo "${min[*]}")
> + max_ip=$(IFS=$sep; echo "${max[*]}")
> + fi
> +
> + echo $min_ip $max_ip
> +}
If you just fix the variable name 'shift' to something else, then I'm
happy with this patch.
Again, I'm very impressed with your shell/bash skills, I were certainly
challenged when reviewing this :-)
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: [v2 3/3] samples: pktgen: allow to specify destination IP range (CIDR)
From: Toke Høiland-Jørgensen @ 2019-09-13 12:37 UTC (permalink / raw)
To: Jesper Dangaard Brouer, Daniel T. Lee; +Cc: David S . Miller, netdev, brouer
In-Reply-To: <20190913143144.2b8c18ed@carbon>
Jesper Dangaard Brouer <brouer@redhat.com> writes:
> On Thu, 12 Sep 2019 03:48:07 +0900
> "Daniel T. Lee" <danieltimlee@gmail.com> wrote:
>
>> diff --git a/samples/pktgen/pktgen_sample01_simple.sh b/samples/pktgen/pktgen_sample01_simple.sh
>> index 063ec0998906..08995fa70025 100755
>> --- a/samples/pktgen/pktgen_sample01_simple.sh
>> +++ b/samples/pktgen/pktgen_sample01_simple.sh
>> @@ -22,6 +22,7 @@ fi
>> # Example enforce param "-m" for dst_mac
>> [ -z "$DST_MAC" ] && usage && err 2 "Must specify -m dst_mac"
>> [ -z "$COUNT" ] && COUNT="100000" # Zero means indefinitely
>> +[ -n "$DEST_IP" ] && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
>
> The way the function "parse_addr" is called, in case of errors the
> 'err()' function is called inside, but it will not stop the program
> flow. Instead that function will "only" echo the "ERROR", but program
> flow continues (even-thought 'err()' uses exit $exitcode).
>
> Maybe it is not solveable to get the exit/$?/status out? (I've tried
> different options, but didn't find a way).
`set -o errexit`? :)
-Toke
^ permalink raw reply
* Re: [v2 2/3] samples: pktgen: add helper functions for IP(v4/v6) CIDR parsing
From: Jesper Dangaard Brouer @ 2019-09-13 12:32 UTC (permalink / raw)
To: Daniel T. Lee; +Cc: David S . Miller, netdev, brouer
In-Reply-To: <CAEKGpzhz2jDdO2W7kaZxKQ-3Dkpvu5=DB=JumfcfxM-Hr7Fp0w@mail.gmail.com>
On Fri, 13 Sep 2019 02:53:26 +0900
"Daniel T. Lee" <danieltimlee@gmail.com> wrote:
> On Fri, Sep 13, 2019 at 12:59 AM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
> >
> > On Thu, 12 Sep 2019 03:48:06 +0900
> > "Daniel T. Lee" <danieltimlee@gmail.com> wrote:
> >
> > > This commit adds CIDR parsing and IP validate helper function to parse
> > > single IP or range of IP with CIDR. (e.g. 198.18.0.0/15)
> >
> > One question: You do know that this expansion of the CIDR will also
> > include the CIDR network broadcast IP and "network-address", is that
> > intentional?
> >
>
> Correct.
>
> What I was trying to do with this script is,
> I want to test RSS/RPS and it does not
> really matters whether it is broadcast or network address,
> since the n-tuple hashing doesn't matter whether which kind of it.
Okay, sounds valid to me.
Some more feedback on the shell code... in another reply.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: [v2 3/3] samples: pktgen: allow to specify destination IP range (CIDR)
From: Jesper Dangaard Brouer @ 2019-09-13 12:31 UTC (permalink / raw)
To: Daniel T. Lee; +Cc: David S . Miller, netdev, brouer
In-Reply-To: <20190911184807.21770-3-danieltimlee@gmail.com>
On Thu, 12 Sep 2019 03:48:07 +0900
"Daniel T. Lee" <danieltimlee@gmail.com> wrote:
> diff --git a/samples/pktgen/pktgen_sample01_simple.sh b/samples/pktgen/pktgen_sample01_simple.sh
> index 063ec0998906..08995fa70025 100755
> --- a/samples/pktgen/pktgen_sample01_simple.sh
> +++ b/samples/pktgen/pktgen_sample01_simple.sh
> @@ -22,6 +22,7 @@ fi
> # Example enforce param "-m" for dst_mac
> [ -z "$DST_MAC" ] && usage && err 2 "Must specify -m dst_mac"
> [ -z "$COUNT" ] && COUNT="100000" # Zero means indefinitely
> +[ -n "$DEST_IP" ] && read -r DST_MIN DST_MAX <<< $(parse_addr${IP6} $DEST_IP)
The way the function "parse_addr" is called, in case of errors the
'err()' function is called inside, but it will not stop the program
flow. Instead that function will "only" echo the "ERROR", but program
flow continues (even-thought 'err()' uses exit $exitcode).
Maybe it is not solveable to get the exit/$?/status out? (I've tried
different options, but didn't find a way).
Alternatively we can just add one extra line to validate result:
[ -z "$DST_MIN" ] && err 5 "Stop: Invalid IP${IP6} address input"
As if it fails then $DST_MIN isn't set.
> if [ -n "$DST_PORT" ]; then
> read -r UDP_DST_MIN UDP_DST_MAX <<< $(parse_ports $DST_PORT)
> validate_ports $UDP_DST_MIN $UDP_DST_MAX
> @@ -61,7 +62,8 @@ pg_set $DEV "flag NO_TIMESTAMP"
>
> # Destination
> pg_set $DEV "dst_mac $DST_MAC"
> -pg_set $DEV "dst$IP6 $DEST_IP"
> +pg_set $DEV "dst${IP6}_min $DST_MIN"
> +pg_set $DEV "dst${IP6}_max $DST_MAX"
>
> if [ -n "$DST_PORT" ]; then
> # Single destination port or random port range
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: [PATCH] bpf: validate bpf_func when BPF_JIT is enabled
From: Toke Høiland-Jørgensen @ 2019-09-13 12:19 UTC (permalink / raw)
To: Sami Tolvanen
Cc: Björn Töpel, Yonghong Song, Alexei Starovoitov,
Daniel Borkmann, Kees Cook, Martin Lau, Song Liu,
netdev@vger.kernel.org, bpf@vger.kernel.org,
linux-kernel@vger.kernel.org, Jesper Dangaard Brouer
In-Reply-To: <CABCJKufGy0aRDSUPQEOKYZ9tLjqwQDcDaTW-6im-VfjkB_gUsw@mail.gmail.com>
Sami Tolvanen <samitolvanen@google.com> writes:
> On Thu, Sep 12, 2019 at 3:52 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> I think it would be good if you do both. I'm a bit worried that XDP
>> performance will end up in a "death by a thousand paper cuts" situation,
>> so I'd rather push back on even relatively small overheads like this; so
>> being able to turn it off in the config would be good.
>
> OK, thanks for the feedback. In that case, I think it's probably
> better to wait until we have CFI ready for upstreaming and use the
> same config for this one.
SGTM, thanks!
>> Can you share more details about what the "future CFI checking" is
>> likely to look like?
>
> Sure, I posted an overview of CFI and what we're doing in Pixel devices here:
>
> https://android-developers.googleblog.com/2018/10/control-flow-integrity-in-android-kernel.html
Great, thank you.
-Toke
^ permalink raw reply
* Re: [v2 1/3] samples: pktgen: make variable consistent with option
From: Jesper Dangaard Brouer @ 2019-09-13 12:04 UTC (permalink / raw)
To: Daniel T. Lee; +Cc: David S . Miller, netdev, brouer
In-Reply-To: <20190911184807.21770-1-danieltimlee@gmail.com>
On Thu, 12 Sep 2019 03:48:05 +0900
"Daniel T. Lee" <danieltimlee@gmail.com> wrote:
> This commit changes variable names that can cause confusion.
>
> For example, variable DST_MIN is quite confusing since the
> keyword 'udp_dst_min' and keyword 'dst_min' is used with pg_ctrl.
>
> On the following commit, 'dst_min' will be used to set destination IP,
> and the existing variable name DST_MIN should be changed.
>
> Variable names are matched to the exact keyword used with pg_ctrl.
>
> Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply
* Re: [PATCH bpf-next v10 2/4] bpf: new helper to obtain namespace data from current task New bpf helper bpf_get_current_pidns_info.
From: Carlos Antonio Neira Bustos @ 2019-09-13 11:58 UTC (permalink / raw)
To: Yonghong Song
Cc: Eric Biederman, Al Viro, netdev@vger.kernel.org,
brouer@redhat.com, bpf@vger.kernel.org
In-Reply-To: <91327e6c-2ea7-de0e-4459-06a9e0075416@fb.com>
On Fri, Sep 13, 2019 at 02:56:43AM +0000, Yonghong Song wrote:
Yonghong,
Great, I'll submit this new interface along self tests as version 12.
Thanks for your help.
Bests
>
>
> On 9/12/19 3:03 PM, carlos antonio neira bustos wrote:
> > Yonghong,
> >
> > I think bpf_get_ns_current_pid_tgid interface is a lot better than the
> > one proposed in my patch, how are we going to move forward? Should I
> > take these changes and refactor the selftests to use this new interface
> > and submit version 12 or as the interface changed completely is a new
> > set of patches?.
>
> This is to solve the same problem. You can just submit as version 12.
> This way, we preserves discussion history clearly.
>
> >
> > Eric,
> > Thank you very much for explaining the problem and your help to move
> > forward with this new helper.
> >
> >
> > Bests
^ permalink raw reply
* [PATCH 01/27] netfilter: nf_tables: Fix an Oops in nf_tables_updobj() error handling
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: Dan Carpenter <dan.carpenter@oracle.com>
The "newobj" is an error pointer so we can't pass it to kfree(). It
doesn't need to be freed so we can remove that and I also renamed the
error label.
Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 013d28899cab..7def31ae3022 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5151,7 +5151,7 @@ static int nf_tables_updobj(const struct nft_ctx *ctx,
newobj = nft_obj_init(ctx, type, attr);
if (IS_ERR(newobj)) {
err = PTR_ERR(newobj);
- goto err1;
+ goto err_free_trans;
}
nft_trans_obj(trans) = obj;
@@ -5160,9 +5160,9 @@ static int nf_tables_updobj(const struct nft_ctx *ctx,
list_add_tail(&trans->list, &ctx->net->nft.commit_list);
return 0;
-err1:
+
+err_free_trans:
kfree(trans);
- kfree(newobj);
return err;
}
--
2.11.0
^ permalink raw reply related
* [PATCH 07/27] netfilter: nf_tables_offload: refactor the nft_flow_offload_chain function
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: wenxu <wenxu@ucloud.cn>
Pass chain and policy parameters to nft_flow_offload_chain to reuse it.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_offload.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index e200491ec672..367a7fa5c9dd 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -294,12 +294,13 @@ static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
#define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
-static int nft_flow_offload_chain(struct nft_trans *trans,
+static int nft_flow_offload_chain(struct nft_chain *chain,
+ u8 *ppolicy,
enum flow_block_command cmd)
{
- struct nft_chain *chain = trans->ctx.chain;
struct nft_base_chain *basechain;
struct net_device *dev;
+ u8 policy;
if (!nft_is_base_chain(chain))
return -EOPNOTSUPP;
@@ -309,10 +310,10 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
if (!dev)
return -EOPNOTSUPP;
+ policy = ppolicy ? *ppolicy : basechain->policy;
+
/* Only default policy to accept is supported for now. */
- if (cmd == FLOW_BLOCK_BIND &&
- nft_trans_chain_policy(trans) != -1 &&
- nft_trans_chain_policy(trans) != NF_ACCEPT)
+ if (cmd == FLOW_BLOCK_BIND && policy != -1 && policy != NF_ACCEPT)
return -EOPNOTSUPP;
if (dev->netdev_ops->ndo_setup_tc)
@@ -325,6 +326,7 @@ int nft_flow_rule_offload_commit(struct net *net)
{
struct nft_trans *trans;
int err = 0;
+ u8 policy;
list_for_each_entry(trans, &net->nft.commit_list, list) {
if (trans->ctx.family != NFPROTO_NETDEV)
@@ -335,13 +337,17 @@ int nft_flow_rule_offload_commit(struct net *net)
if (!(trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD))
continue;
- err = nft_flow_offload_chain(trans, FLOW_BLOCK_BIND);
+ policy = nft_trans_chain_policy(trans);
+ err = nft_flow_offload_chain(trans->ctx.chain, &policy,
+ FLOW_BLOCK_BIND);
break;
case NFT_MSG_DELCHAIN:
if (!(trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD))
continue;
- err = nft_flow_offload_chain(trans, FLOW_BLOCK_UNBIND);
+ policy = nft_trans_chain_policy(trans);
+ err = nft_flow_offload_chain(trans->ctx.chain, &policy,
+ FLOW_BLOCK_BIND);
break;
case NFT_MSG_NEWRULE:
if (!(trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD))
--
2.11.0
^ permalink raw reply related
* [PATCH 04/27] netfilter: nft_synproxy: add synproxy stateful object support
From: Pablo Neira Ayuso @ 2019-09-13 11:30 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <20190913113102.15776-1-pablo@netfilter.org>
From: Fernando Fernandez Mancera <ffmancera@riseup.net>
Register a new synproxy stateful object type into the stateful object
infrastructure.
Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/uapi/linux/netfilter/nf_tables.h | 3 +-
net/netfilter/nft_synproxy.c | 143 ++++++++++++++++++++++++++-----
2 files changed, 124 insertions(+), 22 deletions(-)
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 0ff932dadc8e..ed8881ad18ed 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -1481,7 +1481,8 @@ enum nft_ct_expectation_attributes {
#define NFT_OBJECT_CT_TIMEOUT 7
#define NFT_OBJECT_SECMARK 8
#define NFT_OBJECT_CT_EXPECT 9
-#define __NFT_OBJECT_MAX 10
+#define NFT_OBJECT_SYNPROXY 10
+#define __NFT_OBJECT_MAX 11
#define NFT_OBJECT_MAX (__NFT_OBJECT_MAX - 1)
/**
diff --git a/net/netfilter/nft_synproxy.c b/net/netfilter/nft_synproxy.c
index db4c23f5dfcb..e2c1fc608841 100644
--- a/net/netfilter/nft_synproxy.c
+++ b/net/netfilter/nft_synproxy.c
@@ -24,7 +24,7 @@ static void nft_synproxy_tcp_options(struct synproxy_options *opts,
const struct tcphdr *tcp,
struct synproxy_net *snet,
struct nf_synproxy_info *info,
- struct nft_synproxy *priv)
+ const struct nft_synproxy *priv)
{
this_cpu_inc(snet->stats->syn_received);
if (tcp->ece && tcp->cwr)
@@ -41,14 +41,13 @@ static void nft_synproxy_tcp_options(struct synproxy_options *opts,
NF_SYNPROXY_OPT_ECN);
}
-static void nft_synproxy_eval_v4(const struct nft_expr *expr,
+static void nft_synproxy_eval_v4(const struct nft_synproxy *priv,
struct nft_regs *regs,
const struct nft_pktinfo *pkt,
const struct tcphdr *tcp,
struct tcphdr *_tcph,
struct synproxy_options *opts)
{
- struct nft_synproxy *priv = nft_expr_priv(expr);
struct nf_synproxy_info info = priv->info;
struct net *net = nft_net(pkt);
struct synproxy_net *snet = synproxy_pernet(net);
@@ -73,14 +72,13 @@ static void nft_synproxy_eval_v4(const struct nft_expr *expr,
}
#if IS_ENABLED(CONFIG_NF_TABLES_IPV6)
-static void nft_synproxy_eval_v6(const struct nft_expr *expr,
+static void nft_synproxy_eval_v6(const struct nft_synproxy *priv,
struct nft_regs *regs,
const struct nft_pktinfo *pkt,
const struct tcphdr *tcp,
struct tcphdr *_tcph,
struct synproxy_options *opts)
{
- struct nft_synproxy *priv = nft_expr_priv(expr);
struct nf_synproxy_info info = priv->info;
struct net *net = nft_net(pkt);
struct synproxy_net *snet = synproxy_pernet(net);
@@ -105,9 +103,9 @@ static void nft_synproxy_eval_v6(const struct nft_expr *expr,
}
#endif /* CONFIG_NF_TABLES_IPV6*/
-static void nft_synproxy_eval(const struct nft_expr *expr,
- struct nft_regs *regs,
- const struct nft_pktinfo *pkt)
+static void nft_synproxy_do_eval(const struct nft_synproxy *priv,
+ struct nft_regs *regs,
+ const struct nft_pktinfo *pkt)
{
struct synproxy_options opts = {};
struct sk_buff *skb = pkt->skb;
@@ -140,23 +138,22 @@ static void nft_synproxy_eval(const struct nft_expr *expr,
switch (skb->protocol) {
case htons(ETH_P_IP):
- nft_synproxy_eval_v4(expr, regs, pkt, tcp, &_tcph, &opts);
+ nft_synproxy_eval_v4(priv, regs, pkt, tcp, &_tcph, &opts);
return;
#if IS_ENABLED(CONFIG_NF_TABLES_IPV6)
case htons(ETH_P_IPV6):
- nft_synproxy_eval_v6(expr, regs, pkt, tcp, &_tcph, &opts);
+ nft_synproxy_eval_v6(priv, regs, pkt, tcp, &_tcph, &opts);
return;
#endif
}
regs->verdict.code = NFT_BREAK;
}
-static int nft_synproxy_init(const struct nft_ctx *ctx,
- const struct nft_expr *expr,
- const struct nlattr * const tb[])
+static int nft_synproxy_do_init(const struct nft_ctx *ctx,
+ const struct nlattr * const tb[],
+ struct nft_synproxy *priv)
{
struct synproxy_net *snet = synproxy_pernet(ctx->net);
- struct nft_synproxy *priv = nft_expr_priv(expr);
u32 flags;
int err;
@@ -206,8 +203,7 @@ static int nft_synproxy_init(const struct nft_ctx *ctx,
return err;
}
-static void nft_synproxy_destroy(const struct nft_ctx *ctx,
- const struct nft_expr *expr)
+static void nft_synproxy_do_destroy(const struct nft_ctx *ctx)
{
struct synproxy_net *snet = synproxy_pernet(ctx->net);
@@ -229,10 +225,8 @@ static void nft_synproxy_destroy(const struct nft_ctx *ctx,
nf_ct_netns_put(ctx->net, ctx->family);
}
-static int nft_synproxy_dump(struct sk_buff *skb, const struct nft_expr *expr)
+static int nft_synproxy_do_dump(struct sk_buff *skb, struct nft_synproxy *priv)
{
- const struct nft_synproxy *priv = nft_expr_priv(expr);
-
if (nla_put_be16(skb, NFTA_SYNPROXY_MSS, htons(priv->info.mss)) ||
nla_put_u8(skb, NFTA_SYNPROXY_WSCALE, priv->info.wscale) ||
nla_put_be32(skb, NFTA_SYNPROXY_FLAGS, htonl(priv->info.options)))
@@ -244,6 +238,15 @@ static int nft_synproxy_dump(struct sk_buff *skb, const struct nft_expr *expr)
return -1;
}
+static void nft_synproxy_eval(const struct nft_expr *expr,
+ struct nft_regs *regs,
+ const struct nft_pktinfo *pkt)
+{
+ const struct nft_synproxy *priv = nft_expr_priv(expr);
+
+ nft_synproxy_do_eval(priv, regs, pkt);
+}
+
static int nft_synproxy_validate(const struct nft_ctx *ctx,
const struct nft_expr *expr,
const struct nft_data **data)
@@ -252,6 +255,28 @@ static int nft_synproxy_validate(const struct nft_ctx *ctx,
(1 << NF_INET_FORWARD));
}
+static int nft_synproxy_init(const struct nft_ctx *ctx,
+ const struct nft_expr *expr,
+ const struct nlattr * const tb[])
+{
+ struct nft_synproxy *priv = nft_expr_priv(expr);
+
+ return nft_synproxy_do_init(ctx, tb, priv);
+}
+
+static void nft_synproxy_destroy(const struct nft_ctx *ctx,
+ const struct nft_expr *expr)
+{
+ nft_synproxy_do_destroy(ctx);
+}
+
+static int nft_synproxy_dump(struct sk_buff *skb, const struct nft_expr *expr)
+{
+ struct nft_synproxy *priv = nft_expr_priv(expr);
+
+ return nft_synproxy_do_dump(skb, priv);
+}
+
static struct nft_expr_type nft_synproxy_type;
static const struct nft_expr_ops nft_synproxy_ops = {
.eval = nft_synproxy_eval,
@@ -271,14 +296,89 @@ static struct nft_expr_type nft_synproxy_type __read_mostly = {
.maxattr = NFTA_SYNPROXY_MAX,
};
+static int nft_synproxy_obj_init(const struct nft_ctx *ctx,
+ const struct nlattr * const tb[],
+ struct nft_object *obj)
+{
+ struct nft_synproxy *priv = nft_obj_data(obj);
+
+ return nft_synproxy_do_init(ctx, tb, priv);
+}
+
+static void nft_synproxy_obj_destroy(const struct nft_ctx *ctx,
+ struct nft_object *obj)
+{
+ nft_synproxy_do_destroy(ctx);
+}
+
+static int nft_synproxy_obj_dump(struct sk_buff *skb,
+ struct nft_object *obj, bool reset)
+{
+ struct nft_synproxy *priv = nft_obj_data(obj);
+
+ return nft_synproxy_do_dump(skb, priv);
+}
+
+static void nft_synproxy_obj_eval(struct nft_object *obj,
+ struct nft_regs *regs,
+ const struct nft_pktinfo *pkt)
+{
+ const struct nft_synproxy *priv = nft_obj_data(obj);
+
+ nft_synproxy_do_eval(priv, regs, pkt);
+}
+
+static void nft_synproxy_obj_update(struct nft_object *obj,
+ struct nft_object *newobj)
+{
+ struct nft_synproxy *newpriv = nft_obj_data(newobj);
+ struct nft_synproxy *priv = nft_obj_data(obj);
+
+ priv->info = newpriv->info;
+}
+
+static struct nft_object_type nft_synproxy_obj_type;
+static const struct nft_object_ops nft_synproxy_obj_ops = {
+ .type = &nft_synproxy_obj_type,
+ .size = sizeof(struct nft_synproxy),
+ .init = nft_synproxy_obj_init,
+ .destroy = nft_synproxy_obj_destroy,
+ .dump = nft_synproxy_obj_dump,
+ .eval = nft_synproxy_obj_eval,
+ .update = nft_synproxy_obj_update,
+};
+
+static struct nft_object_type nft_synproxy_obj_type __read_mostly = {
+ .type = NFT_OBJECT_SYNPROXY,
+ .ops = &nft_synproxy_obj_ops,
+ .maxattr = NFTA_SYNPROXY_MAX,
+ .policy = nft_synproxy_policy,
+ .owner = THIS_MODULE,
+};
+
static int __init nft_synproxy_module_init(void)
{
- return nft_register_expr(&nft_synproxy_type);
+ int err;
+
+ err = nft_register_obj(&nft_synproxy_obj_type);
+ if (err < 0)
+ return err;
+
+ err = nft_register_expr(&nft_synproxy_type);
+ if (err < 0)
+ goto err;
+
+ return 0;
+
+err:
+ nft_unregister_obj(&nft_synproxy_obj_type);
+ return err;
}
static void __exit nft_synproxy_module_exit(void)
{
- return nft_unregister_expr(&nft_synproxy_type);
+ nft_unregister_expr(&nft_synproxy_type);
+ nft_unregister_obj(&nft_synproxy_obj_type);
}
module_init(nft_synproxy_module_init);
@@ -287,3 +387,4 @@ module_exit(nft_synproxy_module_exit);
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Fernando Fernandez <ffmancera@riseup.net>");
MODULE_ALIAS_NFT_EXPR("synproxy");
+MODULE_ALIAS_NFT_OBJ(NFT_OBJECT_SYNPROXY);
--
2.11.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox