* [PATCH 08/10] dynamic_debug: make netif_dbg() call __netdev_printk()
From: Jason Baron @ 2011-07-06 17:25 UTC (permalink / raw)
To: gregkh
Cc: joe, jim.cromie, bvanassche, linux-kernel, davem, aloisio.almeida,
netdev
In-Reply-To: <cover.1309967232.git.root@dhcp-100-18-164.bos.redhat.com>
From: Jason Baron <jbaron@redhat.com>
Previously, netif_dbg() was using dynamic_dev_dbg() to perform
the underlying printk. Fix it to use __netdev_printk(), instead.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Jason Baron <jbaron@redhat.com>
---
include/linux/dynamic_debug.h | 12 ++++++++++++
include/linux/netdevice.h | 6 ++----
2 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
index feaac1e..7048e64 100644
--- a/include/linux/dynamic_debug.h
+++ b/include/linux/dynamic_debug.h
@@ -84,6 +84,18 @@ extern int __dynamic_netdev_dbg(struct _ddebug *descriptor,
__dynamic_netdev_dbg(&descriptor, dev, fmt, ##__VA_ARGS__);\
} while (0)
+#define dynamic_netif_dbg(dev, cond, fmt, ...) do { \
+ static struct _ddebug descriptor \
+ __used \
+ __attribute__((section("__verbose"), aligned(8))) = \
+ { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \
+ _DPRINTK_FLAGS_DEFAULT }; \
+ if (unlikely(descriptor.enabled)) { \
+ if (cond) \
+ __dynamic_netdev_dbg(&descriptor, dev, fmt, ##__VA_ARGS__);\
+ } \
+ } while (0)
+
#else
static inline int ddebug_remove_module(const char *mod)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9b132ef..99c358f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2731,10 +2731,8 @@ do { \
#elif defined(CONFIG_DYNAMIC_DEBUG)
#define netif_dbg(priv, type, netdev, format, args...) \
do { \
- if (netif_msg_##type(priv)) \
- dynamic_dev_dbg((netdev)->dev.parent, \
- "%s: " format, \
- netdev_name(netdev), ##args); \
+ dynamic_netif_dbg(netdev, (netif_msg_##type(priv)), \
+ format, ##args); \
} while (0)
#else
#define netif_dbg(priv, type, dev, format, args...) \
--
1.7.5.4
^ permalink raw reply related
* Re: Getting the correct asix AX88178 usb gige driver in mainline?
From: Marc MERLIN @ 2011-07-06 17:25 UTC (permalink / raw)
To: netdev; +Cc: greg
In-Reply-To: <20110629033025.GA32153@merlins.org>
Howdy netdev folks,
On Tue, Jul 05, 2011 at 08:35:19AM -0700, Greg KH wrote:
> I looked at this and it seems they took a very old version of the driver
> (from 2003) and somehow changed it to work for this device. I can't
> really tell what they changed unless I were to dig through the original
> version.
>
> I suggest you post your original message to the
> mailing list. The network developers there should be able to help you
> out as I can't at the moment due to real-work and travel.
Here are the details. If somehow their driver could be integrated in
mainline by putting the relevant bits in the current driver, that would be
fantastic :)
(obviously it would have been better if they had done that themselves to
start with, no idea why they didn't).
----------------------------------------------------------------------------
I just bought a USB gige ethernet adapter from 'winstars' called 'delphi g
usb 2.0 gigabit lan'.
Linux 2.6.39.1 says:
usb 8-1: Product: AX88178
usb 8-1: Manufacturer: ASIX Elec. Corp.
usb 8-1: SerialNumber: 000002
asix 8-1:1.0: eth1: register 'asix' at usb-0000:00:1d.7-1, ASIX AX88178 USB 2.0 Ethernet, 00:0e:c6:88:7c:ae
usbcore: registered new interface driver asix
but that driver brings up an eth1 that cannot send data.
Google showed me the vendor website with a working linux driver:
http://www.asix.com.tw/FrootAttach/driver/AX88772_772A_178_LINUX2.6.9_REV122.zip
Their driver which compiled out of tree easily and said:
ASIX USB Ethernet Adapter:v4.1.0 19:16:59 Jun 28 2011
<6> http://www.asix.com.tw
eth%d: status ep1in, 8 bytes period 11
eth1: register 'asix' at usb-0000:00:1d.7-1, ASIX AX88178 USB 2.0 Ethernet, 00:0e:c6:88:7c:ae
usbcore: registered new interface driver asix
eth1: rxqlen 0 --> 5
eth1: ax88178 - Link status is: 0
eth1: kevent 4 scheduled
eth1: ax88178 - Link status is: 1
eth1: no IPv6 routers present
It worked fine with dhcp (which the stock driver sure didn't).
Is there someone who can see if that driver can be merged in mainline to
replace the non working one?
For comparison, the mainline one also outputted a traceback. The full relevant logs are below:
usb 8-1: new high speed USB device number 8 using ehci_hcd
usb 8-1: New USB device found, idVendor=0b95, idProduct=1780
usb 8-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 8-1: Product: AX88178
usb 8-1: Manufacturer: ASIX Elec. Corp.
usb 8-1: SerialNumber: 000002
asix 8-1:1.0: eth1: register 'asix' at usb-0000:00:1d.7-1, ASIX AX88178 USB 2.0 Ethernet, 00:0e:c6:88:7c:ae
usbcore: registered new interface driver asix
asix 8-1:1.0: eth1: link down
ADDRCONF(NETDEV_UP): eth1: link is not ready
ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
asix 8-1:1.0: eth1: link up, 1000Mbps, full-duplex, lpa 0xC5E1
eth1: no IPv6 routers present
gandalfthegrey:~# ifconfig eth1 192.168.205.10
gandalfthegrey:~# ping 192.168.205.254
PING 192.168.205.254 (192.168.205.254) 56(84) bytes of data.
>From 192.168.205.10 icmp_seq=3 Destination Host Unreachable
>From 192.168.205.10 icmp_seq=4 Destination Host Unreachable
(tcpdump showed some traffic though)
Later, I found this in dmesg:
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:256 dev_watchdog+0x113/0x1a9()
Hardware name: 4063FM6
NETDEV WATCHDOG: eth1 (asix): transmit queue 0 timed out
Modules linked in: asix usbnet nfs fscache usb_storage usb_libusual uas xt_tcpudp xt_state nls_iso8859_1 nls_cp437 vfat fat vboxnetadp vboxnetflt vboxdrv fuse nfsd lockd nfs_acl auth_rpcgss sunrpc autofs4 acpi_cpufreq mperf cpufreq_conservative cpufreq_powersave ipt_REJECT cpufreq_stats cpufreq_userspace cpufreq_ondemand freq_table ipt_MASQUERADE ipt_LOG iptable_mangle iptable_filter rfcomm bnep binfmt_misc fbcon tileblit font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor iptable_nat nf_nat nf_conntrack_ipv4 i915 drm_kms_helper nf_conntrack drm fb fbdev nf_defrag_ipv4 i2c_algo_bit cfbcopyarea cfbimgblt cfbfillrect ip_tables x_tables ipv6 btusb bluetooth snd_hda_codec_conexant msr snd_hda_intel coretemp snd_hda_codec snd_hwdep snd_pcm_oss input_polldev snd_mixer_oss thinkpad_a
cpi snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq uvcvideo videodev media pcmcia arc4 snd_timer snd_seq_device r852 sm_common nand nand_ids yenta_socket pcmcia_rsrc iwlagn mac80211 nand_bch pcmcia_core rtc_cmos snd ppdev ehci_hcd bch sdhci_pci sdhci nand_ecc mtd mmc_core parport_pc r592 rtc_core uhci_hcd memstick cfg80211 snd_page_alloc rfkill e1000e soundcore processor battery rtc_lib usbcore psmouse lp ac nvram tpm_tis serio_raw sr_mod cdrom wmi evdev sg parport raid10 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx multipath sha256_generic dm_crypt dm_mod aes_i586 aes_generic ecb cbc intel_agp intel_gtt video agpgart backlight thermal thermal_sys hwmon button
Pid: 0, comm: swapper Not tainted 2.6.39.1-core2-volpreempt-noide-hm64-20110620 #2
Call Trace:
[<c0133599>] warn_slowpath_common+0x60/0x75
[<c0133612>] warn_slowpath_fmt+0x26/0x2a
[<c03c8a35>] dev_watchdog+0x113/0x1a9
[<c012dd94>] ? try_to_wake_up+0x321/0x32c
[<c024f023>] ? jbd2_journal_force_commit_nested+0x6e/0x6e
[<c03c8922>] ? __netdev_watchdog_up+0x52/0x52
[<c013c8c1>] run_timer_softirq+0x139/0x1b5
[<c01380c1>] __do_softirq+0x7a/0x110
[<c0138047>] ? __local_bh_enable+0x6c/0x6c
<IRQ> [<c0137f7a>] ? irq_exit+0x3d/0x8b
[<c01039f1>] ? do_IRQ+0x7b/0x91
[<c0443bb0>] ? common_interrupt+0x30/0x38
[<c014007b>] ? get_signal_to_deliver+0x18d/0x32a
[<f8b82fc9>] ? acpi_idle_enter_simple+0x136/0x16a [processor]
[<c03a1a6f>] ? cpuidle_idle_call+0x6f/0xa0
[<c0101b91>] ? cpu_idle+0x8b/0xa6
[<c042a560>] ? rest_init+0x58/0x5a
[<c06257d6>] ? start_kernel+0x313/0x318
[<c06250a8>] ? i386_start_kernel+0xa8/0xb0
---[ end trace 0c6c364f0d69b5d1 ]---
device eth1 entered promiscuous mode
device eth1 left promiscuous mode
usbcore: deregistering interface driver asix
asix 8-1:1.0: eth1: unregister 'asix' usb-0000:00:1d.7-1, ASIX AX88178 USB 2.0 Ethernet
Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply
* Re: [PATCH 00/10] dynamic_debug: various fixes
From: Jim Cromie @ 2011-07-06 17:57 UTC (permalink / raw)
To: Jason Baron
Cc: gregkh, joe, bvanassche, linux-kernel, davem, aloisio.almeida,
netdev
In-Reply-To: <cover.1309967232.git.root@dhcp-100-18-164.bos.redhat.com>
On Wed, Jul 6, 2011 at 11:24 AM, Jason Baron <jbaron@redhat.com> wrote:
> Hi,
>
> Various dynamic debug fixes and cleanups, and a patch to add myself as
> maintainer. Hopefully, nobody will object too loudly :)
>
do you have this in a git-tree somewhere I can pull ?
> Thanks,
>
> -Jason
>
>
> Joe Perches (4):
> dynamic_debug: Add __dynamic_dev_dbg
> dynamic_debug: Consolidate prefix output to single routine
> dynamic_debug: Remove uses of KERN_CONT in dynamic_emit_prefix
> dynamic_debug: Convert printks to pr_<level>
>
> Jason Baron (6):
> dynamic_debug: remove unused control variables
> dynamic_debug: add myslef as maintainer
> dynamic_debug: make netdev_dbg() call __netdev_printk()
> dynamic_debug: make netif_dbg() call __netdev_printk()
> dynamic_debug: consolidate repetitive struct _ddebug descriptor
> definitions
> dynamic_debug: remove num_enabled accounting
>
> MAINTAINERS | 6 ++
> drivers/base/core.c | 5 +-
> include/linux/device.h | 5 +
> include/linux/dynamic_debug.h | 58 ++++++++++-----
> include/linux/netdevice.h | 12 ++--
> lib/dynamic_debug.c | 165 ++++++++++++++++++++++++++++-------------
> net/core/dev.c | 3 +-
> 7 files changed, 172 insertions(+), 82 deletions(-)
>
> --
> 1.7.5.4
>
>
^ permalink raw reply
* RE: [RFC] non-preemptible kernel socket for RAMster
From: Loke, Chetan @ 2011-07-06 18:12 UTC (permalink / raw)
To: Dan Magenheimer, netdev; +Cc: Konrad Wilk, linux-mm
In-Reply-To: <d19811cc-a722-4d30-8a43-aedb1cd978c9@default>
> -----Original Message-----
> From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com]
> Sent: July 05, 2011 9:06 PM
> To: Loke, Chetan; netdev@vger.kernel.org
> Cc: Konrad Wilk; linux-mm
> Subject: RE: [RFC] non-preemptible kernel socket for RAMster
>
> > From: Loke, Chetan [mailto:Chetan.Loke@netscout.com]
> > Subject: RE: [RFC] non-preemptible kernel socket for RAMster
> >
> > > From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com]
>
> > How often are you going to re-size your remote-SWAP?
>
> is "as often as the working set changes on any machine in the
> cluster", meaning *constantly*, entirely dynamically! How
> about a more specific example: Suppose you have 2 machines,
> each with 8GB of memory. 99% of the time each machine is
> chugging along just fine and doesn't really need more than 4GB,
> and may even use less than 1GB a large part of the time.
> But very now and then, one of the machines randomly needs
> 9GB, 10GB, maybe even 12GB of memory. This would normally
> result in swapping. (Most system administrators won't even
> have this much information... they'll just know they are
> seeing swapping and decide they need to buy more RAM.)
>
Ok, I understand there is interest in implementing
'remote-volatile-ballooning-variant' but how do you pick a remote
candidate(hypervisor)? Let's say, memory could be available on remote
system but what if the remote-p{NIC,CPU} is overloaded? Sure, sysadmins
won't have this info because this so dynamic(and it's quite possible as
you mentioned above). But does the trans-remote-API know about this
resource-availability before opening a remote-channel?
Stressing the remote-p{NIC/CPU} might trick hypervisor-vmotion-plugin to
vmotion VM[s] to another hypervisor. How is trans-remote-API integrating
with remote/global vmotion policies to avoid this false vmotion?
> Dan
Chetan Loke
^ permalink raw reply
* Re: [PATCH 00/10] dynamic_debug: various fixes
From: Jason Baron @ 2011-07-06 18:18 UTC (permalink / raw)
To: Jim Cromie
Cc: gregkh, joe, bvanassche, linux-kernel, davem, aloisio.almeida,
netdev
In-Reply-To: <CAJfuBxyfA=Hjr8HEp47mKyr9fSdRGhppKr_-755vHQqn_RB=8Q@mail.gmail.com>
On Wed, Jul 06, 2011 at 11:57:21AM -0600, Jim Cromie wrote:
> >
> > Various dynamic debug fixes and cleanups, and a patch to add myself as
> > maintainer. Hopefully, nobody will object too loudly :)
> >
>
> do you have this in a git-tree somewhere I can pull ?
>
its only in a local tree...if you really need it, I can set something
up, but these are all the patches I have pending.
thanks,
-Jason
^ permalink raw reply
* Re: [PATCH V7 4/4 net-next] vhost: vhost TX zero-copy support
From: Shirley Ma @ 2011-07-06 19:28 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: David Miller, Eric Dumazet, Avi Kivity, Arnd Bergmann, netdev,
kvm, linux-kernel
In-Reply-To: <20110629091300.GC14627@redhat.com>
On Wed, 2011-06-29 at 12:13 +0300, Michael S. Tsirkin wrote:
> On Sat, May 28, 2011 at 12:34:27PM -0700, Shirley Ma wrote:
> > Hello Michael,
> >
> > In order to use wait for completion in shutting down, seems to me
> > another work thread is needed to call vhost_zerocopy_add_used,
>
> Hmm I don't see vhost_zerocopy_add_used here.
I put the call in vhost_set_vring.
>
> > it seems
> > too much work to address a minor issue here. Do we really need it?
>
> Assuming you mean vhost_zerocopy_signal_used, here's how I would do
> it:
> add a kref and a completion, signal completion in kref_put
> callback, when backend is set - kref_get, on cleanup,
> kref_put and then wait_for_completion_interruptible.
> Where's the need for another thread coming from?
>
> If you like, post a patch with busywait + a FIXME comment,
> and I can write up a patch on top.
I might not have time to finish this during my vacation, so I am
putting busywait + a FIXME comment.
> (BTW, ideally the function that does the signalling should be
> in core networking bits so that it's still around
> even if the vhost module gets removed).
Thanks
Shirley
^ permalink raw reply
* Re: Getting the correct asix AX88178 usb gige driver in mainline?
From: Arnd Bergmann @ 2011-07-06 20:09 UTC (permalink / raw)
To: Marc MERLIN; +Cc: netdev, greg
In-Reply-To: <20110706172538.GI18238@merlins.org>
On Wednesday 06 July 2011 19:25:38 Marc MERLIN wrote:
> Howdy netdev folks,
>
> On Tue, Jul 05, 2011 at 08:35:19AM -0700, Greg KH wrote:
> > I looked at this and it seems they took a very old version of the driver
> > (from 2003) and somehow changed it to work for this device. I can't
> > really tell what they changed unless I were to dig through the original
> > version.
> >
> > I suggest you post your original message to the
> > mailing list. The network developers there should be able to help you
> > out as I can't at the moment due to real-work and travel.
>
> Here are the details. If somehow their driver could be integrated in
> mainline by putting the relevant bits in the current driver, that would be
> fantastic :)
> (obviously it would have been better if they had done that themselves to
> start with, no idea why they didn't).
>
Hi Marc,
I've taken a look at the driver you linked to and compared it to the
version that was closest at the time.
This is similar to the patch they must have had at some point. I would guess
that the answer is somewhere in there. It's quite different to the much
cleaner patch 933a27d39e "USB: asix - Add AX88178 support and many other
changes", which was merged later with a similar intention.
Arnd
diff --git a/drivers/usb/net/usbnet.c b/drivers/usb/net/usbnet.c
index 4cbb408..3c1d0ee 100644
--- a/drivers/usb/net/usbnet.c
+++ b/drivers/usb/net/usbnet.c
@@ -739,11 +739,15 @@ static void ax8817x_mdio_write(struct net_device *netdev, int phy_id, int loc, i
static int ax88172_link_reset(struct usbnet *dev)
{
u16 lpa;
+ u16 adv;
+ u16 res;
u8 mode;
mode = AX_MEDIUM_TX_ABORT_ALLOW | AX_MEDIUM_FLOW_CONTROL_EN;
lpa = ax8817x_mdio_read(dev->net, dev->mii.phy_id, MII_LPA);
- if (lpa & LPA_DUPLEX)
+ adv = ax8817x_mdio_read(dev->net, dev->mii.phy_id, MII_ADVERTISE);
+ res = mii_nway_result(lpa|adv);
+ if (res & LPA_DUPLEX)
mode |= AX_MEDIUM_FULL_DUPLEX;
ax8817x_write_cmd(dev, AX_CMD_WRITE_MEDIUM_MODE, mode, 0, 0, NULL);
@@ -816,7 +820,7 @@ static int ax8817x_get_eeprom(struct net_device *net,
eeprom->offset + i, 0, 2, &ebuf[i]) < 0)
return -EINVAL;
}
- return 0;
+ return i * 2;
}
static void ax8817x_get_drvinfo (struct net_device *net,
@@ -960,6 +964,29 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
goto out2;
msleep(5);
+
+ /* Initialize MII structure */
+ dev->mii.dev = dev->net;
+ dev->mii.mdio_read = ax8817x_mdio_read;
+ dev->mii.mdio_write = ax8817x_mdio_write;
+ dev->mii.phy_id_mask = 0xff;
+ dev->mii.reg_num_mask = 0xff;
+
+ /* Get the PHY id */
+ if ((ret = ax8817x_read_cmd(dev, AX_CMD_READ_PHY_ID, 0, 0, 2, buf)) < 0) {
+ dbg("Error reading PHY ID: %02x", ret);
+ goto out2;
+ } else if (ret < 2) {
+ /* this should always return 2 bytes */
+ dbg("AX_CMD_READ_PHY_ID returned less than 2 bytes: ret=%02x",
+ ret);
+ ret = -EIO;
+ goto out2;
+ }
+ dev->mii.phy_id = *((u8 *)buf + 1);
+
+ if (dev->mii.phy_id == 0x10)
+ {
if ((ret = ax8817x_write_cmd(dev, AX_CMD_SW_PHY_SELECT, 0x0001, 0, 0, buf)) < 0) {
dbg("Select PHY #1 failed: %d", ret);
goto out2;
@@ -984,6 +1011,21 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
dbg("Failed to set Internal/External PHY reset control: %d", ret);
goto out2;
}
+ }
+ else
+ {
+ if ((ret =
+ ax8817x_write_cmd(dev, AX_CMD_SW_PHY_SELECT, 0x0000, 0, 0, buf)) < 0) {
+ dbg("Select PHY #1 failed: %d", ret);
+ goto out2;
+ }
+
+ if ((ret =
+ ax8817x_write_cmd(dev, AX_CMD_SW_RESET, AX_SWRESET_IPPD | AX_SWRESET_PRL, 0, 0, buf)) < 0) {
+ dbg("Failed to power down internal PHY: %d", ret);
+ goto out2;
+ }
+ }
msleep(150);
if ((ret =
@@ -1006,6 +1048,8 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
goto out2;
}
+ if (dev->mii.phy_id == 0x10)
+ {
if (((ret =
ax8817x_read_cmd(dev, AX_CMD_READ_MII_REG, 0x0010, 2, 2, buf)) < 0)
|| (*((u16 *)buf) != 0x003b)) {
@@ -1013,26 +1057,6 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
goto out2;
}
- /* Initialize MII structure */
- dev->mii.dev = dev->net;
- dev->mii.mdio_read = ax8817x_mdio_read;
- dev->mii.mdio_write = ax8817x_mdio_write;
- dev->mii.phy_id_mask = 0xff;
- dev->mii.reg_num_mask = 0xff;
-
- /* Get the PHY id */
- if ((ret = ax8817x_read_cmd(dev, AX_CMD_READ_PHY_ID, 0, 0, 2, buf)) < 0) {
- dbg("Error reading PHY ID: %02x", ret);
- goto out2;
- } else if (ret < 2) {
- /* this should always return 2 bytes */
- dbg("AX_CMD_READ_PHY_ID returned less than 2 bytes: ret=%02x",
- ret);
- ret = -EIO;
- goto out2;
- }
- dev->mii.phy_id = *((u8 *)buf + 1);
-
if ((ret =
ax8817x_write_cmd(dev, AX_CMD_SW_RESET, AX_SWRESET_PRL, 0, 0, buf)) < 0) {
dbg("Set external PHY reset pin level: %d", ret);
@@ -1045,14 +1069,14 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
goto out2;
}
msleep(150);
-
+ }
dev->net->set_multicast_list = ax8817x_set_multicast;
dev->net->ethtool_ops = &ax88772_ethtool_ops;
ax8817x_mdio_write(dev->net, dev->mii.phy_id, MII_BMCR, BMCR_RESET);
ax8817x_mdio_write(dev->net, dev->mii.phy_id, MII_ADVERTISE,
- ADVERTISE_ALL | ADVERTISE_CSMA);
+ ADVERTISE_ALL | ADVERTISE_CSMA | ADVERTISE_PAUSE_CAP);
mii_nway_restart(&dev->mii);
if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MEDIUM_MODE, AX88772_MEDIUM_DEFAULT, 0, 0, buf)) < 0) {
@@ -1060,7 +1084,7 @@ static int ax88772_bind(struct usbnet *dev, struct usb_interface *intf)
goto out2;
}
- if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_IPG0, AX88772_IPG0_DEFAULT | AX88772_IPG1_DEFAULT,AX88772_IPG2_DEFAULT, 0, buf)) < 0) {
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_IPG0, AX88772_IPG0_DEFAULT | (AX88772_IPG1_DEFAULT << 8), AX88772_IPG2_DEFAULT, 0, buf)) < 0) {
dbg("Write IPG,IPG1,IPG2 failed: %d", ret);
goto out2;
}
@@ -1088,6 +1112,663 @@ out1:
return ret;
}
+static int mediacheck(struct usbnet *dev)
+{
+ int ret,fullduplex;
+ u16 phylinkstatus1, phylinkstatus2, data16, tempshort = 0;
+ struct ax8817x_data *ax17xdataptr = (struct ax8817x_data *)&dev->data;
+ struct ax88178_data *ax178dataptr = (struct ax88178_data *)ax17xdataptr->ax178dataptr;
+
+
+ if ((ret =ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG,dev->mii.phy_id,
+ GMII_PHY_ANLPAR, REG_LENGTH, &data16)) < 0) {
+ dbg("error on reading MII register 5 failed: %02x", ret);
+ return ret; //
+ }
+ phylinkstatus1 = le16_to_cpu(data16);
+
+ if ((ret =ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG,dev->mii.phy_id, GMII_PHY_1000BT_STATUS,
+ REG_LENGTH, &data16)) < 0) {
+ dbg("error on reading MII register 0x0a failed: %02x", ret);
+ return ret; //
+ }
+ phylinkstatus2 = le16_to_cpu(data16);
+
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){ //1st generation Marvel PHY
+ if(ax178dataptr->LedMode == 1){
+ if ((ret = ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG,dev->mii.phy_id, MARVELL_MANUAL_LED,
+ REG_LENGTH, &data16)) < 0) {
+ dbg("error on reading MII register 0x19 failed: %02x", ret);
+ return ret; //
+ }
+ tempshort = le16_to_cpu(data16);
+ tempshort &=0xfc0f;
+ }
+ }
+
+ fullduplex=1;
+ if(phylinkstatus2 & (GMII_1000_AUX_STATUS_FD_CAPABLE|GMII_1000_AUX_STATUS_HD_CAPABLE)){ /* 1000BT full duplex */
+ ax178dataptr->MediaLink =
+ MEDIUM_GIGA_MODE|MEDIUM_FULL_DUPLEX_MODE|MEDIUM_ENABLE_125MHZ|MEDIUM_ENABLE_RECEIVE;
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){
+ if(ax178dataptr->LedMode == 1){
+ tempshort|=0x3e0;
+ }
+ }
+ }else if(phylinkstatus1 & GMII_ANLPAR_100TXFD){ /* 100BT full duplex */
+ ax178dataptr->MediaLink=MEDIUM_FULL_DUPLEX_MODE|MEDIUM_ENABLE_RECEIVE|MEDIUM_MII_100M_MODE;
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){
+ if(ax178dataptr->LedMode == 1){
+ tempshort|=0x3b0;
+ }
+ }
+ }else if(phylinkstatus1 & GMII_ANLPAR_100TX){ /* 100BT half duplex */
+ ax178dataptr->MediaLink=(MEDIUM_ENABLE_RECEIVE|MEDIUM_MII_100M_MODE);
+ fullduplex=0;
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){
+ if(ax178dataptr->LedMode == 1){
+ tempshort|=0x3b0;
+ }
+ }
+ }else if(phylinkstatus1 & GMII_ANLPAR_10TFD){ /* 10 full duplex */
+ ax178dataptr->MediaLink = (MEDIUM_FULL_DUPLEX_MODE|MEDIUM_ENABLE_RECEIVE);
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){
+ if(ax178dataptr->LedMode == 1){
+ tempshort|=0x02f0;
+ }
+ }
+ }else{
+ /* 10 half duplex*/
+ ax178dataptr->MediaLink = MEDIUM_ENABLE_RECEIVE;
+ fullduplex=0;
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){
+ if(ax178dataptr->LedMode == 1){
+ tempshort|=0x02f0;
+ }
+ }
+ }
+
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){
+ if(ax178dataptr->LedMode == 1){
+ data16 = le16_to_cpu(tempshort);
+ if ( (ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, (u8)dev->mii.phy_id,
+ MARVELL_MANUAL_LED, REG_LENGTH, &data16)) < 0){
+ dbg("error on writing MII register 0x19 failed: %02x", ret);
+ return ret;
+ }
+ }
+ }
+ ax178dataptr->MediaLink |= 0x0004;
+ if(ax178dataptr->UseRgmii != 0)
+ ax178dataptr->MediaLink |= 0x0008;
+ if(fullduplex){
+ ax178dataptr->MediaLink |= 0x0020; //ebable tx flow control as default;
+ ax178dataptr->MediaLink |= 0x0010; //ebable rx flow control as default;
+ }
+
+ return 0;
+}
+
+static int marevell_init(struct usbnet *dev)
+{
+ struct ax8817x_data *ax17xdataptr = (struct ax8817x_data *)&dev->data;
+ struct ax88178_data *ax178dataptr = (struct ax88178_data *)ax17xdataptr->ax178dataptr;
+ u16 tmp,phyreg,PhyPatch,data16;
+ int ret;
+ void *buf;
+ u8 PhyID = (u8)ax178dataptr->PhyID;
+
+ buf = kmalloc(ETH_ALEN, GFP_KERNEL);
+ if(!buf)
+ return -ENOMEM;
+
+ if(ax178dataptr->UseGpio0)
+ {
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS,AXGPIOS_GPO0EN |AXGPIOS_RSE,0, 0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(25);
+ tmp = AXGPIOS_GPO2 | AXGPIOS_GPO2EN | AXGPIOS_GPO0EN;
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, tmp, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(25);
+ tmp = AXGPIOS_GPO2EN | AXGPIOS_GPO0EN;
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, tmp, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(245);
+ tmp = AXGPIOS_GPO2 | AXGPIOS_GPO2EN | AXGPIOS_GPO0EN;
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, tmp, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+
+ }
+ else /* !UseGpio0 */
+ {
+ tmp = AXGPIOS_GPO1|AXGPIOS_GPO1EN | AXGPIOS_RSE;
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, tmp, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ if(ax178dataptr->LedMode != 1) //our new demo board
+ {
+ msleep(25);
+ tmp = AXGPIOS_GPO1|AXGPIOS_GPO1EN | AXGPIOS_GPO2EN | AXGPIOS_GPO2;
+ if ((ret =ax8817x_write_cmd(dev,AX_CMD_WRITE_GPIOS,tmp,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(25);
+ tmp = AXGPIOS_GPO2EN | AXGPIOS_GPO1|AXGPIOS_GPO1EN;
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, tmp, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(245);
+ tmp = AXGPIOS_GPO1|AXGPIOS_GPO1EN|AXGPIOS_GPO2|AXGPIOS_GPO2EN;
+ if ((ret = ax8817x_write_cmd(dev,AX_CMD_WRITE_GPIOS,tmp,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ }
+ else if(ax178dataptr->LedMode == 1) //bufflo old card
+ {
+ msleep(350);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, AXGPIOS_GPO1EN, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(350);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, AXGPIOS_GPO1|AXGPIOS_GPO1EN, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ }
+ }
+
+
+ if((ret = ax8817x_read_cmd(dev, AX_CMD_READ_MII_REG, PhyID, PHY_MARVELL_STATUS, REG_LENGTH, &data16)) < 0){
+ dbg("read register reg 27 failed: %d", ret);
+ return ret;
+ } //read phy register
+
+ phyreg = le16_to_cpu(data16);
+ if(!(phyreg & MARVELL_STATUS_HWCFG)){
+ ax178dataptr->UseRgmii=1;
+ PhyPatch = MARVELL_CTRL_RXDELAY | MARVELL_CTRL_TXDELAY;
+ data16 = cpu_to_le16(PhyPatch);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, PHY_MARVELL_CTRL, REG_LENGTH, &data16)) < 0)
+ return ret;
+ ax178dataptr->MediaLink |= MEDIUM_ENABLE_125MHZ;
+ }
+
+ if(ax178dataptr->LedMode == 1){
+ if((ret = ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG, PhyID, MARVELL_LED_CTRL, REG_LENGTH, &data16))< 0)
+ return ret;
+ phyreg = le16_to_cpu(data16);
+ phyreg &= 0xf8ff;
+ phyreg |= (1+0x100);
+
+ data16 = le16_to_cpu(phyreg);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, MARVELL_LED_CTRL, REG_LENGTH,&data16))< 0)
+ return ret;
+ if((ret = ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG, PhyID, MARVELL_LED_CTRL, REG_LENGTH, &data16))< 0)
+ return ret;
+ phyreg = le16_to_cpu(data16);
+ phyreg &=0xfc0f;
+ } else if(ax178dataptr->LedMode == 2){
+
+ if((ret = ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG, PhyID, MARVELL_LED_CTRL, REG_LENGTH, &data16))< 0)
+ return ret;
+
+ phyreg = le16_to_cpu(data16);
+ phyreg &= 0xf886;
+ phyreg |= (1+0x10+0x300);
+ data16 = cpu_to_le16(phyreg);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, MARVELL_LED_CTRL, REG_LENGTH,&data16))< 0)
+ return ret;
+
+ }else if(ax178dataptr->LedMode == 5){
+ if((ret = ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG, PhyID, MARVELL_LED_CTRL, REG_LENGTH, &data16))< 0)
+ return ret;
+ phyreg = le16_to_cpu(data16);
+ phyreg &= 0xf8be;
+ phyreg |= (1+0x40+0x300);
+ data16 = cpu_to_le16(phyreg);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, MARVELL_LED_CTRL, REG_LENGTH,&data16))< 0)
+ return ret;
+ }
+
+ ax178dataptr->phyreg = phyreg;
+ return 0;
+}
+
+static int cicada_init(struct usbnet *dev)
+{
+
+ struct ax8817x_data *ax17xdataptr = (struct ax8817x_data *)&dev->data;
+ struct ax88178_data *ax178dataptr = (struct ax88178_data *)ax17xdataptr->ax178dataptr;
+ u16 tmp, phyreg, i, data16;
+ int ret;
+ void *buf;
+ u8 PhyID = (u8)ax178dataptr->PhyID;
+
+ buf = kmalloc(ETH_ALEN, GFP_KERNEL);
+ if(!buf)
+ return -ENOMEM;
+
+ if(ax178dataptr->UseGpio0)
+ {
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS,AXGPIOS_GPO0 | AXGPIOS_GPO0EN | AXGPIOS_RSE,0, 0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ }
+ else
+ {
+ tmp = AXGPIOS_GPO1|AXGPIOS_GPO1EN | AXGPIOS_RSE;
+ if ((ret =ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS,tmp,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ if(ax178dataptr->LedMode!= 1) //our new demo board
+ {
+ msleep(25);
+ tmp = AXGPIOS_GPO1|AXGPIOS_GPO1EN | AXGPIOS_GPO2EN | AXGPIOS_GPO2;
+ if ((ret =ax8817x_write_cmd(dev,AX_CMD_WRITE_GPIOS,tmp,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(25);
+ tmp = AXGPIOS_GPO2EN | AXGPIOS_GPO1|AXGPIOS_GPO1EN;
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, tmp, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(245);
+ tmp = AXGPIOS_GPO1|AXGPIOS_GPO1EN|AXGPIOS_GPO2|AXGPIOS_GPO2EN;
+ if ((ret = ax8817x_write_cmd(dev,AX_CMD_WRITE_GPIOS,tmp,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ }
+ else if(ax178dataptr->LedMode==1) //bufflo old card
+ {
+ msleep(350);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, AXGPIOS_GPO1EN, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(350);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, AXGPIOS_GPO1|AXGPIOS_GPO1EN, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ }
+ }
+
+ if(ax178dataptr->PhyMode == PHY_MODE_CICADA_FAMILY){ //CICADA 1st version phy
+ ax178dataptr->UseRgmii=1;
+ ax178dataptr->MediaLink |= MEDIUM_ENABLE_125MHZ;// MEDIUM_ENABLE_125MHZ;
+
+ for (i = 0; i < sizeof(CICADA_FAMILY_HWINIT)/sizeof(CICADA_FAMILY_HWINIT[0]); i++) {
+ data16 = cpu_to_le16(CICADA_FAMILY_HWINIT[i].value);
+ ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, CICADA_FAMILY_HWINIT[i].offset, REG_LENGTH, &data16);
+ if(ret < 0) return ret;
+ }
+ }
+ else if(ax178dataptr->PhyMode == PHY_MODE_CICADA_V2){
+ ax178dataptr->UseRgmii=1;
+ ax178dataptr->MediaLink |= MEDIUM_ENABLE_125MHZ;
+
+ for (i = 0; i < ( sizeof(CICADA_V2_HWINIT)/sizeof(CICADA_V2_HWINIT[0]) ); i++) {
+ data16 = cpu_to_le16(CICADA_V2_HWINIT[i].value);
+ ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, CICADA_V2_HWINIT[i].offset, REG_LENGTH, &data16);
+ if(ret < 0) return ret;
+ }
+ }
+ else if(ax178dataptr->PhyMode == PHY_MODE_CICADA_V2_ASIX){
+ ax178dataptr->UseRgmii=1;
+ ax178dataptr->MediaLink |= MEDIUM_ENABLE_125MHZ;
+
+ for (i = 0; i < ( sizeof(CICADA_V2_ASIX_HWINIT)/sizeof(CICADA_V2_ASIX_HWINIT[0]) ); i++) {
+ data16 = cpu_to_le16(CICADA_V2_ASIX_HWINIT[i].value);
+ ret=ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, CICADA_V2_ASIX_HWINIT[i].offset, REG_LENGTH, &data16);
+ if(ret < 0) return ret;
+ }
+ }
+
+ if(ax178dataptr->PhyMode == PHY_MODE_CICADA_FAMILY){
+ if(ax178dataptr->LedMode == 3){
+ if((ret = ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG, PhyID, 27, 2, &data16))< 0)
+ return ret;
+ phyreg = le16_to_cpu(data16);
+ phyreg &= 0xfcff;
+ phyreg |= 0x0100;
+ data16 = cpu_to_le16(phyreg);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, 27,2,&data16))< 0)
+ return ret;
+ }
+ }
+ return 0;
+}
+
+static int agere_init(struct usbnet *dev)
+{
+ struct ax8817x_data *ax17xdataptr = (struct ax8817x_data *)&dev->data;
+ struct ax88178_data *ax178dataptr = (struct ax88178_data *)ax17xdataptr->ax178dataptr;
+ u16 tmp, phyreg, i;
+ int ret;
+ void *buf;
+ u8 PhyID = (u8)ax178dataptr->PhyID;
+
+ buf = kmalloc(ETH_ALEN, GFP_KERNEL);
+ if(!buf)
+ return -ENOMEM;
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS,AXGPIOS_GPO1|AXGPIOS_GPO1EN | AXGPIOS_RSE,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(25);
+ if ((ret=ax8817x_write_cmd(dev,AX_CMD_WRITE_GPIOS,AXGPIOS_GPO1|AXGPIOS_GPO1EN
+ | AXGPIOS_GPO2EN | AXGPIOS_GPO2,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(25);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_GPIOS, AXGPIOS_GPO2EN | AXGPIOS_GPO1
+ | AXGPIOS_GPO1EN, 0, 0, buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+ msleep(245);
+ if ((ret=ax8817x_write_cmd(dev,AX_CMD_WRITE_GPIOS,AXGPIOS_GPO1|AXGPIOS_GPO1EN|AXGPIOS_GPO2
+ | AXGPIOS_GPO2EN,0,0,buf)) < 0){
+ dbg("write GPIO failed: %d", ret);
+ return ret;
+ }
+
+ ax178dataptr->UseRgmii=1;
+ ax178dataptr->MediaLink |= MEDIUM_ENABLE_125MHZ;
+
+ phyreg = cpu_to_le16(BMCR_RESET);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, MII_BMCR, REG_LENGTH, &phyreg)) < 0) {
+ dbg("Failed to write MII reg - MII_BMCR: %02x", ret);
+ return ret;
+ } //software reset
+
+ while (1)
+ {
+ phyreg = cpu_to_le16(0x1001);
+ ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, 21, REG_LENGTH, &phyreg);
+ msleep(10);
+ ax8817x_read_cmd(dev, AX_CMD_READ_MII_REG, PhyID, 21, REG_LENGTH, &phyreg);
+ tmp = le16_to_cpu(phyreg);
+ if ((tmp & 0xf00f) == 0x1001)
+ break;
+ msleep(10);
+ }
+
+ if (ax178dataptr->LedMode == 4)
+ {
+ phyreg = cpu_to_le16(0x7417);
+ ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, 28, 2, &phyreg);
+ }
+ else if (ax178dataptr->LedMode == 9)
+ {
+ phyreg = cpu_to_le16(0x7a10);
+ ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, 28, 2, &phyreg);
+ }
+ else if (ax178dataptr->LedMode == 10)
+ {
+ phyreg = cpu_to_le16(0x7a13);
+ ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, PhyID, 28, 2, &phyreg);
+ }
+
+ for (i = 0; i < ( sizeof(AGERE_FAMILY_HWINIT)/sizeof(AGERE_FAMILY_HWINIT[0]) ); i++) {
+ phyreg = cpu_to_le16(AGERE_FAMILY_HWINIT[i].value);
+ ret=ax8817x_write_cmd(dev,AX_CMD_WRITE_MII_REG,PhyID,AGERE_FAMILY_HWINIT[i].offset,REG_LENGTH,&phyreg);
+ if(ret < 0) return ret;
+ }
+
+ return 0;
+}
+
+static int phy_init(struct usbnet *dev)
+{
+ struct ax8817x_data *ax17xdataptr = (struct ax8817x_data *)&dev->data;
+ struct ax88178_data *ax178dataptr = (struct ax88178_data *)ax17xdataptr->ax178dataptr;
+ int ret;
+ u16 tmp, data16, phyanar, phyauxctrl, phyctrl, phyreg = 0;
+ void *buf;
+
+ buf = kmalloc(ETH_ALEN, GFP_KERNEL);
+ if(!buf)
+ return -ENOMEM;
+
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL) {
+ if((ret = marevell_init(dev)) < 0) return ret;
+ }else if(ax178dataptr->PhyMode == PHY_MODE_CICADA_FAMILY) {
+ if((ret = cicada_init(dev)) < 0) return ret;
+ }else if(ax178dataptr->PhyMode == PHY_MODE_CICADA_V1) {
+ if((ret = cicada_init(dev)) < 0) return ret;
+ }else if(ax178dataptr->PhyMode == PHY_MODE_CICADA_V2_ASIX) {
+ if((ret = cicada_init(dev)) < 0) return ret;
+ }else if(ax178dataptr->PhyMode == PHY_MODE_AGERE_FAMILY) {
+ if((ret = agere_init(dev)) < 0) return ret;
+ }
+
+ if(ax178dataptr->PhyMode != PHY_MODE_AGERE_FAMILY)
+ {
+ /* reset phy */
+ data16 = cpu_to_le16(BMCR_RESET);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, dev->mii.phy_id,
+ MII_BMCR, REG_LENGTH, (void *)(&data16))) < 0) {
+ dbg("Failed to write MII reg - MII_BMCR: %02x", ret);
+ return ret;
+ }
+ }
+
+ if ((ret = ax8817x_read_cmd(dev,AX_CMD_READ_MII_REG, dev->mii.phy_id , MII_BMCR,
+ REG_LENGTH, &data16)) < 0) {
+ dbg("error on read MII reg - MII_BMCR: %02x", ret);
+ return ret; //could be 0x0000
+ }
+
+ phyctrl = le16_to_cpu(data16);
+ tmp=phyctrl;
+ phyctrl &=~(BMCR_PDOWN|BMCR_ISOLATE);
+ if(phyctrl != tmp){
+ data16 = cpu_to_le16(phyctrl);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG, dev->mii.phy_id, MII_BMCR,
+ REG_LENGTH, &data16)) < 0) {
+ dbg("Failed to write MII reg - MII_BMCR: %02x", ret);
+ return ret;
+ }
+
+ }
+
+ phyctrl&= ~BMCR_ISOLATE;
+ phyanar=1+(0x0400|ADVERTISE_100FULL|ADVERTISE_100HALF|ADVERTISE_10FULL|ADVERTISE_10HALF);
+ phyauxctrl=0x0200; //1000M and full duplex
+
+ data16 = cpu_to_le16(phyanar);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG,dev->mii.phy_id,
+ GMII_PHY_ANAR,REG_LENGTH,&data16))< 0) return ret;
+
+ data16 = cpu_to_le16(phyauxctrl);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG,dev->mii.phy_id,
+ GMII_PHY_1000BT_CONTROL,REG_LENGTH,&data16))< 0) return ret;
+
+ phyctrl |= (BMCR_ANENABLE|BMCR_ANRESTART);
+ data16 = cpu_to_le16(phyctrl);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG,dev->mii.phy_id,
+ GMII_PHY_CONTROL,REG_LENGTH,&data16))< 0) return ret;
+
+ if(ax178dataptr->PhyMode == PHY_MODE_MARVELL){
+ if(ax178dataptr->LedMode==1) {
+ phyreg |= 0x3f0;
+ data16 = cpu_to_le16(phyreg);
+ if((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MII_REG,dev->mii.phy_id,
+ 25,REG_LENGTH,&phyreg))< 0) return ret;
+ }
+ }
+
+ msleep(3000);
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_IPG0,
+ (AX88772_IPG0_DEFAULT | (AX88772_IPG1_DEFAULT << 8)),
+ 0x000e, 0, buf)) < 0) {
+ dbg("write IPG IPG1 IPG2 reg failed: %d", ret);
+ return ret;
+ }
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_SET_HW_MII, 0, 0, 0, buf)) < 0) {
+ dbg("disable PHY access failed: %d", ret);
+ return ret;
+ }
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_RX_CTL,
+ (AX_RX_CTL_MFB | AX_RX_CTL_START | AX_RX_CTL_AB),
+ 0, 0, buf)) < 0) {
+ dbg("write RX ctrl reg failed: %d", ret);
+ return ret;
+ }
+
+ return 0;
+
+}
+
+static int ax88178_bind(struct usbnet *dev, struct usb_interface *intf)
+{
+ int ret;
+ void *buf;
+ u16 EepromData,PhyID, temp16;
+ struct ax8817x_data *ax17xdataptr = (struct ax8817x_data *)&dev->data;
+ struct ax88178_data *ax178dataptr;
+
+ get_endpoints(dev,intf);
+
+ buf = kmalloc(6, GFP_KERNEL);
+ if(!buf) {
+ dbg ("Cannot allocate memory for buffer");
+ return -ENOMEM;
+ }
+
+ /* allocate 178 data */
+ if (!(ax178dataptr = kmalloc (sizeof(struct ax88178_data), GFP_KERNEL))) {
+ dbg ("Cannot allocate memory for AX88178 data");
+ return -ENOMEM;
+ }
+ memset (ax178dataptr, 0, sizeof(struct ax88178_data));
+ ax17xdataptr->ax178dataptr = ax178dataptr;
+ /* end of allocate 178 data */
+
+ if ((ret = ax8817x_write_cmd(dev, 0x22, 0x0000, 0, 0, buf)) < 0) {
+ dbg("write S/W reset failed: %d", ret);
+ return ret;
+ }
+ msleep(150);
+
+ if ((ret = ax8817x_write_cmd(dev, 0x20, 0x0048, 0, 0, buf)) < 0) {
+ dbg("write S/W reset failed: %d", ret);
+ return ret;
+ }
+ msleep(150);
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_RX_CTL, 0x0000, 0, 0, buf)) < 0) {
+ dbg("send AX_CMD_WRITE_RX_CTL failed: %d", ret);
+ return ret; //stop rcv
+ }
+
+ msleep(150);
+
+ /* Get the MAC address */
+ memset(buf, 0, ETH_ALEN);
+ if ((ret = ax8817x_read_cmd(dev, AX88772_CMD_READ_NODE_ID, 0, 0, ETH_ALEN, buf)) < 0) {
+ dbg("read AX_CMD_READ_NODE_ID failed: %d", ret);
+ return ret;
+ }
+ memcpy(dev->net->dev_addr, buf, ETH_ALEN);
+ /* End of get MAC address */
+
+
+ /* Get the EEPROM data*/
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_EEPROM_EN, 0, 0, 0, buf)) < 0) {
+ dbg("enable SROM reading failed: %d", ret);
+ return ret; // ???
+ }
+
+ if ((ret = ax8817x_read_cmd(dev, AX_CMD_READ_EEPROM,
+ 0x0017, 0, 2, (void *)(&EepromData))) < 0) {
+ dbg("read SROM address 17h failed: %d", ret);
+ return ret;
+ }
+
+ ax178dataptr->EepromData = le16_to_cpu(EepromData);
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_EEPROM_DIS, 0, 0, 0, buf)) < 0) {
+ dbg("disable SROM reading failed: %d", ret);
+ return ret; // ???
+ }
+ /* End of get EEPROM data */
+
+ /* Get PHY id */
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_SET_SW_MII, 0x0000, 0, 0, buf)) < 0) {
+ dbg("enable PHY reg. access capability: %d", ret);
+ return ret; //enable Phy register access capability
+ }
+
+ if ((ret = ax8817x_read_cmd(dev, AX_CMD_READ_PHY_ID, 0, 0, REG_LENGTH, &temp16)) < 0) {
+ dbg("error on read AX_CMD_READ_PHY_ID: %02x", ret);
+ return ret;
+ } else if (ret < 2) {
+ /* this should always return 2 bytes */
+ dbg("AX_CMD_READ_PHY_ID returned less than 2 bytes: ret=%02x", ret);
+ return -EIO;
+ }
+
+ PhyID = le16_to_cpu(temp16);
+ PhyID = (PhyID >> 8) & PHY_ID_MASK;
+ ax178dataptr->PhyID = PhyID;
+ /* End of get PHY id */
+
+ /* Initialize MII structure */
+ dev->mii.dev = dev->net;
+ dev->mii.mdio_read = ax8817x_mdio_read;
+ dev->mii.mdio_write = ax8817x_mdio_write;
+ dev->mii.phy_id_mask = 0x3f;
+ dev->mii.reg_num_mask = 0x1f;
+ dev->mii.phy_id = (u8)ax178dataptr->PhyID;
+
+ if (ax178dataptr->EepromData == 0xffff)
+ {
+ ax178dataptr->PhyMode = PHY_MODE_MARVELL;
+ ax178dataptr->LedMode = 0;
+ ax178dataptr->UseGpio0 = 1; //True
+ }
+ else
+ {
+ ax178dataptr->PhyMode = (u8)(ax178dataptr->EepromData & EEPROMMASK);
+ ax178dataptr->LedMode = (u8)(ax178dataptr->EepromData>>8);
+ if(ax178dataptr->EepromData & 0x80) {
+ ax178dataptr->UseGpio0=0; //MARVEL se and other
+ }
+ else {
+ ax178dataptr->UseGpio0=1; //cameo
+ }
+ }
+
+ if ((ret = phy_init(dev)) < 0) return ret;
+
+ return 0;
+}
+
static int ax88772_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
{
u32 *header;
@@ -1199,6 +1880,70 @@ static int ax88772_link_reset(struct usbnet *dev)
return 0;
}
+static int set_media(struct usbnet *dev)
+{
+ int ret;
+ void *buf;
+ struct ax8817x_data *ax17xdataptr = (struct ax8817x_data *)&dev->data;
+ struct ax88178_data *ax178dataptr = (struct ax88178_data *)ax17xdataptr->ax178dataptr;
+
+ buf = kmalloc(ETH_ALEN, GFP_KERNEL);
+ if(!buf)
+ return -ENOMEM;
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_SET_SW_MII, 0x0000, 0, 0, buf)) < 0) {
+ dbg("enable PHY reg. access capability: %d", ret);
+ return ret; //enable Phy register access capability
+ }
+
+ mediacheck(dev);
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_WRITE_MEDIUM_MODE,
+ ax178dataptr->MediaLink, 0, 0, buf)) < 0) {
+ dbg("write mode medium reg failed: %d", ret);
+ return ret;
+ }
+
+ if ((ret = ax8817x_write_cmd(dev, AX_CMD_SET_HW_MII, 0, 0, 0, buf)) < 0) {
+ dbg("disable PHY access failed: %d", ret);
+ return ret;
+ }
+
+ dev->net->set_multicast_list = ax8817x_set_multicast;
+ dev->net->ethtool_ops = &ax8817x_ethtool_ops;
+ return 0;
+}
+
+static int ax88178_link_reset(struct usbnet *dev)
+{
+ int ret;
+
+ if ((ret = set_media(dev)) < 0) return ret;
+ return 0;
+}
+
+static const struct driver_info ax88178_info = {
+ .description = "ASIX AX88178 USB 2.0 Ethernet",
+ .bind = ax88178_bind,
+ .status = ax8817x_status,
+ .link_reset = ax88178_link_reset,
+ .flags = FLAG_ETHER|FLAG_FRAMING_AX,
+ .rx_fixup = ax88772_rx_fixup,
+ .tx_fixup = ax88772_tx_fixup,
+ .data = 0x00130103, //useless here
+};
+
+static const struct driver_info belkin178_info = {
+ .description = "Belkin Gigabit USB 2.0 Network Adapter",
+ .bind = ax88178_bind,
+ .status = ax8817x_status,
+ .link_reset = ax88178_link_reset,
+ .flags = FLAG_ETHER|FLAG_FRAMING_AX,
+ .rx_fixup = ax88772_rx_fixup,
+ .tx_fixup = ax88772_tx_fixup,
+ .data = 0x00130103, //useless here
+};
+
static const struct driver_info ax8817x_info = {
.description = "ASIX AX8817x USB 2.0 Ethernet",
.bind = ax8817x_bind,
@@ -1251,6 +1996,18 @@ static const struct driver_info ax88772_info = {
.data = 0x00130103,
};
+static const struct driver_info dlink_dub_e100b_info = {
+ .description = "D-Link DUB-E100 USB 2.0 Fast Ethernet Adapter",
+ .bind = ax88772_bind,
+ .status = ax8817x_status,
+ .link_reset = ax88772_link_reset,
+ .reset = ax88772_link_reset,
+ .flags = FLAG_ETHER | FLAG_FRAMING_AX,
+ .rx_fixup = ax88772_rx_fixup,
+ .tx_fixup = ax88772_tx_fixup,
+ .data = 0x00130103,
+};
+
#endif /* CONFIG_USB_AX8817X */
^ permalink raw reply related
* Re: Getting the correct asix AX88178 usb gige driver in mainline?
From: Marc MERLIN @ 2011-07-06 21:08 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: netdev, greg
In-Reply-To: <201107062209.05794.arnd@arndb.de>
On Wed, Jul 06, 2011 at 10:09:05PM +0200, Arnd Bergmann wrote:
> > Here are the details. If somehow their driver could be integrated in
> > mainline by putting the relevant bits in the current driver, that would be
> > fantastic :)
> > (obviously it would have been better if they had done that themselves to
> > start with, no idea why they didn't).
> >
>
> Hi Marc,
>
> I've taken a look at the driver you linked to and compared it to the
> version that was closest at the time.
>
> This is similar to the patch they must have had at some point. I would guess
> that the answer is somewhere in there. It's quite different to the much
> cleaner patch 933a27d39e "USB: asix - Add AX88178 support and many other
> changes", which was merged later with a similar intention.
(...)
> The patch I mentioned was merged back in 2006, for 2.6.19. Either that
> patch was never complete and is missing support for your hardware, or
> it broke since then. You should probably try an old kernel to see if it's
> actually a regression.
Thanks for the details Arnd, I'll see if I can boot 2.6.19 on that laptop
and report back.
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
^ permalink raw reply
* Re: [Bugme-new] [Bug 38862] New: No support for DGE-530T Rev C1
From: Andrew Morton @ 2011-07-06 21:37 UTC (permalink / raw)
To: jameshenderson; +Cc: bugme-daemon, netdev, Stephen Hemminger
In-Reply-To: <bug-38862-10286@https.bugzilla.kernel.org/>
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Wed, 6 Jul 2011 16:05:36 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=38862
>
> Summary: No support for DGE-530T Rev C1
> Product: Drivers
> Version: 2.5
> Kernel Version: 2.6.39.2
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Network
> AssignedTo: drivers_network@kernel-bugs.osdl.org
> ReportedBy: jameshenderson@ruggedcom.com
> Regression: No
>
>
> The kernel support DGE-530T REV B2 through the skge driver. The PCI device id
> of REV-B2 is 1186:4302. Rev C1 has a PCI device id of 1186:4302. No driver in
> the current kernel supports this device id/vendor combination. Furthermore,
> this device is not even listed in the device database ->
> http://www.pcidatabase.com/vendor_details.php?id=921 .
>
> On the chip is the following information:
> D-Link
> DLG10028C
> A8A34A1
> GA50 TAIWAN
>
> A sticker on the card reads:
> DGE-530T Rev C1
>
> I am including a picture I took of the card.
>
Did you test simply adding that device to the driver?
--- a/drivers/net/skge.c~a
+++ a/drivers/net/skge.c
@@ -89,6 +89,7 @@ static DEFINE_PCI_DEVICE_TABLE(skge_id_t
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_YU) },
{ PCI_DEVICE(PCI_VENDOR_ID_DLINK, PCI_DEVICE_ID_DLINK_DGE510T) },
{ PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b01) }, /* DGE-530T */
+ { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4302) }, /* DGE-530T Rev C1 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4320) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x5005) }, /* Belkin */
{ PCI_DEVICE(PCI_VENDOR_ID_CNET, PCI_DEVICE_ID_CNET_GIGACARD) },
_
^ permalink raw reply
* Re: [PATCH v2 net-next af-packet 1/2] Enhance af-packet to provide (near zero)lossless packet capture functionality.
From: chetan loke @ 2011-07-06 21:45 UTC (permalink / raw)
To: David Miller
Cc: netdev, eric.dumazet, joe, bhutchings, shemminger, linux-kernel
In-Reply-To: <20110705.080123.2174577714045488116.davem@davemloft.net>
On Tue, Jul 5, 2011 at 11:01 AM, David Miller <davem@davemloft.net> wrote:
>
> That issue only exists because you haven't defined a common header
> struct that the current, and all future, block descriptor variants can
> include at the start of their definitions.
what's common today may not be common tomorrow. After much thinking I
decided to not provide a generic header because I wouldn't want to
enforce anything.
new format:
union bd_header_u {
/* renamed struct bd_v1 to hdr_v1 */
struct hdr_v1 h1;
} __attribute__ ((__packed__));
struct block_desc {
__u16 version;
__u16 offset_to_priv;
union bd_header_u hdr;
} __attribute__ ((__packed__));
Is this ok with you?
>
> Use real data structures, not opaque "offset+size" poking into the
> descriptors.
>
Used to writing firmware APIs. APIs use words/bytes so that they can
be interpreted by firmware folks too.
Chetan Loke
^ permalink raw reply
* Re: [PATCH 01/10] dynamic_debug: Add __dynamic_dev_dbg
From: Joe Perches @ 2011-07-06 21:46 UTC (permalink / raw)
To: Jason Baron
Cc: gregkh, jim.cromie, bvanassche, linux-kernel, davem,
aloisio.almeida, netdev
In-Reply-To: <6d176b4a9fa316e5130455b6a6b47ba1dd0501d5.1309967232.git.root@dhcp-100-18-164.bos.redhat.com>
On Wed, 2011-07-06 at 13:24 -0400, Jason Baron wrote:
> Unlike dynamic_pr_debug, dynamic uses of dev_dbg can not
> currently add task_pid/KBUILD_MODNAME/__func__/__LINE__
> to selected debug output.
> Add a new function similar to dynamic_pr_debug to
> optionally emit these prefixes.
[]
> diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
[]
> @@ -456,6 +457,43 @@ int __dynamic_pr_debug(struct _ddebug *descriptor, const char *fmt, ...)
[]
> +int __dynamic_dev_dbg(struct _ddebug *descriptor,
> + const struct device *dev, const char *fmt, ...)
[]
> + res += __dev_printk("", dev, &vaf);
I suppose that should more properly be written as:
res += __dev_printk(KERN_CONT, dev, &vaf);
^ permalink raw reply
* Re: [PATCH 07/10] dynamic_debug: make netdev_dbg() call __netdev_printk()
From: Joe Perches @ 2011-07-06 21:50 UTC (permalink / raw)
To: Jason Baron
Cc: gregkh, jim.cromie, bvanassche, linux-kernel, davem,
aloisio.almeida, netdev
In-Reply-To: <2ac0aaf4e955209cfad896c72fdb6b1491b021e1.1309967232.git.root@dhcp-100-18-164.bos.redhat.com>
On Wed, 2011-07-06 at 13:25 -0400, Jason Baron wrote:
> diff --git a/lib/dynamic_debug.c b/lib/dynamic_debug.c
[]
> @@ -503,6 +504,30 @@ int __dynamic_dev_dbg(struct _ddebug *descriptor,
[]
> +int __dynamic_netdev_dbg(struct _ddebug *descriptor,
> + const struct net_device *dev, const char *fmt, ...)
[]
> + res += __netdev_printk("", dev, &vaf);
KERN_CONT here too.
^ permalink raw reply
* Re: [Bugme-new] [Bug 38862] New: No support for DGE-530T Rev C1
From: James Henderson @ 2011-07-06 21:48 UTC (permalink / raw)
To: Andrew Morton
Cc: bugme-daemon@bugzilla.kernel.org, netdev@vger.kernel.org,
Stephen Hemminger
In-Reply-To: <20110706143709.0f5ab7d6.akpm@linux-foundation.org>
Andrew Morton wrote:
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Wed, 6 Jul 2011 16:05:36 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
>
>
>> https://bugzilla.kernel.org/show_bug.cgi?id=38862
>>
>> Summary: No support for DGE-530T Rev C1
>> Product: Drivers
>> Version: 2.5
>> Kernel Version: 2.6.39.2
>> Platform: All
>> OS/Version: Linux
>> Tree: Mainline
>> Status: NEW
>> Severity: normal
>> Priority: P1
>> Component: Network
>> AssignedTo: drivers_network@kernel-bugs.osdl.org
>> ReportedBy: jameshenderson@ruggedcom.com
>> Regression: No
>>
>>
>> The kernel support DGE-530T REV B2 through the skge driver. The PCI device id
>> of REV-B2 is 1186:4302. Rev C1 has a PCI device id of 1186:4302. No driver in
>> the current kernel supports this device id/vendor combination. Furthermore,
>> this device is not even listed in the device database ->
>> http://www.pcidatabase.com/vendor_details.php?id=921 .
>>
>> On the chip is the following information:
>> D-Link
>> DLG10028C
>> A8A34A1
>> GA50 TAIWAN
>>
>> A sticker on the card reads:
>> DGE-530T Rev C1
>>
>> I am including a picture I took of the card.
>>
>>
>
> Did you test simply adding that device to the driver?
>
> --- a/drivers/net/skge.c~a
> +++ a/drivers/net/skge.c
> @@ -89,6 +89,7 @@ static DEFINE_PCI_DEVICE_TABLE(skge_id_t
> { PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_YU) },
> { PCI_DEVICE(PCI_VENDOR_ID_DLINK, PCI_DEVICE_ID_DLINK_DGE510T) },
> { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b01) }, /* DGE-530T */
> + { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4302) }, /* DGE-530T Rev C1 */
> { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4320) },
> { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x5005) }, /* Belkin */
> { PCI_DEVICE(PCI_VENDOR_ID_CNET, PCI_DEVICE_ID_CNET_GIGACARD) },
> _
>
>
No I haven't tested that change - I don't have a kernel development
environment setup and unfortunately I don't have any more work time to
budget to the issue beyond reporting it.
Also, I meant to say that Rev B2 has PCI id 1186:4B01 (although you seem
to have figured that out).
Thanks,
James
^ permalink raw reply
* Re: [Bugme-new] [Bug 38862] New: No support for DGE-530T Rev C1
From: Andrew Morton @ 2011-07-06 21:54 UTC (permalink / raw)
To: James Henderson
Cc: bugme-daemon@bugzilla.kernel.org, netdev@vger.kernel.org,
Stephen Hemminger
In-Reply-To: <4E14D83C.1090506@ruggedcom.com>
On Wed, 6 Jul 2011 17:48:44 -0400
James Henderson <JamesHenderson@ruggedcom.com> wrote:
> > Did you test simply adding that device to the driver?
> >
> > --- a/drivers/net/skge.c~a
> > +++ a/drivers/net/skge.c
> > @@ -89,6 +89,7 @@ static DEFINE_PCI_DEVICE_TABLE(skge_id_t
> > { PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_YU) },
> > { PCI_DEVICE(PCI_VENDOR_ID_DLINK, PCI_DEVICE_ID_DLINK_DGE510T) },
> > { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b01) }, /* DGE-530T */
> > + { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4302) }, /* DGE-530T Rev C1 */
> > { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4320) },
> > { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x5005) }, /* Belkin */
> > { PCI_DEVICE(PCI_VENDOR_ID_CNET, PCI_DEVICE_ID_CNET_GIGACARD) },
> > _
> >
> >
> No I haven't tested that change - I don't have a kernel development
> environment setup and unfortunately I don't have any more work time to
> budget to the issue beyond reporting it.
>
> Also, I meant to say that Rev B2 has PCI id 1186:4B01 (although you seem
> to have figured that out).
OK, I suppose we can add that info thusly:
--- a/drivers/net/skge.c~drivers-net-skgec-support-dlink-dge-530t-rev-c1
+++ a/drivers/net/skge.c
@@ -88,7 +88,8 @@ static DEFINE_PCI_DEVICE_TABLE(skge_id_t
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_GE) },
{ PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_YU) },
{ PCI_DEVICE(PCI_VENDOR_ID_DLINK, PCI_DEVICE_ID_DLINK_DGE510T) },
- { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b01) }, /* DGE-530T */
+ { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b01) }, /* DGE-530T Rev B2 */
+ { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4302) }, /* DGE-530T Rev C1 */
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4320) },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x5005) }, /* Belkin */
{ PCI_DEVICE(PCI_VENDOR_ID_CNET, PCI_DEVICE_ID_CNET_GIGACARD) },
_
although that might be misleading if, say, 0x4b01 describes other
revisions.
But there isn't much we can do with this until someone can test the
change.
^ permalink raw reply
* Re: [PATCHv2] sctp: Enforce retransmission limit during shutdown
From: Thomas Graf @ 2011-07-06 21:58 UTC (permalink / raw)
To: Vladislav Yasevich
Cc: netdev, davem, Wei Yongjun, Sridhar Samudrala, linux-sctp
In-Reply-To: <4E148C16.8090505@hp.com>
On Wed, Jul 06, 2011 at 12:23:50PM -0400, Vladislav Yasevich wrote:
> You are right. Without a receiver patch, a linux receiver would stay in 0-window condition
> while sending a SHUTDOWN with a_rwnd of 0.
>
> How about instead of checking for "Not greater then or equals", we instead simply test for
> "less then"?
Agreed
Will repost the patch with your suggestions included and look into the
receiver patch as well.
^ permalink raw reply
* Re: [PATCH 08/10] dynamic_debug: make netif_dbg() call __netdev_printk()
From: Joe Perches @ 2011-07-06 21:59 UTC (permalink / raw)
To: Jason Baron
Cc: gregkh, jim.cromie, bvanassche, linux-kernel, davem,
aloisio.almeida, netdev
In-Reply-To: <889f3300a96f381aee1239ea775014fff26d93c9.1309967232.git.root@dhcp-100-18-164.bos.redhat.com>
On Wed, 2011-07-06 at 13:25 -0400, Jason Baron wrote:
> From: Jason Baron <jbaron@redhat.com>
>
> Previously, netif_dbg() was using dynamic_dev_dbg() to perform
> the underlying printk. Fix it to use __netdev_printk(), instead.
>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Jason Baron <jbaron@redhat.com>
> ---
> include/linux/dynamic_debug.h | 12 ++++++++++++
> include/linux/netdevice.h | 6 ++----
> 2 files changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/dynamic_debug.h b/include/linux/dynamic_debug.h
[]
> +#define dynamic_netif_dbg(dev, cond, fmt, ...) do { \
> + static struct _ddebug descriptor \
> + __used \
> + __attribute__((section("__verbose"), aligned(8))) = \
> + { KBUILD_MODNAME, __func__, __FILE__, fmt, __LINE__, \
> + _DPRINTK_FLAGS_DEFAULT }; \
> + if (unlikely(descriptor.enabled)) { \
> + if (cond) \
> + __dynamic_netdev_dbg(&descriptor, dev, fmt, ##__VA_ARGS__);\
> + } \
> + } while (0)
> +
Just nits:
I think it'd be better to use
#define dynamic_netif_dbg(etc) \
do { \
etc...
} while (0)
so that there aren't 2 consecutive close braces at the same indent level.
and maybe just use one test
if (unlikely(descriptor.enabled) && cond)
__dynamic_netdev_dbg(&descriptor, dev, fmt, ##__VA_ARGS__);
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 9b132ef..99c358f 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2731,10 +2731,8 @@ do { \
> #elif defined(CONFIG_DYNAMIC_DEBUG)
> #define netif_dbg(priv, type, netdev, format, args...) \
> do { \
> - if (netif_msg_##type(priv)) \
> - dynamic_dev_dbg((netdev)->dev.parent, \
> - "%s: " format, \
> - netdev_name(netdev), ##args); \
> + dynamic_netif_dbg(netdev, (netif_msg_##type(priv)), \
> + format, ##args); \
Because you've already added dynamic_netdev_dbg,
maybe this should be:
#define netif_dbg(priv, type, netdev, format, args...) \
do { \
if (netif_msg_##type(priv)) \
dynamic_netdev_dbg(netdev, format, ##args); \
} while (0)
^ permalink raw reply
* Re: [Bugme-new] [Bug 38862] New: No support for DGE-530T Rev C1
From: Stephen Hemminger @ 2011-07-06 22:07 UTC (permalink / raw)
To: James Henderson
Cc: Andrew Morton, bugme-daemon@bugzilla.kernel.org,
netdev@vger.kernel.org
In-Reply-To: <4E14D83C.1090506@ruggedcom.com>
On Wed, 6 Jul 2011 17:48:44 -0400
James Henderson <JamesHenderson@ruggedcom.com> wrote:
> Andrew Morton wrote:
> > (switched to email. Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> >
> > On Wed, 6 Jul 2011 16:05:36 GMT
> > bugzilla-daemon@bugzilla.kernel.org wrote:
> >
> >
> >> https://bugzilla.kernel.org/show_bug.cgi?id=38862
> >>
> >> Summary: No support for DGE-530T Rev C1
> >> Product: Drivers
> >> Version: 2.5
> >> Kernel Version: 2.6.39.2
> >> Platform: All
> >> OS/Version: Linux
> >> Tree: Mainline
> >> Status: NEW
> >> Severity: normal
> >> Priority: P1
> >> Component: Network
> >> AssignedTo: drivers_network@kernel-bugs.osdl.org
> >> ReportedBy: jameshenderson@ruggedcom.com
> >> Regression: No
> >>
> >>
> >> The kernel support DGE-530T REV B2 through the skge driver. The PCI device id
> >> of REV-B2 is 1186:4302. Rev C1 has a PCI device id of 1186:4302. No driver in
> >> the current kernel supports this device id/vendor combination. Furthermore,
> >> this device is not even listed in the device database ->
> >> http://www.pcidatabase.com/vendor_details.php?id=921 .
> >>
> >> On the chip is the following information:
> >> D-Link
> >> DLG10028C
> >> A8A34A1
> >> GA50 TAIWAN
> >>
> >> A sticker on the card reads:
> >> DGE-530T Rev C1
> >>
> >> I am including a picture I took of the card.
> >>
> >>
> >
> > Did you test simply adding that device to the driver?
> >
> > --- a/drivers/net/skge.c~a
> > +++ a/drivers/net/skge.c
> > @@ -89,6 +89,7 @@ static DEFINE_PCI_DEVICE_TABLE(skge_id_t
> > { PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_YU) },
> > { PCI_DEVICE(PCI_VENDOR_ID_DLINK, PCI_DEVICE_ID_DLINK_DGE510T) },
> > { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4b01) }, /* DGE-530T */
> > + { PCI_DEVICE(PCI_VENDOR_ID_DLINK, 0x4302) }, /* DGE-530T Rev C1 */
> > { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x4320) },
> > { PCI_DEVICE(PCI_VENDOR_ID_MARVELL, 0x5005) }, /* Belkin */
> > { PCI_DEVICE(PCI_VENDOR_ID_CNET, PCI_DEVICE_ID_CNET_GIGACARD) },
> > _
I will go look at the Marvell Syskonnect driver, they occasionally
up date it with new ID's.
^ permalink raw reply
* [PATCH V8 0/4 net-next] macvtap/vhost TX zero-copy support
From: Shirley Ma @ 2011-07-06 22:15 UTC (permalink / raw)
To: David Miller, mst; +Cc: netdev, kvm, linux-kernel
This patchset add supports for TX zero-copy between guest and host
kernel through vhost. It significantly reduces CPU utilization on the
local host on which the guest is located (It reduced about 50% CPU usage
for single stream test on the host, while 4K message size BW has
increased about 50%). The patchset is based on previous submission and
comments from the community regarding when/how to handle guest kernel
buffers to be released. This is the simplest approach I can think of
after comparing with several other solutions.
This patchset has integrated V3 review comments from community:
1. Add more comments on how to use device ZEROCOPY flag;
2. Change device ZEROCOPY to available bit 31
3. Fix skb header linear allocation when virtio_net GSO is not enabled
It has integrated V4 review comments from MST and Sridhar:
1. In vhost, using socket poll wake up for outstanding DMAs
2. Add detailed comments for vhost_zerocopy_signal_used call
3. Add sleep in vhost shutting down instead of busy-wait for outstanding
DMAs.
4. Copy small packets, don't do zero-copy callback in mavtap, mark it's
DMA done in vhost
5. change zerocopy to bool in macvtap.
It has integrated V5 review comments from MST and
MichaÅ. MirosÅ.aw <mirqus@...il.com>
1. Prevent userspace apps from holding skb userspace buffers by copying
userspace buffers to kernel in skb_clone, skb_copy, pskb_copy,
pskb_expand_head.
2. It is also used HIGHDMA, SG feature bits to enable ZEROCOPY to remove
the dependency of a new feature bit, we can add it later when new
feature bit is available.
It has integrated V6 review comments from Eric Dumazet.
1. Moving ubuf_info object from skb to caller, just use one pointer in
skb_share_info to point ubuf_info object.
2. Change the zero-copy size from 256 bytes to PAGE_SIZE (4K) because of
the small message size performance issue.
3. During vhost shutting down, release outstanding userspace buffers w/o
waiting for lower device DMAs done if any. Do we really care about the
possible wrong data being sent on the wire during shutting down?
This patch has integrated Version 7 review from Michael:
1. Add comment to fix busywait while vhost ring changes and clean up.
2. Add a new tx flags for zero copy skbs, use destructor_arg to avoid a new
point in skb share_info.
This patchset includes:
1/4: Add a new sock zero-copy flag, SOCK_ZEROCOPY;
2/4: Add a new tx flags in skb_share_info SKBTX_DEV_ZEROCOPY to check
userspace buffers release callback when lower device DMA has done for that skb,
which is the last reference count gone;
And whenever skb_clone, skb_copy, pskb_copy, pskb_expand_head get call
from tcpdump, filtering, these userspace buffers will be copied into kernel
... we don't want userspace apps to hold userspace buffers too long.
Use skb destructor arg as a pointer to userspace buffer info
3/4: Add vhost zero-copy callback in vhost when skb last refcnt is gone;
add vhost_zerocopy_signal_used to notify guest to release TX skb
buffers.
4/4: Add macvtap zero-copy in lower device when sending packet is
greater than 256 bytes.
The patchset is built against net next linux-3.0.0-rc5. It has passed
netperf/netserver multiple streams stress test, tcpdump suspended test,
dynamically SG change test.
Single TCP_STREAM 120 secs test results 2.6.39-rc3 over ixgbe 10Gb NIC
results:
Message BW(Gb/s)qemu-kvm (NumCPU)vhost-net(NumCPU) PerfTop irq/s
4K 7408.57 92.1% 22.6% 1229
4K(Orig)4913.17 118.1% 84.1% 2086
8K 9129.90 89.3% 23.3% 1141
8K(Orig)7094.55 115.9% 84.7% 2157
16K 9178.81 89.1% 23.3% 1139
16K(Orig)8927.1 118.7% 83.4% 2262
64K 9171.43 88.4% 24.9% 1253
64K(Orig)9085.85 115.9% 82.4% 2229
For message size less or equal than 2K, there is a known KVM guest TX
overrun issue. With this zero-copy patch, the issue becomes more severe,
guest io_exits has tripled than before, so the performance is not good.
Once the TX overrun problem has been addressed, I will retest the small
message size performance.
drivers/net/macvtap.c | 132 ++++++++++++++++++++++++++++++++++++++++++++----
drivers/vhost/net.c | 45 ++++++++++++++++-
drivers/vhost/vhost.c | 48 +++++++++++++++++
drivers/vhost/vhost.h | 15 ++++++
include/linux/skbuff.h | 16 ++++++
include/net/sock.h | 1 +
net/core/skbuff.c | 79 ++++++++++++++++++++++++++++-
7 files changed, 324 insertions(+), 14 deletions(-)
Thanks
Shirley
^ permalink raw reply
* [PATCH V8 1/4 net-next] sock.h: Add a new sock zero-copy flag
From: Shirley Ma @ 2011-07-06 22:17 UTC (permalink / raw)
To: David Miller, mst; +Cc: netdev, kvm, linux-kernel
Signed-off-by: Shirley Ma <xma@us.ibm.com>
---
include/net/sock.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index ae56da6..396f735 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -563,6 +563,7 @@ enum sock_flags {
SOCK_TIMESTAMPING_SYS_HARDWARE, /* %SOF_TIMESTAMPING_SYS_HARDWARE */
SOCK_FASYNC, /* fasync() active */
SOCK_RXQ_OVFL,
+ SOCK_ZEROCOPY, /* buffers from userspace */
};
static inline void sock_copy_flags(struct sock *nsk, struct sock *osk)
^ permalink raw reply related
* [PATCH V8 2/4 net-next] skbuff: skb supports zero-copy buffers
From: Shirley Ma @ 2011-07-06 22:22 UTC (permalink / raw)
To: David Miller, mst; +Cc: netdev, kvm, linux-kernel
This patch adds userspace buffers support in skb shared info. A new
struct skb_ubuf_info is needed to maintain the userspace buffers
argument and index, a callback is used to notify userspace to release
the buffers once lower device has done DMA (Last reference to that skb
has gone).
If there is any userspace apps to reference these userspace buffers,
then these userspaces buffers will be copied into kernel. This way we
can prevent userspace apps from holding these userspace buffers too long.
Use destructor_arg to point to the userspace buffer info; a new tx flags
SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.
Signed-off-by: Shirley Ma <xma@...ibm.com>
---
include/linux/skbuff.h | 16 ++++++++++
net/core/skbuff.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 94 insertions(+), 1 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3e54337..08d4507 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -187,6 +187,20 @@ enum {
/* ensure the originating sk reference is available on driver level */
SKBTX_DRV_NEEDS_SK_REF = 1 << 3,
+
+ /* device driver supports TX zero-copy buffers */
+ SKBTX_DEV_ZEROCOPY = 1 << 4,
+};
+
+/*
+ * The callback notifies userspace to release buffers when skb DMA is done in
+ * lower device, the skb last reference should be 0 when calling this.
+ * The desc is used to track userspace buffer index.
+ */
+struct ubuf_info {
+ void (*callback)(void *);
+ void *arg;
+ unsigned long desc;
};
/* This data is invariant across clones and lives at
@@ -211,6 +225,7 @@ struct skb_shared_info {
/* Intermediate layers must ensure that destructor_arg
* remains valid until skb destructor */
void * destructor_arg;
+
/* must be last field, see pskb_expand_head() */
skb_frag_t frags[MAX_SKB_FRAGS];
};
@@ -2265,5 +2280,6 @@ static inline void skb_checksum_none_assert(struct sk_buff *skb)
}
bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off);
+
#endif /* __KERNEL__ */
#endif /* _LINUX_SKBUFF_H */
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 46cbd28..42462f5 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -329,6 +329,18 @@ static void skb_release_data(struct sk_buff *skb)
put_page(skb_shinfo(skb)->frags[i].page);
}
+ /*
+ * If skb buf is from userspace, we need to notify the caller
+ * the lower device DMA has done;
+ */
+ if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
+ struct ubuf_info *uarg;
+
+ uarg = skb_shinfo(skb)->destructor_arg;
+ if (uarg->callback)
+ uarg->callback(uarg);
+ }
+
if (skb_has_frag_list(skb))
skb_drop_fraglist(skb);
@@ -481,6 +493,9 @@ bool skb_recycle_check(struct sk_buff *skb, int skb_size)
if (irqs_disabled())
return false;
+ if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY)
+ return false;
+
if (skb_is_nonlinear(skb) || skb->fclone != SKB_FCLONE_UNAVAILABLE)
return false;
@@ -596,6 +611,50 @@ struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src)
}
EXPORT_SYMBOL_GPL(skb_morph);
+/* skb frags copy userspace buffers to kernel */
+static int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
+{
+ int i;
+ int num_frags = skb_shinfo(skb)->nr_frags;
+ struct page *page, *head = NULL;
+ struct ubuf_info *uarg = skb_shinfo(skb)->destructor_arg;
+
+ for (i = 0; i < num_frags; i++) {
+ u8 *vaddr;
+ skb_frag_t *f = &skb_shinfo(skb)->frags[i];
+
+ page = alloc_page(GFP_ATOMIC);
+ if (!page) {
+ while (head) {
+ put_page(head);
+ head = (struct page *)head->private;
+ }
+ return -ENOMEM;
+ }
+ vaddr = kmap_skb_frag(&skb_shinfo(skb)->frags[i]);
+ memcpy(page_address(page),
+ vaddr + f->page_offset, f->size);
+ kunmap_skb_frag(vaddr);
+ page->private = (unsigned long)head;
+ head = page;
+ }
+
+ /* skb frags release userspace buffers */
+ for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
+ put_page(skb_shinfo(skb)->frags[i].page);
+
+ uarg->callback(uarg);
+
+ /* skb frags point to kernel buffers */
+ for (i = skb_shinfo(skb)->nr_frags; i > 0; i--) {
+ skb_shinfo(skb)->frags[i - 1].page_offset = 0;
+ skb_shinfo(skb)->frags[i - 1].page = head;
+ head = (struct page *)head->private;
+ }
+ return 0;
+}
+
+
/**
* skb_clone - duplicate an sk_buff
* @skb: buffer to clone
@@ -614,6 +673,11 @@ struct sk_buff *skb_clone(struct sk_buff *skb, gfp_t gfp_mask)
{
struct sk_buff *n;
+ if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
+ if (skb_copy_ubufs(skb, gfp_mask))
+ return NULL;
+ }
+
n = skb + 1;
if (skb->fclone == SKB_FCLONE_ORIG &&
n->fclone == SKB_FCLONE_UNAVAILABLE) {
@@ -731,6 +795,12 @@ struct sk_buff *pskb_copy(struct sk_buff *skb, gfp_t gfp_mask)
if (skb_shinfo(skb)->nr_frags) {
int i;
+ if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
+ if (skb_copy_ubufs(skb, gfp_mask)) {
+ kfree(n);
+ goto out;
+ }
+ }
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_shinfo(n)->frags[i] = skb_shinfo(skb)->frags[i];
get_page(skb_shinfo(n)->frags[i].page);
@@ -788,7 +858,6 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
fastpath = true;
else {
int delta = skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1;
-
fastpath = atomic_read(&skb_shinfo(skb)->dataref) == delta;
}
@@ -819,6 +888,11 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
if (fastpath) {
kfree(skb->head);
} else {
+ /* copy this zero copy skb frags */
+ if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
+ if (skb_copy_ubufs(skb, gfp_mask))
+ goto nofrags;
+ }
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
get_page(skb_shinfo(skb)->frags[i].page);
@@ -853,6 +927,8 @@ adjust_others:
atomic_set(&skb_shinfo(skb)->dataref, 1);
return 0;
+nofrags:
+ kfree(data);
nodata:
return -ENOMEM;
}
@@ -1354,6 +1430,7 @@ int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len)
}
start = end;
}
+
if (!len)
return 0;
^ permalink raw reply related
* [PATCH V8 3/4 net-next] macvtap: macvtapTX zero-copy support
From: Shirley Ma @ 2011-07-06 22:26 UTC (permalink / raw)
To: David Miller, mst; +Cc: netdev, kvm, linux-kernel
Only 128 bytes is copied, the rest of data is DMA mapped directly from
userspace.
Signed-off-by: Shirley Ma <xma@...ibm.com>
---
drivers/net/macvtap.c | 132 ++++++++++++++++++++++++++++++++++++++++++++----
1 files changed, 121 insertions(+), 11 deletions(-)
diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index ecee0fe..b3fd53a 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -60,6 +60,7 @@ static struct proto macvtap_proto = {
*/
static dev_t macvtap_major;
#define MACVTAP_NUM_DEVS 65536
+#define GOODCOPY_LEN 128
static struct class *macvtap_class;
static struct cdev macvtap_cdev;
@@ -340,6 +341,7 @@ static int macvtap_open(struct inode *inode, struct file *file)
{
struct net *net = current->nsproxy->net_ns;
struct net_device *dev = dev_get_by_index(net, iminor(inode));
+ struct macvlan_dev *vlan = netdev_priv(dev);
struct macvtap_queue *q;
int err;
@@ -369,6 +371,16 @@ static int macvtap_open(struct inode *inode, struct file *file)
q->flags = IFF_VNET_HDR | IFF_NO_PI | IFF_TAP;
q->vnet_hdr_sz = sizeof(struct virtio_net_hdr);
+ /*
+ * so far only KVM virtio_net uses macvtap, enable zero copy between
+ * guest kernel and host kernel when lower device supports zerocopy
+ */
+ if (vlan) {
+ if ((vlan->lowerdev->features & NETIF_F_HIGHDMA) &&
+ (vlan->lowerdev->features & NETIF_F_SG))
+ sock_set_flag(&q->sk, SOCK_ZEROCOPY);
+ }
+
err = macvtap_set_queue(dev, file, q);
if (err)
sock_put(&q->sk);
@@ -433,6 +445,80 @@ static inline struct sk_buff *macvtap_alloc_skb(struct sock *sk, size_t prepad,
return skb;
}
+/* set skb frags from iovec, this can move to core network code for reuse */
+static int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
+ int offset, size_t count)
+{
+ int len = iov_length(from, count) - offset;
+ int copy = skb_headlen(skb);
+ int size, offset1 = 0;
+ int i = 0;
+ skb_frag_t *f;
+
+ /* Skip over from offset */
+ while (count && (offset >= from->iov_len)) {
+ offset -= from->iov_len;
+ ++from;
+ --count;
+ }
+
+ /* copy up to skb headlen */
+ while (count && (copy > 0)) {
+ size = min_t(unsigned int, copy, from->iov_len - offset);
+ if (copy_from_user(skb->data + offset1, from->iov_base + offset,
+ size))
+ return -EFAULT;
+ if (copy > size) {
+ ++from;
+ --count;
+ }
+ copy -= size;
+ offset1 += size;
+ offset = 0;
+ }
+
+ if (len == offset1)
+ return 0;
+
+ while (count--) {
+ struct page *page[MAX_SKB_FRAGS];
+ int num_pages;
+ unsigned long base;
+
+ len = from->iov_len - offset1;
+ if (!len) {
+ offset1 = 0;
+ ++from;
+ continue;
+ }
+ base = (unsigned long)from->iov_base + offset1;
+ size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
+ num_pages = get_user_pages_fast(base, size, 0, &page[i]);
+ if ((num_pages != size) ||
+ (num_pages > MAX_SKB_FRAGS - skb_shinfo(skb)->nr_frags))
+ /* put_page is in skb free */
+ return -EFAULT;
+ skb->data_len += len;
+ skb->len += len;
+ skb->truesize += len;
+ atomic_add(len, &skb->sk->sk_wmem_alloc);
+ while (len) {
+ f = &skb_shinfo(skb)->frags[i];
+ f->page = page[i];
+ f->page_offset = base & ~PAGE_MASK;
+ f->size = min_t(int, len, PAGE_SIZE - f->page_offset);
+ skb_shinfo(skb)->nr_frags++;
+ /* increase sk_wmem_alloc */
+ base += f->size;
+ len -= f->size;
+ i++;
+ }
+ offset1 = 0;
+ ++from;
+ }
+ return 0;
+}
+
/*
* macvtap_skb_from_vnet_hdr and macvtap_skb_to_vnet_hdr should
* be shared with the tun/tap driver.
@@ -517,16 +603,18 @@ static int macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
/* Get packet from user space buffer */
-static ssize_t macvtap_get_user(struct macvtap_queue *q,
- const struct iovec *iv, size_t count,
- int noblock)
+static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
+ const struct iovec *iv, unsigned long total_len,
+ size_t count, int noblock)
{
struct sk_buff *skb;
struct macvlan_dev *vlan;
- size_t len = count;
+ unsigned long len = total_len;
int err;
struct virtio_net_hdr vnet_hdr = { 0 };
int vnet_hdr_len = 0;
+ int copylen;
+ bool zerocopy = false;
if (q->flags & IFF_VNET_HDR) {
vnet_hdr_len = q->vnet_hdr_sz;
@@ -554,12 +642,31 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q,
if (unlikely(len < ETH_HLEN))
goto err;
- skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, len, vnet_hdr.hdr_len,
- noblock, &err);
+ if (m && m->msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY))
+ zerocopy = true;
+
+ if (zerocopy) {
+ /* There are 256 bytes to be copied in skb, so there is enough
+ * room for skb expand head in case it is used.
+ * The rest buffer is mapped from userspace.
+ */
+ copylen = vnet_hdr.hdr_len;
+ if (!copylen)
+ copylen = GOODCOPY_LEN;
+ } else
+ copylen = len;
+
+ skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen,
+ vnet_hdr.hdr_len, noblock, &err);
if (!skb)
goto err;
- err = skb_copy_datagram_from_iovec(skb, 0, iv, vnet_hdr_len, len);
+ if (zerocopy) {
+ err = zerocopy_sg_from_iovec(skb, iv, vnet_hdr_len, count);
+ skb_shinfo(skb)->tx_flags |= SKBTX_DEV_ZEROCOPY;
+ } else
+ err = skb_copy_datagram_from_iovec(skb, 0, iv, vnet_hdr_len,
+ len);
if (err)
goto err_kfree;
@@ -575,13 +682,16 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q,
rcu_read_lock_bh();
vlan = rcu_dereference_bh(q->vlan);
+ /* copy skb_ubuf_info for callback when skb has no error */
+ if (zerocopy)
+ skb_shinfo(skb)->destructor_arg = m->msg_control;
if (vlan)
macvlan_start_xmit(skb, vlan->dev);
else
kfree_skb(skb);
rcu_read_unlock_bh();
- return count;
+ return total_len;
err_kfree:
kfree_skb(skb);
@@ -603,8 +713,8 @@ static ssize_t macvtap_aio_write(struct kiocb *iocb, const struct iovec *iv,
ssize_t result = -ENOLINK;
struct macvtap_queue *q = file->private_data;
- result = macvtap_get_user(q, iv, iov_length(iv, count),
- file->f_flags & O_NONBLOCK);
+ result = macvtap_get_user(q, NULL, iv, iov_length(iv, count), count,
+ file->f_flags & O_NONBLOCK);
return result;
}
@@ -817,7 +927,7 @@ static int macvtap_sendmsg(struct kiocb *iocb, struct socket *sock,
struct msghdr *m, size_t total_len)
{
struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
- return macvtap_get_user(q, m->msg_iov, total_len,
+ return macvtap_get_user(q, m, m->msg_iov, total_len, m->msg_iovlen,
m->msg_flags & MSG_DONTWAIT);
}
^ permalink raw reply related
* [PATCH V8 4/4 net-next] vhost: vhost TX zero-copy support
From: Shirley Ma @ 2011-07-06 22:28 UTC (permalink / raw)
To: David Miller, mst; +Cc: netdev, kvm, linux-kernel
This patch maintains the outstanding userspace buffers in the
sequence it is delivered to vhost. The outstanding userspace buffers
will be marked as done once the lower device buffers DMA has finished.
This is monitored through last reference of kfree_skb callback. Two
buffer index are used for this purpose.
The vhost passes the userspace buffers info to lower device skb
through message control. Since there will be some done DMAs when
entering vhost handle_tx. The worse case is all buffers in the vq are
in pending/done status, so we need to notify guest to release DMA done
buffers first before get any new buffers from the vq.
The busywait is waiting for fix in clean up and ring changes.
Signed-off-by: Shirley <xma@us.ibm.com>
---
drivers/vhost/net.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
drivers/vhost/vhost.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
drivers/vhost/vhost.h | 15 +++++++++++++++
3 files changed, 107 insertions(+), 1 deletions(-)
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index e224a92..7de0c6e 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -32,6 +32,10 @@
* Using this limit prevents one virtqueue from starving others. */
#define VHOST_NET_WEIGHT 0x80000
+/* MAX number of TX used buffers for outstanding zerocopy */
+#define VHOST_MAX_PEND 128
+#define VHOST_GOODCOPY_LEN 256
+
enum {
VHOST_NET_VQ_RX = 0,
VHOST_NET_VQ_TX = 1,
@@ -151,6 +155,10 @@ static void handle_tx(struct vhost_net *net)
hdr_size = vq->vhost_hlen;
for (;;) {
+ /* Release DMAs done buffers first */
+ if (atomic_read(&vq->refcnt) > VHOST_MAX_PEND)
+ vhost_zerocopy_signal_used(vq);
+
head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
ARRAY_SIZE(vq->iov),
&out, &in,
@@ -166,6 +174,12 @@ static void handle_tx(struct vhost_net *net)
set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
break;
}
+ /* If more outstanding DMAs, queue the work */
+ if (atomic_read(&vq->refcnt) > VHOST_MAX_PEND) {
+ tx_poll_start(net, sock);
+ set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
+ break;
+ }
if (unlikely(vhost_enable_notify(&net->dev, vq))) {
vhost_disable_notify(&net->dev, vq);
continue;
@@ -188,6 +202,26 @@ static void handle_tx(struct vhost_net *net)
iov_length(vq->hdr, s), hdr_size);
break;
}
+ /* use msg_control to pass vhost zerocopy ubuf info to skb */
+ if (sock_flag(sock->sk, SOCK_ZEROCOPY)) {
+ vq->heads[vq->upend_idx].id = head;
+ if (len < VHOST_GOODCOPY_LEN)
+ /* copy don't need to wait for DMA done */
+ vq->heads[vq->upend_idx].len =
+ VHOST_DMA_DONE_LEN;
+ else {
+ struct ubuf_info *ubuf = &vq->ubuf_info[head];
+
+ vq->heads[vq->upend_idx].len = len;
+ ubuf->callback = vhost_zerocopy_callback;
+ ubuf->arg = vq;
+ ubuf->desc = vq->upend_idx;
+ msg.msg_control = ubuf;
+ msg.msg_controllen = sizeof(ubuf);
+ }
+ atomic_inc(&vq->refcnt);
+ vq->upend_idx = (vq->upend_idx + 1) % UIO_MAXIOV;
+ }
/* TODO: Check specific error and bomb out unless ENOBUFS? */
err = sock->ops->sendmsg(NULL, sock, &msg, len);
if (unlikely(err < 0)) {
@@ -198,12 +232,21 @@ static void handle_tx(struct vhost_net *net)
if (err != len)
pr_debug("Truncated TX packet: "
" len %d != %zd\n", err, len);
- vhost_add_used_and_signal(&net->dev, vq, head, 0);
+ if (!sock_flag(sock->sk, SOCK_ZEROCOPY))
+ vhost_add_used_and_signal(&net->dev, vq, head, 0);
total_len += len;
if (unlikely(total_len >= VHOST_NET_WEIGHT)) {
vhost_poll_queue(&vq->poll);
break;
}
+ /* if upend_idx is full, then wait for free more */
+/*
+ if (unlikely(vq->upend_idx == vq->done_idx)) {
+ tx_poll_start(net, sock);
+ set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
+ break;
+ }
+*/
}
mutex_unlock(&vq->mutex);
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index ea966b3..db242b1 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -179,6 +179,9 @@ static void vhost_vq_reset(struct vhost_dev *dev,
vq->call_ctx = NULL;
vq->call = NULL;
vq->log_ctx = NULL;
+ vq->upend_idx = 0;
+ vq->done_idx = 0;
+ atomic_set(&vq->refcnt, 0);
}
static int vhost_worker(void *data)
@@ -237,6 +240,8 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
GFP_KERNEL);
dev->vqs[i].heads = kmalloc(sizeof *dev->vqs[i].heads *
UIO_MAXIOV, GFP_KERNEL);
+ dev->vqs[i].ubuf_info = kmalloc(sizeof *dev->vqs[i].ubuf_info *
+ UIO_MAXIOV, GFP_KERNEL);
if (!dev->vqs[i].indirect || !dev->vqs[i].log ||
!dev->vqs[i].heads)
@@ -249,6 +254,7 @@ err_nomem:
kfree(dev->vqs[i].indirect);
kfree(dev->vqs[i].log);
kfree(dev->vqs[i].heads);
+ kfree(dev->vqs[i].ubuf_info);
}
return -ENOMEM;
}
@@ -390,6 +396,30 @@ long vhost_dev_reset_owner(struct vhost_dev *dev)
return 0;
}
+/* In case of DMA done not in order in lower device driver for some reason.
+ * upend_idx is used to track end of used idx, done_idx is used to track head
+ * of used idx. Once lower device DMA done contiguously, we will signal KVM
+ * guest used idx.
+ */
+void vhost_zerocopy_signal_used(struct vhost_virtqueue *vq)
+{
+ int i, j = 0;
+
+ for (i = vq->done_idx; i != vq->upend_idx; i = (i + 1) % UIO_MAXIOV) {
+ if ((vq->heads[i].len == VHOST_DMA_DONE_LEN)) {
+ vq->heads[i].len = VHOST_DMA_CLEAR_LEN;
+ vhost_add_used_and_signal(vq->dev, vq,
+ vq->heads[i].id, 0);
+ ++j;
+ } else
+ break;
+ }
+ if (j) {
+ vq->done_idx = i;
+ atomic_sub(j, &vq->refcnt);
+ }
+}
+
/* Caller should have device mutex */
void vhost_dev_cleanup(struct vhost_dev *dev)
{
@@ -400,6 +430,9 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
vhost_poll_stop(&dev->vqs[i].poll);
vhost_poll_flush(&dev->vqs[i].poll);
}
+ /* Wait for all lower device DMAs done (busywait FIXME) */
+ while (atomic_read(&dev->vqs[i].refcnt))
+ vhost_zerocopy_signal_used(&dev->vqs[i]);
if (dev->vqs[i].error_ctx)
eventfd_ctx_put(dev->vqs[i].error_ctx);
if (dev->vqs[i].error)
@@ -612,6 +645,11 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp)
mutex_lock(&vq->mutex);
+ /* clean up lower device outstanding DMAs, before setting ring
+ busywait FIXME */
+ while (atomic_read(&vq->refcnt))
+ vhost_zerocopy_signal_used(vq);
+
switch (ioctl) {
case VHOST_SET_VRING_NUM:
/* Resizing ring with an active backend?
@@ -1486,3 +1524,13 @@ void vhost_disable_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq)
&vq->used->flags, r);
}
}
+
+void vhost_zerocopy_callback(void *arg)
+{
+ struct ubuf_info *ubuf = (struct ubuf_info *)arg;
+ struct vhost_virtqueue *vq;
+
+ vq = (struct vhost_virtqueue *)ubuf->arg;
+ /* set len = 1 to mark this desc buffers done DMA */
+ vq->heads[ubuf->desc].len = VHOST_DMA_DONE_LEN;
+}
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 8e03379..883688c 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -13,6 +13,11 @@
#include <linux/virtio_ring.h>
#include <asm/atomic.h>
+/* This is for zerocopy, used buffer len is set to 1 when lower device DMA
+ * done */
+#define VHOST_DMA_DONE_LEN 1
+#define VHOST_DMA_CLEAR_LEN 0
+
struct vhost_device;
struct vhost_work;
@@ -114,6 +119,14 @@ struct vhost_virtqueue {
/* Log write descriptors */
void __user *log_base;
struct vhost_log *log;
+ /* vhost zerocopy support */
+ atomic_t refcnt; /* num of outstanding zerocopy DMAs */
+ /* last used idx for outstanding DMA zerocopy buffers */
+ int upend_idx;
+ /* first used idx for DMA done zerocopy buffers */
+ int done_idx;
+ /* an array of userspace buffers info */
+ struct ubuf_info *ubuf_info;
};
struct vhost_dev {
@@ -160,6 +173,8 @@ bool vhost_enable_notify(struct vhost_dev *, struct vhost_virtqueue *);
int vhost_log_write(struct vhost_virtqueue *vq, struct vhost_log *log,
unsigned int log_num, u64 len);
+void vhost_zerocopy_callback(void *arg);
+void vhost_zerocopy_signal_used(struct vhost_virtqueue *vq);
#define vq_err(vq, fmt, ...) do { \
pr_debug(pr_fmt(fmt), ##__VA_ARGS__); \
^ permalink raw reply related
* Re: [PATCH V8 2/4 net-next] skbuff: skb supports zero-copy buffers
From: Zan Lynx @ 2011-07-06 23:01 UTC (permalink / raw)
To: Shirley Ma; +Cc: David Miller, mst, netdev, kvm, linux-kernel
In-Reply-To: <1309990932.10209.19.camel@localhost.localdomain>
On 7/6/2011 4:22 PM, Shirley Ma wrote:
> This patch adds userspace buffers support in skb shared info. A new
> struct skb_ubuf_info is needed to maintain the userspace buffers
> argument and index, a callback is used to notify userspace to release
> the buffers once lower device has done DMA (Last reference to that skb
> has gone).
>
> If there is any userspace apps to reference these userspace buffers,
> then these userspaces buffers will be copied into kernel. This way we
> can prevent userspace apps from holding these userspace buffers too long.
>
> Use destructor_arg to point to the userspace buffer info; a new tx flags
> SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.
>
> Signed-off-by: Shirley Ma <xma@...ibm.com>
I was just reading this patch and noticed that you check if
uarg->callback is set before calling it in skb_release_data, but you do
not check before calling it in skb_copy_ubufs.
I was only skimming so I have probably missed something...
> ---
>
> include/linux/skbuff.h | 16 ++++++++++
> net/core/skbuff.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 94 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 3e54337..08d4507 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -187,6 +187,20 @@ enum {
>
> /* ensure the originating sk reference is available on driver level */
> SKBTX_DRV_NEEDS_SK_REF = 1 << 3,
> +
> + /* device driver supports TX zero-copy buffers */
> + SKBTX_DEV_ZEROCOPY = 1 << 4,
> +};
> +
> +/*
> + * The callback notifies userspace to release buffers when skb DMA is done in
> + * lower device, the skb last reference should be 0 when calling this.
> + * The desc is used to track userspace buffer index.
> + */
> +struct ubuf_info {
> + void (*callback)(void *);
> + void *arg;
> + unsigned long desc;
> };
>
> /* This data is invariant across clones and lives at
> @@ -211,6 +225,7 @@ struct skb_shared_info {
> /* Intermediate layers must ensure that destructor_arg
> * remains valid until skb destructor */
> void * destructor_arg;
> +
> /* must be last field, see pskb_expand_head() */
> skb_frag_t frags[MAX_SKB_FRAGS];
> };
> @@ -2265,5 +2280,6 @@ static inline void skb_checksum_none_assert(struct sk_buff *skb)
> }
>
> bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off);
> +
> #endif /* __KERNEL__ */
> #endif /* _LINUX_SKBUFF_H */
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 46cbd28..42462f5 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -329,6 +329,18 @@ static void skb_release_data(struct sk_buff *skb)
> put_page(skb_shinfo(skb)->frags[i].page);
> }
>
> + /*
> + * If skb buf is from userspace, we need to notify the caller
> + * the lower device DMA has done;
> + */
> + if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
> + struct ubuf_info *uarg;
> +
> + uarg = skb_shinfo(skb)->destructor_arg;
> + if (uarg->callback)
> + uarg->callback(uarg);
> + }
> +
> if (skb_has_frag_list(skb))
> skb_drop_fraglist(skb);
>
> @@ -481,6 +493,9 @@ bool skb_recycle_check(struct sk_buff *skb, int skb_size)
> if (irqs_disabled())
> return false;
>
> + if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY)
> + return false;
> +
> if (skb_is_nonlinear(skb) || skb->fclone != SKB_FCLONE_UNAVAILABLE)
> return false;
>
> @@ -596,6 +611,50 @@ struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src)
> }
> EXPORT_SYMBOL_GPL(skb_morph);
>
> +/* skb frags copy userspace buffers to kernel */
> +static int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
> +{
> + int i;
> + int num_frags = skb_shinfo(skb)->nr_frags;
> + struct page *page, *head = NULL;
> + struct ubuf_info *uarg = skb_shinfo(skb)->destructor_arg;
> +
> + for (i = 0; i < num_frags; i++) {
> + u8 *vaddr;
> + skb_frag_t *f = &skb_shinfo(skb)->frags[i];
> +
> + page = alloc_page(GFP_ATOMIC);
> + if (!page) {
> + while (head) {
> + put_page(head);
> + head = (struct page *)head->private;
> + }
> + return -ENOMEM;
> + }
> + vaddr = kmap_skb_frag(&skb_shinfo(skb)->frags[i]);
> + memcpy(page_address(page),
> + vaddr + f->page_offset, f->size);
> + kunmap_skb_frag(vaddr);
> + page->private = (unsigned long)head;
> + head = page;
> + }
> +
> + /* skb frags release userspace buffers */
> + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
> + put_page(skb_shinfo(skb)->frags[i].page);
> +
> + uarg->callback(uarg);
> +
> + /* skb frags point to kernel buffers */
> + for (i = skb_shinfo(skb)->nr_frags; i > 0; i--) {
> + skb_shinfo(skb)->frags[i - 1].page_offset = 0;
> + skb_shinfo(skb)->frags[i - 1].page = head;
> + head = (struct page *)head->private;
> + }
> + return 0;
> +}
> +
> +
> /**
> * skb_clone - duplicate an sk_buff
> * @skb: buffer to clone
> @@ -614,6 +673,11 @@ struct sk_buff *skb_clone(struct sk_buff *skb, gfp_t gfp_mask)
> {
> struct sk_buff *n;
>
> + if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
> + if (skb_copy_ubufs(skb, gfp_mask))
> + return NULL;
> + }
> +
> n = skb + 1;
> if (skb->fclone == SKB_FCLONE_ORIG &&
> n->fclone == SKB_FCLONE_UNAVAILABLE) {
> @@ -731,6 +795,12 @@ struct sk_buff *pskb_copy(struct sk_buff *skb, gfp_t gfp_mask)
> if (skb_shinfo(skb)->nr_frags) {
> int i;
>
> + if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
> + if (skb_copy_ubufs(skb, gfp_mask)) {
> + kfree(n);
> + goto out;
> + }
> + }
> for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> skb_shinfo(n)->frags[i] = skb_shinfo(skb)->frags[i];
> get_page(skb_shinfo(n)->frags[i].page);
> @@ -788,7 +858,6 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> fastpath = true;
> else {
> int delta = skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1;
> -
> fastpath = atomic_read(&skb_shinfo(skb)->dataref) == delta;
> }
>
> @@ -819,6 +888,11 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
> if (fastpath) {
> kfree(skb->head);
> } else {
> + /* copy this zero copy skb frags */
> + if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
> + if (skb_copy_ubufs(skb, gfp_mask))
> + goto nofrags;
> + }
> for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
> get_page(skb_shinfo(skb)->frags[i].page);
>
> @@ -853,6 +927,8 @@ adjust_others:
> atomic_set(&skb_shinfo(skb)->dataref, 1);
> return 0;
>
> +nofrags:
> + kfree(data);
> nodata:
> return -ENOMEM;
> }
> @@ -1354,6 +1430,7 @@ int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len)
> }
> start = end;
> }
> +
> if (!len)
> return 0;
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply
* Re: [Bugme-new] [Bug 38032] New: default values of /proc/sys/net/ipv4/udp_mem does not consider huge page allocatio
From: Andrew Morton @ 2011-07-06 23:03 UTC (permalink / raw)
To: linux-mm, netdev; +Cc: bugme-daemon, starlight, Rafael Aquini
In-Reply-To: <bug-38032-10286@https.bugzilla.kernel.org/>
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
(cc's added)
On Tue, 21 Jun 2011 00:35:22 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=38032
>
> Summary: default values of /proc/sys/net/ipv4/udp_mem does not
> consider huge page allocatio
> Product: Memory Management
> Version: 2.5
> Platform: All
> OS/Version: Linux
> Tree: Mainline
> Status: NEW
> Severity: normal
> Priority: P1
> Component: Other
> AssignedTo: akpm@linux-foundation.org
> ReportedBy: starlight@binnacle.cx
> Regression: No
>
>
> In the RHEL 5.5 back-port of this tunable we ran into trouble locking up
> systems because the boot-time default is set based on physical memory does not
> account for the hugepages= in the boot parameters. So the UDP socket buffer
> limit can exceed phyisical memory. Don't know if this is an issue in mainline
> kernels but it seems likely so reporting this as a courtsey. Seems like it
> would be easy to fix the default to account for the memory reserved by
> hugepages which is not available for slab allocations.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=714833
>
Yes, we've made similar mistakes in other places.
I don't think we really have an official formula for what callers
should be doing here. net/ipv4/udp.c:udp_init() does
nr_pages = totalram_pages - totalhigh_pages;
which assumes that totalram_pages does not include the pages which were
lost to hugepage allocations.
I *think* that this is now the case, but it wasn't always the case - we
made relatively recent fixes to the totalram_pages maintenance.
Perhaps UDP should be using the misnamed nr_free_buffer_pages() here.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH V8 2/4 net-next] skbuff: skb supports zero-copy buffers
From: Shirley Ma @ 2011-07-06 23:24 UTC (permalink / raw)
To: Zan Lynx; +Cc: David Miller, mst, netdev, kvm, linux-kernel
In-Reply-To: <4E14E939.5040904@acm.org>
On Wed, 2011-07-06 at 17:01 -0600, Zan Lynx wrote:
> On 7/6/2011 4:22 PM, Shirley Ma wrote:
> > This patch adds userspace buffers support in skb shared info. A new
> > struct skb_ubuf_info is needed to maintain the userspace buffers
> > argument and index, a callback is used to notify userspace to
> release
> > the buffers once lower device has done DMA (Last reference to that
> skb
> > has gone).
> >
> > If there is any userspace apps to reference these userspace buffers,
> > then these userspaces buffers will be copied into kernel. This way
> we
> > can prevent userspace apps from holding these userspace buffers too
> long.
> >
> > Use destructor_arg to point to the userspace buffer info; a new tx
> flags
> > SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.
> >
> > Signed-off-by: Shirley Ma <xma@...ibm.com>
>
> I was just reading this patch and noticed that you check if
> uarg->callback is set before calling it in skb_release_data, but you
> do
> not check before calling it in skb_copy_ubufs.
>
> I was only skimming so I have probably missed something...
It is a redundant check. The userspace buffer info always has a callback
to release the buffers. I should have removed it after using tx_flags.
Thanks
Shirley
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox