From: Marco Steinacher <marco@websource.ch>
To: xen-devel@lists.xen.org
Cc: Marco Steinacher <marco@websource.ch>
Subject: "swiotlb buffer is full" problem with tg3 and kernel 3.16.0-4-686-pae on Xen 4.4.1
Date: Thu, 21 May 2015 17:17:30 +0200 [thread overview]
Message-ID: <555DF70A.3020001@websource.ch> (raw)
Hi,
After upgrading to Debian jessie, and consequently to the default Linux
kernel 3.16.0-4-686-pae and Xen hypervisor 4.4.1-amd64 in that
distribution, I'm having problems with the tg3 network driver under high
load. Unfortunately this affects a production system that I am
administrating. It usually happens when doing a DRBD sync. Here is one
such event:
[ 4765.528635] block drbd0: Began resync as SyncSource (will sync 886784
KB [221696 bits set])
[ 4765.528654] block drbd0: updated sync UUID
09891C136111799E:F7FD1C0A50225596:F7FC1C0A50225596:F7FB1C0A50225596
[ 4768.992280] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4769.400296] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4770.216360] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4771.852283] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4775.120286] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4775.776027] tg3 0000:02:00.0: swiotlb buffer is full (sz: 32768 bytes)
[ 4775.778814] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4775.780995] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4775.783345] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4775.785097] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4775.988290] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4776.396285] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4777.212295] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4778.848298] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4781.664292] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4782.120285] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4788.672288] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4793.776046] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time
expired, ko = 6
[ 4794.752314] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4799.776046] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time
expired, ko = 5
[ 4801.760290] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4805.776040] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time
expired, ko = 4
[ 4811.776040] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time
expired, ko = 3
[ 4817.776050] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time
expired, ko = 2
[ 4823.776079] drbd base-disk: [drbd_w_base-dis/1734] sock_sendmsg time
expired, ko = 1
[ 4827.936300] tg3 0000:02:00.0: swiotlb buffer is full (sz: 1448 bytes)
[ 4829.776069] drbd base-disk: peer( Secondary -> Unknown ) conn(
SyncSource -> Timeout )
[ 4829.776088] block drbd0: drbd_send_block() failed
Sometimes I also see the message "swiotlb_tbl_map_single: 8 callbacks
suppressed" or similar between the "buffer full" messages.
Sometimes the sync finishes, sometimes it stalls and fails completely.
The problem only occurs when running Linux 3.16.0-4-686-pae under Xen
4.4.1. It does NOT occur when booting the same kernel without Xen, or
when booting the corresponding amd64 kernel (3.16.0-4-amd64) with or
without Xen. There was no problem in Debian wheezy before the upgrade
(kernel 3.2.0-4-686-pae and Xen Hypervisor 4.1.3-amd64). The problem
also occurs when only dom0 is running (all domU VMs shut down).
I found the thread "tg3 NIC driver bug in 3.14.x under Xen"
(http://www.spinics.net/lists/netdev/msg324124.html) which looks like a
similar issue, but I don't understand exactly what is going on there and
what I could do to fix or debug it further.
Shall I try to build a 3.16.0-4-686-pae kernel with
"CONFIG_NEED_DMA_MAP_STATE=y"?
Shall I try to set the 'iommu' and/or 'swiotlb' kernel parameters? To
what values?
Any help or hint how to fix or work around this issue is very much
appreciated. Also hints how to debug this further are welcome.
Thanks,
Marco
P.S. Here is some information that might help figuring out what's going on:
-------------------------------------------------------------------
kepler:~# ethtool -S eth0 | grep -v ': 0$'
NIC statistics:
rx_octets: 42531865
rx_ucast_packets: 582596
rx_mcast_packets: 127
rx_bcast_packets: 1
tx_octets: 8692263469
tx_ucast_packets: 5755264
tx_mcast_packets: 10
-------------------------------------------------------------------
-------------------------------------------------------------------
kepler:~# ethtool -i eth0
driver: tg3
version: 3.137
firmware-version: 5722-v3.09, ASFIPMI v6.03
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
-------------------------------------------------------------------
-------------------------------------------------------------------
kepler:~# lspci -vvv -s 02:00.0
02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5722
Gigabit Ethernet PCI Express
Subsystem: IBM IBM System x3350 (Machine type 4192)
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 32 bytes
Interrupt: pin A routed to IRQ 59
Region 0: Memory at e8200000 (64-bit, non-prefetchable) [size=64K]
Expansion ROM at <ignored> [disabled]
Capabilities: [48] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] Vital Product Data
Product Name: Broadcom NetXtreme Gigabit Ethernet Controller
Read-only fields:
[PN] Part number: BCM95722
[EC] Engineering changes: 106679-15
[SN] Serial number: 0123456789
[MN] Manufacture ID: 31 34 65 34
[RV] Reserved: checksum good, 28 byte(s) reserved
Read/write fields:
[YA] Asset tag: XYZ01234567
[RW] Read-write area: 107 byte(s) free
End
Capabilities: [58] Vendor Specific Information: Len=78 <?>
Capabilities: [e8] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee0200c Data: 4121
Capabilities: [d0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s
<4us, L1 <64us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive-
BWMgmt- ABWMgmt-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
MalfTLP- ECRC- UnsupReq+ ACSViol-
UESvrt: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF+
MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [13c v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
Capabilities: [160 v1] Device Serial Number 00-21-5e-ff-fe-4d-2c-13
Capabilities: [16c v1] Power Budgeting <?>
Kernel driver in use: tg3
-------------------------------------------------------------------
-------------------------------------------------------------------
kepler:~# brctl show
bridge name bridge id STP enabled interfaces
xenbrext0 8000.00215e4d2c14 no eth1
xenbrint0 8000.00215e4d2c13 no eth0
kepler:~# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:21:5e:4d:2c:13
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:582865 errors:0 dropped:0 overruns:0 frame:0
TX packets:5755690 errors:0 dropped:1153 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:42557655 (40.5 MiB) TX bytes:8692339211 (8.0 GiB)
Interrupt:16
kepler:~# ifconfig xenbrint0
xenbrint0 Link encap:Ethernet HWaddr 00:21:5e:4d:2c:13
inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0
inet6 addr: 2001:1620:206b:1::2:1/64 Scope:Global
inet6 addr: fe80::221:5eff:fe4d:2c13/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:582461 errors:0 dropped:0 overruns:0 frame:0
TX packets:329904 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:32044143 (30.5 MiB) TX bytes:8330130321 (7.7 GiB)
-------------------------------------------------------------------
-------------------------------------------------------------------
kepler:~# cat /proc/version
Linux version 3.16.0-4-686-pae (debian-kernel@lists.debian.org) (gcc
version 4.8.4 (Debian 4.8.4-1) ) #1 SMP Debian 3.16.7-ckt9-3~deb8u1
(2015-04-24)
kepler:~# grep -e SWIOTLB -e CONFIG_NEED_DMA_MAP_STATE /boot/config-*
/boot/config-3.16.0-4-686-pae:CONFIG_SWIOTLB=y
/boot/config-3.16.0-4-686-pae:CONFIG_SWIOTLB_XEN=y
/boot/config-3.16.0-4-amd64:CONFIG_NEED_DMA_MAP_STATE=y
/boot/config-3.16.0-4-amd64:CONFIG_SWIOTLB=y
/boot/config-3.16.0-4-amd64:CONFIG_SWIOTLB_XEN=y
-------------------------------------------------------------------
-------------------------------------------------------------------
kepler:~# xen info
host : kepler
release : 3.16.0-4-686-pae
version : #1 SMP Debian 3.16.7-ckt9-3~deb8u1 (2015-04-24)
machine : i686
nr_cpus : 2
max_cpu_id : 1
nr_nodes : 1
cores_per_socket : 2
threads_per_core : 1
cpu_mhz : 2400
hw_caps :
bfebfbff:20100800:00000000:00000900:0000e39d:00000000:00000001:00000000
virt_caps :
total_memory : 8189
free_memory : 3999
sharing_freed_memory : 0
sharing_used_memory : 0
outstanding_claims : 0
free_cpus : 0
xen_major : 4
xen_minor : 4
xen_extra : .1
xen_version : 4.4.1
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p
xen_scheduler : credit
xen_pagesize : 4096
platform_params : virt_start=0xff400000
xen_changeset :
xen_commandline : placeholder com1=115200,8n1 console=com1
dom0_mem=4096M,max:4096M
cc_compiler : gcc (Debian 4.9.2-10) 4.9.2
cc_compile_by : waldi
cc_compile_domain : debian.org
cc_compile_date : Mon Apr 6 19:49:18 UTC 2015
xend_config_format : 4
-------------------------------------------------------------------
next reply other threads:[~2015-05-21 15:17 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-21 15:17 Marco Steinacher [this message]
2015-05-21 18:11 ` "swiotlb buffer is full" problem with tg3 and kernel 3.16.0-4-686-pae on Xen 4.4.1 Ian Campbell
2015-05-22 17:44 ` Marco Steinacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=555DF70A.3020001@websource.ch \
--to=marco@websource.ch \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.