From: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Cc: leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: multipath IB/srp fail-over testing lands up in dump stack in swiotlb_alloc_coherent()
Date: Wed, 15 Jun 2016 08:02:13 -0400 (EDT) [thread overview]
Message-ID: <794983323.42297890.1465992133003.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <109658870.42286330.1465988279277.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
----- Original Message -----
> From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> To: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> Cc: leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, "Yishai Hadas" <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Sent: Wednesday, June 15, 2016 6:57:59 AM
> Subject: Re: multipath IB/srp fail-over testing lands up in dump stack in swiotlb_alloc_coherent()
>
>
>
> ----- Original Message -----
> > From: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> > To: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> > Cc: "Yishai Hadas" <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Sent: Wednesday, June 15, 2016 3:40:23 AM
> > Subject: Re: multipath IB/srp fail-over testing lands up in dump stack in
> > swiotlb_alloc_coherent()
> >
> > On 06/14/2016 08:41 PM, Laurence Oberman wrote:
> > > This may be a data point here
> > > After each change I have rebooted the host as its required.
> > > I am at swiotlb=16 and after the first reboot with maxed out tuning I had
> > > no alerts.
> > > On the second controller restart without a system reboot I got them
> > > again.
> > >
> > > Again, I never see these other than when I am in the reconnect loop, and
> > > they seem to be non-intrusive as each time I recover fully.
> > >
> > > When I first changed to 4 and had not increased the ib_srp paramaters I
> > > had
> > > two restarts with no messages so that was what led me to report that this
> > > seems to have worked.
> > > I can see now that this was not the case and already mentioned, the claim
> > > that the change fixed this was wrong.
> > > Apologies for that.
> > >
> > > I am continuing to research and debug now.
> >
> > Hello Laurence,
> >
> > In the kernel source tree I found the following:
> >
> > From include/linux/swiotlb.h:
> >
> > #define IO_TLB_SHIFT 11
> >
> > From lib/swiotlb.c:
> >
> > #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> > [ ... ]
> > #define IO_TLB_DEFAULT_SIZE (64UL<<20)
> > [ ... ]
> > static int __init
> > setup_io_tlb_npages(char *str)
> > {
> > if (isdigit(*str)) {
> > io_tlb_nslabs = simple_strtoul(str, &str, 0);
> > /* avoid tail segment of size < IO_TLB_SEGSIZE */
> > io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
> > }
> > [ ... ]
> > }
> > early_param("swiotlb", setup_io_tlb_npages);
> > [ ... ]
> > void __init
> > swiotlb_init(int verbose)
> > {
> > size_t default_size = IO_TLB_DEFAULT_SIZE;
> > [ ... ]
> >
> > if (!io_tlb_nslabs) {
> > io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
> > io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
> > }
> > [ ... ]
> > }
> >
> > I think this means that the swiotlb parameter has to be set to a value
> > above 32768 to increase the number of swiotlb buffers above the default.
> >
> > Bart.
> >
> >
> >
> Hello Bart
>
> I will try that.
> When I looked at the code I saw it being set to 1 as a default, and read the
> Doc comments as a slab count so figured its an int and would be calculated
> as n x slabs.
> I guess that's another Document update needed for kernel docs.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
We are missing something here
I set it to double 266240 given the message below where its says we are full at 266240.
I will instrument kernel and see what it gets set to to make sure we see whats happening.
BOOT_IMAGE=/vmlinuz-4.7.0-rc1.bart.swiotlb+ root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS1,115200n8 scsi_mod.use_blk_mq=1 swiotlb=532480
dmesg | grep -i swio
[ 0.000000] Linux version 4.7.0-rc1.bart.swiotlb+ (loberman@jumptest1) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #5 SMP Mon Jun 13 21:09:50 EDT 2016
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc1.bart.swiotlb+ root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS1,115200n8 scsi_mod.use_blk_mq=1 swiotlb=532480
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc1.bart.swiotlb+ root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS1,115200n8 scsi_mod.use_blk_mq=1 swiotlb=532480
[ 4.663794] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) **** Note
[ 4.917998] usb usb1: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ ehci_hcd
[ 4.954666] usb usb2: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[ 5.083110] usb usb3: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[ 5.111634] usb usb4: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[ 5.240089] usb usb5: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[ 5.373986] usb usb6: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[ 1403.045092] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1403.045095] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1403.075632] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1403.075634] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1404.091624] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1404.091627] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1404.207057] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1404.207060] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1404.673154] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1404.673157] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1414.717610] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1414.779978] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.016524] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.073408] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.143262] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.204337] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1414.717610] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1414.758355] RHDEBUG: wrap=56 index=56
[ 1414.779978] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.016524] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.053908] RHDEBUG: wrap=56 index=56
[ 1415.073408] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.143262] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.183465] RHDEBUG: wrap=56 index=56
[ 1415.204337] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2016-06-15 12:02 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1217453008.41876448.1465770498545.JavaMail.zimbra@redhat.com>
[not found] ` <1217453008.41876448.1465770498545.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-12 22:40 ` multipath IB/srp fail-over testing lands up in dump stack in swiotlb_alloc_coherent() Laurence Oberman
[not found] ` <19156300.41876496.1465771227395.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-13 6:32 ` Bart Van Assche
[not found] ` <2d316ddf-9a2a-3aba-cf2d-fcdaafbaa848-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-13 13:23 ` Laurence Oberman
2016-06-13 14:07 ` Leon Romanovsky
[not found] ` <20160613140747.GL5408-2ukJVAZIZ/Y@public.gmane.org>
2016-06-13 14:19 ` Laurence Oberman
[not found] ` <946373818.41993264.1465827597452.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-13 15:22 ` Laurence Oberman
[not found] ` <887623939.42004497.1465831339845.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-13 22:30 ` Laurence Oberman
[not found] ` <450384210.42057823.1465857004662.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14 1:56 ` Laurence Oberman
[not found] ` <1964187258.42093298.1465869387551.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14 9:24 ` Bart Van Assche
[not found] ` <11e680c4-84b3-1cd6-133c-36f71bd853d0-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-14 12:08 ` Leon Romanovsky
[not found] ` <20160614120833.GO5408-2ukJVAZIZ/Y@public.gmane.org>
2016-06-14 12:25 ` Bart Van Assche
[not found] ` <fe7c9713-2864-7b6c-53ec-f5d1364d65d8-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-14 13:10 ` Laurence Oberman
2016-06-14 13:15 ` Leon Romanovsky
[not found] ` <20160614131552.GP5408-2ukJVAZIZ/Y@public.gmane.org>
2016-06-14 13:57 ` Laurence Oberman
[not found] ` <1531921470.42169965.1465912634165.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14 17:40 ` Laurence Oberman
[not found] ` <1296246237.42197305.1465926035162.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14 18:41 ` Laurence Oberman
[not found] ` <1167916510.42202925.1465929678588.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 7:40 ` Bart Van Assche
[not found] ` <a524c577-cfb1-4072-da12-01d0d9ab9c38-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-15 10:57 ` Laurence Oberman
[not found] ` <109658870.42286330.1465988279277.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 12:02 ` Laurence Oberman [this message]
[not found] ` <794983323.42297890.1465992133003.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 12:51 ` Bart Van Assche
[not found] ` <cb6f8f42-1f4f-cf9d-42d0-12ba5e90ab86-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-15 13:19 ` Laurence Oberman
[not found] ` <1925675172.42312868.1465996772507.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 13:23 ` Laurence Oberman
[not found] ` <868111008.42313561.1465997038399.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 23:05 ` Laurence Oberman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=794983323.42297890.1465992133003.JavaMail.zimbra@redhat.com \
--to=loberman-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
--cc=bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org \
--cc=leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox