All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurence Oberman <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Bart Van Assche <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
Cc: leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: multipath IB/srp fail-over testing lands up in dump stack in swiotlb_alloc_coherent()
Date: Wed, 15 Jun 2016 08:02:13 -0400 (EDT)	[thread overview]
Message-ID: <794983323.42297890.1465992133003.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <109658870.42286330.1465988279277.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>



----- Original Message -----
> From: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> To: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> Cc: leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, "Yishai Hadas" <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Sent: Wednesday, June 15, 2016 6:57:59 AM
> Subject: Re: multipath IB/srp fail-over testing lands up in dump stack in swiotlb_alloc_coherent()
> 
> 
> 
> ----- Original Message -----
> > From: "Bart Van Assche" <bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
> > To: "Laurence Oberman" <loberman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> > Cc: "Yishai Hadas" <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > Sent: Wednesday, June 15, 2016 3:40:23 AM
> > Subject: Re: multipath IB/srp fail-over testing lands up in dump stack in
> > swiotlb_alloc_coherent()
> > 
> > On 06/14/2016 08:41 PM, Laurence Oberman wrote:
> > > This may be a data point here
> > > After each change I have rebooted the host as its required.
> > > I am at swiotlb=16 and after the first reboot with maxed out tuning I had
> > > no alerts.
> > > On the second controller restart without a system reboot I got them
> > > again.
> > >
> > > Again, I never see these other than when I am in  the reconnect loop, and
> > > they seem to be non-intrusive as each time I recover fully.
> > >
> > > When I first changed to 4 and had not increased the ib_srp paramaters I
> > > had
> > > two restarts with no messages so that was what led me to report that this
> > > seems to have worked.
> > > I can see now that this was not the case and already mentioned, the claim
> > > that the change fixed this was wrong.
> > > Apologies for that.
> > >
> > > I am continuing to research and debug now.
> > 
> > Hello Laurence,
> > 
> > In the kernel source tree I found the following:
> > 
> >  From include/linux/swiotlb.h:
> > 
> > #define IO_TLB_SHIFT 11
> > 
> >  From lib/swiotlb.c:
> > 
> > #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
> > [ ... ]
> > #define IO_TLB_DEFAULT_SIZE (64UL<<20)
> > [ ... ]
> > static int __init
> > setup_io_tlb_npages(char *str)
> > {
> > 	if (isdigit(*str)) {
> > 		io_tlb_nslabs = simple_strtoul(str, &str, 0);
> > 		/* avoid tail segment of size < IO_TLB_SEGSIZE */
> > 		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
> > 	}
> >          [ ... ]
> > }
> > early_param("swiotlb", setup_io_tlb_npages);
> > [ ... ]
> > void  __init
> > swiotlb_init(int verbose)
> > {
> > 	size_t default_size = IO_TLB_DEFAULT_SIZE;
> > 	[ ... ]
> > 
> > 	if (!io_tlb_nslabs) {
> > 		io_tlb_nslabs = (default_size >> IO_TLB_SHIFT);
> > 		io_tlb_nslabs = ALIGN(io_tlb_nslabs, IO_TLB_SEGSIZE);
> > 	}
> > 	[ ... ]
> > }
> > 
> > I think this means that the swiotlb parameter has to be set to a value
> > above 32768 to increase the number of swiotlb buffers above the default.
> > 
> > Bart.
> > 
> > 
> > 
> Hello Bart
> 
> I will try that.
> When I looked at the code I saw it being set to 1 as a default, and read the
> Doc comments as a slab count so figured its an int and would be calculated
> as n x slabs.
> I guess that's another Document update needed for kernel docs.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

We are missing something here

I set it to double 266240 given the message below where its says we are full at 266240.

I will instrument kernel and see what it gets set to to make sure we see whats happening.

BOOT_IMAGE=/vmlinuz-4.7.0-rc1.bart.swiotlb+ root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS1,115200n8 scsi_mod.use_blk_mq=1 swiotlb=532480

dmesg | grep -i swio
[    0.000000] Linux version 4.7.0-rc1.bart.swiotlb+ (loberman@jumptest1) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #5 SMP Mon Jun 13 21:09:50 EDT 2016
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc1.bart.swiotlb+ root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS1,115200n8 scsi_mod.use_blk_mq=1 swiotlb=532480
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.7.0-rc1.bart.swiotlb+ root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap console=ttyS1,115200n8 scsi_mod.use_blk_mq=1 swiotlb=532480

[    4.663794] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)           **** Note

[    4.917998] usb usb1: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ ehci_hcd
[    4.954666] usb usb2: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[    5.083110] usb usb3: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[    5.111634] usb usb4: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[    5.240089] usb usb5: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[    5.373986] usb usb6: Manufacturer: Linux 4.7.0-rc1.bart.swiotlb+ uhci_hcd
[ 1403.045092] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1403.045095] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1403.075632] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1403.075634] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1404.091624] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1404.091627] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1404.207057] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1404.207060] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1404.673154] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1404.673157] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1414.717610] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1414.779978] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.016524] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.073408] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.143262] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.204337] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff


[ 1414.717610] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1414.758355] RHDEBUG: wrap=56 index=56
[ 1414.779978] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.016524] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.053908] RHDEBUG: wrap=56 index=56
[ 1415.073408] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
[ 1415.143262] mlx5_core 0000:08:00.0: swiotlb buffer is full (sz: 266240 bytes)
[ 1415.183465] RHDEBUG: wrap=56 index=56
[ 1415.204337] RHDEBUG: SWIOTLB_MAP_ERROR ffffffffffffffff
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-06-15 12:02 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1217453008.41876448.1465770498545.JavaMail.zimbra@redhat.com>
     [not found] ` <1217453008.41876448.1465770498545.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-12 22:40   ` multipath IB/srp fail-over testing lands up in dump stack in swiotlb_alloc_coherent() Laurence Oberman
     [not found]     ` <19156300.41876496.1465771227395.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-13  6:32       ` Bart Van Assche
     [not found]         ` <2d316ddf-9a2a-3aba-cf2d-fcdaafbaa848-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-13 13:23           ` Laurence Oberman
2016-06-13 14:07           ` Leon Romanovsky
     [not found]             ` <20160613140747.GL5408-2ukJVAZIZ/Y@public.gmane.org>
2016-06-13 14:19               ` Laurence Oberman
     [not found]                 ` <946373818.41993264.1465827597452.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-13 15:22                   ` Laurence Oberman
     [not found]                     ` <887623939.42004497.1465831339845.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-13 22:30                       ` Laurence Oberman
     [not found]                         ` <450384210.42057823.1465857004662.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14  1:56                           ` Laurence Oberman
     [not found]                             ` <1964187258.42093298.1465869387551.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14  9:24                               ` Bart Van Assche
     [not found]                                 ` <11e680c4-84b3-1cd6-133c-36f71bd853d0-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-14 12:08                                   ` Leon Romanovsky
     [not found]                                     ` <20160614120833.GO5408-2ukJVAZIZ/Y@public.gmane.org>
2016-06-14 12:25                                       ` Bart Van Assche
     [not found]                                         ` <fe7c9713-2864-7b6c-53ec-f5d1364d65d8-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-14 13:10                                           ` Laurence Oberman
2016-06-14 13:15                                           ` Leon Romanovsky
     [not found]                                             ` <20160614131552.GP5408-2ukJVAZIZ/Y@public.gmane.org>
2016-06-14 13:57                                               ` Laurence Oberman
     [not found]                                                 ` <1531921470.42169965.1465912634165.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14 17:40                                                   ` Laurence Oberman
     [not found]                                                     ` <1296246237.42197305.1465926035162.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-14 18:41                                                       ` Laurence Oberman
     [not found]                                                         ` <1167916510.42202925.1465929678588.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15  7:40                                                           ` Bart Van Assche
     [not found]                                                             ` <a524c577-cfb1-4072-da12-01d0d9ab9c38-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-15 10:57                                                               ` Laurence Oberman
     [not found]                                                                 ` <109658870.42286330.1465988279277.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 12:02                                                                   ` Laurence Oberman [this message]
     [not found]                                                                     ` <794983323.42297890.1465992133003.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 12:51                                                                       ` Bart Van Assche
     [not found]                                                                         ` <cb6f8f42-1f4f-cf9d-42d0-12ba5e90ab86-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org>
2016-06-15 13:19                                                                           ` Laurence Oberman
     [not found]                                                                             ` <1925675172.42312868.1465996772507.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 13:23                                                                               ` Laurence Oberman
     [not found]                                                                                 ` <868111008.42313561.1465997038399.JavaMail.zimbra-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2016-06-15 23:05                                                                                   ` Laurence Oberman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=794983323.42297890.1465992133003.JavaMail.zimbra@redhat.com \
    --to=loberman-h+wxahxf7alqt0dzr+alfa@public.gmane.org \
    --cc=bart.vanassche-XdAiOPVOjttBDgjK7y7TUQ@public.gmane.org \
    --cc=leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.