* nvme-tcp request timeouts @ 2022-10-11 15:31 Seth Forshee 2022-10-11 19:30 ` Chaitanya Kulkarni 0 siblings, 1 reply; 10+ messages in thread From: Seth Forshee @ 2022-10-11 15:31 UTC (permalink / raw) To: Christoph Hellwig, Sagi Grimberg, Chaitanya Kulkarni; +Cc: linux-nvme Hi, I'm seeing timeouts like the following from nvme-tcp: [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 [ 6369.513283] nvme nvme5: starting error recovery [ 6369.514379] block nvme5n1: no usable path - requeuing I/O [ 6369.514385] block nvme5n1: no usable path - requeuing I/O [ 6369.514392] block nvme5n1: no usable path - requeuing I/O [ 6369.514393] block nvme5n1: no usable path - requeuing I/O [ 6369.514401] block nvme5n1: no usable path - requeuing I/O [ 6369.514414] block nvme5n1: no usable path - requeuing I/O [ 6369.514420] block nvme5n1: no usable path - requeuing I/O [ 6369.514427] block nvme5n1: no usable path - requeuing I/O [ 6369.514430] block nvme5n1: no usable path - requeuing I/O [ 6369.514432] block nvme5n1: no usable path - requeuing I/O [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... [ 6379.761015] nvme nvme5: creating 128 I/O queues. [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) This is with 6.0, using nvmet-tcp on a different machine as the target. I've seen this sporadically with several test cases. The fio fio-rand-RW example test is a pretty good reproducer when numjobs in increased (I'm setting it equal to the number of CPUs in the system). Let me know what I can do to help debug this. I'm currently adding some tracing to the driver to see if I can get an idea of the sequence of events that leads to this problem. Thanks, Seth ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-11 15:31 nvme-tcp request timeouts Seth Forshee @ 2022-10-11 19:30 ` Chaitanya Kulkarni 2022-10-11 20:14 ` Seth Forshee 0 siblings, 1 reply; 10+ messages in thread From: Chaitanya Kulkarni @ 2022-10-11 19:30 UTC (permalink / raw) To: Seth Forshee Cc: linux-nvme@lists.infradead.org, Sagi Grimberg, Chaitanya Kulkarni, Christoph Hellwig Hi Seth, On 10/11/22 08:31, Seth Forshee wrote: > Hi, > > I'm seeing timeouts like the following from nvme-tcp: > > [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 > [ 6369.513283] nvme nvme5: starting error recovery > [ 6369.514379] block nvme5n1: no usable path - requeuing I/O > [ 6369.514385] block nvme5n1: no usable path - requeuing I/O > [ 6369.514392] block nvme5n1: no usable path - requeuing I/O > [ 6369.514393] block nvme5n1: no usable path - requeuing I/O > [ 6369.514401] block nvme5n1: no usable path - requeuing I/O > [ 6369.514414] block nvme5n1: no usable path - requeuing I/O > [ 6369.514420] block nvme5n1: no usable path - requeuing I/O > [ 6369.514427] block nvme5n1: no usable path - requeuing I/O > [ 6369.514430] block nvme5n1: no usable path - requeuing I/O > [ 6369.514432] block nvme5n1: no usable path - requeuing I/O > [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... > [ 6379.761015] nvme nvme5: creating 128 I/O queues. > [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. > [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) > > This is with 6.0, using nvmet-tcp on a different machine as the target. > I've seen this sporadically with several test cases. The fio fio-rand-RW > example test is a pretty good reproducer when numjobs in increased (I'm > setting it equal to the number of CPUs in the system). > > Let me know what I can do to help debug this. I'm currently adding some > tracing to the driver to see if I can get an idea of the sequence of > events that leads to this problem. > > Thanks, > Seth > Can you bisect it ? that will help to understand the commit causing issue. -ck ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-11 19:30 ` Chaitanya Kulkarni @ 2022-10-11 20:14 ` Seth Forshee 2022-10-11 20:19 ` Chaitanya Kulkarni 0 siblings, 1 reply; 10+ messages in thread From: Seth Forshee @ 2022-10-11 20:14 UTC (permalink / raw) To: Chaitanya Kulkarni Cc: linux-nvme@lists.infradead.org, Sagi Grimberg, Christoph Hellwig On Tue, Oct 11, 2022 at 07:30:56PM +0000, Chaitanya Kulkarni wrote: > Hi Seth, > > On 10/11/22 08:31, Seth Forshee wrote: > > Hi, > > > > I'm seeing timeouts like the following from nvme-tcp: > > > > [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 > > [ 6369.513283] nvme nvme5: starting error recovery > > [ 6369.514379] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514385] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514392] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514393] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514401] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514414] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514420] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514427] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514430] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514432] block nvme5n1: no usable path - requeuing I/O > > [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... > > [ 6379.761015] nvme nvme5: creating 128 I/O queues. > > [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. > > [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) > > > > This is with 6.0, using nvmet-tcp on a different machine as the target. > > I've seen this sporadically with several test cases. The fio fio-rand-RW > > example test is a pretty good reproducer when numjobs in increased (I'm > > setting it equal to the number of CPUs in the system). > > > > Let me know what I can do to help debug this. I'm currently adding some > > tracing to the driver to see if I can get an idea of the sequence of > > events that leads to this problem. > > > > Thanks, > > Seth > > > > Can you bisect it ? that will help to understand the commit causing > issue. I don't know of any "good" version right now. I started with a 5.10 kernel and saw this, and tested 6.0 and still see it. I found several commits since 5.10 which fix some kind of timeouts: a0fdd1418007 nvme-tcp: rerun io_work if req_list is not empty 70f437fb4395 nvme-tcp: fix io_work priority inversion 3770a42bb8ce nvme-tcp: fix regression that causes sporadic requests to time out 5.10 still has timeouts with these backported, so whatever the problem is it has existed at least that long. I suppose I could go back to older kernels with these backported if that's going to be the best path forward here. Thanks, Seth ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-11 20:14 ` Seth Forshee @ 2022-10-11 20:19 ` Chaitanya Kulkarni 2022-10-11 20:37 ` Seth Forshee 0 siblings, 1 reply; 10+ messages in thread From: Chaitanya Kulkarni @ 2022-10-11 20:19 UTC (permalink / raw) To: Seth Forshee Cc: linux-nvme@lists.infradead.org, Sagi Grimberg, Christoph Hellwig On 10/11/22 13:14, Seth Forshee wrote: > On Tue, Oct 11, 2022 at 07:30:56PM +0000, Chaitanya Kulkarni wrote: >> Hi Seth, >> >> On 10/11/22 08:31, Seth Forshee wrote: >>> Hi, >>> >>> I'm seeing timeouts like the following from nvme-tcp: >>> >>> [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 >>> [ 6369.513283] nvme nvme5: starting error recovery >>> [ 6369.514379] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514385] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514392] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514393] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514401] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514414] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514420] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514427] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514430] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514432] block nvme5n1: no usable path - requeuing I/O >>> [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... >>> [ 6379.761015] nvme nvme5: creating 128 I/O queues. >>> [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. >>> [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) >>> >>> This is with 6.0, using nvmet-tcp on a different machine as the target. >>> I've seen this sporadically with several test cases. The fio fio-rand-RW >>> example test is a pretty good reproducer when numjobs in increased (I'm >>> setting it equal to the number of CPUs in the system). >>> >>> Let me know what I can do to help debug this. I'm currently adding some >>> tracing to the driver to see if I can get an idea of the sequence of >>> events that leads to this problem. >>> >>> Thanks, >>> Seth >>> >> >> Can you bisect it ? that will help to understand the commit causing >> issue. > > I don't know of any "good" version right now. I started with a 5.10 > kernel and saw this, and tested 6.0 and still see it. I found several > commits since 5.10 which fix some kind of timeouts: > > a0fdd1418007 nvme-tcp: rerun io_work if req_list is not empty > 70f437fb4395 nvme-tcp: fix io_work priority inversion > 3770a42bb8ce nvme-tcp: fix regression that causes sporadic requests to time out > > 5.10 still has timeouts with these backported, so whatever the problem > is it has existed at least that long. I suppose I could go back to older > kernels with these backported if that's going to be the best path > forward here. > > Thanks, > Seth Can you please share the fio config you are using ? -ck ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-11 20:19 ` Chaitanya Kulkarni @ 2022-10-11 20:37 ` Seth Forshee 2022-10-12 6:33 ` Sagi Grimberg 2022-10-12 7:51 ` Sagi Grimberg 0 siblings, 2 replies; 10+ messages in thread From: Seth Forshee @ 2022-10-11 20:37 UTC (permalink / raw) To: Chaitanya Kulkarni Cc: linux-nvme@lists.infradead.org, Sagi Grimberg, Christoph Hellwig On Tue, Oct 11, 2022 at 08:19:47PM +0000, Chaitanya Kulkarni wrote: > On 10/11/22 13:14, Seth Forshee wrote: > > On Tue, Oct 11, 2022 at 07:30:56PM +0000, Chaitanya Kulkarni wrote: > >> Hi Seth, > >> > >> On 10/11/22 08:31, Seth Forshee wrote: > >>> Hi, > >>> > >>> I'm seeing timeouts like the following from nvme-tcp: > >>> > >>> [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 > >>> [ 6369.513283] nvme nvme5: starting error recovery > >>> [ 6369.514379] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514385] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514392] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514393] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514401] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514414] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514420] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514427] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514430] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514432] block nvme5n1: no usable path - requeuing I/O > >>> [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... > >>> [ 6379.761015] nvme nvme5: creating 128 I/O queues. > >>> [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. > >>> [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) > >>> > >>> This is with 6.0, using nvmet-tcp on a different machine as the target. > >>> I've seen this sporadically with several test cases. The fio fio-rand-RW > >>> example test is a pretty good reproducer when numjobs in increased (I'm > >>> setting it equal to the number of CPUs in the system). > >>> > >>> Let me know what I can do to help debug this. I'm currently adding some > >>> tracing to the driver to see if I can get an idea of the sequence of > >>> events that leads to this problem. > >>> > >>> Thanks, > >>> Seth > >>> > >> > >> Can you bisect it ? that will help to understand the commit causing > >> issue. > > > > I don't know of any "good" version right now. I started with a 5.10 > > kernel and saw this, and tested 6.0 and still see it. I found several > > commits since 5.10 which fix some kind of timeouts: > > > > a0fdd1418007 nvme-tcp: rerun io_work if req_list is not empty > > 70f437fb4395 nvme-tcp: fix io_work priority inversion > > 3770a42bb8ce nvme-tcp: fix regression that causes sporadic requests to time out > > > > 5.10 still has timeouts with these backported, so whatever the problem > > is it has existed at least that long. I suppose I could go back to older > > kernels with these backported if that's going to be the best path > > forward here. > > > > Thanks, > > Seth > > Can you please share the fio config you are using ? Sure. Note that I can reproduce it with a lower number of numjobs, but higher numbers make it easier, so I set it to the number of CPUs present on the system I'm using to test. [global] name=fio-rand-RW filename=fio-rand-RW rw=randrw rwmixread=60 rwmixwrite=40 bs=4K direct=0 numjobs=128 time_based runtime=900 [file1] size=10G ioengine=libaio iodepth=16 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-11 20:37 ` Seth Forshee @ 2022-10-12 6:33 ` Sagi Grimberg 2022-10-12 16:55 ` Seth Forshee 2022-10-12 7:51 ` Sagi Grimberg 1 sibling, 1 reply; 10+ messages in thread From: Sagi Grimberg @ 2022-10-12 6:33 UTC (permalink / raw) To: Seth Forshee, Chaitanya Kulkarni Cc: linux-nvme@lists.infradead.org, Christoph Hellwig Hey Seth, thanks for reporting. >>>> Hi Seth, >>>> >>>> On 10/11/22 08:31, Seth Forshee wrote: >>>>> Hi, >>>>> >>>>> I'm seeing timeouts like the following from nvme-tcp: >>>>> >>>>> [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 >>>>> [ 6369.513283] nvme nvme5: starting error recovery >>>>> [ 6369.514379] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514385] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514392] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514393] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514401] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514414] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514420] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514427] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514430] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514432] block nvme5n1: no usable path - requeuing I/O >>>>> [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... >>>>> [ 6379.761015] nvme nvme5: creating 128 I/O queues. >>>>> [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. >>>>> [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) >>>>> >>>>> This is with 6.0, using nvmet-tcp on a different machine as the target. >>>>> I've seen this sporadically with several test cases. The fio fio-rand-RW >>>>> example test is a pretty good reproducer when numjobs in increased (I'm >>>>> setting it equal to the number of CPUs in the system). >>>>> >>>>> Let me know what I can do to help debug this. I'm currently adding some >>>>> tracing to the driver to see if I can get an idea of the sequence of >>>>> events that leads to this problem. >>>>> >>>>> Thanks, >>>>> Seth >>>>> >>>> >>>> Can you bisect it ? that will help to understand the commit causing >>>> issue. >>> >>> I don't know of any "good" version right now. I started with a 5.10 >>> kernel and saw this, and tested 6.0 and still see it. I found several >>> commits since 5.10 which fix some kind of timeouts: >>> >>> a0fdd1418007 nvme-tcp: rerun io_work if req_list is not empty >>> 70f437fb4395 nvme-tcp: fix io_work priority inversion >>> 3770a42bb8ce nvme-tcp: fix regression that causes sporadic requests to time out >>> >>> 5.10 still has timeouts with these backported, so whatever the problem >>> is it has existed at least that long. I suppose I could go back to older >>> kernels with these backported if that's going to be the best path >>> forward here. >>> >>> Thanks, >>> Seth >> >> Can you please share the fio config you are using ? > > Sure. Note that I can reproduce it with a lower number of numjobs, but > higher numbers make it easier, so I set it to the number of CPUs present > on the system I'm using to test. > > > [global] > name=fio-rand-RW > filename=fio-rand-RW > rw=randrw > rwmixread=60 > rwmixwrite=40 > bs=4K > direct=0 Does this happen with direct=1? > numjobs=128 > time_based > runtime=900 > > [file1] > size=10G > ioengine=libaio > iodepth=16 > Is it possible that the backend nvmet device is not fast enough? Are your backend devices nvme? Can you share nvmetcli ls output? Also, just to understand if things are slow or stuck, perhaps you can artificially increase the io_timeout to say 60s or 120s. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-12 6:33 ` Sagi Grimberg @ 2022-10-12 16:55 ` Seth Forshee 2022-10-12 17:30 ` Sagi Grimberg 0 siblings, 1 reply; 10+ messages in thread From: Seth Forshee @ 2022-10-12 16:55 UTC (permalink / raw) To: Sagi Grimberg Cc: Chaitanya Kulkarni, linux-nvme@lists.infradead.org, Christoph Hellwig On Wed, Oct 12, 2022 at 09:33:19AM +0300, Sagi Grimberg wrote: > Hey Seth, thanks for reporting. > > > > > > Hi Seth, > > > > > > > > > > On 10/11/22 08:31, Seth Forshee wrote: > > > > > > Hi, > > > > > > > > > > > > I'm seeing timeouts like the following from nvme-tcp: > > > > > > > > > > > > [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 > > > > > > [ 6369.513283] nvme nvme5: starting error recovery > > > > > > [ 6369.514379] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514385] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514392] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514393] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514401] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514414] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514420] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514427] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514430] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514432] block nvme5n1: no usable path - requeuing I/O > > > > > > [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... > > > > > > [ 6379.761015] nvme nvme5: creating 128 I/O queues. > > > > > > [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. > > > > > > [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) > > > > > > > > > > > > This is with 6.0, using nvmet-tcp on a different machine as the target. > > > > > > I've seen this sporadically with several test cases. The fio fio-rand-RW > > > > > > example test is a pretty good reproducer when numjobs in increased (I'm > > > > > > setting it equal to the number of CPUs in the system). > > > > > > > > > > > > Let me know what I can do to help debug this. I'm currently adding some > > > > > > tracing to the driver to see if I can get an idea of the sequence of > > > > > > events that leads to this problem. > > > > > > > > > > > > Thanks, > > > > > > Seth > > > > > > > > > > > > > > > > Can you bisect it ? that will help to understand the commit causing > > > > > issue. > > > > > > > > I don't know of any "good" version right now. I started with a 5.10 > > > > kernel and saw this, and tested 6.0 and still see it. I found several > > > > commits since 5.10 which fix some kind of timeouts: > > > > > > > > a0fdd1418007 nvme-tcp: rerun io_work if req_list is not empty > > > > 70f437fb4395 nvme-tcp: fix io_work priority inversion > > > > 3770a42bb8ce nvme-tcp: fix regression that causes sporadic requests to time out > > > > > > > > 5.10 still has timeouts with these backported, so whatever the problem > > > > is it has existed at least that long. I suppose I could go back to older > > > > kernels with these backported if that's going to be the best path > > > > forward here. > > > > > > > > Thanks, > > > > Seth > > > > > > Can you please share the fio config you are using ? > > > > Sure. Note that I can reproduce it with a lower number of numjobs, but > > higher numbers make it easier, so I set it to the number of CPUs present > > on the system I'm using to test. > > > > > > [global] > > name=fio-rand-RW > > filename=fio-rand-RW > > rw=randrw > > rwmixread=60 > > rwmixwrite=40 > > bs=4K > > direct=0 > > Does this happen with direct=1? So far I haven't been able to reproduce it so far with direct=1. > > numjobs=128 > > time_based > > runtime=900 > > > > [file1] > > size=10G > > ioengine=libaio > > iodepth=16 > > > > Is it possible that the backend nvmet device is not fast enough? > Are your backend devices nvme? Can you share nvmetcli ls output? Currently I have a quick-and-dirty setup. The systems I have to test with don't have a spare nvme device right now but do have a lot of RAM, so the backend is a loop device backed by a file in tmpfs. But it should be plenty fast. I'm working on getting a setup with an nvme backend device. o- / ......................................................................................................................... [...] o- hosts ................................................................................................................... [...] | o- hostnqn ............................................................................................................... [...] o- ports ................................................................................................................... [...] | o- 2 ................................................... [trtype=tcp, traddr=..., trsvcid=4420, inline_data_size=16384] | o- ana_groups .......................................................................................................... [...] | | o- 1 ..................................................................................................... [state=optimized] | o- referrals ........................................................................................................... [...] | o- subsystems .......................................................................................................... [...] | o- testnqn ........................................................................................................... [...] o- subsystems .............................................................................................................. [...] o- testnqn ............................................................. [version=1.3, allow_any=1, serial=2c2e39e2a551f7febf33] o- allowed_hosts ....................................................................................................... [...] o- namespaces .......................................................................................................... [...] o- 1 [path=/dev/loop0, uuid=8a1561fb-82c3-4e9d-96b9-11c7b590d047, nguid=ef90689c-6c46-d44c-89c1-4067801309a8, grpid=1, enabled] > Also, just to understand if things are slow or stuck, perhaps you > can artificially increase the io_timeout to say 60s or 120s. I could still reproduce it with io_timeout set to 120s. Thanks, Seth ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-12 16:55 ` Seth Forshee @ 2022-10-12 17:30 ` Sagi Grimberg 2022-10-13 4:57 ` Seth Forshee 0 siblings, 1 reply; 10+ messages in thread From: Sagi Grimberg @ 2022-10-12 17:30 UTC (permalink / raw) To: Seth Forshee Cc: Chaitanya Kulkarni, linux-nvme@lists.infradead.org, Christoph Hellwig On 10/12/22 19:55, Seth Forshee wrote: > On Wed, Oct 12, 2022 at 09:33:19AM +0300, Sagi Grimberg wrote: >> Hey Seth, thanks for reporting. >> >>>>>> Hi Seth, >>>>>> >>>>>> On 10/11/22 08:31, Seth Forshee wrote: >>>>>>> Hi, >>>>>>> >>>>>>> I'm seeing timeouts like the following from nvme-tcp: >>>>>>> >>>>>>> [ 6369.513269] nvme nvme5: queue 102: timeout request 0x73 type 4 >>>>>>> [ 6369.513283] nvme nvme5: starting error recovery >>>>>>> [ 6369.514379] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514385] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514392] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514393] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514401] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514414] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514420] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514427] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514430] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514432] block nvme5n1: no usable path - requeuing I/O >>>>>>> [ 6369.514926] nvme nvme5: Reconnecting in 10 seconds... >>>>>>> [ 6379.761015] nvme nvme5: creating 128 I/O queues. >>>>>>> [ 6379.944389] nvme nvme5: mapped 128/0/0 default/read/poll queues. >>>>>>> [ 6379.947922] nvme nvme5: Successfully reconnected (1 attempt) >>>>>>> >>>>>>> This is with 6.0, using nvmet-tcp on a different machine as the target. >>>>>>> I've seen this sporadically with several test cases. The fio fio-rand-RW >>>>>>> example test is a pretty good reproducer when numjobs in increased (I'm >>>>>>> setting it equal to the number of CPUs in the system). >>>>>>> >>>>>>> Let me know what I can do to help debug this. I'm currently adding some >>>>>>> tracing to the driver to see if I can get an idea of the sequence of >>>>>>> events that leads to this problem. >>>>>>> >>>>>>> Thanks, >>>>>>> Seth >>>>>>> >>>>>> >>>>>> Can you bisect it ? that will help to understand the commit causing >>>>>> issue. >>>>> >>>>> I don't know of any "good" version right now. I started with a 5.10 >>>>> kernel and saw this, and tested 6.0 and still see it. I found several >>>>> commits since 5.10 which fix some kind of timeouts: >>>>> >>>>> a0fdd1418007 nvme-tcp: rerun io_work if req_list is not empty >>>>> 70f437fb4395 nvme-tcp: fix io_work priority inversion >>>>> 3770a42bb8ce nvme-tcp: fix regression that causes sporadic requests to time out >>>>> >>>>> 5.10 still has timeouts with these backported, so whatever the problem >>>>> is it has existed at least that long. I suppose I could go back to older >>>>> kernels with these backported if that's going to be the best path >>>>> forward here. >>>>> >>>>> Thanks, >>>>> Seth >>>> >>>> Can you please share the fio config you are using ? >>> >>> Sure. Note that I can reproduce it with a lower number of numjobs, but >>> higher numbers make it easier, so I set it to the number of CPUs present >>> on the system I'm using to test. >>> >>> >>> [global] >>> name=fio-rand-RW >>> filename=fio-rand-RW >>> rw=randrw >>> rwmixread=60 >>> rwmixwrite=40 >>> bs=4K >>> direct=0 >> >> Does this happen with direct=1? > > So far I haven't been able to reproduce it so far with direct=1. It is probably because the ios are actually going through as 4k and blk-mq tags depth is throttling you. on direct=0 writes are batched and go down to the disk in larger blocks, and in bursts, which probably is creating longer latencies... > >>> numjobs=128 >>> time_based >>> runtime=900 >>> >>> [file1] >>> size=10G >>> ioengine=libaio >>> iodepth=16 >>> >> >> Is it possible that the backend nvmet device is not fast enough? >> Are your backend devices nvme? Can you share nvmetcli ls output? > > Currently I have a quick-and-dirty setup. The systems I have to test > with don't have a spare nvme device right now but do have a lot of RAM, > so the backend is a loop device backed by a file in tmpfs. But it should > be plenty fast. I'm working on getting a setup with an nvme backend > device. I see, note that ram drives are fast, but stall the cpu for memory copy, which will prevent nvme io threads from servicing new requests, this can be magnified with large heavy bursts... > > o- / ......................................................................................................................... [...] > o- hosts ................................................................................................................... [...] > | o- hostnqn ............................................................................................................... [...] > o- ports ................................................................................................................... [...] > | o- 2 ................................................... [trtype=tcp, traddr=..., trsvcid=4420, inline_data_size=16384] > | o- ana_groups .......................................................................................................... [...] > | | o- 1 ..................................................................................................... [state=optimized] > | o- referrals ........................................................................................................... [...] > | o- subsystems .......................................................................................................... [...] > | o- testnqn ........................................................................................................... [...] > o- subsystems .............................................................................................................. [...] > o- testnqn ............................................................. [version=1.3, allow_any=1, serial=2c2e39e2a551f7febf33] > o- allowed_hosts ....................................................................................................... [...] > o- namespaces .......................................................................................................... [...] > o- 1 [path=/dev/loop0, uuid=8a1561fb-82c3-4e9d-96b9-11c7b590d047, nguid=ef90689c-6c46-d44c-89c1-4067801309a8, grpid=1, enabled] Ohh, I'd say that would be the culprit... the loop driver uses only a single queue to access the disk. This means that all your 100+ nvme-tcp queues are all serializing access on the single loop disk queue. This heavy back-pressure bubbles all the way back to the host and manifests in IO timeouts when large bursts hit... I can say that loop is not the best way to benchmark performance, and I'd expect to see such phenomenons when attempting to drive high loads to a loop device... Maybe you can possibly use a tmpfs file directly instead (nvmet supports file backends as well). Or maybe you can try to use null_blk with memory_backed=Y modparam (may need to define cache_size modparam as well, never tried it with memory backing...)? That would be more efficient. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-12 17:30 ` Sagi Grimberg @ 2022-10-13 4:57 ` Seth Forshee 0 siblings, 0 replies; 10+ messages in thread From: Seth Forshee @ 2022-10-13 4:57 UTC (permalink / raw) To: Sagi Grimberg Cc: Chaitanya Kulkarni, linux-nvme@lists.infradead.org, Christoph Hellwig On Wed, Oct 12, 2022 at 08:30:18PM +0300, Sagi Grimberg wrote: > > o- / ......................................................................................................................... [...] > > o- hosts ................................................................................................................... [...] > > | o- hostnqn ............................................................................................................... [...] > > o- ports ................................................................................................................... [...] > > | o- 2 ................................................... [trtype=tcp, traddr=..., trsvcid=4420, inline_data_size=16384] > > | o- ana_groups .......................................................................................................... [...] > > | | o- 1 ..................................................................................................... [state=optimized] > > | o- referrals ........................................................................................................... [...] > > | o- subsystems .......................................................................................................... [...] > > | o- testnqn ........................................................................................................... [...] > > o- subsystems .............................................................................................................. [...] > > o- testnqn ............................................................. [version=1.3, allow_any=1, serial=2c2e39e2a551f7febf33] > > o- allowed_hosts ....................................................................................................... [...] > > o- namespaces .......................................................................................................... [...] > > o- 1 [path=/dev/loop0, uuid=8a1561fb-82c3-4e9d-96b9-11c7b590d047, nguid=ef90689c-6c46-d44c-89c1-4067801309a8, grpid=1, enabled] > > Ohh, I'd say that would be the culprit... > the loop driver uses only a single queue to access the disk. This means that > all your 100+ nvme-tcp queues are all serializing access on the single loop > disk queue. This heavy back-pressure bubbles all the way > back to the host and manifests in IO timeouts when large bursts hit... > > I can say that loop is not the best way to benchmark performance, and > I'd expect to see such phenomenons when attempting to drive high loads > to a loop device... The goal wasn't to benchmark performance with this setup, just to start getting familiar. > Maybe you can possibly use a tmpfs file directly instead (nvmet supports > file backends as well). > > Or maybe you can try to use null_blk with memory_backed=Y modparam (may need > to define cache_size modparam as well, never tried it with memory > backing...)? That would be more efficient. I've got this set up now with an nvme drive as the backend for the target, and as you predicted the timeouts went away. So it does seem the problem was with using a loop device. Thanks for the help! Seth ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: nvme-tcp request timeouts 2022-10-11 20:37 ` Seth Forshee 2022-10-12 6:33 ` Sagi Grimberg @ 2022-10-12 7:51 ` Sagi Grimberg 1 sibling, 0 replies; 10+ messages in thread From: Sagi Grimberg @ 2022-10-12 7:51 UTC (permalink / raw) To: Seth Forshee, Chaitanya Kulkarni Cc: linux-nvme@lists.infradead.org, Christoph Hellwig >>>>> Can you bisect it ? that will help to understand the commit causing >>>>> issue. >>>> >>>> I don't know of any "good" version right now. I started with a 5.10 >>>> kernel and saw this, and tested 6.0 and still see it. I found several >>>> commits since 5.10 which fix some kind of timeouts: >>>> >>>> a0fdd1418007 nvme-tcp: rerun io_work if req_list is not empty >>>> 70f437fb4395 nvme-tcp: fix io_work priority inversion >>>> 3770a42bb8ce nvme-tcp: fix regression that causes sporadic requests >>>> to time out btw Seth, I've just looked at stable v5.10.147 and all of these fixes were taken to 5.10 stable branch. If you happen to use a 5.10 stable kernel (or a modified version of it), let us know if you spot any missing fixes so we'll make sure they end up there as well. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-10-13 4:57 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-10-11 15:31 nvme-tcp request timeouts Seth Forshee 2022-10-11 19:30 ` Chaitanya Kulkarni 2022-10-11 20:14 ` Seth Forshee 2022-10-11 20:19 ` Chaitanya Kulkarni 2022-10-11 20:37 ` Seth Forshee 2022-10-12 6:33 ` Sagi Grimberg 2022-10-12 16:55 ` Seth Forshee 2022-10-12 17:30 ` Sagi Grimberg 2022-10-13 4:57 ` Seth Forshee 2022-10-12 7:51 ` Sagi Grimberg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox