All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurence Oberman <loberman@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Bart Van Assche <bart.vanassche@sandisk.com>,
	dm-devel@redhat.com, linux-scsi@vger.kernel.org
Subject: Re: dm-mq and end_clone_request()
Date: Wed, 3 Aug 2016 11:10:14 -0400 (EDT)	[thread overview]
Message-ID: <114076006.8072270.1470237014051.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <766674512.7979990.1470192959236.JavaMail.zimbra@redhat.com>



----- Original Message -----
> From: "Laurence Oberman" <loberman@redhat.com>
> To: "Mike Snitzer" <snitzer@redhat.com>
> Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com, linux-scsi@vger.kernel.org
> Sent: Tuesday, August 2, 2016 10:55:59 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> 
> 
> ----- Original Message -----
> > From: "Laurence Oberman" <loberman@redhat.com>
> > To: "Mike Snitzer" <snitzer@redhat.com>
> > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com,
> > linux-scsi@vger.kernel.org
> > Sent: Tuesday, August 2, 2016 10:18:30 PM
> > Subject: Re: dm-mq and end_clone_request()
> > 
> > 
> > 
> > ----- Original Message -----
> > > From: "Mike Snitzer" <snitzer@redhat.com>
> > > To: "Laurence Oberman" <loberman@redhat.com>
> > > Cc: "Bart Van Assche" <bart.vanassche@sandisk.com>, dm-devel@redhat.com,
> > > linux-scsi@vger.kernel.org
> > > Sent: Tuesday, August 2, 2016 10:10:12 PM
> > > Subject: Re: dm-mq and end_clone_request()
> > > 
> > > On Tue, Aug 02 2016 at  9:33pm -0400,
> > > Laurence Oberman <loberman@redhat.com> wrote:
> > > 
> > > > Hi Bart
> > > > 
> > > > I simplified the test to 2 simple scripts and only running against one
> > > > XFS
> > > > file system.
> > > > Can you validate these and tell me if its enough to emulate what you
> > > > are
> > > > doing.
> > > > Perhaps our test-suite is too simple.
> > > > 
> > > > Start the test
> > > > 
> > > > # cat run_test.sh
> > > > #!/bin/bash
> > > > logger "Starting Bart's test"
> > > > #for i in `seq 1 10`
> > > > for i in 1
> > > > do
> > > > 	fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
> > > >         --iodepth=64 --group_reporting --sync=1 --direct=1
> > > >         --ioengine=libaio \
> > > >         --directory="/data-$i" --name=data-integrity-test --thread
> > > >         --numjobs=16 \
> > > >         --runtime=600 --output=fio-output.txt >/dev/null &
> > > > done
> > > > 
> > > > Delete the host, I wait 10s in between host deletions.
> > > > But I also tested with 3s and still its stable with Mike's patches.
> > > > 
> > > > #!/bin/bash
> > > > for i in /sys/class/srp_remote_ports/*
> > > > do
> > > >  echo "Deleting host $i, it will re-connect via srp_daemon"
> > > >  echo 1 > $i/delete
> > > >  sleep 10
> > > > done
> > > > 
> > > > Check for I/O errors affecting XFS and we now have none with the
> > > > patches
> > > > Mike provided.
> > > > After recovery I can create files in the xfs mount with no issues.
> > > > 
> > > > Can you use my scripts and 1 mount and see if it still fails for you.
> > > 
> > > In parallel we can try Bart's testsuite that he shared earlier in this
> > > thread: https://github.com/bvanassche/srp-test
> > > 
> > > README.md says:
> > > "Running these tests manually is tedious. Hence this test suite that
> > > tests the SRP initiator and target drivers by loading both drivers on
> > > the same server, by logging in using the IB loopback functionality and
> > > by sending I/O through the SRP initiator driver to a RAM disk exported
> > > by the SRP target driver."
> > > 
> > > This could explain why Bart is still seeing issues.  He isn't testing
> > > real hardware -- as such he is using ramdisk to expose races, etc.
> > > 
> > > Mike
> > > 
> > 
> > Hi Mike,
> > 
> > I looked at Bart's scripts, they looked fine but I wanted a more simplified
> > way to bring the error out.
> > Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
> > That is the same way I do it when I am not connected to a large array as it
> > is the only way I can get EDR like speeds.
> > 
> > I don't thinks its racing due to the ramdisk back-end but  maybe we need to
> > ramp ours up to run more in parallel in a loop.
> > 
> > I will run 21 parallel runs and see if it makes a difference tonight and
> > report back tomorrow.
> > Clearly prior to your final patches we were escaping back to the FS layer
> > with errors but since your patches, at least in out test harness that is
> > resolved.
> > 
> > Thanks
> > Laurence
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> Hello
> 
> I ran 20 parallel runs with 3 loops through host deletion and in each case
> fio survived with no hard error escaping to the FS layer.
> Its solid in our test bed,
> Keep in mind we have no ib_srpt loaded as we have a hardware based array and
> are connected directly to the array with EDR 100.
> I am also not removing and reloading modules like is happening in Barts's
> scripts and also not trying to delete mpath maps etc.
> 
> I focused only on the I/O error that was escaping up to the FS layer.
> I will check in with Bart tomorrow.
> 
> Thanks
> Laurence
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Hi Bart

Looking back at your email.

I also get these, but those are expected as we are in the process of doing I/O when we yank the hosts and that has in-flights affected.

Aug  2 22:41:23 jumpclient kernel: device-mapper: multipath: Failing path 8:192.
Aug  2 22:41:23 jumpclient kernel: blk_update_request: I/O error, dev sdm, sector 258504
Aug  2 22:41:23 jumpclient kernel: blk_update_request: I/O error, dev sdm, sector 60320

However I never get any of these any more (with the patches applied) that you show:
[  162.903284] Buffer I/O error on dev dm-0, logical block 32928, lost sync page write

I will work with you to understand why with Mike's patches, its now stable here but not in your configuration

Thanks
Laurence

  reply	other threads:[~2016-08-03 15:10 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19 22:57 dm-mq and end_clone_request() Bart Van Assche
2016-07-20 14:08 ` Mike Snitzer
2016-07-20 14:27   ` Mike Snitzer
2016-07-20 17:37     ` Bart Van Assche
2016-07-20 18:33       ` Mike Snitzer
2016-07-21 20:58         ` [dm-devel] " Bart Van Assche
2016-07-25 17:53           ` Mike Snitzer
2016-07-25 21:23             ` Mike Snitzer
2016-07-25 22:00               ` Bart Van Assche
2016-07-26  1:16                 ` Mike Snitzer
2016-07-26 22:51                   ` Bart Van Assche
2016-07-27 14:08                     ` Mike Snitzer
2016-07-27 15:52                       ` [dm-devel] " Benjamin Marzinski
2016-07-27 19:06                         ` Bart Van Assche
2016-07-27 19:54                           ` Mike Snitzer
2016-07-27 20:09                   ` Mike Snitzer
2016-07-27 23:05                     ` Bart Van Assche
2016-07-28 13:33                       ` Mike Snitzer
2016-07-28 15:23                         ` Bart Van Assche
2016-07-28 15:40                           ` Mike Snitzer
2016-07-29  6:28                             ` [dm-devel] " Hannes Reinecke
2016-07-26  6:02             ` Hannes Reinecke
2016-07-26 16:06               ` Mike Snitzer
     [not found]         ` <317679447.7168375.1469729769593.JavaMail.zimbra@redhat.com>
     [not found]           ` <6880321d-e14f-169b-d100-6e460dd9bd09@sandisk.com>
     [not found]             ` <1110327939.7305916.1469819453678.JavaMail.zimbra@redhat.com>
     [not found]               ` <a5c1a149-b1a2-b5a4-2207-bdaf32db3cbd@sandisk.com>
     [not found]                 ` <757522831.7667712.1470059860543.JavaMail.zimbra@redhat.com>
     [not found]                   ` <536022978.7668211.1470060125271.JavaMail.zimbra@redhat.com>
     [not found]                     ` <931235537.7668834.1470060339483.JavaMail.zimbra@redhat.com>
     [not found]                       ` <1264951811.7684268.1470065187014.JavaMail.zimbra@redhat.com>
     [not found]                         ` <17da3ab0-233a-2cec-f921-bfd42c953ccc@sandisk.com>
2016-08-01 17:59                           ` Mike Snitzer
2016-08-01 18:55                             ` Bart Van Assche
2016-08-01 19:15                               ` Mike Snitzer
2016-08-01 20:46                               ` Mike Snitzer
2016-08-01 22:41                                 ` Bart Van Assche
2016-08-01 22:41                                   ` Bart Van Assche
2016-08-02 17:45                                   ` Mike Snitzer
2016-08-03  0:19                                     ` Bart Van Assche
2016-08-03  0:40                                       ` Mike Snitzer
2016-08-03  1:33                                         ` Laurence Oberman
2016-08-03  2:10                                           ` Mike Snitzer
2016-08-03  2:18                                             ` Laurence Oberman
2016-08-03  2:55                                               ` Laurence Oberman
2016-08-03 15:10                                                 ` Laurence Oberman [this message]
2016-08-03 16:06                                           ` Bart Van Assche
2016-08-03 17:25                                             ` Laurence Oberman
2016-08-03 18:03                                             ` [dm-devel] " Laurence Oberman
2016-08-03 16:55                                         ` Bart Van Assche
2016-08-04  9:53                                           ` Hannes Reinecke
2016-08-04 10:09                                             ` Hannes Reinecke
2016-08-04 15:10                                               ` Mike Snitzer
2016-08-04 16:10                                           ` Mike Snitzer
2016-08-04 17:42                                             ` Bart Van Assche
2016-08-04 23:58                                               ` Mike Snitzer
2016-08-05  1:07                                                 ` Laurence Oberman
2016-08-05 11:43                                                   ` Laurence Oberman
2016-08-05 15:39                                                     ` Laurence Oberman
2016-08-05 15:43                                                       ` Bart Van Assche
2016-08-05 18:42                                                     ` [dm-devel] " Bart Van Assche
2016-08-06 14:47                                                       ` Laurence Oberman
2016-08-07 22:31                                                         ` [dm-devel] " Bart Van Assche
2016-08-08 12:45                                                           ` Laurence Oberman
2016-08-08 13:44                                                             ` Johannes Thumshirn
2016-08-08 13:44                                                               ` Johannes Thumshirn
2016-08-08 14:32                                                               ` Laurence Oberman
2016-08-08 14:54                                                               ` Bart Van Assche
2016-08-08 15:11                                                         ` Bart Van Assche
2016-08-08 15:26                                                           ` Laurence Oberman
2016-08-08 15:28                                                             ` Bart Van Assche
2016-08-08 22:39                                                             ` Bart Van Assche
2016-08-08 22:52                                                               ` Laurence Oberman
2016-08-09  0:09                                                                 ` Laurence Oberman
2016-08-09 15:51                                                                   ` Bart Van Assche
2016-08-09 17:12                                                                     ` [dm-devel] " Laurence Oberman
2016-08-09 17:16                                                                       ` Bart Van Assche
2016-08-09 17:21                                                                         ` Laurence Oberman
2016-08-10 21:38                                                                           ` Laurence Oberman
2016-08-11 16:51                                                                             ` Laurence Oberman
2016-08-05 18:40                                                 ` Bart Van Assche
2016-07-21 20:32       ` Mike Snitzer
2016-07-21 20:40         ` [dm-devel] " Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=114076006.8072270.1470237014051.JavaMail.zimbra@redhat.com \
    --to=loberman@redhat.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.