From mboxrd@z Thu Jan  1 00:00:00 1970
Content-Type: multipart/mixed; boundary="===============1825905427516214305=="
MIME-Version: 1.0
From: Walker, Benjamin <benjamin.walker at intel.com>
Subject: Re: [SPDK] Bdev claim and single writer semantics
Date: Wed, 09 May 2018 21:57:14 +0000
Message-ID: <1525903032.31529.1.camel@intel.com>
In-Reply-To: CANvN+ekbG1M9_dF+w6KRtPfe4SzVKcDgFXyGDdLR8PtVpupN1A@mail.gmail.com
List-ID: <spdk@lists.01.org>
To: spdk@lists.01.org

--===============1825905427516214305==
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable

On Wed, 2018-05-09 at 17:58 +0000, Andrey Kuzmin wrote:
> =

> =

> On Wed, May 9, 2018, 20:32 Walker, Benjamin <benjamin.walker(a)intel.com>=
 wrote:
> > On Wed, 2018-05-09 at 08:12 +0000, Andrey Kuzmin wrote:
> > > Ben,
> > > =

> > > the patch hs been submitted a week back, and then updated to account =
for
> > > review coments, but I still don't see it merged. Is there anything I =
can
> > do to
> > > make this happen?
> > =

> > Hi Andrey,
> > =

> > The patch is failing one of the tests. The test output is linked in the
> > comments
> > of the review.
> =

> If you're referring to bdev unit tests, I have fixed that already in the
> updated patch.
> =

> > I re-ran it a few times and it consistently fails. I don't think
> > the problem is going to end up being your change, but rather your change
> > invalidates some assumptions or expected behavior in one of the tests. =
That
> > needs to get debugged and resolved before we can merge.
> =

> If you detail on what test is failing (I'm not sure as all unitests pass
> locally), I could give it a shot as well.

The patch is passing the unit tests. It's a functional test that's failing.=
 Here
 is the dashboard for the test results for your patch:

https://ci.spdk.io/spdk/builds/review/14e40f86993e67f4107231fc25e239491d89e=
f08.1
525882846/

If you click fedora-08, which is the system that failed, and then click
'build.log' you'll see that a test failed because an assert was hit in
blobstore. It happens to hit when it is opening a bdev to check if it has a
blobstore present on it - precisely the area of code that you changed.

The failing test kicks off in test/lvol/lvol.sh, but for whatever reason th=
at
set of tests is calling a fairly sizable python script to execute the tests
(test/lvol/lvol_test.py). Our general policy is that functional tests shoul=
d be
written as bash scripts calling small tools and the applications under test=
, but
this set of tests seems to have slipped through the cracks. That's going to=
 make
this a bit more annoying to debug, unfortunately.

The failing test in test/lvol/lvol_test.py is #700. Again, numbering tests =
is
not our general policy and pattern, but this specific python script numbers
them. It's hitting an assert when it tries to load up a blobstore from an N=
VMe
device - the metadata on the disk is invalid.

I suspected that there is some other problem that your changes are simply
uncovering, so I put in a few print statements that trigger any time a
spdk_bdev_open is called but fails due to the write permission check in the
current code. Sure enough, this hits in this lvol test all over. It turns o=
ut
that there is a code path that is relying on spdk_bdev_open to get blocked =
by a
previous claim in that test. We need to deal with that in a separate patch,=
 then
your patch will pass the tests. The problem is the same one identified here:

https://github.com/spdk/spdk/issues/293


> =

> Thanks,
> A.
> > Thanks,
> > Ben
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> =

> -- =

> Regards,
> Andrey
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk

--===============1825905427516214305==--