public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] Block fixes for 4.17-rc2
@ 2018-04-25 16:50 Jens Axboe
  2018-04-25 17:03 ` Paolo Valente
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2018-04-25 16:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-block@vger.kernel.org

Hi Linus,

I ended up sitting on this about a week longer than I wanted to,
since we were hashing out details with a timeout change. I've now
killed that patch, so we can flush the existing queue in due time.

This pull request contains:

- Fix for an old regression, where entering the queue can be disturbed
  by a signal to the process. This can cause spurious EIO. Fix from Alan
  Jenkins.

- cdrom information leak fix from Dan.

- Trivial helper for testing queue FUA from Dave Chinner, part of his
  O_DIRECT FUA series.

- Series of swim fixes from Finn that actually makes it work again.

- Loop O_DIRECT corruption fix, which caused data corruption in
  production for us. From me.

- BFQ crash fix from me.

- bcache maintainer update. Michael no longer has the time to do it,
  Coly has stepped up to serve as the new maintainer.

- blkcg locking fixes from Jiang Biao.

- Revert of a change from this merge window from Ming, that causes an
  issue on some hardware.

- Minor clarification doc addition from Linus Walleij.

Please pull!


  git://git.kernel.dk/linux-block.git tags/for-linus-20180425


----------------------------------------------------------------
Alan Jenkins (1):
      block: do not use interruptible wait anywhere

Dan Carpenter (1):
      cdrom: information leak in cdrom_ioctl_media_changed()

Dave Chinner (1):
      block: add blk_queue_fua() helper function

Finn Thain (8):
      m68k/mac: Don't remap SWIM MMIO region
      block/swim: Fix array bounds check
      block/swim: Remove extra put_disk() call from error path
      block/swim: Don't log an error message for an invalid ioctl
      block/swim: Rename macros to avoid inconsistent inverted logic
      block/swim: Check drive type
      block/swim: Fix IO error at end of medium
      block/swim: Select appropriate drive on device open

Jens Axboe (4):
      loop: remove cmd->rq member
      loop: handle short DIO reads
      bfq-iosched: ensure to clear bic/bfqq pointers when preparing request
      bcache: mark Coly Li as bcache maintainer

Jianchao Wang (1):
      blk-mq: start request gstate with gen 1

Jiang Biao (3):
      blkcg: don't hold blkcg lock when deactivating policy
      blkcg: small fix on comment in blkcg_init_queue
      blkcg: init root blkcg_gq under lock

Linus Walleij (1):
      block: mq: Add some minor doc for core structs

Michael Lyle (1):
      MAINTAINERS: Remove me as maintainer of bcache

Ming Lei (1):
      Revert "blk-mq: remove code for dealing with remapping queue"

 MAINTAINERS            |  2 +-
 block/bfq-iosched.c    | 10 +++++++-
 block/blk-cgroup.c     | 28 ++++++++++------------
 block/blk-core.c       | 15 ++++++------
 block/blk-mq.c         | 41 +++++++++++++++++++++++++++++---
 block/blk-mq.h         |  3 +++
 drivers/block/loop.c   | 64 +++++++++++++++++++++++++++++++++-----------------
 drivers/block/loop.h   |  1 -
 drivers/block/swim.c   | 49 +++++++++++++++++---------------------
 drivers/block/swim3.c  |  6 ++---
 drivers/cdrom/cdrom.c  |  2 +-
 include/linux/blk-mq.h |  3 +++
 include/linux/blkdev.h |  1 +
 13 files changed, 144 insertions(+), 81 deletions(-)

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 16:50 [GIT PULL] Block fixes for 4.17-rc2 Jens Axboe
@ 2018-04-25 17:03 ` Paolo Valente
  2018-04-25 17:06   ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Valente @ 2018-04-25 17:03 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linus Torvalds, linux-block@vger.kernel.org



> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe <axboe@kernel.dk> ha =
scritto:
>=20
> Hi Linus,
>=20
> I ended up sitting on this about a week longer than I wanted to,
> since we were hashing out details with a timeout change. I've now
> killed that patch, so we can flush the existing queue in due time.
>=20
> This pull request contains:
>=20
> - Fix for an old regression, where entering the queue can be disturbed
>  by a signal to the process. This can cause spurious EIO. Fix from =
Alan
>  Jenkins.
>=20
> - cdrom information leak fix from Dan.
>=20
> - Trivial helper for testing queue FUA from Dave Chinner, part of his
>  O_DIRECT FUA series.
>=20
> - Series of swim fixes from Finn that actually makes it work again.
>=20
> - Loop O_DIRECT corruption fix, which caused data corruption in
>  production for us. =46rom me.
>=20
> - BFQ crash fix from me.
>=20

For what it's worth, I disagree with this patch.  This change is based
on an apparently buggy motivation, as I wrote in my reply in the
thread where Jens proposed it.  As such, I think it might even bury
more deeply the actual bug that causes the crash (although of course
this patch does eliminate the crash for the use case reported in that
thread).

Thanks,
Paolo

> - bcache maintainer update. Michael no longer has the time to do it,
>  Coly has stepped up to serve as the new maintainer.
>=20
> - blkcg locking fixes from Jiang Biao.
>=20
> - Revert of a change from this merge window from Ming, that causes an
>  issue on some hardware.
>=20
> - Minor clarification doc addition from Linus Walleij.
>=20
> Please pull!
>=20
>=20
>  git://git.kernel.dk/linux-block.git tags/for-linus-20180425
>=20
>=20
> ----------------------------------------------------------------
> Alan Jenkins (1):
>      block: do not use interruptible wait anywhere
>=20
> Dan Carpenter (1):
>      cdrom: information leak in cdrom_ioctl_media_changed()
>=20
> Dave Chinner (1):
>      block: add blk_queue_fua() helper function
>=20
> Finn Thain (8):
>      m68k/mac: Don't remap SWIM MMIO region
>      block/swim: Fix array bounds check
>      block/swim: Remove extra put_disk() call from error path
>      block/swim: Don't log an error message for an invalid ioctl
>      block/swim: Rename macros to avoid inconsistent inverted logic
>      block/swim: Check drive type
>      block/swim: Fix IO error at end of medium
>      block/swim: Select appropriate drive on device open
>=20
> Jens Axboe (4):
>      loop: remove cmd->rq member
>      loop: handle short DIO reads
>      bfq-iosched: ensure to clear bic/bfqq pointers when preparing =
request
>      bcache: mark Coly Li as bcache maintainer
>=20
> Jianchao Wang (1):
>      blk-mq: start request gstate with gen 1
>=20
> Jiang Biao (3):
>      blkcg: don't hold blkcg lock when deactivating policy
>      blkcg: small fix on comment in blkcg_init_queue
>      blkcg: init root blkcg_gq under lock
>=20
> Linus Walleij (1):
>      block: mq: Add some minor doc for core structs
>=20
> Michael Lyle (1):
>      MAINTAINERS: Remove me as maintainer of bcache
>=20
> Ming Lei (1):
>      Revert "blk-mq: remove code for dealing with remapping queue"
>=20
> MAINTAINERS            |  2 +-
> block/bfq-iosched.c    | 10 +++++++-
> block/blk-cgroup.c     | 28 ++++++++++------------
> block/blk-core.c       | 15 ++++++------
> block/blk-mq.c         | 41 +++++++++++++++++++++++++++++---
> block/blk-mq.h         |  3 +++
> drivers/block/loop.c   | 64 =
+++++++++++++++++++++++++++++++++-----------------
> drivers/block/loop.h   |  1 -
> drivers/block/swim.c   | 49 +++++++++++++++++---------------------
> drivers/block/swim3.c  |  6 ++---
> drivers/cdrom/cdrom.c  |  2 +-
> include/linux/blk-mq.h |  3 +++
> include/linux/blkdev.h |  1 +
> 13 files changed, 144 insertions(+), 81 deletions(-)
>=20
> --=20
> Jens Axboe
>=20

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 17:03 ` Paolo Valente
@ 2018-04-25 17:06   ` Jens Axboe
  2018-04-25 17:25     ` Paolo Valente
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2018-04-25 17:06 UTC (permalink / raw)
  To: Paolo Valente; +Cc: Linus Torvalds, linux-block@vger.kernel.org

On 4/25/18 11:03 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe <axboe@kernel.dk> ha scritto:
>>
>> Hi Linus,
>>
>> I ended up sitting on this about a week longer than I wanted to,
>> since we were hashing out details with a timeout change. I've now
>> killed that patch, so we can flush the existing queue in due time.
>>
>> This pull request contains:
>>
>> - Fix for an old regression, where entering the queue can be disturbed
>>  by a signal to the process. This can cause spurious EIO. Fix from Alan
>>  Jenkins.
>>
>> - cdrom information leak fix from Dan.
>>
>> - Trivial helper for testing queue FUA from Dave Chinner, part of his
>>  O_DIRECT FUA series.
>>
>> - Series of swim fixes from Finn that actually makes it work again.
>>
>> - Loop O_DIRECT corruption fix, which caused data corruption in
>>  production for us. From me.
>>
>> - BFQ crash fix from me.
>>
> 
> For what it's worth, I disagree with this patch.  This change is based
> on an apparently buggy motivation, as I wrote in my reply in the
> thread where Jens proposed it.  As such, I think it might even bury
> more deeply the actual bug that causes the crash (although of course
> this patch does eliminate the crash for the use case reported in that
> thread).

The patch fixes the issue and I've explained why. If you have a
motivation to fix it differently, for whatever reason, then by all
means submit a patch. So far I haven't seen it, and we still have
the known crash that people are actually hitting.

I'll be happy to work with you on that later in the week, when
the LSFMM conference has wound down.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 17:06   ` Jens Axboe
@ 2018-04-25 17:25     ` Paolo Valente
  2018-04-25 17:34       ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Valente @ 2018-04-25 17:25 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown



> Il giorno 25 apr 2018, alle ore 19:06, Jens Axboe <axboe@kernel.dk> ha =
scritto:
>=20
> On 4/25/18 11:03 AM, Paolo Valente wrote:
>>=20
>>=20
>>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe <axboe@kernel.dk> =
ha scritto:
>>>=20
>>> Hi Linus,
>>>=20
>>> I ended up sitting on this about a week longer than I wanted to,
>>> since we were hashing out details with a timeout change. I've now
>>> killed that patch, so we can flush the existing queue in due time.
>>>=20
>>> This pull request contains:
>>>=20
>>> - Fix for an old regression, where entering the queue can be =
disturbed
>>> by a signal to the process. This can cause spurious EIO. Fix from =
Alan
>>> Jenkins.
>>>=20
>>> - cdrom information leak fix from Dan.
>>>=20
>>> - Trivial helper for testing queue FUA from Dave Chinner, part of =
his
>>> O_DIRECT FUA series.
>>>=20
>>> - Series of swim fixes from Finn that actually makes it work again.
>>>=20
>>> - Loop O_DIRECT corruption fix, which caused data corruption in
>>> production for us. =46rom me.
>>>=20
>>> - BFQ crash fix from me.
>>>=20
>>=20
>> For what it's worth, I disagree with this patch.  This change is =
based
>> on an apparently buggy motivation, as I wrote in my reply in the
>> thread where Jens proposed it.  As such, I think it might even bury
>> more deeply the actual bug that causes the crash (although of course
>> this patch does eliminate the crash for the use case reported in that
>> thread).
>=20
> The patch fixes the issue and I've explained why.

Definitely, but I wrote you that your explanation seems wrong.

> If you have a
> motivation to fix it differently, for whatever reason, then by all
> means submit a patch. So far I haven't seen it, and we still have
> the known crash that people are actually hitting.
>=20

Unfortunately I don't have a solution, because I don't know what the
bug is.  I only know that there's a bug in your explanation for the
bug you want to fix.

> I'll be happy to work with you on that later in the week, when
> the LSFMM conference has wound down.
>=20

I do thank you for that.  If you can, please start by answering my
concerns on your explanation.  Maybe your explanation can be fixed (or
I'm simply wrong), and all will be fine.  Or, if your explanation is
really buggy and we don't find the actual bug in the code, we can
just enrich your proposed change with a comment, and state that this
is a crutch to walk on, while waiting for the actual bug to show up.

Thanks,
Paolo

> --=20
> Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 17:25     ` Paolo Valente
@ 2018-04-25 17:34       ` Jens Axboe
  2018-04-25 18:02         ` Paolo Valente
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2018-04-25 17:34 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown

On 4/25/18 11:25 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 25 apr 2018, alle ore 19:06, Jens Axboe <axboe@kernel.dk> ha scritto:
>>
>> On 4/25/18 11:03 AM, Paolo Valente wrote:
>>>
>>>
>>>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe <axboe@kernel.dk> ha scritto:
>>>>
>>>> Hi Linus,
>>>>
>>>> I ended up sitting on this about a week longer than I wanted to,
>>>> since we were hashing out details with a timeout change. I've now
>>>> killed that patch, so we can flush the existing queue in due time.
>>>>
>>>> This pull request contains:
>>>>
>>>> - Fix for an old regression, where entering the queue can be disturbed
>>>> by a signal to the process. This can cause spurious EIO. Fix from Alan
>>>> Jenkins.
>>>>
>>>> - cdrom information leak fix from Dan.
>>>>
>>>> - Trivial helper for testing queue FUA from Dave Chinner, part of his
>>>> O_DIRECT FUA series.
>>>>
>>>> - Series of swim fixes from Finn that actually makes it work again.
>>>>
>>>> - Loop O_DIRECT corruption fix, which caused data corruption in
>>>> production for us. From me.
>>>>
>>>> - BFQ crash fix from me.
>>>>
>>>
>>> For what it's worth, I disagree with this patch.  This change is based
>>> on an apparently buggy motivation, as I wrote in my reply in the
>>> thread where Jens proposed it.  As such, I think it might even bury
>>> more deeply the actual bug that causes the crash (although of course
>>> this patch does eliminate the crash for the use case reported in that
>>> thread).
>>
>> The patch fixes the issue and I've explained why.
> 
> Definitely, but I wrote you that your explanation seems wrong.
> 
>> If you have a
>> motivation to fix it differently, for whatever reason, then by all
>> means submit a patch. So far I haven't seen it, and we still have
>> the known crash that people are actually hitting.
>>
> 
> Unfortunately I don't have a solution, because I don't know what the
> bug is.  I only know that there's a bug in your explanation for the
> bug you want to fix.
> 
>> I'll be happy to work with you on that later in the week, when
>> the LSFMM conference has wound down.
>>
> 
> I do thank you for that.  If you can, please start by answering my
> concerns on your explanation.  Maybe your explanation can be fixed (or
> I'm simply wrong), and all will be fine.  Or, if your explanation is
> really buggy and we don't find the actual bug in the code, we can
> just enrich your proposed change with a comment, and state that this
> is a crutch to walk on, while waiting for the actual bug to show up.

The reproduction case was there, so should not be hard to run with
that and add some instrumentation to verify or disprove theories.
For the flush case, you start out with a regular request, and assign
the pointers. Then when we transform that into a flush, it's
bypass inserted, and later retrieved in __bfq_dispatch_request(). As
I said earlier, you could either check that in there (add a ->elv.icq
non-NULL check before doing the bfqq check), or you can have it
cleaner and just always clear the pointers if ->elv.icq isn't assigned.

The patch clears the pointers if ->elv.icq isn't set, since the whole
set has to be valid. If ->elv.icq isn't set, then we can't trust
potentially earlier content of ->priv[0/1]. It's better to clear them
explicitly, rather than add extra checks for ->elv.icq where we would
otherwise check for RQ_BIC() or RQ_BFQQ().


-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 17:34       ` Jens Axboe
@ 2018-04-25 18:02         ` Paolo Valente
  2018-04-25 18:18           ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Valente @ 2018-04-25 18:02 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown



> Il giorno 25 apr 2018, alle ore 19:34, Jens Axboe <axboe@kernel.dk> ha =
scritto:
>=20
> On 4/25/18 11:25 AM, Paolo Valente wrote:
>>=20
>>=20
>>> Il giorno 25 apr 2018, alle ore 19:06, Jens Axboe <axboe@kernel.dk> =
ha scritto:
>>>=20
>>> On 4/25/18 11:03 AM, Paolo Valente wrote:
>>>>=20
>>>>=20
>>>>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe =
<axboe@kernel.dk> ha scritto:
>>>>>=20
>>>>> Hi Linus,
>>>>>=20
>>>>> I ended up sitting on this about a week longer than I wanted to,
>>>>> since we were hashing out details with a timeout change. I've now
>>>>> killed that patch, so we can flush the existing queue in due time.
>>>>>=20
>>>>> This pull request contains:
>>>>>=20
>>>>> - Fix for an old regression, where entering the queue can be =
disturbed
>>>>> by a signal to the process. This can cause spurious EIO. Fix from =
Alan
>>>>> Jenkins.
>>>>>=20
>>>>> - cdrom information leak fix from Dan.
>>>>>=20
>>>>> - Trivial helper for testing queue FUA from Dave Chinner, part of =
his
>>>>> O_DIRECT FUA series.
>>>>>=20
>>>>> - Series of swim fixes from Finn that actually makes it work =
again.
>>>>>=20
>>>>> - Loop O_DIRECT corruption fix, which caused data corruption in
>>>>> production for us. =46rom me.
>>>>>=20
>>>>> - BFQ crash fix from me.
>>>>>=20
>>>>=20
>>>> For what it's worth, I disagree with this patch.  This change is =
based
>>>> on an apparently buggy motivation, as I wrote in my reply in the
>>>> thread where Jens proposed it.  As such, I think it might even bury
>>>> more deeply the actual bug that causes the crash (although of =
course
>>>> this patch does eliminate the crash for the use case reported in =
that
>>>> thread).
>>>=20
>>> The patch fixes the issue and I've explained why.
>>=20
>> Definitely, but I wrote you that your explanation seems wrong.
>>=20
>>> If you have a
>>> motivation to fix it differently, for whatever reason, then by all
>>> means submit a patch. So far I haven't seen it, and we still have
>>> the known crash that people are actually hitting.
>>>=20
>>=20
>> Unfortunately I don't have a solution, because I don't know what the
>> bug is.  I only know that there's a bug in your explanation for the
>> bug you want to fix.
>>=20
>>> I'll be happy to work with you on that later in the week, when
>>> the LSFMM conference has wound down.
>>>=20
>>=20
>> I do thank you for that.  If you can, please start by answering my
>> concerns on your explanation.  Maybe your explanation can be fixed =
(or
>> I'm simply wrong), and all will be fine.  Or, if your explanation is
>> really buggy and we don't find the actual bug in the code, we can
>> just enrich your proposed change with a comment, and state that this
>> is a crutch to walk on, while waiting for the actual bug to show up.
>=20
> The reproduction case was there, so should not be hard to run with
> that and add some instrumentation to verify or disprove theories.

Simple code check seems enough for that, see below.

> For the flush case, you start out with a regular request, and assign
> the pointers. Then when we transform that into a flush, it's
> bypass inserted, and later retrieved in __bfq_dispatch_request().

Before the transformation, the request undergoes a finish or a
requeue, right?  If so, bfq clears the pointers there.

> As
> I said earlier, you could either check that in there (add a ->elv.icq
> non-NULL check before doing the bfqq check),

They must be already cleared, because of the above.

> or you can have it
> cleaner and just always clear the pointers if ->elv.icq isn't =
assigned.
>=20

It is the cleanest solution, if it is ok that those pointers are
dirtied (for reasons I don't know) before or during the
transformation.

Otherwise, either I'm wrong (no surprise), or the forced clearing you
propose hides an unexplained dirtying of two pointers.

If I'm not wrong, then, if you deem it reasonable, we could just
change your patch as follows: add a comment on the above before your
change, and change the commit message in a consistent way.  I can do
it.  By doing so, we would have your patch now, to at least flush away
this nasty failure for one case.  In the meantime we could hopefully
try to understand what is going wrong in the handling of requests.

Thanks,
Paolo

> The patch clears the pointers if ->elv.icq isn't set, since the whole
> set has to be valid. If ->elv.icq isn't set, then we can't trust
> potentially earlier content of ->priv[0/1]. It's better to clear them
> explicitly, rather than add extra checks for ->elv.icq where we would
> otherwise check for RQ_BIC() or RQ_BFQQ().
>=20
>=20
> --=20
> Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 18:02         ` Paolo Valente
@ 2018-04-25 18:18           ` Jens Axboe
  2018-04-25 18:42             ` Paolo Valente
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2018-04-25 18:18 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown

On 4/25/18 12:02 PM, Paolo Valente wrote:
> 
> 
>> Il giorno 25 apr 2018, alle ore 19:34, Jens Axboe <axboe@kernel.dk> ha scritto:
>>
>> On 4/25/18 11:25 AM, Paolo Valente wrote:
>>>
>>>
>>>> Il giorno 25 apr 2018, alle ore 19:06, Jens Axboe <axboe@kernel.dk> ha scritto:
>>>>
>>>> On 4/25/18 11:03 AM, Paolo Valente wrote:
>>>>>
>>>>>
>>>>>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe <axboe@kernel.dk> ha scritto:
>>>>>>
>>>>>> Hi Linus,
>>>>>>
>>>>>> I ended up sitting on this about a week longer than I wanted to,
>>>>>> since we were hashing out details with a timeout change. I've now
>>>>>> killed that patch, so we can flush the existing queue in due time.
>>>>>>
>>>>>> This pull request contains:
>>>>>>
>>>>>> - Fix for an old regression, where entering the queue can be disturbed
>>>>>> by a signal to the process. This can cause spurious EIO. Fix from Alan
>>>>>> Jenkins.
>>>>>>
>>>>>> - cdrom information leak fix from Dan.
>>>>>>
>>>>>> - Trivial helper for testing queue FUA from Dave Chinner, part of his
>>>>>> O_DIRECT FUA series.
>>>>>>
>>>>>> - Series of swim fixes from Finn that actually makes it work again.
>>>>>>
>>>>>> - Loop O_DIRECT corruption fix, which caused data corruption in
>>>>>> production for us. From me.
>>>>>>
>>>>>> - BFQ crash fix from me.
>>>>>>
>>>>>
>>>>> For what it's worth, I disagree with this patch.  This change is based
>>>>> on an apparently buggy motivation, as I wrote in my reply in the
>>>>> thread where Jens proposed it.  As such, I think it might even bury
>>>>> more deeply the actual bug that causes the crash (although of course
>>>>> this patch does eliminate the crash for the use case reported in that
>>>>> thread).
>>>>
>>>> The patch fixes the issue and I've explained why.
>>>
>>> Definitely, but I wrote you that your explanation seems wrong.
>>>
>>>> If you have a
>>>> motivation to fix it differently, for whatever reason, then by all
>>>> means submit a patch. So far I haven't seen it, and we still have
>>>> the known crash that people are actually hitting.
>>>>
>>>
>>> Unfortunately I don't have a solution, because I don't know what the
>>> bug is.  I only know that there's a bug in your explanation for the
>>> bug you want to fix.
>>>
>>>> I'll be happy to work with you on that later in the week, when
>>>> the LSFMM conference has wound down.
>>>>
>>>
>>> I do thank you for that.  If you can, please start by answering my
>>> concerns on your explanation.  Maybe your explanation can be fixed (or
>>> I'm simply wrong), and all will be fine.  Or, if your explanation is
>>> really buggy and we don't find the actual bug in the code, we can
>>> just enrich your proposed change with a comment, and state that this
>>> is a crutch to walk on, while waiting for the actual bug to show up.
>>
>> The reproduction case was there, so should not be hard to run with
>> that and add some instrumentation to verify or disprove theories.
> 
> Simple code check seems enough for that, see below.
> 
>> For the flush case, you start out with a regular request, and assign
>> the pointers. Then when we transform that into a flush, it's
>> bypass inserted, and later retrieved in __bfq_dispatch_request().
> 
> Before the transformation, the request undergoes a finish or a
> requeue, right?  If so, bfq clears the pointers there.

If the request has an end_io hook, we go through there and don't
finish the request. The request is reused.

>> As
>> I said earlier, you could either check that in there (add a ->elv.icq
>> non-NULL check before doing the bfqq check),
> 
> They must be already cleared, because of the above.

Wrong

>> or you can have it
>> cleaner and just always clear the pointers if ->elv.icq isn't assigned.
>>
> 
> It is the cleanest solution, if it is ok that those pointers are
> dirtied (for reasons I don't know) before or during the
> transformation.

Of course it is, they are not valid if ->elv.icq isn't valid. You
must clear them, or they can be stale.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 18:18           ` Jens Axboe
@ 2018-04-25 18:42             ` Paolo Valente
  2018-04-27  7:57               ` Paolo Valente
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Valente @ 2018-04-25 18:42 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown



> Il giorno 25 apr 2018, alle ore 20:18, Jens Axboe <axboe@kernel.dk> ha =
scritto:
>=20
> On 4/25/18 12:02 PM, Paolo Valente wrote:
>>=20
>>=20
>>> Il giorno 25 apr 2018, alle ore 19:34, Jens Axboe <axboe@kernel.dk> =
ha scritto:
>>>=20
>>> On 4/25/18 11:25 AM, Paolo Valente wrote:
>>>>=20
>>>>=20
>>>>> Il giorno 25 apr 2018, alle ore 19:06, Jens Axboe =
<axboe@kernel.dk> ha scritto:
>>>>>=20
>>>>> On 4/25/18 11:03 AM, Paolo Valente wrote:
>>>>>>=20
>>>>>>=20
>>>>>>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe =
<axboe@kernel.dk> ha scritto:
>>>>>>>=20
>>>>>>> Hi Linus,
>>>>>>>=20
>>>>>>> I ended up sitting on this about a week longer than I wanted to,
>>>>>>> since we were hashing out details with a timeout change. I've =
now
>>>>>>> killed that patch, so we can flush the existing queue in due =
time.
>>>>>>>=20
>>>>>>> This pull request contains:
>>>>>>>=20
>>>>>>> - Fix for an old regression, where entering the queue can be =
disturbed
>>>>>>> by a signal to the process. This can cause spurious EIO. Fix =
from Alan
>>>>>>> Jenkins.
>>>>>>>=20
>>>>>>> - cdrom information leak fix from Dan.
>>>>>>>=20
>>>>>>> - Trivial helper for testing queue FUA from Dave Chinner, part =
of his
>>>>>>> O_DIRECT FUA series.
>>>>>>>=20
>>>>>>> - Series of swim fixes from Finn that actually makes it work =
again.
>>>>>>>=20
>>>>>>> - Loop O_DIRECT corruption fix, which caused data corruption in
>>>>>>> production for us. =46rom me.
>>>>>>>=20
>>>>>>> - BFQ crash fix from me.
>>>>>>>=20
>>>>>>=20
>>>>>> For what it's worth, I disagree with this patch.  This change is =
based
>>>>>> on an apparently buggy motivation, as I wrote in my reply in the
>>>>>> thread where Jens proposed it.  As such, I think it might even =
bury
>>>>>> more deeply the actual bug that causes the crash (although of =
course
>>>>>> this patch does eliminate the crash for the use case reported in =
that
>>>>>> thread).
>>>>>=20
>>>>> The patch fixes the issue and I've explained why.
>>>>=20
>>>> Definitely, but I wrote you that your explanation seems wrong.
>>>>=20
>>>>> If you have a
>>>>> motivation to fix it differently, for whatever reason, then by all
>>>>> means submit a patch. So far I haven't seen it, and we still have
>>>>> the known crash that people are actually hitting.
>>>>>=20
>>>>=20
>>>> Unfortunately I don't have a solution, because I don't know what =
the
>>>> bug is.  I only know that there's a bug in your explanation for the
>>>> bug you want to fix.
>>>>=20
>>>>> I'll be happy to work with you on that later in the week, when
>>>>> the LSFMM conference has wound down.
>>>>>=20
>>>>=20
>>>> I do thank you for that.  If you can, please start by answering my
>>>> concerns on your explanation.  Maybe your explanation can be fixed =
(or
>>>> I'm simply wrong), and all will be fine.  Or, if your explanation =
is
>>>> really buggy and we don't find the actual bug in the code, we can
>>>> just enrich your proposed change with a comment, and state that =
this
>>>> is a crutch to walk on, while waiting for the actual bug to show =
up.
>>>=20
>>> The reproduction case was there, so should not be hard to run with
>>> that and add some instrumentation to verify or disprove theories.
>>=20
>> Simple code check seems enough for that, see below.
>>=20
>>> For the flush case, you start out with a regular request, and assign
>>> the pointers. Then when we transform that into a flush, it's
>>> bypass inserted, and later retrieved in __bfq_dispatch_request().
>>=20
>> Before the transformation, the request undergoes a finish or a
>> requeue, right?  If so, bfq clears the pointers there.
>=20
> If the request has an end_io hook, we go through there and don't
> finish the request. The request is reused.
>=20

Then, neither elv->finish_request or elv->requeue_request hooks are
invoked, right?  This is one of the possibilities I feared, because it
just breaks the balance of bfq counters, making bfq fail in more or
less unpredictable ways.  So, if I did get you right, then I'll have a
look at the code involved outside bfq, to fully see and learn what you
wrote above; after that, I'll start working on how to make bfq comply
with this new possibility.

Thanks for answering my questions,
Paolo

>>> As
>>> I said earlier, you could either check that in there (add a =
->elv.icq
>>> non-NULL check before doing the bfqq check),
>>=20
>> They must be already cleared, because of the above.
>=20
> Wrong
>=20
>>> or you can have it
>>> cleaner and just always clear the pointers if ->elv.icq isn't =
assigned.
>>>=20
>>=20
>> It is the cleanest solution, if it is ok that those pointers are
>> dirtied (for reasons I don't know) before or during the
>> transformation.
>=20
> Of course it is, they are not valid if ->elv.icq isn't valid. You
> must clear them, or they can be stale.
>=20
> --=20
> Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-25 18:42             ` Paolo Valente
@ 2018-04-27  7:57               ` Paolo Valente
  2018-04-27 14:29                 ` Jens Axboe
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Valente @ 2018-04-27  7:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown



> Il giorno 25 apr 2018, alle ore 20:42, Paolo Valente =
<paolo.valente@linaro.org> ha scritto:
>=20
>=20
>=20
>> Il giorno 25 apr 2018, alle ore 20:18, Jens Axboe <axboe@kernel.dk> =
ha scritto:
>>=20
>> On 4/25/18 12:02 PM, Paolo Valente wrote:
>>>=20
>>>=20
>>>> Il giorno 25 apr 2018, alle ore 19:34, Jens Axboe <axboe@kernel.dk> =
ha scritto:
>>>>=20
>>>> On 4/25/18 11:25 AM, Paolo Valente wrote:
>>>>>=20
>>>>>=20
>>>>>> Il giorno 25 apr 2018, alle ore 19:06, Jens Axboe =
<axboe@kernel.dk> ha scritto:
>>>>>>=20
>>>>>> On 4/25/18 11:03 AM, Paolo Valente wrote:
>>>>>>>=20
>>>>>>>=20
>>>>>>>> Il giorno 25 apr 2018, alle ore 18:50, Jens Axboe =
<axboe@kernel.dk> ha scritto:
>>>>>>>>=20
>>>>>>>> Hi Linus,
>>>>>>>>=20
>>>>>>>> I ended up sitting on this about a week longer than I wanted =
to,
>>>>>>>> since we were hashing out details with a timeout change. I've =
now
>>>>>>>> killed that patch, so we can flush the existing queue in due =
time.
>>>>>>>>=20
>>>>>>>> This pull request contains:
>>>>>>>>=20
>>>>>>>> - Fix for an old regression, where entering the queue can be =
disturbed
>>>>>>>> by a signal to the process. This can cause spurious EIO. Fix =
from Alan
>>>>>>>> Jenkins.
>>>>>>>>=20
>>>>>>>> - cdrom information leak fix from Dan.
>>>>>>>>=20
>>>>>>>> - Trivial helper for testing queue FUA from Dave Chinner, part =
of his
>>>>>>>> O_DIRECT FUA series.
>>>>>>>>=20
>>>>>>>> - Series of swim fixes from Finn that actually makes it work =
again.
>>>>>>>>=20
>>>>>>>> - Loop O_DIRECT corruption fix, which caused data corruption in
>>>>>>>> production for us. =46rom me.
>>>>>>>>=20
>>>>>>>> - BFQ crash fix from me.
>>>>>>>>=20
>>>>>>>=20
>>>>>>> For what it's worth, I disagree with this patch.  This change is =
based
>>>>>>> on an apparently buggy motivation, as I wrote in my reply in the
>>>>>>> thread where Jens proposed it.  As such, I think it might even =
bury
>>>>>>> more deeply the actual bug that causes the crash (although of =
course
>>>>>>> this patch does eliminate the crash for the use case reported in =
that
>>>>>>> thread).
>>>>>>=20
>>>>>> The patch fixes the issue and I've explained why.
>>>>>=20
>>>>> Definitely, but I wrote you that your explanation seems wrong.
>>>>>=20
>>>>>> If you have a
>>>>>> motivation to fix it differently, for whatever reason, then by =
all
>>>>>> means submit a patch. So far I haven't seen it, and we still have
>>>>>> the known crash that people are actually hitting.
>>>>>>=20
>>>>>=20
>>>>> Unfortunately I don't have a solution, because I don't know what =
the
>>>>> bug is.  I only know that there's a bug in your explanation for =
the
>>>>> bug you want to fix.
>>>>>=20
>>>>>> I'll be happy to work with you on that later in the week, when
>>>>>> the LSFMM conference has wound down.
>>>>>>=20
>>>>>=20
>>>>> I do thank you for that.  If you can, please start by answering my
>>>>> concerns on your explanation.  Maybe your explanation can be fixed =
(or
>>>>> I'm simply wrong), and all will be fine.  Or, if your explanation =
is
>>>>> really buggy and we don't find the actual bug in the code, we can
>>>>> just enrich your proposed change with a comment, and state that =
this
>>>>> is a crutch to walk on, while waiting for the actual bug to show =
up.
>>>>=20
>>>> The reproduction case was there, so should not be hard to run with
>>>> that and add some instrumentation to verify or disprove theories.
>>>=20
>>> Simple code check seems enough for that, see below.
>>>=20
>>>> For the flush case, you start out with a regular request, and =
assign
>>>> the pointers. Then when we transform that into a flush, it's
>>>> bypass inserted, and later retrieved in __bfq_dispatch_request().
>>>=20
>>> Before the transformation, the request undergoes a finish or a
>>> requeue, right?  If so, bfq clears the pointers there.
>>=20
>> If the request has an end_io hook, we go through there and don't
>> finish the request. The request is reused.
>>=20
>=20
> Then, neither elv->finish_request or elv->requeue_request hooks are
> invoked, right?  This is one of the possibilities I feared, because it
> just breaks the balance of bfq counters, making bfq fail in more or
> less unpredictable ways.  So, if I did get you right, then I'll have a
> look at the code involved outside bfq, to fully see and learn what you
> wrote above; after that, I'll start working on how to make bfq comply
> with this new possibility.
>=20

Should Jens' commit for bfq be still blocked by my concern, this is
just to inform you that I think I found a way to improve bfq, so that
it can handle also requests that are prepared, but then disappear
without any communication to bfq.  In particular, my change would
include Jens's change, so I will simply base my change on Jens' one,
and I remove my disagreement.  Of course, Jens' change alone would
just move the system from an immediate crash, to an unpredictable
behavior, which may include later crashes.

Thanks,
Paolo

> Thanks for answering my questions,
> Paolo
>=20
>>>> As
>>>> I said earlier, you could either check that in there (add a =
->elv.icq
>>>> non-NULL check before doing the bfqq check),
>>>=20
>>> They must be already cleared, because of the above.
>>=20
>> Wrong
>>=20
>>>> or you can have it
>>>> cleaner and just always clear the pointers if ->elv.icq isn't =
assigned.
>>>>=20
>>>=20
>>> It is the cleanest solution, if it is ok that those pointers are
>>> dirtied (for reasons I don't know) before or during the
>>> transformation.
>>=20
>> Of course it is, they are not valid if ->elv.icq isn't valid. You
>> must clear them, or they can be stale.
>>=20
>> --=20
>> Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-27  7:57               ` Paolo Valente
@ 2018-04-27 14:29                 ` Jens Axboe
  2018-04-27 17:55                   ` Paolo Valente
  0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2018-04-27 14:29 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown

On 4/27/18 1:57 AM, Paolo Valente wrote:
> Should Jens' commit for bfq be still blocked by my concern, this is

The patch is in.

> just to inform you that I think I found a way to improve bfq, so that
> it can handle also requests that are prepared, but then disappear
> without any communication to bfq.  In particular, my change would

This is going about it in totally the wrong way. If things magically
disappear "without any communication to bfq", then it's either a bug in
the scheduling framework OR a bug in bfq. Those are the only two
options. Request don't magically appear or disappear, they follow a
specific set of rules.

> include Jens's change, so I will simply base my change on Jens' one,
> and I remove my disagreement.  Of course, Jens' change alone would
> just move the system from an immediate crash, to an unpredictable
> behavior, which may include later crashes.

Can we please stop with this nonsense. A (->priv[0]) and B (->priv[1])
are valid, IFF C (->elv.icq) is valid. The patch ensures we clear A and
B, if C isn't valid. There's nothing unpredictable about the patch.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [GIT PULL] Block fixes for 4.17-rc2
  2018-04-27 14:29                 ` Jens Axboe
@ 2018-04-27 17:55                   ` Paolo Valente
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Valente @ 2018-04-27 17:55 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Linus Torvalds, linux-block@vger.kernel.org, Ulf Hansson,
	Mark Brown



> Il giorno 27 apr 2018, alle ore 16:29, Jens Axboe <axboe@kernel.dk> ha =
scritto:
>=20
> On 4/27/18 1:57 AM, Paolo Valente wrote:
>> Should Jens' commit for bfq be still blocked by my concern, this is
>=20
> The patch is in.
>=20
>> just to inform you that I think I found a way to improve bfq, so that
>> it can handle also requests that are prepared, but then disappear
>> without any communication to bfq.  In particular, my change would
>=20
> This is going about it in totally the wrong way. If things magically
> disappear "without any communication to bfq", then it's either a bug =
in
> the scheduling framework OR a bug in bfq. Those are the only two
> options. Request don't magically appear or disappear, they follow a
> specific set of rules.
>=20

I meant "disappear for bfq", sorry if this created confusion.  To
clear further confusion, let me restate the consequences of what you
told me that happens, reusing your own words.

(1) "For the flush case, you start out with a regular request, and
assign the pointers." -> bfq assigns the pointers, for that regular
request, because bfq associates that request with an internal queue
(as it has to do).  In particular bfq increments reference counters
and other counters in that queue (as it has to do).

(2) "Then when we transform that into a flush, it's bypass inserted,"
-> The same request is then transformed into a request with no icq,
i.e., a request not associated with any queue. bfq has no way to know =
that this
bypass-inserted request is a transformed version of the same request
prepared one moment before, because no bfq hook is invoked in the
middle ( "...  the request has an end_io hook, we go through there and
don't finish the request").

So, the incrementing of counters done for of event (1) will *never* be
balanced, pushing bfq into an inconsistent state.  Unless, of course,
I misinterpreted your sentences above.


>> include Jens's change, so I will simply base my change on Jens' one,
>> and I remove my disagreement.  Of course, Jens' change alone would
>> just move the system from an immediate crash, to an unpredictable
>> behavior, which may include later crashes.
>=20
> Can we please stop with this nonsense. A (->priv[0]) and B (->priv[1])
> are valid, IFF C (->elv.icq) is valid. The patch ensures we clear A =
and
> B, if C isn't valid. There's nothing unpredictable about the patch.

You got me wrong.  The sense is as follows: because of the above,
the transformation of the request leads bfq into an inconsistent
state, with which your patch has nothing to do.  Your patch opens the
door to the consequences of this inconsistency only because your patch
eliminates an immediate crash, on request dispatch, occurring *before*
those consequences could have effects.  Your patch *is not* the cause
of this inconsistency.

That said, once a complex system is inconsistent, its behavior may be
hard to predict.  For example, one of the problems we fixed some
months ago was the missing of the requeue hook in bfq, which caused a
very similar unbalance of counters, and ultimately a system hang.

I can even paste involved code snippets, if needed, but I guess it
won't be necessary.

Anyway, I'm already testing a patch that should solve the problem I
described again in this email.  I'm basing that patch on yours, if ok
for you.

Thanks,
Paolo

>=20
> --=20
> Jens Axboe
>=20

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-04-27 17:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-04-25 16:50 [GIT PULL] Block fixes for 4.17-rc2 Jens Axboe
2018-04-25 17:03 ` Paolo Valente
2018-04-25 17:06   ` Jens Axboe
2018-04-25 17:25     ` Paolo Valente
2018-04-25 17:34       ` Jens Axboe
2018-04-25 18:02         ` Paolo Valente
2018-04-25 18:18           ` Jens Axboe
2018-04-25 18:42             ` Paolo Valente
2018-04-27  7:57               ` Paolo Valente
2018-04-27 14:29                 ` Jens Axboe
2018-04-27 17:55                   ` Paolo Valente

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox