* [PATCH] dm-integrity: fix inefficient allocation of stack space
@ 2017-07-19 15:23 Mikulas Patocka
2017-07-19 18:39 ` John Stoffel
0 siblings, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2017-07-19 15:23 UTC (permalink / raw)
To: Mike Snitzer, Milan Broz; +Cc: dm-devel
When using block size greater than 512 bytes, the dm-integrity target
allocates journal space inefficiently, it allocates one entry for each
512-byte chunk of data, fills entries for each block of data and leaves
the remaining entries unused. This doesn't cause data corruption, but it
causes severe performance degradation.
This patch fixes the journal allocation. As a safety it also adds BUG that
fires if the variables representing journal usage get out of sync (it's
better to crash than continue and corrupt data, so BUG is justified).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 7eada909bfd7 ("dm: add integrity target")
---
drivers/md/dm-integrity.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
Index: linux-2.6/drivers/md/dm-integrity.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-integrity.c
+++ linux-2.6/drivers/md/dm-integrity.c
@@ -1589,14 +1589,14 @@ retry:
unsigned next_entry, i, pos;
unsigned ws, we;
- dio->range.n_sectors = min(dio->range.n_sectors, ic->free_sectors);
+ dio->range.n_sectors = min(dio->range.n_sectors, ic->free_sectors << ic->sb->log2_sectors_per_block);
if (unlikely(!dio->range.n_sectors))
goto sleep;
- ic->free_sectors -= dio->range.n_sectors;
+ ic->free_sectors -= dio->range.n_sectors >> ic->sb->log2_sectors_per_block;
journal_section = ic->free_section;
journal_entry = ic->free_section_entry;
- next_entry = ic->free_section_entry + dio->range.n_sectors;
+ next_entry = ic->free_section_entry + (dio->range.n_sectors >> ic->sb->log2_sectors_per_block);
ic->free_section_entry = next_entry % ic->journal_section_entries;
ic->free_section += next_entry / ic->journal_section_entries;
ic->n_uncommitted_sections += next_entry / ic->journal_section_entries;
@@ -1727,6 +1727,7 @@ static void pad_uncommitted(struct dm_in
wraparound_section(ic, &ic->free_section);
ic->n_uncommitted_sections++;
}
+ BUG_ON((ic->n_uncommitted_sections + ic->n_committed_sections) * ic->journal_section_entries + ic->free_sectors != ic->journal_sections * ic->journal_section_entries);
}
static void integrity_commit(struct work_struct *w)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dm-integrity: fix inefficient allocation of stack space
2017-07-19 15:23 [PATCH] dm-integrity: fix inefficient allocation of stack space Mikulas Patocka
@ 2017-07-19 18:39 ` John Stoffel
2017-07-19 19:02 ` Mike Snitzer
2017-07-20 10:26 ` [PATCH] " Mikulas Patocka
0 siblings, 2 replies; 12+ messages in thread
From: John Stoffel @ 2017-07-19 18:39 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: Mike Snitzer, dm-devel, Milan Broz
>>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
Mikulas> When using block size greater than 512 bytes, the dm-integrity target
Mikulas> allocates journal space inefficiently, it allocates one entry for each
Mikulas> 512-byte chunk of data, fills entries for each block of data and leaves
Mikulas> the remaining entries unused. This doesn't cause data corruption, but it
Mikulas> causes severe performance degradation.
Mikulas> This patch fixes the journal allocation. As a safety it also
Mikulas> adds BUG that fires if the variables representing journal
Mikulas> usage get out of sync (it's better to crash than continue and
Mikulas> corrupt data, so BUG is justified).
No, I don't agree. You should lock down the block device(s) using the
dm-integrity target, but you should NOT take down the entire system
because of this. Just return a failure up the stack that forces the
device into read only mode so that there's a chance to debug this
issue.
Using a BUG_ON, especially for code that isn't proven, is just wrong.
Do a WARN_ONCE() and then return an error instead.
John
Mikulas> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Mikulas> Cc: stable@vger.kernel.org
Mikulas> Fixes: 7eada909bfd7 ("dm: add integrity target")
Mikulas> ---
Mikulas> drivers/md/dm-integrity.c | 7 ++++---
Mikulas> 1 file changed, 4 insertions(+), 3 deletions(-)
Mikulas> Index: linux-2.6/drivers/md/dm-integrity.c
Mikulas> ===================================================================
Mikulas> --- linux-2.6.orig/drivers/md/dm-integrity.c
Mikulas> +++ linux-2.6/drivers/md/dm-integrity.c
Mikulas> @@ -1589,14 +1589,14 @@ retry:
Mikulas> unsigned next_entry, i, pos;
Mikulas> unsigned ws, we;
Mikulas> - dio->range.n_sectors = min(dio->range.n_sectors, ic->free_sectors);
Mikulas> + dio->range.n_sectors = min(dio->range.n_sectors, ic->free_sectors << ic->sb->log2_sectors_per_block);
Mikulas> if (unlikely(!dio->range.n_sectors))
Mikulas> goto sleep;
Mikulas> - ic->free_sectors -= dio->range.n_sectors;
Mikulas> + ic->free_sectors -= dio->range.n_sectors >> ic->sb->log2_sectors_per_block;
Mikulas> journal_section = ic->free_section;
Mikulas> journal_entry = ic->free_section_entry;
Mikulas> - next_entry = ic->free_section_entry + dio->range.n_sectors;
Mikulas> + next_entry = ic->free_section_entry + (dio->range.n_sectors >> ic->sb->log2_sectors_per_block);
ic-> free_section_entry = next_entry % ic->journal_section_entries;
ic-> free_section += next_entry / ic->journal_section_entries;
ic-> n_uncommitted_sections += next_entry / ic->journal_section_entries;
Mikulas> @@ -1727,6 +1727,7 @@ static void pad_uncommitted(struct dm_in
Mikulas> wraparound_section(ic, &ic->free_section);
ic-> n_uncommitted_sections++;
Mikulas> }
Mikulas> + BUG_ON((ic->n_uncommitted_sections + ic->n_committed_sections) * ic->journal_section_entries + ic->free_sectors != ic->journal_sections * ic->journal_section_entries);
Mikulas> }
Mikulas> static void integrity_commit(struct work_struct *w)
Mikulas> --
Mikulas> dm-devel mailing list
Mikulas> dm-devel@redhat.com
Mikulas> https://www.redhat.com/mailman/listinfo/dm-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dm-integrity: fix inefficient allocation of stack space
2017-07-19 18:39 ` John Stoffel
@ 2017-07-19 19:02 ` Mike Snitzer
2017-07-19 21:07 ` John Stoffel
2017-07-20 10:45 ` Mikulas Patocka
2017-07-20 10:26 ` [PATCH] " Mikulas Patocka
1 sibling, 2 replies; 12+ messages in thread
From: Mike Snitzer @ 2017-07-19 19:02 UTC (permalink / raw)
To: John Stoffel; +Cc: dm-devel, Mikulas Patocka, Milan Broz
On Wed, Jul 19 2017 at 2:39pm -0400,
John Stoffel <john@stoffel.org> wrote:
> >>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
>
> Mikulas> When using block size greater than 512 bytes, the dm-integrity target
> Mikulas> allocates journal space inefficiently, it allocates one entry for each
> Mikulas> 512-byte chunk of data, fills entries for each block of data and leaves
> Mikulas> the remaining entries unused. This doesn't cause data corruption, but it
> Mikulas> causes severe performance degradation.
>
> Mikulas> This patch fixes the journal allocation. As a safety it also
> Mikulas> adds BUG that fires if the variables representing journal
> Mikulas> usage get out of sync (it's better to crash than continue and
> Mikulas> corrupt data, so BUG is justified).
>
> No, I don't agree. You should lock down the block device(s) using the
> dm-integrity target, but you should NOT take down the entire system
> because of this. Just return a failure up the stack that forces the
> device into read only mode so that there's a chance to debug this
> issue.
>
> Using a BUG_ON, especially for code that isn't proven, is just wrong.
> Do a WARN_ONCE() and then return an error instead.
I'll remove the BUG_ON from this stable@ fix. Mikulas, please have a
look at handling the failure more gracefully (maybe like John suggests).
In general, there are too many BUG_ON()s in the dm-integrity code. I
let them slide for inclusion but it would probably be a good idea to
look at eliminating them and putting the dm-integrity device into "fail
mode" in the face of critical errors -- much like dm-thinp and dm-cache
has.
Mike
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dm-integrity: fix inefficient allocation of stack space
2017-07-19 19:02 ` Mike Snitzer
@ 2017-07-19 21:07 ` John Stoffel
2017-07-20 10:59 ` Mikulas Patocka
2017-07-20 10:45 ` Mikulas Patocka
1 sibling, 1 reply; 12+ messages in thread
From: John Stoffel @ 2017-07-19 21:07 UTC (permalink / raw)
To: Mike Snitzer; +Cc: Mikulas Patocka, dm-devel, John Stoffel, Milan Broz
>>>>> "Mike" == Mike Snitzer <snitzer@redhat.com> writes:
Mike> On Wed, Jul 19 2017 at 2:39pm -0400,
Mike> John Stoffel <john@stoffel.org> wrote:
>> >>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
>>
Mikulas> When using block size greater than 512 bytes, the dm-integrity target
Mikulas> allocates journal space inefficiently, it allocates one entry for each
Mikulas> 512-byte chunk of data, fills entries for each block of data and leaves
Mikulas> the remaining entries unused. This doesn't cause data corruption, but it
Mikulas> causes severe performance degradation.
>>
Mikulas> This patch fixes the journal allocation. As a safety it also
Mikulas> adds BUG that fires if the variables representing journal
Mikulas> usage get out of sync (it's better to crash than continue and
Mikulas> corrupt data, so BUG is justified).
>>
>> No, I don't agree. You should lock down the block device(s) using the
>> dm-integrity target, but you should NOT take down the entire system
>> because of this. Just return a failure up the stack that forces the
>> device into read only mode so that there's a chance to debug this
>> issue.
>>
>> Using a BUG_ON, especially for code that isn't proven, is just wrong.
>> Do a WARN_ONCE() and then return an error instead.
Mike> I'll remove the BUG_ON from this stable@ fix. Mikulas, please have a
Mike> look at handling the failure more gracefully (maybe like John suggests).
Mike> In general, there are too many BUG_ON()s in the dm-integrity code. I
Mike> let them slide for inclusion but it would probably be a good idea to
Mike> look at eliminating them and putting the dm-integrity device into "fail
Mike> mode" in the face of critical errors -- much like dm-thinp and dm-cache
Mike> has.
I'd like to argue that you should never use BUG_ON at all, esp since
if you have integrity running on just one critical device, but have
other devices that work just fine, bringing down the entire system
because you don't think things are ok is terrible.
We should rip out ALL the BUG_ONs in the dm-integrity and just do
proper error handling instead. For testing, sure, use them on your
code. But please, not for general use!
John
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dm-integrity: fix inefficient allocation of stack space
2017-07-19 18:39 ` John Stoffel
2017-07-19 19:02 ` Mike Snitzer
@ 2017-07-20 10:26 ` Mikulas Patocka
2017-07-20 15:43 ` John Stoffel
1 sibling, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2017-07-20 10:26 UTC (permalink / raw)
To: John Stoffel; +Cc: Mike Snitzer, dm-devel, Milan Broz
On Wed, 19 Jul 2017, John Stoffel wrote:
> >>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
>
> Mikulas> When using block size greater than 512 bytes, the dm-integrity target
> Mikulas> allocates journal space inefficiently, it allocates one entry for each
> Mikulas> 512-byte chunk of data, fills entries for each block of data and leaves
> Mikulas> the remaining entries unused. This doesn't cause data corruption, but it
> Mikulas> causes severe performance degradation.
>
> Mikulas> This patch fixes the journal allocation. As a safety it also
> Mikulas> adds BUG that fires if the variables representing journal
> Mikulas> usage get out of sync (it's better to crash than continue and
> Mikulas> corrupt data, so BUG is justified).
>
> No, I don't agree. You should lock down the block device(s) using the
> dm-integrity target, but you should NOT take down the entire system
> because of this. Just return a failure up the stack that forces the
> device into read only mode so that there's a chance to debug this
> issue.
>
> Using a BUG_ON, especially for code that isn't proven, is just wrong.
> Do a WARN_ONCE() and then return an error instead.
>
> John
That BUG can only be triggered by a bug in the code, it can't be triggered
by bad data on the disk, so why should we write code to recover from it?
Mikulas
> Mikulas> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> Mikulas> Cc: stable@vger.kernel.org
> Mikulas> Fixes: 7eada909bfd7 ("dm: add integrity target")
>
> Mikulas> ---
> Mikulas> drivers/md/dm-integrity.c | 7 ++++---
> Mikulas> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> Mikulas> Index: linux-2.6/drivers/md/dm-integrity.c
> Mikulas> ===================================================================
> Mikulas> --- linux-2.6.orig/drivers/md/dm-integrity.c
> Mikulas> +++ linux-2.6/drivers/md/dm-integrity.c
> Mikulas> @@ -1589,14 +1589,14 @@ retry:
> Mikulas> unsigned next_entry, i, pos;
> Mikulas> unsigned ws, we;
>
> Mikulas> - dio->range.n_sectors = min(dio->range.n_sectors, ic->free_sectors);
> Mikulas> + dio->range.n_sectors = min(dio->range.n_sectors, ic->free_sectors << ic->sb->log2_sectors_per_block);
> Mikulas> if (unlikely(!dio->range.n_sectors))
> Mikulas> goto sleep;
> Mikulas> - ic->free_sectors -= dio->range.n_sectors;
> Mikulas> + ic->free_sectors -= dio->range.n_sectors >> ic->sb->log2_sectors_per_block;
> Mikulas> journal_section = ic->free_section;
> Mikulas> journal_entry = ic->free_section_entry;
>
> Mikulas> - next_entry = ic->free_section_entry + dio->range.n_sectors;
> Mikulas> + next_entry = ic->free_section_entry + (dio->range.n_sectors >> ic->sb->log2_sectors_per_block);
> ic-> free_section_entry = next_entry % ic->journal_section_entries;
> ic-> free_section += next_entry / ic->journal_section_entries;
> ic-> n_uncommitted_sections += next_entry / ic->journal_section_entries;
> Mikulas> @@ -1727,6 +1727,7 @@ static void pad_uncommitted(struct dm_in
> Mikulas> wraparound_section(ic, &ic->free_section);
> ic-> n_uncommitted_sections++;
> Mikulas> }
> Mikulas> + BUG_ON((ic->n_uncommitted_sections + ic->n_committed_sections) * ic->journal_section_entries + ic->free_sectors != ic->journal_sections * ic->journal_section_entries);
> Mikulas> }
>
> Mikulas> static void integrity_commit(struct work_struct *w)
>
> Mikulas> --
> Mikulas> dm-devel mailing list
> Mikulas> dm-devel@redhat.com
> Mikulas> https://www.redhat.com/mailman/listinfo/dm-devel
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dm-integrity: fix inefficient allocation of stack space
2017-07-19 19:02 ` Mike Snitzer
2017-07-19 21:07 ` John Stoffel
@ 2017-07-20 10:45 ` Mikulas Patocka
1 sibling, 0 replies; 12+ messages in thread
From: Mikulas Patocka @ 2017-07-20 10:45 UTC (permalink / raw)
To: Mike Snitzer; +Cc: dm-devel, John Stoffel, Milan Broz
On Wed, 19 Jul 2017, Mike Snitzer wrote:
> On Wed, Jul 19 2017 at 2:39pm -0400,
> John Stoffel <john@stoffel.org> wrote:
>
> > >>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
> >
> > Mikulas> When using block size greater than 512 bytes, the dm-integrity target
> > Mikulas> allocates journal space inefficiently, it allocates one entry for each
> > Mikulas> 512-byte chunk of data, fills entries for each block of data and leaves
> > Mikulas> the remaining entries unused. This doesn't cause data corruption, but it
> > Mikulas> causes severe performance degradation.
> >
> > Mikulas> This patch fixes the journal allocation. As a safety it also
> > Mikulas> adds BUG that fires if the variables representing journal
> > Mikulas> usage get out of sync (it's better to crash than continue and
> > Mikulas> corrupt data, so BUG is justified).
> >
> > No, I don't agree. You should lock down the block device(s) using the
> > dm-integrity target, but you should NOT take down the entire system
> > because of this. Just return a failure up the stack that forces the
> > device into read only mode so that there's a chance to debug this
> > issue.
> >
> > Using a BUG_ON, especially for code that isn't proven, is just wrong.
> > Do a WARN_ONCE() and then return an error instead.
>
> I'll remove the BUG_ON from this stable@ fix.
Don't - the BUG will help developers in the future if they break the code.
It is much easier to debug a BUG than to debug data corruption.
> Mikulas, please have a look at handling the failure more gracefully
> (maybe like John suggests). In general, there are too many BUG_ON()s in
> the dm-integrity code. I let them slide for inclusion but it would
> probably be a good idea to look at eliminating them and putting the
> dm-integrity device into "fail mode" in the face of critical errors --
> much like dm-thinp and dm-cache has.
It's better to write a void function that calls BUG in case of bug in the
code than to write a function returning error code, propagate the error
code to the callers and try to handle the error code. How should the error
code be handled if memory is already corrupted? In this paper they say
that half of the code written for Multics was error recovery -
http://multicians.org/unix.html - I don't want to go down that path and
write code to handle conditions that can't happen.
> Mike
Mikulas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dm-integrity: fix inefficient allocation of stack space
2017-07-19 21:07 ` John Stoffel
@ 2017-07-20 10:59 ` Mikulas Patocka
2017-07-20 11:25 ` Mikulas Patocka
2017-07-20 15:46 ` John Stoffel
0 siblings, 2 replies; 12+ messages in thread
From: Mikulas Patocka @ 2017-07-20 10:59 UTC (permalink / raw)
To: John Stoffel; +Cc: dm-devel, Mike Snitzer, Milan Broz
On Wed, 19 Jul 2017, John Stoffel wrote:
> I'd like to argue that you should never use BUG_ON at all, esp since
> if you have integrity running on just one critical device, but have
> other devices that work just fine, bringing down the entire system
> because you don't think things are ok is terrible.
>
> We should rip out ALL the BUG_ONs in the dm-integrity and just do
> proper error handling instead. For testing, sure, use them on your
> code. But please, not for general use!
>
> John
Linus sometimes argued against BUG_ON and people are repeating it after
him like sheep.
If a programmer made a bug in his code, he should fix that bug, not write
additional recovery code for the bug.
Mikulas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dm-integrity: fix inefficient allocation of stack space
2017-07-20 10:59 ` Mikulas Patocka
@ 2017-07-20 11:25 ` Mikulas Patocka
[not found] ` <22896.56378.568488.989656@quad.stoffel.home>
2017-07-20 15:46 ` John Stoffel
1 sibling, 1 reply; 12+ messages in thread
From: Mikulas Patocka @ 2017-07-20 11:25 UTC (permalink / raw)
To: John Stoffel; +Cc: dm-devel, Mike Snitzer, Milan Broz
On Thu, 20 Jul 2017, Mikulas Patocka wrote:
>
>
> On Wed, 19 Jul 2017, John Stoffel wrote:
>
> > I'd like to argue that you should never use BUG_ON at all, esp since
> > if you have integrity running on just one critical device, but have
> > other devices that work just fine, bringing down the entire system
> > because you don't think things are ok is terrible.
> >
> > We should rip out ALL the BUG_ONs in the dm-integrity and just do
> > proper error handling instead. For testing, sure, use them on your
> > code. But please, not for general use!
> >
> > John
>
> Linus sometimes argued against BUG_ON and people are repeating it after
> him like sheep.
>
> If a programmer made a bug in his code, he should fix that bug, not write
> additional recovery code for the bug.
>
> Mikulas
For example, there a function access_journal_check that checks if journal
section and offset are valid and calls BUG if they're not.
Now, suppose that we change BUG() to WARN() - now we need to return error
code from the function. So we change the return value from void to int.
But this function is called by three other functions - page_list_location,
access_journal_entry, access_journal_data - and now we need to test the
error code in all of them.
page_list_location returns void, so it needs to be changed to return an
error code too. access_journal_entry and access_journal_data return a
pointer, so we change them to return NULL on error - but that NULL needs
to be checked in the callers.
There are 11 callers of access_journal_entry and 5 calleds of
access_journal_data and 5 callers of page_list_location.
And now you can see that changing a single BUG to WARN adds excessive
amount of junk code that checks for "impossible" conditions. These extra
tests do not make the code any more readable or understandable - they make
it less readable because the programmer no longer knows what checks are
really needed and what are not.
In the early days when I was programming, I was adding checks for NULL
pointers "just in case - so that it doesn't crash". But these checks
didn't make the code any more understandable - they made the code less
understandable - because if a function return value is checked for NULL,
the programmer believes that the function could return NULL. And the
programmer no longer knows what is the contract of the function - could
the function really return NULL or not?
Mikulas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dm-integrity: fix inefficient allocation of stack space
2017-07-20 10:26 ` [PATCH] " Mikulas Patocka
@ 2017-07-20 15:43 ` John Stoffel
2017-07-21 12:29 ` Mikulas Patocka
0 siblings, 1 reply; 12+ messages in thread
From: John Stoffel @ 2017-07-20 15:43 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: Mike Snitzer, John Stoffel, dm-devel, Milan Broz
>>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
Mikulas> On Wed, 19 Jul 2017, John Stoffel wrote:
>> >>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
>>
Mikulas> When using block size greater than 512 bytes, the dm-integrity target
Mikulas> allocates journal space inefficiently, it allocates one entry for each
Mikulas> 512-byte chunk of data, fills entries for each block of data and leaves
Mikulas> the remaining entries unused. This doesn't cause data corruption, but it
Mikulas> causes severe performance degradation.
>>
Mikulas> This patch fixes the journal allocation. As a safety it also
Mikulas> adds BUG that fires if the variables representing journal
Mikulas> usage get out of sync (it's better to crash than continue and
Mikulas> corrupt data, so BUG is justified).
>>
>> No, I don't agree. You should lock down the block device(s) using the
>> dm-integrity target, but you should NOT take down the entire system
>> because of this. Just return a failure up the stack that forces the
>> device into read only mode so that there's a chance to debug this
>> issue.
>>
>> Using a BUG_ON, especially for code that isn't proven, is just wrong.
>> Do a WARN_ONCE() and then return an error instead.
>>
>> John
Mikulas> That BUG can only be triggered by a bug in the code, it can't be triggered
Mikulas> by bad data on the disk, so why should we write code to recover from it?
Because it's damn obnoxious to kill a system because of bad coding,
without giving the user a chance to recover or even investigate
the root cause. If it's a bug in the code, just warn the user and
then try to continue. If you can't continue, then just make the block
device go read-only to protect the data.
Again, why do you think BUG_ON is appropriate thing to do? What
advantage does it have for you?
John
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dm-integrity: fix inefficient allocation of stack space
2017-07-20 10:59 ` Mikulas Patocka
2017-07-20 11:25 ` Mikulas Patocka
@ 2017-07-20 15:46 ` John Stoffel
1 sibling, 0 replies; 12+ messages in thread
From: John Stoffel @ 2017-07-20 15:46 UTC (permalink / raw)
To: Mikulas Patocka; +Cc: dm-devel, John Stoffel, Mike Snitzer, Milan Broz
>>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
Mikulas> On Wed, 19 Jul 2017, John Stoffel wrote:
>> I'd like to argue that you should never use BUG_ON at all, esp since
>> if you have integrity running on just one critical device, but have
>> other devices that work just fine, bringing down the entire system
>> because you don't think things are ok is terrible.
>>
>> We should rip out ALL the BUG_ONs in the dm-integrity and just do
>> proper error handling instead. For testing, sure, use them on your
>> code. But please, not for general use!
>>
>> John
Mikulas> Linus sometimes argued against BUG_ON and people are
Mikulas> repeating it after him like sheep.
I understand Linus' rants on BUG_ON and I agree with them. He doesn't
argue that they're not appropriate at times, but just using them
because a random driver or module freaks out is NOT appropriate.
Mikulas> If a programmer made a bug in his code, he should fix that
Mikulas> bug, not write additional recovery code for the bug.
If a programmer has a bug which kills the system for no damn good
reason, then I object. Don't think that dm-integrity is oh so
important that you need to kill the system because of a mistake.
Try to be robust. 90% of all code is error handling.
John
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dm-integrity: fix inefficient allocation of stack space
[not found] ` <22896.56378.568488.989656@quad.stoffel.home>
@ 2017-07-21 12:15 ` Mikulas Patocka
0 siblings, 0 replies; 12+ messages in thread
From: Mikulas Patocka @ 2017-07-21 12:15 UTC (permalink / raw)
To: John Stoffel; +Cc: dm-devel, Mike Snitzer, Milan Broz
On Thu, 20 Jul 2017, John Stoffel wrote:
> Mikulas> In the early days when I was programming, I was adding checks for NULL
> Mikulas> pointers "just in case - so that it doesn't crash". But these checks
> Mikulas> didn't make the code any more understandable - they made the code less
> Mikulas> understandable - because if a function return value is checked for NULL,
> Mikulas> the programmer believes that the function could return NULL. And the
> Mikulas> programmer no longer knows what is the contract of the function - could
> Mikulas> the function really return NULL or not?
>
> In this case, maybe it would make more sense to clamp the values to
> the maximums in case they do exceed their limits? The check is
> simply:
>
> if (unlikely(section >= ic->journal_sections) ||
> unlikely(offset >= limit)) {
>
> So I'm arguing that when you set the section or offset, shouldn't they
> be clamped to their limits there?
>
> Thanks for your time,
> John
Returning invalid data is directly against the purpose of the dm-integrity
target.
If you think that clamping array indices that are out of range improves
software reliability - so write some program that uses this technique and
we will see how reliable or unreliable it is. You can persuade people by
writing software, you can't persuade them by preaching what should they
do.
Mikulas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dm-integrity: fix inefficient allocation of stack space
2017-07-20 15:43 ` John Stoffel
@ 2017-07-21 12:29 ` Mikulas Patocka
0 siblings, 0 replies; 12+ messages in thread
From: Mikulas Patocka @ 2017-07-21 12:29 UTC (permalink / raw)
To: John Stoffel; +Cc: Mike Snitzer, dm-devel, Milan Broz
On Thu, 20 Jul 2017, John Stoffel wrote:
> >>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
>
> Mikulas> On Wed, 19 Jul 2017, John Stoffel wrote:
>
> >> >>>>> "Mikulas" == Mikulas Patocka <mpatocka@redhat.com> writes:
> >>
> Mikulas> When using block size greater than 512 bytes, the dm-integrity target
> Mikulas> allocates journal space inefficiently, it allocates one entry for each
> Mikulas> 512-byte chunk of data, fills entries for each block of data and leaves
> Mikulas> the remaining entries unused. This doesn't cause data corruption, but it
> Mikulas> causes severe performance degradation.
> >>
> Mikulas> This patch fixes the journal allocation. As a safety it also
> Mikulas> adds BUG that fires if the variables representing journal
> Mikulas> usage get out of sync (it's better to crash than continue and
> Mikulas> corrupt data, so BUG is justified).
> >>
> >> No, I don't agree. You should lock down the block device(s) using the
> >> dm-integrity target, but you should NOT take down the entire system
> >> because of this. Just return a failure up the stack that forces the
> >> device into read only mode so that there's a chance to debug this
> >> issue.
> >>
> >> Using a BUG_ON, especially for code that isn't proven, is just wrong.
> >> Do a WARN_ONCE() and then return an error instead.
> >>
> >> John
>
> Mikulas> That BUG can only be triggered by a bug in the code, it can't be triggered
> Mikulas> by bad data on the disk, so why should we write code to recover from it?
>
> Because it's damn obnoxious to kill a system because of bad coding,
> without giving the user a chance to recover or even investigate
> the root cause. If it's a bug in the code, just warn the user and
> then try to continue. If you can't continue, then just make the block
> device go read-only to protect the data.
>
> Again, why do you think BUG_ON is appropriate thing to do? What
> advantage does it have for you?
When writing the code, I occasionally make mistakes, such as passing a
wrong variable to the function. This BUG_ON catches the problem
immediatelly, I can instantly see what problem occured.
If that BUG_ON is omitted, I get a crash that is harder to debug (it is
needed to disassemble the function and find out which registers are used
for which variables) or I get a data corruption, which is even harder to
debug than a crash.
If I pass a wrong variable to a function, the correct fix is to pass the
correct variable to the function, not to write additional code that tries
to recover from the wrong variable. If you disagree, just show us some
program that uses your own techniques for recovering from your bugs and we
can see.
Mikulas
> John
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-07-21 12:29 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-19 15:23 [PATCH] dm-integrity: fix inefficient allocation of stack space Mikulas Patocka
2017-07-19 18:39 ` John Stoffel
2017-07-19 19:02 ` Mike Snitzer
2017-07-19 21:07 ` John Stoffel
2017-07-20 10:59 ` Mikulas Patocka
2017-07-20 11:25 ` Mikulas Patocka
[not found] ` <22896.56378.568488.989656@quad.stoffel.home>
2017-07-21 12:15 ` Mikulas Patocka
2017-07-20 15:46 ` John Stoffel
2017-07-20 10:45 ` Mikulas Patocka
2017-07-20 10:26 ` [PATCH] " Mikulas Patocka
2017-07-20 15:43 ` John Stoffel
2017-07-21 12:29 ` Mikulas Patocka
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.