public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [discussion] proposal to bypass zero data for dm-crypt
@ 2024-12-21  2:34 Yu Kuai
  2025-01-03 16:25 ` Mikulas Patocka
  2025-01-07  2:04 ` James Bottomley
  0 siblings, 2 replies; 7+ messages in thread
From: Yu Kuai @ 2024-12-21  2:34 UTC (permalink / raw)
  To: Alasdair Kergon, Mike Snitzer, Mikulas Patocka, dm-devel, lkml,
	yukuai (C)

Background

We provide virtual machines for customers to use, which include an 
important feature: in the initial state, the disks in the virtual 
machine do not occupy actual storage space, and the data read by users 
is all zeros until the user writes data for the first time. This can 
save a large amount of storage.

Problem

However, after introducing dm-crypt, this feature has failed. Because we 
expect the data read by users in the initial state to be zero, we have 
to write all zeros from dm-crypt.

Hence we'd like to propose to bypass zero data for dm-crypt, for
example:

before:
zero data -> encrypted zero data
decrypted zero data -> zero data
others

after:
zero data -> zero data
decrypted zero data -> encrypted zero data
others(doesn't change)

We'd like to hear from the community for suggestions first, before we
start. :)

Thanks,
Kuai


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discussion] proposal to bypass zero data for dm-crypt
  2024-12-21  2:34 [discussion] proposal to bypass zero data for dm-crypt Yu Kuai
@ 2025-01-03 16:25 ` Mikulas Patocka
  2025-01-05 20:54   ` Milan Broz
  2025-01-07  2:04 ` James Bottomley
  1 sibling, 1 reply; 7+ messages in thread
From: Mikulas Patocka @ 2025-01-03 16:25 UTC (permalink / raw)
  To: Milan Broz
  Cc: Alasdair Kergon, Mike Snitzer, dm-devel, Yu Kuai, lkml,
	yukuai (C)

Milan, what do you think about this from a cryptographic point of view? 
Does it make sense to add an option that would detect zero data and skip 
decryption in this case?

Mikulas

On Sat, 21 Dec 2024, Yu Kuai wrote:

> Background
> 
> We provide virtual machines for customers to use, which include an important
> feature: in the initial state, the disks in the virtual machine do not occupy
> actual storage space, and the data read by users is all zeros until the user
> writes data for the first time. This can save a large amount of storage.
> 
> Problem
> 
> However, after introducing dm-crypt, this feature has failed. Because we
> expect the data read by users in the initial state to be zero, we have to
> write all zeros from dm-crypt.
> 
> Hence we'd like to propose to bypass zero data for dm-crypt, for
> example:
> 
> before:
> zero data -> encrypted zero data
> decrypted zero data -> zero data
> others
> 
> after:
> zero data -> zero data
> decrypted zero data -> encrypted zero data
> others(doesn't change)
> 
> We'd like to hear from the community for suggestions first, before we
> start. :)
> 
> Thanks,
> Kuai
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discussion] proposal to bypass zero data for dm-crypt
  2025-01-03 16:25 ` Mikulas Patocka
@ 2025-01-05 20:54   ` Milan Broz
  2025-01-06  1:43     ` Yu Kuai
  0 siblings, 1 reply; 7+ messages in thread
From: Milan Broz @ 2025-01-05 20:54 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Alasdair Kergon, Mike Snitzer, dm-devel, Yu Kuai, lkml,
	yukuai (C)


On 1/3/25 5:25 PM, Mikulas Patocka wrote:
> Milan, what do you think about this from a cryptographic point of view?
> Does it make sense to add an option that would detect zero data and skip
> decryption in this case?

It is a very dangerous thing.

Disk encryption is a length-preserving encryption, so it cannot prevent
decryption of modified ciphertext. However, such ciphertext modification
(without key knowledge) will cause a pseudorandom plaintext output
(IOW attacker cannot easily flip bits or whole sectors by ciphertext
modification).

If you allow the zeroed sector to transform to valid plaintext directly,
the attacker can wipe arbitrary plaintext sector. It can lead to fatal
issues (for example, wiping filesystem metadata bitmaps on some known
location).

Stack FDE (dm-crypt) below the filesystem or other storage layer
(like thin provision) that supports sparse data, and you will get
the expected behavior without such tricks.

Milan


> 
> Mikulas
> 
> On Sat, 21 Dec 2024, Yu Kuai wrote:
> 
>> Background
>>
>> We provide virtual machines for customers to use, which include an important
>> feature: in the initial state, the disks in the virtual machine do not occupy
>> actual storage space, and the data read by users is all zeros until the user
>> writes data for the first time. This can save a large amount of storage.
>>
>> Problem
>>
>> However, after introducing dm-crypt, this feature has failed. Because we
>> expect the data read by users in the initial state to be zero, we have to
>> write all zeros from dm-crypt.
>>
>> Hence we'd like to propose to bypass zero data for dm-crypt, for
>> example:
>>
>> before:
>> zero data -> encrypted zero data
>> decrypted zero data -> zero data
>> others
>>
>> after:
>> zero data -> zero data
>> decrypted zero data -> encrypted zero data
>> others(doesn't change)
>>
>> We'd like to hear from the community for suggestions first, before we
>> start. :)
>>
>> Thanks,
>> Kuai
>>
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discussion] proposal to bypass zero data for dm-crypt
  2025-01-05 20:54   ` Milan Broz
@ 2025-01-06  1:43     ` Yu Kuai
  2025-01-06  9:09       ` Milan Broz
  0 siblings, 1 reply; 7+ messages in thread
From: Yu Kuai @ 2025-01-06  1:43 UTC (permalink / raw)
  To: Milan Broz, Mikulas Patocka
  Cc: Alasdair Kergon, Mike Snitzer, dm-devel, Yu Kuai, lkml,
	yukuai (C)

Hi,

在 2025/01/06 4:54, Milan Broz 写道:
> 
> On 1/3/25 5:25 PM, Mikulas Patocka wrote:
>> Milan, what do you think about this from a cryptographic point of view?
>> Does it make sense to add an option that would detect zero data and skip
>> decryption in this case?
> 
> It is a very dangerous thing.
> 
> Disk encryption is a length-preserving encryption, so it cannot prevent
> decryption of modified ciphertext. However, such ciphertext modification
> (without key knowledge) will cause a pseudorandom plaintext output
> (IOW attacker cannot easily flip bits or whole sectors by ciphertext
> modification).
> 
> If you allow the zeroed sector to transform to valid plaintext directly,
> the attacker can wipe arbitrary plaintext sector. It can lead to fatal
> issues (for example, wiping filesystem metadata bitmaps on some known
> location).

Will there be difference if the attacher wipe the data to zero data or
random data? And AFAIK, for this case, should user consider dm-integrity
to prevent such attack?

> 
> Stack FDE (dm-crypt) below the filesystem or other storage layer
> (like thin provision) that supports sparse data, and you will get
> the expected behavior without such tricks.

All we want to do is to offer an additional option for user, to enable
dm-crypt or not. And if we stack dm-crypt below our storage layer, then
all users will have to use dm-crypt. In order to prevent that, the
storage layer will have to be much complex, and it will be impossible
to perform a hot upgrade without affecting existing use cases. :(

Thanks,
Kuai

> 
> Milan
> 
> 
>>
>> Mikulas
>>
>> On Sat, 21 Dec 2024, Yu Kuai wrote:
>>
>>> Background
>>>
>>> We provide virtual machines for customers to use, which include an 
>>> important
>>> feature: in the initial state, the disks in the virtual machine do 
>>> not occupy
>>> actual storage space, and the data read by users is all zeros until 
>>> the user
>>> writes data for the first time. This can save a large amount of storage.
>>>
>>> Problem
>>>
>>> However, after introducing dm-crypt, this feature has failed. Because we
>>> expect the data read by users in the initial state to be zero, we 
>>> have to
>>> write all zeros from dm-crypt.
>>>
>>> Hence we'd like to propose to bypass zero data for dm-crypt, for
>>> example:
>>>
>>> before:
>>> zero data -> encrypted zero data
>>> decrypted zero data -> zero data
>>> others
>>>
>>> after:
>>> zero data -> zero data
>>> decrypted zero data -> encrypted zero data
>>> others(doesn't change)
>>>
>>> We'd like to hear from the community for suggestions first, before we
>>> start. :)
>>>
>>> Thanks,
>>> Kuai
>>>
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discussion] proposal to bypass zero data for dm-crypt
  2025-01-06  1:43     ` Yu Kuai
@ 2025-01-06  9:09       ` Milan Broz
  2025-01-06  9:39         ` Yu Kuai
  0 siblings, 1 reply; 7+ messages in thread
From: Milan Broz @ 2025-01-06  9:09 UTC (permalink / raw)
  To: Yu Kuai, Mikulas Patocka
  Cc: Alasdair Kergon, Mike Snitzer, dm-devel, lkml, yukuai (C)

On 1/6/25 2:43 AM, Yu Kuai wrote:
>> On 1/3/25 5:25 PM, Mikulas Patocka wrote:
>>> Milan, what do you think about this from a cryptographic point of view?
>>> Does it make sense to add an option that would detect zero data and skip
>>> decryption in this case?
>>
>> It is a very dangerous thing.
>>
>> Disk encryption is a length-preserving encryption, so it cannot prevent
>> decryption of modified ciphertext. However, such ciphertext modification
>> (without key knowledge) will cause a pseudorandom plaintext output
>> (IOW attacker cannot easily flip bits or whole sectors by ciphertext
>> modification).
>>
>> If you allow the zeroed sector to transform to valid plaintext directly,
>> the attacker can wipe arbitrary plaintext sector. It can lead to fatal
>> issues (for example, wiping filesystem metadata bitmaps on some known
>> location).
> 
> Will there be difference if the attacher wipe the data to zero data or
> random data? And AFAIK, for this case, should user consider dm-integrity
> to prevent such attack?

I think I just explained this - you can directly set specific data with
zeroed plaintext. With pseudorandom decrypted data, you can only destroy
it and hope it will do something useful.

(I did not mention side channels as "decryption" will be much faster.)

With such a feature, it will not be full disk encryption but a weakened variant.
Even if it is ok for your threat model, someone can later use it improperly.

Just use encryption that suits your intended use case (perhaps filesystem
encryption orproperly configure your storage stack, dunno).
>> Stack FDE (dm-crypt) below the filesystem or other storage layer
>> (like thin provision) that supports sparse data, and you will get
>> the expected behavior without such tricks.
> 
> All we want to do is to offer an additional option for user, to enable
> dm-crypt or not. And if we stack dm-crypt below our storage layer, then
> all users will have to use dm-crypt. In order to prevent that, the
> storage layer will have to be much complex, and it will be impossible
> to perform a hot upgrade without affecting existing use cases. :(

This is not a reason for weakening encryption in dm-crypt.
You can always have a virtual volume per user that is optionally encrypted.

Milan

> 
> Thanks,
> Kuai
> 
>>
>> Milan
>>
>>
>>>
>>> Mikulas
>>>
>>> On Sat, 21 Dec 2024, Yu Kuai wrote:
>>>
>>>> Background
>>>>
>>>> We provide virtual machines for customers to use, which include an
>>>> important
>>>> feature: in the initial state, the disks in the virtual machine do
>>>> not occupy
>>>> actual storage space, and the data read by users is all zeros until
>>>> the user
>>>> writes data for the first time. This can save a large amount of storage.
>>>>
>>>> Problem
>>>>
>>>> However, after introducing dm-crypt, this feature has failed. Because we
>>>> expect the data read by users in the initial state to be zero, we
>>>> have to
>>>> write all zeros from dm-crypt.
>>>>
>>>> Hence we'd like to propose to bypass zero data for dm-crypt, for
>>>> example:
>>>>
>>>> before:
>>>> zero data -> encrypted zero data
>>>> decrypted zero data -> zero data
>>>> others
>>>>
>>>> after:
>>>> zero data -> zero data
>>>> decrypted zero data -> encrypted zero data
>>>> others(doesn't change)
>>>>
>>>> We'd like to hear from the community for suggestions first, before we
>>>> start. :)
>>>>
>>>> Thanks,
>>>> Kuai
>>>>
>>>
>>
>> .
>>
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discussion] proposal to bypass zero data for dm-crypt
  2025-01-06  9:09       ` Milan Broz
@ 2025-01-06  9:39         ` Yu Kuai
  0 siblings, 0 replies; 7+ messages in thread
From: Yu Kuai @ 2025-01-06  9:39 UTC (permalink / raw)
  To: Milan Broz, Yu Kuai, Mikulas Patocka
  Cc: Alasdair Kergon, Mike Snitzer, dm-devel, lkml, yukuai (C)

Hi,

在 2025/01/06 17:09, Milan Broz 写道:
> On 1/6/25 2:43 AM, Yu Kuai wrote:
>>> On 1/3/25 5:25 PM, Mikulas Patocka wrote:
>>>> Milan, what do you think about this from a cryptographic point of view?
>>>> Does it make sense to add an option that would detect zero data and 
>>>> skip
>>>> decryption in this case?
>>>
>>> It is a very dangerous thing.
>>>
>>> Disk encryption is a length-preserving encryption, so it cannot prevent
>>> decryption of modified ciphertext. However, such ciphertext modification
>>> (without key knowledge) will cause a pseudorandom plaintext output
>>> (IOW attacker cannot easily flip bits or whole sectors by ciphertext
>>> modification).
>>>
>>> If you allow the zeroed sector to transform to valid plaintext directly,
>>> the attacker can wipe arbitrary plaintext sector. It can lead to fatal
>>> issues (for example, wiping filesystem metadata bitmaps on some known
>>> location).
>>
>> Will there be difference if the attacher wipe the data to zero data or
>> random data? And AFAIK, for this case, should user consider dm-integrity
>> to prevent such attack?
> 
> I think I just explained this - you can directly set specific data with
> zeroed plaintext. With pseudorandom decrypted data, you can only destroy
> it and hope it will do something useful.

Ok, I got it, the difference is the possibility, thanks for the
explanation.

> 
> (I did not mention side channels as "decryption" will be much faster.)
> 
> With such a feature, it will not be full disk encryption but a weakened 
> variant.
> Even if it is ok for your threat model, someone can later use it 
> improperly.
> 
> Just use encryption that suits your intended use case (perhaps filesystem
> encryption orproperly configure your storage stack, dunno).
>>> Stack FDE (dm-crypt) below the filesystem or other storage layer
>>> (like thin provision) that supports sparse data, and you will get
>>> the expected behavior without such tricks.
>>
>> All we want to do is to offer an additional option for user, to enable
>> dm-crypt or not. And if we stack dm-crypt below our storage layer, then
>> all users will have to use dm-crypt. In order to prevent that, the
>> storage layer will have to be much complex, and it will be impossible
>> to perform a hot upgrade without affecting existing use cases. :(
> 
> This is not a reason for weakening encryption in dm-crypt.
> You can always have a virtual volume per user that is optionally encrypted.

Yes, and we'll evaluate the risk in our use cases and then decide if we
still want to do this downstream, by an additional option that user must
enable manually.

Thanks,
Kuai
> 
> Milan
> 
>>
>> Thanks,
>> Kuai
>>
>>>
>>> Milan
>>>
>>>
>>>>
>>>> Mikulas
>>>>
>>>> On Sat, 21 Dec 2024, Yu Kuai wrote:
>>>>
>>>>> Background
>>>>>
>>>>> We provide virtual machines for customers to use, which include an
>>>>> important
>>>>> feature: in the initial state, the disks in the virtual machine do
>>>>> not occupy
>>>>> actual storage space, and the data read by users is all zeros until
>>>>> the user
>>>>> writes data for the first time. This can save a large amount of 
>>>>> storage.
>>>>>
>>>>> Problem
>>>>>
>>>>> However, after introducing dm-crypt, this feature has failed. 
>>>>> Because we
>>>>> expect the data read by users in the initial state to be zero, we
>>>>> have to
>>>>> write all zeros from dm-crypt.
>>>>>
>>>>> Hence we'd like to propose to bypass zero data for dm-crypt, for
>>>>> example:
>>>>>
>>>>> before:
>>>>> zero data -> encrypted zero data
>>>>> decrypted zero data -> zero data
>>>>> others
>>>>>
>>>>> after:
>>>>> zero data -> zero data
>>>>> decrypted zero data -> encrypted zero data
>>>>> others(doesn't change)
>>>>>
>>>>> We'd like to hear from the community for suggestions first, before we
>>>>> start. :)
>>>>>
>>>>> Thanks,
>>>>> Kuai
>>>>>
>>>>
>>>
>>> .
>>>
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [discussion] proposal to bypass zero data for dm-crypt
  2024-12-21  2:34 [discussion] proposal to bypass zero data for dm-crypt Yu Kuai
  2025-01-03 16:25 ` Mikulas Patocka
@ 2025-01-07  2:04 ` James Bottomley
  1 sibling, 0 replies; 7+ messages in thread
From: James Bottomley @ 2025-01-07  2:04 UTC (permalink / raw)
  To: Yu Kuai, Alasdair Kergon, Mike Snitzer, Mikulas Patocka, dm-devel,
	lkml, yukuai (C)

On Sat, 2024-12-21 at 10:34 +0800, Yu Kuai wrote:
> Background
> 
> We provide virtual machines for customers to use, which include an 
> important feature: in the initial state, the disks in the virtual 
> machine do not occupy actual storage space, and the data read by
> users is all zeros until the user writes data for the first time.
> This can save a large amount of storage.
> 
> Problem
> 
> However, after introducing dm-crypt, this feature has failed. Because
> we expect the data read by users in the initial state to be zero, we
> have to write all zeros from dm-crypt.

Why do you expect the user to read all zeros?  For DM crypt on a new
physical disk, we don't set the disk contents to an initial value
because there's no expectation on the part of the user that they can
read a sector they haven't written and get a sensible result instead of
the dm-crypt of whatever the sector contained.  Why can't your cloud
system just behave like a physical disk?

Even for an unencrypted physical disk, there's no expectation of any
particular value being in a sector, so why do your users have the
expectation of all zeros?

Regards,

James


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-07  2:14 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-21  2:34 [discussion] proposal to bypass zero data for dm-crypt Yu Kuai
2025-01-03 16:25 ` Mikulas Patocka
2025-01-05 20:54   ` Milan Broz
2025-01-06  1:43     ` Yu Kuai
2025-01-06  9:09       ` Milan Broz
2025-01-06  9:39         ` Yu Kuai
2025-01-07  2:04 ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox