[QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss
@ 2025-05-28  8:07 Zizhi Wo
  2025-05-28  8:35 ` Gao Xiang
  0 siblings, 1 reply; 5+ messages in thread
From: Zizhi Wo @ 2025-05-28  8:07 UTC (permalink / raw)
  To: netfs, dhowells, jlayton, brauner
  Cc: hsiangkao, jefflexu, zhujia.zj, linux-erofs, linux-fsdevel,
	linux-kernel, wozizhi, libaokun1, yangerkun, houtao1, yukuai3

Currently, in on-demand loading mode, cachefiles first calls
cachefiles_create_tmpfile() to generate a tmpfile, and only during the exit
process does it call cachefiles_commit_object->cachefiles_commit_tmpfile to
create the actual dentry and making it visible to users.

If the cache write is interrupted unexpectedly (e.g., by system crash or
power loss), during the next startup process, cachefiles_look_up_object()
will determine that no corresponding dentry has been generated and will
recreate the tmpfile and pull the complete data again!

The current implementation mechanism appears to provide per-file atomicity.
For scenarios involving large image files (where significant amount of
cache data needs to be written), this re-pulling process after an
interruption seems considerable overhead?

In previous kernel versions, cache dentry were generated during the
LOOK_UP_OBJECT process of the object state machine. Even if power was lost
midway, the next startup process could continue pulling data based on the
previously downloaded cache data on disk.

What would be the recommended way to handle this situation? Or am I
thinking about this incorrectly? Would appreciate any feedback and guidance
from the community.

Thanks,
Zizhi Wo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss
  2025-05-28  8:07 [QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss Zizhi Wo
@ 2025-05-28  8:35 ` Gao Xiang
  2025-05-28  8:53   ` Zizhi Wo
  0 siblings, 1 reply; 5+ messages in thread
From: Gao Xiang @ 2025-05-28  8:35 UTC (permalink / raw)
  To: Zizhi Wo, netfs, dhowells, jlayton, brauner
  Cc: jefflexu, zhujia.zj, linux-erofs, linux-fsdevel, linux-kernel,
	wozizhi, libaokun1, yangerkun, houtao1, yukuai3

Hi Zizhi,

On 2025/5/28 16:07, Zizhi Wo wrote:
> Currently, in on-demand loading mode, cachefiles first calls
> cachefiles_create_tmpfile() to generate a tmpfile, and only during the exit
> process does it call cachefiles_commit_object->cachefiles_commit_tmpfile to
> create the actual dentry and making it visible to users.
> 
> If the cache write is interrupted unexpectedly (e.g., by system crash or
> power loss), during the next startup process, cachefiles_look_up_object()
> will determine that no corresponding dentry has been generated and will
> recreate the tmpfile and pull the complete data again!
> 
> The current implementation mechanism appears to provide per-file atomicity.
> For scenarios involving large image files (where significant amount of
> cache data needs to be written), this re-pulling process after an
> interruption seems considerable overhead?
> 
> In previous kernel versions, cache dentry were generated during the
> LOOK_UP_OBJECT process of the object state machine. Even if power was lost
> midway, the next startup process could continue pulling data based on the
> previously downloaded cache data on disk.
> 
> What would be the recommended way to handle this situation? Or am I
> thinking about this incorrectly? Would appreciate any feedback and guidance
> from the community.

As you can see, EROFS fscache feature was marked as deprecated
since per-content hooks already support the same use case.

the EROFS fscache support will be removed after I make
per-content hooks work in erofs-utils, which needs some time
because currently I don't have enough time to work on the
community stuff.

Thanks,
Gao Xiang

> 
> Thanks,
> Zizhi Wo


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss
  2025-05-28  8:35 ` Gao Xiang
@ 2025-05-28  8:53   ` Zizhi Wo
  2025-05-28  9:25     ` Gao Xiang
  0 siblings, 1 reply; 5+ messages in thread
From: Zizhi Wo @ 2025-05-28  8:53 UTC (permalink / raw)
  To: Gao Xiang, Zizhi Wo, netfs, dhowells, jlayton, brauner
  Cc: jefflexu, zhujia.zj, linux-erofs, linux-fsdevel, linux-kernel,
	libaokun1, yangerkun, houtao1, yukuai3



在 2025/5/28 16:35, Gao Xiang 写道:
> Hi Zizhi,
> 
> On 2025/5/28 16:07, Zizhi Wo wrote:
>> Currently, in on-demand loading mode, cachefiles first calls
>> cachefiles_create_tmpfile() to generate a tmpfile, and only during the 
>> exit
>> process does it call 
>> cachefiles_commit_object->cachefiles_commit_tmpfile to
>> create the actual dentry and making it visible to users.
>>
>> If the cache write is interrupted unexpectedly (e.g., by system crash or
>> power loss), during the next startup process, cachefiles_look_up_object()
>> will determine that no corresponding dentry has been generated and will
>> recreate the tmpfile and pull the complete data again!
>>
>> The current implementation mechanism appears to provide per-file 
>> atomicity.
>> For scenarios involving large image files (where significant amount of
>> cache data needs to be written), this re-pulling process after an
>> interruption seems considerable overhead?
>>
>> In previous kernel versions, cache dentry were generated during the
>> LOOK_UP_OBJECT process of the object state machine. Even if power was 
>> lost
>> midway, the next startup process could continue pulling data based on the
>> previously downloaded cache data on disk.
>>
>> What would be the recommended way to handle this situation? Or am I
>> thinking about this incorrectly? Would appreciate any feedback and 
>> guidance
>> from the community.
> 
> As you can see, EROFS fscache feature was marked as deprecated
> since per-content hooks already support the same use case.
> 
> the EROFS fscache support will be removed after I make
> per-content hooks work in erofs-utils, which needs some time
> because currently I don't have enough time to work on the
> community stuff.
> 
> Thanks,
> Gao Xiang

Thanks for your reply.

Indeed, the subsequent implementations have moved to using fanotify.
Moreover, based on evaluation, this approach could indeed lead to
performance improvements.

However, in our current use case, we are still working with a kernel
version that only supports the fscache-based approach, so this issue
still exists for us. :(

Thanks,
Zizhi Wo

> 
>>
>> Thanks,
>> Zizhi Wo
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss
  2025-05-28  8:53   ` Zizhi Wo
@ 2025-05-28  9:25     ` Gao Xiang
  2025-05-28  9:58       ` Zizhi Wo
  0 siblings, 1 reply; 5+ messages in thread
From: Gao Xiang @ 2025-05-28  9:25 UTC (permalink / raw)
  To: Zizhi Wo, netfs, dhowells, jlayton, brauner
  Cc: jefflexu, zhujia.zj, linux-erofs, linux-fsdevel, linux-kernel,
	libaokun1, yangerkun, houtao1, yukuai3



On 2025/5/28 16:53, Zizhi Wo wrote:
> 
> 
> 在 2025/5/28 16:35, Gao Xiang 写道:
>> Hi Zizhi,
>>
>> On 2025/5/28 16:07, Zizhi Wo wrote:
>>> Currently, in on-demand loading mode, cachefiles first calls
>>> cachefiles_create_tmpfile() to generate a tmpfile, and only during the exit
>>> process does it call cachefiles_commit_object->cachefiles_commit_tmpfile to
>>> create the actual dentry and making it visible to users.
>>>
>>> If the cache write is interrupted unexpectedly (e.g., by system crash or
>>> power loss), during the next startup process, cachefiles_look_up_object()
>>> will determine that no corresponding dentry has been generated and will
>>> recreate the tmpfile and pull the complete data again!
>>>
>>> The current implementation mechanism appears to provide per-file atomicity.
>>> For scenarios involving large image files (where significant amount of
>>> cache data needs to be written), this re-pulling process after an
>>> interruption seems considerable overhead?
>>>
>>> In previous kernel versions, cache dentry were generated during the
>>> LOOK_UP_OBJECT process of the object state machine. Even if power was lost
>>> midway, the next startup process could continue pulling data based on the
>>> previously downloaded cache data on disk.
>>>
>>> What would be the recommended way to handle this situation? Or am I
>>> thinking about this incorrectly? Would appreciate any feedback and guidance
>>> from the community.
>>
>> As you can see, EROFS fscache feature was marked as deprecated
>> since per-content hooks already support the same use case.
>>
>> the EROFS fscache support will be removed after I make
>> per-content hooks work in erofs-utils, which needs some time
>> because currently I don't have enough time to work on the
>> community stuff.
>>
>> Thanks,
>> Gao Xiang
> 
> Thanks for your reply.
> 
> Indeed, the subsequent implementations have moved to using fanotify.
> Moreover, based on evaluation, this approach could indeed lead to
> performance improvements.
> 
> However, in our current use case, we are still working with a kernel
> version that only supports the fscache-based approach, so this issue
> still exists for us. :(

Since it's deprecated (because that fscache improvement will
take much long time to upstream and netfs dependency is
redundant in addition to new pre-content hooks), could you
improve it downstream directly?

Or if you have some simple proposal you could post, but no one
avoids you to use fscache downstream but it seems pre-content
hooks are more cleaner for this use case..

Thanks,
Gao Xiang


> 
> Thanks,
> Zizhi Wo
> 
>>
>>>
>>> Thanks,
>>> Zizhi Wo
>>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss
  2025-05-28  9:25     ` Gao Xiang
@ 2025-05-28  9:58       ` Zizhi Wo
  0 siblings, 0 replies; 5+ messages in thread
From: Zizhi Wo @ 2025-05-28  9:58 UTC (permalink / raw)
  To: Gao Xiang, Zizhi Wo, netfs, dhowells, jlayton, brauner
  Cc: jefflexu, zhujia.zj, linux-erofs, linux-fsdevel, linux-kernel,
	libaokun1, yangerkun, houtao1, yukuai3



在 2025/5/28 17:25, Gao Xiang 写道:
> 
> 
> On 2025/5/28 16:53, Zizhi Wo wrote:
>>
>>
>> 在 2025/5/28 16:35, Gao Xiang 写道:
>>> Hi Zizhi,
>>>
>>> On 2025/5/28 16:07, Zizhi Wo wrote:
>>>> Currently, in on-demand loading mode, cachefiles first calls
>>>> cachefiles_create_tmpfile() to generate a tmpfile, and only during 
>>>> the exit
>>>> process does it call 
>>>> cachefiles_commit_object->cachefiles_commit_tmpfile to
>>>> create the actual dentry and making it visible to users.
>>>>
>>>> If the cache write is interrupted unexpectedly (e.g., by system 
>>>> crash or
>>>> power loss), during the next startup process, 
>>>> cachefiles_look_up_object()
>>>> will determine that no corresponding dentry has been generated and will
>>>> recreate the tmpfile and pull the complete data again!
>>>>
>>>> The current implementation mechanism appears to provide per-file 
>>>> atomicity.
>>>> For scenarios involving large image files (where significant amount of
>>>> cache data needs to be written), this re-pulling process after an
>>>> interruption seems considerable overhead?
>>>>
>>>> In previous kernel versions, cache dentry were generated during the
>>>> LOOK_UP_OBJECT process of the object state machine. Even if power 
>>>> was lost
>>>> midway, the next startup process could continue pulling data based 
>>>> on the
>>>> previously downloaded cache data on disk.
>>>>
>>>> What would be the recommended way to handle this situation? Or am I
>>>> thinking about this incorrectly? Would appreciate any feedback and 
>>>> guidance
>>>> from the community.
>>>
>>> As you can see, EROFS fscache feature was marked as deprecated
>>> since per-content hooks already support the same use case.
>>>
>>> the EROFS fscache support will be removed after I make
>>> per-content hooks work in erofs-utils, which needs some time
>>> because currently I don't have enough time to work on the
>>> community stuff.
>>>
>>> Thanks,
>>> Gao Xiang
>>
>> Thanks for your reply.
>>
>> Indeed, the subsequent implementations have moved to using fanotify.
>> Moreover, based on evaluation, this approach could indeed lead to
>> performance improvements.
>>
>> However, in our current use case, we are still working with a kernel
>> version that only supports the fscache-based approach, so this issue
>> still exists for us. :(
> 
> Since it's deprecated (because that fscache improvement will
> take much long time to upstream and netfs dependency is
> redundant in addition to new pre-content hooks), could you
> improve it downstream directly?
> 
> Or if you have some simple proposal you could post, but no one
> avoids you to use fscache downstream but it seems pre-content
> hooks are more cleaner for this use case..
> 
> Thanks,
> Gao Xiang
> 

I understand. We'll have some internal discussions to explore possible
solutions. Our initial intention in bringing this up was also to hear if
the community has any suggestions or different perspectives on this
issue, which could help guide our future improvements (although fanotify 
is for future and fscache seems deprecated). For example, as you
mentioned fanotify — that was quite insightful.

Thanks,
Zizhi Wo

> 
>>
>> Thanks,
>> Zizhi Wo
>>
>>>
>>>>
>>>> Thanks,
>>>> Zizhi Wo
>>>
> 
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-05-28  9:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-28  8:07 [QUESTION] cachefiles: Recovery concerns with on-demand loading after unexpected power loss Zizhi Wo
2025-05-28  8:35 ` Gao Xiang
2025-05-28  8:53   ` Zizhi Wo
2025-05-28  9:25     ` Gao Xiang
2025-05-28  9:58       ` Zizhi Wo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).