* [PATCH] zram: panic when use ext4 over zram
@ 2024-11-29 11:57 caiqingfu
2024-11-30 5:32 ` Sergey Senozhatsky
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: caiqingfu @ 2024-11-29 11:57 UTC (permalink / raw)
To: akpm, senozhatsky, minchan; +Cc: mm-commits, caiqingfu
From: caiqingfu <caiqingfu@ruijie.com.cn>
[ 52.073080 ] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 52.073511 ] Modules linked in:
[ 52.074094 ] CPU: 0 UID: 0 PID: 3825 Comm: a.out Not tainted 6.12.0-07749-g28eb75e178d3-dirty #3
[ 52.074672 ] Hardware name: linux,dummy-virt (DT)
[ 52.075128 ] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 52.075619 ] pc : obj_malloc+0x5c/0x160
[ 52.076402 ] lr : zs_malloc+0x200/0x570
[ 52.076630 ] sp : ffff80008dd335f0
[ 52.076797 ] x29: ffff80008dd335f0 x28: ffff000004104a00 x27: ffff000004dfc400
[ 52.077319 ] x26: 000000000000ca18 x25: ffff00003fcaf0e0 x24: ffff000006925cf0
[ 52.077785 ] x23: 0000000000000c0a x22: ffff0000032ee780 x21: ffff000006925cf0
[ 52.078257 ] x20: 0000000000088000 x19: 0000000000000000 x18: 0000000000fffc18
[ 52.078701 ] x17: 00000000fffffffd x16: 0000000000000803 x15: 00000000fffffffe
[ 52.079203 ] x14: 000000001824429d x13: ffff000006e84000 x12: ffff000006e83fec
[ 52.079711 ] x11: ffff000006e83000 x10: 00000000000002a5 x9 : ffff000006e83ff3
[ 52.080269 ] x8 : 0000000000000001 x7 : 0000000017e80000 x6 : 0000000000017e80
[ 52.080724 ] x5 : 0000000000000003 x4 : ffff00000402a5e8 x3 : 0000000000000066
[ 52.081081 ] x2 : ffff000006925cf0 x1 : ffff00000402a5e8 x0 : ffff000004104a00
[ 52.081595 ] Call trace:
[ 52.081925 ] obj_malloc+0x5c/0x160 (P)
[ 52.082220 ] zs_malloc+0x200/0x570 (L)
[ 52.082504 ] zs_malloc+0x200/0x570
[ 52.082716 ] zram_submit_bio+0x788/0x9e8
[ 52.083017 ] __submit_bio+0x1c4/0x338
[ 52.083343 ] submit_bio_noacct_nocheck+0x128/0x2c0
[ 52.083518 ] submit_bio_noacct+0x1c8/0x308
[ 52.083722 ] submit_bio+0xa8/0x14c
[ 52.083942 ] submit_bh_wbc+0x140/0x1bc
[ 52.084088 ] __block_write_full_folio+0x23c/0x5f0
[ 52.084232 ] block_write_full_folio+0x134/0x21c
[ 52.084524 ] write_cache_pages+0x64/0xd4
[ 52.084778 ] blkdev_writepages+0x50/0x8c
[ 52.085040 ] do_writepages+0x80/0x2b0
[ 52.085292 ] filemap_fdatawrite_wbc+0x6c/0x90
[ 52.085597 ] __filemap_fdatawrite_range+0x64/0x94
[ 52.085900 ] filemap_fdatawrite+0x1c/0x28
[ 52.086158 ] sync_bdevs+0x170/0x17c
[ 52.086374 ] ksys_sync+0x6c/0xb8
[ 52.086597 ] __arm64_sys_sync+0x10/0x20
[ 52.086847 ] invoke_syscall+0x44/0x100
[ 52.087230 ] el0_svc_common.constprop.0+0x40/0xe0
[ 52.087550 ] do_el0_svc+0x1c/0x28
[ 52.087690 ] el0_svc+0x30/0xd0
[ 52.087818 ] el0t_64_sync_handler+0xc8/0xcc
[ 52.088046 ] el0t_64_sync+0x198/0x19c
[ 52.088500 ] Code: 110004a5 6b0500df f9401273 54000160 (f9401664)
[ 52.089097 ] ---[ end trace 0000000000000000 ]---
When using ext4 on zram, the following panic occasionally occurs under
high memory usage
The reason is that when the handle is obtained using the slow path, it
will be re-compressed. If the data in the page changes, the compressed
length may exceed the previous one. Overflow occurred when writing to
zs_object, which then caused the panic.
Comment the fast path and force the slow path. Adding a large number of
read and write file systems can quickly reproduce it.
The solution is to re-obtain the handle after re-compression if the
length is different from the previous one.
Signed-off-by: caiqingfu <caiqingfu@ruijie.com.cn>
---
drivers/block/zram/zram_drv.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 3dee026988dc..0ca6d55c9917 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1633,6 +1633,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
unsigned long alloced_pages;
unsigned long handle = -ENOMEM;
unsigned int comp_len = 0;
+ unsigned int last_comp_len = 0;
void *src, *dst, *mem;
struct zcomp_strm *zstrm;
unsigned long element = 0;
@@ -1664,6 +1665,11 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
if (comp_len >= huge_class_size)
comp_len = PAGE_SIZE;
+
+ if (last_comp_len && (last_comp_len != comp_len)) {
+ zs_free(zram->mem_pool, handle);
+ handle = (unsigned long)ERR_PTR(-ENOMEM);
+ }
/*
* handle allocation has 2 paths:
* a) fast path is executed with preemption disabled (for
@@ -1692,8 +1698,10 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index)
if (IS_ERR_VALUE(handle))
return PTR_ERR((void *)handle);
- if (comp_len != PAGE_SIZE)
+ if (comp_len != PAGE_SIZE) {
+ last_comp_len = comp_len;
goto compress_again;
+ }
/*
* If the page is not compressible, you need to acquire the
* lock and execute the code below. The zcomp_stream_get()
--
2.25.1
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [PATCH] zram: panic when use ext4 over zram 2024-11-29 11:57 [PATCH] zram: panic when use ext4 over zram caiqingfu @ 2024-11-30 5:32 ` Sergey Senozhatsky 2024-12-02 6:06 ` caiqingfu 2024-12-02 10:07 ` caiqingfu 2024-12-11 11:04 ` Sergey Senozhatsky 2024-12-12 18:40 ` Kees Bakker 2 siblings, 2 replies; 16+ messages in thread From: Sergey Senozhatsky @ 2024-11-30 5:32 UTC (permalink / raw) To: caiqingfu; +Cc: akpm, senozhatsky, minchan, mm-commits, caiqingfu On (24/11/29 19:57), caiqingfu wrote: > [ 52.073080 ] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > [ 52.073511 ] Modules linked in: > [ 52.074094 ] CPU: 0 UID: 0 PID: 3825 Comm: a.out Not tainted 6.12.0-07749-g28eb75e178d3-dirty #3 > [ 52.074672 ] Hardware name: linux,dummy-virt (DT) > [ 52.075128 ] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 52.075619 ] pc : obj_malloc+0x5c/0x160 > [ 52.076402 ] lr : zs_malloc+0x200/0x570 > [ 52.076630 ] sp : ffff80008dd335f0 > [ 52.076797 ] x29: ffff80008dd335f0 x28: ffff000004104a00 x27: ffff000004dfc400 > [ 52.077319 ] x26: 000000000000ca18 x25: ffff00003fcaf0e0 x24: ffff000006925cf0 > [ 52.077785 ] x23: 0000000000000c0a x22: ffff0000032ee780 x21: ffff000006925cf0 > [ 52.078257 ] x20: 0000000000088000 x19: 0000000000000000 x18: 0000000000fffc18 > [ 52.078701 ] x17: 00000000fffffffd x16: 0000000000000803 x15: 00000000fffffffe > [ 52.079203 ] x14: 000000001824429d x13: ffff000006e84000 x12: ffff000006e83fec > [ 52.079711 ] x11: ffff000006e83000 x10: 00000000000002a5 x9 : ffff000006e83ff3 > [ 52.080269 ] x8 : 0000000000000001 x7 : 0000000017e80000 x6 : 0000000000017e80 > [ 52.080724 ] x5 : 0000000000000003 x4 : ffff00000402a5e8 x3 : 0000000000000066 > [ 52.081081 ] x2 : ffff000006925cf0 x1 : ffff00000402a5e8 x0 : ffff000004104a00 > [ 52.081595 ] Call trace: > [ 52.081925 ] obj_malloc+0x5c/0x160 (P) > [ 52.082220 ] zs_malloc+0x200/0x570 (L) > [ 52.082504 ] zs_malloc+0x200/0x570 > [ 52.082716 ] zram_submit_bio+0x788/0x9e8 > [ 52.083017 ] __submit_bio+0x1c4/0x338 > [ 52.083343 ] submit_bio_noacct_nocheck+0x128/0x2c0 > [ 52.083518 ] submit_bio_noacct+0x1c8/0x308 > [ 52.083722 ] submit_bio+0xa8/0x14c > [ 52.083942 ] submit_bh_wbc+0x140/0x1bc > [ 52.084088 ] __block_write_full_folio+0x23c/0x5f0 > [ 52.084232 ] block_write_full_folio+0x134/0x21c > [ 52.084524 ] write_cache_pages+0x64/0xd4 > [ 52.084778 ] blkdev_writepages+0x50/0x8c > [ 52.085040 ] do_writepages+0x80/0x2b0 > [ 52.085292 ] filemap_fdatawrite_wbc+0x6c/0x90 > [ 52.085597 ] __filemap_fdatawrite_range+0x64/0x94 > [ 52.085900 ] filemap_fdatawrite+0x1c/0x28 > [ 52.086158 ] sync_bdevs+0x170/0x17c > > When using ext4 on zram, the following panic occasionally occurs under > high memory usage > > The reason is that when the handle is obtained using the slow path, it > will be re-compressed. If the data in the page changes, the compressed > length may exceed the previous one. Overflow occurred when writing to > zs_object, which then caused the panic. I honestly don't know... What is changing the page under write()? If something is modifying page's content in parallel with zram's write() then you can't really use zram, if the content is changing concurrently with zram's compression then I really don't see how it would be able to decompress it later and what would it decompress to (some mix of stale and new data?). ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-11-30 5:32 ` Sergey Senozhatsky @ 2024-12-02 6:06 ` caiqingfu 2024-12-02 7:04 ` Sergey Senozhatsky 2024-12-02 10:07 ` caiqingfu 1 sibling, 1 reply; 16+ messages in thread From: caiqingfu @ 2024-12-02 6:06 UTC (permalink / raw) To: senozhatsky; +Cc: akpm, baicaiaichibaicai, caiqingfu, minchan, mm-commits > I honestly don't know... > > What is changing the page under write()? If something is modifying > page's content in parallel with zram's write() then you can't really > use zram, if the content is changing concurrently with zram's compression > then I really don't see how it would be able to decompress it later > and what would it decompress to (some mix of stale and new data?). I am not familiar with file systems and lzro, so I dumped the page owner of the modified page. The function stack of most page allocated is as follows: [ 46.560092] page_owner tracks the page as allocated [ 46.560101] page last allocated via order 0, migratetype Movable, gfp_mask 0x148c48(GFP_NOFS|__GFP_NOFAIL|__GFP_COMP|__GFP_HARDWALL|__GFP_MOVABLE), pid 1769, tgid 1372 (a.out), ts 45588110160, free_ts 45123127760 [ 46.560126] prep_new_page+0xa8/0x10c [ 46.560143] get_page_from_freelist+0xa44/0x16d0 [ 46.560160] __alloc_pages_noprof+0x150/0x290 [ 46.560177] alloc_pages_mpol_noprof+0x88/0x23c [ 46.560195] alloc_pages_noprof+0x4c/0x7c [ 46.560234] folio_alloc_noprof+0x14/0x64 [ 46.560254] filemap_alloc_folio_noprof+0x100/0x14c [ 46.560272] __filemap_get_folio+0x21c/0x38c [ 46.560290] bdev_getblk+0xd0/0x2b4 [ 46.560304] __ext4_get_inode_loc+0x11c/0x53c [ 46.560320] ext4_get_inode_loc+0x44/0xb0 [ 46.560337] ext4_reserve_inode_write+0x40/0xf0 [ 46.560354] __ext4_mark_inode_dirty+0x4c/0x1f0 [ 46.560370] ext4_ext_tree_init+0x40/0x4c [ 46.560388] __ext4_new_inode+0x790/0x13b8 [ 46.560406] ext4_create+0xdc/0x1d0 and other list this: [ 47.683962] page_owner tracks the page as allocated [ 47.684012] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x148c40(GFP_NOFS|__GFP_NOFAIL|__GFP_COMP|__GFP_HARDWALL), pid 1309, tgid 970 (a.out), ts 44091572544, free_ts 0 [ 47.684056] prep_new_page+0xa8/0x10c [ 47.684090] get_page_from_freelist+0xa44/0x16d0 [ 47.684109] __alloc_pages_noprof+0x150/0x290 [ 47.684127] alloc_pages_mpol_noprof+0x88/0x23c [ 47.684148] alloc_pages_noprof+0x4c/0x7c [ 47.684166] folio_alloc_noprof+0x14/0x64 [ 47.684184] filemap_alloc_folio_noprof+0x100/0x14c [ 47.684205] __filemap_get_folio+0x21c/0x38c [ 47.684223] bdev_getblk+0xd0/0x2b4 [ 47.684240] ext4_getblk+0xac/0x300 [ 47.684258] ext4_bread+0x14/0xe4 [ 47.684274] ext4_append+0x90/0x1cc [ 47.684293] do_split+0x88/0x9c4 [ 47.684307] make_indexed_dir+0x580/0x688 [ 47.684322] ext4_add_entry+0x37c/0x420 [ 47.684337] ext4_add_nondir+0x38/0x11c It look like only the file system metadata has been modified.I'm not sure. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-02 6:06 ` caiqingfu @ 2024-12-02 7:04 ` Sergey Senozhatsky 0 siblings, 0 replies; 16+ messages in thread From: Sergey Senozhatsky @ 2024-12-02 7:04 UTC (permalink / raw) To: caiqingfu; +Cc: senozhatsky, akpm, caiqingfu, minchan, mm-commits On (24/12/02 14:06), caiqingfu wrote: > [ 46.560092] page_owner tracks the page as allocated > [ 46.560101] page last allocated via order 0, migratetype Movable, gfp_mask 0x148c48(GFP_NOFS|__GFP_NOFAIL|__GFP_COMP|__GFP_HARDWALL|__GFP_MOVABLE), pid 1769, tgid 1372 (a.out), ts 45588110160, free_ts 45123127760 > [ 46.560126] prep_new_page+0xa8/0x10c > [ 46.560143] get_page_from_freelist+0xa44/0x16d0 > [ 46.560160] __alloc_pages_noprof+0x150/0x290 > [ 46.560177] alloc_pages_mpol_noprof+0x88/0x23c > [ 46.560195] alloc_pages_noprof+0x4c/0x7c > [ 46.560234] folio_alloc_noprof+0x14/0x64 > [ 46.560254] filemap_alloc_folio_noprof+0x100/0x14c > [ 46.560272] __filemap_get_folio+0x21c/0x38c > [ 46.560290] bdev_getblk+0xd0/0x2b4 > [ 46.560304] __ext4_get_inode_loc+0x11c/0x53c > [ 46.560320] ext4_get_inode_loc+0x44/0xb0 > [ 46.560337] ext4_reserve_inode_write+0x40/0xf0 > [ 46.560354] __ext4_mark_inode_dirty+0x4c/0x1f0 > [ 46.560370] ext4_ext_tree_init+0x40/0x4c > [ 46.560388] __ext4_new_inode+0x790/0x13b8 > [ 46.560406] ext4_create+0xdc/0x1d0 > > and other list this: > > [ 47.683962] page_owner tracks the page as allocated > [ 47.684012] page last allocated via order 0, migratetype Unmovable, gfp_mask 0x148c40(GFP_NOFS|__GFP_NOFAIL|__GFP_COMP|__GFP_HARDWALL), pid 1309, tgid 970 (a.out), ts 44091572544, free_ts 0 > [ 47.684056] prep_new_page+0xa8/0x10c > [ 47.684090] get_page_from_freelist+0xa44/0x16d0 > [ 47.684109] __alloc_pages_noprof+0x150/0x290 > [ 47.684127] alloc_pages_mpol_noprof+0x88/0x23c > [ 47.684148] alloc_pages_noprof+0x4c/0x7c > [ 47.684166] folio_alloc_noprof+0x14/0x64 > [ 47.684184] filemap_alloc_folio_noprof+0x100/0x14c > [ 47.684205] __filemap_get_folio+0x21c/0x38c > [ 47.684223] bdev_getblk+0xd0/0x2b4 > [ 47.684240] ext4_getblk+0xac/0x300 > [ 47.684258] ext4_bread+0x14/0xe4 > [ 47.684274] ext4_append+0x90/0x1cc > [ 47.684293] do_split+0x88/0x9c4 > [ 47.684307] make_indexed_dir+0x580/0x688 > [ 47.684322] ext4_add_entry+0x37c/0x420 > [ 47.684337] ext4_add_nondir+0x38/0x11c > > It look like only the file system metadata has been modified.I'm not sure. What's your use-case, what apps you run and what data is stored on zram disk? ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] zram: panic when use ext4 over zram 2024-11-30 5:32 ` Sergey Senozhatsky 2024-12-02 6:06 ` caiqingfu @ 2024-12-02 10:07 ` caiqingfu 2024-12-10 9:34 ` Sergey Senozhatsky 1 sibling, 1 reply; 16+ messages in thread From: caiqingfu @ 2024-12-02 10:07 UTC (permalink / raw) To: senozhatsky; +Cc: akpm, baicaiaichibaicai, caiqingfu, minchan, mm-commits > What's your use-case, what apps you run and what data is stored > on zram disk? The steps to reproduce are as follows: 1. force slow path,then re-build kernel diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 0ca6d55c9917..29ac52a4f2e7 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1683,12 +1683,14 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index) * if we have a 'non-null' handle here then we are coming * from the slow path and handle has already been allocated. */ + /* if (IS_ERR_VALUE(handle)) handle = zs_malloc(zram->mem_pool, comp_len, __GFP_KSWAPD_RECLAIM | __GFP_NOWARN | __GFP_HIGHMEM | __GFP_MOVABLE); + */ if (IS_ERR_VALUE(handle)) { zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); atomic64_inc(&zram->stats.writestall); 2. create an arm64 virtual machine qemu-system-aarch64 -M virt -cpu cortex-a57 -smp 4 -m 1024 -kernel kernel-build/arch/arm64/boot/Image \ -nographic -append "console=ttyAMA0,115200 root=/dev/vda rw" \ -drive file=$(IMGFILE),format=raw -nic user,hostfwd=tcp::60022-:22 3. after the virtual machine is started, create zram cat /sys/class/zram-control/hot_add echo 524288000 > /sys/devices/virtual/block/zram0/disksize echo 524288000 > /sys/devices/virtual/block/zram1/disksize mkfs.ext4 -O ^has_journal -b 4096 -F -L TEMP -m 0 /dev/zram0 mkdir /tmp/zram mount -t ext4 -o errors=continue,nosuid,nodev,noatime /dev/zram0 /tmp/zram mkswap /dev/zram1 swapon /dev/zram1 echo 100 > /proc/sys/vm/swappiness mkdir -p /tmp/zram/stressTest 4. build demo app and run again and again demo app : #include <stdio.h> #include <sys/mman.h> #define _GNU_SOURCE #include <unistd.h> #include <sys/types.h> #include <stdlib.h> #include <pthread.h> #include <errno.h> #include <string.h> #include <syscall.h> #define THREAD_NUM 400 #define BUFF_SIZE 128*1024 const static char * basePath = "/tmp/zram/stressTest"; static unsigned char buf[BUFF_SIZE]; void * cb(void * arg){ char path[256]; unsigned int tid = (unsigned int)(syscall(SYS_gettid)); unsigned int start = ((unsigned int)random())%(BUFF_SIZE - 1); unsigned int msleep = (((unsigned int)random())%10) * 1000; void * p; int i; snprintf(path,sizeof(path) - 1,"%s/%u",basePath,tid); FILE * fp = fopen(path,"a"); if (NULL == fp) { fprintf(stderr,"%u open file '%s' failure '%s'...\n",tid,path,strerror(errno)); return NULL; } start = ((unsigned int)random())%(BUFF_SIZE - 1); fwrite(&buf[start],1,BUFF_SIZE - start,fp); fflush(fp); fclose(fp); sync(); usleep(msleep); return NULL; } int main(int argc, char *argv[]) { pthread_t pid[THREAD_NUM]; int i; fprintf(stderr,"build @ %s\n",__TIME__); FILE * fp = fopen("/dev/urandom","ro"); fread(buf,BUFF_SIZE,1,fp); fclose(fp); fprintf(stderr,"read urandom size '%d'\n",BUFF_SIZE); srandom((unsigned)time(NULL)); for (i = 0;i < THREAD_NUM;i++) { pthread_create(&pid[i],NULL,cb,NULL); } fprintf(stderr,"create over\n"); for (i = 0;i < THREAD_NUM;i++) { pthread_join(pid[i],NULL); } return 0; } run it while true;do rm /tmp/zram_vsd/stressTest/* &> /dev/null;./a.out &> /dev/null ;echo again...;done ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-02 10:07 ` caiqingfu @ 2024-12-10 9:34 ` Sergey Senozhatsky 0 siblings, 0 replies; 16+ messages in thread From: Sergey Senozhatsky @ 2024-12-10 9:34 UTC (permalink / raw) To: caiqingfu; +Cc: senozhatsky, akpm, caiqingfu, minchan, mm-commits On (24/12/02 18:07), caiqingfu wrote: > > What's your use-case, what apps you run and what data is stored > > on zram disk? > > The steps to reproduce are as follows: > > 1. force slow path,then re-build kernel > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index 0ca6d55c9917..29ac52a4f2e7 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -1683,12 +1683,14 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index) > * if we have a 'non-null' handle here then we are coming > * from the slow path and handle has already been allocated. > */ > + /* > if (IS_ERR_VALUE(handle)) > handle = zs_malloc(zram->mem_pool, comp_len, > __GFP_KSWAPD_RECLAIM | > __GFP_NOWARN | > __GFP_HIGHMEM | > __GFP_MOVABLE); > + */ > if (IS_ERR_VALUE(handle)) { > zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); > atomic64_inc(&zram->stats.writestall); > > 2. create an arm64 virtual machine > > qemu-system-aarch64 -M virt -cpu cortex-a57 -smp 4 -m 1024 -kernel kernel-build/arch/arm64/boot/Image \ > -nographic -append "console=ttyAMA0,115200 root=/dev/vda rw" \ > -drive file=$(IMGFILE),format=raw -nic user,hostfwd=tcp::60022-:22 > > 3. after the virtual machine is started, create zram > > cat /sys/class/zram-control/hot_add > echo 524288000 > /sys/devices/virtual/block/zram0/disksize > echo 524288000 > /sys/devices/virtual/block/zram1/disksize > mkfs.ext4 -O ^has_journal -b 4096 -F -L TEMP -m 0 /dev/zram0 > mkdir /tmp/zram > mount -t ext4 -o errors=continue,nosuid,nodev,noatime /dev/zram0 /tmp/zram > mkswap /dev/zram1 > swapon /dev/zram1 > echo 100 > /proc/sys/vm/swappiness > mkdir -p /tmp/zram/stressTest > > > 4. build demo app and run again and again > I was more curious whether you have a real-world example/app that does it. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-11-29 11:57 [PATCH] zram: panic when use ext4 over zram caiqingfu 2024-11-30 5:32 ` Sergey Senozhatsky @ 2024-12-11 11:04 ` Sergey Senozhatsky 2024-12-12 9:55 ` caiqingfu 2024-12-12 18:40 ` Kees Bakker 2 siblings, 1 reply; 16+ messages in thread From: Sergey Senozhatsky @ 2024-12-11 11:04 UTC (permalink / raw) To: caiqingfu; +Cc: akpm, senozhatsky, minchan, mm-commits, caiqingfu On (24/11/29 19:57), caiqingfu wrote: > [ 52.073080 ] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > [ 52.073511 ] Modules linked in: > [ 52.074094 ] CPU: 0 UID: 0 PID: 3825 Comm: a.out Not tainted 6.12.0-07749-g28eb75e178d3-dirty #3 > [ 52.074672 ] Hardware name: linux,dummy-virt (DT) > [ 52.075128 ] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 52.075619 ] pc : obj_malloc+0x5c/0x160 > [ 52.076402 ] lr : zs_malloc+0x200/0x570 > [ 52.076630 ] sp : ffff80008dd335f0 > [ 52.076797 ] x29: ffff80008dd335f0 x28: ffff000004104a00 x27: ffff000004dfc400 > [ 52.077319 ] x26: 000000000000ca18 x25: ffff00003fcaf0e0 x24: ffff000006925cf0 > [ 52.077785 ] x23: 0000000000000c0a x22: ffff0000032ee780 x21: ffff000006925cf0 > [ 52.078257 ] x20: 0000000000088000 x19: 0000000000000000 x18: 0000000000fffc18 > [ 52.078701 ] x17: 00000000fffffffd x16: 0000000000000803 x15: 00000000fffffffe > [ 52.079203 ] x14: 000000001824429d x13: ffff000006e84000 x12: ffff000006e83fec > [ 52.079711 ] x11: ffff000006e83000 x10: 00000000000002a5 x9 : ffff000006e83ff3 > [ 52.080269 ] x8 : 0000000000000001 x7 : 0000000017e80000 x6 : 0000000000017e80 > [ 52.080724 ] x5 : 0000000000000003 x4 : ffff00000402a5e8 x3 : 0000000000000066 > [ 52.081081 ] x2 : ffff000006925cf0 x1 : ffff00000402a5e8 x0 : ffff000004104a00 > [ 52.081595 ] Call trace: > [ 52.081925 ] obj_malloc+0x5c/0x160 (P) > [ 52.082220 ] zs_malloc+0x200/0x570 (L) > [ 52.082504 ] zs_malloc+0x200/0x570 > [ 52.082716 ] zram_submit_bio+0x788/0x9e8 > [ 52.083017 ] __submit_bio+0x1c4/0x338 > [ 52.083343 ] submit_bio_noacct_nocheck+0x128/0x2c0 > [ 52.083518 ] submit_bio_noacct+0x1c8/0x308 > [ 52.083722 ] submit_bio+0xa8/0x14c > [ 52.083942 ] submit_bh_wbc+0x140/0x1bc > [ 52.084088 ] __block_write_full_folio+0x23c/0x5f0 > [ 52.084232 ] block_write_full_folio+0x134/0x21c > [ 52.084524 ] write_cache_pages+0x64/0xd4 > [ 52.084778 ] blkdev_writepages+0x50/0x8c > [ 52.085040 ] do_writepages+0x80/0x2b0 > [ 52.085292 ] filemap_fdatawrite_wbc+0x6c/0x90 > [ 52.085597 ] __filemap_fdatawrite_range+0x64/0x94 > [ 52.085900 ] filemap_fdatawrite+0x1c/0x28 > [ 52.086158 ] sync_bdevs+0x170/0x17c > [ 52.086374 ] ksys_sync+0x6c/0xb8 > [ 52.086597 ] __arm64_sys_sync+0x10/0x20 > [ 52.086847 ] invoke_syscall+0x44/0x100 > [ 52.087230 ] el0_svc_common.constprop.0+0x40/0xe0 > [ 52.087550 ] do_el0_svc+0x1c/0x28 > [ 52.087690 ] el0_svc+0x30/0xd0 > [ 52.087818 ] el0t_64_sync_handler+0xc8/0xcc > [ 52.088046 ] el0t_64_sync+0x198/0x19c > [ 52.088500 ] Code: 110004a5 6b0500df f9401273 54000160 (f9401664) > [ 52.089097 ] ---[ end trace 0000000000000000 ]--- > > When using ext4 on zram, the following panic occasionally occurs under > high memory usage > > The reason is that when the handle is obtained using the slow path, it > will be re-compressed. If the data in the page changes, the compressed > length may exceed the previous one. Overflow occurred when writing to > zs_object, which then caused the panic. > > Comment the fast path and force the slow path. Adding a large number of > read and write file systems can quickly reproduce it. > > The solution is to re-obtain the handle after re-compression if the > length is different from the previous one. I see that somebody posted something similar on the bugzilla [1]. Is that person you, by any chance? If so, can you please post your findings to linux kernel mailing list? E.g. linux-ext4@, Cc-ing linux-fsdevevl@ and linux-mm@ would be a good start. Bugzilla is not used. If this problem is in ext4 then I'd really prefer to address it there and drop this patch from zram, simply because it covers an issue in the upper layer. And it's better to fix issues that to cover them. [1] https://bugzilla.kernel.org/show_bug.cgi?id=219548 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-11 11:04 ` Sergey Senozhatsky @ 2024-12-12 9:55 ` caiqingfu 2024-12-13 4:28 ` Sergey Senozhatsky 0 siblings, 1 reply; 16+ messages in thread From: caiqingfu @ 2024-12-12 9:55 UTC (permalink / raw) To: senozhatsky; +Cc: akpm, baicaiaichibaicai, caiqingfu, minchan, mm-commits > I see that somebody posted something similar on the bugzilla [1]. Is > that person you, by any chance? If so, can you please post your > findings to linux kernel mailing list? E.g. linux-ext4@, Cc-ing > linux-fsdevevl@ and linux-mm@ would be a good start. Bugzilla is > not used. > > If this problem is in ext4 then I'd really prefer to address it there and > drop this patch from zram, simply because it covers an issue in the upper > layer. And it's better to fix issues that to cover them. > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=219548 He is my colleague, he reported a bug, but no solution has been found yet. In any case, zram should ensure that its data is not damaged, and not rechecking the length after recompression will cause more serious consequences. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-12 9:55 ` caiqingfu @ 2024-12-13 4:28 ` Sergey Senozhatsky 2024-12-13 5:56 ` caiqingfu 0 siblings, 1 reply; 16+ messages in thread From: Sergey Senozhatsky @ 2024-12-13 4:28 UTC (permalink / raw) To: caiqingfu; +Cc: senozhatsky, akpm, caiqingfu, minchan, mm-commits On (24/12/12 17:55), caiqingfu wrote: > > I see that somebody posted something similar on the bugzilla [1]. Is > > that person you, by any chance? If so, can you please post your > > findings to linux kernel mailing list? E.g. linux-ext4@, Cc-ing > > linux-fsdevevl@ and linux-mm@ would be a good start. Bugzilla is > > not used. > > > > If this problem is in ext4 then I'd really prefer to address it there and > > drop this patch from zram, simply because it covers an issue in the upper > > layer. And it's better to fix issues that to cover them. > > > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=219548 > > He is my colleague, he reported a bug Oh, I see. > In any case, zram should ensure that its data is not damaged zram cannot ensure that, it's not zram's job. Nothing stops the buggy actor from modifiying the page at any random point, zram might not even be able to decompress the page in the end, it's not only the zs_handle allocation slow-path case. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-13 4:28 ` Sergey Senozhatsky @ 2024-12-13 5:56 ` caiqingfu 2024-12-13 6:28 ` Sergey Senozhatsky 0 siblings, 1 reply; 16+ messages in thread From: caiqingfu @ 2024-12-13 5:56 UTC (permalink / raw) To: senozhatsky; +Cc: akpm, baicaiaichibaicai, caiqingfu, minchan, mm-commits > > In any case, zram should ensure that its data is not damaged > > zram cannot ensure that, it's not zram's job. Nothing stops > the buggy actor from modifiying the page at any random point, > zram might not even be able to decompress the page in the end, > it's not only the zs_handle allocation slow-path case. What I mean is that if the length after recompression is longer than before, writing directly to zs_obj will cause the next zs_obj to be corrupted, eventually leading to panic. zram must ensure that its own zs_obj will not be written to overflow, not page. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-13 5:56 ` caiqingfu @ 2024-12-13 6:28 ` Sergey Senozhatsky 2024-12-13 6:40 ` Sergey Senozhatsky 0 siblings, 1 reply; 16+ messages in thread From: Sergey Senozhatsky @ 2024-12-13 6:28 UTC (permalink / raw) To: caiqingfu; +Cc: senozhatsky, akpm, caiqingfu, minchan, mm-commits On (24/12/13 13:56), caiqingfu wrote: > > > In any case, zram should ensure that its data is not damaged > > > > zram cannot ensure that, it's not zram's job. Nothing stops > > the buggy actor from modifiying the page at any random point, > > zram might not even be able to decompress the page in the end, > > it's not only the zs_handle allocation slow-path case. > > What I mean is that if the length after recompression is longer than before, > writing directly to zs_obj will cause the next zs_obj to be corrupted, > eventually leading to panic. zram must ensure that its own zs_obj will not be > written to overflow, not page. The data must not change. That's the only "must" thing. zram cannot and must not tolerate buggy upper layer that do not respect BLK_FEAT_STABLE_WRITES. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-13 6:28 ` Sergey Senozhatsky @ 2024-12-13 6:40 ` Sergey Senozhatsky 2024-12-15 8:57 ` caiqingfu 0 siblings, 1 reply; 16+ messages in thread From: Sergey Senozhatsky @ 2024-12-13 6:40 UTC (permalink / raw) To: caiqingfu Cc: caiqingfu, akpm, caiqingfu, minchan, mm-commits, Sergey Senozhatsky On (24/12/13 15:28), Sergey Senozhatsky wrote: > The data must not change. That's the only "must" thing. zram > cannot and must not tolerate buggy upper layer that do not respect > BLK_FEAT_STABLE_WRITES. The solution (per [1]) is "we need to teach the buffer cache writeback code to issue writes through a bounce buffer if the device requires stable writes." Unless somebody is already working on it (I'm not aware of that), please feel free to submit a patch, that would be highly appreciated. [1] https://lore.kernel.org/linux-fsdevel/20241212140437.GD1265540@mit.edu ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-13 6:40 ` Sergey Senozhatsky @ 2024-12-15 8:57 ` caiqingfu 2024-12-15 9:12 ` Sergey Senozhatsky 0 siblings, 1 reply; 16+ messages in thread From: caiqingfu @ 2024-12-15 8:57 UTC (permalink / raw) To: senozhatsky; +Cc: akpm, baicaiaichibaicai, caiqingfu, minchan, mm-commits There are two issues: [1] data may change while ext4 is writing it to zram. [2] zram is written directly without checking the length after recompression, which may cause zs_obj to be damaged. This patch solves [2]. as you said: > zram cannot ensure that, it's not zram's job. Nothing stops > the buggy actor from modifiying the page at any random point, It is not zram's job to ensure that data is not modified. As long as there is re-compression, it must be confirmed again whether the length has changed. This way we can avoid zram panic caused by upper layer problems. So I think this patch should not be discarded. As for issues [1], i'm powerless. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-15 8:57 ` caiqingfu @ 2024-12-15 9:12 ` Sergey Senozhatsky 2024-12-16 1:24 ` caiqingfu 0 siblings, 1 reply; 16+ messages in thread From: Sergey Senozhatsky @ 2024-12-15 9:12 UTC (permalink / raw) To: caiqingfu; +Cc: senozhatsky, akpm, caiqingfu, minchan, mm-commits On (24/12/15 16:57), caiqingfu wrote: > as you said: > > zram cannot ensure that, it's not zram's job. Nothing stops > > the buggy actor from modifiying the page at any random point, > > It is not zram's job to ensure that data is not modified. > As long as there is re-compression, it must be confirmed again > whether the length has changed. zram recompression will go away soon, I just need to finish my old series and send it out. > This way we can avoid zrampanic caused by upper layer problems. > So I think this patch should not be discarded. Sorry, I diagree. That's not the right place to fix that problem, that's not the right way to fix that problem. There are numerous block devices that set BLK_FEAT_STABLE_WRITES bit, we should not patch all those drivers to "make sure that upper layer actually respects BLK_FEAT_STABLE_WRITES". It's the upper layer that needs to be fixed. My opinon - we need to drop this patch, sorry. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-12-15 9:12 ` Sergey Senozhatsky @ 2024-12-16 1:24 ` caiqingfu 0 siblings, 0 replies; 16+ messages in thread From: caiqingfu @ 2024-12-16 1:24 UTC (permalink / raw) To: senozhatsky; +Cc: akpm, baicaiaichibaicai, caiqingfu, minchan, mm-commits > zram recompression will go away soon, I just need to finish my > old series and send it out. Ok I got it, if there is no recompression then this patch is not needed. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] zram: panic when use ext4 over zram 2024-11-29 11:57 [PATCH] zram: panic when use ext4 over zram caiqingfu 2024-11-30 5:32 ` Sergey Senozhatsky 2024-12-11 11:04 ` Sergey Senozhatsky @ 2024-12-12 18:40 ` Kees Bakker 2 siblings, 0 replies; 16+ messages in thread From: Kees Bakker @ 2024-12-12 18:40 UTC (permalink / raw) To: caiqingfu, akpm, senozhatsky, minchan; +Cc: mm-commits, caiqingfu Op 29-11-2024 om 12:57 schreef caiqingfu: > From: caiqingfu <caiqingfu@ruijie.com.cn> > > [ 52.073080 ] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > [ 52.073511 ] Modules linked in: > [ 52.074094 ] CPU: 0 UID: 0 PID: 3825 Comm: a.out Not tainted 6.12.0-07749-g28eb75e178d3-dirty #3 > [ 52.074672 ] Hardware name: linux,dummy-virt (DT) > [ 52.075128 ] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 52.075619 ] pc : obj_malloc+0x5c/0x160 > [ 52.076402 ] lr : zs_malloc+0x200/0x570 > [ 52.076630 ] sp : ffff80008dd335f0 > [ 52.076797 ] x29: ffff80008dd335f0 x28: ffff000004104a00 x27: ffff000004dfc400 > [ 52.077319 ] x26: 000000000000ca18 x25: ffff00003fcaf0e0 x24: ffff000006925cf0 > [ 52.077785 ] x23: 0000000000000c0a x22: ffff0000032ee780 x21: ffff000006925cf0 > [ 52.078257 ] x20: 0000000000088000 x19: 0000000000000000 x18: 0000000000fffc18 > [ 52.078701 ] x17: 00000000fffffffd x16: 0000000000000803 x15: 00000000fffffffe > [ 52.079203 ] x14: 000000001824429d x13: ffff000006e84000 x12: ffff000006e83fec > [ 52.079711 ] x11: ffff000006e83000 x10: 00000000000002a5 x9 : ffff000006e83ff3 > [ 52.080269 ] x8 : 0000000000000001 x7 : 0000000017e80000 x6 : 0000000000017e80 > [ 52.080724 ] x5 : 0000000000000003 x4 : ffff00000402a5e8 x3 : 0000000000000066 > [ 52.081081 ] x2 : ffff000006925cf0 x1 : ffff00000402a5e8 x0 : ffff000004104a00 > [ 52.081595 ] Call trace: > [ 52.081925 ] obj_malloc+0x5c/0x160 (P) > [ 52.082220 ] zs_malloc+0x200/0x570 (L) > [ 52.082504 ] zs_malloc+0x200/0x570 > [ 52.082716 ] zram_submit_bio+0x788/0x9e8 > [ 52.083017 ] __submit_bio+0x1c4/0x338 > [ 52.083343 ] submit_bio_noacct_nocheck+0x128/0x2c0 > [ 52.083518 ] submit_bio_noacct+0x1c8/0x308 > [ 52.083722 ] submit_bio+0xa8/0x14c > [ 52.083942 ] submit_bh_wbc+0x140/0x1bc > [ 52.084088 ] __block_write_full_folio+0x23c/0x5f0 > [ 52.084232 ] block_write_full_folio+0x134/0x21c > [ 52.084524 ] write_cache_pages+0x64/0xd4 > [ 52.084778 ] blkdev_writepages+0x50/0x8c > [ 52.085040 ] do_writepages+0x80/0x2b0 > [ 52.085292 ] filemap_fdatawrite_wbc+0x6c/0x90 > [ 52.085597 ] __filemap_fdatawrite_range+0x64/0x94 > [ 52.085900 ] filemap_fdatawrite+0x1c/0x28 > [ 52.086158 ] sync_bdevs+0x170/0x17c > [ 52.086374 ] ksys_sync+0x6c/0xb8 > [ 52.086597 ] __arm64_sys_sync+0x10/0x20 > [ 52.086847 ] invoke_syscall+0x44/0x100 > [ 52.087230 ] el0_svc_common.constprop.0+0x40/0xe0 > [ 52.087550 ] do_el0_svc+0x1c/0x28 > [ 52.087690 ] el0_svc+0x30/0xd0 > [ 52.087818 ] el0t_64_sync_handler+0xc8/0xcc > [ 52.088046 ] el0t_64_sync+0x198/0x19c > [ 52.088500 ] Code: 110004a5 6b0500df f9401273 54000160 (f9401664) > [ 52.089097 ] ---[ end trace 0000000000000000 ]--- > > When using ext4 on zram, the following panic occasionally occurs under > high memory usage > > The reason is that when the handle is obtained using the slow path, it > will be re-compressed. If the data in the page changes, the compressed > length may exceed the previous one. Overflow occurred when writing to > zs_object, which then caused the panic. > > Comment the fast path and force the slow path. Adding a large number of > read and write file systems can quickly reproduce it. > > The solution is to re-obtain the handle after re-compression if the > length is different from the previous one. > > Signed-off-by: caiqingfu <caiqingfu@ruijie.com.cn> > --- > drivers/block/zram/zram_drv.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > index 3dee026988dc..0ca6d55c9917 100644 > --- a/drivers/block/zram/zram_drv.c > +++ b/drivers/block/zram/zram_drv.c > @@ -1633,6 +1633,7 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index) > unsigned long alloced_pages; > unsigned long handle = -ENOMEM; > unsigned int comp_len = 0; > + unsigned int last_comp_len = 0; Shouldn't this be `static`? > void *src, *dst, *mem; > struct zcomp_strm *zstrm; > unsigned long element = 0; > @@ -1664,6 +1665,11 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index) > > if (comp_len >= huge_class_size) > comp_len = PAGE_SIZE; > + > + if (last_comp_len && (last_comp_len != comp_len)) { > + zs_free(zram->mem_pool, handle); > + handle = (unsigned long)ERR_PTR(-ENOMEM); > + } > /* > * handle allocation has 2 paths: > * a) fast path is executed with preemption disabled (for > @@ -1692,8 +1698,10 @@ static int zram_write_page(struct zram *zram, struct page *page, u32 index) > if (IS_ERR_VALUE(handle)) > return PTR_ERR((void *)handle); > > - if (comp_len != PAGE_SIZE) > + if (comp_len != PAGE_SIZE) { > + last_comp_len = comp_len; > goto compress_again; > + } > /* > * If the page is not compressible, you need to acquire the > * lock and execute the code below. The zcomp_stream_get() ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2024-12-16 1:25 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-29 11:57 [PATCH] zram: panic when use ext4 over zram caiqingfu 2024-11-30 5:32 ` Sergey Senozhatsky 2024-12-02 6:06 ` caiqingfu 2024-12-02 7:04 ` Sergey Senozhatsky 2024-12-02 10:07 ` caiqingfu 2024-12-10 9:34 ` Sergey Senozhatsky 2024-12-11 11:04 ` Sergey Senozhatsky 2024-12-12 9:55 ` caiqingfu 2024-12-13 4:28 ` Sergey Senozhatsky 2024-12-13 5:56 ` caiqingfu 2024-12-13 6:28 ` Sergey Senozhatsky 2024-12-13 6:40 ` Sergey Senozhatsky 2024-12-15 8:57 ` caiqingfu 2024-12-15 9:12 ` Sergey Senozhatsky 2024-12-16 1:24 ` caiqingfu 2024-12-12 18:40 ` Kees Bakker
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.