[f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image

linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed

* [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
@ 2022-03-13 17:52 Juhyung Park
  2022-03-15  0:30 ` Jaegeuk Kim
  2022-03-15  8:33 ` Chao Yu
  0 siblings, 2 replies; 11+ messages in thread
From: Juhyung Park @ 2022-03-13 17:52 UTC (permalink / raw)
  To: linux-f2fs-devel

Hi.

We have a production server storing some Android firmwares over a ZFS
file-system, and we noticed some issues when extracting firmware files
that use f2fs for Android system partitions.

This is a proprietary environment, so I cannot disclose every detail,
so I hope you understand. I'll try to elaborate as much as I can.

The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
from v5.13 after noticing RO feature added on v5.14 being required).
We have a set of scripts extracting Android firmware files. The input
is typically the OTA zip file and after going through the script, it
extracts every file and binary image from a given file.

So that includes extracting super (dynamic partition), ext4 system
partitions with dedup enabled, and now, f2fs system partitions with RO
and compression enabled.

Our script never had to deal with f2fs before as we only started
seeing f2fs system partitions with recently released devices.

This is the f2fs mount flag after mounting with `mount -o ro
system.raw /some/dir`:
ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block

There are *a lot* of files in Android firmware these days, so we try
to parallelize parts when we can.

This is a snippet of the script:
```
#!/bin/bash
<...>
RSYNC="rsync -ahAXx --inplace --numeric-ids"
<...>
for val in system vendor product odm; do
  if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi

  mkdir -p fs
  cd fs

  mkdir -p $val.mount tmp_$val
  mount -o ro ../images/$val.raw $val.mount

  $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
  echo $! > $val.pid
  disown

  cd $val.mount
  find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
  find . -type d -exec mkdir -p "../tmp_$val/"{} \;

  while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
& done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
  wait

<...>

  cd ../
  rm -rf tmp_$val
  cd ../
done

wait
<...>
for val in system vendor product odm; do
  if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
  tail --pid=$(cat fs/$val.pid) -f /dev/null
  umount fs/$val.mount
  rmdir fs/$val.mount
  rm -f images/$val.img images/$val.raw 2>/dev/null
done
```

The offending part is:
```
  $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
  find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
  find . -type d -exec mkdir -p "../tmp_$val/"{} \;
  while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
& done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
  wait
```

When that part is reached, the script forks thousands of new processes
and starts reading from f2fs. (We simply decided to rely on Linux's
task scheduler and didn't bother to limit the number of
sub-processes.)

I am able to reliably cause f2fs to return EIO on some files:
cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
Input/output error
cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
Input/output error
cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
Input/output error
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
Input/output error (5)
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
Input/output error (5)
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
Input/output error (5)
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
Input/output error (5)
ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
-- update retained.
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
Input/output error (5)
ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
-- update retained.
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
Input/output error (5)
ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
-- update retained.
rsync error: some files/attrs were not transferred (see previous
errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]

The dmesg remains silent.

When I modify the script a little bit and force it to run in a
single-thread (by removing &), it runs well.

I was able to confirm that it isn't a memory issue. The server has
50G+ of free memory, and the issue is still reliably reproducible when
I defragment the memory by dropping caches and doing `echo 1 >
/proc/sys/vm/compact_memory`.

I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
unsupported by ZFS. And it being a production server, I am somewhat
limited in dabbling around the kernel.

I am planning to test a new kernel with v5.15 +
f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
fixed with newer commits, I'd appreciate a tip.

Thanks.

_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-13 17:52 [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image Juhyung Park
@ 2022-03-15  0:30 ` Jaegeuk Kim
  2022-03-15  4:42   ` Juhyung Park
  2022-03-15  8:33 ` Chao Yu
  1 sibling, 1 reply; 11+ messages in thread
From: Jaegeuk Kim @ 2022-03-15  0:30 UTC (permalink / raw)
  To: Juhyung Park; +Cc: linux-f2fs-devel

Juhyung,

On 03/14, Juhyung Park wrote:
> Hi.
> 
> We have a production server storing some Android firmwares over a ZFS
> file-system, and we noticed some issues when extracting firmware files
> that use f2fs for Android system partitions.
> 
> This is a proprietary environment, so I cannot disclose every detail,
> so I hope you understand. I'll try to elaborate as much as I can.
> 
> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> from v5.13 after noticing RO feature added on v5.14 being required).
> We have a set of scripts extracting Android firmware files. The input
> is typically the OTA zip file and after going through the script, it
> extracts every file and binary image from a given file.
> 
> So that includes extracting super (dynamic partition), ext4 system
> partitions with dedup enabled, and now, f2fs system partitions with RO
> and compression enabled.
> 
> Our script never had to deal with f2fs before as we only started
> seeing f2fs system partitions with recently released devices.
> 
> This is the f2fs mount flag after mounting with `mount -o ro
> system.raw /some/dir`:
> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> 
> There are *a lot* of files in Android firmware these days, so we try
> to parallelize parts when we can.
> 
> This is a snippet of the script:
> ```
> #!/bin/bash
> <...>
> RSYNC="rsync -ahAXx --inplace --numeric-ids"
> <...>
> for val in system vendor product odm; do
>   if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> 
>   mkdir -p fs
>   cd fs
> 
>   mkdir -p $val.mount tmp_$val
>   mount -o ro ../images/$val.raw $val.mount
> 
>   $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
>   echo $! > $val.pid
>   disown
> 
>   cd $val.mount
>   find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
>   find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> 
>   while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
>   wait
> 
> <...>
> 
>   cd ../
>   rm -rf tmp_$val
>   cd ../
> done
> 
> wait
> <...>
> for val in system vendor product odm; do
>   if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
>   tail --pid=$(cat fs/$val.pid) -f /dev/null
>   umount fs/$val.mount
>   rmdir fs/$val.mount
>   rm -f images/$val.img images/$val.raw 2>/dev/null
> done
> ```
> 
> The offending part is:
> ```
>   $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
>   find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
>   find . -type d -exec mkdir -p "../tmp_$val/"{} \;
>   while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
>   wait
> ```
> 
> When that part is reached, the script forks thousands of new processes
> and starts reading from f2fs. (We simply decided to rely on Linux's
> task scheduler and didn't bother to limit the number of
> sub-processes.)
> 
> I am able to reliably cause f2fs to return EIO on some files:
> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> Input/output error
> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> Input/output error
> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> Input/output error
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> Input/output error (5)
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> Input/output error (5)
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> Input/output error (5)
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> Input/output error (5)
> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> -- update retained.
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> Input/output error (5)
> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> -- update retained.
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> Input/output error (5)
> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> -- update retained.
> rsync error: some files/attrs were not transferred (see previous
> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
> 
> The dmesg remains silent.

Hmm, could you share fsck.f2fs result with the image?

> 
> When I modify the script a little bit and force it to run in a
> single-thread (by removing &), it runs well.
> 
> I was able to confirm that it isn't a memory issue. The server has
> 50G+ of free memory, and the issue is still reliably reproducible when
> I defragment the memory by dropping caches and doing `echo 1 >
> /proc/sys/vm/compact_memory`.
> 
> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> unsupported by ZFS. And it being a production server, I am somewhat
> limited in dabbling around the kernel.
> 
> I am planning to test a new kernel with v5.15 +
> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> fixed with newer commits, I'd appreciate a tip.
> 
> Thanks.
> 
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-15  0:30 ` Jaegeuk Kim
@ 2022-03-15  4:42   ` Juhyung Park
  0 siblings, 0 replies; 11+ messages in thread
From: Juhyung Park @ 2022-03-15  4:42 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel

Hi Jaegeuk.

fsck doesn't raise any red flags:
Info: Force to fix corruption
Info: Dry run
Info: Segments per section = 1
Info: Sections per zone = 1
Info: sector size = 512
Info: total sectors = 11411472 (5572 MB)
Info: MKFS version
  "5.4.0-62-generic"
Info: FSCK version
  from "5.4.0-62-generic"
    to "Linux version 5.15.0-22-generic (buildd@lcy02-amd64-054) (gcc
(Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0, GNU ld (GNU Binutils for Ubuntu)
2.34) #22~20.04.1-Ubuntu SMP Wed Feb 9 16:21:20 UTC 2022"
Info: superblock features = 6008 :  extra_attr compression ro
Info: superblock encrypt level = 0, salt = 00000000000000000000000000000000
Info: total FS sectors = 11231232 (5484 MB)
Info: CKPT version = 4321
Info: Checked valid nat_bits in checkpoint
Info: checkpoint state = 81 :  nat_bits unmount
[FSCK] Check node 1 / 9153 (0.01%)
[FSCK] Check node 916 / 9153 (10.01%)
[FSCK] Check node 1831 / 9153 (20.01%)
[FSCK] Check node 2746 / 9153 (30.01%)
[FSCK] Check node 3661 / 9153 (40.01%)
[FSCK] Check node 4576 / 9153 (50.01%)
[FSCK] Check node 5491 / 9153 (60.01%)
[FSCK] Check node 6406 / 9153 (70.01%)
[FSCK] Check node 7321 / 9153 (80.01%)
[FSCK] Check node 8236 / 9153 (90.01%)
[FSCK] Check node 9151 / 9153 (100.01%)

[FSCK] Max image size: 5480 MB, Free space: 0 MB
[FSCK] Unreachable nat entries                        [Ok..] [0x0]
[FSCK] SIT valid block bitmap checking                [Ok..]
[FSCK] Hard link checking for regular file            [Ok..] [0x0]
[FSCK] valid_block_count matching with CP             [Ok..] [0x155d87]
[FSCK] valid_node_count matching with CP (de lookup)  [Ok..] [0x23c1]
[FSCK] valid_node_count matching with CP (nat lookup) [Ok..] [0x23c1]
[FSCK] valid_inode_count matched with CP              [Ok..] [0x1db9]
[FSCK] free segment_count matched with CP             [Ok..] [0x0]
[FSCK] next block offset is free                      [Ok..]
[FSCK] fixing SIT types
[FSCK] other corrupted bugs                           [Ok..]

Done: 0.064030 secs

This is debugfs output, just in case:

=====[ partition info(loop0). #0, RO, CP: Good]=====
[SB: 1] [CP: 2] [SIT: 2] [NAT: 2] [SSA: 0] [MAIN: 2735(OverProv:0 Resv:0)]

Current Time Sec: 125333 / Mounted Time Sec: 125330

Utilization: 99% (1400199 valid blocks, 121 discard blocks)
  - Node: 9153 (Inode: 7609, Other: 1544)
  - Data: 1391046
  - Inline_xattr Inode: 0
  - Inline_data Inode: 0
  - Inline_dentry Inode: 0
  - Compressed Inode: 0, Blocks: 0
  - Orphan/Append/Update Inode: 0, 0, 0

Main area: 2735 segs, 2735 secs 2735 zones
    TYPE            segno    secno   zoneno  dirty_seg   full_seg  valid_blk
  - COLD   data:        0        0        0          0          0          0
  - WARM   data:        0        0        0          0          0          0
  - HOT    data:     2734     2734     2734          1       2716    1391046
  - Dir   dnode:     2661     2661     2661          1         17       9153
  - File  dnode:        0        0        0          0          0          0
  - Indir nodes:        0        0        0          0          0          0
  - Pinned file:       -1       -1       -1
  - ATGC   data:       -1       -1       -1

  - Valid: 2735
  - Dirty: 0
  - Prefree: 0
  - Free: 0 (0)

CP calls: 0 (BG: 0)
  - cp blocks : 0
  - sit blocks : 0
  - nat blocks : 0
  - ssa blocks : 0
CP merge (Queued:    0, Issued:    0, Total:    0, Cur time:    0(ms),
Peak time:    0(ms))
GC calls: 0 (BG: 0)
  - data segments : 0 (0)
  - node segments : 0 (0)
  - Reclaimed segs : Normal (0), Idle CB (0), Idle Greedy (0), Idle AT
(0), Urgent High (0), Urgent Low (0)
Try to move 0 blocks (BG: 0)
  - data blocks : 0 (0)
  - node blocks : 0 (0)
Skipped : atomic write 0 (0)
BG skip : IO: 0, Other: 0

Extent Cache:
  - Hit Count: L1-1:0 L1-2:0 L2:0
  - Hit Ratio: 0% (0 / 0)
  - Inner Struct Count: tree: 0(0), node: 0

Balancing F2FS Async:
  - DIO (R:    0, W:    0)
  - IO_R (Data:    0, Node:    0, Meta:    0
  - IO_W (CP:    0, Data:    0, Flush: (   0    0    0), Discard: (
0    0)) cmd:    0 undiscard:   0
  - inmem:    0, atomic IO:    0 (Max.    0), volatile IO:    0 (Max.    0)
  - compress:    0, hit:       0
  - nodes:    0 in    1
  - dents:    0 in dirs:   0 (   0)
  - datas:    0 in files:   0
  - quota datas:    0 in quota files:   0
  - meta:    0 in   69
  - imeta:    0
  - NATs:         0/        1
  - SITs:         0/     2735
  - free_nids:         0/   223804
  - alloc_nids:         0

Distribution of User Blocks: [ valid | invalid | free ]
  [-------------------------------------------------|-|]

IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 0 blocks in 0 segments

BDF: 99, avg. vblocks: 0

Memory: 992 KB
  - static: 704 KB
  - cached: 8 KB
  - paged : 280 KB

Thanks.

On Tue, Mar 15, 2022 at 9:30 AM Jaegeuk Kim <jaegeuk@kernel.org> wrote:
>
> Juhyung,
>
> On 03/14, Juhyung Park wrote:
> > Hi.
> >
> > We have a production server storing some Android firmwares over a ZFS
> > file-system, and we noticed some issues when extracting firmware files
> > that use f2fs for Android system partitions.
> >
> > This is a proprietary environment, so I cannot disclose every detail,
> > so I hope you understand. I'll try to elaborate as much as I can.
> >
> > The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> > from v5.13 after noticing RO feature added on v5.14 being required).
> > We have a set of scripts extracting Android firmware files. The input
> > is typically the OTA zip file and after going through the script, it
> > extracts every file and binary image from a given file.
> >
> > So that includes extracting super (dynamic partition), ext4 system
> > partitions with dedup enabled, and now, f2fs system partitions with RO
> > and compression enabled.
> >
> > Our script never had to deal with f2fs before as we only started
> > seeing f2fs system partitions with recently released devices.
> >
> > This is the f2fs mount flag after mounting with `mount -o ro
> > system.raw /some/dir`:
> > ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> >
> > There are *a lot* of files in Android firmware these days, so we try
> > to parallelize parts when we can.
> >
> > This is a snippet of the script:
> > ```
> > #!/bin/bash
> > <...>
> > RSYNC="rsync -ahAXx --inplace --numeric-ids"
> > <...>
> > for val in system vendor product odm; do
> >   if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> >
> >   mkdir -p fs
> >   cd fs
> >
> >   mkdir -p $val.mount tmp_$val
> >   mount -o ro ../images/$val.raw $val.mount
> >
> >   $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> >   echo $! > $val.pid
> >   disown
> >
> >   cd $val.mount
> >   find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> >   find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> >
> >   while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> >   wait
> >
> > <...>
> >
> >   cd ../
> >   rm -rf tmp_$val
> >   cd ../
> > done
> >
> > wait
> > <...>
> > for val in system vendor product odm; do
> >   if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> >   tail --pid=$(cat fs/$val.pid) -f /dev/null
> >   umount fs/$val.mount
> >   rmdir fs/$val.mount
> >   rm -f images/$val.img images/$val.raw 2>/dev/null
> > done
> > ```
> >
> > The offending part is:
> > ```
> >   $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> >   find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> >   find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> >   while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> >   wait
> > ```
> >
> > When that part is reached, the script forks thousands of new processes
> > and starts reading from f2fs. (We simply decided to rely on Linux's
> > task scheduler and didn't bother to limit the number of
> > sub-processes.)
> >
> > I am able to reliably cause f2fs to return EIO on some files:
> > cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> > Input/output error
> > cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> > Input/output error
> > cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> > Input/output error
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > Input/output error (5)
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > Input/output error (5)
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > Input/output error (5)
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > Input/output error (5)
> > ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> > -- update retained.
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > Input/output error (5)
> > ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> > -- update retained.
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > Input/output error (5)
> > ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> > -- update retained.
> > rsync error: some files/attrs were not transferred (see previous
> > errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
> >
> > The dmesg remains silent.
>
> Hmm, could you share fsck.f2fs result with the image?
>
> >
> > When I modify the script a little bit and force it to run in a
> > single-thread (by removing &), it runs well.
> >
> > I was able to confirm that it isn't a memory issue. The server has
> > 50G+ of free memory, and the issue is still reliably reproducible when
> > I defragment the memory by dropping caches and doing `echo 1 >
> > /proc/sys/vm/compact_memory`.
> >
> > I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> > unsupported by ZFS. And it being a production server, I am somewhat
> > limited in dabbling around the kernel.
> >
> > I am planning to test a new kernel with v5.15 +
> > f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> > fixed with newer commits, I'd appreciate a tip.
> >
> > Thanks.
> >
> >
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-13 17:52 [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image Juhyung Park
  2022-03-15  0:30 ` Jaegeuk Kim
@ 2022-03-15  8:33 ` Chao Yu
  2022-03-15  8:37   ` Juhyung Park
  1 sibling, 1 reply; 11+ messages in thread
From: Chao Yu @ 2022-03-15  8:33 UTC (permalink / raw)
  To: Juhyung Park, linux-f2fs-devel

On 2022/3/14 1:52, Juhyung Park wrote:
> Hi.
> 
> We have a production server storing some Android firmwares over a ZFS
> file-system, and we noticed some issues when extracting firmware files
> that use f2fs for Android system partitions.
> 
> This is a proprietary environment, so I cannot disclose every detail,
> so I hope you understand. I'll try to elaborate as much as I can.
> 
> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> from v5.13 after noticing RO feature added on v5.14 being required).
> We have a set of scripts extracting Android firmware files. The input
> is typically the OTA zip file and after going through the script, it
> extracts every file and binary image from a given file.
> 
> So that includes extracting super (dynamic partition), ext4 system
> partitions with dedup enabled, and now, f2fs system partitions with RO
> and compression enabled.
> 
> Our script never had to deal with f2fs before as we only started
> seeing f2fs system partitions with recently released devices.
> 
> This is the f2fs mount flag after mounting with `mount -o ro
> system.raw /some/dir`:
> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> 
> There are *a lot* of files in Android firmware these days, so we try
> to parallelize parts when we can.
> 
> This is a snippet of the script:
> ```
> #!/bin/bash
> <...>
> RSYNC="rsync -ahAXx --inplace --numeric-ids"
> <...>
> for val in system vendor product odm; do
>    if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> 
>    mkdir -p fs
>    cd fs
> 
>    mkdir -p $val.mount tmp_$val
>    mount -o ro ../images/$val.raw $val.mount
> 
>    $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
>    echo $! > $val.pid
>    disown
> 
>    cd $val.mount
>    find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
>    find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> 
>    while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
>    wait
> 
> <...>
> 
>    cd ../
>    rm -rf tmp_$val
>    cd ../
> done
> 
> wait
> <...>
> for val in system vendor product odm; do
>    if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
>    tail --pid=$(cat fs/$val.pid) -f /dev/null
>    umount fs/$val.mount
>    rmdir fs/$val.mount
>    rm -f images/$val.img images/$val.raw 2>/dev/null
> done
> ```
> 
> The offending part is:
> ```
>    $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
>    find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
>    find . -type d -exec mkdir -p "../tmp_$val/"{} \;
>    while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
>    wait
> ```
> 
> When that part is reached, the script forks thousands of new processes
> and starts reading from f2fs. (We simply decided to rely on Linux's
> task scheduler and didn't bother to limit the number of
> sub-processes.)
> 
> I am able to reliably cause f2fs to return EIO on some files:
> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> Input/output error
> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> Input/output error
> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> Input/output error
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> Input/output error (5)
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> Input/output error (5)
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> Input/output error (5)
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> Input/output error (5)
> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> -- update retained.
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> Input/output error (5)
> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> -- update retained.
> rsync: [sender] read errors mapping
> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> Input/output error (5)
> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> -- update retained.
> rsync error: some files/attrs were not transferred (see previous
> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]

Hi Juhyung,

Do you enable compression on specified files? if so, do all EIOs come from compressed
file?

Thanks,

> 
> The dmesg remains silent.
> 
> When I modify the script a little bit and force it to run in a
> single-thread (by removing &), it runs well.
> 
> I was able to confirm that it isn't a memory issue. The server has
> 50G+ of free memory, and the issue is still reliably reproducible when
> I defragment the memory by dropping caches and doing `echo 1 >
> /proc/sys/vm/compact_memory`.
> 
> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> unsupported by ZFS. And it being a production server, I am somewhat
> limited in dabbling around the kernel.
> 
> I am planning to test a new kernel with v5.15 +
> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> fixed with newer commits, I'd appreciate a tip.
> 
> Thanks.
> 
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-15  8:33 ` Chao Yu
@ 2022-03-15  8:37   ` Juhyung Park
  2022-03-15  8:45     ` Chao Yu
  0 siblings, 1 reply; 11+ messages in thread
From: Juhyung Park @ 2022-03-15  8:37 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel

Hi Chao,

I actually have never used compression on f2fs yet and this image is
from an unmodified firmware taken directly from the OTA server, not
made by me.

If you can tell me how to check if the file is compressed or not, I'll
run it and report back.

I never had this sort of issue (a lot of multiple readers causing
troubles) before with a regular R/W, uncompressed f2fs partition, so
I'm guessing it has something to do with RO or compression feature.

Thanks.

On Tue, Mar 15, 2022 at 5:33 PM Chao Yu <chao@kernel.org> wrote:
>
> On 2022/3/14 1:52, Juhyung Park wrote:
> > Hi.
> >
> > We have a production server storing some Android firmwares over a ZFS
> > file-system, and we noticed some issues when extracting firmware files
> > that use f2fs for Android system partitions.
> >
> > This is a proprietary environment, so I cannot disclose every detail,
> > so I hope you understand. I'll try to elaborate as much as I can.
> >
> > The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> > from v5.13 after noticing RO feature added on v5.14 being required).
> > We have a set of scripts extracting Android firmware files. The input
> > is typically the OTA zip file and after going through the script, it
> > extracts every file and binary image from a given file.
> >
> > So that includes extracting super (dynamic partition), ext4 system
> > partitions with dedup enabled, and now, f2fs system partitions with RO
> > and compression enabled.
> >
> > Our script never had to deal with f2fs before as we only started
> > seeing f2fs system partitions with recently released devices.
> >
> > This is the f2fs mount flag after mounting with `mount -o ro
> > system.raw /some/dir`:
> > ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> >
> > There are *a lot* of files in Android firmware these days, so we try
> > to parallelize parts when we can.
> >
> > This is a snippet of the script:
> > ```
> > #!/bin/bash
> > <...>
> > RSYNC="rsync -ahAXx --inplace --numeric-ids"
> > <...>
> > for val in system vendor product odm; do
> >    if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> >
> >    mkdir -p fs
> >    cd fs
> >
> >    mkdir -p $val.mount tmp_$val
> >    mount -o ro ../images/$val.raw $val.mount
> >
> >    $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> >    echo $! > $val.pid
> >    disown
> >
> >    cd $val.mount
> >    find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> >    find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> >
> >    while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> >    wait
> >
> > <...>
> >
> >    cd ../
> >    rm -rf tmp_$val
> >    cd ../
> > done
> >
> > wait
> > <...>
> > for val in system vendor product odm; do
> >    if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> >    tail --pid=$(cat fs/$val.pid) -f /dev/null
> >    umount fs/$val.mount
> >    rmdir fs/$val.mount
> >    rm -f images/$val.img images/$val.raw 2>/dev/null
> > done
> > ```
> >
> > The offending part is:
> > ```
> >    $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> >    find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> >    find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> >    while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> >    wait
> > ```
> >
> > When that part is reached, the script forks thousands of new processes
> > and starts reading from f2fs. (We simply decided to rely on Linux's
> > task scheduler and didn't bother to limit the number of
> > sub-processes.)
> >
> > I am able to reliably cause f2fs to return EIO on some files:
> > cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> > Input/output error
> > cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> > Input/output error
> > cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> > Input/output error
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > Input/output error (5)
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > Input/output error (5)
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > Input/output error (5)
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > Input/output error (5)
> > ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> > -- update retained.
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > Input/output error (5)
> > ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> > -- update retained.
> > rsync: [sender] read errors mapping
> > "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > Input/output error (5)
> > ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> > -- update retained.
> > rsync error: some files/attrs were not transferred (see previous
> > errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
>
> Hi Juhyung,
>
> Do you enable compression on specified files? if so, do all EIOs come from compressed
> file?
>
> Thanks,
>
> >
> > The dmesg remains silent.
> >
> > When I modify the script a little bit and force it to run in a
> > single-thread (by removing &), it runs well.
> >
> > I was able to confirm that it isn't a memory issue. The server has
> > 50G+ of free memory, and the issue is still reliably reproducible when
> > I defragment the memory by dropping caches and doing `echo 1 >
> > /proc/sys/vm/compact_memory`.
> >
> > I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> > unsupported by ZFS. And it being a production server, I am somewhat
> > limited in dabbling around the kernel.
> >
> > I am planning to test a new kernel with v5.15 +
> > f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> > fixed with newer commits, I'd appreciate a tip.
> >
> > Thanks.
> >
> >
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-15  8:37   ` Juhyung Park
@ 2022-03-15  8:45     ` Chao Yu
  2022-03-15 10:25       ` Juhyung Park
  0 siblings, 1 reply; 11+ messages in thread
From: Chao Yu @ 2022-03-15  8:45 UTC (permalink / raw)
  To: Juhyung Park; +Cc: linux-f2fs-devel

Hi Juhyung,

On 2022/3/15 16:37, Juhyung Park wrote:
> Hi Chao,
> 
> I actually have never used compression on f2fs yet and this image is
> from an unmodified firmware taken directly from the OTA server, not
> made by me.

Alright.

> 
> If you can tell me how to check if the file is compressed or not, I'll
> run it and report back.

lsattr <target_file_path> |grep c

You can check the flag's mean in manual of chattr.

> 
> I never had this sort of issue (a lot of multiple readers causing
> troubles) before with a regular R/W, uncompressed f2fs partition, so
> I'm guessing it has something to do with RO or compression feature.

Agreed, maybe we can repacking image to RO one or compressed one to
check which one can cause this issue?

Thanks,

> 
> Thanks.
> 
> On Tue, Mar 15, 2022 at 5:33 PM Chao Yu <chao@kernel.org> wrote:
>>
>> On 2022/3/14 1:52, Juhyung Park wrote:
>>> Hi.
>>>
>>> We have a production server storing some Android firmwares over a ZFS
>>> file-system, and we noticed some issues when extracting firmware files
>>> that use f2fs for Android system partitions.
>>>
>>> This is a proprietary environment, so I cannot disclose every detail,
>>> so I hope you understand. I'll try to elaborate as much as I can.
>>>
>>> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
>>> from v5.13 after noticing RO feature added on v5.14 being required).
>>> We have a set of scripts extracting Android firmware files. The input
>>> is typically the OTA zip file and after going through the script, it
>>> extracts every file and binary image from a given file.
>>>
>>> So that includes extracting super (dynamic partition), ext4 system
>>> partitions with dedup enabled, and now, f2fs system partitions with RO
>>> and compression enabled.
>>>
>>> Our script never had to deal with f2fs before as we only started
>>> seeing f2fs system partitions with recently released devices.
>>>
>>> This is the f2fs mount flag after mounting with `mount -o ro
>>> system.raw /some/dir`:
>>> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
>>>
>>> There are *a lot* of files in Android firmware these days, so we try
>>> to parallelize parts when we can.
>>>
>>> This is a snippet of the script:
>>> ```
>>> #!/bin/bash
>>> <...>
>>> RSYNC="rsync -ahAXx --inplace --numeric-ids"
>>> <...>
>>> for val in system vendor product odm; do
>>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
>>>
>>>     mkdir -p fs
>>>     cd fs
>>>
>>>     mkdir -p $val.mount tmp_$val
>>>     mount -o ro ../images/$val.raw $val.mount
>>>
>>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
>>>     echo $! > $val.pid
>>>     disown
>>>
>>>     cd $val.mount
>>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
>>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
>>>
>>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
>>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
>>>     wait
>>>
>>> <...>
>>>
>>>     cd ../
>>>     rm -rf tmp_$val
>>>     cd ../
>>> done
>>>
>>> wait
>>> <...>
>>> for val in system vendor product odm; do
>>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
>>>     tail --pid=$(cat fs/$val.pid) -f /dev/null
>>>     umount fs/$val.mount
>>>     rmdir fs/$val.mount
>>>     rm -f images/$val.img images/$val.raw 2>/dev/null
>>> done
>>> ```
>>>
>>> The offending part is:
>>> ```
>>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
>>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
>>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
>>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
>>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
>>>     wait
>>> ```
>>>
>>> When that part is reached, the script forks thousands of new processes
>>> and starts reading from f2fs. (We simply decided to rely on Linux's
>>> task scheduler and didn't bother to limit the number of
>>> sub-processes.)
>>>
>>> I am able to reliably cause f2fs to return EIO on some files:
>>> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
>>> Input/output error
>>> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
>>> Input/output error
>>> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
>>> Input/output error
>>> rsync: [sender] read errors mapping
>>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
>>> Input/output error (5)
>>> rsync: [sender] read errors mapping
>>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
>>> Input/output error (5)
>>> rsync: [sender] read errors mapping
>>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
>>> Input/output error (5)
>>> rsync: [sender] read errors mapping
>>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
>>> Input/output error (5)
>>> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
>>> -- update retained.
>>> rsync: [sender] read errors mapping
>>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
>>> Input/output error (5)
>>> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
>>> -- update retained.
>>> rsync: [sender] read errors mapping
>>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
>>> Input/output error (5)
>>> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
>>> -- update retained.
>>> rsync error: some files/attrs were not transferred (see previous
>>> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
>>
>> Hi Juhyung,
>>
>> Do you enable compression on specified files? if so, do all EIOs come from compressed
>> file?
>>
>> Thanks,
>>
>>>
>>> The dmesg remains silent.
>>>
>>> When I modify the script a little bit and force it to run in a
>>> single-thread (by removing &), it runs well.
>>>
>>> I was able to confirm that it isn't a memory issue. The server has
>>> 50G+ of free memory, and the issue is still reliably reproducible when
>>> I defragment the memory by dropping caches and doing `echo 1 >
>>> /proc/sys/vm/compact_memory`.
>>>
>>> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
>>> unsupported by ZFS. And it being a production server, I am somewhat
>>> limited in dabbling around the kernel.
>>>
>>> I am planning to test a new kernel with v5.15 +
>>> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
>>> fixed with newer commits, I'd appreciate a tip.
>>>
>>> Thanks.
>>>
>>>
>>> _______________________________________________
>>> Linux-f2fs-devel mailing list
>>> Linux-f2fs-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-15  8:45     ` Chao Yu
@ 2022-03-15 10:25       ` Juhyung Park
  2022-03-15 10:48         ` Juhyung Park
  0 siblings, 1 reply; 11+ messages in thread
From: Juhyung Park @ 2022-03-15 10:25 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel

Hi Chao,

On Tue, Mar 15, 2022 at 5:45 PM Chao Yu <chao@kernel.org> wrote:
>
> Hi Juhyung,
>
> On 2022/3/15 16:37, Juhyung Park wrote:
> > Hi Chao,
> >
> > I actually have never used compression on f2fs yet and this image is
> > from an unmodified firmware taken directly from the OTA server, not
> > made by me.
>
> Alright.
>
> >
> > If you can tell me how to check if the file is compressed or not, I'll
> > run it and report back.
>
> lsattr <target_file_path> |grep c
>
> You can check the flag's mean in manual of chattr.

Ah, didn't know that lsattr shows compressed state.
It seems that every single file under that system partition is
compressed except ones with file size of < 4096, but the error only
happens on select files.

>
> >
> > I never had this sort of issue (a lot of multiple readers causing
> > troubles) before with a regular R/W, uncompressed f2fs partition, so
> > I'm guessing it has something to do with RO or compression feature.
>
> Agreed, maybe we can repacking image to RO one or compressed one to
> check which one can cause this issue?

Is this easy to do standalone?

AOSP's ext4 image creation was already super complicated to do
out-of-tree with their dedup and minimizing file size (so that the
resulting image is as small as possible) thinggies going on. I think
I've seen EROFS support but I don't know how f2fs system images are
handled under AOSP.

Also, I moved the environment to my personal laptop. It's running
v5.15.28 with the latest f2fs-stable/linux-5.15.y merged and it still
happens there.

One weird thing that I've been able to confirm is that it seems that
multiple readers are not the problem:
#!/bin/bash

set -v

APK=system/app/some_app/some_app.apk

echo 3 > /proc/sys/vm/drop_caches
tar -pcf - $APK | cat > /dev/null     # Works!
cp $APK /tmp                          # Works!

echo 3 > /proc/sys/vm/drop_caches
tar -pcf - $APK | cat > /dev/null     # Works!
echo 3 > /proc/sys/vm/drop_caches
cp $APK /tmp                          # cp: error reading
'system/app/some_app/some_app.apk': Input/output error

echo 3 > /proc/sys/vm/drop_caches
cp $APK /tmp                          # cp: error reading
'system/app/some_app/some_app.apk': Input/output error
tar -pcf - $APK | cat > /dev/null     # tar:
system/app/some_app/some_app.apk: File shrank by 3375397 bytes;
padding with zeros

echo 3 > /proc/sys/vm/drop_caches
cp $APK /tmp                          # cp: error reading
'system/app/some_app/some_app.apk': Input/output error
echo 3 > /proc/sys/vm/drop_caches
tar -pcf - $APK | cat > /dev/null     # Works!

I'm guessing some system-calls that cp use causes f2fs to malfunction.
(GNU tar is smart enough to detect if the output is /dev/null, so it
skips the actual file read when it is piped directly to /dev/null, so
I added `cat` in between.)

Hope this new info helps.

Thanks.

>
> Thanks,
>
> >
> > Thanks.
> >
> > On Tue, Mar 15, 2022 at 5:33 PM Chao Yu <chao@kernel.org> wrote:
> >>
> >> On 2022/3/14 1:52, Juhyung Park wrote:
> >>> Hi.
> >>>
> >>> We have a production server storing some Android firmwares over a ZFS
> >>> file-system, and we noticed some issues when extracting firmware files
> >>> that use f2fs for Android system partitions.
> >>>
> >>> This is a proprietary environment, so I cannot disclose every detail,
> >>> so I hope you understand. I'll try to elaborate as much as I can.
> >>>
> >>> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> >>> from v5.13 after noticing RO feature added on v5.14 being required).
> >>> We have a set of scripts extracting Android firmware files. The input
> >>> is typically the OTA zip file and after going through the script, it
> >>> extracts every file and binary image from a given file.
> >>>
> >>> So that includes extracting super (dynamic partition), ext4 system
> >>> partitions with dedup enabled, and now, f2fs system partitions with RO
> >>> and compression enabled.
> >>>
> >>> Our script never had to deal with f2fs before as we only started
> >>> seeing f2fs system partitions with recently released devices.
> >>>
> >>> This is the f2fs mount flag after mounting with `mount -o ro
> >>> system.raw /some/dir`:
> >>> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> >>>
> >>> There are *a lot* of files in Android firmware these days, so we try
> >>> to parallelize parts when we can.
> >>>
> >>> This is a snippet of the script:
> >>> ```
> >>> #!/bin/bash
> >>> <...>
> >>> RSYNC="rsync -ahAXx --inplace --numeric-ids"
> >>> <...>
> >>> for val in system vendor product odm; do
> >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> >>>
> >>>     mkdir -p fs
> >>>     cd fs
> >>>
> >>>     mkdir -p $val.mount tmp_$val
> >>>     mount -o ro ../images/$val.raw $val.mount
> >>>
> >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> >>>     echo $! > $val.pid
> >>>     disown
> >>>
> >>>     cd $val.mount
> >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> >>>
> >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> >>>     wait
> >>>
> >>> <...>
> >>>
> >>>     cd ../
> >>>     rm -rf tmp_$val
> >>>     cd ../
> >>> done
> >>>
> >>> wait
> >>> <...>
> >>> for val in system vendor product odm; do
> >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> >>>     tail --pid=$(cat fs/$val.pid) -f /dev/null
> >>>     umount fs/$val.mount
> >>>     rmdir fs/$val.mount
> >>>     rm -f images/$val.img images/$val.raw 2>/dev/null
> >>> done
> >>> ```
> >>>
> >>> The offending part is:
> >>> ```
> >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> >>>     wait
> >>> ```
> >>>
> >>> When that part is reached, the script forks thousands of new processes
> >>> and starts reading from f2fs. (We simply decided to rely on Linux's
> >>> task scheduler and didn't bother to limit the number of
> >>> sub-processes.)
> >>>
> >>> I am able to reliably cause f2fs to return EIO on some files:
> >>> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> >>> Input/output error
> >>> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> >>> Input/output error
> >>> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> >>> Input/output error
> >>> rsync: [sender] read errors mapping
> >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> >>> Input/output error (5)
> >>> rsync: [sender] read errors mapping
> >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> >>> Input/output error (5)
> >>> rsync: [sender] read errors mapping
> >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> >>> Input/output error (5)
> >>> rsync: [sender] read errors mapping
> >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> >>> Input/output error (5)
> >>> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> >>> -- update retained.
> >>> rsync: [sender] read errors mapping
> >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> >>> Input/output error (5)
> >>> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> >>> -- update retained.
> >>> rsync: [sender] read errors mapping
> >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> >>> Input/output error (5)
> >>> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> >>> -- update retained.
> >>> rsync error: some files/attrs were not transferred (see previous
> >>> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
> >>
> >> Hi Juhyung,
> >>
> >> Do you enable compression on specified files? if so, do all EIOs come from compressed
> >> file?
> >>
> >> Thanks,
> >>
> >>>
> >>> The dmesg remains silent.
> >>>
> >>> When I modify the script a little bit and force it to run in a
> >>> single-thread (by removing &), it runs well.
> >>>
> >>> I was able to confirm that it isn't a memory issue. The server has
> >>> 50G+ of free memory, and the issue is still reliably reproducible when
> >>> I defragment the memory by dropping caches and doing `echo 1 >
> >>> /proc/sys/vm/compact_memory`.
> >>>
> >>> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> >>> unsupported by ZFS. And it being a production server, I am somewhat
> >>> limited in dabbling around the kernel.
> >>>
> >>> I am planning to test a new kernel with v5.15 +
> >>> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> >>> fixed with newer commits, I'd appreciate a tip.
> >>>
> >>> Thanks.
> >>>
> >>>
> >>> _______________________________________________
> >>> Linux-f2fs-devel mailing list
> >>> Linux-f2fs-devel@lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-15 10:25       ` Juhyung Park
@ 2022-03-15 10:48         ` Juhyung Park
  2022-03-15 20:49           ` Jaegeuk Kim
  0 siblings, 1 reply; 11+ messages in thread
From: Juhyung Park @ 2022-03-15 10:48 UTC (permalink / raw)
  To: Chao Yu; +Cc: linux-f2fs-devel

Here's strace from both commands right after drop_caches.

cp: https://pastebin.com/raw/GH51BjV4
tar: https://pastebin.com/raw/0wgiu7fE

It seems that cp is calling a few extra fadvise and ioctl.

On Tue, Mar 15, 2022 at 7:25 PM Juhyung Park <qkrwngud825@gmail.com> wrote:
>
> Hi Chao,
>
> On Tue, Mar 15, 2022 at 5:45 PM Chao Yu <chao@kernel.org> wrote:
> >
> > Hi Juhyung,
> >
> > On 2022/3/15 16:37, Juhyung Park wrote:
> > > Hi Chao,
> > >
> > > I actually have never used compression on f2fs yet and this image is
> > > from an unmodified firmware taken directly from the OTA server, not
> > > made by me.
> >
> > Alright.
> >
> > >
> > > If you can tell me how to check if the file is compressed or not, I'll
> > > run it and report back.
> >
> > lsattr <target_file_path> |grep c
> >
> > You can check the flag's mean in manual of chattr.
>
> Ah, didn't know that lsattr shows compressed state.
> It seems that every single file under that system partition is
> compressed except ones with file size of < 4096, but the error only
> happens on select files.
>
> >
> > >
> > > I never had this sort of issue (a lot of multiple readers causing
> > > troubles) before with a regular R/W, uncompressed f2fs partition, so
> > > I'm guessing it has something to do with RO or compression feature.
> >
> > Agreed, maybe we can repacking image to RO one or compressed one to
> > check which one can cause this issue?
>
> Is this easy to do standalone?
>
> AOSP's ext4 image creation was already super complicated to do
> out-of-tree with their dedup and minimizing file size (so that the
> resulting image is as small as possible) thinggies going on. I think
> I've seen EROFS support but I don't know how f2fs system images are
> handled under AOSP.
>
> Also, I moved the environment to my personal laptop. It's running
> v5.15.28 with the latest f2fs-stable/linux-5.15.y merged and it still
> happens there.
>
> One weird thing that I've been able to confirm is that it seems that
> multiple readers are not the problem:
> #!/bin/bash
>
> set -v
>
> APK=system/app/some_app/some_app.apk
>
> echo 3 > /proc/sys/vm/drop_caches
> tar -pcf - $APK | cat > /dev/null     # Works!
> cp $APK /tmp                          # Works!
>
> echo 3 > /proc/sys/vm/drop_caches
> tar -pcf - $APK | cat > /dev/null     # Works!
> echo 3 > /proc/sys/vm/drop_caches
> cp $APK /tmp                          # cp: error reading
> 'system/app/some_app/some_app.apk': Input/output error
>
> echo 3 > /proc/sys/vm/drop_caches
> cp $APK /tmp                          # cp: error reading
> 'system/app/some_app/some_app.apk': Input/output error
> tar -pcf - $APK | cat > /dev/null     # tar:
> system/app/some_app/some_app.apk: File shrank by 3375397 bytes;
> padding with zeros
>
> echo 3 > /proc/sys/vm/drop_caches
> cp $APK /tmp                          # cp: error reading
> 'system/app/some_app/some_app.apk': Input/output error
> echo 3 > /proc/sys/vm/drop_caches
> tar -pcf - $APK | cat > /dev/null     # Works!
>
> I'm guessing some system-calls that cp use causes f2fs to malfunction.
> (GNU tar is smart enough to detect if the output is /dev/null, so it
> skips the actual file read when it is piped directly to /dev/null, so
> I added `cat` in between.)
>
> Hope this new info helps.
>
> Thanks.
>
> >
> > Thanks,
> >
> > >
> > > Thanks.
> > >
> > > On Tue, Mar 15, 2022 at 5:33 PM Chao Yu <chao@kernel.org> wrote:
> > >>
> > >> On 2022/3/14 1:52, Juhyung Park wrote:
> > >>> Hi.
> > >>>
> > >>> We have a production server storing some Android firmwares over a ZFS
> > >>> file-system, and we noticed some issues when extracting firmware files
> > >>> that use f2fs for Android system partitions.
> > >>>
> > >>> This is a proprietary environment, so I cannot disclose every detail,
> > >>> so I hope you understand. I'll try to elaborate as much as I can.
> > >>>
> > >>> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> > >>> from v5.13 after noticing RO feature added on v5.14 being required).
> > >>> We have a set of scripts extracting Android firmware files. The input
> > >>> is typically the OTA zip file and after going through the script, it
> > >>> extracts every file and binary image from a given file.
> > >>>
> > >>> So that includes extracting super (dynamic partition), ext4 system
> > >>> partitions with dedup enabled, and now, f2fs system partitions with RO
> > >>> and compression enabled.
> > >>>
> > >>> Our script never had to deal with f2fs before as we only started
> > >>> seeing f2fs system partitions with recently released devices.
> > >>>
> > >>> This is the f2fs mount flag after mounting with `mount -o ro
> > >>> system.raw /some/dir`:
> > >>> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> > >>>
> > >>> There are *a lot* of files in Android firmware these days, so we try
> > >>> to parallelize parts when we can.
> > >>>
> > >>> This is a snippet of the script:
> > >>> ```
> > >>> #!/bin/bash
> > >>> <...>
> > >>> RSYNC="rsync -ahAXx --inplace --numeric-ids"
> > >>> <...>
> > >>> for val in system vendor product odm; do
> > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > >>>
> > >>>     mkdir -p fs
> > >>>     cd fs
> > >>>
> > >>>     mkdir -p $val.mount tmp_$val
> > >>>     mount -o ro ../images/$val.raw $val.mount
> > >>>
> > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > >>>     echo $! > $val.pid
> > >>>     disown
> > >>>
> > >>>     cd $val.mount
> > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > >>>
> > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > >>>     wait
> > >>>
> > >>> <...>
> > >>>
> > >>>     cd ../
> > >>>     rm -rf tmp_$val
> > >>>     cd ../
> > >>> done
> > >>>
> > >>> wait
> > >>> <...>
> > >>> for val in system vendor product odm; do
> > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > >>>     tail --pid=$(cat fs/$val.pid) -f /dev/null
> > >>>     umount fs/$val.mount
> > >>>     rmdir fs/$val.mount
> > >>>     rm -f images/$val.img images/$val.raw 2>/dev/null
> > >>> done
> > >>> ```
> > >>>
> > >>> The offending part is:
> > >>> ```
> > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > >>>     wait
> > >>> ```
> > >>>
> > >>> When that part is reached, the script forks thousands of new processes
> > >>> and starts reading from f2fs. (We simply decided to rely on Linux's
> > >>> task scheduler and didn't bother to limit the number of
> > >>> sub-processes.)
> > >>>
> > >>> I am able to reliably cause f2fs to return EIO on some files:
> > >>> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> > >>> Input/output error
> > >>> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> > >>> Input/output error
> > >>> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> > >>> Input/output error
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > >>> Input/output error (5)
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > >>> Input/output error (5)
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > >>> Input/output error (5)
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > >>> Input/output error (5)
> > >>> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> > >>> -- update retained.
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > >>> Input/output error (5)
> > >>> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> > >>> -- update retained.
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > >>> Input/output error (5)
> > >>> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> > >>> -- update retained.
> > >>> rsync error: some files/attrs were not transferred (see previous
> > >>> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
> > >>
> > >> Hi Juhyung,
> > >>
> > >> Do you enable compression on specified files? if so, do all EIOs come from compressed
> > >> file?
> > >>
> > >> Thanks,
> > >>
> > >>>
> > >>> The dmesg remains silent.
> > >>>
> > >>> When I modify the script a little bit and force it to run in a
> > >>> single-thread (by removing &), it runs well.
> > >>>
> > >>> I was able to confirm that it isn't a memory issue. The server has
> > >>> 50G+ of free memory, and the issue is still reliably reproducible when
> > >>> I defragment the memory by dropping caches and doing `echo 1 >
> > >>> /proc/sys/vm/compact_memory`.
> > >>>
> > >>> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> > >>> unsupported by ZFS. And it being a production server, I am somewhat
> > >>> limited in dabbling around the kernel.
> > >>>
> > >>> I am planning to test a new kernel with v5.15 +
> > >>> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> > >>> fixed with newer commits, I'd appreciate a tip.
> > >>>
> > >>> Thanks.
> > >>>
> > >>>
> > >>> _______________________________________________
> > >>> Linux-f2fs-devel mailing list
> > >>> Linux-f2fs-devel@lists.sourceforge.net
> > >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-15 10:48         ` Juhyung Park
@ 2022-03-15 20:49           ` Jaegeuk Kim
  2022-03-16  8:43             ` Juhyung Park
  0 siblings, 1 reply; 11+ messages in thread
From: Jaegeuk Kim @ 2022-03-15 20:49 UTC (permalink / raw)
  To: Juhyung Park; +Cc: linux-f2fs-devel

On 03/15, Juhyung Park wrote:
> Here's strace from both commands right after drop_caches.
> 
> cp: https://pastebin.com/raw/GH51BjV4
> tar: https://pastebin.com/raw/0wgiu7fE

Can we catch who sets EIO?

# echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_readpage/enable
# echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_readpages/enable
# echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_decompress_pages_start/enable
# echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_decompress_pages_end/enable
# echo 1 > /sys/kernel/tracing/tracing_on
# cat /sys/kernel/tracing/trace_pipe

> 
> It seems that cp is calling a few extra fadvise and ioctl.
> 
> On Tue, Mar 15, 2022 at 7:25 PM Juhyung Park <qkrwngud825@gmail.com> wrote:
> >
> > Hi Chao,
> >
> > On Tue, Mar 15, 2022 at 5:45 PM Chao Yu <chao@kernel.org> wrote:
> > >
> > > Hi Juhyung,
> > >
> > > On 2022/3/15 16:37, Juhyung Park wrote:
> > > > Hi Chao,
> > > >
> > > > I actually have never used compression on f2fs yet and this image is
> > > > from an unmodified firmware taken directly from the OTA server, not
> > > > made by me.
> > >
> > > Alright.
> > >
> > > >
> > > > If you can tell me how to check if the file is compressed or not, I'll
> > > > run it and report back.
> > >
> > > lsattr <target_file_path> |grep c
> > >
> > > You can check the flag's mean in manual of chattr.
> >
> > Ah, didn't know that lsattr shows compressed state.
> > It seems that every single file under that system partition is
> > compressed except ones with file size of < 4096, but the error only
> > happens on select files.
> >
> > >
> > > >
> > > > I never had this sort of issue (a lot of multiple readers causing
> > > > troubles) before with a regular R/W, uncompressed f2fs partition, so
> > > > I'm guessing it has something to do with RO or compression feature.
> > >
> > > Agreed, maybe we can repacking image to RO one or compressed one to
> > > check which one can cause this issue?
> >
> > Is this easy to do standalone?
> >
> > AOSP's ext4 image creation was already super complicated to do
> > out-of-tree with their dedup and minimizing file size (so that the
> > resulting image is as small as possible) thinggies going on. I think
> > I've seen EROFS support but I don't know how f2fs system images are
> > handled under AOSP.
> >
> > Also, I moved the environment to my personal laptop. It's running
> > v5.15.28 with the latest f2fs-stable/linux-5.15.y merged and it still
> > happens there.
> >
> > One weird thing that I've been able to confirm is that it seems that
> > multiple readers are not the problem:
> > #!/bin/bash
> >
> > set -v
> >
> > APK=system/app/some_app/some_app.apk
> >
> > echo 3 > /proc/sys/vm/drop_caches
> > tar -pcf - $APK | cat > /dev/null     # Works!
> > cp $APK /tmp                          # Works!
> >
> > echo 3 > /proc/sys/vm/drop_caches
> > tar -pcf - $APK | cat > /dev/null     # Works!
> > echo 3 > /proc/sys/vm/drop_caches
> > cp $APK /tmp                          # cp: error reading
> > 'system/app/some_app/some_app.apk': Input/output error
> >
> > echo 3 > /proc/sys/vm/drop_caches
> > cp $APK /tmp                          # cp: error reading
> > 'system/app/some_app/some_app.apk': Input/output error
> > tar -pcf - $APK | cat > /dev/null     # tar:
> > system/app/some_app/some_app.apk: File shrank by 3375397 bytes;
> > padding with zeros
> >
> > echo 3 > /proc/sys/vm/drop_caches
> > cp $APK /tmp                          # cp: error reading
> > 'system/app/some_app/some_app.apk': Input/output error
> > echo 3 > /proc/sys/vm/drop_caches
> > tar -pcf - $APK | cat > /dev/null     # Works!
> >
> > I'm guessing some system-calls that cp use causes f2fs to malfunction.
> > (GNU tar is smart enough to detect if the output is /dev/null, so it
> > skips the actual file read when it is piped directly to /dev/null, so
> > I added `cat` in between.)
> >
> > Hope this new info helps.
> >
> > Thanks.
> >
> > >
> > > Thanks,
> > >
> > > >
> > > > Thanks.
> > > >
> > > > On Tue, Mar 15, 2022 at 5:33 PM Chao Yu <chao@kernel.org> wrote:
> > > >>
> > > >> On 2022/3/14 1:52, Juhyung Park wrote:
> > > >>> Hi.
> > > >>>
> > > >>> We have a production server storing some Android firmwares over a ZFS
> > > >>> file-system, and we noticed some issues when extracting firmware files
> > > >>> that use f2fs for Android system partitions.
> > > >>>
> > > >>> This is a proprietary environment, so I cannot disclose every detail,
> > > >>> so I hope you understand. I'll try to elaborate as much as I can.
> > > >>>
> > > >>> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> > > >>> from v5.13 after noticing RO feature added on v5.14 being required).
> > > >>> We have a set of scripts extracting Android firmware files. The input
> > > >>> is typically the OTA zip file and after going through the script, it
> > > >>> extracts every file and binary image from a given file.
> > > >>>
> > > >>> So that includes extracting super (dynamic partition), ext4 system
> > > >>> partitions with dedup enabled, and now, f2fs system partitions with RO
> > > >>> and compression enabled.
> > > >>>
> > > >>> Our script never had to deal with f2fs before as we only started
> > > >>> seeing f2fs system partitions with recently released devices.
> > > >>>
> > > >>> This is the f2fs mount flag after mounting with `mount -o ro
> > > >>> system.raw /some/dir`:
> > > >>> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> > > >>>
> > > >>> There are *a lot* of files in Android firmware these days, so we try
> > > >>> to parallelize parts when we can.
> > > >>>
> > > >>> This is a snippet of the script:
> > > >>> ```
> > > >>> #!/bin/bash
> > > >>> <...>
> > > >>> RSYNC="rsync -ahAXx --inplace --numeric-ids"
> > > >>> <...>
> > > >>> for val in system vendor product odm; do
> > > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > > >>>
> > > >>>     mkdir -p fs
> > > >>>     cd fs
> > > >>>
> > > >>>     mkdir -p $val.mount tmp_$val
> > > >>>     mount -o ro ../images/$val.raw $val.mount
> > > >>>
> > > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > > >>>     echo $! > $val.pid
> > > >>>     disown
> > > >>>
> > > >>>     cd $val.mount
> > > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > > >>>
> > > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > > >>>     wait
> > > >>>
> > > >>> <...>
> > > >>>
> > > >>>     cd ../
> > > >>>     rm -rf tmp_$val
> > > >>>     cd ../
> > > >>> done
> > > >>>
> > > >>> wait
> > > >>> <...>
> > > >>> for val in system vendor product odm; do
> > > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > > >>>     tail --pid=$(cat fs/$val.pid) -f /dev/null
> > > >>>     umount fs/$val.mount
> > > >>>     rmdir fs/$val.mount
> > > >>>     rm -f images/$val.img images/$val.raw 2>/dev/null
> > > >>> done
> > > >>> ```
> > > >>>
> > > >>> The offending part is:
> > > >>> ```
> > > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > > >>>     wait
> > > >>> ```
> > > >>>
> > > >>> When that part is reached, the script forks thousands of new processes
> > > >>> and starts reading from f2fs. (We simply decided to rely on Linux's
> > > >>> task scheduler and didn't bother to limit the number of
> > > >>> sub-processes.)
> > > >>>
> > > >>> I am able to reliably cause f2fs to return EIO on some files:
> > > >>> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> > > >>> Input/output error
> > > >>> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> > > >>> Input/output error
> > > >>> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> > > >>> Input/output error
> > > >>> rsync: [sender] read errors mapping
> > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > > >>> Input/output error (5)
> > > >>> rsync: [sender] read errors mapping
> > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > > >>> Input/output error (5)
> > > >>> rsync: [sender] read errors mapping
> > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > > >>> Input/output error (5)
> > > >>> rsync: [sender] read errors mapping
> > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > > >>> Input/output error (5)
> > > >>> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> > > >>> -- update retained.
> > > >>> rsync: [sender] read errors mapping
> > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > > >>> Input/output error (5)
> > > >>> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> > > >>> -- update retained.
> > > >>> rsync: [sender] read errors mapping
> > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > > >>> Input/output error (5)
> > > >>> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> > > >>> -- update retained.
> > > >>> rsync error: some files/attrs were not transferred (see previous
> > > >>> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
> > > >>
> > > >> Hi Juhyung,
> > > >>
> > > >> Do you enable compression on specified files? if so, do all EIOs come from compressed
> > > >> file?
> > > >>
> > > >> Thanks,
> > > >>
> > > >>>
> > > >>> The dmesg remains silent.
> > > >>>
> > > >>> When I modify the script a little bit and force it to run in a
> > > >>> single-thread (by removing &), it runs well.
> > > >>>
> > > >>> I was able to confirm that it isn't a memory issue. The server has
> > > >>> 50G+ of free memory, and the issue is still reliably reproducible when
> > > >>> I defragment the memory by dropping caches and doing `echo 1 >
> > > >>> /proc/sys/vm/compact_memory`.
> > > >>>
> > > >>> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> > > >>> unsupported by ZFS. And it being a production server, I am somewhat
> > > >>> limited in dabbling around the kernel.
> > > >>>
> > > >>> I am planning to test a new kernel with v5.15 +
> > > >>> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> > > >>> fixed with newer commits, I'd appreciate a tip.
> > > >>>
> > > >>> Thanks.
> > > >>>
> > > >>>
> > > >>> _______________________________________________
> > > >>> Linux-f2fs-devel mailing list
> > > >>> Linux-f2fs-devel@lists.sourceforge.net
> > > >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> 
> 
> _______________________________________________
> Linux-f2fs-devel mailing list
> Linux-f2fs-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-15 20:49           ` Jaegeuk Kim
@ 2022-03-16  8:43             ` Juhyung Park
  2022-03-16 10:00               ` Chao Yu
  0 siblings, 1 reply; 11+ messages in thread
From: Juhyung Park @ 2022-03-16  8:43 UTC (permalink / raw)
  To: Jaegeuk Kim; +Cc: linux-f2fs-devel

Hi Jaegeuk,

I'll perform the trace debugging later.

Meanwhile, we encountered another issue.

For now, I added a workaround that performs mlockall() on all files
(vmtouch -L $val.mount -d -w -P $val.vmtouch) under the f2fs partition
so that the cp command works.

After confirming that extraction works, I've parallelized the script
again (on our production server again, so v5.15 without f2fs-stable
backports) and while userspace processes don't complain, there's
something going on on dmesg:

[224319.321668] loop0: detected capacity change from 0 to 11041080
[224319.361530] F2FS-fs (loop0): Found nat_bits in checkpoint
[224319.365029] F2FS-fs (loop0): write access unavailable, skipping recovery
[224319.365033] F2FS-fs (loop0): Mounted with checkpoint version = 4321
[224340.047471] loop1: detected capacity change from 0 to 11769352
[224340.101509] F2FS-fs (loop1): Found nat_bits in checkpoint
[224340.103799] F2FS-fs (loop1): write access unavailable, skipping recovery
[224340.103802] F2FS-fs (loop1): Mounted with checkpoint version = 4321
[224361.763528] loop2: detected capacity change from 0 to 11386480
[224361.812891] F2FS-fs (loop2): Found nat_bits in checkpoint
[224361.814881] F2FS-fs (loop2): write access unavailable, skipping recovery
[224361.814886] F2FS-fs (loop2): Mounted with checkpoint version = 4321
[224386.701011] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[224386.701248] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[224386.701779] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[224386.701806] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[224386.701826] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[224391.823216] loop3: detected capacity change from 0 to 11045240
[224391.884383] F2FS-fs (loop3): Found nat_bits in checkpoint
[224392.194608] F2FS-fs (loop3): write access unavailable, skipping recovery
[224392.194612] F2FS-fs (loop3): Mounted with checkpoint version = 4321
[224401.386330] loop4: detected capacity change from 0 to 11428112
[224401.424575] F2FS-fs (loop4): Found nat_bits in checkpoint
[224401.428872] F2FS-fs (loop4): write access unavailable, skipping recovery
[224401.428875] F2FS-fs (loop4): Mounted with checkpoint version = 4321
[224576.100839] loop0: detected capacity change from 0 to 11041080
[224576.140862] F2FS-fs (loop0): Found nat_bits in checkpoint
[224576.142694] F2FS-fs (loop0): write access unavailable, skipping recovery
[224576.142699] F2FS-fs (loop0): Mounted with checkpoint version = 4321
[224599.330003] loop1: detected capacity change from 0 to 2913344
[224599.360425] F2FS-fs (loop1): Found nat_bits in checkpoint
[224599.362257] F2FS-fs (loop1): write access unavailable, skipping recovery
[224599.362260] F2FS-fs (loop1): Mounted with checkpoint version = 4321
[224602.241868] loop2: detected capacity change from 0 to 2064344
[224602.276146] F2FS-fs (loop2): Found nat_bits in checkpoint
[224602.277955] F2FS-fs (loop2): write access unavailable, skipping recovery
[224602.277971] F2FS-fs (loop2): Mounted with checkpoint version = 4321
[224629.346056] loop3: detected capacity change from 0 to 41776
[224629.375324] F2FS-fs (loop3): Found nat_bits in checkpoint
[224629.377167] F2FS-fs (loop3): write access unavailable, skipping recovery
[224629.377170] F2FS-fs (loop3): Mounted with checkpoint version = 4321
[225029.224078] loop0: detected capacity change from 0 to 11411472
[225029.270396] F2FS-fs (loop0): Found nat_bits in checkpoint
[225029.271345] F2FS-fs (loop0): write access unavailable, skipping recovery
[225029.271348] F2FS-fs (loop0): Mounted with checkpoint version = 4321
[225625.045254] loop0: detected capacity change from 0 to 11045240
[225625.078435] F2FS-fs (loop0): Found nat_bits in checkpoint
[225625.079453] F2FS-fs (loop0): write access unavailable, skipping recovery
[225625.079455] F2FS-fs (loop0): Mounted with checkpoint version = 4321
[225651.466438] loop1: detected capacity change from 0 to 11428112
[225651.532489] F2FS-fs (loop1): Found nat_bits in checkpoint
[225651.582589] F2FS-fs (loop1): write access unavailable, skipping recovery
[225651.582594] F2FS-fs (loop1): Mounted with checkpoint version = 4321
[225653.729775] loop2: detected capacity change from 0 to 2896696
[225653.817115] F2FS-fs (loop2): Found nat_bits in checkpoint
[225653.842649] F2FS-fs (loop2): write access unavailable, skipping recovery
[225653.842654] F2FS-fs (loop2): Mounted with checkpoint version = 4321
[225665.480652] loop3: detected capacity change from 0 to 2064344
[225665.548952] F2FS-fs (loop3): Found nat_bits in checkpoint
[225665.551460] F2FS-fs (loop3): write access unavailable, skipping recovery
[225665.551465] F2FS-fs (loop3): Mounted with checkpoint version = 4321
[225687.173551] loop4: detected capacity change from 0 to 3753984
[225687.297132] F2FS-fs (loop4): Found nat_bits in checkpoint
[225687.299702] F2FS-fs (loop4): write access unavailable, skipping recovery
[225687.299705] F2FS-fs (loop4): Mounted with checkpoint version = 4321
[225694.934235] loop5: detected capacity change from 0 to 1868760
[225695.047462] F2FS-fs (loop5): Found nat_bits in checkpoint
[225695.049562] F2FS-fs (loop5): write access unavailable, skipping recovery
[225695.049567] F2FS-fs (loop5): Mounted with checkpoint version = 4321
[225704.254531] loop6: detected capacity change from 0 to 41776
[225704.282972] F2FS-fs (loop6): Found nat_bits in checkpoint
[225704.286725] F2FS-fs (loop6): write access unavailable, skipping recovery
[225704.286730] F2FS-fs (loop6): Mounted with checkpoint version = 4321
[225723.689515] loop0: detected capacity change from 0 to 11411472
[225723.723250] F2FS-fs (loop0): Found nat_bits in checkpoint
[225723.725740] F2FS-fs (loop0): write access unavailable, skipping recovery
[225723.725744] F2FS-fs (loop0): Mounted with checkpoint version = 4321
[225727.574004] loop2: detected capacity change from 0 to 11386480
[225727.621016] F2FS-fs (loop2): Found nat_bits in checkpoint
[225727.623296] F2FS-fs (loop2): write access unavailable, skipping recovery
[225727.623300] F2FS-fs (loop2): Mounted with checkpoint version = 4321
[225728.404524] loop3: detected capacity change from 0 to 41776
[225728.440403] F2FS-fs (loop3): Found nat_bits in checkpoint
[225728.442743] F2FS-fs (loop3): write access unavailable, skipping recovery
[225728.442748] F2FS-fs (loop3): Mounted with checkpoint version = 4321
[225731.328330] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[225731.328400] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[225731.328424] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
[225743.064116] loop1: detected capacity change from 0 to 11769352
[225743.110306] F2FS-fs (loop1): Found nat_bits in checkpoint
[225743.113408] F2FS-fs (loop1): write access unavailable, skipping recovery
[225743.113414] F2FS-fs (loop1): Mounted with checkpoint version = 4321
[225761.148410] loop3: detected capacity change from 0 to 3758152
[225761.195014] F2FS-fs (loop3): Found nat_bits in checkpoint
[225761.196431] F2FS-fs (loop3): write access unavailable, skipping recovery
[225761.196432] F2FS-fs (loop3): Mounted with checkpoint version = 4321
[225766.885693] loop4: detected capacity change from 0 to 1868760
[225767.010849] F2FS-fs (loop4): Found nat_bits in checkpoint
[225767.013024] F2FS-fs (loop4): write access unavailable, skipping recovery
[225767.013027] F2FS-fs (loop4): Mounted with checkpoint version = 4321
[225781.475142] loop5: detected capacity change from 0 to 3766472
[225781.500625] F2FS-fs (loop5): Found nat_bits in checkpoint
[225781.503148] F2FS-fs (loop5): write access unavailable, skipping recovery
[225781.503150] F2FS-fs (loop5): Mounted with checkpoint version = 4321
[225785.367635] loop6: detected capacity change from 0 to 1868760
[225785.474029] F2FS-fs (loop6): Found nat_bits in checkpoint
[225785.475673] F2FS-fs (loop6): write access unavailable, skipping recovery
[225785.475678] F2FS-fs (loop6): Mounted with checkpoint version = 4321
[225802.664721] loop7: detected capacity change from 0 to 41776
[225802.706846] F2FS-fs (loop7): Found nat_bits in checkpoint
[225802.709092] F2FS-fs (loop7): write access unavailable, skipping recovery
[225802.709095] F2FS-fs (loop7): Mounted with checkpoint version = 4321
[225812.972186] loop0: detected capacity change from 0 to 41776
[225813.000175] F2FS-fs (loop0): Found nat_bits in checkpoint
[225813.003617] F2FS-fs (loop0): write access unavailable, skipping recovery
[225813.003619] F2FS-fs (loop0): Mounted with checkpoint version = 4321

Those "lz4 invalid rlen:16384, expected:16384", is this normal?

Thanks.

On Wed, Mar 16, 2022 at 5:49 AM Jaegeuk Kim <jaegeuk@kernel.org> wrote:
>
> On 03/15, Juhyung Park wrote:
> > Here's strace from both commands right after drop_caches.
> >
> > cp: https://pastebin.com/raw/GH51BjV4
> > tar: https://pastebin.com/raw/0wgiu7fE
>
> Can we catch who sets EIO?
>
> # echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_readpage/enable
> # echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_readpages/enable
> # echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_decompress_pages_start/enable
> # echo 1 > /sys/kernel/tracing/events/f2fs/f2fs_decompress_pages_end/enable
> # echo 1 > /sys/kernel/tracing/tracing_on
> # cat /sys/kernel/tracing/trace_pipe
>
> >
> > It seems that cp is calling a few extra fadvise and ioctl.
> >
> > On Tue, Mar 15, 2022 at 7:25 PM Juhyung Park <qkrwngud825@gmail.com> wrote:
> > >
> > > Hi Chao,
> > >
> > > On Tue, Mar 15, 2022 at 5:45 PM Chao Yu <chao@kernel.org> wrote:
> > > >
> > > > Hi Juhyung,
> > > >
> > > > On 2022/3/15 16:37, Juhyung Park wrote:
> > > > > Hi Chao,
> > > > >
> > > > > I actually have never used compression on f2fs yet and this image is
> > > > > from an unmodified firmware taken directly from the OTA server, not
> > > > > made by me.
> > > >
> > > > Alright.
> > > >
> > > > >
> > > > > If you can tell me how to check if the file is compressed or not, I'll
> > > > > run it and report back.
> > > >
> > > > lsattr <target_file_path> |grep c
> > > >
> > > > You can check the flag's mean in manual of chattr.
> > >
> > > Ah, didn't know that lsattr shows compressed state.
> > > It seems that every single file under that system partition is
> > > compressed except ones with file size of < 4096, but the error only
> > > happens on select files.
> > >
> > > >
> > > > >
> > > > > I never had this sort of issue (a lot of multiple readers causing
> > > > > troubles) before with a regular R/W, uncompressed f2fs partition, so
> > > > > I'm guessing it has something to do with RO or compression feature.
> > > >
> > > > Agreed, maybe we can repacking image to RO one or compressed one to
> > > > check which one can cause this issue?
> > >
> > > Is this easy to do standalone?
> > >
> > > AOSP's ext4 image creation was already super complicated to do
> > > out-of-tree with their dedup and minimizing file size (so that the
> > > resulting image is as small as possible) thinggies going on. I think
> > > I've seen EROFS support but I don't know how f2fs system images are
> > > handled under AOSP.
> > >
> > > Also, I moved the environment to my personal laptop. It's running
> > > v5.15.28 with the latest f2fs-stable/linux-5.15.y merged and it still
> > > happens there.
> > >
> > > One weird thing that I've been able to confirm is that it seems that
> > > multiple readers are not the problem:
> > > #!/bin/bash
> > >
> > > set -v
> > >
> > > APK=system/app/some_app/some_app.apk
> > >
> > > echo 3 > /proc/sys/vm/drop_caches
> > > tar -pcf - $APK | cat > /dev/null     # Works!
> > > cp $APK /tmp                          # Works!
> > >
> > > echo 3 > /proc/sys/vm/drop_caches
> > > tar -pcf - $APK | cat > /dev/null     # Works!
> > > echo 3 > /proc/sys/vm/drop_caches
> > > cp $APK /tmp                          # cp: error reading
> > > 'system/app/some_app/some_app.apk': Input/output error
> > >
> > > echo 3 > /proc/sys/vm/drop_caches
> > > cp $APK /tmp                          # cp: error reading
> > > 'system/app/some_app/some_app.apk': Input/output error
> > > tar -pcf - $APK | cat > /dev/null     # tar:
> > > system/app/some_app/some_app.apk: File shrank by 3375397 bytes;
> > > padding with zeros
> > >
> > > echo 3 > /proc/sys/vm/drop_caches
> > > cp $APK /tmp                          # cp: error reading
> > > 'system/app/some_app/some_app.apk': Input/output error
> > > echo 3 > /proc/sys/vm/drop_caches
> > > tar -pcf - $APK | cat > /dev/null     # Works!
> > >
> > > I'm guessing some system-calls that cp use causes f2fs to malfunction.
> > > (GNU tar is smart enough to detect if the output is /dev/null, so it
> > > skips the actual file read when it is piped directly to /dev/null, so
> > > I added `cat` in between.)
> > >
> > > Hope this new info helps.
> > >
> > > Thanks.
> > >
> > > >
> > > > Thanks,
> > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > > On Tue, Mar 15, 2022 at 5:33 PM Chao Yu <chao@kernel.org> wrote:
> > > > >>
> > > > >> On 2022/3/14 1:52, Juhyung Park wrote:
> > > > >>> Hi.
> > > > >>>
> > > > >>> We have a production server storing some Android firmwares over a ZFS
> > > > >>> file-system, and we noticed some issues when extracting firmware files
> > > > >>> that use f2fs for Android system partitions.
> > > > >>>
> > > > >>> This is a proprietary environment, so I cannot disclose every detail,
> > > > >>> so I hope you understand. I'll try to elaborate as much as I can.
> > > > >>>
> > > > >>> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> > > > >>> from v5.13 after noticing RO feature added on v5.14 being required).
> > > > >>> We have a set of scripts extracting Android firmware files. The input
> > > > >>> is typically the OTA zip file and after going through the script, it
> > > > >>> extracts every file and binary image from a given file.
> > > > >>>
> > > > >>> So that includes extracting super (dynamic partition), ext4 system
> > > > >>> partitions with dedup enabled, and now, f2fs system partitions with RO
> > > > >>> and compression enabled.
> > > > >>>
> > > > >>> Our script never had to deal with f2fs before as we only started
> > > > >>> seeing f2fs system partitions with recently released devices.
> > > > >>>
> > > > >>> This is the f2fs mount flag after mounting with `mount -o ro
> > > > >>> system.raw /some/dir`:
> > > > >>> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> > > > >>>
> > > > >>> There are *a lot* of files in Android firmware these days, so we try
> > > > >>> to parallelize parts when we can.
> > > > >>>
> > > > >>> This is a snippet of the script:
> > > > >>> ```
> > > > >>> #!/bin/bash
> > > > >>> <...>
> > > > >>> RSYNC="rsync -ahAXx --inplace --numeric-ids"
> > > > >>> <...>
> > > > >>> for val in system vendor product odm; do
> > > > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > > > >>>
> > > > >>>     mkdir -p fs
> > > > >>>     cd fs
> > > > >>>
> > > > >>>     mkdir -p $val.mount tmp_$val
> > > > >>>     mount -o ro ../images/$val.raw $val.mount
> > > > >>>
> > > > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > > > >>>     echo $! > $val.pid
> > > > >>>     disown
> > > > >>>
> > > > >>>     cd $val.mount
> > > > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > > > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > > > >>>
> > > > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > > > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > > > >>>     wait
> > > > >>>
> > > > >>> <...>
> > > > >>>
> > > > >>>     cd ../
> > > > >>>     rm -rf tmp_$val
> > > > >>>     cd ../
> > > > >>> done
> > > > >>>
> > > > >>> wait
> > > > >>> <...>
> > > > >>> for val in system vendor product odm; do
> > > > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > > > >>>     tail --pid=$(cat fs/$val.pid) -f /dev/null
> > > > >>>     umount fs/$val.mount
> > > > >>>     rmdir fs/$val.mount
> > > > >>>     rm -f images/$val.img images/$val.raw 2>/dev/null
> > > > >>> done
> > > > >>> ```
> > > > >>>
> > > > >>> The offending part is:
> > > > >>> ```
> > > > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > > > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > > > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > > > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > > > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > > > >>>     wait
> > > > >>> ```
> > > > >>>
> > > > >>> When that part is reached, the script forks thousands of new processes
> > > > >>> and starts reading from f2fs. (We simply decided to rely on Linux's
> > > > >>> task scheduler and didn't bother to limit the number of
> > > > >>> sub-processes.)
> > > > >>>
> > > > >>> I am able to reliably cause f2fs to return EIO on some files:
> > > > >>> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> > > > >>> Input/output error
> > > > >>> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> > > > >>> Input/output error
> > > > >>> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> > > > >>> Input/output error
> > > > >>> rsync: [sender] read errors mapping
> > > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > > > >>> Input/output error (5)
> > > > >>> rsync: [sender] read errors mapping
> > > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > > > >>> Input/output error (5)
> > > > >>> rsync: [sender] read errors mapping
> > > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > > > >>> Input/output error (5)
> > > > >>> rsync: [sender] read errors mapping
> > > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > > > >>> Input/output error (5)
> > > > >>> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> > > > >>> -- update retained.
> > > > >>> rsync: [sender] read errors mapping
> > > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > > > >>> Input/output error (5)
> > > > >>> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> > > > >>> -- update retained.
> > > > >>> rsync: [sender] read errors mapping
> > > > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > > > >>> Input/output error (5)
> > > > >>> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> > > > >>> -- update retained.
> > > > >>> rsync error: some files/attrs were not transferred (see previous
> > > > >>> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
> > > > >>
> > > > >> Hi Juhyung,
> > > > >>
> > > > >> Do you enable compression on specified files? if so, do all EIOs come from compressed
> > > > >> file?
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >>>
> > > > >>> The dmesg remains silent.
> > > > >>>
> > > > >>> When I modify the script a little bit and force it to run in a
> > > > >>> single-thread (by removing &), it runs well.
> > > > >>>
> > > > >>> I was able to confirm that it isn't a memory issue. The server has
> > > > >>> 50G+ of free memory, and the issue is still reliably reproducible when
> > > > >>> I defragment the memory by dropping caches and doing `echo 1 >
> > > > >>> /proc/sys/vm/compact_memory`.
> > > > >>>
> > > > >>> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> > > > >>> unsupported by ZFS. And it being a production server, I am somewhat
> > > > >>> limited in dabbling around the kernel.
> > > > >>>
> > > > >>> I am planning to test a new kernel with v5.15 +
> > > > >>> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> > > > >>> fixed with newer commits, I'd appreciate a tip.
> > > > >>>
> > > > >>> Thanks.
> > > > >>>
> > > > >>>
> > > > >>> _______________________________________________
> > > > >>> Linux-f2fs-devel mailing list
> > > > >>> Linux-f2fs-devel@lists.sourceforge.net
> > > > >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
> >
> >
> > _______________________________________________
> > Linux-f2fs-devel mailing list
> > Linux-f2fs-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
  2022-03-16  8:43             ` Juhyung Park
@ 2022-03-16 10:00               ` Chao Yu
  0 siblings, 0 replies; 11+ messages in thread
From: Chao Yu @ 2022-03-16 10:00 UTC (permalink / raw)
  To: Juhyung Park, Jaegeuk Kim; +Cc: linux-f2fs-devel

On 2022/3/16 16:43, Juhyung Park wrote:
> Hi Jaegeuk,
> 
> I'll perform the trace debugging later.
> 
> Meanwhile, we encountered another issue.
> 
> For now, I added a workaround that performs mlockall() on all files
> (vmtouch -L $val.mount -d -w -P $val.vmtouch) under the f2fs partition
> so that the cp command works.
> 
> After confirming that extraction works, I've parallelized the script
> again (on our production server again, so v5.15 without f2fs-stable
> backports) and while userspace processes don't complain, there's
> something going on on dmesg:
> 
> [224319.321668] loop0: detected capacity change from 0 to 11041080
> [224319.361530] F2FS-fs (loop0): Found nat_bits in checkpoint
> [224319.365029] F2FS-fs (loop0): write access unavailable, skipping recovery
> [224319.365033] F2FS-fs (loop0): Mounted with checkpoint version = 4321
> [224340.047471] loop1: detected capacity change from 0 to 11769352
> [224340.101509] F2FS-fs (loop1): Found nat_bits in checkpoint
> [224340.103799] F2FS-fs (loop1): write access unavailable, skipping recovery
> [224340.103802] F2FS-fs (loop1): Mounted with checkpoint version = 4321
> [224361.763528] loop2: detected capacity change from 0 to 11386480
> [224361.812891] F2FS-fs (loop2): Found nat_bits in checkpoint
> [224361.814881] F2FS-fs (loop2): write access unavailable, skipping recovery
> [224361.814886] F2FS-fs (loop2): Mounted with checkpoint version = 4321
> [224386.701011] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [224386.701248] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [224386.701779] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [224386.701806] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [224386.701826] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [224391.823216] loop3: detected capacity change from 0 to 11045240
> [224391.884383] F2FS-fs (loop3): Found nat_bits in checkpoint
> [224392.194608] F2FS-fs (loop3): write access unavailable, skipping recovery
> [224392.194612] F2FS-fs (loop3): Mounted with checkpoint version = 4321
> [224401.386330] loop4: detected capacity change from 0 to 11428112
> [224401.424575] F2FS-fs (loop4): Found nat_bits in checkpoint
> [224401.428872] F2FS-fs (loop4): write access unavailable, skipping recovery
> [224401.428875] F2FS-fs (loop4): Mounted with checkpoint version = 4321
> [224576.100839] loop0: detected capacity change from 0 to 11041080
> [224576.140862] F2FS-fs (loop0): Found nat_bits in checkpoint
> [224576.142694] F2FS-fs (loop0): write access unavailable, skipping recovery
> [224576.142699] F2FS-fs (loop0): Mounted with checkpoint version = 4321
> [224599.330003] loop1: detected capacity change from 0 to 2913344
> [224599.360425] F2FS-fs (loop1): Found nat_bits in checkpoint
> [224599.362257] F2FS-fs (loop1): write access unavailable, skipping recovery
> [224599.362260] F2FS-fs (loop1): Mounted with checkpoint version = 4321
> [224602.241868] loop2: detected capacity change from 0 to 2064344
> [224602.276146] F2FS-fs (loop2): Found nat_bits in checkpoint
> [224602.277955] F2FS-fs (loop2): write access unavailable, skipping recovery
> [224602.277971] F2FS-fs (loop2): Mounted with checkpoint version = 4321
> [224629.346056] loop3: detected capacity change from 0 to 41776
> [224629.375324] F2FS-fs (loop3): Found nat_bits in checkpoint
> [224629.377167] F2FS-fs (loop3): write access unavailable, skipping recovery
> [224629.377170] F2FS-fs (loop3): Mounted with checkpoint version = 4321
> [225029.224078] loop0: detected capacity change from 0 to 11411472
> [225029.270396] F2FS-fs (loop0): Found nat_bits in checkpoint
> [225029.271345] F2FS-fs (loop0): write access unavailable, skipping recovery
> [225029.271348] F2FS-fs (loop0): Mounted with checkpoint version = 4321
> [225625.045254] loop0: detected capacity change from 0 to 11045240
> [225625.078435] F2FS-fs (loop0): Found nat_bits in checkpoint
> [225625.079453] F2FS-fs (loop0): write access unavailable, skipping recovery
> [225625.079455] F2FS-fs (loop0): Mounted with checkpoint version = 4321
> [225651.466438] loop1: detected capacity change from 0 to 11428112
> [225651.532489] F2FS-fs (loop1): Found nat_bits in checkpoint
> [225651.582589] F2FS-fs (loop1): write access unavailable, skipping recovery
> [225651.582594] F2FS-fs (loop1): Mounted with checkpoint version = 4321
> [225653.729775] loop2: detected capacity change from 0 to 2896696
> [225653.817115] F2FS-fs (loop2): Found nat_bits in checkpoint
> [225653.842649] F2FS-fs (loop2): write access unavailable, skipping recovery
> [225653.842654] F2FS-fs (loop2): Mounted with checkpoint version = 4321
> [225665.480652] loop3: detected capacity change from 0 to 2064344
> [225665.548952] F2FS-fs (loop3): Found nat_bits in checkpoint
> [225665.551460] F2FS-fs (loop3): write access unavailable, skipping recovery
> [225665.551465] F2FS-fs (loop3): Mounted with checkpoint version = 4321
> [225687.173551] loop4: detected capacity change from 0 to 3753984
> [225687.297132] F2FS-fs (loop4): Found nat_bits in checkpoint
> [225687.299702] F2FS-fs (loop4): write access unavailable, skipping recovery
> [225687.299705] F2FS-fs (loop4): Mounted with checkpoint version = 4321
> [225694.934235] loop5: detected capacity change from 0 to 1868760
> [225695.047462] F2FS-fs (loop5): Found nat_bits in checkpoint
> [225695.049562] F2FS-fs (loop5): write access unavailable, skipping recovery
> [225695.049567] F2FS-fs (loop5): Mounted with checkpoint version = 4321
> [225704.254531] loop6: detected capacity change from 0 to 41776
> [225704.282972] F2FS-fs (loop6): Found nat_bits in checkpoint
> [225704.286725] F2FS-fs (loop6): write access unavailable, skipping recovery
> [225704.286730] F2FS-fs (loop6): Mounted with checkpoint version = 4321
> [225723.689515] loop0: detected capacity change from 0 to 11411472
> [225723.723250] F2FS-fs (loop0): Found nat_bits in checkpoint
> [225723.725740] F2FS-fs (loop0): write access unavailable, skipping recovery
> [225723.725744] F2FS-fs (loop0): Mounted with checkpoint version = 4321
> [225727.574004] loop2: detected capacity change from 0 to 11386480
> [225727.621016] F2FS-fs (loop2): Found nat_bits in checkpoint
> [225727.623296] F2FS-fs (loop2): write access unavailable, skipping recovery
> [225727.623300] F2FS-fs (loop2): Mounted with checkpoint version = 4321
> [225728.404524] loop3: detected capacity change from 0 to 41776
> [225728.440403] F2FS-fs (loop3): Found nat_bits in checkpoint
> [225728.442743] F2FS-fs (loop3): write access unavailable, skipping recovery
> [225728.442748] F2FS-fs (loop3): Mounted with checkpoint version = 4321
> [225731.328330] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [225731.328400] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [225731.328424] F2FS-fs (loop2): lz4 invalid rlen:16384, expected:16384
> [225743.064116] loop1: detected capacity change from 0 to 11769352
> [225743.110306] F2FS-fs (loop1): Found nat_bits in checkpoint
> [225743.113408] F2FS-fs (loop1): write access unavailable, skipping recovery
> [225743.113414] F2FS-fs (loop1): Mounted with checkpoint version = 4321
> [225761.148410] loop3: detected capacity change from 0 to 3758152
> [225761.195014] F2FS-fs (loop3): Found nat_bits in checkpoint
> [225761.196431] F2FS-fs (loop3): write access unavailable, skipping recovery
> [225761.196432] F2FS-fs (loop3): Mounted with checkpoint version = 4321
> [225766.885693] loop4: detected capacity change from 0 to 1868760
> [225767.010849] F2FS-fs (loop4): Found nat_bits in checkpoint
> [225767.013024] F2FS-fs (loop4): write access unavailable, skipping recovery
> [225767.013027] F2FS-fs (loop4): Mounted with checkpoint version = 4321
> [225781.475142] loop5: detected capacity change from 0 to 3766472
> [225781.500625] F2FS-fs (loop5): Found nat_bits in checkpoint
> [225781.503148] F2FS-fs (loop5): write access unavailable, skipping recovery
> [225781.503150] F2FS-fs (loop5): Mounted with checkpoint version = 4321
> [225785.367635] loop6: detected capacity change from 0 to 1868760
> [225785.474029] F2FS-fs (loop6): Found nat_bits in checkpoint
> [225785.475673] F2FS-fs (loop6): write access unavailable, skipping recovery
> [225785.475678] F2FS-fs (loop6): Mounted with checkpoint version = 4321
> [225802.664721] loop7: detected capacity change from 0 to 41776
> [225802.706846] F2FS-fs (loop7): Found nat_bits in checkpoint
> [225802.709092] F2FS-fs (loop7): write access unavailable, skipping recovery
> [225802.709095] F2FS-fs (loop7): Mounted with checkpoint version = 4321
> [225812.972186] loop0: detected capacity change from 0 to 41776
> [225813.000175] F2FS-fs (loop0): Found nat_bits in checkpoint
> [225813.003617] F2FS-fs (loop0): write access unavailable, skipping recovery
> [225813.003619] F2FS-fs (loop0): Mounted with checkpoint version = 4321
> 
> Those "lz4 invalid rlen:16384, expected:16384", is this normal?

Juhyung, good catch.

I guess this is the place in where it returns EIO in your scenario.

Actually, it looks we printed the wrong variable in the message... could
you please change the code as below, using "ret" instead of "dic->rlen",
and try another test to check the log.

static int lz4_decompress_pages()
{
...
	if (ret != PAGE_SIZE << dic->log_cluster_size) {
		printk_ratelimited("%sF2FS-fs (%s): lz4 invalid rlen:%zu, "
					"expected:%lu\n", KERN_ERR,
					F2FS_I_SB(dic->inode)->sb->s_id,
					ret,
					PAGE_SIZE << dic->log_cluster_size);
		return -EIO;
	}
	return 0;
}

Not sure, is it possible that you can share this f2fs image in somewhere?
then I can check details of this issue directly w/ the image.

Thanks,


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-16 10:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-13 17:52 [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image Juhyung Park
2022-03-15  0:30 ` Jaegeuk Kim
2022-03-15  4:42   ` Juhyung Park
2022-03-15  8:33 ` Chao Yu
2022-03-15  8:37   ` Juhyung Park
2022-03-15  8:45     ` Chao Yu
2022-03-15 10:25       ` Juhyung Park
2022-03-15 10:48         ` Juhyung Park
2022-03-15 20:49           ` Jaegeuk Kim
2022-03-16  8:43             ` Juhyung Park
2022-03-16 10:00               ` Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).