linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
* [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
@ 2022-03-13 17:52 Juhyung Park
  2022-03-15  0:30 ` Jaegeuk Kim
  2022-03-15  8:33 ` Chao Yu
  0 siblings, 2 replies; 11+ messages in thread
From: Juhyung Park @ 2022-03-13 17:52 UTC (permalink / raw)
  To: linux-f2fs-devel

Hi.

We have a production server storing some Android firmwares over a ZFS
file-system, and we noticed some issues when extracting firmware files
that use f2fs for Android system partitions.

This is a proprietary environment, so I cannot disclose every detail,
so I hope you understand. I'll try to elaborate as much as I can.

The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
from v5.13 after noticing RO feature added on v5.14 being required).
We have a set of scripts extracting Android firmware files. The input
is typically the OTA zip file and after going through the script, it
extracts every file and binary image from a given file.

So that includes extracting super (dynamic partition), ext4 system
partitions with dedup enabled, and now, f2fs system partitions with RO
and compression enabled.

Our script never had to deal with f2fs before as we only started
seeing f2fs system partitions with recently released devices.

This is the f2fs mount flag after mounting with `mount -o ro
system.raw /some/dir`:
ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block

There are *a lot* of files in Android firmware these days, so we try
to parallelize parts when we can.

This is a snippet of the script:
```
#!/bin/bash
<...>
RSYNC="rsync -ahAXx --inplace --numeric-ids"
<...>
for val in system vendor product odm; do
  if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi

  mkdir -p fs
  cd fs

  mkdir -p $val.mount tmp_$val
  mount -o ro ../images/$val.raw $val.mount

  $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
  echo $! > $val.pid
  disown

  cd $val.mount
  find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
  find . -type d -exec mkdir -p "../tmp_$val/"{} \;

  while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
& done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
  wait

<...>

  cd ../
  rm -rf tmp_$val
  cd ../
done

wait
<...>
for val in system vendor product odm; do
  if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
  tail --pid=$(cat fs/$val.pid) -f /dev/null
  umount fs/$val.mount
  rmdir fs/$val.mount
  rm -f images/$val.img images/$val.raw 2>/dev/null
done
```

The offending part is:
```
  $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
  find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
  find . -type d -exec mkdir -p "../tmp_$val/"{} \;
  while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
& done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
  wait
```

When that part is reached, the script forks thousands of new processes
and starts reading from f2fs. (We simply decided to rely on Linux's
task scheduler and didn't bother to limit the number of
sub-processes.)

I am able to reliably cause f2fs to return EIO on some files:
cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
Input/output error
cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
Input/output error
cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
Input/output error
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
Input/output error (5)
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
Input/output error (5)
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
Input/output error (5)
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
Input/output error (5)
ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
-- update retained.
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
Input/output error (5)
ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
-- update retained.
rsync: [sender] read errors mapping
"/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
Input/output error (5)
ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
-- update retained.
rsync error: some files/attrs were not transferred (see previous
errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]

The dmesg remains silent.

When I modify the script a little bit and force it to run in a
single-thread (by removing &), it runs well.

I was able to confirm that it isn't a memory issue. The server has
50G+ of free memory, and the issue is still reliably reproducible when
I defragment the memory by dropping caches and doing `echo 1 >
/proc/sys/vm/compact_memory`.

I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
unsupported by ZFS. And it being a production server, I am somewhat
limited in dabbling around the kernel.

I am planning to test a new kernel with v5.15 +
f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
fixed with newer commits, I'd appreciate a tip.

Thanks.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-16 10:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-13 17:52 [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image Juhyung Park
2022-03-15  0:30 ` Jaegeuk Kim
2022-03-15  4:42   ` Juhyung Park
2022-03-15  8:33 ` Chao Yu
2022-03-15  8:37   ` Juhyung Park
2022-03-15  8:45     ` Chao Yu
2022-03-15 10:25       ` Juhyung Park
2022-03-15 10:48         ` Juhyung Park
2022-03-15 20:49           ` Jaegeuk Kim
2022-03-16  8:43             ` Juhyung Park
2022-03-16 10:00               ` Chao Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).