linux-f2fs-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
From: Juhyung Park <qkrwngud825@gmail.com>
To: Chao Yu <chao@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net
Subject: Re: [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image
Date: Tue, 15 Mar 2022 19:48:14 +0900	[thread overview]
Message-ID: <CAD14+f1_smUWHgZRffuTNfsmCk-j_LL_vw5RuWffGdnd=vP8BA@mail.gmail.com> (raw)
In-Reply-To: <CAD14+f3SnZir_Q7pyVzFZ6zHJendYyxYfGaxP9iLEDQuPE=AZw@mail.gmail.com>

Here's strace from both commands right after drop_caches.

cp: https://pastebin.com/raw/GH51BjV4
tar: https://pastebin.com/raw/0wgiu7fE

It seems that cp is calling a few extra fadvise and ioctl.

On Tue, Mar 15, 2022 at 7:25 PM Juhyung Park <qkrwngud825@gmail.com> wrote:
>
> Hi Chao,
>
> On Tue, Mar 15, 2022 at 5:45 PM Chao Yu <chao@kernel.org> wrote:
> >
> > Hi Juhyung,
> >
> > On 2022/3/15 16:37, Juhyung Park wrote:
> > > Hi Chao,
> > >
> > > I actually have never used compression on f2fs yet and this image is
> > > from an unmodified firmware taken directly from the OTA server, not
> > > made by me.
> >
> > Alright.
> >
> > >
> > > If you can tell me how to check if the file is compressed or not, I'll
> > > run it and report back.
> >
> > lsattr <target_file_path> |grep c
> >
> > You can check the flag's mean in manual of chattr.
>
> Ah, didn't know that lsattr shows compressed state.
> It seems that every single file under that system partition is
> compressed except ones with file size of < 4096, but the error only
> happens on select files.
>
> >
> > >
> > > I never had this sort of issue (a lot of multiple readers causing
> > > troubles) before with a regular R/W, uncompressed f2fs partition, so
> > > I'm guessing it has something to do with RO or compression feature.
> >
> > Agreed, maybe we can repacking image to RO one or compressed one to
> > check which one can cause this issue?
>
> Is this easy to do standalone?
>
> AOSP's ext4 image creation was already super complicated to do
> out-of-tree with their dedup and minimizing file size (so that the
> resulting image is as small as possible) thinggies going on. I think
> I've seen EROFS support but I don't know how f2fs system images are
> handled under AOSP.
>
> Also, I moved the environment to my personal laptop. It's running
> v5.15.28 with the latest f2fs-stable/linux-5.15.y merged and it still
> happens there.
>
> One weird thing that I've been able to confirm is that it seems that
> multiple readers are not the problem:
> #!/bin/bash
>
> set -v
>
> APK=system/app/some_app/some_app.apk
>
> echo 3 > /proc/sys/vm/drop_caches
> tar -pcf - $APK | cat > /dev/null     # Works!
> cp $APK /tmp                          # Works!
>
> echo 3 > /proc/sys/vm/drop_caches
> tar -pcf - $APK | cat > /dev/null     # Works!
> echo 3 > /proc/sys/vm/drop_caches
> cp $APK /tmp                          # cp: error reading
> 'system/app/some_app/some_app.apk': Input/output error
>
> echo 3 > /proc/sys/vm/drop_caches
> cp $APK /tmp                          # cp: error reading
> 'system/app/some_app/some_app.apk': Input/output error
> tar -pcf - $APK | cat > /dev/null     # tar:
> system/app/some_app/some_app.apk: File shrank by 3375397 bytes;
> padding with zeros
>
> echo 3 > /proc/sys/vm/drop_caches
> cp $APK /tmp                          # cp: error reading
> 'system/app/some_app/some_app.apk': Input/output error
> echo 3 > /proc/sys/vm/drop_caches
> tar -pcf - $APK | cat > /dev/null     # Works!
>
> I'm guessing some system-calls that cp use causes f2fs to malfunction.
> (GNU tar is smart enough to detect if the output is /dev/null, so it
> skips the actual file read when it is piped directly to /dev/null, so
> I added `cat` in between.)
>
> Hope this new info helps.
>
> Thanks.
>
> >
> > Thanks,
> >
> > >
> > > Thanks.
> > >
> > > On Tue, Mar 15, 2022 at 5:33 PM Chao Yu <chao@kernel.org> wrote:
> > >>
> > >> On 2022/3/14 1:52, Juhyung Park wrote:
> > >>> Hi.
> > >>>
> > >>> We have a production server storing some Android firmwares over a ZFS
> > >>> file-system, and we noticed some issues when extracting firmware files
> > >>> that use f2fs for Android system partitions.
> > >>>
> > >>> This is a proprietary environment, so I cannot disclose every detail,
> > >>> so I hope you understand. I'll try to elaborate as much as I can.
> > >>>
> > >>> The server is running Ubuntu 20.04 with Linux v5.15 (recently upgraded
> > >>> from v5.13 after noticing RO feature added on v5.14 being required).
> > >>> We have a set of scripts extracting Android firmware files. The input
> > >>> is typically the OTA zip file and after going through the script, it
> > >>> extracts every file and binary image from a given file.
> > >>>
> > >>> So that includes extracting super (dynamic partition), ext4 system
> > >>> partitions with dedup enabled, and now, f2fs system partitions with RO
> > >>> and compression enabled.
> > >>>
> > >>> Our script never had to deal with f2fs before as we only started
> > >>> seeing f2fs system partitions with recently released devices.
> > >>>
> > >>> This is the f2fs mount flag after mounting with `mount -o ro
> > >>> system.raw /some/dir`:
> > >>> ro,relatime,lazytime,background_gc=on,discard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,extent_cache,mode=adaptive,active_logs=2,alloc_mode=reuse,checkpoint_merge,fsync_mode=posix,compress_algorithm=lz4,compress_log_size=2,compress_mode=fs,discard_unit=block
> > >>>
> > >>> There are *a lot* of files in Android firmware these days, so we try
> > >>> to parallelize parts when we can.
> > >>>
> > >>> This is a snippet of the script:
> > >>> ```
> > >>> #!/bin/bash
> > >>> <...>
> > >>> RSYNC="rsync -ahAXx --inplace --numeric-ids"
> > >>> <...>
> > >>> for val in system vendor product odm; do
> > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > >>>
> > >>>     mkdir -p fs
> > >>>     cd fs
> > >>>
> > >>>     mkdir -p $val.mount tmp_$val
> > >>>     mount -o ro ../images/$val.raw $val.mount
> > >>>
> > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > >>>     echo $! > $val.pid
> > >>>     disown
> > >>>
> > >>>     cd $val.mount
> > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > >>>
> > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > >>>     wait
> > >>>
> > >>> <...>
> > >>>
> > >>>     cd ../
> > >>>     rm -rf tmp_$val
> > >>>     cd ../
> > >>> done
> > >>>
> > >>> wait
> > >>> <...>
> > >>> for val in system vendor product odm; do
> > >>>     if ! ls images/$val.raw > /dev/null 2>&1; then continue; fi
> > >>>     tail --pid=$(cat fs/$val.pid) -f /dev/null
> > >>>     umount fs/$val.mount
> > >>>     rmdir fs/$val.mount
> > >>>     rm -f images/$val.img images/$val.raw 2>/dev/null
> > >>> done
> > >>> ```
> > >>>
> > >>> The offending part is:
> > >>> ```
> > >>>     $RSYNC $val.mount/ "$DEST_PWD/fs/$val/" &
> > >>>     find . -type d -exec mkdir -p "$DEST_PWD/strings/$val/"{} \;
> > >>>     find . -type d -exec mkdir -p "../tmp_$val/"{} \;
> > >>>     while read file; do strings "$file" > "$DEST_PWD/strings/$val/$file"
> > >>> & done < <(find . -type f | grep -v '\.apk\|\.jar\|\.zip')
> > >>>     wait
> > >>> ```
> > >>>
> > >>> When that part is reached, the script forks thousands of new processes
> > >>> and starts reading from f2fs. (We simply decided to rely on Linux's
> > >>> task scheduler and didn't bother to limit the number of
> > >>> sub-processes.)
> > >>>
> > >>> I am able to reliably cause f2fs to return EIO on some files:
> > >>> cp: error reading './system/priv-app/some_apk_1/some_apk_1.apk':
> > >>> Input/output error
> > >>> cp: error reading './system/priv-app/some_apk_2/some_apk_2.apk':
> > >>> Input/output error
> > >>> cp: error reading './system/priv-app/some_apk_3/some_apk_3.apk':
> > >>> Input/output error
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > >>> Input/output error (5)
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > >>> Input/output error (5)
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > >>> Input/output error (5)
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_1/some_apk_1.apk":
> > >>> Input/output error (5)
> > >>> ERROR: system/priv-app/some_apk_1/some_apk_1.apk failed verification
> > >>> -- update retained.
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_2/some_apk_2.apk":
> > >>> Input/output error (5)
> > >>> ERROR: system/priv-app/some_apk_2/some_apk_2.apk failed verification
> > >>> -- update retained.
> > >>> rsync: [sender] read errors mapping
> > >>> "/ssd/some_firmware.zip/fs/system.mount/system/priv-app/some_apk_3/some_apk_3.apk":
> > >>> Input/output error (5)
> > >>> ERROR: system/priv-app/some_apk_3/some_apk_3.apk failed verification
> > >>> -- update retained.
> > >>> rsync error: some files/attrs were not transferred (see previous
> > >>> errors) (code 23) at main.c(1333) [sender=v3.2.3-45-ga28c4558]
> > >>
> > >> Hi Juhyung,
> > >>
> > >> Do you enable compression on specified files? if so, do all EIOs come from compressed
> > >> file?
> > >>
> > >> Thanks,
> > >>
> > >>>
> > >>> The dmesg remains silent.
> > >>>
> > >>> When I modify the script a little bit and force it to run in a
> > >>> single-thread (by removing &), it runs well.
> > >>>
> > >>> I was able to confirm that it isn't a memory issue. The server has
> > >>> 50G+ of free memory, and the issue is still reliably reproducible when
> > >>> I defragment the memory by dropping caches and doing `echo 1 >
> > >>> /proc/sys/vm/compact_memory`.
> > >>>
> > >>> I wasn't able to test any recent kernels (v5.16 or v5.17) as it's
> > >>> unsupported by ZFS. And it being a production server, I am somewhat
> > >>> limited in dabbling around the kernel.
> > >>>
> > >>> I am planning to test a new kernel with v5.15 +
> > >>> f2fs-stable/linux-5.15.y merged. Meanwhile, if this is a new report or
> > >>> fixed with newer commits, I'd appreciate a tip.
> > >>>
> > >>> Thanks.
> > >>>
> > >>>
> > >>> _______________________________________________
> > >>> Linux-f2fs-devel mailing list
> > >>> Linux-f2fs-devel@lists.sourceforge.net
> > >>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2022-03-15 10:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-13 17:52 [f2fs-dev] EIO returned when reading files from R/O, compressed f2fs image Juhyung Park
2022-03-15  0:30 ` Jaegeuk Kim
2022-03-15  4:42   ` Juhyung Park
2022-03-15  8:33 ` Chao Yu
2022-03-15  8:37   ` Juhyung Park
2022-03-15  8:45     ` Chao Yu
2022-03-15 10:25       ` Juhyung Park
2022-03-15 10:48         ` Juhyung Park [this message]
2022-03-15 20:49           ` Jaegeuk Kim
2022-03-16  8:43             ` Juhyung Park
2022-03-16 10:00               ` Chao Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD14+f1_smUWHgZRffuTNfsmCk-j_LL_vw5RuWffGdnd=vP8BA@mail.gmail.com' \
    --to=qkrwngud825@gmail.com \
    --cc=chao@kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).