From: "Darrick J. Wong" <djwong@kernel.org>
To: Wu Guanghao <wuguanghao3@huawei.com>
Cc: cem@kernel.org, linux-xfs@vger.kernel.org,
"liuzhiqiang (I)" <liuzhiqiang26@huawei.com>
Subject: Re: [PATCH] mkfs: acquire flock before modifying the device superblock
Date: Fri, 14 Oct 2022 08:38:18 -0700 [thread overview]
Message-ID: <Y0mCauklwsDwImi8@magnolia> (raw)
In-Reply-To: <b359751c-2397-bcd1-9065-583afb2f93ef@huawei.com>
On Fri, Oct 14, 2022 at 04:41:35PM +0800, Wu Guanghao wrote:
> We noticed that systemd has an issue about symlink unreliable caused by
> formatting filesystem and systemd operating on same device.
> Issue Link: https://github.com/systemd/systemd/issues/23746
>
> According to systemd doc, a BSD flock needs to be acquired before
> formatting the device.
> Related Link: https://systemd.io/BLOCK_DEVICE_LOCKING/
TLDR: udevd wants fs utilities to use advisory file locking to
coordinate (re)writes to block devices to avoid collisions between mkfs
and all the udev magic.
Critically, udev calls flock(LOCK_SH | LOCK_NB) to trylock the device in
shared mode to avoid blocking on fs utilities; if the trylock fails,
they'll move on and try again later. The old O_EXCL-on-blockdevs trick
will not work for that usecase (I guess) because it's not a shared
reader lock. It's also not the file locking API.
> So we acquire flock after opening the device but before
> writing superblock.
xfs_db and xfs_repair can write to the filesystem too; shouldn't this
locking apply to them as well?
> Signed-off-by: wuguanghao <wuguanghao3@huawei.com>
> ---
> mkfs/xfs_mkfs.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/mkfs/xfs_mkfs.c b/mkfs/xfs_mkfs.c
> index 9dd0e79c..b83cb043 100644
> --- a/mkfs/xfs_mkfs.c
> +++ b/mkfs/xfs_mkfs.c
> @@ -13,6 +13,7 @@
> #include "libfrog/crc32cselftest.h"
> #include "proto.h"
> #include <ini.h>
> +#include <sys/file.h>
>
> #define TERABYTES(count, blog) ((uint64_t)(count) << (40 - (blog)))
> #define GIGABYTES(count, blog) ((uint64_t)(count) << (30 - (blog)))
> @@ -2758,6 +2759,30 @@ _("log stripe unit (%d bytes) is too large (maximum is 256KiB)\n"
>
> }
>
> +static void
> +lock_device(dev_t dev, int flag, char *name)
> +{
> + int fd = libxfs_device_to_fd(dev);
> + int readonly = flag & LIBXFS_ISREADONLY;
> +
> + if (!readonly && fd > 0)
> + if (flock(fd, LOCK_EX) != 0) {
> + fprintf(stderr, "%s: failed to get lock.\n", name);
> + exit(1);
> + }
So yes, this belongs in libxfs_device_open.
If we're opening the bdevs in readonly mode, shouldn't we take LOCK_SH
to prevent mkfs from colliding with (say) xfs_metadump?
Bonus question: Shouldn't the /kernel/ also effectively be taking
LOCK_SH when it opens the bdevs to mount the filesystem?
--D
> +}
> +
> +static void
> +lock_devices(struct libxfs_xinit *xi)
> +{
> + if (!xi->disfile)
> + lock_device(xi->ddev, xi->dcreat, xi->dname);
> + if (xi->logdev && !xi->lisfile)
> + lock_device(xi->logdev, xi->lcreat, xi->logname);
> + if (xi->rtdev && !xi->risfile)
> + lock_device(xi->rtdev, xi->rcreat, xi->rtname);
> +}
> +
> static void
> open_devices(
> struct mkfs_params *cfg,
> @@ -4208,6 +4233,7 @@ main(
> * Open and validate the device configurations
> */
> open_devices(&cfg, &xi);
> + lock_devices(&xi);
> validate_overwrite(dfile, force_overwrite);
> validate_datadev(&cfg, &cli);
> validate_logdev(&cfg, &cli, &logfile);
> --
> 2.27.0
next prev parent reply other threads:[~2022-10-14 15:38 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-14 8:41 [PATCH] mkfs: acquire flock before modifying the device superblock Wu Guanghao
2022-10-14 15:38 ` Darrick J. Wong [this message]
2022-10-18 2:45 ` Wu Guanghao
2022-10-18 21:09 ` Darrick J. Wong
2022-10-19 1:00 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y0mCauklwsDwImi8@magnolia \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=liuzhiqiang26@huawei.com \
--cc=wuguanghao3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox