From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Andreas Dilger <adilger@dilger.ca>,
Florian Weimer <fw@deneb.enyo.de>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux API <linux-api@vger.kernel.org>,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
lucho@ionkov.net, libc-alpha@sourceware.org,
Arnd Bergmann <arnd@arndb.de>,
ericvh@gmail.com, hpa@zytor.com,
lkml - Kernel Mailing List <linux-kernel@vger.kernel.org>,
QEMU Developers <qemu-devel@nongnu.org>,
rminnich@sandia.gov, v9fs-developer@lists.sourceforge.net
Subject: Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation
Date: Fri, 28 Dec 2018 21:11:57 -0500 [thread overview]
Message-ID: <20181229021157.GG5864@mit.edu> (raw)
In-Reply-To: <CAFEAcA9W+JK7_TrtTnL1P2ES1knNPJX9wcUvhfLwxLq9augq1w@mail.gmail.com>
On Fri, Dec 28, 2018 at 11:18:18AM +0000, Peter Maydell wrote:
> In general inodes and offsets start from 0 and work up --
> so almost all of the time they don't actually overflow.
> The problem with ext4 directory hash "offsets" is that they
> overflow all the time and immediately, so instead of "works
> unless you have a weird edge case" like all the other filesystems,h
> it's "never works".
Actually, XFS uses the inode number to encode the location of the
inode (it doesn't have a fixed inode table, so it's effectively the
block number shifted left by 3 or 4 bits, with the low bits indicating
the slot in the 4k block). It has a hack to provide backwards
compatibility for 32-bit API's, but there is a similar, "oh, we're on
a non-paleolithic CPU, let's use the full 64-bits" sort of logic that
ext4 has.
> The problem is that there is no 32-bit API in some cases
> (unless I have misunderstood the kernel code) -- not all
> host architectures implement compat syscalls or allow them
> to be called from 64-bit processes or implement all the older
> syscall variants that had smaller offets. If there was a guaranteed
> "this syscall always exists and always gives me 32-bit offsets"
> we could use it.
Are there going to be cases where a process or a thread will sometimes
want the 64-bit interface, and sometimes want the 32-bit interface?
Or is it always going to be one or the other? I wonder if we could
simply add a new flag to the process personality(2) flags.
> Yes, that has been suggested, but it seemed a bit dubious
> to bake in knowledge of ext4's internal implementation details.
> Can we rely on this as an ABI promise that will always work
> for all versions of all file systems going forwards?
Yeah, that seems dubious because I'm pretty sure there are other file
systems that may have their own 32/64-bit quirks.
- Ted
WARNING: multiple messages have this Message-ID (diff)
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: Andreas Dilger <adilger@dilger.ca>,
Florian Weimer <fw@deneb.enyo.de>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux API <linux-api@vger.kernel.org>,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
<lucho@ionkov.net>, <libc-alpha@sourceware.org>,
Arnd Bergmann <arnd@arndb.de>, <ericvh@gmail.com>,
<hpa@zytor.com>,
lkml - Kernel Mailing List <linux-kernel@vger.kernel.org>,
QEMU Developers <qemu-devel@nongnu.org>, <rminnich@sandia.gov>,
<v9fs-developer@lists.sourceforge.net>
Subject: Re: [Qemu-devel] d_off field in struct dirent and 32-on-64 emulation
Date: Fri, 28 Dec 2018 21:11:57 -0500 [thread overview]
Message-ID: <20181229021157.GG5864@mit.edu> (raw)
In-Reply-To: <CAFEAcA9W+JK7_TrtTnL1P2ES1knNPJX9wcUvhfLwxLq9augq1w@mail.gmail.com>
On Fri, Dec 28, 2018 at 11:18:18AM +0000, Peter Maydell wrote:
> In general inodes and offsets start from 0 and work up --
> so almost all of the time they don't actually overflow.
> The problem with ext4 directory hash "offsets" is that they
> overflow all the time and immediately, so instead of "works
> unless you have a weird edge case" like all the other filesystems,h
> it's "never works".
Actually, XFS uses the inode number to encode the location of the
inode (it doesn't have a fixed inode table, so it's effectively the
block number shifted left by 3 or 4 bits, with the low bits indicating
the slot in the 4k block). It has a hack to provide backwards
compatibility for 32-bit API's, but there is a similar, "oh, we're on
a non-paleolithic CPU, let's use the full 64-bits" sort of logic that
ext4 has.
> The problem is that there is no 32-bit API in some cases
> (unless I have misunderstood the kernel code) -- not all
> host architectures implement compat syscalls or allow them
> to be called from 64-bit processes or implement all the older
> syscall variants that had smaller offets. If there was a guaranteed
> "this syscall always exists and always gives me 32-bit offsets"
> we could use it.
Are there going to be cases where a process or a thread will sometimes
want the 64-bit interface, and sometimes want the 32-bit interface?
Or is it always going to be one or the other? I wonder if we could
simply add a new flag to the process personality(2) flags.
> Yes, that has been suggested, but it seemed a bit dubious
> to bake in knowledge of ext4's internal implementation details.
> Can we rely on this as an ABI promise that will always work
> for all versions of all file systems going forwards?
Yeah, that seems dubious because I'm pretty sure there are other file
systems that may have their own 32/64-bit quirks.
- Ted
next prev parent reply other threads:[~2018-12-29 2:11 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-27 17:18 d_off field in struct dirent and 32-on-64 emulation Florian Weimer
2018-12-27 17:18 ` [Qemu-devel] " Florian Weimer
2018-12-27 17:38 ` Andy Lutomirski
2018-12-27 17:38 ` [Qemu-devel] " Andy Lutomirski
2018-12-27 17:56 ` Florian Weimer
2018-12-27 17:56 ` [Qemu-devel] " Florian Weimer
2018-12-27 17:41 ` Peter Maydell
2018-12-28 0:23 ` Andreas Dilger
2018-12-28 11:18 ` Peter Maydell
2018-12-28 23:16 ` Andreas Dilger
2018-12-29 0:12 ` Peter Maydell
2018-12-29 1:54 ` Matthew Wilcox
2018-12-29 16:49 ` Andy Lutomirski
2018-12-30 13:59 ` Peter Maydell
2018-12-29 2:11 ` Theodore Y. Ts'o [this message]
2018-12-29 2:11 ` Theodore Y. Ts'o
2018-12-29 2:37 ` Dominique Martinet
2018-12-29 3:14 ` Theodore Y. Ts'o
2018-12-29 3:14 ` Theodore Y. Ts'o
2018-12-29 4:04 ` [V9fs-developer] " Dominique Martinet
2018-12-29 4:04 ` [Qemu-devel] [V9fs-developer] " Dominique Martinet
2018-12-27 17:58 ` Adhemerval Zanella
2018-12-27 17:58 ` [Qemu-devel] " Adhemerval Zanella
2018-12-27 18:09 ` Florian Weimer
2018-12-27 18:09 ` [Qemu-devel] " Florian Weimer
2018-12-28 11:53 ` Adhemerval Zanella
2018-12-28 11:53 ` [Qemu-devel] " Adhemerval Zanella
2018-12-28 11:56 ` Florian Weimer
2018-12-28 11:56 ` [Qemu-devel] " Florian Weimer
2018-12-28 12:01 ` Florian Weimer
2018-12-28 12:01 ` [Qemu-devel] " Florian Weimer
2018-12-28 12:21 ` Adhemerval Zanella
2018-12-28 12:21 ` [Qemu-devel] " Adhemerval Zanella
2018-12-31 17:03 ` Joseph Myers
2018-12-31 17:03 ` [Qemu-devel] " Joseph Myers
2018-12-31 17:03 ` Joseph Myers
2019-01-02 13:16 ` Adhemerval Zanella
2019-01-02 13:16 ` [Qemu-devel] " Adhemerval Zanella
2018-12-28 2:23 ` Dmitry V. Levin
2018-12-28 2:23 ` [Qemu-devel] " Dmitry V. Levin
2018-12-28 7:38 ` Florian Weimer
2018-12-28 7:38 ` [Qemu-devel] " Florian Weimer
2018-12-28 15:26 ` Andy Lutomirski
2018-12-28 15:26 ` [Qemu-devel] " Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181229021157.GG5864@mit.edu \
--to=tytso@mit.edu \
--cc=adilger@dilger.ca \
--cc=arnd@arndb.de \
--cc=ericvh@gmail.com \
--cc=fw@deneb.enyo.de \
--cc=hpa@zytor.com \
--cc=libc-alpha@sourceware.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lucho@ionkov.net \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=rminnich@sandia.gov \
--cc=v9fs-developer@lists.sourceforge.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.