public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: George Anthony Vernon <contact@gvernon.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Cc: "syzbot+97e301b4b82ae803d21b@syzkaller.appspotmail.com"
	<syzbot+97e301b4b82ae803d21b@syzkaller.appspotmail.com>,
	"glaubitz@physik.fu-berlin.de" <glaubitz@physik.fu-berlin.de>,
	"frank.li@vivo.com" <frank.li@vivo.com>,
	"penguin-kernel@i-love.sakura.ne.jp"
	<penguin-kernel@i-love.sakura.ne.jp>,
	"slava@dubeyko.com" <slava@dubeyko.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v4] hfs: Validate CNIDs in hfs_read_inode
Date: Wed, 18 Mar 2026 22:49:26 +0000	[thread overview]
Message-ID: <absr9g7LaMEHfw47@Bertha> (raw)
In-Reply-To: <dbacbf1c48ac5aa24f37c58bff995d7b03268db4.camel@ibm.com>

On Wed, Mar 18, 2026 at 02:02:10AM +0000, Viacheslav Dubeyko wrote:
> On Wed, 2026-03-18 at 00:37 +0000, George Anthony Vernon wrote:
> > Sorry I struggled to understand you here Slava, there's a little bit lost in
> > translation I think.
> > 
> > On Mon, Mar 16, 2026 at 09:50:14PM +0000, Viacheslav Dubeyko wrote:
> > > by hfs_cat_find_brec(). But I cannot imagine that this logic can extract the
> > > record  with incorrect CNID. Because, it is the main goal of hfs_cat_find_brec()
> > > logic to extract the record that contains requested CNID. And if we requested
> > 
> > Do you mean that you do not think hfs_cat_find_brec *can* return a
> > record with incorrect CNID, or that you do not think it *should*?
> > 
> > I think Tetsuo is right that hfs_cat_find_brec() will return a catalog
> > record with different CNID in case of a malformed thread record.
> > 
> > On Mon, Mar 16, 2026 at 09:50:14PM +0000, Viacheslav Dubeyko wrote:
> > > logic to extract the record that contains requested CNID. And if we requested
> > > the HFS_ROOT_CNID, then this logic should return the record with exactly
> > > requested CNID or return the error code if such record has not been found.
> > 
> > Do you mean that hfs_cat_find_brec() should validate the CNID of the
> > catalog record found by hfs_brec_find()? I'm worried that validating
> > every B-tree lookup is going to be expensive. We could do it, however.
> > 
> > 
> 
> If you need to initialize the inode, then you need to find a file or a folder
> record in Catalog File (b-tree). It means that there are two possible ways: (1)
> find it by name, (2) find it by CNID.
> 
> If you know the name only, then you need to find a thread record by name. The
> thread record contains associated CNID that can be used to find the final
> file/folder record. It means that the second step is the searching the record by
> using the CNID. If CNID is OK, then we can find the record. If it is not OK,
> then we can find nothing or wrong record.
> 
> If we know CNID, then we can try to find the record by CNID directly. We will
> fail to find the record if there is no record with such CNID. But you don't need
> in thread record in the case of having CNID for the search.
> 

I don't think this is entirely right because we can't find a catalog
record by CNID directly. We also need its node name which is contained
in the thread record.

Knowing the CNID of the catalog entry we want to find allows us to
construct a key for the thread record, which we find and read. The
thread record tells us the CNID of the catalog record's parent and the
catalog record's file name, which we combine to create a key for the
catalog record. If the thread record is corrupted, the key we
constructed may find us a different catalog entry with different CNID.

- hfs_cat_find_brec() builds a key for the root CNID thread record [1]
- The contents of the thread record are read into an hfs_cat_rec by
hfs_brec_read(), which calls into hfs_bnode_read [2]
- hfs_cat_find_brec() then populates the key for the root CNID catalog
  record, using the fields of the thread record file [3]
- Finally we can search for the catalog record, and this will populate a
  new hfs_cat_rec [4]

So if the catalog thread record is corrupted, the bug Tetsuo described
can currently happen. The catalog record read from disk may have a
different CNID than the one we wanted to find provided to
hfs_cat_find_brec() hence the need to check the CNID of the catalog file
record returned by hfs_bnode_read in hfs_fill_super.

Therefore Tetsuo's patch to check the returned CNID after calling
hfs_cat_find_brec() seems correct to me, but I think it would be better
do that check inside hfs_cat_find_brec() before it returns, then as you
said the function will return the record with "exactly requested CNID or
return the error code if such record has not been found."

Tetsuo, what do you think about doing your check inside
hfs_cat_find_brec? Something a bit like this:

diff --git a/fs/hfs/catalog.c b/fs/hfs/catalog.c
index b80ba40e3877..5f2b4e223e68 100644
--- a/fs/hfs/catalog.c
+++ b/fs/hfs/catalog.c
@@ -185,10 +185,11 @@ int hfs_cat_keycmp(const btree_key *key1, const btree_key *key2)
 /* Try to get a catalog entry for given catalog id */
 // move to read_super???
 int hfs_cat_find_brec(struct super_block *sb, u32 cnid,
-                     struct hfs_find_data *fd)
+                     struct hfs_find_data *fd, hfs_cat_rec *rec_out)
 {
        hfs_cat_rec rec;
        int res, len, type;
+       u32 cat_cnid;

        hfs_cat_build_key(sb, fd->search_key, cnid, NULL);
        res = hfs_brec_read(fd, &rec, sizeof(rec));
@@ -208,7 +209,37 @@ int hfs_cat_find_brec(struct super_block *sb, u32 cnid,
                return -EIO;
        }
        memcpy(fd->search_key->cat.CName.name, rec.thread.CName.name, len);
-       return hfs_brec_find(fd);
+       res = hfs_brec_find(fd);
+       if (res)
+               return res;
+
+       if (fd->entrylength > sizeof(rec)) {
+               pr_err("Bad catalog entry size\n");
+               return -EIO;
+       }
+       hfs_bnode_read(fd->bnode, &rec, fd->entryoffset, fd->entrylength);
+       type = rec.type;
+       switch (type) {
+       case HFS_CDR_FIL:
+               cat_cnid = be32_to_cpu(rec.file.FlNum);
+               break;
+       case HFS_CDR_DIR:
+               cat_cnid = be32_to_cpu(rec.dir.DirID);
+               break;
+       default:
+               pr_err("Bad catalog record type\n");
+               pr_err("Filesystem may be corrupted, try performing fsck\n");
+               return -EIO;
+       }
+
+       if (cat_cnid != cnid) {
+               pr_err("Bad catalog record CNID %u, expected %u\n", cat_cnid, cnid);
+               pr_err("Volume is probably corrupted, try performing fsck.\n");
+               return -EIO;
+       }
+
+       *rec_out = rec;
+       return 0;
 }

 static inline


Thanks,

George

[1] https://elixir.bootlin.com/linux/v7.0-rc4/source/fs/hfs/catalog.c#L193
[2] https://elixir.bootlin.com/linux/v7.0-rc4/source/fs/hfs/bfind.c#L179
[3] https://elixir.bootlin.com/linux/v7.0-rc4/source/fs/hfs/catalog.c#L204
[4] https://elixir.bootlin.com/linux/v7.0-rc4/source/fs/hfs/catalog.c#L211

  reply	other threads:[~2026-03-18 22:49 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 21:13 [PATCH v4] hfs: Validate CNIDs in hfs_read_inode George Anthony Vernon
2026-03-12 10:45 ` Tetsuo Handa
2026-03-12 23:13   ` Viacheslav Dubeyko
2026-03-13 11:03     ` Tetsuo Handa
2026-03-13 18:40       ` Viacheslav Dubeyko
2026-03-14  6:35         ` Tetsuo Handa
2026-03-16 21:50           ` Viacheslav Dubeyko
2026-03-18  0:37             ` George Anthony Vernon
2026-03-18  2:02               ` Viacheslav Dubeyko
2026-03-18 22:49                 ` George Anthony Vernon [this message]
2026-03-19  9:57                   ` Tetsuo Handa
2026-03-19 12:26                     ` George Vernon
2026-03-20  0:32                       ` Tetsuo Handa
2026-03-18 10:42             ` Tetsuo Handa
2026-03-18  0:10   ` George Anthony Vernon
2026-03-18  8:16     ` Tetsuo Handa
2026-03-12 23:07 ` Viacheslav Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=absr9g7LaMEHfw47@Bertha \
    --to=contact@gvernon.com \
    --cc=Slava.Dubeyko@ibm.com \
    --cc=frank.li@vivo.com \
    --cc=glaubitz@physik.fu-berlin.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=slava@dubeyko.com \
    --cc=syzbot+97e301b4b82ae803d21b@syzkaller.appspotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox