[PATCH 0/2] xfs: random patches on log recovery

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] xfs: random patches on log recovery
@ 2020-09-04  8:25 Gao Xiang
  2020-09-04  8:25 ` [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size} Gao Xiang
  2020-09-04  8:25 ` [PATCH v2 2/2] xfs: clean up calculation of LR header blocks Gao Xiang
  0 siblings, 2 replies; 13+ messages in thread
From: Gao Xiang @ 2020-09-04  8:25 UTC (permalink / raw)
  To: linux-xfs; +Cc: Darrick J . Wong, Brian Foster, Gao Xiang

Hi folks,

Here are some patches after I read recovery code days ago.
Due to code coupling from this version, I send them as a patchset.

I already ran fstests and no strange out and
changelog is in each individual patch.

Thanks,
Gao Xiang

Gao Xiang (2):
  xfs: avoid LR buffer overrun due to crafted h_{len,size}
  xfs: clean up calculation of LR header blocks

 fs/xfs/xfs_log.c         |   4 +-
 fs/xfs/xfs_log_recover.c | 103 +++++++++++++++++++--------------------
 2 files changed, 51 insertions(+), 56 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size}
  2020-09-04  8:25 [PATCH 0/2] xfs: random patches on log recovery Gao Xiang
@ 2020-09-04  8:25 ` Gao Xiang
  2020-09-04 11:25   ` Brian Foster
  2020-09-04  8:25 ` [PATCH v2 2/2] xfs: clean up calculation of LR header blocks Gao Xiang
  1 sibling, 1 reply; 13+ messages in thread
From: Gao Xiang @ 2020-09-04  8:25 UTC (permalink / raw)
  To: linux-xfs; +Cc: Darrick J . Wong, Brian Foster, Gao Xiang

Currently, crafted h_len has been blocked for the log
header of the tail block in commit a70f9fe52daa ("xfs:
detect and handle invalid iclog size set by mkfs").

However, each log record could still have crafted h_len,
h_size and cause log record buffer overrun. So let's
check (h_len vs h_size) and (h_size vs buffer size)
for each log record as well instead.

Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
---
v2: https://lore.kernel.org/r/20200902141923.26422-1-hsiangkao@redhat.com

changes since v2:
 - rename argument h_size to bufsize to make it clear (Brian);
 - leave the mkfs workaround logic in xlog_do_recovery_pass() (Brian);
 - add XLOG_VERSION_2 checking logic since old logrecv1 doesn't have
   h_size field just to be safe.

 fs/xfs/xfs_log_recover.c | 50 +++++++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index e2ec91b2d0f4..28d952794bfa 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2904,9 +2904,10 @@ STATIC int
 xlog_valid_rec_header(
 	struct xlog		*log,
 	struct xlog_rec_header	*rhead,
-	xfs_daddr_t		blkno)
+	xfs_daddr_t		blkno,
+	int			bufsize)
 {
-	int			hlen;
+	int			hlen, hsize = XLOG_BIG_RECORD_BSIZE;
 
 	if (XFS_IS_CORRUPT(log->l_mp,
 			   rhead->h_magicno != cpu_to_be32(XLOG_HEADER_MAGIC_NUM)))
@@ -2920,10 +2921,22 @@ xlog_valid_rec_header(
 		return -EFSCORRUPTED;
 	}
 
-	/* LR body must have data or it wouldn't have been written */
+	/*
+	 * LR body must have data (or it wouldn't have been written) and
+	 * h_len must not be greater than h_size with one exception (see
+	 * comments in xlog_do_recovery_pass()).
+	 */
 	hlen = be32_to_cpu(rhead->h_len);
-	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX))
+	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) &&
+	    (be32_to_cpu(rhead->h_version) & XLOG_VERSION_2))
+		hsize = be32_to_cpu(rhead->h_size);
+
+	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > hsize))
 		return -EFSCORRUPTED;
+
+	if (bufsize && XFS_IS_CORRUPT(log->l_mp, bufsize < hsize))
+		return -EFSCORRUPTED;
+
 	if (XFS_IS_CORRUPT(log->l_mp,
 			   blkno > log->l_logBBsize || blkno > INT_MAX))
 		return -EFSCORRUPTED;
@@ -2984,9 +2997,6 @@ xlog_do_recovery_pass(
 			goto bread_err1;
 
 		rhead = (xlog_rec_header_t *)offset;
-		error = xlog_valid_rec_header(log, rhead, tail_blk);
-		if (error)
-			goto bread_err1;
 
 		/*
 		 * xfsprogs has a bug where record length is based on lsunit but
@@ -3001,21 +3011,19 @@ xlog_do_recovery_pass(
 		 */
 		h_size = be32_to_cpu(rhead->h_size);
 		h_len = be32_to_cpu(rhead->h_len);
-		if (h_len > h_size) {
-			if (h_len <= log->l_mp->m_logbsize &&
-			    be32_to_cpu(rhead->h_num_logops) == 1) {
-				xfs_warn(log->l_mp,
+		if (h_len > h_size && h_len <= log->l_mp->m_logbsize &&
+		    rhead->h_num_logops == cpu_to_be32(1)) {
+			xfs_warn(log->l_mp,
 		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
-					 h_size, log->l_mp->m_logbsize);
-				h_size = log->l_mp->m_logbsize;
-			} else {
-				XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW,
-						log->l_mp);
-				error = -EFSCORRUPTED;
-				goto bread_err1;
-			}
+				 h_size, log->l_mp->m_logbsize);
+			h_size = log->l_mp->m_logbsize;
+			rhead->h_size = cpu_to_be32(h_size);
 		}
 
+		error = xlog_valid_rec_header(log, rhead, tail_blk, 0);
+		if (error)
+			goto bread_err1;
+
 		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
 		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
 			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
@@ -3096,7 +3104,7 @@ xlog_do_recovery_pass(
 			}
 			rhead = (xlog_rec_header_t *)offset;
 			error = xlog_valid_rec_header(log, rhead,
-						split_hblks ? blk_no : 0);
+					split_hblks ? blk_no : 0, h_size);
 			if (error)
 				goto bread_err2;
 
@@ -3177,7 +3185,7 @@ xlog_do_recovery_pass(
 			goto bread_err2;
 
 		rhead = (xlog_rec_header_t *)offset;
-		error = xlog_valid_rec_header(log, rhead, blk_no);
+		error = xlog_valid_rec_header(log, rhead, blk_no, h_size);
 		if (error)
 			goto bread_err2;
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size}
  2020-09-04  8:25 ` [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size} Gao Xiang
@ 2020-09-04 11:25   ` Brian Foster
  2020-09-04 12:46     ` Gao Xiang
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2020-09-04 11:25 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 04:25:15PM +0800, Gao Xiang wrote:
> Currently, crafted h_len has been blocked for the log
> header of the tail block in commit a70f9fe52daa ("xfs:
> detect and handle invalid iclog size set by mkfs").
> 
> However, each log record could still have crafted h_len,
> h_size and cause log record buffer overrun. So let's
> check (h_len vs h_size) and (h_size vs buffer size)
> for each log record as well instead.
> 
> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> ---
> v2: https://lore.kernel.org/r/20200902141923.26422-1-hsiangkao@redhat.com
> 
> changes since v2:
>  - rename argument h_size to bufsize to make it clear (Brian);
>  - leave the mkfs workaround logic in xlog_do_recovery_pass() (Brian);
>  - add XLOG_VERSION_2 checking logic since old logrecv1 doesn't have
>    h_size field just to be safe.
> 
>  fs/xfs/xfs_log_recover.c | 50 +++++++++++++++++++++++-----------------
>  1 file changed, 29 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index e2ec91b2d0f4..28d952794bfa 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2904,9 +2904,10 @@ STATIC int
>  xlog_valid_rec_header(
>  	struct xlog		*log,
>  	struct xlog_rec_header	*rhead,
> -	xfs_daddr_t		blkno)
> +	xfs_daddr_t		blkno,
> +	int			bufsize)
>  {
> -	int			hlen;
> +	int			hlen, hsize = XLOG_BIG_RECORD_BSIZE;
>  
>  	if (XFS_IS_CORRUPT(log->l_mp,
>  			   rhead->h_magicno != cpu_to_be32(XLOG_HEADER_MAGIC_NUM)))
> @@ -2920,10 +2921,22 @@ xlog_valid_rec_header(
>  		return -EFSCORRUPTED;
>  	}
>  
> -	/* LR body must have data or it wouldn't have been written */
> +	/*
> +	 * LR body must have data (or it wouldn't have been written) and
> +	 * h_len must not be greater than h_size with one exception (see
> +	 * comments in xlog_do_recovery_pass()).
> +	 */

I wouldn't mention the exceptional case at all here since I think it
just adds confusion. It's an unfortunate wart with mkfs that requires a
kernel workaround, and I think it's better to keep it one place. I.e.,
should it ever be removed, I find it unlikely somebody will notice this
comment and fix it up accordingly.

>  	hlen = be32_to_cpu(rhead->h_len);
> -	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > INT_MAX))
> +	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb) &&
> +	    (be32_to_cpu(rhead->h_version) & XLOG_VERSION_2))
> +		hsize = be32_to_cpu(rhead->h_size);
> +
> +	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > hsize))
>  		return -EFSCORRUPTED;
> +
> +	if (bufsize && XFS_IS_CORRUPT(log->l_mp, bufsize < hsize))
> +		return -EFSCORRUPTED;

Please do something like the following so the full corruption check
logic is readable:

	if (XFS_IS_CORRUPT(..., bufsize && hsize > bufsize))
		return -EFSCORRUPTED;

> +
>  	if (XFS_IS_CORRUPT(log->l_mp,
>  			   blkno > log->l_logBBsize || blkno > INT_MAX))
>  		return -EFSCORRUPTED;
> @@ -2984,9 +2997,6 @@ xlog_do_recovery_pass(
>  			goto bread_err1;
>  
>  		rhead = (xlog_rec_header_t *)offset;
> -		error = xlog_valid_rec_header(log, rhead, tail_blk);
> -		if (error)
> -			goto bread_err1;

This technically defers broader corruption checks (i.e., magic number,
etc.) until after functional code starts using other fields below. I
don't think we should remove this.

>  
>  		/*
>  		 * xfsprogs has a bug where record length is based on lsunit but
> @@ -3001,21 +3011,19 @@ xlog_do_recovery_pass(
>  		 */
>  		h_size = be32_to_cpu(rhead->h_size);
>  		h_len = be32_to_cpu(rhead->h_len);
> -		if (h_len > h_size) {
> -			if (h_len <= log->l_mp->m_logbsize &&
> -			    be32_to_cpu(rhead->h_num_logops) == 1) {
> -				xfs_warn(log->l_mp,
> +		if (h_len > h_size && h_len <= log->l_mp->m_logbsize &&
> +		    rhead->h_num_logops == cpu_to_be32(1)) {
> +			xfs_warn(log->l_mp,
>  		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
> -					 h_size, log->l_mp->m_logbsize);
> -				h_size = log->l_mp->m_logbsize;
> -			} else {
> -				XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW,
> -						log->l_mp);
> -				error = -EFSCORRUPTED;
> -				goto bread_err1;
> -			}
> +				 h_size, log->l_mp->m_logbsize);
> +			h_size = log->l_mp->m_logbsize;
> +			rhead->h_size = cpu_to_be32(h_size);

I don't think we should update rhead like this, particularly in a rare
and exclusive case. This structure should reflect what is on disk.

All in all, I think this patch should be much more focused:

1.) Add the bufsize parameter and associated corruption check to
xlog_valid_rec_header().
2.) Pass the related value from the existing calls.
3.) (Optional) If there's reason to revalidate after executing the mkfs
workaround, add a second call within the branch that implements the
h_size workaround.

Also, please test the workaround case to make sure it still works as
expected (if you haven't already).

Brian

>  		}
>  
> +		error = xlog_valid_rec_header(log, rhead, tail_blk, 0);
> +		if (error)
> +			goto bread_err1;
> +
>  		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
>  		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
>  			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
> @@ -3096,7 +3104,7 @@ xlog_do_recovery_pass(
>  			}
>  			rhead = (xlog_rec_header_t *)offset;
>  			error = xlog_valid_rec_header(log, rhead,
> -						split_hblks ? blk_no : 0);
> +					split_hblks ? blk_no : 0, h_size);
>  			if (error)
>  				goto bread_err2;
>  
> @@ -3177,7 +3185,7 @@ xlog_do_recovery_pass(
>  			goto bread_err2;
>  
>  		rhead = (xlog_rec_header_t *)offset;
> -		error = xlog_valid_rec_header(log, rhead, blk_no);
> +		error = xlog_valid_rec_header(log, rhead, blk_no, h_size);
>  		if (error)
>  			goto bread_err2;
>  
> -- 
> 2.18.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size}
  2020-09-04 11:25   ` Brian Foster
@ 2020-09-04 12:46     ` Gao Xiang
  2020-09-04 13:37       ` Brian Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Gao Xiang @ 2020-09-04 12:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs, Darrick J . Wong

Hi Brian,

On Fri, Sep 04, 2020 at 07:25:29AM -0400, Brian Foster wrote:
> On Fri, Sep 04, 2020 at 04:25:15PM +0800, Gao Xiang wrote:

...

> > @@ -2904,9 +2904,10 @@ STATIC int
> >  xlog_valid_rec_header(
> >  	struct xlog		*log,
> >  	struct xlog_rec_header	*rhead,
> > -	xfs_daddr_t		blkno)
> > +	xfs_daddr_t		blkno,
> > +	int			bufsize)
> >  {
> > -	int			hlen;
> > +	int			hlen, hsize = XLOG_BIG_RECORD_BSIZE;
> >  
> >  	if (XFS_IS_CORRUPT(log->l_mp,
> >  			   rhead->h_magicno != cpu_to_be32(XLOG_HEADER_MAGIC_NUM)))
> > @@ -2920,10 +2921,22 @@ xlog_valid_rec_header(
> >  		return -EFSCORRUPTED;
> >  	}
> >  
> > -	/* LR body must have data or it wouldn't have been written */
> > +	/*
> > +	 * LR body must have data (or it wouldn't have been written) and
> > +	 * h_len must not be greater than h_size with one exception (see
> > +	 * comments in xlog_do_recovery_pass()).
> > +	 */
> 
> I wouldn't mention the exceptional case at all here since I think it
> just adds confusion. It's an unfortunate wart with mkfs that requires a
> kernel workaround, and I think it's better to keep it one place. I.e.,
> should it ever be removed, I find it unlikely somebody will notice this
> comment and fix it up accordingly.

Thanks for your review.

ok, if I understand correctly, will remove this "with one exception
(see comments..." expression. Please kindly correct me if I
misunderstand.

> 
> > +
> > +	if (XFS_IS_CORRUPT(log->l_mp, hlen <= 0 || hlen > hsize))
> >  		return -EFSCORRUPTED;
> > +
> > +	if (bufsize && XFS_IS_CORRUPT(log->l_mp, bufsize < hsize))
> > +		return -EFSCORRUPTED;
> 
> Please do something like the following so the full corruption check
> logic is readable:
> 
> 	if (XFS_IS_CORRUPT(..., bufsize && hsize > bufsize))
> 		return -EFSCORRUPTED;

That is good idea, will update this as well. 

>

...

> >  		rhead = (xlog_rec_header_t *)offset;
> > -		error = xlog_valid_rec_header(log, rhead, tail_blk);
> > -		if (error)
> > -			goto bread_err1;
> 
> This technically defers broader corruption checks (i.e., magic number,
> etc.) until after functional code starts using other fields below. I
> don't think we should remove this.
> 

I'm trying to combine this with the following part...(see below...)

> >  
> >  		/*
> >  		 * xfsprogs has a bug where record length is based on lsunit but
> > @@ -3001,21 +3011,19 @@ xlog_do_recovery_pass(
> >  		 */
> >  		h_size = be32_to_cpu(rhead->h_size);
> >  		h_len = be32_to_cpu(rhead->h_len);
> > -		if (h_len > h_size) {
> > -			if (h_len <= log->l_mp->m_logbsize &&
> > -			    be32_to_cpu(rhead->h_num_logops) == 1) {
> > -				xfs_warn(log->l_mp,
> > +		if (h_len > h_size && h_len <= log->l_mp->m_logbsize &&
> > +		    rhead->h_num_logops == cpu_to_be32(1)) {
> > +			xfs_warn(log->l_mp,
> >  		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
> > -					 h_size, log->l_mp->m_logbsize);
> > -				h_size = log->l_mp->m_logbsize;
> > -			} else {
> > -				XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW,
> > -						log->l_mp);
> > -				error = -EFSCORRUPTED;
> > -				goto bread_err1;
> > -			}
> > +				 h_size, log->l_mp->m_logbsize);
> > +			h_size = log->l_mp->m_logbsize;
> > +			rhead->h_size = cpu_to_be32(h_size);
> 
> I don't think we should update rhead like this, particularly in a rare
> and exclusive case. This structure should reflect what is on disk.
> 
> All in all, I think this patch should be much more focused:
> 
> 1.) Add the bufsize parameter and associated corruption check to
> xlog_valid_rec_header().
> 2.) Pass the related value from the existing calls.
> 3.) (Optional) If there's reason to revalidate after executing the mkfs
> workaround, add a second call within the branch that implements the
> h_size workaround.
> 

I moved workaround code to xlog_valid_rec_header() at first is
because in xlog_valid_rec_header() actually it has 2 individual
checks now:

1) check rhead->h_len vs rhead->h_size for each individual log record;
2) check rhead->h_size vs the unique allocated buffer size passed in
   for each record (since each log record has one stored h_size,
   even though there are not used later according to the current
   logic of xlog_do_recovery_pass).

if any of the conditions above is not satisfied, xlog_valid_rec_header()
will make fs corrupted immediately, so I tried 2 ways up to now:

 - (v1,v2) fold in workaround case into xlog_valid_rec_header()
 - (v3) rearrange workaround and xlog_valid_rec_header() order in
        xlog_do_recovery_pass() and modify rhead->h_size to the
        workaround h_size before xlog_valid_rec_header() validation
        so xlog_valid_rec_header() will work as expected since it
        has two individual checks as mentioned above.

If there is some better way, kindly let me know :) and I'd like to
hear other folks about this in advance as well.... so I can go
forward since this part is a bit tricky for now.

> Also, please test the workaround case to make sure it still works as
> expected (if you haven't already).

ok, will double confirm this, thanks!

Thanks,
Gao Xiang

> 
> Brian
>
 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size}
  2020-09-04 12:46     ` Gao Xiang
@ 2020-09-04 13:37       ` Brian Foster
  2020-09-04 15:01         ` Gao Xiang
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2020-09-04 13:37 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 08:46:34PM +0800, Gao Xiang wrote:
> Hi Brian,
> 
> On Fri, Sep 04, 2020 at 07:25:29AM -0400, Brian Foster wrote:
> > On Fri, Sep 04, 2020 at 04:25:15PM +0800, Gao Xiang wrote:
> 
> ...
> 
...
> > > @@ -3001,21 +3011,19 @@ xlog_do_recovery_pass(
> > >  		 */
> > >  		h_size = be32_to_cpu(rhead->h_size);
> > >  		h_len = be32_to_cpu(rhead->h_len);
> > > -		if (h_len > h_size) {
> > > -			if (h_len <= log->l_mp->m_logbsize &&
> > > -			    be32_to_cpu(rhead->h_num_logops) == 1) {
> > > -				xfs_warn(log->l_mp,
> > > +		if (h_len > h_size && h_len <= log->l_mp->m_logbsize &&
> > > +		    rhead->h_num_logops == cpu_to_be32(1)) {
> > > +			xfs_warn(log->l_mp,
> > >  		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
> > > -					 h_size, log->l_mp->m_logbsize);
> > > -				h_size = log->l_mp->m_logbsize;
> > > -			} else {
> > > -				XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW,
> > > -						log->l_mp);
> > > -				error = -EFSCORRUPTED;
> > > -				goto bread_err1;
> > > -			}
> > > +				 h_size, log->l_mp->m_logbsize);
> > > +			h_size = log->l_mp->m_logbsize;
> > > +			rhead->h_size = cpu_to_be32(h_size);
> > 
> > I don't think we should update rhead like this, particularly in a rare
> > and exclusive case. This structure should reflect what is on disk.
> > 
> > All in all, I think this patch should be much more focused:
> > 
> > 1.) Add the bufsize parameter and associated corruption check to
> > xlog_valid_rec_header().
> > 2.) Pass the related value from the existing calls.
> > 3.) (Optional) If there's reason to revalidate after executing the mkfs
> > workaround, add a second call within the branch that implements the
> > h_size workaround.
> > 
> 
> I moved workaround code to xlog_valid_rec_header() at first is
> because in xlog_valid_rec_header() actually it has 2 individual
> checks now:
> 
> 1) check rhead->h_len vs rhead->h_size for each individual log record;
> 2) check rhead->h_size vs the unique allocated buffer size passed in
>    for each record (since each log record has one stored h_size,
>    even though there are not used later according to the current
>    logic of xlog_do_recovery_pass).
> 
> if any of the conditions above is not satisfied, xlog_valid_rec_header()
> will make fs corrupted immediately, so I tried 2 ways up to now:
> 
>  - (v1,v2) fold in workaround case into xlog_valid_rec_header()
>  - (v3) rearrange workaround and xlog_valid_rec_header() order in
>         xlog_do_recovery_pass() and modify rhead->h_size to the
>         workaround h_size before xlog_valid_rec_header() validation
>         so xlog_valid_rec_header() will work as expected since it
>         has two individual checks as mentioned above.
> 
> If there is some better way, kindly let me know :) and I'd like to
> hear other folks about this in advance as well.... so I can go
> forward since this part is a bit tricky for now.
> 

My suggestion is to, at minimum, separate the above two logic changes
into separate patches. Item #2 above is a functional check in that it
ensures each record fits into the record buffer size we've allocated
(based on h_size) at the start of recovery. This item is what my
feedback above was referring to and I think is a fairly straightforward
change.

Item #1 is a bit more nebulous. h_size refers to the iclog size used by
the kernel that wrote to the log. I think it should be uniform across
all records and AFAICT it doesn't serve a functional purpose (validation
notwithstanding) for recovery beyond establishing the size of the buffer
we should allocate to process records. The mkfs workaround implements a
case where we have to ignore the on-disk h_size and use the in-core
iclog size, and that is currently isolated to a single place. I'm not
fundamentally against more h_size verification beyond current usage, but
if that means we have to consider the workaround for multiple records
(which is confusing and incorrect) or make subtle runtime changes like
quietly tweaking the in-core record header structure from what is on
disk, then I don't think it's worth the complexity.

If we _really_ wanted to include such a change, I think a more
appropriate validation check might be to similarly track the h_size from
the initial record and simply verify that the values are uniform across
all processed records. That way we don't conflict or impose the
workaround logic on the underlying log format rules. It also catches the
(unlikely) case that the mkfs workaround is applied on the first record
incorrectly because h_size is corrupt and there are more records that
conflict. Also note that v5 filesystems enforce record CRC checks
anyways, so this still might be of limited value...

Brian

> > Also, please test the workaround case to make sure it still works as
> > expected (if you haven't already).
> 
> ok, will double confirm this, thanks!
> 
> Thanks,
> Gao Xiang
> 
> > 
> > Brian
> >
>  
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size}
  2020-09-04 13:37       ` Brian Foster
@ 2020-09-04 15:01         ` Gao Xiang
  2020-09-04 16:31           ` Brian Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Gao Xiang @ 2020-09-04 15:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 09:37:04AM -0400, Brian Foster wrote:
> On Fri, Sep 04, 2020 at 08:46:34PM +0800, Gao Xiang wrote:
...
> > > > @@ -3001,21 +3011,19 @@ xlog_do_recovery_pass(
> > > >  		 */
> > > >  		h_size = be32_to_cpu(rhead->h_size);
> > > >  		h_len = be32_to_cpu(rhead->h_len);
> > > > -		if (h_len > h_size) {
> > > > -			if (h_len <= log->l_mp->m_logbsize &&
> > > > -			    be32_to_cpu(rhead->h_num_logops) == 1) {
> > > > -				xfs_warn(log->l_mp,
> > > > +		if (h_len > h_size && h_len <= log->l_mp->m_logbsize &&
> > > > +		    rhead->h_num_logops == cpu_to_be32(1)) {
> > > > +			xfs_warn(log->l_mp,
> > > >  		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
> > > > -					 h_size, log->l_mp->m_logbsize);
> > > > -				h_size = log->l_mp->m_logbsize;
> > > > -			} else {
> > > > -				XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW,
> > > > -						log->l_mp);
> > > > -				error = -EFSCORRUPTED;
> > > > -				goto bread_err1;
> > > > -			}
> > > > +				 h_size, log->l_mp->m_logbsize);
> > > > +			h_size = log->l_mp->m_logbsize;
> > > > +			rhead->h_size = cpu_to_be32(h_size);
> > > 
> > > I don't think we should update rhead like this, particularly in a rare
> > > and exclusive case. This structure should reflect what is on disk.
> > > 
> > > All in all, I think this patch should be much more focused:
> > > 
> > > 1.) Add the bufsize parameter and associated corruption check to
> > > xlog_valid_rec_header().
> > > 2.) Pass the related value from the existing calls.
> > > 3.) (Optional) If there's reason to revalidate after executing the mkfs
> > > workaround, add a second call within the branch that implements the
> > > h_size workaround.
> > > 
> > 
> > I moved workaround code to xlog_valid_rec_header() at first is
> > because in xlog_valid_rec_header() actually it has 2 individual
> > checks now:
> > 
> > 1) check rhead->h_len vs rhead->h_size for each individual log record;
> > 2) check rhead->h_size vs the unique allocated buffer size passed in
> >    for each record (since each log record has one stored h_size,
> >    even though there are not used later according to the current
> >    logic of xlog_do_recovery_pass).
> > 
> > if any of the conditions above is not satisfied, xlog_valid_rec_header()
> > will make fs corrupted immediately, so I tried 2 ways up to now:
> > 
> >  - (v1,v2) fold in workaround case into xlog_valid_rec_header()
> >  - (v3) rearrange workaround and xlog_valid_rec_header() order in
> >         xlog_do_recovery_pass() and modify rhead->h_size to the
> >         workaround h_size before xlog_valid_rec_header() validation
> >         so xlog_valid_rec_header() will work as expected since it
> >         has two individual checks as mentioned above.
> > 
> > If there is some better way, kindly let me know :) and I'd like to
> > hear other folks about this in advance as well.... so I can go
> > forward since this part is a bit tricky for now.
> > 
> 
> My suggestion is to, at minimum, separate the above two logic changes
> into separate patches. Item #2 above is a functional check in that it
> ensures each record fits into the record buffer size we've allocated
> (based on h_size) at the start of recovery. This item is what my
> feedback above was referring to and I think is a fairly straightforward
> change.

There are 3-value relationships now (rhead->h_len <= rhead->h_size <=
unique allocated buffer size [which is the same as rhead->h_size of
initial tail log record]).

The only buffer overrun case which could do harm runtimely is
rhead->h_len > bufsize as I mentioned in the previous email
https://lore.kernel.org/r/20200902224726.GB4671@xiangao.remote.csb

so I guess what you meant is

 - get h_size from the tail initial log record as buffer size (so
   workaround needs to be settled here as well);
 - call in xlog_valid_rec_header() to fail out rhead->h_len >
   buffer size.

as the 1st splited patch. Yet that's not the case of Item #2
(rhead->h_size <= buffer size), just to confirm if my understanding
is correct.

> 
> Item #1 is a bit more nebulous. h_size refers to the iclog size used by
> the kernel that wrote to the log. I think it should be uniform across
> all records and AFAICT it doesn't serve a functional purpose (validation
> notwithstanding) for recovery beyond establishing the size of the buffer
> we should allocate to process records. The mkfs workaround implements a
> case where we have to ignore the on-disk h_size and use the in-core
> iclog size, and that is currently isolated to a single place. I'm not
> fundamentally against more h_size verification beyond current usage, but
> if that means we have to consider the workaround for multiple records
> (which is confusing and incorrect) or make subtle runtime changes like
> quietly tweaking the in-core record header structure from what is on
> disk, then I don't think it's worth the complexity.
> 
> If we _really_ wanted to include such a change, I think a more
> appropriate validation check might be to similarly track the h_size from
> the initial record and simply verify that the values are uniform across
> all processed records. That way we don't conflict or impose the
> workaround logic on the underlying log format rules. It also catches the
> (unlikely) case that the mkfs workaround is applied on the first record
> incorrectly because h_size is corrupt and there are more records that
> conflict. Also note that v5 filesystems enforce record CRC checks
> anyways, so this still might be of limited value...

based on the words above, my understanding is that the 2nd spilted
patch is used to do such extra check in xlog_valid_rec_header() to

check buffer size == rhead->h_size all the time?

so the relationship would be rhead->h_len <= buffer size = rhead->h_size.

please correct me if I'm wrong since I'm a bit confusing about some
English expressions. Yeah, if my understanding is correct, I'd like
to hear if this 2nd spilted check patch is needed (it won't
cause any stability impact since rhead->h_size for log records
other than tail record isn't useful due to the current logic...)

Thanks,
Gao Xiang

> 
> Brian
> 
> > > Also, please test the workaround case to make sure it still works as
> > > expected (if you haven't already).
> > 
> > ok, will double confirm this, thanks!
> > 
> > Thanks,
> > Gao Xiang
> > 
> > > 
> > > Brian
> > >
> >  
> > 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size}
  2020-09-04 15:01         ` Gao Xiang
@ 2020-09-04 16:31           ` Brian Foster
  0 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2020-09-04 16:31 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 11:01:59PM +0800, Gao Xiang wrote:
> On Fri, Sep 04, 2020 at 09:37:04AM -0400, Brian Foster wrote:
> > On Fri, Sep 04, 2020 at 08:46:34PM +0800, Gao Xiang wrote:
> ...
> > > > > @@ -3001,21 +3011,19 @@ xlog_do_recovery_pass(
> > > > >  		 */
> > > > >  		h_size = be32_to_cpu(rhead->h_size);
> > > > >  		h_len = be32_to_cpu(rhead->h_len);
> > > > > -		if (h_len > h_size) {
> > > > > -			if (h_len <= log->l_mp->m_logbsize &&
> > > > > -			    be32_to_cpu(rhead->h_num_logops) == 1) {
> > > > > -				xfs_warn(log->l_mp,
> > > > > +		if (h_len > h_size && h_len <= log->l_mp->m_logbsize &&
> > > > > +		    rhead->h_num_logops == cpu_to_be32(1)) {
> > > > > +			xfs_warn(log->l_mp,
> > > > >  		"invalid iclog size (%d bytes), using lsunit (%d bytes)",
> > > > > -					 h_size, log->l_mp->m_logbsize);
> > > > > -				h_size = log->l_mp->m_logbsize;
> > > > > -			} else {
> > > > > -				XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW,
> > > > > -						log->l_mp);
> > > > > -				error = -EFSCORRUPTED;
> > > > > -				goto bread_err1;
> > > > > -			}
> > > > > +				 h_size, log->l_mp->m_logbsize);
> > > > > +			h_size = log->l_mp->m_logbsize;
> > > > > +			rhead->h_size = cpu_to_be32(h_size);
> > > > 
> > > > I don't think we should update rhead like this, particularly in a rare
> > > > and exclusive case. This structure should reflect what is on disk.
> > > > 
> > > > All in all, I think this patch should be much more focused:
> > > > 
> > > > 1.) Add the bufsize parameter and associated corruption check to
> > > > xlog_valid_rec_header().
> > > > 2.) Pass the related value from the existing calls.
> > > > 3.) (Optional) If there's reason to revalidate after executing the mkfs
> > > > workaround, add a second call within the branch that implements the
> > > > h_size workaround.
> > > > 
> > > 
> > > I moved workaround code to xlog_valid_rec_header() at first is
> > > because in xlog_valid_rec_header() actually it has 2 individual
> > > checks now:
> > > 
> > > 1) check rhead->h_len vs rhead->h_size for each individual log record;
> > > 2) check rhead->h_size vs the unique allocated buffer size passed in
> > >    for each record (since each log record has one stored h_size,
> > >    even though there are not used later according to the current
> > >    logic of xlog_do_recovery_pass).
> > > 
> > > if any of the conditions above is not satisfied, xlog_valid_rec_header()
> > > will make fs corrupted immediately, so I tried 2 ways up to now:
> > > 
> > >  - (v1,v2) fold in workaround case into xlog_valid_rec_header()
> > >  - (v3) rearrange workaround and xlog_valid_rec_header() order in
> > >         xlog_do_recovery_pass() and modify rhead->h_size to the
> > >         workaround h_size before xlog_valid_rec_header() validation
> > >         so xlog_valid_rec_header() will work as expected since it
> > >         has two individual checks as mentioned above.
> > > 
> > > If there is some better way, kindly let me know :) and I'd like to
> > > hear other folks about this in advance as well.... so I can go
> > > forward since this part is a bit tricky for now.
> > > 
> > 
> > My suggestion is to, at minimum, separate the above two logic changes
> > into separate patches. Item #2 above is a functional check in that it
> > ensures each record fits into the record buffer size we've allocated
> > (based on h_size) at the start of recovery. This item is what my
> > feedback above was referring to and I think is a fairly straightforward
> > change.
> 
> There are 3-value relationships now (rhead->h_len <= rhead->h_size <=
> unique allocated buffer size [which is the same as rhead->h_size of
> initial tail log record]).
> 
> The only buffer overrun case which could do harm runtimely is
> rhead->h_len > bufsize as I mentioned in the previous email
> https://lore.kernel.org/r/20200902224726.GB4671@xiangao.remote.csb
> 
> so I guess what you meant is
> 
>  - get h_size from the tail initial log record as buffer size (so
>    workaround needs to be settled here as well);
>  - call in xlog_valid_rec_header() to fail out rhead->h_len >
>    buffer size.
> 
> as the 1st splited patch. Yet that's not the case of Item #2
> (rhead->h_size <= buffer size), just to confirm if my understanding
> is correct.
> 

Yep.

> > 
> > Item #1 is a bit more nebulous. h_size refers to the iclog size used by
> > the kernel that wrote to the log. I think it should be uniform across
> > all records and AFAICT it doesn't serve a functional purpose (validation
> > notwithstanding) for recovery beyond establishing the size of the buffer
> > we should allocate to process records. The mkfs workaround implements a
> > case where we have to ignore the on-disk h_size and use the in-core
> > iclog size, and that is currently isolated to a single place. I'm not
> > fundamentally against more h_size verification beyond current usage, but
> > if that means we have to consider the workaround for multiple records
> > (which is confusing and incorrect) or make subtle runtime changes like
> > quietly tweaking the in-core record header structure from what is on
> > disk, then I don't think it's worth the complexity.
> > 
> > If we _really_ wanted to include such a change, I think a more
> > appropriate validation check might be to similarly track the h_size from
> > the initial record and simply verify that the values are uniform across
> > all processed records. That way we don't conflict or impose the
> > workaround logic on the underlying log format rules. It also catches the
> > (unlikely) case that the mkfs workaround is applied on the first record
> > incorrectly because h_size is corrupt and there are more records that
> > conflict. Also note that v5 filesystems enforce record CRC checks
> > anyways, so this still might be of limited value...
> 
> based on the words above, my understanding is that the 2nd spilted
> patch is used to do such extra check in xlog_valid_rec_header() to
> 
> check buffer size == rhead->h_size all the time?
> 
> so the relationship would be rhead->h_len <= buffer size = rhead->h_size.
> 
> please correct me if I'm wrong since I'm a bit confusing about some
> English expressions. Yeah, if my understanding is correct, I'd like
> to hear if this 2nd spilted check patch is needed (it won't
> cause any stability impact since rhead->h_size for log records
> other than tail record isn't useful due to the current logic...)
> 

Well I'm not totally convinced the second check is really needed (for
the reasons you mentioned), so this was just an idea of something I
think might be more acceptable than the current approach. The idea was
just to make the validation ensure that h_size is the same across all
log records and it would be completely indepedendent from the mkfs
workaround and the h_len <= bufsize check.

Brian

> Thanks,
> Gao Xiang
> 
> > 
> > Brian
> > 
> > > > Also, please test the workaround case to make sure it still works as
> > > > expected (if you haven't already).
> > > 
> > > ok, will double confirm this, thanks!
> > > 
> > > Thanks,
> > > Gao Xiang
> > > 
> > > > 
> > > > Brian
> > > >
> > >  
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 2/2] xfs: clean up calculation of LR header blocks
  2020-09-04  8:25 [PATCH 0/2] xfs: random patches on log recovery Gao Xiang
  2020-09-04  8:25 ` [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size} Gao Xiang
@ 2020-09-04  8:25 ` Gao Xiang
  2020-09-04 11:25   ` Brian Foster
  1 sibling, 1 reply; 13+ messages in thread
From: Gao Xiang @ 2020-09-04  8:25 UTC (permalink / raw)
  To: linux-xfs; +Cc: Darrick J . Wong, Brian Foster, Gao Xiang

Let's use DIV_ROUND_UP() to calculate log record header
blocks as what did in xlog_get_iclog_buffer_size() and
wrap up common helpers for log recovery.

Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
---
v1: https://lore.kernel.org/r/20200902140923.24392-1-hsiangkao@redhat.com

changes since v1:
 - add another helper xlog_logrec_hblks() for the cases with
   xfs_sb_version_haslogv2(), and use xlog_logrecv2_hblks()
   for the case of xlog_do_recovery_pass() since it has more
   complex logic other than just calculate hblks...

 fs/xfs/xfs_log.c         |  4 +--
 fs/xfs/xfs_log_recover.c | 53 ++++++++++++++++------------------------
 2 files changed, 22 insertions(+), 35 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index ad0c69ee8947..7a4ba408a3a2 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1604,9 +1604,7 @@ xlog_cksum(
 		int		i;
 		int		xheads;
 
-		xheads = size / XLOG_HEADER_CYCLE_SIZE;
-		if (size % XLOG_HEADER_CYCLE_SIZE)
-			xheads++;
+		xheads = DIV_ROUND_UP(size, XLOG_HEADER_CYCLE_SIZE);
 
 		for (i = 1; i < xheads; i++) {
 			crc = crc32c(crc, &xhdr[i].hic_xheader,
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 28d952794bfa..c6163065f6e0 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -397,6 +397,23 @@ xlog_find_verify_cycle(
 	return error;
 }
 
+static inline int xlog_logrecv2_hblks(struct xlog_rec_header *rh)
+{
+	int	h_size = be32_to_cpu(rh->h_size);
+
+	if ((be32_to_cpu(rh->h_version) & XLOG_VERSION_2) &&
+	    h_size > XLOG_HEADER_CYCLE_SIZE)
+		return DIV_ROUND_UP(h_size, XLOG_HEADER_CYCLE_SIZE);
+	return 1;
+}
+
+static inline int xlog_logrec_hblks(struct xlog *log, xlog_rec_header_t *rh)
+{
+	if (!xfs_sb_version_haslogv2(&log->l_mp->m_sb))
+		return 1;
+	return xlog_logrecv2_hblks(rh);
+}
+
 /*
  * Potentially backup over partial log record write.
  *
@@ -489,15 +506,7 @@ xlog_find_verify_log_record(
 	 * reset last_blk.  Only when last_blk points in the middle of a log
 	 * record do we update last_blk.
 	 */
-	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
-		uint	h_size = be32_to_cpu(head->h_size);
-
-		xhdrs = h_size / XLOG_HEADER_CYCLE_SIZE;
-		if (h_size % XLOG_HEADER_CYCLE_SIZE)
-			xhdrs++;
-	} else {
-		xhdrs = 1;
-	}
+	xhdrs = xlog_logrec_hblks(log, head);
 
 	if (*last_blk - i + extra_bblks !=
 	    BTOBB(be32_to_cpu(head->h_len)) + xhdrs)
@@ -1184,22 +1193,7 @@ xlog_check_unmount_rec(
 	 * below. We won't want to clear the unmount record if there is one, so
 	 * we pass the lsn of the unmount record rather than the block after it.
 	 */
-	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
-		int	h_size = be32_to_cpu(rhead->h_size);
-		int	h_version = be32_to_cpu(rhead->h_version);
-
-		if ((h_version & XLOG_VERSION_2) &&
-		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
-			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
-			if (h_size % XLOG_HEADER_CYCLE_SIZE)
-				hblks++;
-		} else {
-			hblks = 1;
-		}
-	} else {
-		hblks = 1;
-	}
-
+	hblks = xlog_logrec_hblks(log, rhead);
 	after_umount_blk = xlog_wrap_logbno(log,
 			rhead_blk + hblks + BTOBB(be32_to_cpu(rhead->h_len)));
 
@@ -3024,15 +3018,10 @@ xlog_do_recovery_pass(
 		if (error)
 			goto bread_err1;
 
-		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
-		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
-			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
-			if (h_size % XLOG_HEADER_CYCLE_SIZE)
-				hblks++;
+		hblks = xlog_logrecv2_hblks(rhead);
+		if (hblks != 1) {
 			kmem_free(hbp);
 			hbp = xlog_alloc_buffer(log, hblks);
-		} else {
-			hblks = 1;
 		}
 	} else {
 		ASSERT(log->l_sectBBsize == 1);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] xfs: clean up calculation of LR header blocks
  2020-09-04  8:25 ` [PATCH v2 2/2] xfs: clean up calculation of LR header blocks Gao Xiang
@ 2020-09-04 11:25   ` Brian Foster
  2020-09-04 12:59     ` Gao Xiang
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2020-09-04 11:25 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 04:25:16PM +0800, Gao Xiang wrote:
> Let's use DIV_ROUND_UP() to calculate log record header
> blocks as what did in xlog_get_iclog_buffer_size() and
> wrap up common helpers for log recovery.
> 
> Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
> ---
> v1: https://lore.kernel.org/r/20200902140923.24392-1-hsiangkao@redhat.com
> 
> changes since v1:
>  - add another helper xlog_logrec_hblks() for the cases with
>    xfs_sb_version_haslogv2(), and use xlog_logrecv2_hblks()
>    for the case of xlog_do_recovery_pass() since it has more
>    complex logic other than just calculate hblks...
> 
>  fs/xfs/xfs_log.c         |  4 +--
>  fs/xfs/xfs_log_recover.c | 53 ++++++++++++++++------------------------
>  2 files changed, 22 insertions(+), 35 deletions(-)
> 
...
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 28d952794bfa..c6163065f6e0 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -397,6 +397,23 @@ xlog_find_verify_cycle(
>  	return error;
>  }
>  
> +static inline int xlog_logrecv2_hblks(struct xlog_rec_header *rh)
> +{
> +	int	h_size = be32_to_cpu(rh->h_size);
> +
> +	if ((be32_to_cpu(rh->h_version) & XLOG_VERSION_2) &&
> +	    h_size > XLOG_HEADER_CYCLE_SIZE)
> +		return DIV_ROUND_UP(h_size, XLOG_HEADER_CYCLE_SIZE);
> +	return 1;
> +}
> +
> +static inline int xlog_logrec_hblks(struct xlog *log, xlog_rec_header_t *rh)
> +{
> +	if (!xfs_sb_version_haslogv2(&log->l_mp->m_sb))
> +		return 1;
> +	return xlog_logrecv2_hblks(rh);
> +}
> +

h_version is assigned based on xfs_sb_version_haslogv2() in the first
place so I'm not sure I see the need for multiple helpers like this, at
least with the current code. I can't really speak to why some code
checks the feature bit and/or the record header version and not the
other way around, but perhaps there's some historical reason I'm not
aware of. Regardless, is there ever a case where
xfs_sb_version_haslogv2() == true and h_version != 2? That strikes me as
more of a corruption scenario than anything..

Brian

>  /*
>   * Potentially backup over partial log record write.
>   *
> @@ -489,15 +506,7 @@ xlog_find_verify_log_record(
>  	 * reset last_blk.  Only when last_blk points in the middle of a log
>  	 * record do we update last_blk.
>  	 */
> -	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
> -		uint	h_size = be32_to_cpu(head->h_size);
> -
> -		xhdrs = h_size / XLOG_HEADER_CYCLE_SIZE;
> -		if (h_size % XLOG_HEADER_CYCLE_SIZE)
> -			xhdrs++;
> -	} else {
> -		xhdrs = 1;
> -	}
> +	xhdrs = xlog_logrec_hblks(log, head);
>  
>  	if (*last_blk - i + extra_bblks !=
>  	    BTOBB(be32_to_cpu(head->h_len)) + xhdrs)
> @@ -1184,22 +1193,7 @@ xlog_check_unmount_rec(
>  	 * below. We won't want to clear the unmount record if there is one, so
>  	 * we pass the lsn of the unmount record rather than the block after it.
>  	 */
> -	if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
> -		int	h_size = be32_to_cpu(rhead->h_size);
> -		int	h_version = be32_to_cpu(rhead->h_version);
> -
> -		if ((h_version & XLOG_VERSION_2) &&
> -		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
> -			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
> -			if (h_size % XLOG_HEADER_CYCLE_SIZE)
> -				hblks++;
> -		} else {
> -			hblks = 1;
> -		}
> -	} else {
> -		hblks = 1;
> -	}
> -
> +	hblks = xlog_logrec_hblks(log, rhead);
>  	after_umount_blk = xlog_wrap_logbno(log,
>  			rhead_blk + hblks + BTOBB(be32_to_cpu(rhead->h_len)));
>  
> @@ -3024,15 +3018,10 @@ xlog_do_recovery_pass(
>  		if (error)
>  			goto bread_err1;
>  
> -		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
> -		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
> -			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
> -			if (h_size % XLOG_HEADER_CYCLE_SIZE)
> -				hblks++;
> +		hblks = xlog_logrecv2_hblks(rhead);
> +		if (hblks != 1) {
>  			kmem_free(hbp);
>  			hbp = xlog_alloc_buffer(log, hblks);
> -		} else {
> -			hblks = 1;
>  		}
>  	} else {
>  		ASSERT(log->l_sectBBsize == 1);
> -- 
> 2.18.1
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] xfs: clean up calculation of LR header blocks
  2020-09-04 11:25   ` Brian Foster
@ 2020-09-04 12:59     ` Gao Xiang
  2020-09-04 13:37       ` Brian Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Gao Xiang @ 2020-09-04 12:59 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs, Darrick J . Wong

Hi Brian,

On Fri, Sep 04, 2020 at 07:25:48AM -0400, Brian Foster wrote:
> On Fri, Sep 04, 2020 at 04:25:16PM +0800, Gao Xiang wrote:

...

> >  
> > +static inline int xlog_logrecv2_hblks(struct xlog_rec_header *rh)
> > +{
> > +	int	h_size = be32_to_cpu(rh->h_size);
> > +
> > +	if ((be32_to_cpu(rh->h_version) & XLOG_VERSION_2) &&
> > +	    h_size > XLOG_HEADER_CYCLE_SIZE)
> > +		return DIV_ROUND_UP(h_size, XLOG_HEADER_CYCLE_SIZE);
> > +	return 1;
> > +}
> > +
> > +static inline int xlog_logrec_hblks(struct xlog *log, xlog_rec_header_t *rh)
> > +{
> > +	if (!xfs_sb_version_haslogv2(&log->l_mp->m_sb))
> > +		return 1;
> > +	return xlog_logrecv2_hblks(rh);
> > +}
> > +
> 
> h_version is assigned based on xfs_sb_version_haslogv2() in the first
> place so I'm not sure I see the need for multiple helpers like this, at
> least with the current code. I can't really speak to why some code
> checks the feature bit and/or the record header version and not the
> other way around, but perhaps there's some historical reason I'm not
> aware of. Regardless, is there ever a case where
> xfs_sb_version_haslogv2() == true and h_version != 2? That strikes me as
> more of a corruption scenario than anything..

Thanks for this.

Honestly, I'm not quite sure if xfs_sb_version_haslogv2() == true and
h_version != 2 is useful (or existed historially)... anyway, that is
another seperate topic though...

Could you kindly give me some code flow on your preferred way about
this so I could update this patch proper (since we have a complex
case in xlog_do_recovery_pass(), I'm not sure how the unique helper
will be like because there are 3 cases below...)

 - for the first 2 cases, we already have rhead read in-memory,
   so the logic is like:
     ....
     log_bread (somewhere in advance)
     ....

     if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
          ...
     } else {
          ...
     }
     (so I folded this two cases in xlog_logrec_hblks())

 - for xlog_do_recovery_pass, it behaves like
    if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
         log_bread (another extra bread to get h_size for
         allocated buffer and hblks).

         ...
    } else {
         ...
    }
    so in this case we don't have rhead until
xfs_sb_version_haslogv2(&log->l_mp->m_sb) is true...

Thanks in advance!

Thanks,
Gao Xiang


> 
> Brian


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] xfs: clean up calculation of LR header blocks
  2020-09-04 12:59     ` Gao Xiang
@ 2020-09-04 13:37       ` Brian Foster
  2020-09-04 15:07         ` Gao Xiang
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2020-09-04 13:37 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 08:59:29PM +0800, Gao Xiang wrote:
> Hi Brian,
> 
> On Fri, Sep 04, 2020 at 07:25:48AM -0400, Brian Foster wrote:
> > On Fri, Sep 04, 2020 at 04:25:16PM +0800, Gao Xiang wrote:
> 
> ...
> 
> > >  
> > > +static inline int xlog_logrecv2_hblks(struct xlog_rec_header *rh)
> > > +{
> > > +	int	h_size = be32_to_cpu(rh->h_size);
> > > +
> > > +	if ((be32_to_cpu(rh->h_version) & XLOG_VERSION_2) &&
> > > +	    h_size > XLOG_HEADER_CYCLE_SIZE)
> > > +		return DIV_ROUND_UP(h_size, XLOG_HEADER_CYCLE_SIZE);
> > > +	return 1;
> > > +}
> > > +
> > > +static inline int xlog_logrec_hblks(struct xlog *log, xlog_rec_header_t *rh)
> > > +{
> > > +	if (!xfs_sb_version_haslogv2(&log->l_mp->m_sb))
> > > +		return 1;
> > > +	return xlog_logrecv2_hblks(rh);
> > > +}
> > > +
> > 
> > h_version is assigned based on xfs_sb_version_haslogv2() in the first
> > place so I'm not sure I see the need for multiple helpers like this, at
> > least with the current code. I can't really speak to why some code
> > checks the feature bit and/or the record header version and not the
> > other way around, but perhaps there's some historical reason I'm not
> > aware of. Regardless, is there ever a case where
> > xfs_sb_version_haslogv2() == true and h_version != 2? That strikes me as
> > more of a corruption scenario than anything..
> 
> Thanks for this.
> 
> Honestly, I'm not quite sure if xfs_sb_version_haslogv2() == true and
> h_version != 2 is useful (or existed historially)... anyway, that is
> another seperate topic though...
> 

Indeed.

> Could you kindly give me some code flow on your preferred way about
> this so I could update this patch proper (since we have a complex
> case in xlog_do_recovery_pass(), I'm not sure how the unique helper
> will be like because there are 3 cases below...)
> 
>  - for the first 2 cases, we already have rhead read in-memory,
>    so the logic is like:
>      ....
>      log_bread (somewhere in advance)
>      ....
> 
>      if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
>           ...
>      } else {
>           ...
>      }
>      (so I folded this two cases in xlog_logrec_hblks())
> 
>  - for xlog_do_recovery_pass, it behaves like
>     if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
>          log_bread (another extra bread to get h_size for
>          allocated buffer and hblks).
> 
>          ...
>     } else {
>          ...
>     }
>     so in this case we don't have rhead until
> xfs_sb_version_haslogv2(&log->l_mp->m_sb) is true...
> 

I'm not sure I'm following the problem...

The current patch makes the following change in xlog_do_recovery_pass():

@@ -3024,15 +3018,10 @@ xlog_do_recovery_pass(
 		if (error)
 			goto bread_err1;
 
-		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
-		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
-			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
-			if (h_size % XLOG_HEADER_CYCLE_SIZE)
-				hblks++;
+		hblks = xlog_logrecv2_hblks(rhead);
+		if (hblks != 1) {
 			kmem_free(hbp);
 			hbp = xlog_alloc_buffer(log, hblks);
-		} else {
-			hblks = 1;
 		}
 	} else {
 		ASSERT(log->l_sectBBsize == 1);

My question is: why can't we replace the xlog_logrecv2_hblks() call here
with xlog_logrec_hblks()? We already have rhead as the existing code is
already looking at h_version. We're inside a _haslogv2() branch, so the
check inside the helper is effectively a duplicate/no-op.. Hm?

Brian

> Thanks in advance!
> 
> Thanks,
> Gao Xiang
> 
> 
> > 
> > Brian
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] xfs: clean up calculation of LR header blocks
  2020-09-04 13:37       ` Brian Foster
@ 2020-09-04 15:07         ` Gao Xiang
  2020-09-04 16:32           ` Brian Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Gao Xiang @ 2020-09-04 15:07 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 09:37:21AM -0400, Brian Foster wrote:
> On Fri, Sep 04, 2020 at 08:59:29PM +0800, Gao Xiang wrote:
...
> > Could you kindly give me some code flow on your preferred way about
> > this so I could update this patch proper (since we have a complex
> > case in xlog_do_recovery_pass(), I'm not sure how the unique helper
> > will be like because there are 3 cases below...)
> > 
> >  - for the first 2 cases, we already have rhead read in-memory,
> >    so the logic is like:
> >      ....
> >      log_bread (somewhere in advance)
> >      ....
> > 
> >      if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
> >           ...
> >      } else {
> >           ...
> >      }
> >      (so I folded this two cases in xlog_logrec_hblks())
> > 
> >  - for xlog_do_recovery_pass, it behaves like
> >     if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
> >          log_bread (another extra bread to get h_size for
> >          allocated buffer and hblks).
> > 
> >          ...
> >     } else {
> >          ...
> >     }
> >     so in this case we don't have rhead until
> > xfs_sb_version_haslogv2(&log->l_mp->m_sb) is true...
> > 
> 
> I'm not sure I'm following the problem...
> 
> The current patch makes the following change in xlog_do_recovery_pass():
> 
> @@ -3024,15 +3018,10 @@ xlog_do_recovery_pass(
>  		if (error)
>  			goto bread_err1;
>  
> -		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
> -		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
> -			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
> -			if (h_size % XLOG_HEADER_CYCLE_SIZE)
> -				hblks++;
> +		hblks = xlog_logrecv2_hblks(rhead);
> +		if (hblks != 1) {
>  			kmem_free(hbp);
>  			hbp = xlog_alloc_buffer(log, hblks);
> -		} else {
> -			hblks = 1;
>  		}
>  	} else {
>  		ASSERT(log->l_sectBBsize == 1);
> 
> My question is: why can't we replace the xlog_logrecv2_hblks() call here
> with xlog_logrec_hblks()? We already have rhead as the existing code is
> already looking at h_version. We're inside a _haslogv2() branch, so the
> check inside the helper is effectively a duplicate/no-op.. Hm?

Yeah, I get your point. That would introduce a duplicated check of
_haslogv2() if we use xlog_logrec_hblks() here (IMHO compliers wouldn't
treat the 2nd _haslogv2() check as no-op).

I will go forward like this if no other concerns. Thank you!

Thanks,
Gao Xiang

> 
> Brian
> 
> > Thanks in advance!
> > 
> > Thanks,
> > Gao Xiang
> > 
> > 
> > > 
> > > Brian
> > 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] xfs: clean up calculation of LR header blocks
  2020-09-04 15:07         ` Gao Xiang
@ 2020-09-04 16:32           ` Brian Foster
  0 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2020-09-04 16:32 UTC (permalink / raw)
  To: Gao Xiang; +Cc: linux-xfs, Darrick J . Wong

On Fri, Sep 04, 2020 at 11:07:30PM +0800, Gao Xiang wrote:
> On Fri, Sep 04, 2020 at 09:37:21AM -0400, Brian Foster wrote:
> > On Fri, Sep 04, 2020 at 08:59:29PM +0800, Gao Xiang wrote:
> ...
> > > Could you kindly give me some code flow on your preferred way about
> > > this so I could update this patch proper (since we have a complex
> > > case in xlog_do_recovery_pass(), I'm not sure how the unique helper
> > > will be like because there are 3 cases below...)
> > > 
> > >  - for the first 2 cases, we already have rhead read in-memory,
> > >    so the logic is like:
> > >      ....
> > >      log_bread (somewhere in advance)
> > >      ....
> > > 
> > >      if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
> > >           ...
> > >      } else {
> > >           ...
> > >      }
> > >      (so I folded this two cases in xlog_logrec_hblks())
> > > 
> > >  - for xlog_do_recovery_pass, it behaves like
> > >     if (xfs_sb_version_haslogv2(&log->l_mp->m_sb)) {
> > >          log_bread (another extra bread to get h_size for
> > >          allocated buffer and hblks).
> > > 
> > >          ...
> > >     } else {
> > >          ...
> > >     }
> > >     so in this case we don't have rhead until
> > > xfs_sb_version_haslogv2(&log->l_mp->m_sb) is true...
> > > 
> > 
> > I'm not sure I'm following the problem...
> > 
> > The current patch makes the following change in xlog_do_recovery_pass():
> > 
> > @@ -3024,15 +3018,10 @@ xlog_do_recovery_pass(
> >  		if (error)
> >  			goto bread_err1;
> >  
> > -		if ((be32_to_cpu(rhead->h_version) & XLOG_VERSION_2) &&
> > -		    (h_size > XLOG_HEADER_CYCLE_SIZE)) {
> > -			hblks = h_size / XLOG_HEADER_CYCLE_SIZE;
> > -			if (h_size % XLOG_HEADER_CYCLE_SIZE)
> > -				hblks++;
> > +		hblks = xlog_logrecv2_hblks(rhead);
> > +		if (hblks != 1) {
> >  			kmem_free(hbp);
> >  			hbp = xlog_alloc_buffer(log, hblks);
> > -		} else {
> > -			hblks = 1;
> >  		}
> >  	} else {
> >  		ASSERT(log->l_sectBBsize == 1);
> > 
> > My question is: why can't we replace the xlog_logrecv2_hblks() call here
> > with xlog_logrec_hblks()? We already have rhead as the existing code is
> > already looking at h_version. We're inside a _haslogv2() branch, so the
> > check inside the helper is effectively a duplicate/no-op.. Hm?
> 
> Yeah, I get your point. That would introduce a duplicated check of
> _haslogv2() if we use xlog_logrec_hblks() here (IMHO compliers wouldn't
> treat the 2nd _haslogv2() check as no-op).
> 

Yeah, I meant it as more as a logical no-op. IOW, it wouldn't affect
functionality. The check instructions might be duplicated, but I doubt
that would measurably impact log recovery.

> I will go forward like this if no other concerns. Thank you!
> 

Thanks.

Brian

> Thanks,
> Gao Xiang
> 
> > 
> > Brian
> > 
> > > Thanks in advance!
> > > 
> > > Thanks,
> > > Gao Xiang
> > > 
> > > 
> > > > 
> > > > Brian
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-04 16:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-09-04  8:25 [PATCH 0/2] xfs: random patches on log recovery Gao Xiang
2020-09-04  8:25 ` [PATCH v3 1/2] xfs: avoid LR buffer overrun due to crafted h_{len,size} Gao Xiang
2020-09-04 11:25   ` Brian Foster
2020-09-04 12:46     ` Gao Xiang
2020-09-04 13:37       ` Brian Foster
2020-09-04 15:01         ` Gao Xiang
2020-09-04 16:31           ` Brian Foster
2020-09-04  8:25 ` [PATCH v2 2/2] xfs: clean up calculation of LR header blocks Gao Xiang
2020-09-04 11:25   ` Brian Foster
2020-09-04 12:59     ` Gao Xiang
2020-09-04 13:37       ` Brian Foster
2020-09-04 15:07         ` Gao Xiang
2020-09-04 16:32           ` Brian Foster

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox