From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: Eric Sandeen <sandeen@redhat.com>,
linux-xfs <linux-xfs@vger.kernel.org>,
Leonardo Vaz <lvaz@redhat.com>
Subject: Re: [PATCH] xfs_repair: fix progress reporting
Date: Mon, 18 May 2020 18:13:35 -0700 [thread overview]
Message-ID: <20200519011335.GD17635@magnolia> (raw)
In-Reply-To: <8cf1d221-e5af-0db0-bad5-1d86d859d624@sandeen.net>
On Mon, May 18, 2020 at 08:03:09PM -0500, Eric Sandeen wrote:
> On 5/18/20 7:58 PM, Darrick J. Wong wrote:
> > On Mon, May 18, 2020 at 05:35:33PM -0500, Eric Sandeen wrote:
> >> Long ago, a young developer tried to fix a segfault in xfs_repair where
> >> a short progress reporting interval could cause a timer to go off and try
> >> to print a progress mesage before any had been properly set up because
> >> we were still busy zeroing the log, and a NULL pointer dereference
> >> ensued.
> >>
> >> That young developer got it wrong, and completely broke progress
> >> reporting, because the change caused us to exit early from the pthread
> >> start routine, and not initialize the progress timer at all.
> >>
> >> That developer is now slightly older and wiser, and finally realizes that
> >> the simple and correct solution here is to initialize the message format
> >> to the first one in the list, so that we will be ready to go with a
> >> progress message no matter when the first timer goes off.
> >>
> >> Reported-by: Leonardo Vaz <lvaz@redhat.com>
> >> Fixes: 7f2d6b811755 ("xfs_repair: avoid segfault if reporting progre...")
> >> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >> ---
> >>
> >> It might be nice to add progress reporting for the log zeroing, but that
> >> requires renumbering all these macros, and we don't/can't actually get
> >> any fine-grained progress at all, so probably not worth it.
> >>
> >> diff --git a/repair/progress.c b/repair/progress.c
> >> index 5ee08229..d7baa606 100644
> >> --- a/repair/progress.c
> >> +++ b/repair/progress.c
> >> @@ -125,7 +125,11 @@ init_progress_rpt (void)
> >> */
> >>
> >> pthread_mutex_init(&global_msgs.mutex, NULL);
> >> - global_msgs.format = NULL;
> >> + /*
> >> + * Ensure the format string is not NULL in case the first timer
> >> + * goes off before any stage calls set_progress_msg() to set it.
> >> + */
> >> + global_msgs.format = &progress_rpt_reports[0];
> >
> > Hmm so does that mean the first progress report could be for "scanning
> > freespace"?
>
> Yes. But unless "zeroing the log" takes more than the report interval,
> which by default is 15 minutes, it won't be wrong.
>
> > Or could you append a new entry to progress_rpt_reports for "getting my
> > shit together and moving out of my parents basement" and initialize it
> > to that?
>
> Oh I guess it could be appended and wouldn't have to be out of order but
> honestly I don't think it's worth it, even a big slow log shouldn't take
> long enough to need a progress report.
Zeroing a 2048MB log in 900s = 2.28MB/s
So I guess that's unlikely... but it still feels like leaving some kind
of weird logic bomb lurking where if we decrease the interval or someone
throws us a slow cloudy block store, we'll start issuing weird progress
messsages about some other part of xfs_repair which hasn't even started
yet. <shrug>
--D
> -Eric
>
> > --D
> >
> >> global_msgs.count = glob_agcount;
> >> global_msgs.interval = report_interval;
> >> global_msgs.done = prog_rpt_done;
> >> @@ -171,10 +175,6 @@ progress_rpt_thread (void *p)
> >> msg_block_t *msgp = (msg_block_t *)p;
> >> uint64_t percent;
> >>
> >> - /* It's possible to get here very early w/ no progress msg set */
> >> - if (!msgp->format)
> >> - return NULL;
> >> -
> >> if ((msgbuf = (char *)malloc(DURATION_BUF_SIZE)) == NULL)
> >> do_error (_("progress_rpt: cannot malloc progress msg buffer\n"));
> >>
> >>
> >
next prev parent reply other threads:[~2020-05-19 1:13 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-18 22:35 [PATCH] xfs_repair: fix progress reporting Eric Sandeen
2020-05-19 0:58 ` Darrick J. Wong
2020-05-19 1:03 ` Eric Sandeen
2020-05-19 1:13 ` Darrick J. Wong [this message]
2020-05-19 1:15 ` Eric Sandeen
2020-05-19 1:29 ` [PATCH V2] " Eric Sandeen
2020-05-19 2:04 ` Darrick J. Wong
2020-05-19 7:03 ` Donald Douwsma
2020-05-19 12:53 ` Eric Sandeen
2020-05-19 23:38 ` Donald Douwsma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200519011335.GD17635@magnolia \
--to=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
--cc=lvaz@redhat.com \
--cc=sandeen@redhat.com \
--cc=sandeen@sandeen.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox