From: James Simmons <jsimmons@infradead.org>
To: Andreas Dilger <adilger@whamcloud.com>,
Oleg Drokin <green@whamcloud.com>, NeilBrown <neilb@suse.de>
Cc: Mikhail Pershin <mpershin@whamcloud.com>,
Lustre Development List <lustre-devel@lists.lustre.org>
Subject: [lustre-devel] [PATCH 09/22] lustre: llog: skip bad records in llog
Date: Sun, 20 Nov 2022 09:16:55 -0500 [thread overview]
Message-ID: <1668953828-10909-10-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1668953828-10909-1-git-send-email-jsimmons@infradead.org>
From: Mikhail Pershin <mpershin@whamcloud.com>
This patch is further development of idea to skip bad
corrupted) llogs data. If llog has fixed-size records
then it is possible to skip one record but not rest of
llog block.
Patch also fixes the skipping to the next chunk:
- make sure to skip to the next block for partial chunk
or it causes the same block re-read.
- handle index == 0 as goal for the llog_next_block() as
expected exclusion and just return requested block
- set new index after block was skipped to the first one
in block
- don't create fake padding record in llog_osd_next_block()
as the caller can handle it and would know about
- restore test_8 functionality to check corruption handling
Fixes: b79e7c205e40 ("lustre: llog: add synchronization for the last record")
WC-bug-id: https://jira.whamcloud.com/browse/LU-16203
Lustre-commit: cf121b16685fe2a27 ("LU-16203 llog: skip bad records in llog")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48776
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
fs/lustre/obdclass/llog.c | 86 ++++++++++++++++++++++++++++-------------------
1 file changed, 52 insertions(+), 34 deletions(-)
diff --git a/fs/lustre/obdclass/llog.c b/fs/lustre/obdclass/llog.c
index eb8f7e5..90bb8bd 100644
--- a/fs/lustre/obdclass/llog.c
+++ b/fs/lustre/obdclass/llog.c
@@ -233,27 +233,26 @@ int llog_init_handle(const struct lu_env *env, struct llog_handle *handle,
}
EXPORT_SYMBOL(llog_init_handle);
+#define LLOG_ERROR_REC(lgh, rec, format, a...) \
+ CERROR("%s: "DFID" rec type=%x idx=%u len=%u, " format "\n", \
+ loghandle2name(lgh), PLOGID(&lgh->lgh_id), (rec)->lrh_type, \
+ (rec)->lrh_index, (rec)->lrh_len, ##a)
+
int llog_verify_record(const struct llog_handle *llh, struct llog_rec_hdr *rec)
{
int chunk_size = llh->lgh_hdr->llh_hdr.lrh_len;
- if (rec->lrh_len == 0 || rec->lrh_len > chunk_size) {
- CERROR("%s: record is too large: %d > %d\n",
- loghandle2name(llh), rec->lrh_len, chunk_size);
- return -EINVAL;
- }
- if (rec->lrh_index >= LLOG_HDR_BITMAP_SIZE(llh->lgh_hdr)) {
- CERROR("%s: index is too high: %d\n",
- loghandle2name(llh), rec->lrh_index);
- return -EINVAL;
- }
- if ((rec->lrh_type & LLOG_OP_MASK) != LLOG_OP_MAGIC) {
- CERROR("%s: magic %x is bad\n",
- loghandle2name(llh), rec->lrh_type);
- return -EINVAL;
- }
+ if ((rec->lrh_type & LLOG_OP_MASK) != LLOG_OP_MAGIC)
+ LLOG_ERROR_REC(llh, rec, "magic is bad");
+ else if (rec->lrh_len == 0 || rec->lrh_len > chunk_size)
+ LLOG_ERROR_REC(llh, rec, "bad record len, chunk size is %d",
+ chunk_size);
+ else if (rec->lrh_index >= LLOG_HDR_BITMAP_SIZE(llh->lgh_hdr))
+ LLOG_ERROR_REC(llh, rec, "index is too high");
+ else
+ return 0;
- return 0;
+ return -EINVAL;
}
static inline bool llog_is_index_skipable(int idx, struct llog_log_hdr *llh,
@@ -278,7 +277,6 @@ static int llog_process_thread(void *arg)
int saved_index = 0;
int last_called_index = 0;
bool repeated = false;
- bool refresh_idx = false;
if (!llh)
return -EINVAL;
@@ -346,6 +344,11 @@ static int llog_process_thread(void *arg)
rc = 0;
goto out;
}
+ /* EOF while trying to skip to the next chunk */
+ if (!index && rc == -EBADR) {
+ rc = 0;
+ goto out;
+ }
if (rc)
goto out;
@@ -377,6 +380,15 @@ static int llog_process_thread(void *arg)
CDEBUG(D_OTHER, "after swabbing, type=%#x idx=%d\n",
rec->lrh_type, rec->lrh_index);
+ /* start with first rec if block was skipped */
+ if (!index) {
+ CDEBUG(D_OTHER,
+ "%s: skipping to the index %u\n",
+ loghandle2name(loghandle),
+ rec->lrh_index);
+ index = rec->lrh_index;
+ }
+
if (index == (synced_idx + 1) &&
synced_idx == LLOG_HDR_TAIL(llh)->lrt_index) {
rc = 0;
@@ -399,11 +411,15 @@ static int llog_process_thread(void *arg)
* it turns to
* lh_last_idx != LLOG_HDR_TAIL(llh)->lrt_index
* This exception is working for catalog only.
+ * The last check is for the partial chunk boundary,
+ * if it is reached then try to re-read for possible
+ * new records once.
*/
if ((index == lh_last_idx && synced_idx != index) ||
(index == (lh_last_idx + 1) &&
lh_last_idx != LLOG_HDR_TAIL(llh)->lrt_index) ||
- (rec->lrh_index == 0 && !repeated)) {
+ (((char *)rec - buf >= cur_offset - chunk_offset) &&
+ !repeated)) {
/* save offset inside buffer for the re-read */
buf_offset = (char *)rec - (char *)buf;
cur_offset = chunk_offset;
@@ -415,24 +431,27 @@ static int llog_process_thread(void *arg)
CDEBUG(D_OTHER, "synced_idx: %d\n", synced_idx);
goto repeat;
}
-
repeated = false;
rc = llog_verify_record(loghandle, rec);
if (rc) {
- CERROR("%s: invalid record in llog "DFID" record for index %d/%d: rc = %d\n",
- loghandle2name(loghandle),
- PLOGID(&loghandle->lgh_id),
- rec->lrh_len, index, rc);
+ CDEBUG(D_OTHER, "invalid record at index %d\n",
+ index);
/*
- * the block seem to be corrupted, let's try
- * with the next one. reset rc to go to the
- * next chunk.
+ * for fixed-sized llogs we can skip one record
+ * by using llh_size from llog header.
+ * Otherwise skip the next llog chunk.
*/
- refresh_idx = true;
- index = 0;
rc = 0;
- goto repeat;
+ if (llh->llh_flags & LLOG_F_IS_FIXSIZE) {
+ rec->lrh_len = llh->llh_size;
+ goto next_rec;
+ }
+ /* make sure that is always next block */
+ cur_offset = chunk_offset + chunk_size;
+ /* no goal to find, just next block to read */
+ index = 0;
+ break;
}
if (rec->lrh_index < index) {
@@ -446,10 +465,9 @@ static int llog_process_thread(void *arg)
* gap which can be result of old bugs, just
* keep going
*/
- CERROR("%s: "DFID" index %u, expected %u\n",
- loghandle2name(loghandle),
- PLOGID(&loghandle->lgh_id),
- rec->lrh_index, index);
+ LLOG_ERROR_REC(loghandle, rec,
+ "gap in index, expected %u",
+ index);
index = rec->lrh_index;
}
@@ -470,7 +488,7 @@ static int llog_process_thread(void *arg)
if (rc)
goto out;
}
-
+next_rec:
/* exit if the last index is reached */
if (index >= last_index) {
rc = 0;
--
1.8.3.1
_______________________________________________
lustre-devel mailing list
lustre-devel@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
next prev parent reply other threads:[~2022-11-20 14:31 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-20 14:16 [lustre-devel] [PATCH 00/22] lustre: backport OpenSFS work as of Nov 20, 2022 James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 01/22] lustre: llite: clear stale page's uptodate bit James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 02/22] lustre: osc: Remove oap lock James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 03/22] lnet: Don't modify uptodate peer with temp NI James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 04/22] lustre: llite: Explicitly support .splice_write James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 05/22] lnet: o2iblnd: add verbose debug prints for rx/tx events James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 06/22] lnet: use Netlink to support old and new NI APIs James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 07/22] lustre: obdclass: improve precision of wakeups for mod_rpcs James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 08/22] lnet: allow ping packet to contain large nids James Simmons
2022-11-20 14:16 ` James Simmons [this message]
2022-11-20 14:16 ` [lustre-devel] [PATCH 10/22] lnet: fix build issue when IPv6 is disabled James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 11/22] lustre: obdclass: fill jobid in a safe way James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 12/22] lustre: llite: remove linefeed from LDLM_DEBUG James Simmons
2022-11-20 14:16 ` [lustre-devel] [PATCH 13/22] lnet: selftest: migrate LNet selftest session handling to Netlink James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 14/22] lustre: clio: append to non-existent component James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 15/22] lnet: fix debug message in lnet_discovery_event_reply James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 16/22] lustre: ldlm: group lock unlock fix James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 17/22] lnet: Signal completion on ping send failure James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 18/22] lnet: extend lnet_is_nid_in_ping_info() James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 19/22] lnet: find correct primary for peer James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 20/22] lnet: change lnet_notify() to take struct lnet_nid James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 21/22] lnet: discard lnet_nid2ni_*() James Simmons
2022-11-20 14:17 ` [lustre-devel] [PATCH 22/22] lnet: change lnet_debug_peer() to struct lnet_nid James Simmons
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1668953828-10909-10-git-send-email-jsimmons@infradead.org \
--to=jsimmons@infradead.org \
--cc=adilger@whamcloud.com \
--cc=green@whamcloud.com \
--cc=lustre-devel@lists.lustre.org \
--cc=mpershin@whamcloud.com \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).