From: James Simmons <jsimmons@infradead.org>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
devel@driverdev.osuosl.org,
Andreas Dilger <andreas.dilger@intel.com>,
Oleg Drokin <oleg.drokin@intel.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Lustre Development List <lustre-devel@lists.lustre.org>,
Andreas Dilger <andreas.dilger@intel.com>,
Bobi Jam <bobijam.xu@intel.com>,
James Simmons <jsimmons@infradead.org>
Subject: [PATCH 077/124] staging: lustre: ptlrpc: quiet errors on initial connection
Date: Sun, 18 Sep 2016 16:38:16 -0400 [thread overview]
Message-ID: <1474231143-4061-78-git-send-email-jsimmons@infradead.org> (raw)
In-Reply-To: <1474231143-4061-1-git-send-email-jsimmons@infradead.org>
From: Andreas Dilger <andreas.dilger@intel.com>
It may be that a client or MDS is trying to connect to a target (OST
or peer MDT) before that target is finished setup. Rather than
spamming the console logs during initial connection, only print a
console error message if there are repeated failures trying to
connect to the target, which may indicate an error on that node.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3456
Reviewed-on: http://review.whamcloud.com/10057
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
drivers/staging/lustre/lustre/ptlrpc/client.c | 52 +++++++++++---------
.../staging/lustre/lustre/ptlrpc/ptlrpc_internal.h | 2 +-
2 files changed, 30 insertions(+), 24 deletions(-)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index a29ccaa..f3914cc 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1075,36 +1075,42 @@ static int ptlrpc_import_delay_req(struct obd_import *imp,
}
/**
- * Decide if the error message regarding provided request \a req
- * should be printed to the console or not.
- * Makes it's decision on request status and other properties.
- * Returns 1 to print error on the system console or 0 if not.
+ * Decide if the error message should be printed to the console or not.
+ * Makes its decision based on request type, status, and failure frequency.
+ *
+ * \param[in] req request that failed and may need a console message
+ *
+ * \retval false if no message should be printed
+ * \retval true if console message should be printed
*/
-static int ptlrpc_console_allow(struct ptlrpc_request *req)
+static bool ptlrpc_console_allow(struct ptlrpc_request *req)
{
__u32 opc;
- int err;
LASSERT(req->rq_reqmsg);
opc = lustre_msg_get_opc(req->rq_reqmsg);
- /*
- * Suppress particular reconnect errors which are to be expected. No
- * errors are suppressed for the initial connection on an import
- */
- if ((lustre_handle_is_used(&req->rq_import->imp_remote_handle)) &&
- (opc == OST_CONNECT || opc == MDS_CONNECT || opc == MGS_CONNECT)) {
+ /* Suppress particular reconnect errors which are to be expected. */
+ if (opc == OST_CONNECT || opc == MDS_CONNECT || opc == MGS_CONNECT) {
+ int err;
+
/* Suppress timed out reconnect requests */
- if (req->rq_timedout)
- return 0;
+ if (lustre_handle_is_used(&req->rq_import->imp_remote_handle) ||
+ req->rq_timedout)
+ return false;
- /* Suppress unavailable/again reconnect requests */
+ /*
+ * Suppress most unavailable/again reconnect requests, but
+ * print occasionally so it is clear client is trying to
+ * connect to a server where no target is running.
+ */
err = lustre_msg_get_status(req->rq_repmsg);
- if (err == -ENODEV || err == -EAGAIN)
- return 0;
+ if ((err == -ENODEV || err == -EAGAIN) &&
+ req->rq_import->imp_conn_cnt % 30 != 20)
+ return false;
}
- return 1;
+ return true;
}
/**
@@ -1118,14 +1124,14 @@ static int ptlrpc_check_status(struct ptlrpc_request *req)
err = lustre_msg_get_status(req->rq_repmsg);
if (lustre_msg_get_type(req->rq_repmsg) == PTL_RPC_MSG_ERR) {
struct obd_import *imp = req->rq_import;
+ lnet_nid_t nid = imp->imp_connection->c_peer.nid;
__u32 opc = lustre_msg_get_opc(req->rq_reqmsg);
if (ptlrpc_console_allow(req))
- LCONSOLE_ERROR_MSG(0x011, "%s: Communicating with %s, operation %s failed with %d.\n",
+ LCONSOLE_ERROR_MSG(0x011, "%s: operation %s to node %s failed: rc = %d\n",
imp->imp_obd->obd_name,
- libcfs_nid2str(
- imp->imp_connection->c_peer.nid),
- ll_opcode2str(opc), err);
+ ll_opcode2str(opc),
+ libcfs_nid2str(nid), err);
return err < 0 ? err : -EINVAL;
}
@@ -1282,7 +1288,7 @@ static int after_reply(struct ptlrpc_request *req)
* some reason. Try to reconnect, and if that fails, punt to
* the upcall.
*/
- if (ll_rpc_recoverable_error(rc)) {
+ if (ptlrpc_recoverable_error(rc)) {
if (req->rq_send_state != LUSTRE_IMP_FULL ||
imp->imp_obd->obd_no_recov || imp->imp_dlm_fake) {
return rc;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
index 29cfac2..b420aa8 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpc_internal.h
@@ -270,7 +270,7 @@ void sptlrpc_conf_fini(void);
int sptlrpc_init(void);
void sptlrpc_fini(void);
-static inline int ll_rpc_recoverable_error(int rc)
+static inline bool ptlrpc_recoverable_error(int rc)
{
return (rc == -ENOTCONN || rc == -ENODEV);
}
--
1.7.1
next prev parent reply other threads:[~2016-09-18 20:44 UTC|newest]
Thread overview: 131+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-18 20:36 [PATCH 000/124] missing patches from Lustre 2.7 release James Simmons
2016-09-18 20:37 ` [PATCH 001/124] staging: lustre: llite: fix ll_statahead_thread() problems on failure James Simmons
2016-09-18 20:37 ` [PATCH 002/124] staging: lustre: ptlrpc: enlarge OST_MAXREQSIZE for 4MB RPC James Simmons
2016-09-18 20:37 ` [PATCH 003/124] staging: lustre: ldlm: fix a use after free in ldlm_resource_get() James Simmons
2016-09-18 20:37 ` [PATCH 004/124] staging: lustre: lmv: honor MDT index when creating volatile file James Simmons
2016-09-18 20:37 ` [PATCH 005/124] staging: lustre: obdclass: optimize busy loop wait James Simmons
2016-09-18 20:37 ` [PATCH 006/124] staging: lustre: lmv: Do not ignore ENOENT in lmv_unlink James Simmons
2016-09-18 20:37 ` [PATCH 007/124] staging: lustre: obd: add lnb_ prefix to members of struct niobuf_local James Simmons
2016-09-18 20:37 ` [PATCH 008/124] staging: lustre: obd: add rnb_ prefix to struct niobuf_remote members James Simmons
2016-09-18 20:37 ` [PATCH 009/124] staging: lustre: obdclass: serialize lu_site purge James Simmons
2016-09-18 20:37 ` [PATCH 010/124] staging: lustre: llite: add LL_LEASE_{RD,WR,UN}LCK James Simmons
2016-09-18 20:37 ` [PATCH 011/124] staging: lustre: llite: update ras stride offset James Simmons
2016-09-18 20:37 ` [PATCH 012/124] staging: lustre: lmv: fix some byte order issues James Simmons
2016-09-18 20:37 ` [PATCH 013/124] staging: lustre: osc: update kms in brw_interpret() properly James Simmons
2016-09-18 20:37 ` [PATCH 014/124] staging: lustre: lmv: release locks if lmv_intent_lock() fails James Simmons
2016-09-18 20:37 ` [PATCH 015/124] staging: lustre: clio: lu_ref_del() mismatch ref add scope James Simmons
2016-09-18 20:37 ` [PATCH 016/124] staging: lustre: fix comparison between signed and unsigned James Simmons
2016-09-18 20:37 ` [PATCH 017/124] staging: lustre: lov: adjust page bufsize after layout change James Simmons
2016-09-18 20:37 ` [PATCH 018/124] staging: lustre: obdclass: fix comparison between signed and unsigned James Simmons
2016-09-18 20:37 ` [PATCH 019/124] staging: lustre: ptlrpc: fix magic return value of ptlrpc_init_portals James Simmons
2016-09-18 20:37 ` [PATCH 020/124] staging: lustre: lmv: release request in lmv_revalidate_slaves() James Simmons
2016-09-18 20:37 ` [PATCH 021/124] staging: lustre: build: bump build version warnings to x.y.53 James Simmons
2016-09-18 20:37 ` [PATCH 022/124] staging: lustre: llog: add newly opened llog at tail of handle list James Simmons
2016-09-18 20:37 ` [PATCH 023/124] staging: lustre: mdc: Report D_CHANGELOG messages as D_HSM James Simmons
2016-09-18 20:37 ` [PATCH 024/124] staging: lustre: remove RCU2HANDLE macro James Simmons
2016-09-18 20:37 ` [PATCH 025/124] staging: lustre: llite: Compare of unsigned value against 0 is always true James Simmons
2016-09-18 20:37 ` [PATCH 026/124] staging: lustre: statahead: statahead thread wait for RPCs to finish James Simmons
2016-09-18 20:37 ` [PATCH 027/124] staging: lustre: ldlm: reconstruct proper flags on enqueue resend James Simmons
2016-09-18 20:37 ` [PATCH 028/124] staging: lustre: ldlm: resend AST callbacks James Simmons
2016-09-18 20:37 ` [PATCH 029/124] staging: lustre: ldlm: restore some of the interval functionality James Simmons
2016-09-18 20:37 ` [PATCH 030/124] staging: lustre: llite: Replace write mutex with range lock James Simmons
2016-09-19 7:28 ` Greg Kroah-Hartman
2016-09-19 9:25 ` [lustre-devel] " Dilger, Andreas
2016-09-19 9:59 ` Jan Kara
2016-09-18 20:37 ` [PATCH 031/124] staging: lustre: vvp: Use lockless __generic_file_aio_write James Simmons
2016-09-18 20:37 ` [PATCH 032/124] staging: lustre: llite: remove lookup_flags from ll_lookup_it() James Simmons
2016-09-18 20:37 ` [PATCH 033/124] staging: lustre: llite: remove mode from ll_create_it() James Simmons
2016-09-18 20:37 ` [PATCH 034/124] staging: lustre: llite: turn mode to umode_t for ll_new_inode() James Simmons
2016-09-18 20:37 ` [PATCH 035/124] staging: lustre: llite: style cleanup for ll_mkdir James Simmons
2016-09-18 20:37 ` [PATCH 036/124] staging: lustre: llite: no need to check dentry is NULL James Simmons
2016-09-18 20:37 ` [PATCH 037/124] staging: lustre: cleanup lustre_lib.h James Simmons
2016-09-18 20:37 ` [PATCH 038/124] staging: lustre: osc: debug to match extent to brw RPC James Simmons
2016-09-18 20:37 ` [PATCH 039/124] staging: lustre: remove lustre_lite.h James Simmons
2016-09-18 20:37 ` [PATCH 040/124] staging: lustre: obd: rename LUSTRE_STRIPE_MAXBYTES James Simmons
2016-09-18 20:37 ` [PATCH 041/124] staging: lustre: llite: don't call make_bad_inode() on an old inode James Simmons
2016-09-18 20:37 ` [PATCH 042/124] staging: lustre: obd: change type of lmv_tgt_desc->ltd_idx to u32 James Simmons
2016-09-18 20:37 ` [PATCH 043/124] staging: lustre: lmv: change type of lmv_obd->tgts_size " James Simmons
2016-09-18 20:37 ` [PATCH 044/124] staging: lustre: misc: Reduce exposure to overflow on page counters James Simmons
2016-09-18 20:37 ` [PATCH 045/124] staging: lustre: lmv: remove dead code James Simmons
2016-09-18 20:37 ` [PATCH 046/124] staging: lustre: llite: handle concurrent use of cob_transient_pages James Simmons
2016-09-18 20:37 ` [PATCH 047/124] staging: lustre: llite: enforce pool name length limit James Simmons
2016-09-18 20:37 ` [PATCH 048/124] staging: lustre: Flexible changelog format James Simmons
2016-09-18 20:37 ` [PATCH 049/124] staging: lustre: lmv: move some inline functions to lustre_lmv.h James Simmons
2016-09-18 20:37 ` [PATCH 050/124] staging: lustre: ldlm: per-export lock callback timeout James Simmons
2016-09-18 20:37 ` [PATCH 051/124] staging: lustre: llite: ensure all data flush out when umount James Simmons
2016-09-18 20:37 ` [PATCH 052/124] staging: lustre: lmv: add testing for bad name hash James Simmons
2016-09-18 20:37 ` [PATCH 053/124] staging: lustre: obd: restore linkea support James Simmons
2016-09-18 20:37 ` [PATCH 054/124] staging: lustre: llite: Add ioctl to get parent fids from link EA James Simmons
2016-09-18 20:37 ` [PATCH 055/124] staging: lustre: llite: allow setting stripes to specify OSTs James Simmons
2016-09-18 20:37 ` [PATCH 056/124] staging: lustre: statahead: use dcache-like interface for sa entry James Simmons
2016-09-18 20:37 ` [PATCH 057/124] staging: lustre: statahead: ll_intent_drop_lock() called in spinlock James Simmons
2016-09-18 20:37 ` [PATCH 058/124] staging: lustre: statahead: race in start/stop statahead James Simmons
2016-09-18 20:37 ` [PATCH 059/124] staging: lustre: at: net AT after connect James Simmons
2016-09-18 20:37 ` [PATCH 060/124] staging: lustre: mdc: fix comparison between signed and unsigned James Simmons
2016-09-18 20:38 ` [PATCH 061/124] staging: lustre: obd: cleanup struct md_op_data and uses James Simmons
2016-09-18 20:38 ` [PATCH 062/124] staging: lustre: replace direct HZ access with kernel APIs James Simmons
2016-09-18 20:38 ` [PATCH 063/124] staging: lustre: ldlm: count of pools is unsigned long James Simmons
2016-09-18 20:38 ` [PATCH 064/124] staging: lustre: lu_dirent_calc_size() return type to size_t James Simmons
2016-09-18 20:38 ` [PATCH 065/124] staging: lustre: obdclass: change lu_site->ls_purge_start to unsigned James Simmons
2016-09-18 20:38 ` [PATCH 066/124] staging: lustre: lov: remove LL_IOC_RECREATE_{FID,OBJ} James Simmons
2016-09-18 20:38 ` [PATCH 067/124] staging: lustre: changelog: fix comparison between signed and unsigned James Simmons
2016-09-18 20:38 ` [PATCH 068/124] staging: lustre: lov: remove unused {get,set}_info handlers James Simmons
2016-09-18 20:38 ` [PATCH 069/124] staging: lustre: fix messages with missing newlines James Simmons
2016-09-18 20:38 ` [PATCH 070/124] staging: lustre: statahead: small fixes and cleanup James Simmons
2016-09-19 7:51 ` Greg Kroah-Hartman
2016-09-18 20:38 ` [PATCH 071/124] staging: lustre: obd: remove unused obd methods James Simmons
2016-09-18 20:38 ` [PATCH 072/124] staging: lustre: echo: replace lov_stripe_md with lov_oinfo James Simmons
2016-09-18 20:38 ` [PATCH 073/124] staging: lustre: llite: remove ll_objects_destroy() James Simmons
2016-09-18 20:38 ` [PATCH 074/124] staging: lustre: changelog: Proper record remapping James Simmons
2016-09-18 20:38 ` [PATCH 075/124] staging: lustre: recovery: don't replay closed open James Simmons
2016-09-18 20:38 ` [PATCH 076/124] staging: lustre: ldlm: revert the changes for lock canceling policy James Simmons
2016-09-18 20:38 ` James Simmons [this message]
2016-09-18 20:38 ` [PATCH 078/124] staging: lustre: llog: prevent out-of-bound index James Simmons
2016-09-18 20:38 ` [PATCH 079/124] staging: lustre: mgc: add nid iteration James Simmons
2016-09-18 20:38 ` [PATCH 080/124] staging: lustre: llite: fix dup flags names James Simmons
2016-09-18 20:38 ` [PATCH 081/124] staging: lustre: obdclass: lu_htable_order() return type to long James Simmons
2016-09-18 20:38 ` [PATCH 082/124] staging: lustre: mdc: Proper accessing struct lov_user_md James Simmons
2016-09-18 20:38 ` [PATCH 083/124] staging: lustre: ldlm: evict clients returning errors on ASTs James Simmons
2016-09-18 20:38 ` [PATCH 084/124] staging: lustre: fiemap: set FIEMAP_EXTENT_LAST correctly James Simmons
2016-09-18 20:38 ` [PATCH 085/124] staging: lustre: obdclass: change loop indexes to unsigned James Simmons
2016-09-18 20:38 ` [PATCH 086/124] staging: lustre: obdclass: eliminate NULL error return James Simmons
2016-09-18 20:38 ` [PATCH 087/124] staging: lustre: ptlrpc: Suppress error message when imp_sec is freed James Simmons
2016-09-18 20:38 ` [PATCH 088/124] staging: lustre: ldlm: Recalculate interval in ldlm_pool_recalc() James Simmons
2016-09-18 20:38 ` [PATCH 089/124] staging: lustre: obd: change brw_page->count to unsigned James Simmons
2016-09-18 20:38 ` [PATCH 090/124] staging: lustre: obdclass: change cl_fault_io->ft_nob to size_t James Simmons
2016-09-18 20:38 ` [PATCH 091/124] staging: lustre: clio: add coo_getstripe interface James Simmons
2016-09-18 20:38 ` [PATCH 092/124] staging: lustre: ptlrpc: fix comparison between signed and unsigned James Simmons
2016-09-18 20:38 ` [PATCH 093/124] staging: lustre: ldlm: move LDLM_GID_ANY to lustre_dlm.h James Simmons
2016-09-18 20:38 ` [PATCH 094/124] staging: lustre: lov: flatten struct lov_stripe_md James Simmons
2016-09-18 20:38 ` [PATCH 095/124] staging: lustre: ptlrpc: fix race between connect vs resend James Simmons
2016-09-18 20:38 ` [PATCH 096/124] staging: lustre: osc: osc_object_ast_clear() LBUG James Simmons
2016-09-18 20:38 ` [PATCH 097/124] staging: lustre: osc: change cl_extent_tax and *grants to unsigned James Simmons
2016-09-18 20:38 ` [PATCH 098/124] staging: lustre: lprocfs: cleanup stats locking code James Simmons
2016-09-18 20:38 ` [PATCH 099/124] staging: lustre: llite: unlock inode size in ll_lov_setstripe_ea_info() James Simmons
2016-09-18 20:38 ` [PATCH 100/124] staging: lustre: obd: change type of cl_conn_count to size_t James Simmons
2016-09-18 20:38 ` [PATCH 101/124] staging: lustre: libcfs: check mask returned by cpumask_of_node James Simmons
2016-09-18 20:38 ` [PATCH 102/124] staging: lustre: remove lustre/include/linux/ James Simmons
2016-09-18 20:38 ` [PATCH 103/124] staging: lustre: llite: pack suppgid to MDS correctly James Simmons
2016-09-18 20:38 ` [PATCH 104/124] staging: lustre: clio: rename coo_attr_set to coo_attr_update James Simmons
2016-09-18 20:38 ` [PATCH 105/124] staging: lustre: clio: pass fid for OST setattr James Simmons
2016-09-18 20:38 ` [PATCH 106/124] staging: lustre: client: Fix mkdir -i 1 from DNE2 client to DNE1 server James Simmons
2016-09-18 20:38 ` [PATCH 107/124] staging: lustre: lmv: Do not revalidate stripes with master lock James Simmons
2016-09-18 20:38 ` [PATCH 108/124] staging: lustre: grant: quiet message on grant waiting timeout James Simmons
2016-09-18 20:38 ` [PATCH 109/124] staging: lustre: misc: remove unnecessary EXPORT_SYMBOL James Simmons
2016-09-18 20:38 ` [PATCH 110/124] staging: lustre: obdclass: " James Simmons
2016-09-18 20:38 ` [PATCH 111/124] staging: lustre: llite: lock the inode to be migrated James Simmons
2016-09-18 20:38 ` [PATCH 112/124] staging: lustre: ptlrpc: remove unnecessary EXPORT_SYMBOL James Simmons
2016-09-19 5:43 ` kbuild test robot
2016-09-18 20:38 ` [PATCH 113/124] staging: lustre: obd: use proper flags for call_usermodehelper James Simmons
2016-09-18 20:38 ` [PATCH 114/124] staging: lustre: ptlrpc: prevent request timeout grow due to recovery James Simmons
2016-09-18 20:38 ` [PATCH 115/124] staging: lustre: mdt: add indexing option to default dir stripe James Simmons
2016-09-18 20:38 ` [PATCH 116/124] staging: lustre: llite: make default_easize writeable in /sysfs James Simmons
2016-09-18 20:38 ` [PATCH 117/124] staging: lustre: mdc: cl_default_mds_easize not refreshed James Simmons
2016-09-18 20:38 ` [PATCH 118/124] staging: lustre: lmv: fix parent FID for migration James Simmons
2016-09-18 20:38 ` [PATCH 119/124] staging: lustre: lnet: potential deadlock in lnet James Simmons
2016-09-18 20:38 ` [PATCH 120/124] staging: lustre: lnet: check if ni is in current net namespace James Simmons
2016-09-18 20:39 ` [PATCH 121/124] staging: lustre: lnet: Ensure routing is turned on first time James Simmons
2016-09-18 20:39 ` [PATCH 122/124] staging: lustre: lnet: Enable setting per NI peer_credits James Simmons
2016-09-18 20:39 ` [PATCH 123/124] staging: lustre: o2iblnd: Put back work queue check previously removed James Simmons
2016-09-18 20:39 ` [PATCH 124/124] staging: lustre: update version to 2.6.99 James Simmons
2016-09-19 8:10 ` [PATCH 000/124] missing patches from Lustre 2.7 release Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1474231143-4061-78-git-send-email-jsimmons@infradead.org \
--to=jsimmons@infradead.org \
--cc=andreas.dilger@intel.com \
--cc=bobijam.xu@intel.com \
--cc=devel@driverdev.osuosl.org \
--cc=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lustre-devel@lists.lustre.org \
--cc=oleg.drokin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).