From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-ej1-f51.google.com (mail-ej1-f51.google.com [209.85.218.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEA64B677 for ; Thu, 9 Jan 2025 02:26:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.51 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736389595; cv=none; b=VLTsZ60vxokAJQFDUSpXtlt65FeXEKqhwnyO6I5PEOwcO58ieQpTdwJExdcDINyY2mODSrqo3XnLFFVSzQ1RLUdCR8oLR7/riLgShoenOjKSL/hKsfaHyHuu6az6NVXI+Nu4GvbyQj6RVDFUvfbfZImoZXN/l9sejqI88PFy0Tw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736389595; c=relaxed/simple; bh=UE2KkTOeL+zlWKTKCb7bE+8CiLB/LWo+Gjy7DAJmAyo=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=rDoFKjJqMQe03p8Gir4WvTZhDftki2O8uG6Bt+ft5/8lAsgWVJNtUomv4zG8/k0BVoJ81iZ3ZoD920OB5d4ZTPOKdv1IZyO6RAC4OLw4v7b4ZiuxoUmEeGontiGg2xPjFSjm8UHWZ+Jg/dzXrRbK1T6PH2HenNhP84KZUaJAIRk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com; spf=pass smtp.mailfrom=suse.com; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b=VgJgrMbK; arc=none smtp.client-ip=209.85.218.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=suse.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=suse.com header.i=@suse.com header.b="VgJgrMbK" Received: by mail-ej1-f51.google.com with SMTP id a640c23a62f3a-aab73e995b5so9605666b.0 for ; Wed, 08 Jan 2025 18:26:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1736389591; x=1736994391; darn=lists.linux.dev; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=jmHXkhopjePU9SDPVEz/znoT5Fn3OcksEBWe+LtddBI=; b=VgJgrMbKkg8ZXEvI26eLlGmfYPx1+ox+38sT5Oz5hpGL1MB5hlJVWB5FFZr4oYb6qO v9wJIvMA6Q6YsU4jlX5WRVkg0ravPu+UXrmmByJd9VbdvNfBENEq/kI56on0vgBu6Blp 1iV1E+7OV60Oa4D72nUg3o3CyEsrU88NC7HsYtX0Baqqf9jJz81sDurQVKJnZUDNQZLG PfO5579ZTZFI1Z2DJlfCfxTn2AyEJmoSLwkzTaqRbQ3IfvreCNK7vJlLDjyb9C7MJc0/ aKbf/4RA7KqFoad0ABmu/pSSYVVYaFItDAl1teIvx0h6rTc/GUGUHvnIRAnJ7ByOTjoh ZteA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736389591; x=1736994391; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jmHXkhopjePU9SDPVEz/znoT5Fn3OcksEBWe+LtddBI=; b=lCs6MMn1Gz65OZZqUkUMttzrPNlwtmAa/JqxFK4RunrKyp2qWA+/Sfyhg8PUeIcUNs jx1e6jRCSp6dWFVEmFTEE1Y0ZtB3cVGlcIWZNYdu3j7EAphXlGQiVk0kOsqsqBD3qmdv WFwj2vxpHxxnT6faR2J3DqG+0kUNWkSuPvCQCYBe6RRHM/XHeLalIXHobv8ixJihGlir 7D0ObBPpM5+WZwa6ljnByIVFCo2WI3RuQHbnRf5AoMarE0A91D/+fAVwPNk5K5MglVGw G2Zirk84NT1chRPR5Zxr7tiYnsg5wkbmgqNFkLsBMy2OD1gg+PboS4eL7cfI+K4aDzkd JhjQ== X-Forwarded-Encrypted: i=1; AJvYcCWb82SbaxpW2tmkWe19B6NsaIJ6l8UipFbcU/LuOvd/igMkpQlwgooaxmRFTMyCzSpFl2aZ@lists.linux.dev X-Gm-Message-State: AOJu0Yz9SIRjIFwqtjRwwntR1jLDeimzT8CpOV8HkVhePHnhOQovyNlr zEsF/Fb1sO8gQknNQx50IU2sicUds/AhJI24GIlryYt6AWfjRs0MGT2eYiZ+XpQ= X-Gm-Gg: ASbGncuBB6es6fevvF/EE1tM48s0iATTulo0IQ5UASSawGfKGd3ehmVCqf+dtbGmrxx RuwJc14MQAo1956Pel1OIDT4cIA15JuyYEeYV2wSRgvgurFGSqmV2G3duP+6UGte/XsYQQGtIzT 6pX5AEw8Og9A45w01A465XgCEFOiJpoPEzo0bmb4Z5bq60/h+IhARoiJUnUZMmrcCuJsEzsbvKz Nrm8s3J2/YDr7hvAMR+3dzWjyQNoSMqlh56eZ/xjSPCOZ275SFuiXt0NQRu X-Google-Smtp-Source: AGHT+IHU/aRbwQKnvDIQTqaguaBYv+ts3nLPjfedVih9A4DEZoVyOqx18ZY7tyuHATQFNctzvzOCtQ== X-Received: by 2002:a17:907:d1b:b0:aa6:273c:a616 with SMTP id a640c23a62f3a-ab2ab571c40mr156587366b.4.1736389591216; Wed, 08 Jan 2025 18:26:31 -0800 (PST) Received: from [10.202.32.28] ([202.127.77.110]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f54a36a06fsm2326633a91.46.2025.01.08.18.26.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jan 2025 18:26:30 -0800 (PST) Message-ID: Date: Thu, 9 Jan 2025 10:26:25 +0800 Precedence: bulk X-Mailing-List: gfs2@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 1/1] dlm_controld: support corosync3/knet multi-link To: Alexander Aring Cc: teigland@redhat.com, ccaulfie@redhat.com, jfriesse@redhat.com, nicholas.yang@suse.com, glass.su@suse.com, gfs2@lists.linux.dev, Roger Zhou References: <20241224084241.13563-1-heming.zhao@suse.com> <20241224084241.13563-2-heming.zhao@suse.com> Content-Language: en-US From: Heming Zhao In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 1/8/25 23:54, Alexander Aring wrote: > Hi, > > On Mon, Jan 6, 2025 at 11:59 PM Heming Zhao wrote: >> >> On 1/7/25 02:11, Alexander Aring wrote: >>> Hi Heming, >>> >>> On Tue, Dec 24, 2024 at 3:42 AM Heming Zhao wrote: >>>> >>>> The totem.rrp_mode config item was obsolete in corosync3. And >>>> this patch gives dlm_controld the ability to detect multiple >>>> links. >>>> >>>> The corosync and dlm network protocol relationship table: >>>> >>>> -------------+-----------------------+--------------------- >>>> | totem.transport=udpu | totem.transport=udp >>>> +-----------------------+--------------------- >>>> corosync 2.x | | | multicast >>>> | 1-ring | 2-ring |--------------------- >>>> | | | default | 2-ring >>>> -------------+------------+----------+--------------------- >>>> dlm | tcp | sctp | tcp | sctp >>>> -------------+------------+----------+--------------------- >>>> >>>> -------------+----------------------------+---------------------- >>>> | totem.transport = udpu/udp | totem.transport=knet >>>> corosync 3.x |----------------------------+---------------------- >>>> | 1-ring | 1-link | multi-links >>>> -------------+----------------------------+---------+----------- >>>> dlm | tcp | tcp | sctp >>>> -------------+----------------------------+---------+----------- >>>> >>>> At last, this patch should be work with updated kernel dlm module. >>> >>> I am not getting why the network protocol configuration has anything >>> to do with the corosync configuration. >>> I know that we currently get the address configurations from corosync >>> but with this patch we are forced to use SCTP when corosync provides >>> more than one "ring" configuration? >> >> Yes. this patch will force dlm to change to SCTP when corosync provides >> more than one "ring". >> >> The reason: >> (without this patch) When a user sets up multi-links on corosync3 >> and corosync.conf with an incorrect or missing rrp_mode, >> dlm_tcp_listen_validate() will trigger 'dlm_local_count > 1' and report >> an error. >> Please note, rrp_mode is obsolete; the dlm_daemon will fail to read this >> config item in the further. Therefore, the network protocol will >> always be TCP. >> >>> >>> Even with corosync3 it should be possible to use corosync in SCTP >>> (multiple rings) and the kernel dlm using TCP only, would this not be >>> possible with dlm_controld then? >> >> Only one case for above case: corosync3 on single-link. >> A new patch is needed for dlm to work over TCP when corosync3 in SCTP >> (multi-link mode). i.e. dlm_tcp_listen_validate() shouldn't return >> -EINVAL when 'dlm_local_count > 1'. >> > > I think we should change that condition then. > >> A key point for dlm is that there is no way to get the corosync version. >> This patch is compatible with corosync2 env. In corosync2, the user must >> correctly config rrp_mode when using 2-ring. >> > > So far I looked into it, it is anyway for detecting a protocol > according to some Corosync functionality it should still be possible > to always force dlm_controld using a different protocol by setting the > right config values/parameters. Yes, I forgot the config item 'protocol=[detect|tcp|sctp]', which can bypass the detection phase when its value is "tcp|sctp". But in general, dlm.conf is seldom used. Unfortunately, corosync doesn't provide the api. ref: https://github.com/corosync/corosync/issues/771 > >> i.e.: >> In corosync2, change to 2-ring from 1-ring (whatever multicast mode). >> There must include rrp_mode item, if not, error report: >> corosync[1284]: [MAIN ] parse error in config: 2 is too many configured interfaces for the rrp_mode setting none. >> corosync[1284]: [MAIN ] Corosync Cluster Engine exiting with status 8 at main.c:1415. >> >> Even more, since corosync3 isn't compatible with corosync2, >> in my view, latest version of dlm_tools should only focus on corosync3 >> and drop corosync2 support. If any Linux distribution stay with >> corosync2, they should choose an old version of dlm_tools. >> > > Is dlm_tool not just a domain socket talking to dlm_controld? It does > not use any library of the Corosync project? detect_protocol(@dlm_controld/action.c) uses the cmap APIs to get corosync settings. dlm_controld registers a callback in setup_cluster(), then corosync uses quorum_callback/quorum_nodelist_callback to notify the dlm_controld daemon about quorum status. - Heming