linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Monitor: improve check_one_sharer() for checking duplicated process
@ 2020-04-10 16:24 Coly Li
  2020-04-10 22:55 ` John Stoffel
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Coly Li @ 2020-04-10 16:24 UTC (permalink / raw)
  To: jes, linux-raid; +Cc: Coly Li, Shinkichi Yamazaki

When running mdadm monitor with scan mode, only one autorebuild process
is allowed. check_one_sharer() checks duplicated process by following
steps,
1) Read autorebuild.pid file,
   - if file does not exist, no duplicated process, go to 3).
   - if file exists, continue to next step.
2) Read pid number from autorebuild.pid file, then check procfs pid
   directory /proc/<PID>,
   - if the directory does not exist, no duplicated process, go to 3)
   - if the directory exists, print error message for duplicated process
     and exit this mdadm.
3) Write current pid into autorebuild.pid file, continue to monitor in
   scan mode.

The problem for the above step 2) is, if after system reboots and
another different process happens to have exact same pid number which
autorebuild.pid file records, check_one_sharer() will treat it as a
duplicated mdadm process and returns error with message "Only one
autorebuild process allowed in scan mode, aborting".

This patch tries to fix the above same-pid-but-different-process issue
by one more step to check the process command name,
1) Read autorebuild.pid file
   - if file does not exist, no duplicated process, go to 4).
   - if file exists, continue to next step.
2) Read pid number from autorebuild.pid file, then check procfs file
   comm with the specific pid directory /proc/<PID>/comm
   - if the file does not exit, it means the directory /proc/<PID> does
     not exist, go to 4)
   - if the file exits, continue next step
3) Read process command name from /proc/<PIC>/comm, compare the command
   name with "mdadm" process name,
   - if not equal, no duplicated process, goto 4)
   - if strings are equal, print error message for duplicated process
     and exit this mdadm.
4) Write current pid into autorebuild.pid file, continue to monitor in
   scan mode.

Now check_one_sharer() returns error for duplicated process only when
the recorded pid from autorebuild.pid exists, and the process has exact
same command name as "mdadm".

Reported-by: Shinkichi Yamazaki <shinkichi.yamazaki@suse.com>
Signed-off-by: Coly Li <colyli@suse.de>
---
 Monitor.c | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/Monitor.c b/Monitor.c
index b527165..2d6b3b9 100644
--- a/Monitor.c
+++ b/Monitor.c
@@ -301,26 +301,34 @@ static int make_daemon(char *pidfile)
 
 static int check_one_sharer(int scan)
 {
-	int pid, rv;
+	int pid;
+	FILE *comm_fp;
 	FILE *fp;
-	char dir[20];
+	char comm_path[100];
 	char path[100];
-	struct stat buf;
+	char comm[20];
+
 	sprintf(path, "%s/autorebuild.pid", MDMON_DIR);
 	fp = fopen(path, "r");
 	if (fp) {
 		if (fscanf(fp, "%d", &pid) != 1)
 			pid = -1;
-		sprintf(dir, "/proc/%d", pid);
-		rv = stat(dir, &buf);
-		if (rv != -1) {
-			if (scan) {
-				pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
-				fclose(fp);
-				return 1;
-			} else {
-				pr_err("Warning: One autorebuild process already running.\n");
+		snprintf(comm_path, sizeof(comm_path),
+			 "/proc/%d/comm", pid);
+		comm_fp = fopen(comm_path, "r");
+		if (comm_fp) {
+			if (fscanf(comm_fp, "%s", comm) &&
+			    strncmp(basename(comm), Name, strlen(Name)) == 0) {
+				if (scan) {
+					pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
+					fclose(comm_fp);
+					fclose(fp);
+					return 1;
+				} else {
+					pr_err("Warning: One autorebuild process already running.\n");
+				}
 			}
+			fclose(comm_fp);
 		}
 		fclose(fp);
 	}
-- 
2.25.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Monitor: improve check_one_sharer() for checking duplicated process
  2020-04-10 16:24 [PATCH] Monitor: improve check_one_sharer() for checking duplicated process Coly Li
@ 2020-04-10 22:55 ` John Stoffel
  2020-04-13  1:47 ` Zhong Lidong
  2020-04-27 14:27 ` Jes Sorensen
  2 siblings, 0 replies; 5+ messages in thread
From: John Stoffel @ 2020-04-10 22:55 UTC (permalink / raw)
  To: Coly Li; +Cc: jes, linux-raid, Shinkichi Yamazaki

>>>>> "Coly" == Coly Li <colyli@suse.de> writes:

Coly> When running mdadm monitor with scan mode, only one autorebuild process
Coly> is allowed. check_one_sharer() checks duplicated process by following
Coly> steps,
Coly> 1) Read autorebuild.pid file,
Coly>    - if file does not exist, no duplicated process, go to 3).
Coly>    - if file exists, continue to next step.
Coly> 2) Read pid number from autorebuild.pid file, then check procfs pid
Coly>    directory /proc/<PID>,
Coly>    - if the directory does not exist, no duplicated process, go to 3)
Coly>    - if the directory exists, print error message for duplicated process
Coly>      and exit this mdadm.
Coly> 3) Write current pid into autorebuild.pid file, continue to monitor in
Coly>    scan mode.

Shouldn't the pid file be in /var/run or something like that, which is
going to be wiped on reboot no matter what?  I'm not really against
the code, since it is good belt and suspenders programming, but it
just strikes me as a really really hard to hit corner case that should
be handled by the OS already.


Coly> The problem for the above step 2) is, if after system reboots and
Coly> another different process happens to have exact same pid number which
Coly> autorebuild.pid file records, check_one_sharer() will treat it as a
Coly> duplicated mdadm process and returns error with message "Only one
Coly> autorebuild process allowed in scan mode, aborting".

Coly> This patch tries to fix the above same-pid-but-different-process issue
Coly> by one more step to check the process command name,
Coly> 1) Read autorebuild.pid file
Coly>    - if file does not exist, no duplicated process, go to 4).
Coly>    - if file exists, continue to next step.
Coly> 2) Read pid number from autorebuild.pid file, then check procfs file
Coly>    comm with the specific pid directory /proc/<PID>/comm
Coly>    - if the file does not exit, it means the directory /proc/<PID> does
Coly>      not exist, go to 4)
Coly>    - if the file exits, continue next step
Coly> 3) Read process command name from /proc/<PIC>/comm, compare the command
Coly>    name with "mdadm" process name,
Coly>    - if not equal, no duplicated process, goto 4)
Coly>    - if strings are equal, print error message for duplicated process
Coly>      and exit this mdadm.
Coly> 4) Write current pid into autorebuild.pid file, continue to monitor in
Coly>    scan mode.

Coly> Now check_one_sharer() returns error for duplicated process only when
Coly> the recorded pid from autorebuild.pid exists, and the process has exact
Coly> same command name as "mdadm".

Coly> Reported-by: Shinkichi Yamazaki <shinkichi.yamazaki@suse.com>
Coly> Signed-off-by: Coly Li <colyli@suse.de>
Coly> ---
Coly>  Monitor.c | 32 ++++++++++++++++++++------------
Coly>  1 file changed, 20 insertions(+), 12 deletions(-)

Coly> diff --git a/Monitor.c b/Monitor.c
Coly> index b527165..2d6b3b9 100644
Coly> --- a/Monitor.c
Coly> +++ b/Monitor.c
Coly> @@ -301,26 +301,34 @@ static int make_daemon(char *pidfile)
 
Coly>  static int check_one_sharer(int scan)
Coly>  {
Coly> -	int pid, rv;
Coly> +	int pid;
Coly> +	FILE *comm_fp;
Coly>  	FILE *fp;
Coly> -	char dir[20];
Coly> +	char comm_path[100];
Coly>  	char path[100];
Coly> -	struct stat buf;
Coly> +	char comm[20];
Coly> +
Coly>  	sprintf(path, "%s/autorebuild.pid", MDMON_DIR);
Coly>  	fp = fopen(path, "r");
Coly>  	if (fp) {
Coly>  		if (fscanf(fp, "%d", &pid) != 1)
Coly>  			pid = -1;
Coly> -		sprintf(dir, "/proc/%d", pid);
Coly> -		rv = stat(dir, &buf);
Coly> -		if (rv != -1) {
Coly> -			if (scan) {
Coly> -				pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
Coly> -				fclose(fp);
Coly> -				return 1;
Coly> -			} else {
Coly> -				pr_err("Warning: One autorebuild process already running.\n");
Coly> +		snprintf(comm_path, sizeof(comm_path),
Coly> +			 "/proc/%d/comm", pid);
Coly> +		comm_fp = fopen(comm_path, "r");
Coly> +		if (comm_fp) {
Coly> +			if (fscanf(comm_fp, "%s", comm) &&
Coly> +			    strncmp(basename(comm), Name, strlen(Name)) == 0) {
Coly> +				if (scan) {
Coly> +					pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
Coly> +					fclose(comm_fp);
Coly> +					fclose(fp);
Coly> +					return 1;
Coly> +				} else {
Coly> +					pr_err("Warning: One autorebuild process already running.\n");
Coly> +				}
Coly>  			}
Coly> +			fclose(comm_fp);
Coly>  		}
Coly>  		fclose(fp);
Coly>  	}
Coly> -- 
Coly> 2.25.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Monitor: improve check_one_sharer() for checking duplicated process
  2020-04-10 16:24 [PATCH] Monitor: improve check_one_sharer() for checking duplicated process Coly Li
  2020-04-10 22:55 ` John Stoffel
@ 2020-04-13  1:47 ` Zhong Lidong
  2020-04-13  2:34   ` Coly Li
  2020-04-27 14:27 ` Jes Sorensen
  2 siblings, 1 reply; 5+ messages in thread
From: Zhong Lidong @ 2020-04-13  1:47 UTC (permalink / raw)
  To: Coly Li, jes, linux-raid; +Cc: Shinkichi Yamazaki

On 4/11/20 12:24 AM, Coly Li wrote:
> When running mdadm monitor with scan mode, only one autorebuild process
> is allowed. check_one_sharer() checks duplicated process by following
> steps,
> 1) Read autorebuild.pid file,
>    - if file does not exist, no duplicated process, go to 3).
>    - if file exists, continue to next step.
> 2) Read pid number from autorebuild.pid file, then check procfs pid
>    directory /proc/<PID>,
>    - if the directory does not exist, no duplicated process, go to 3)
>    - if the directory exists, print error message for duplicated process
>      and exit this mdadm.
> 3) Write current pid into autorebuild.pid file, continue to monitor in
>    scan mode.
> 
> The problem for the above step 2) is, if after system reboots and
> another different process happens to have exact same pid number which
> autorebuild.pid file records, check_one_sharer() will treat it as a
> duplicated mdadm process and returns error with message "Only one
> autorebuild process allowed in scan mode, aborting".
> 
> This patch tries to fix the above same-pid-but-different-process issue
> by one more step to check the process command name,
> 1) Read autorebuild.pid file
>    - if file does not exist, no duplicated process, go to 4).
>    - if file exists, continue to next step.
> 2) Read pid number from autorebuild.pid file, then check procfs file
>    comm with the specific pid directory /proc/<PID>/comm
>    - if the file does not exit, it means the directory /proc/<PID> does
>      not exist, go to 4)
>    - if the file exits, continue next step
> 3) Read process command name from /proc/<PIC>/comm, compare the command
>    name with "mdadm" process name,
>    - if not equal, no duplicated process, goto 4)
>    - if strings are equal, print error message for duplicated process
>      and exit this mdadm.
> 4) Write current pid into autorebuild.pid file, continue to monitor in
>    scan mode.
> 
> Now check_one_sharer() returns error for duplicated process only when
> the recorded pid from autorebuild.pid exists, and the process has exact
> same command name as "mdadm".
> 

Consider another corner case: what if the recorded pid from
autorebuild.pid is actually used by other mdadm command, such as "mdadm
--wait"? It shouldn't report error now.

Thanks,
Lidong

> Reported-by: Shinkichi Yamazaki <shinkichi.yamazaki@suse.com>
> Signed-off-by: Coly Li <colyli@suse.de>
> ---
>  Monitor.c | 32 ++++++++++++++++++++------------
>  1 file changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/Monitor.c b/Monitor.c
> index b527165..2d6b3b9 100644
> --- a/Monitor.c
> +++ b/Monitor.c
> @@ -301,26 +301,34 @@ static int make_daemon(char *pidfile)
>  
>  static int check_one_sharer(int scan)
>  {
> -	int pid, rv;
> +	int pid;
> +	FILE *comm_fp;
>  	FILE *fp;
> -	char dir[20];
> +	char comm_path[100];
>  	char path[100];
> -	struct stat buf;
> +	char comm[20];
> +
>  	sprintf(path, "%s/autorebuild.pid", MDMON_DIR);
>  	fp = fopen(path, "r");
>  	if (fp) {
>  		if (fscanf(fp, "%d", &pid) != 1)
>  			pid = -1;
> -		sprintf(dir, "/proc/%d", pid);
> -		rv = stat(dir, &buf);
> -		if (rv != -1) {
> -			if (scan) {
> -				pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
> -				fclose(fp);
> -				return 1;
> -			} else {
> -				pr_err("Warning: One autorebuild process already running.\n");
> +		snprintf(comm_path, sizeof(comm_path),
> +			 "/proc/%d/comm", pid);
> +		comm_fp = fopen(comm_path, "r");
> +		if (comm_fp) {
> +			if (fscanf(comm_fp, "%s", comm) &&
> +			    strncmp(basename(comm), Name, strlen(Name)) == 0) {
> +				if (scan) {
> +					pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
> +					fclose(comm_fp);
> +					fclose(fp);
> +					return 1;
> +				} else {
> +					pr_err("Warning: One autorebuild process already running.\n");
> +				}
>  			}
> +			fclose(comm_fp);
>  		}
>  		fclose(fp);
>  	}
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Monitor: improve check_one_sharer() for checking duplicated process
  2020-04-13  1:47 ` Zhong Lidong
@ 2020-04-13  2:34   ` Coly Li
  0 siblings, 0 replies; 5+ messages in thread
From: Coly Li @ 2020-04-13  2:34 UTC (permalink / raw)
  To: Zhong Lidong; +Cc: jes, linux-raid, Shinkichi Yamazaki

On 2020/4/13 09:47, Zhong Lidong wrote:
> On 4/11/20 12:24 AM, Coly Li wrote:
>> When running mdadm monitor with scan mode, only one autorebuild process
>> is allowed. check_one_sharer() checks duplicated process by following
>> steps,
>> 1) Read autorebuild.pid file,
>>    - if file does not exist, no duplicated process, go to 3).
>>    - if file exists, continue to next step.
>> 2) Read pid number from autorebuild.pid file, then check procfs pid
>>    directory /proc/<PID>,
>>    - if the directory does not exist, no duplicated process, go to 3)
>>    - if the directory exists, print error message for duplicated process
>>      and exit this mdadm.
>> 3) Write current pid into autorebuild.pid file, continue to monitor in
>>    scan mode.
>>
>> The problem for the above step 2) is, if after system reboots and
>> another different process happens to have exact same pid number which
>> autorebuild.pid file records, check_one_sharer() will treat it as a
>> duplicated mdadm process and returns error with message "Only one
>> autorebuild process allowed in scan mode, aborting".
>>
>> This patch tries to fix the above same-pid-but-different-process issue
>> by one more step to check the process command name,
>> 1) Read autorebuild.pid file
>>    - if file does not exist, no duplicated process, go to 4).
>>    - if file exists, continue to next step.
>> 2) Read pid number from autorebuild.pid file, then check procfs file
>>    comm with the specific pid directory /proc/<PID>/comm
>>    - if the file does not exit, it means the directory /proc/<PID> does
>>      not exist, go to 4)
>>    - if the file exits, continue next step
>> 3) Read process command name from /proc/<PIC>/comm, compare the command
>>    name with "mdadm" process name,
>>    - if not equal, no duplicated process, goto 4)
>>    - if strings are equal, print error message for duplicated process
>>      and exit this mdadm.
>> 4) Write current pid into autorebuild.pid file, continue to monitor in
>>    scan mode.
>>
>> Now check_one_sharer() returns error for duplicated process only when
>> the recorded pid from autorebuild.pid exists, and the process has exact
>> same command name as "mdadm".
>>
> 
> Consider another corner case: what if the recorded pid from
> autorebuild.pid is actually used by other mdadm command, such as "mdadm
> --wait"? It shouldn't report error now.
> 

Hi Lidong,

This is a try-best effort, if there happens to be another mdadm process
has exact same pid number, this mdadm in scan mode has to fail.

There is no perfect method to prevent duplicated process, e.g. rename
the mdadm program to another name, this patch is not able to detect it
as duplicated process neither.

This is an improvement for current check_one_sharer(), not a silver bullet.

Thank.

Coly Li




> 
>> Reported-by: Shinkichi Yamazaki <shinkichi.yamazaki@suse.com>
>> Signed-off-by: Coly Li <colyli@suse.de>
>> ---
>>  Monitor.c | 32 ++++++++++++++++++++------------
>>  1 file changed, 20 insertions(+), 12 deletions(-)
>>
>> diff --git a/Monitor.c b/Monitor.c
>> index b527165..2d6b3b9 100644
>> --- a/Monitor.c
>> +++ b/Monitor.c
>> @@ -301,26 +301,34 @@ static int make_daemon(char *pidfile)
>>  
>>  static int check_one_sharer(int scan)
>>  {
>> -	int pid, rv;
>> +	int pid;
>> +	FILE *comm_fp;
>>  	FILE *fp;
>> -	char dir[20];
>> +	char comm_path[100];
>>  	char path[100];
>> -	struct stat buf;
>> +	char comm[20];
>> +
>>  	sprintf(path, "%s/autorebuild.pid", MDMON_DIR);
>>  	fp = fopen(path, "r");
>>  	if (fp) {
>>  		if (fscanf(fp, "%d", &pid) != 1)
>>  			pid = -1;
>> -		sprintf(dir, "/proc/%d", pid);
>> -		rv = stat(dir, &buf);
>> -		if (rv != -1) {
>> -			if (scan) {
>> -				pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
>> -				fclose(fp);
>> -				return 1;
>> -			} else {
>> -				pr_err("Warning: One autorebuild process already running.\n");
>> +		snprintf(comm_path, sizeof(comm_path),
>> +			 "/proc/%d/comm", pid);
>> +		comm_fp = fopen(comm_path, "r");
>> +		if (comm_fp) {
>> +			if (fscanf(comm_fp, "%s", comm) &&
>> +			    strncmp(basename(comm), Name, strlen(Name)) == 0) {
>> +				if (scan) {
>> +					pr_err("Only one autorebuild process allowed in scan mode, aborting\n");
>> +					fclose(comm_fp);
>> +					fclose(fp);
>> +					return 1;
>> +				} else {
>> +					pr_err("Warning: One autorebuild process already running.\n");
>> +				}
>>  			}
>> +			fclose(comm_fp);
>>  		}
>>  		fclose(fp);
>>  	}
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Monitor: improve check_one_sharer() for checking duplicated process
  2020-04-10 16:24 [PATCH] Monitor: improve check_one_sharer() for checking duplicated process Coly Li
  2020-04-10 22:55 ` John Stoffel
  2020-04-13  1:47 ` Zhong Lidong
@ 2020-04-27 14:27 ` Jes Sorensen
  2 siblings, 0 replies; 5+ messages in thread
From: Jes Sorensen @ 2020-04-27 14:27 UTC (permalink / raw)
  To: Coly Li, linux-raid; +Cc: Shinkichi Yamazaki

On 4/10/20 12:24 PM, Coly Li wrote:
> When running mdadm monitor with scan mode, only one autorebuild process
> is allowed. check_one_sharer() checks duplicated process by following
> steps,
> 1) Read autorebuild.pid file,
>    - if file does not exist, no duplicated process, go to 3).
>    - if file exists, continue to next step.
> 2) Read pid number from autorebuild.pid file, then check procfs pid
>    directory /proc/<PID>,
>    - if the directory does not exist, no duplicated process, go to 3)
>    - if the directory exists, print error message for duplicated process
>      and exit this mdadm.
> 3) Write current pid into autorebuild.pid file, continue to monitor in
>    scan mode.
> 
> The problem for the above step 2) is, if after system reboots and
> another different process happens to have exact same pid number which
> autorebuild.pid file records, check_one_sharer() will treat it as a
> duplicated mdadm process and returns error with message "Only one
> autorebuild process allowed in scan mode, aborting".
> 
> This patch tries to fix the above same-pid-but-different-process issue
> by one more step to check the process command name,
> 1) Read autorebuild.pid file
>    - if file does not exist, no duplicated process, go to 4).
>    - if file exists, continue to next step.
> 2) Read pid number from autorebuild.pid file, then check procfs file
>    comm with the specific pid directory /proc/<PID>/comm
>    - if the file does not exit, it means the directory /proc/<PID> does
>      not exist, go to 4)
>    - if the file exits, continue next step
> 3) Read process command name from /proc/<PIC>/comm, compare the command
>    name with "mdadm" process name,
>    - if not equal, no duplicated process, goto 4)
>    - if strings are equal, print error message for duplicated process
>      and exit this mdadm.
> 4) Write current pid into autorebuild.pid file, continue to monitor in
>    scan mode.
> 
> Now check_one_sharer() returns error for duplicated process only when
> the recorded pid from autorebuild.pid exists, and the process has exact
> same command name as "mdadm".
> 
> Reported-by: Shinkichi Yamazaki <shinkichi.yamazaki@suse.com>
> Signed-off-by: Coly Li <colyli@suse.de>
> ---
>  Monitor.c | 32 ++++++++++++++++++++------------
>  1 file changed, 20 insertions(+), 12 deletions(-)

Applied!

Thanks,
Jes

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-04-27 14:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-04-10 16:24 [PATCH] Monitor: improve check_one_sharer() for checking duplicated process Coly Li
2020-04-10 22:55 ` John Stoffel
2020-04-13  1:47 ` Zhong Lidong
2020-04-13  2:34   ` Coly Li
2020-04-27 14:27 ` Jes Sorensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).