Re: Performance regression in scsi sequential throughput (iozone) due to "e084b - page-allocator: preserve PFN ordering when __GFP_COLD is set"

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	epasch@de.ibm.com, SCHILLIG@de.ibm.com,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	christof.schmitt@de.ibm.com, thoss@de.ibm.com, hare@suse.de
Subject: Re: Performance regression in scsi sequential throughput (iozone) due to "e084b - page-allocator: preserve PFN ordering when	__GFP_COLD is set"
Date: Fri, 05 Feb 2010 16:51:10 +0100	[thread overview]
Message-ID: <4B6C3E6E.6050303@linux.vnet.ibm.com> (raw)
In-Reply-To: <20100119113306.GA23881@csn.ul.ie>

[-- Attachment #1: Type: text/plain, Size: 10339 bytes --]

I'll keep the old thread below as reference.

After taking a round of ensuring reproducibility and a pile of new 
measurements I now can come back with several new insights.

FYI - I'm now running iozone triplets (4, then 8, then 16 parallel 
threads) with sequential read load and all that 4 times to find 
potential noise. But since I changed to that load instead of random read 
wit hone thread and ensuring the most memory is cleared (sync + echo 3 > 
/proc/sys/vm/drop_caches + a few sleeps) . The noise is now down <2%. 
For detailed questions about the setup feel free to ask me directly as I 
won't flood this thread too much with such details.

So in the past I identified git id e084b for bringing a huge 
degradation, that is still true, but I need to revert my old statement 
that unapplying e084b on v2.6.32 helps - it simply doesn't.

Another bisect (keepign e084b reverted) brought up git id 5f8dcc21 which 
came in later. Both patches unapplied individually don't improve 
anything. But both patches reverted at the same time on git v2.6.32 
bring us back to our old values (remember that is more than 2Gb/s 
improvement in throughput)!

Unfortunately 5f8dcc21 is as unobvious as e84b in explaining how this 
can cause so much trouble.

In the past I identified congestion_wait as the point where the "time is 
lost" when comparing good and bad case. Fortunately at least this is 
still true when comparing 2.6.32 vs 2.6.32 with both patches reverted.
So I upgraded my old counter patches a bit and can now report the following:

BAD CASE
                                                                4 THREAD 
READ               8 THREAD READ               16 THREAD READ       
16THR % portions
perf_count_congestion_wait                                        
305                        1970                        8980
perf_count_call_congestion_wait_from_alloc_pages_high_priority      
0                           0                           0
perf_count_call_congestion_wait_from_alloc_pages_slowpath         
305                        1970                        
8979                100.00%
perf_count_pages_direct_reclaim                                  
1153                        6818                        3221
perf_count_failed_pages_direct_reclaim                            
305                        1556                        8979
perf_count_failed_pages_direct_reclaim_but_progress               
305                        1478                        
8979                27.87%

GOOD CASE WITH REVERTS                                          4 THREAD 
READ               8 THREAD READ               16 THREAD READ       
16THR % portions
perf_count_congestion_wait                                         
25                          76                         1114
perf_count_call_congestion_wait_from_alloc_pages_high_priority      
0                           0                            0
perf_count_call_congestion_wait_from_alloc_pages_slowpath          
25                          76                         
1114               99.98%
perf_count_pages_direct_reclaim                                  
1054                        9150                        62297
perf_count_failed_pages_direct_reclaim                             
25                          64                         1114
perf_count_failed_pages_direct_reclaim_but_progress                
25                          57                         
1114                1.79%


I hope the format is kept, it should be good with every monospace viewer.
You can all clearly see that the patches somehow affect the ratio at 
which __alloc_pages_direct_reclaim runs into a race between try_to_free 
pages that could actually free something, but a few lines below can't 
get a page from the free list.
Outside in the function alloc_pages_slowpath this leads to a call to 
congestion_wait which is absolutely futile as there are absolutely no 
writes in flight to wait for.

Now this kills effectively up to 80% of our throughput - Any idea of 
better understanding the link between the patches and the effect is 
welcome and might lead to a solution.

FYI - I tried all patches you suggested - none of them affects this.

Updated perf counter patches are attached.
As for some questions in the old thread:

>> Back to business :-)
>> First - yes it is reproducible in 2.6.32 final and fixable by unapplying  
>> e084b.
>> But I don't think un-applying it is really a fix as no one really  
>> understands how e084b cause this regression (it might just work around  
>> whatever goes on in the background and someone still hits the hidden 
>> issue).
>>
>>     
>
> Agreed on reverting is not an option. While it is not understood why it
> causes your regression, it's known to fix a regression for hardware that
> does do merging.
>   

I'm not taking this point but fairly compared it is +15% vs -80% - so it 
is not "only" good :-)

>> Lets better look what we know summarized:
>> - the patch e084a causes this regression
>> - it causes it only in very rare if not theoretically high memory  
>> constraints
>> - time is lost between read system call and the block device layer enter
>> - via debugging and discussion we found that time lost is spent in  
>> congestion_wait
>>
>>     
>
> Thanks for the summary.
>
> There is an outside chance it's due to a difference in free page availablity
> and the associated counters. Particularly if some subsystem is optimistally
> trying high-order allocations depending on availability. If there are many
> allocation failures, the counters gets skewed due to a bug, watermarks are
> not met and direct reclaim is used more. A patch is on its way upstream to
> fix this is http://www.spinics.net/lists/mm-commits/msg75671.html . Can you
> try it please?
>
> There is a report on swap-based workloads being impacted on 2.6.32 and a fix
> exists but it wouldn't have affected 2.6.31. If you are testing on 2.6.32
> though, this patch should be applied http://lkml.org/lkml/2009/12/21/245
>   
both are not affecting my scenario at all
>   
>> I did not yet check if it is possible, but we might have a spot tho  
>> might fix the issue.
>> Congestion_wait states it would "Waits for up to @timeout jiffies for a  
>> backing_dev (any backing_dev) to exit write congestion.  If no  
>> backing_devs are congested then just wait for the next write to be  
>> completed.".
>> Now in some case (and also my test) there is absolutely no write to do  
>> or in flight. So maybe it would be possible to either detect this before  
>> calling congestion_wait from page_alloc.c or let congestion_wait return  
>> an error code or something similar.
>>
>>     
>
> While I think that idea has merit, it's fixing the symptons and not the
> underlying problem. There is a growing opinion that congestion_wait() being
> used in the VM at all is a mistake. If the VM really needs to wait on pages
> being written out, it should have been on a waitqueue for a number of pages
> rather than a time-based method.
>
> In other words, I'm more interesting in figuring out *why* we enter
> congestion_wait() at all with the patch and it's avoided without rather than
> what congestion_wait() does when it gets called.
>
> Also, are you sure about thw "waiting for the next write to complete"? In
> 2.6.31 at least, it's waiting on the async write queue. If it's a case that
> it waits the full timeout if there is no async traffic then that's obviously
> bad as well but still doesn't explain why the problem patch makes a difference.
>   
I agree that this part of the discussion is about symptoms and for now 
we should try to focus on the cause that avtually brings up the race in 
__alloc_pages_direct_reclaim.
>> I mean as far as I can see the kernel currently does a loop like that  
>> (pseudo code and simplified):
>> 1. get a page <- fail
>> 2. try_to_free page <- returns >=1 pages freed
>> 3. get_page_from_freelist <- fails
>> 4. should_alloc_retry <- true
>> 5. congestion_wait <- FUTILE - just loosing time, nothing will be freed  
>> by writes
>> 6. goto 1
>> -> So maybe we should detect the futility of that congestion_wait call  
>> and not do it at all repeating instantly, go to oom, .., whatever (thats  
>> out for discussion).
>>
>> If we can't fix it that way I would propose we keep the code as is,  
>>     
>
> It's not my preferred way to fix this although I'm willing to explore it.
> I'd much prefer an explanation as to why your testcase is sensitive to the
> PFN ordering of the pages it gets back.
>
> One possibility is that your testcase requires a number of
> high-order allocations and the patch is preventing them being
> merged. If that was the case, the following patch would also help
> http://www.spinics.net/lists/mm-commits/msg75672.html but it's less than ideal.
>
>   
again not affecting my workload at all

>> hoping that it is a theoretical case that never hits a customer system.  
>> But in that case we should definitely add a userspace readable counter  
>> to congestion_wait similar to my debug patches. This would allow  
>> everyone that ever assumes an issue might be related to this to verify  
>> it by checking the congestion_wait counter.
>>     
>
> Would the accounting features available with
> CONFIG_TASK_DELAY_ACCT be sufficient?  There is a description in
> Documentation/accounting/delay-accounting.txt on how to use it. An abnormal
> amount of time spent on IO might explain the problem.
>
>   
Eventually it will appear as a huge increase in I/O-wait for no obvious 
reason - DELAY_ACCT is fine, actually even plain iostat is be enough.
The congestion_wait counter would just help to differ this issue, from 
others also increasing I/O-wait - but that is not too important.
I think we can all live without a counter and better fix the cause :-)
>> This should not be too hard in a technical way, but also not performance  
>> critical as congestion_wait is going to wait anyway. On the other hand  
>> everyone hates introducing new kernel interfaces that need to be kept  
>> compatible until nirvana - especially for bug tracking its not the best  
>> idea :-) So it should be our very last resort.
>>
>>     
>
> Agreed.
>
>   


-- 

Grüsse / regards, Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization 


[-- Attachment #2: perf-count-congestion-wait.diff --]
[-- Type: text/x-patch, Size: 2630 bytes --]

Index: linux-2.6-git/include/linux/sysctl.h
===================================================================
--- linux-2.6-git.orig/include/linux/sysctl.h	2010-02-05 16:48:08.000000000 +0100
+++ linux-2.6-git/include/linux/sysctl.h	2010-02-05 16:48:50.000000000 +0100
@@ -68,6 +68,7 @@
 	CTL_BUS=8,		/* Busses */
 	CTL_ABI=9,		/* Binary emulation */
 	CTL_CPU=10,		/* CPU stuff (speed scaling, etc) */
+  	CTL_PERF=11,		/* Performance counters and timer sums for debugging */
 	CTL_ARLAN=254,		/* arlan wireless driver */
 	CTL_S390DBF=5677,	/* s390 debug */
 	CTL_SUNRPC=7249,	/* sunrpc debug */
Index: linux-2.6-git/kernel/sysctl.c
===================================================================
--- linux-2.6-git.orig/kernel/sysctl.c	2010-02-05 16:48:08.000000000 +0100
+++ linux-2.6-git/kernel/sysctl.c	2010-02-05 16:48:27.000000000 +0100
@@ -183,6 +183,7 @@
 	.default_set.list = LIST_HEAD_INIT(root_table_header.ctl_entry),
 };
 
+static struct ctl_table perf_table[];
 static struct ctl_table kern_table[];
 static struct ctl_table vm_table[];
 static struct ctl_table fs_table[];
@@ -236,6 +237,13 @@
 		.mode		= 0555,
 		.child		= dev_table,
 	},
+	{
+		.ctl_name	= CTL_PERF,
+		.procname	= "perf",
+		.mode		= 0555,
+		.child		= perf_table,
+	},
+
 /*
  * NOTE: do not add new entries to this table unless you have read
  * Documentation/sysctl/ctl_unnumbered.txt
@@ -250,6 +258,19 @@
 static int max_wakeup_granularity_ns = NSEC_PER_SEC;	/* 1 second */
 #endif
 
+extern unsigned long perf_count_congestion_wait;
+static struct ctl_table perf_table[] = {
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname       = "perf_count_congestion_wait",
+		.data           = &perf_count_congestion_wait,
+		.mode           = 0666,
+		.maxlen		= sizeof(unsigned long),
+		.proc_handler   = &proc_doulongvec_minmax,
+	},
+	{ .ctl_name = 0 }
+};
+
 static struct ctl_table kern_table[] = {
 	{
 		.ctl_name	= CTL_UNNUMBERED,
Index: linux-2.6-git/mm/backing-dev.c
===================================================================
--- linux-2.6-git.orig/mm/backing-dev.c	2010-02-05 16:48:08.000000000 +0100
+++ linux-2.6-git/mm/backing-dev.c	2010-02-05 16:48:27.000000000 +0100
@@ -739,6 +739,7 @@
 }
 EXPORT_SYMBOL(set_bdi_congested);
 
+unsigned long perf_count_congestion_wait = 0;
 /**
  * congestion_wait - wait for a backing_dev to become uncongested
  * @sync: SYNC or ASYNC IO
@@ -754,6 +755,7 @@
 	DEFINE_WAIT(wait);
 	wait_queue_head_t *wqh = &congestion_wqh[sync];
 
+	perf_count_congestion_wait++;
 	prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE);
 	ret = io_schedule_timeout(timeout);
 	finish_wait(wqh, &wait);

[-- Attachment #3: perf-count-congestion-wait-calls.diff --]
[-- Type: text/x-patch, Size: 3604 bytes --]

Index: linux-2.6-git/kernel/sysctl.c
===================================================================
--- linux-2.6-git.orig/kernel/sysctl.c	2010-02-05 16:48:58.000000000 +0100
+++ linux-2.6-git/kernel/sysctl.c	2010-02-05 16:49:03.000000000 +0100
@@ -262,6 +262,8 @@
 extern unsigned long perf_count_pages_direct_reclaim;
 extern unsigned long perf_count_failed_pages_direct_reclaim;
 extern unsigned long perf_count_failed_pages_direct_reclaim_but_progress;
+extern unsigned long perf_count_call_congestion_wait_from_alloc_pages_high_priority;
+extern unsigned long perf_count_call_congestion_wait_from_alloc_pages_slowpath;
 static struct ctl_table perf_table[] = {
 	{
 		.ctl_name	= CTL_UNNUMBERED,
@@ -270,6 +272,22 @@
 		.mode           = 0666,
 		.maxlen		= sizeof(unsigned long),
 		.proc_handler   = &proc_doulongvec_minmax,
+	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname       = "perf_count_call_congestion_wait_from_alloc_pages_high_priority",
+		.data           = &perf_count_call_congestion_wait_from_alloc_pages_high_priority,
+		.mode           = 0666,
+		.maxlen		= sizeof(unsigned long),
+		.proc_handler   = &proc_doulongvec_minmax,
+	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname       = "perf_count_call_congestion_wait_from_alloc_pages_slowpath",
+		.data           = &perf_count_call_congestion_wait_from_alloc_pages_slowpath,
+		.mode           = 0666,
+		.maxlen		= sizeof(unsigned long),
+		.proc_handler   = &proc_doulongvec_minmax,
 	},
 	{
 		.ctl_name	= CTL_UNNUMBERED,
Index: linux-2.6-git/mm/page_alloc.c
===================================================================
--- linux-2.6-git.orig/mm/page_alloc.c	2010-02-05 16:48:58.000000000 +0100
+++ linux-2.6-git/mm/page_alloc.c	2010-02-05 16:49:03.000000000 +0100
@@ -1663,6 +1663,7 @@
 
 unsigned long perf_count_pages_direct_reclaim = 0;
 unsigned long perf_count_failed_pages_direct_reclaim = 0;
+unsigned long perf_count_failed_pages_direct_reclaim_but_progress = 0;
 
 /* The really slow allocator path where we enter direct reclaim */
 static inline struct page *
@@ -1704,12 +1705,13 @@
 	perf_count_pages_direct_reclaim++;
 	if (!page)
 		perf_count_failed_pages_direct_reclaim++;
-	if (!page && !(*did_some_progress))
-		perf_count_failed_pages_direct_reclaim_but_progress++
+	if (!page && (*did_some_progress))
+		perf_count_failed_pages_direct_reclaim_but_progress++;
 
 	return page;
 }
 
+unsigned long perf_count_call_congestion_wait_from_alloc_pages_high_priority = 0;
 /*
  * This is called in the allocator slow-path if the allocation request is of
  * sufficient urgency to ignore watermarks and take other desperate measures
@@ -1727,8 +1729,10 @@
 			zonelist, high_zoneidx, ALLOC_NO_WATERMARKS,
 			preferred_zone, migratetype);
 
-		if (!page && gfp_mask & __GFP_NOFAIL)
+		if (!page && gfp_mask & __GFP_NOFAIL) {
+			perf_count_call_congestion_wait_from_alloc_pages_high_priority++;
 			congestion_wait(BLK_RW_ASYNC, HZ/50);
+		}
 	} while (!page && (gfp_mask & __GFP_NOFAIL));
 
 	return page;
@@ -1783,6 +1787,7 @@
 	return alloc_flags;
 }
 
+unsigned long perf_count_call_congestion_wait_from_alloc_pages_slowpath = 0;
 static inline struct page *
 __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	struct zonelist *zonelist, enum zone_type high_zoneidx,
@@ -1899,6 +1904,7 @@
 	pages_reclaimed += did_some_progress;
 	if (should_alloc_retry(gfp_mask, order, pages_reclaimed)) {
 		/* Wait for some write requests to complete then retry */
+		perf_count_call_congestion_wait_from_alloc_pages_slowpath++;
 		congestion_wait(BLK_RW_ASYNC, HZ/50);
 		goto rebalance;
 	}

[-- Attachment #4: perf-count-failed-direct-recalims.diff --]
[-- Type: text/x-patch, Size: 2461 bytes --]

Index: linux-2.6-git/kernel/sysctl.c
===================================================================
--- linux-2.6-git.orig/kernel/sysctl.c	2010-02-05 16:48:27.000000000 +0100
+++ linux-2.6-git/kernel/sysctl.c	2010-02-05 16:48:58.000000000 +0100
@@ -259,6 +259,9 @@
 #endif
 
 extern unsigned long perf_count_congestion_wait;
+extern unsigned long perf_count_pages_direct_reclaim;
+extern unsigned long perf_count_failed_pages_direct_reclaim;
+extern unsigned long perf_count_failed_pages_direct_reclaim_but_progress;
 static struct ctl_table perf_table[] = {
 	{
 		.ctl_name	= CTL_UNNUMBERED,
@@ -268,6 +271,30 @@
 		.maxlen		= sizeof(unsigned long),
 		.proc_handler   = &proc_doulongvec_minmax,
 	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname       = "perf_count_pages_direct_reclaim",
+		.data           = &perf_count_pages_direct_reclaim,
+		.maxlen		= sizeof(unsigned long),
+		.mode           = 0666,
+		.proc_handler   = &proc_doulongvec_minmax,
+	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname       = "perf_count_failed_pages_direct_reclaim",
+		.data           = &perf_count_failed_pages_direct_reclaim,
+		.maxlen		= sizeof(unsigned long),
+		.mode           = 0666,
+		.proc_handler   = &proc_doulongvec_minmax,
+	},
+	{
+		.ctl_name	= CTL_UNNUMBERED,
+		.procname       = "perf_count_failed_pages_direct_reclaim_but_progress",
+		.data           = &perf_count_failed_pages_direct_reclaim_but_progress,
+		.maxlen		= sizeof(unsigned long),
+		.mode           = 0666,
+		.proc_handler   = &proc_doulongvec_minmax,
+	},
 	{ .ctl_name = 0 }
 };
 
Index: linux-2.6-git/mm/page_alloc.c
===================================================================
--- linux-2.6-git.orig/mm/page_alloc.c	2010-02-05 16:48:07.000000000 +0100
+++ linux-2.6-git/mm/page_alloc.c	2010-02-05 16:48:58.000000000 +0100
@@ -1661,6 +1661,9 @@
 	return page;
 }
 
+unsigned long perf_count_pages_direct_reclaim = 0;
+unsigned long perf_count_failed_pages_direct_reclaim = 0;
+
 /* The really slow allocator path where we enter direct reclaim */
 static inline struct page *
 __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
@@ -1697,6 +1700,13 @@
 					zonelist, high_zoneidx,
 					alloc_flags, preferred_zone,
 					migratetype);
+
+	perf_count_pages_direct_reclaim++;
+	if (!page)
+		perf_count_failed_pages_direct_reclaim++;
+	if (!page && !(*did_some_progress))
+		perf_count_failed_pages_direct_reclaim_but_progress++
+
 	return page;
 }

next prev parent reply	other threads:[~2010-02-05 15:51 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-07 14:39 Performance regression in scsi sequential throughput (iozone) due to "e084b - page-allocator: preserve PFN ordering when __GFP_COLD is set" Christian Ehrhardt
2009-12-07 15:09 ` Mel Gorman
2009-12-08 17:59   ` Christian Ehrhardt
2009-12-10 14:36     ` Christian Ehrhardt
2009-12-11 11:20       ` Mel Gorman
2009-12-11 14:47         ` Christian Ehrhardt
2009-12-18 13:38           ` Christian Ehrhardt
2009-12-18 17:42             ` Mel Gorman
2010-01-14 12:30               ` Christian Ehrhardt
2010-01-19 11:33                 ` Mel Gorman
2010-02-05 15:51                   ` Christian Ehrhardt [this message]
2010-02-05 17:49                     ` Mel Gorman
2010-02-08 14:01                       ` Christian Ehrhardt
2010-02-08 15:21                         ` Mel Gorman
2010-02-08 16:55                           ` Mel Gorman
2010-02-09  6:23                           ` Christian Ehrhardt
2010-02-09 15:52                           ` Christian Ehrhardt
2010-02-09 17:57                             ` Mel Gorman
2010-02-11 16:11                               ` Christian Ehrhardt
2010-02-12 10:05                                 ` Nick Piggin
2010-02-15  6:59                                   ` Nick Piggin
2010-02-15 15:46                                   ` Christian Ehrhardt
2010-02-16 11:25                                     ` Mel Gorman
2010-02-16 16:47                                       ` Christian Ehrhardt
2010-02-17  9:55                                         ` Christian Ehrhardt
2010-02-17 10:03                                           ` Christian Ehrhardt
2010-02-18 11:43                                           ` Mel Gorman
2010-02-18 16:09                                             ` Christian Ehrhardt
2010-02-19 11:19                                               ` Christian Ehrhardt
2010-02-19 15:19                                                 ` Mel Gorman
2010-02-22 15:42                                                   ` Christian Ehrhardt
2010-02-25 15:13                                                     ` Christian Ehrhardt
2010-02-26 11:18                                                       ` Nick Piggin
2010-03-02  6:52                                                   ` Nick Piggin
2010-03-02 10:04                                                     ` Mel Gorman
2010-03-02 10:36                                                       ` Nick Piggin
2010-03-02 11:01                                                         ` Mel Gorman
2010-03-02 11:18                                                           ` Nick Piggin
2010-03-02 11:24                                                             ` Mel Gorman
2010-03-03  6:51                                                               ` Christian Ehrhardt
2010-02-08 15:02                       ` Christian Ehrhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B6C3E6E.6050303@linux.vnet.ibm.com \
    --to=ehrhardt@linux.vnet.ibm.com \
    --cc=SCHILLIG@de.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=christof.schmitt@de.ibm.com \
    --cc=epasch@de.ibm.com \
    --cc=hare@suse.de \
    --cc=heiko.carstens@de.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=schwidefsky@de.ibm.com \
    --cc=thoss@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.