* [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure @ 2013-07-30 4:05 Brandon Casey 2013-07-30 7:51 ` Eric Sunshine 2013-07-30 15:39 ` Junio C Hamano 0 siblings, 2 replies; 23+ messages in thread From: Brandon Casey @ 2013-07-30 4:05 UTC (permalink / raw) To: git; +Cc: gitster, spearce, Brandon Casey From: Brandon Casey <drafnel@gmail.com> When the number of open packs exceeds pack_max_fds, unuse_one_window() is called repeatedly to attempt to release the least-recently-used pack windows, which, as a side-effect, will also close a pack file after closing its last open window. If a pack file has been opened, but no windows have been allocated into it, it will never be selected by unuse_one_window() and hence its file descriptor will not be closed. When this happens, git may exceed the number of file descriptors permitted by the system. This latter situation can occur in show-ref or receive-pack during ref advertisement. During ref advertisement, receive-pack will iterate over every ref in the repository and advertise it to the client after ensuring that the ref exists in the local repository. If the ref is located inside a pack, then the pack is opened to ensure that it exists, but since the object is not actually read from the pack, no mmap windows are allocated. When the number of open packs exceeds pack_max_fds, unuse_one_window() will not able to find any windows to free and will not be able to close any packs. Once the per-process file descriptor limit is exceeded, receive-pack will produce a warning, not an error, for each pack it cannot open, and will then most likely fail with an error to spawn rev-list or index-pack like: error: cannot create standard input pipe for rev-list: Too many open files error: Could not run 'git rev-list' This is not likely to occur during upload-pack since upload-pack reads each object from the pack so that it can peel tags and advertise the exposed object. So during upload-pack, mmap windows will be allocated for each pack that is opened and unuse_one_window() will eventually be able to close unused packs after freeing all of their windows. When we have file descriptor pressure, in contrast to memory pressure, we need to free all windows and close the pack file descriptor so that a new pack can be opened. Let's introduce a new function close_one_pack() designed specifically for this purpose to search for and close the least-recently-used pack, where LRU is defined as * pack with oldest mtime and no allocated mmap windows or * pack with the least-recently-used windows, i.e. the pack with the oldest most-recently-used window Signed-off-by: Brandon Casey <drafnel@gmail.com> --- sha1_file.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 62 insertions(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 8e27db1..7731ab1 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -682,6 +682,67 @@ void close_pack_windows(struct packed_git *p) } } +/* + * The LRU pack is the one with the oldest MRU window or the oldest mtime + * if it has no windows allocated. + */ +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w) +{ + struct pack_window *w, *this_mru_w; + + /* + * Reject this pack if it has windows and the previously selected + * one does not. If this pack does not have windows, reject + * it if the pack file is newer than the previously selected one. + */ + if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime)) + return; + + for (w = this_mru_w = p->windows; w; w = w->next) { + /* Reject this pack if any of its windows are in use */ + if (w->inuse_cnt) + return; + /* + * Reject this pack if it has windows that have been + * used more recently than the previously selected pack. + */ + if (*mru_w && w->last_used > (*mru_w)->last_used) + return; + if (w->last_used > this_mru_w->last_used) + this_mru_w = w; + } + + /* + * Select this pack. + */ + *mru_w = this_mru_w; + *lru_p = p; +} + +static int close_one_pack(void) +{ + struct packed_git *p, *lru_p = NULL; + struct pack_window *mru_w = NULL; + + for (p = packed_git; p; p = p->next) { + if (p->pack_fd == -1) + continue; + find_lru_pack(p, &lru_p, &mru_w); + } + + if (lru_p) { + close_pack_windows(lru_p); + close(lru_p->pack_fd); + pack_open_fds--; + lru_p->pack_fd = -1; + if (lru_p == last_found_pack) + last_found_pack = NULL; + return 1; + } + + return 0; +} + void unuse_pack(struct pack_window **w_cursor) { struct pack_window *w = *w_cursor; @@ -777,7 +838,7 @@ static int open_packed_git_1(struct packed_git *p) pack_max_fds = 1; } - while (pack_max_fds <= pack_open_fds && unuse_one_window(NULL, -1)) + while (pack_max_fds <= pack_open_fds && close_one_pack()) ; /* nothing */ p->pack_fd = git_open_noatime(p->pack_name); -- 1.8.3.1.440.gc2bf105 ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-07-30 4:05 [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure Brandon Casey @ 2013-07-30 7:51 ` Eric Sunshine 2013-07-30 15:39 ` Junio C Hamano 1 sibling, 0 replies; 23+ messages in thread From: Eric Sunshine @ 2013-07-30 7:51 UTC (permalink / raw) To: Brandon Casey; +Cc: Git List, Junio C Hamano, Shawn Pearce, Brandon Casey On Tue, Jul 30, 2013 at 12:05 AM, Brandon Casey <bcasey@nvidia.com> wrote: > When the number of open packs exceeds pack_max_fds, unuse_one_window() > is called repeatedly to attempt to release the least-recently-used > pack windows, which, as a side-effect, will also close a pack file > after closing its last open window. If a pack file has been opened, > but no windows have been allocated into it, it will never be selected > by unuse_one_window() and hence its file descriptor will not be > closed. When this happens, git may exceed the number of file > descriptors permitted by the system. > > This latter situation can occur in show-ref or receive-pack during ref > advertisement. During ref advertisement, receive-pack will iterate > over every ref in the repository and advertise it to the client after > ensuring that the ref exists in the local repository. If the ref is > located inside a pack, then the pack is opened to ensure that it > exists, but since the object is not actually read from the pack, no > mmap windows are allocated. When the number of open packs exceeds > pack_max_fds, unuse_one_window() will not able to find any windows to s/not able/not be able/ ...or... s/not able to find/not find/ > free and will not be able to close any packs. Once the per-process > file descriptor limit is exceeded, receive-pack will produce a warning, > not an error, for each pack it cannot open, and will then most likely > fail with an error to spawn rev-list or index-pack like: > > error: cannot create standard input pipe for rev-list: Too many open files > error: Could not run 'git rev-list' > > This is not likely to occur during upload-pack since upload-pack > reads each object from the pack so that it can peel tags and > advertise the exposed object. So during upload-pack, mmap windows > will be allocated for each pack that is opened and unuse_one_window() > will eventually be able to close unused packs after freeing all of > their windows. > > When we have file descriptor pressure, in contrast to memory pressure, > we need to free all windows and close the pack file descriptor so that > a new pack can be opened. Let's introduce a new function > close_one_pack() designed specifically for this purpose to search > for and close the least-recently-used pack, where LRU is defined as > > * pack with oldest mtime and no allocated mmap windows or > * pack with the least-recently-used windows, i.e. the pack > with the oldest most-recently-used window > > Signed-off-by: Brandon Casey <drafnel@gmail.com> ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-07-30 4:05 [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure Brandon Casey 2013-07-30 7:51 ` Eric Sunshine @ 2013-07-30 15:39 ` Junio C Hamano 2013-07-30 19:52 ` Jeff King 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2013-07-30 15:39 UTC (permalink / raw) To: Brandon Casey; +Cc: git, spearce, Brandon Casey, Jeff King Brandon Casey <bcasey@nvidia.com> writes: > From: Brandon Casey <drafnel@gmail.com> > > When the number of open packs exceeds pack_max_fds, unuse_one_window() > is called repeatedly to attempt to release the least-recently-used > pack windows, which, as a side-effect, will also close a pack file > after closing its last open window. If a pack file has been opened, > but no windows have been allocated into it, it will never be selected > by unuse_one_window() and hence its file descriptor will not be > closed. When this happens, git may exceed the number of file > descriptors permitted by the system. An interesting find. The patch from a cursory look reads OK. Thanks. > This is not likely to occur during upload-pack since upload-pack > reads each object from the pack so that it can peel tags and > advertise the exposed object. Another interesting find. Perhaps there is a room for improvements, as packed-refs file knows what objects the tags peel to? I vaguely recall Peff was actively reducing the object access during ref enumeration in not so distant past... ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-07-30 15:39 ` Junio C Hamano @ 2013-07-30 19:52 ` Jeff King 2013-07-30 22:59 ` Brandon Casey 0 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2013-07-30 19:52 UTC (permalink / raw) To: Junio C Hamano; +Cc: Brandon Casey, git, spearce, Brandon Casey On Tue, Jul 30, 2013 at 08:39:48AM -0700, Junio C Hamano wrote: > Brandon Casey <bcasey@nvidia.com> writes: > > > From: Brandon Casey <drafnel@gmail.com> > > > > When the number of open packs exceeds pack_max_fds, unuse_one_window() > > is called repeatedly to attempt to release the least-recently-used > > pack windows, which, as a side-effect, will also close a pack file > > after closing its last open window. If a pack file has been opened, > > but no windows have been allocated into it, it will never be selected > > by unuse_one_window() and hence its file descriptor will not be > > closed. When this happens, git may exceed the number of file > > descriptors permitted by the system. > > An interesting find. The patch from a cursory look reads OK. Yeah. I wonder if unuse_one_window() should actually leave the pack fd open now in general. If you close packfile descriptors, you can run into racy situations where somebody else is repacking and deleting packs, and they go away while you are trying to access them. If you keep a descriptor open, you're fine; they last to the end of the process. If you don't, then they disappear from under you. For normal object access, this isn't that big a deal; we just rescan the packs and retry. But if you are packing yourself (e.g., because you are a pack-objects started by upload-pack for a clone or fetch), it's much harder to recover (and we print some warnings). We had our core.packedGitWindowSize lowered on GitHub for a while, and we ran into this warning on busy repositories when we were running "git gc" on the server. We solved it by bumping the window size so we never release memory. But just not closing the descriptor wouldn't work until Brandon's patch, because we used the same function to release memory and descriptor pressure. Now we could do them separately (and progressively if we need to). > > This is not likely to occur during upload-pack since upload-pack > > reads each object from the pack so that it can peel tags and > > advertise the exposed object. > > Another interesting find. Perhaps there is a room for improvements, > as packed-refs file knows what objects the tags peel to? I vaguely > recall Peff was actively reducing the object access during ref > enumeration in not so distant past... Yeah, we should be reading almost no objects these days due to the packed-refs peel lines. I just did a double-check on what "git upload-pack . </dev/null >/dev/null" reads on my git.git repo, and it is only three objects: the v1.8.3.3, v1.8.3.4, and v1.8.4-rc0 tag objects. In other words, the tags I got since the last time I ran "git gc". So I think all is working as designed. We could give receive-pack the same treatment; I've spent less time micro-optimizing it because because we (and most sites, I would think) get an order of magnitude more fetches than pushes. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-07-30 19:52 ` Jeff King @ 2013-07-30 22:59 ` Brandon Casey 2013-07-31 19:51 ` [PATCH v2 1/2] " Brandon Casey 0 siblings, 1 reply; 23+ messages in thread From: Brandon Casey @ 2013-07-30 22:59 UTC (permalink / raw) To: Jeff King Cc: Junio C Hamano, Brandon Casey, git@vger.kernel.org, Shawn Pearce On Tue, Jul 30, 2013 at 12:52 PM, Jeff King <peff@peff.net> wrote: > On Tue, Jul 30, 2013 at 08:39:48AM -0700, Junio C Hamano wrote: > >> Brandon Casey <bcasey@nvidia.com> writes: >> >> > From: Brandon Casey <drafnel@gmail.com> >> > >> > When the number of open packs exceeds pack_max_fds, unuse_one_window() >> > is called repeatedly to attempt to release the least-recently-used >> > pack windows, which, as a side-effect, will also close a pack file >> > after closing its last open window. If a pack file has been opened, >> > but no windows have been allocated into it, it will never be selected >> > by unuse_one_window() and hence its file descriptor will not be >> > closed. When this happens, git may exceed the number of file >> > descriptors permitted by the system. >> >> An interesting find. The patch from a cursory look reads OK. > > Yeah. I wonder if unuse_one_window() should actually leave the pack fd > open now in general. > > If you close packfile descriptors, you can run into racy situations > where somebody else is repacking and deleting packs, and they go away > while you are trying to access them. If you keep a descriptor open, > you're fine; they last to the end of the process. If you don't, then > they disappear from under you. > > For normal object access, this isn't that big a deal; we just rescan the > packs and retry. But if you are packing yourself (e.g., because you are > a pack-objects started by upload-pack for a clone or fetch), it's much > harder to recover (and we print some warnings). > > We had our core.packedGitWindowSize lowered on GitHub for a while, and > we ran into this warning on busy repositories when we were running "git > gc" on the server. We solved it by bumping the window size so we never > release memory. > > But just not closing the descriptor wouldn't work until Brandon's patch, > because we used the same function to release memory and descriptor > pressure. Now we could do them separately (and progressively if we need > to). I had thought about whether to stop closing the pack file in unuse_one_window(), but didn't have a reason to do so. I think the scenario you described provides a justification. If we're not under file descriptor pressure and we can possibly avoid rescanning the pack directory, it sounds like a net win. >> > This is not likely to occur during upload-pack since upload-pack >> > reads each object from the pack so that it can peel tags and >> > advertise the exposed object. >> >> Another interesting find. Perhaps there is a room for improvements, >> as packed-refs file knows what objects the tags peel to? I vaguely >> recall Peff was actively reducing the object access during ref >> enumeration in not so distant past... > > Yeah, we should be reading almost no objects these days due to the > packed-refs peel lines. I just did a double-check on what "git > upload-pack . </dev/null >/dev/null" reads on my git.git repo, and it is > only three objects: the v1.8.3.3, v1.8.3.4, and v1.8.4-rc0 tag objects. > In other words, the tags I got since the last time I ran "git gc". So I > think all is working as designed. Ok, looks like this has been the case since your 435c8332 which taught upload-pack to use peel_ref(). So looks like we do avoid reaching into the pack for any ref that was read from a (modern) packed-refs file. The repository I was testing with had mostly loose refs. Indeed, after packing refs, upload-pack encounters the same problem as receive-pack and runs out of file descriptors. So my comment about upload-pack is not completely accurate. Upload-pack _can_ run into this problem, but the refs must be packed, as well as there being enough of them that exist in enough different pack files to exceed the processes fd limit. > We could give receive-pack the same treatment; I've spent less time > micro-optimizing it because because we (and most sites, I would think) > get an order of magnitude more fetches than pushes. I don't think it would need the 435c8332 treatment since receive-pack doesn't peel refs when it advertises them to the client and hence does not need to load the ref object from the pack file during ref advertisement, but possibly some of the other stuff you did would be applicable. But like you said, the number of fetches far exceed the number of pushes. -Brandon ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-07-30 22:59 ` Brandon Casey @ 2013-07-31 19:51 ` Brandon Casey 2013-07-31 19:51 ` [PATCH 2/2] Don't close pack fd when free'ing pack windows Brandon Casey 2013-08-01 17:12 ` [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure Junio C Hamano 0 siblings, 2 replies; 23+ messages in thread From: Brandon Casey @ 2013-07-31 19:51 UTC (permalink / raw) To: git; +Cc: gitster, peff, spearce, sunshine, Brandon Casey From: Brandon Casey <drafnel@gmail.com> When the number of open packs exceeds pack_max_fds, unuse_one_window() is called repeatedly to attempt to release the least-recently-used pack windows, which, as a side-effect, will also close a pack file after closing its last open window. If a pack file has been opened, but no windows have been allocated into it, it will never be selected by unuse_one_window() and hence its file descriptor will not be closed. When this happens, git may exceed the number of file descriptors permitted by the system. This latter situation can occur in show-ref or receive-pack during ref advertisement. During ref advertisement, receive-pack will iterate over every ref in the repository and advertise it to the client after ensuring that the ref exists in the local repository. If the ref is located inside a pack, then the pack is opened to ensure that it exists, but since the object is not actually read from the pack, no mmap windows are allocated. When the number of open packs exceeds pack_max_fds, unuse_one_window() will not be able to find any windows to free and will not be able to close any packs. Once the per-process file descriptor limit is exceeded, receive-pack will produce a warning, not an error, for each pack it cannot open, and will then most likely fail with an error to spawn rev-list or index-pack like: error: cannot create standard input pipe for rev-list: Too many open files error: Could not run 'git rev-list' This may also occur during upload-pack when refs are packed (in the packed-refs file) and the number of packs that must be opened to verify that these packed refs exist exceeds the file descriptor limit. If the refs are loose, then upload-pack will read each ref from the pack (allocating one or more mmap windows) so it can peel tags and advertise the underlying object. If the refs are packed and peeled, then upload-pack will use the peeled sha1 in the packed-refs file and will not need to read from the pack files, so no mmap windows will be allocated and just like with receive-pack, unuse_one_window() will never select these opened packs to close. When we have file descriptor pressure, in contrast to memory pressure, we need to free all windows and close the pack file descriptor so that a new pack can be opened. Let's introduce a new function close_one_pack() designed specifically for this purpose to search for and close the least-recently-used pack, where LRU is defined as * pack with oldest mtime and no allocated mmap windows or * pack with the least-recently-used windows, i.e. the pack with the oldest most-recently-used window Signed-off-by: Brandon Casey <drafnel@gmail.com> --- The commit message was updated to fix the grammatical error that Eric Sunshine pointed out, and to correct the paragraph about upload-pack. -Brandon sha1_file.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 62 insertions(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 8e27db1..7731ab1 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -682,6 +682,67 @@ void close_pack_windows(struct packed_git *p) } } +/* + * The LRU pack is the one with the oldest MRU window or the oldest mtime + * if it has no windows allocated. + */ +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w) +{ + struct pack_window *w, *this_mru_w; + + /* + * Reject this pack if it has windows and the previously selected + * one does not. If this pack does not have windows, reject + * it if the pack file is newer than the previously selected one. + */ + if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime)) + return; + + for (w = this_mru_w = p->windows; w; w = w->next) { + /* Reject this pack if any of its windows are in use */ + if (w->inuse_cnt) + return; + /* + * Reject this pack if it has windows that have been + * used more recently than the previously selected pack. + */ + if (*mru_w && w->last_used > (*mru_w)->last_used) + return; + if (w->last_used > this_mru_w->last_used) + this_mru_w = w; + } + + /* + * Select this pack. + */ + *mru_w = this_mru_w; + *lru_p = p; +} + +static int close_one_pack(void) +{ + struct packed_git *p, *lru_p = NULL; + struct pack_window *mru_w = NULL; + + for (p = packed_git; p; p = p->next) { + if (p->pack_fd == -1) + continue; + find_lru_pack(p, &lru_p, &mru_w); + } + + if (lru_p) { + close_pack_windows(lru_p); + close(lru_p->pack_fd); + pack_open_fds--; + lru_p->pack_fd = -1; + if (lru_p == last_found_pack) + last_found_pack = NULL; + return 1; + } + + return 0; +} + void unuse_pack(struct pack_window **w_cursor) { struct pack_window *w = *w_cursor; @@ -777,7 +838,7 @@ static int open_packed_git_1(struct packed_git *p) pack_max_fds = 1; } - while (pack_max_fds <= pack_open_fds && unuse_one_window(NULL, -1)) + while (pack_max_fds <= pack_open_fds && close_one_pack()) ; /* nothing */ p->pack_fd = git_open_noatime(p->pack_name); -- 1.8.4.rc0.2.g6cf5c31 ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/2] Don't close pack fd when free'ing pack windows 2013-07-31 19:51 ` [PATCH v2 1/2] " Brandon Casey @ 2013-07-31 19:51 ` Brandon Casey 2013-07-31 21:08 ` Antoine Pelisse 2013-08-01 17:12 ` [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure Junio C Hamano 1 sibling, 1 reply; 23+ messages in thread From: Brandon Casey @ 2013-07-31 19:51 UTC (permalink / raw) To: git; +Cc: gitster, peff, spearce, sunshine, Brandon Casey From: Brandon Casey <drafnel@gmail.com> Now that close_one_pack() has been introduced to handle file descriptor pressure, it is not strictly necessary to close the pack file descriptor in unuse_one_window() when we're under memory pressure. Jeff King provided a justification for leaving the pack file open: If you close packfile descriptors, you can run into racy situations where somebody else is repacking and deleting packs, and they go away while you are trying to access them. If you keep a descriptor open, you're fine; they last to the end of the process. If you don't, then they disappear from under you. For normal object access, this isn't that big a deal; we just rescan the packs and retry. But if you are packing yourself (e.g., because you are a pack-objects started by upload-pack for a clone or fetch), it's much harder to recover (and we print some warnings). Let's do so (or uh, not do so). Signed-off-by: Brandon Casey <drafnel@gmail.com> --- builtin/pack-objects.c | 2 +- git-compat-util.h | 2 +- sha1_file.c | 21 +++++++-------------- 3 files changed, 9 insertions(+), 16 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f069462..4eb0521 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1809,7 +1809,7 @@ static void find_deltas(struct object_entry **list, unsigned *list_size, static void try_to_free_from_threads(size_t size) { read_lock(); - release_pack_memory(size, -1); + release_pack_memory(size); read_unlock(); } diff --git a/git-compat-util.h b/git-compat-util.h index cc4ba4d..29b1ee3 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -517,7 +517,7 @@ int inet_pton(int af, const char *src, void *dst); const char *inet_ntop(int af, const void *src, char *dst, size_t size); #endif -extern void release_pack_memory(size_t, int); +extern void release_pack_memory(size_t); typedef void (*try_to_free_t)(size_t); extern try_to_free_t set_try_to_free_routine(try_to_free_t); diff --git a/sha1_file.c b/sha1_file.c index 7731ab1..d26121a 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -614,7 +614,7 @@ static void scan_windows(struct packed_git *p, } } -static int unuse_one_window(struct packed_git *current, int keep_fd) +static int unuse_one_window(struct packed_git *current) { struct packed_git *p, *lru_p = NULL; struct pack_window *lru_w = NULL, *lru_l = NULL; @@ -628,15 +628,8 @@ static int unuse_one_window(struct packed_git *current, int keep_fd) pack_mapped -= lru_w->len; if (lru_l) lru_l->next = lru_w->next; - else { + else lru_p->windows = lru_w->next; - if (!lru_p->windows && lru_p->pack_fd != -1 - && lru_p->pack_fd != keep_fd) { - close(lru_p->pack_fd); - pack_open_fds--; - lru_p->pack_fd = -1; - } - } free(lru_w); pack_open_windows--; return 1; @@ -644,10 +637,10 @@ static int unuse_one_window(struct packed_git *current, int keep_fd) return 0; } -void release_pack_memory(size_t need, int fd) +void release_pack_memory(size_t need) { size_t cur = pack_mapped; - while (need >= (cur - pack_mapped) && unuse_one_window(NULL, fd)) + while (need >= (cur - pack_mapped) && unuse_one_window(NULL)) ; /* nothing */ } @@ -658,7 +651,7 @@ void *xmmap(void *start, size_t length, if (ret == MAP_FAILED) { if (!length) return NULL; - release_pack_memory(length, fd); + release_pack_memory(length); ret = mmap(start, length, prot, flags, fd, offset); if (ret == MAP_FAILED) die_errno("Out of memory? mmap failed"); @@ -954,7 +947,7 @@ unsigned char *use_pack(struct packed_git *p, win->len = (size_t)len; pack_mapped += win->len; while (packed_git_limit < pack_mapped - && unuse_one_window(p, p->pack_fd)) + && unuse_one_window(p)) ; /* nothing */ win->base = xmmap(NULL, win->len, PROT_READ, MAP_PRIVATE, @@ -1000,7 +993,7 @@ static struct packed_git *alloc_packed_git(int extra) static void try_to_free_pack_memory(size_t size) { - release_pack_memory(size, -1); + release_pack_memory(size); } struct packed_git *add_packed_git(const char *path, int path_len, int local) -- 1.8.4.rc0.2.g6cf5c31 ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ----------------------------------------------------------------------------------- ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Don't close pack fd when free'ing pack windows 2013-07-31 19:51 ` [PATCH 2/2] Don't close pack fd when free'ing pack windows Brandon Casey @ 2013-07-31 21:08 ` Antoine Pelisse 2013-07-31 21:21 ` Fredrik Gustafsson ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Antoine Pelisse @ 2013-07-31 21:08 UTC (permalink / raw) To: Brandon Casey Cc: git, Junio C Hamano, Jeff King, spearce, Eric Sunshine, Brandon Casey On Wed, Jul 31, 2013 at 9:51 PM, Brandon Casey <bcasey@nvidia.com> wrote: > ----------------------------------------------------------------------------------- > This email message is for the sole use of the intended recipient(s) and may contain > confidential information. Any unauthorized review, use, disclosure or distribution > is prohibited. If you are not the intended recipient, please contact the sender by > reply email and destroy all copies of the original message. > ----------------------------------------------------------------------------------- I'm certainly not a lawyer, and I'm sorry for not reviewing the content of the patch instead, but is that not a problem from a legal point of view ? I remember a video of Greg Kroah-Hartman where he talked about that (the video was posted by Junio on G+). ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Don't close pack fd when free'ing pack windows 2013-07-31 21:08 ` Antoine Pelisse @ 2013-07-31 21:21 ` Fredrik Gustafsson 2013-07-31 21:31 ` Brandon Casey 2013-07-31 21:23 ` Brandon Casey 2013-07-31 21:28 ` Thomas Rast 2 siblings, 1 reply; 23+ messages in thread From: Fredrik Gustafsson @ 2013-07-31 21:21 UTC (permalink / raw) To: Antoine Pelisse Cc: Brandon Casey, git, Junio C Hamano, Jeff King, spearce, Eric Sunshine, Brandon Casey On Wed, Jul 31, 2013 at 11:08:21PM +0200, Antoine Pelisse wrote: > On Wed, Jul 31, 2013 at 9:51 PM, Brandon Casey <bcasey@nvidia.com> wrote: > > ----------------------------------------------------------------------------------- > > This email message is for the sole use of the intended recipient(s) and may contain > > confidential information. Any unauthorized review, use, disclosure or distribution > > is prohibited. If you are not the intended recipient, please contact the sender by > > reply email and destroy all copies of the original message. > > ----------------------------------------------------------------------------------- > > I'm certainly not a lawyer, and I'm sorry for not reviewing the > content of the patch instead, but is that not a problem from a legal > point of view ? Talking about legal, is it a problem if a commit isn't signed-off by it's committer or author e-mail? Like in this case where the sign-off is from gmail.com and the committer from nvidia.com? -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Don't close pack fd when free'ing pack windows 2013-07-31 21:21 ` Fredrik Gustafsson @ 2013-07-31 21:31 ` Brandon Casey 2013-07-31 21:44 ` Fredrik Gustafsson 0 siblings, 1 reply; 23+ messages in thread From: Brandon Casey @ 2013-07-31 21:31 UTC (permalink / raw) To: Fredrik Gustafsson Cc: Antoine Pelisse, Brandon Casey, git, Junio C Hamano, Jeff King, Shawn Pearce, Eric Sunshine On Wed, Jul 31, 2013 at 2:21 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > On Wed, Jul 31, 2013 at 11:08:21PM +0200, Antoine Pelisse wrote: >> On Wed, Jul 31, 2013 at 9:51 PM, Brandon Casey <bcasey@nvidia.com> wrote: >> > ----------------------------------------------------------------------------------- >> > This email message is for the sole use of the intended recipient(s) and may contain >> > confidential information. Any unauthorized review, use, disclosure or distribution >> > is prohibited. If you are not the intended recipient, please contact the sender by >> > reply email and destroy all copies of the original message. >> > ----------------------------------------------------------------------------------- >> >> I'm certainly not a lawyer, and I'm sorry for not reviewing the >> content of the patch instead, but is that not a problem from a legal >> point of view ? > > Talking about legal, is it a problem if a commit isn't signed-off by > it's committer or author e-mail? Like in this case where the sign-off is > from gmail.com and the committer from nvidia.com? It never has been. My commits should have the author and committer set to my gmail address actually. Others have sometimes used the two fields to distinguish between a corporate identity (i.e. me@somecompany.com) that represents the funder of the work and a canonical identity (me@personalemail.com) that identifies the person that performed the work. -Brandon ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Don't close pack fd when free'ing pack windows 2013-07-31 21:31 ` Brandon Casey @ 2013-07-31 21:44 ` Fredrik Gustafsson 0 siblings, 0 replies; 23+ messages in thread From: Fredrik Gustafsson @ 2013-07-31 21:44 UTC (permalink / raw) To: Brandon Casey Cc: Fredrik Gustafsson, Antoine Pelisse, Brandon Casey, git, Junio C Hamano, Jeff King, Shawn Pearce, Eric Sunshine On Wed, Jul 31, 2013 at 02:31:34PM -0700, Brandon Casey wrote: > On Wed, Jul 31, 2013 at 2:21 PM, Fredrik Gustafsson <iveqy@iveqy.com> wrote: > > On Wed, Jul 31, 2013 at 11:08:21PM +0200, Antoine Pelisse wrote: > >> On Wed, Jul 31, 2013 at 9:51 PM, Brandon Casey <bcasey@nvidia.com> wrote: > >> > ----------------------------------------------------------------------------------- > >> > This email message is for the sole use of the intended recipient(s) and may contain > >> > confidential information. Any unauthorized review, use, disclosure or distribution > >> > is prohibited. If you are not the intended recipient, please contact the sender by > >> > reply email and destroy all copies of the original message. > >> > ----------------------------------------------------------------------------------- > >> > >> I'm certainly not a lawyer, and I'm sorry for not reviewing the > >> content of the patch instead, but is that not a problem from a legal > >> point of view ? > > > > Talking about legal, is it a problem if a commit isn't signed-off by > > it's committer or author e-mail? Like in this case where the sign-off is > > from gmail.com and the committer from nvidia.com? > > It never has been. My commits should have the author and committer > set to my gmail address actually. Oh, that's why the extra "From: " - field below the header is for. > > Others have sometimes used the two fields to distinguish between a > corporate identity (i.e. me@somecompany.com) that represents the > funder of the work and a canonical identity (me@personalemail.com) > that identifies the person that performed the work. > In some contries your work when you're employed does not belong to you but to your employer and when you're acting for your employer you're representing the corporate legal person. Therefore two different e-mails can be seen as two different (legal not physical) persons. At least that's how I understand those "legal tips for developers" I've got. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iveqy@iveqy.com ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Don't close pack fd when free'ing pack windows 2013-07-31 21:08 ` Antoine Pelisse 2013-07-31 21:21 ` Fredrik Gustafsson @ 2013-07-31 21:23 ` Brandon Casey 2013-07-31 21:28 ` Thomas Rast 2 siblings, 0 replies; 23+ messages in thread From: Brandon Casey @ 2013-07-31 21:23 UTC (permalink / raw) To: Antoine Pelisse Cc: Brandon Casey, git, Junio C Hamano, Jeff King, Shawn Pearce, Eric Sunshine On Wed, Jul 31, 2013 at 2:08 PM, Antoine Pelisse <apelisse@gmail.com> wrote: > On Wed, Jul 31, 2013 at 9:51 PM, Brandon Casey <bcasey@nvidia.com> wrote: >> ----------------------------------------------------------------------------------- >> This email message is for the sole use of the intended recipient(s) and may contain >> confidential information. Any unauthorized review, use, disclosure or distribution >> is prohibited. If you are not the intended recipient, please contact the sender by >> reply email and destroy all copies of the original message. >> ----------------------------------------------------------------------------------- > > I'm certainly not a lawyer, and I'm sorry for not reviewing the > content of the patch instead, but is that not a problem from a legal > point of view ? > I remember a video of Greg Kroah-Hartman where he talked about that > (the video was posted by Junio on G+). Me either thank God. Are those footers even enforceable? I mean, really, if someone mistakenly sends me their corporate financial numbers am I supposed to be under some legal obligation not to share it? I always assumed it was a scare tactic that lawyers like to use. To address the text of the footer, I'd say the "intended recipient(s)" are those on the "to" line which includes git@vger.kernel.org and the implicit use is for inclusion and distribution in the git source code. Anyway, I doubt I would have any influence on getting the footer removed. If Junio would rather me not submit patches with that footer, then I'd try to find a workaround. -Brandon ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] Don't close pack fd when free'ing pack windows 2013-07-31 21:08 ` Antoine Pelisse 2013-07-31 21:21 ` Fredrik Gustafsson 2013-07-31 21:23 ` Brandon Casey @ 2013-07-31 21:28 ` Thomas Rast 2 siblings, 0 replies; 23+ messages in thread From: Thomas Rast @ 2013-07-31 21:28 UTC (permalink / raw) To: Antoine Pelisse Cc: Brandon Casey, git, Junio C Hamano, Jeff King, spearce, Eric Sunshine, Brandon Casey Antoine Pelisse <apelisse@gmail.com> writes: > On Wed, Jul 31, 2013 at 9:51 PM, Brandon Casey <bcasey@nvidia.com> wrote: >> ----------------------------------------------------------------------------------- >> This email message is for the sole use of the intended recipient(s) and may contain >> confidential information. Any unauthorized review, use, disclosure or distribution >> is prohibited. If you are not the intended recipient, please contact the sender by >> reply email and destroy all copies of the original message. >> ----------------------------------------------------------------------------------- > > I'm certainly not a lawyer, and I'm sorry for not reviewing the > content of the patch instead, but is that not a problem from a legal > point of view ? > I remember a video of Greg Kroah-Hartman where he talked about that > (the video was posted by Junio on G+). It's this video: http://www.youtube.com/watch?v=fMeH7wqOwXA The comment starts at 13:55. -- Thomas Rast trast@{inf,student}.ethz.ch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-07-31 19:51 ` [PATCH v2 1/2] " Brandon Casey 2013-07-31 19:51 ` [PATCH 2/2] Don't close pack fd when free'ing pack windows Brandon Casey @ 2013-08-01 17:12 ` Junio C Hamano 2013-08-01 18:01 ` Brandon Casey 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2013-08-01 17:12 UTC (permalink / raw) To: Brandon Casey; +Cc: git, peff, spearce, sunshine, Brandon Casey Brandon Casey <bcasey@nvidia.com> writes: > If the refs are loose, then upload-pack will read each ref from the > pack (allocating one or more mmap windows) so it can peel tags and > advertise the underlying object. If the refs are packed and peeled, > then upload-pack will use the peeled sha1 in the packed-refs file and > will not need to read from the pack files, so no mmap windows will be > allocated and just like with receive-pack, unuse_one_window() will Even though what it says is not incorrect, the phrasing around here, especially "so it can", confused me in my first reading. It reads objects "in order to" peel and advertise (and as a side-effect it can lead to windows into packs that eventually help relieaving the fd pressure), but a quick scan led me to misread it as "so it can do peel and advertise just fine", which misses the point, because it is not like we are having trouble peeling and advertising. Also, the objects at the tips of refs and the objects they point at may be loose objects, which is very likely for branch tips. The fd pressure will not be relieved in such a case even if these refs were packed. I've tentatively reworded the above section like so: ... If the refs are loose, then upload-pack will read each ref from the object database (if the object is in a pack, allocating one or more mmap windows for it) in order to peel tags and advertise the underlying object. But when the refs are packed and peeled, upload-pack will use the peeled sha1 in the packed-refs file and will not need to read from the pack files, so no mmap windows will be allocated ... > +static int close_one_pack(void) > +{ > + struct packed_git *p, *lru_p = NULL; > + struct pack_window *mru_w = NULL; > + > + for (p = packed_git; p; p = p->next) { > + if (p->pack_fd == -1) > + continue; > + find_lru_pack(p, &lru_p, &mru_w); > + } > + > + if (lru_p) { > + close_pack_windows(lru_p); > + close(lru_p->pack_fd); > + pack_open_fds--; > + lru_p->pack_fd = -1; > + if (lru_p == last_found_pack) > + last_found_pack = NULL; > + return 1; > + } > + > + return 0; > +} OK, so in this codepath where we know we are under fd pressure, we find the pack that is least recently used that can be closed, and use close_pack_windows() to reclaim all of its open windows (if any), which takes care of the accounting for pack_mapped and pack_open_windows, but we need to do the pack_open_fds accounting here ourselves. Makes sense to me. Thanks. > void unuse_pack(struct pack_window **w_cursor) > { > struct pack_window *w = *w_cursor; > @@ -777,7 +838,7 @@ static int open_packed_git_1(struct packed_git *p) > pack_max_fds = 1; > } > > - while (pack_max_fds <= pack_open_fds && unuse_one_window(NULL, -1)) > + while (pack_max_fds <= pack_open_fds && close_one_pack()) > ; /* nothing */ > > p->pack_fd = git_open_noatime(p->pack_name); ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-01 17:12 ` [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure Junio C Hamano @ 2013-08-01 18:01 ` Brandon Casey 2013-08-01 18:39 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Brandon Casey @ 2013-08-01 18:01 UTC (permalink / raw) To: Junio C Hamano Cc: Brandon Casey, git@vger.kernel.org, Jeff King, Shawn Pearce, Eric Sunshine On Thu, Aug 1, 2013 at 10:12 AM, Junio C Hamano <gitster@pobox.com> wrote: > Brandon Casey <bcasey@nvidia.com> writes: > >> If the refs are loose, then upload-pack will read each ref from the >> pack (allocating one or more mmap windows) so it can peel tags and >> advertise the underlying object. If the refs are packed and peeled, >> then upload-pack will use the peeled sha1 in the packed-refs file and >> will not need to read from the pack files, so no mmap windows will be >> allocated and just like with receive-pack, unuse_one_window() will > > Even though what it says is not incorrect, the phrasing around here, > especially "so it can", confused me in my first reading. It reads > objects "in order to" peel and advertise (and as a side-effect it > can lead to windows into packs that eventually help relieaving the > fd pressure), but a quick scan led me to misread it as "so it can do > peel and advertise just fine", which misses the point, because it is > not like we are having trouble peeling and advertising. > > Also, the objects at the tips of refs and the objects they point at > may be loose objects, which is very likely for branch tips. The fd > pressure will not be relieved in such a case even if these refs were > packed. > > I've tentatively reworded the above section like so: > > ... If the refs are loose, then upload-pack will read each ref > from the object database (if the object is in a pack, allocating > one or more mmap windows for it) in order to peel tags and > advertise the underlying object. But when the refs are packed > and peeled, upload-pack will use the peeled sha1 in the > packed-refs file and will not need to read from the pack files, > so no mmap windows will be allocated ... Thanks. >> +static int close_one_pack(void) >> +{ >> + struct packed_git *p, *lru_p = NULL; >> + struct pack_window *mru_w = NULL; >> + >> + for (p = packed_git; p; p = p->next) { >> + if (p->pack_fd == -1) >> + continue; >> + find_lru_pack(p, &lru_p, &mru_w); >> + } >> + >> + if (lru_p) { >> + close_pack_windows(lru_p); >> + close(lru_p->pack_fd); >> + pack_open_fds--; >> + lru_p->pack_fd = -1; >> + if (lru_p == last_found_pack) >> + last_found_pack = NULL; >> + return 1; >> + } >> + >> + return 0; >> +} > > OK, so in this codepath where we know we are under fd pressure, we > find the pack that is least recently used that can be closed, and > use close_pack_windows() to reclaim all of its open windows (if > any), I've been looking closer at uses of p->windows everywhere, and it seems that we always open_packed_git() before we try to create new windows. There doesn't seem to be any reason that we can't continue to use the existing open windows even after closing the pack file. We obviously do this when the window spans the entire file. So, I'm thinking we can drop the close_pack_windows() and refrain from resetting last_found_pack, so the last block will become simply: + if (lru_p) { + close(lru_p->pack_fd); + pack_open_fds--; + lru_p->pack_fd = -1; + return 1; + } If the pack file needs to be reopened later and it has been rewritten in the mean time, open_packed_git_1() should notice when it compares either the file size or the pack's sha1 checksum to what was previously read from the pack index. So this seems safe. If we don't need to close_pack_windows(), find_lru_pack() doesn't strictly need to reject packs that have windows in use. I think the algorithm can be tweaked to prefer to close packs that have no windows in use, but still select them for closing if not. The order of preference would look like: 1. pack with no open windows, oldest mtime 2. pack with oldest MRU window but none in use 3. pack with oldest MRU window > which takes care of the accounting for pack_mapped and > pack_open_windows, but we need to do the pack_open_fds accounting > here ourselves. Makes sense to me. > > Thanks. Sorry about the additional reroll. I'll make the above changes and resubmit. -Brandon ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-01 18:01 ` Brandon Casey @ 2013-08-01 18:39 ` Junio C Hamano 2013-08-01 19:16 ` Brandon Casey 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2013-08-01 18:39 UTC (permalink / raw) To: Brandon Casey Cc: Brandon Casey, git@vger.kernel.org, Jeff King, Shawn Pearce, Eric Sunshine Brandon Casey <drafnel@gmail.com> writes: > I've been looking closer at uses of p->windows everywhere, and it > seems that we always open_packed_git() before we try to create new > windows. There doesn't seem to be any reason that we can't continue > to use the existing open windows even after closing the pack file. > ... > If we don't need to close_pack_windows(), find_lru_pack() doesn't > strictly need to reject packs that have windows in use. That makes me feel somewhat uneasy. Yes, you can open/mmap/close and hold onto the contents of a file still mapped in-core, and it may not count as "open filedescriptor", but do OSes allow infinite such mmapped regions to us? We do keep track of number of open windows, but is there a way for us to learn how close we are to the limit? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-01 18:39 ` Junio C Hamano @ 2013-08-01 19:16 ` Brandon Casey 2013-08-01 19:23 ` Brandon Casey 2013-08-01 20:02 ` Junio C Hamano 0 siblings, 2 replies; 23+ messages in thread From: Brandon Casey @ 2013-08-01 19:16 UTC (permalink / raw) To: Junio C Hamano Cc: Brandon Casey, git@vger.kernel.org, Jeff King, Shawn Pearce, Eric Sunshine On Thu, Aug 1, 2013 at 11:39 AM, Junio C Hamano <gitster@pobox.com> wrote: > Brandon Casey <drafnel@gmail.com> writes: > >> I've been looking closer at uses of p->windows everywhere, and it >> seems that we always open_packed_git() before we try to create new >> windows. There doesn't seem to be any reason that we can't continue >> to use the existing open windows even after closing the pack file. >> ... >> If we don't need to close_pack_windows(), find_lru_pack() doesn't >> strictly need to reject packs that have windows in use. > > That makes me feel somewhat uneasy. Yes, you can open/mmap/close > and hold onto the contents of a file still mapped in-core, and it > may not count as "open filedescriptor", but do OSes allow infinite > such mmapped regions to us? We do keep track of number of open > windows, but is there a way for us to learn how close we are to the > limit? Not that I know of, but xmmap() does already try to unmap existing windows when mmap() fails, and then retries the mmap. It calls release_pack_memory() which calls unuse_one_window(). mmap returns ENOMEM when either there is no available memory or if the limit of mmap mappings has been exceeded. So, I think we'll be ok. It's the same situation we'd be in if there were many large packs (but fewer than pack_max_fds) and a small packedGitWindowSize, requiring many mmap windows. We'd try to map an additional segment, fail, release some unused segments, and retry. The memory usage of all mmap segments would still be bounded by packedGitLimit. It's just that now, when we're only under file descriptor pressure, we won't close the mmap windows unnecessarily when they may be needed again. -Brandon ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-01 19:16 ` Brandon Casey @ 2013-08-01 19:23 ` Brandon Casey 2013-08-01 20:02 ` Junio C Hamano 1 sibling, 0 replies; 23+ messages in thread From: Brandon Casey @ 2013-08-01 19:23 UTC (permalink / raw) To: Junio C Hamano Cc: Brandon Casey, git@vger.kernel.org, Jeff King, Shawn Pearce, Eric Sunshine On Thu, Aug 1, 2013 at 12:16 PM, Brandon Casey <drafnel@gmail.com> wrote: > On Thu, Aug 1, 2013 at 11:39 AM, Junio C Hamano <gitster@pobox.com> wrote: >> Brandon Casey <drafnel@gmail.com> writes: >> >>> I've been looking closer at uses of p->windows everywhere, and it >>> seems that we always open_packed_git() before we try to create new >>> windows. There doesn't seem to be any reason that we can't continue >>> to use the existing open windows even after closing the pack file. >>> ... >>> If we don't need to close_pack_windows(), find_lru_pack() doesn't >>> strictly need to reject packs that have windows in use. >> >> That makes me feel somewhat uneasy. Yes, you can open/mmap/close >> and hold onto the contents of a file still mapped in-core, and it >> may not count as "open filedescriptor", but do OSes allow infinite >> such mmapped regions to us? We do keep track of number of open >> windows, but is there a way for us to learn how close we are to the >> limit? > > Not that I know of, but xmmap() does already try to unmap existing > windows when mmap() fails, and then retries the mmap. It calls > release_pack_memory() which calls unuse_one_window(). mmap returns > ENOMEM when either there is no available memory or if the limit of > mmap mappings has been exceeded. > > So, I think we'll be ok. It's the same situation we'd be in if there > were many large packs (but fewer than pack_max_fds) and a small > packedGitWindowSize, requiring many mmap windows. We'd try to map an > additional segment, fail, release some unused segments, and retry. Also, it's the same situation we'd be in if there were many small packs that were smaller than packedGitWindowSize. We'd mmap the entire pack file into memory and then close the file descriptor, allowing us to have many more pack files mapped into memory than pack_max_fds would allow us to have open. With enough small packs, we'd eventually reach the mmap limit and xmmap would try to release some mappings. -Brandon ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-01 19:16 ` Brandon Casey 2013-08-01 19:23 ` Brandon Casey @ 2013-08-01 20:02 ` Junio C Hamano 2013-08-01 20:37 ` Brandon Casey 1 sibling, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2013-08-01 20:02 UTC (permalink / raw) To: Brandon Casey Cc: Brandon Casey, git@vger.kernel.org, Jeff King, Shawn Pearce, Eric Sunshine Brandon Casey <drafnel@gmail.com> writes: > On Thu, Aug 1, 2013 at 11:39 AM, Junio C Hamano <gitster@pobox.com> wrote: >> That makes me feel somewhat uneasy. Yes, you can open/mmap/close >> and hold onto the contents of a file still mapped in-core, and it >> may not count as "open filedescriptor", but do OSes allow infinite >> such mmapped regions to us? We do keep track of number of open >> windows, but is there a way for us to learn how close we are to the >> limit? > > Not that I know of, but xmmap() does already try to unmap existing > windows when mmap() fails, and then retries the mmap. It calls > release_pack_memory() which calls unuse_one_window(). mmap returns > ENOMEM when either there is no available memory or if the limit of > mmap mappings has been exceeded. OK, so if there were such an OS limit, the unuse_one_window() will hopefully reduce the number of open windows and as a side effect we may go below that limit. What I was worried about was if there was a limit on the number of files we have windows into (i.e. having one window each in N files, with fds all closed, somehow costs more than having N window in one file with the fd closed). We currently have knobs for total number of windows and number of open fds consumed for packs, and the latter indirectly controls the number of active packfiles we have windows into. Your proposed change will essentially make the number of active packfiles unlimited by any of our knobs, and that was where my uneasiness was coming from. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-01 20:02 ` Junio C Hamano @ 2013-08-01 20:37 ` Brandon Casey 2013-08-02 5:36 ` [PATCH v3] " Brandon Casey 0 siblings, 1 reply; 23+ messages in thread From: Brandon Casey @ 2013-08-01 20:37 UTC (permalink / raw) To: Junio C Hamano Cc: Brandon Casey, git@vger.kernel.org, Jeff King, Shawn Pearce, Eric Sunshine On Thu, Aug 1, 2013 at 1:02 PM, Junio C Hamano <gitster@pobox.com> wrote: > Brandon Casey <drafnel@gmail.com> writes: > >> On Thu, Aug 1, 2013 at 11:39 AM, Junio C Hamano <gitster@pobox.com> wrote: >>> That makes me feel somewhat uneasy. Yes, you can open/mmap/close >>> and hold onto the contents of a file still mapped in-core, and it >>> may not count as "open filedescriptor", but do OSes allow infinite >>> such mmapped regions to us? We do keep track of number of open >>> windows, but is there a way for us to learn how close we are to the >>> limit? >> >> Not that I know of, but xmmap() does already try to unmap existing >> windows when mmap() fails, and then retries the mmap. It calls >> release_pack_memory() which calls unuse_one_window(). mmap returns >> ENOMEM when either there is no available memory or if the limit of >> mmap mappings has been exceeded. > > OK, so if there were such an OS limit, the unuse_one_window() will > hopefully reduce the number of open windows and as a side effect we > may go below that limit. > > What I was worried about was if there was a limit on the number of > files we have windows into (i.e. having one window each in N files, > with fds all closed, somehow costs more than having N window in one > file with the fd closed). Ah, yeah, I've never heard of that type of limit and I do not know if there is one. If there is such a limit, like you said unuse_one_window() will _hopefully_ release enough windows to reduce the number of packs we have windows into, but it is certainly not guaranteed. > We currently have knobs for total number > of windows and number of open fds consumed for packs, and the latter > indirectly controls the number of active packfiles we have windows > into. Your proposed change will essentially make the number of > active packfiles unlimited by any of our knobs, and that was where > my uneasiness was coming from. Yes and no. The limit on the number of open fds used for packs only indirectly controls the number of active packfiles we have windows into for the packs that are larger than packedGitWindowSize. For pack files smaller than packedGitWindowSize, the number was unlimited too since we close the file descriptor if the whole pack fits within one window. -Brandon ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-01 20:37 ` Brandon Casey @ 2013-08-02 5:36 ` Brandon Casey 2013-08-02 16:26 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Brandon Casey @ 2013-08-02 5:36 UTC (permalink / raw) To: gitster; +Cc: git, peff, spearce, sunshine, bcasey, Brandon Casey When the number of open packs exceeds pack_max_fds, unuse_one_window() is called repeatedly to attempt to release the least-recently-used pack windows, which, as a side-effect, will also close a pack file after closing its last open window. If a pack file has been opened, but no windows have been allocated into it, it will never be selected by unuse_one_window() and hence its file descriptor will not be closed. When this happens, git may exceed the number of file descriptors permitted by the system. This latter situation can occur in show-ref or receive-pack during ref advertisement. During ref advertisement, receive-pack will iterate over every ref in the repository and advertise it to the client after ensuring that the ref exists in the local repository. If the ref is located inside a pack, then the pack is opened to ensure that it exists, but since the object is not actually read from the pack, no mmap windows are allocated. When the number of open packs exceeds pack_max_fds, unuse_one_window() will not be able to find any windows to free and will not be able to close any packs. Once the per-process file descriptor limit is exceeded, receive-pack will produce a warning, not an error, for each pack it cannot open, and will then most likely fail with an error to spawn rev-list or index-pack like: error: cannot create standard input pipe for rev-list: Too many open files error: Could not run 'git rev-list' This may also occur during upload-pack when refs are packed (in the packed-refs file) and the number of packs that must be opened to verify that these packed refs exist exceeds the file descriptor limit. If the refs are loose, then upload-pack will read each ref from the object database (if the object is in a pack, allocating one or more mmap windows for it) in order to peel tags and advertise the underlying object. But when the refs are packed and peeled, upload-pack will use the peeled sha1 in the packed-refs file and will not need to read from the pack files, so no mmap windows will be allocated and just like with receive-pack, unuse_one_window() will never select these opened packs to close. When we have file descriptor pressure, we just need to find an open pack to close. We can leave the existing mmap windows open. If additional windows need to be mapped into the pack file, it will be reopened when necessary. If the pack file has been rewritten in the mean time, open_packed_git_1() should notice when it compares the file size or the pack's sha1 checksum to what was previously read from the pack index, and reject it. Let's introduce a new function close_one_pack() designed specifically for this purpose to search for and close the least-recently-used pack, where LRU is defined as (in order of preference): * pack with oldest mtime and no allocated mmap windows * pack with the least-recently-used windows, i.e. the pack with the oldest most-recently-used window, where none of the windows are in use * pack with the least-recently-used windows Signed-off-by: Brandon Casey <drafnel@gmail.com> --- Here's the version that leaves the mmap windows open after closing the pack file descriptor. -Brandon sha1_file.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index 40b2329..263cf71 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -673,6 +673,83 @@ void close_pack_windows(struct packed_git *p) } } +/* + * The LRU pack is the one with the oldest MRU window, preferring packs + * with no used windows, or the oldest mtime if it has no windows allocated. + */ +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse) +{ + struct pack_window *w, *this_mru_w; + int has_windows_inuse = 0; + + /* + * Reject this pack if it has windows and the previously selected + * one does not. If this pack does not have windows, reject + * it if the pack file is newer than the previously selected one. + */ + if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime)) + return; + + for (w = this_mru_w = p->windows; w; w = w->next) { + /* + * Reject this pack if any of its windows are in use, + * but the previously selected pack did not have any + * inuse windows. Otherwise, record that this pack + * has windows in use. + */ + if (w->inuse_cnt) { + if (*accept_windows_inuse) + has_windows_inuse = 1; + else + return; + } + + if (w->last_used > this_mru_w->last_used) + this_mru_w = w; + + /* + * Reject this pack if it has windows that have been + * used more recently than the previously selected pack. + * If the previously selected pack had windows inuse and + * we have not encountered a window in this pack that is + * inuse, skip this check since we prefer a pack with no + * inuse windows to one that has inuse windows. + */ + if (*mru_w && *accept_windows_inuse == has_windows_inuse && + this_mru_w->last_used > (*mru_w)->last_used) + return; + } + + /* + * Select this pack. + */ + *mru_w = this_mru_w; + *lru_p = p; + *accept_windows_inuse = has_windows_inuse; +} + +static int close_one_pack(void) +{ + struct packed_git *p, *lru_p = NULL; + struct pack_window *mru_w = NULL; + int accept_windows_inuse = 1; + + for (p = packed_git; p; p = p->next) { + if (p->pack_fd == -1) + continue; + find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse); + } + + if (lru_p) { + close(lru_p->pack_fd); + pack_open_fds--; + lru_p->pack_fd = -1; + return 1; + } + + return 0; +} + void unuse_pack(struct pack_window **w_cursor) { struct pack_window *w = *w_cursor; @@ -768,7 +845,7 @@ static int open_packed_git_1(struct packed_git *p) pack_max_fds = 1; } - while (pack_max_fds <= pack_open_fds && unuse_one_window(NULL, -1)) + while (pack_max_fds <= pack_open_fds && close_one_pack()) ; /* nothing */ p->pack_fd = git_open_noatime(p->pack_name); -- 1.8.1.1.252.gdb33759 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-02 5:36 ` [PATCH v3] " Brandon Casey @ 2013-08-02 16:26 ` Junio C Hamano 2013-08-02 17:12 ` Brandon Casey 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2013-08-02 16:26 UTC (permalink / raw) To: Brandon Casey; +Cc: git, peff, spearce, sunshine, bcasey Brandon Casey <drafnel@gmail.com> writes: > +/* > + * The LRU pack is the one with the oldest MRU window, preferring packs > + * with no used windows, or the oldest mtime if it has no windows allocated. > + */ > +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse) > +{ > + struct pack_window *w, *this_mru_w; > + int has_windows_inuse = 0; > + > + /* > + * Reject this pack if it has windows and the previously selected > + * one does not. If this pack does not have windows, reject > + * it if the pack file is newer than the previously selected one. > + */ > + if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime)) > + return; > + > + for (w = this_mru_w = p->windows; w; w = w->next) { > + /* > + * Reject this pack if any of its windows are in use, > + * but the previously selected pack did not have any > + * inuse windows. Otherwise, record that this pack > + * has windows in use. > + */ > + if (w->inuse_cnt) { > + if (*accept_windows_inuse) > + has_windows_inuse = 1; > + else > + return; > + } > + > + if (w->last_used > this_mru_w->last_used) > + this_mru_w = w; > + > + /* > + * Reject this pack if it has windows that have been > + * used more recently than the previously selected pack. > + * If the previously selected pack had windows inuse and > + * we have not encountered a window in this pack that is > + * inuse, skip this check since we prefer a pack with no > + * inuse windows to one that has inuse windows. > + */ > + if (*mru_w && *accept_windows_inuse == has_windows_inuse && > + this_mru_w->last_used > (*mru_w)->last_used) > + return; The "*accept_windows_inuse == has_windows_inuse" part is hard to grok, together with the fact that this statement is evaluated for each and every "w", even though it is about this_mru_w and that variable is not updated in every iteration of the loop. Can you clarify/simplify this part of the code a bit more? For example, would the above be equivalent to this? if (w->last_used < this_mru_w->last_used) continue; this_mru_w = w; if (has_windows_inuse && *mru_w && w->last_used > (*mru_w)->last_used) return; That is, if we already know a more recently used window in this pack, we do not have to do anything to maintain mru_w. Otherwise, remember that this window is the most recently used one in this pack, and if it is newer than the newest one from the pack we are going to pick, we refrain from picking this pack. But we do not reject ourselves if we haven't seen a window that is in use (yet). > + } > + > + /* > + * Select this pack. > + */ > + *mru_w = this_mru_w; > + *lru_p = p; > + *accept_windows_inuse = has_windows_inuse; > +} > + > +static int close_one_pack(void) > +{ > + struct packed_git *p, *lru_p = NULL; > + struct pack_window *mru_w = NULL; > + int accept_windows_inuse = 1; > + > + for (p = packed_git; p; p = p->next) { > + if (p->pack_fd == -1) > + continue; > + find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse); > + } > + > + if (lru_p) { > + close(lru_p->pack_fd); > + pack_open_fds--; > + lru_p->pack_fd = -1; > + return 1; > + } > + > + return 0; > +} > + > void unuse_pack(struct pack_window **w_cursor) > { > struct pack_window *w = *w_cursor; > @@ -768,7 +845,7 @@ static int open_packed_git_1(struct packed_git *p) > pack_max_fds = 1; > } > > - while (pack_max_fds <= pack_open_fds && unuse_one_window(NULL, -1)) > + while (pack_max_fds <= pack_open_fds && close_one_pack()) > ; /* nothing */ > > p->pack_fd = git_open_noatime(p->pack_name); ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3] sha1_file: introduce close_one_pack() to close packs on fd pressure 2013-08-02 16:26 ` Junio C Hamano @ 2013-08-02 17:12 ` Brandon Casey 0 siblings, 0 replies; 23+ messages in thread From: Brandon Casey @ 2013-08-02 17:12 UTC (permalink / raw) To: Junio C Hamano Cc: git@vger.kernel.org, Jeff King, Shawn Pearce, Eric Sunshine, Brandon Casey On Fri, Aug 2, 2013 at 9:26 AM, Junio C Hamano <gitster@pobox.com> wrote: > Brandon Casey <drafnel@gmail.com> writes: > >> +/* >> + * The LRU pack is the one with the oldest MRU window, preferring packs >> + * with no used windows, or the oldest mtime if it has no windows allocated. >> + */ >> +static void find_lru_pack(struct packed_git *p, struct packed_git **lru_p, struct pack_window **mru_w, int *accept_windows_inuse) >> +{ >> + struct pack_window *w, *this_mru_w; >> + int has_windows_inuse = 0; >> + >> + /* >> + * Reject this pack if it has windows and the previously selected >> + * one does not. If this pack does not have windows, reject >> + * it if the pack file is newer than the previously selected one. >> + */ >> + if (*lru_p && !*mru_w && (p->windows || p->mtime > (*lru_p)->mtime)) >> + return; >> + >> + for (w = this_mru_w = p->windows; w; w = w->next) { >> + /* >> + * Reject this pack if any of its windows are in use, >> + * but the previously selected pack did not have any >> + * inuse windows. Otherwise, record that this pack >> + * has windows in use. >> + */ >> + if (w->inuse_cnt) { >> + if (*accept_windows_inuse) >> + has_windows_inuse = 1; >> + else >> + return; >> + } >> + >> + if (w->last_used > this_mru_w->last_used) >> + this_mru_w = w; >> + >> + /* >> + * Reject this pack if it has windows that have been >> + * used more recently than the previously selected pack. >> + * If the previously selected pack had windows inuse and >> + * we have not encountered a window in this pack that is >> + * inuse, skip this check since we prefer a pack with no >> + * inuse windows to one that has inuse windows. >> + */ >> + if (*mru_w && *accept_windows_inuse == has_windows_inuse && >> + this_mru_w->last_used > (*mru_w)->last_used) >> + return; > > The "*accept_windows_inuse == has_windows_inuse" part is hard to > grok, together with the fact that this statement is evaluated for > each and every "w", even though it is about this_mru_w and that > variable is not updated in every iteration of the loop. Can you > clarify/simplify this part of the code a bit more? > > For example, would the above be equivalent to this? > > if (w->last_used < this_mru_w->last_used) > continue; > > this_mru_w = w; > if (has_windows_inuse && *mru_w && > w->last_used > (*mru_w)->last_used) > return; > > That is, if we already know a more recently used window in this > pack, we do not have to do anything to maintain mru_w. Otherwise, > remember that this window is the most recently used one in this > pack, and if it is newer than the newest one from the pack we are > going to pick, we refrain from picking this pack. > > But we do not reject ourselves if we haven't seen a window that is > in use (yet). No that wouldn't be the same. The function of "*accept_windows_inuse == has_windows_inuse" and the testing of this_mru_w in every loop rather than w, is too subtle. I tried to draw attention to it in the comment, but I agree it's not enough. The case that your example would not catch is when the new pack's mru window has already been found, but has_windows_inuse is not set until later. When has_windows_inuse is later set, we need to test this_mru_w regardless of whether we have just assigned it. For example, if mru_w points to a pack with last_used == 11 and *accept_windows_inuse = 1, and p->windows looks like this: last_used in_use 12 0 10 1 Then the first time through the loop, this_mru_w would be set to the first window with last_used equal to 12. The if statement that tests "this_mru_w->last_used > (*mru_w)->last_used" would be skipped since has_windows_inuse would still be 0. The second time through the loop, this_mru_w would _not_ be reset, but has_windows_inuse _would_ be set. This time, we would want to enter the last for loop so that we can reject the pack. I'll try to rework this loop or add comments to clarify. -Brandon >> + } >> + >> + /* >> + * Select this pack. >> + */ >> + *mru_w = this_mru_w; >> + *lru_p = p; >> + *accept_windows_inuse = has_windows_inuse; >> +} >> + >> +static int close_one_pack(void) >> +{ >> + struct packed_git *p, *lru_p = NULL; >> + struct pack_window *mru_w = NULL; >> + int accept_windows_inuse = 1; >> + >> + for (p = packed_git; p; p = p->next) { >> + if (p->pack_fd == -1) >> + continue; >> + find_lru_pack(p, &lru_p, &mru_w, &accept_windows_inuse); >> + } >> + >> + if (lru_p) { >> + close(lru_p->pack_fd); >> + pack_open_fds--; >> + lru_p->pack_fd = -1; >> + return 1; >> + } >> + >> + return 0; >> +} >> + >> void unuse_pack(struct pack_window **w_cursor) >> { >> struct pack_window *w = *w_cursor; >> @@ -768,7 +845,7 @@ static int open_packed_git_1(struct packed_git *p) >> pack_max_fds = 1; >> } >> >> - while (pack_max_fds <= pack_open_fds && unuse_one_window(NULL, -1)) >> + while (pack_max_fds <= pack_open_fds && close_one_pack()) >> ; /* nothing */ >> >> p->pack_fd = git_open_noatime(p->pack_name); ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2013-08-02 17:12 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-07-30 4:05 [PATCH] sha1_file: introduce close_one_pack() to close packs on fd pressure Brandon Casey 2013-07-30 7:51 ` Eric Sunshine 2013-07-30 15:39 ` Junio C Hamano 2013-07-30 19:52 ` Jeff King 2013-07-30 22:59 ` Brandon Casey 2013-07-31 19:51 ` [PATCH v2 1/2] " Brandon Casey 2013-07-31 19:51 ` [PATCH 2/2] Don't close pack fd when free'ing pack windows Brandon Casey 2013-07-31 21:08 ` Antoine Pelisse 2013-07-31 21:21 ` Fredrik Gustafsson 2013-07-31 21:31 ` Brandon Casey 2013-07-31 21:44 ` Fredrik Gustafsson 2013-07-31 21:23 ` Brandon Casey 2013-07-31 21:28 ` Thomas Rast 2013-08-01 17:12 ` [PATCH v2 1/2] sha1_file: introduce close_one_pack() to close packs on fd pressure Junio C Hamano 2013-08-01 18:01 ` Brandon Casey 2013-08-01 18:39 ` Junio C Hamano 2013-08-01 19:16 ` Brandon Casey 2013-08-01 19:23 ` Brandon Casey 2013-08-01 20:02 ` Junio C Hamano 2013-08-01 20:37 ` Brandon Casey 2013-08-02 5:36 ` [PATCH v3] " Brandon Casey 2013-08-02 16:26 ` Junio C Hamano 2013-08-02 17:12 ` Brandon Casey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).