Patch #10470
closedEfficiently process new git revisions in a single batch
100%
Description
As noted in #8857, I am opening a new issue with patches that make processing new revisions with Git more efficient. I'm including the note that introduced the first pass at these patches below.
Here are 2 patches that apply cleanly to the trunk at revision r9240. The first modifies the usage of git-log to pass all revision arguments via stdin rather than on the command line. This patch can be applied without the second patch if desirable, as it should not change the functionality exposed by the revisions method that uses git-log. Doing this prepares the way for passing large numbers of revisions to git-log without overflowing the command buffer.
However, in order to support this new behavior, the shellout method had to be slightly modified so that the write end of the pipe it creates is left open upon request. That change could potentially affect other consumers of that method, but I doubt it will. Running the full scm test suite would be a good idea just in case though. I only had time to test git functionality myself.
The second patch builds upon the first. It processes all revisions in a single batch that are newly introduced since the last time the repository was processed. Each revision in the batch is processed exactly once. Disjoint branch histories and branch rewrites are supported.
All processing, including updating the last processed heads, occurs within a single transaction in order to ensure integrity of the data in case of concurrent attempts to update the repository. This transaction could potentially block updates for other repositories hosted in the same Redmine instance; however, normal operation of git repositories should rarely introduce so many new revisions as to hold this transaction open for very long. An initial import of a large repository on the order of thousands of commits would likely be the only realistic operation that could be a problem. Given the infrequency of that, it is safe to document that such an import should be scheduled for server downtime.
Importantly, due to the resistance toward introducing a migration in my first patch set for #8857, this patch does not include any migrations. A little extra processing is required to maintain the branch name to head revision relationship for every transaction, but this should be negligible. I'll happily introduce another patch on top of this one though in order to do this head processing in a cleaner way that would require a migration. Just let me know if you would take it.
These patches have been updated since they were submitted for #8857. They have been confirmed to work with the following Rubies:
- ruby 1.8.7 (2011-06-30 patchlevel 352) [x86_64-linux]
- ruby 1.9.2p290 (2011-07-09 revision 32553) [x86_64-linux]
- ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-linux]
- ruby 1.8.7 (2012-02-08 patchlevel 358) [i386-ming32]
- ruby 1.9.2p290 (2011-07-09) [i386-mingw32]
- ruby 1.9.3p0 (2011-10-30) [i386-mingw32]
Files
Related issues
Updated by Toshi MARUYAMA about 13 years ago
- Target version set to 1.4.0
- % Done changed from 0 to 100