Commit 959b5455 authored by Heiko Voigt's avatar Heiko Voigt Committed by Junio C Hamano

submodule: implement a config API for lookup of .gitmodules values

In a superproject some commands need to interact with submodules. They
need to query values from the .gitmodules file either from the worktree
of from certain revisions. At the moment this is quite hard since a
caller would need to read the .gitmodules file from the history and then
parse the values. We want to provide an API for this so we have one
place to get values from .gitmodules from any revision (including the

The API is realized as a cache which allows us to lazily read
.gitmodules configurations by commit into a runtime cache which can then
be used to easily lookup values from it. Currently only the values for
path or name are stored but it can be extended for any value needed.

It is expected that .gitmodules files do not change often between
commits. Thats why we lookup the .gitmodules sha1 from a commit and then
either lookup an already parsed configuration or parse and cache an
unknown one for each sha1. The cache is lazily build on demand for each
requested commit.

This cache can be used for all purposes which need knowledge about
submodule configurations. Example use cases are:

 * Recursive submodule checkout needs to lookup a submodule name from
   its path when a submodule first appears. This needs be done before
   this configuration exists in the worktree.

 * The implementation of submodule support for 'git archive' needs to
   lookup the submodule name to generate the archive when given a
   revision that is not checked out.

 * 'git fetch' when given the --recurse-submodules=on-demand option (or
   configuration) needs to lookup submodule names by path from the
   database rather than reading from the worktree. For new submodule it
   needs to lookup the name from its path to allow cloning new
   submodules into the .git folder so they can be checked out without
   any network interaction when the user does a checkout of that
Signed-off-by: Heiko Voigt's avatarHeiko Voigt <>
Signed-off-by: Stefan Beller's avatarStefan Beller <>
Signed-off-by: default avatarJunio C Hamano <>
parent f86f31ab
...@@ -204,6 +204,7 @@ ...@@ -204,6 +204,7 @@
/test-sha1-array /test-sha1-array
/test-sigchain /test-sigchain
/test-string-list /test-string-list
/test-subprocess /test-subprocess
/test-svn-fe /test-svn-fe
/test-urlmatch-normalization /test-urlmatch-normalization
submodule config cache API
The submodule config cache API allows to read submodule
configurations/information from specified revisions. Internally
information is lazily read into a cache that is used to avoid
unnecessary parsing of the same .gitmodule files. Lookups can be done by
submodule path or name.
The caller can look up information about submodules by using the
`submodule_from_path()` or `submodule_from_name()` functions. They return
a `struct submodule` which contains the values. The API automatically
initializes and allocates the needed infrastructure on-demand.
If the internal cache might grow too big or when the caller is done with
the API, all internally cached values can be freed with submodule_free().
Data Structures
`struct submodule`::
This structure is used to return the information about one
submodule for a certain revision. It is returned by the lookup
`void submodule_free()`::
Use these to free the internally cached values.
`const struct submodule *submodule_from_path(const unsigned char *commit_sha1, const char *path)`::
Lookup values for one submodule by its commit_sha1 and path.
`const struct submodule *submodule_from_name(const unsigned char *commit_sha1, const char *name)`::
The same as above but lookup by name.
For an example usage see test-submodule-config.c.
...@@ -594,6 +594,7 @@ TEST_PROGRAMS_NEED_X += test-sha1 ...@@ -594,6 +594,7 @@ TEST_PROGRAMS_NEED_X += test-sha1
TEST_PROGRAMS_NEED_X += test-sha1-array TEST_PROGRAMS_NEED_X += test-sha1-array
TEST_PROGRAMS_NEED_X += test-sigchain TEST_PROGRAMS_NEED_X += test-sigchain
TEST_PROGRAMS_NEED_X += test-string-list TEST_PROGRAMS_NEED_X += test-string-list
TEST_PROGRAMS_NEED_X += test-submodule-config
TEST_PROGRAMS_NEED_X += test-subprocess TEST_PROGRAMS_NEED_X += test-subprocess
TEST_PROGRAMS_NEED_X += test-svn-fe TEST_PROGRAMS_NEED_X += test-svn-fe
TEST_PROGRAMS_NEED_X += test-urlmatch-normalization TEST_PROGRAMS_NEED_X += test-urlmatch-normalization
...@@ -784,6 +785,7 @@ LIB_OBJS += strbuf.o ...@@ -784,6 +785,7 @@ LIB_OBJS += strbuf.o
LIB_OBJS += streaming.o LIB_OBJS += streaming.o
LIB_OBJS += string-list.o LIB_OBJS += string-list.o
LIB_OBJS += submodule.o LIB_OBJS += submodule.o
LIB_OBJS += submodule-config.o
LIB_OBJS += symlinks.o LIB_OBJS += symlinks.o
LIB_OBJS += tag.o LIB_OBJS += tag.o
LIB_OBJS += trace.o LIB_OBJS += trace.o
This diff is collapsed.
#include "hashmap.h"
#include "strbuf.h"
* Submodule entry containing the information about a certain submodule
* in a certain revision.
struct submodule {
const char *path;
const char *name;
const char *url;
int fetch_recurse;
const char *ignore;
/* the sha1 blob id of the responsible .gitmodules file */
unsigned char gitmodules_sha1[20];
const struct submodule *submodule_from_name(const unsigned char *commit_sha1,
const char *name);
const struct submodule *submodule_from_path(const unsigned char *commit_sha1,
const char *path);
void submodule_free(void);
...@@ -355,6 +355,7 @@ int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg) ...@@ -355,6 +355,7 @@ int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg)
default: default:
if (!strcmp(arg, "on-demand")) if (!strcmp(arg, "on-demand"))
/* TODO: remove the die for history parsing here */
die("bad %s argument: %s", opt, arg); die("bad %s argument: %s", opt, arg);
} }
} }
...@@ -5,6 +5,7 @@ struct diff_options; ...@@ -5,6 +5,7 @@ struct diff_options;
struct argv_array; struct argv_array;
enum { enum {
# Copyright (c) 2014 Heiko Voigt
test_description='Test submodules config cache infrastructure
This test verifies that parsing .gitmodules configuration directly
from the database works.
. ./
test_expect_success 'submodule config cache setup' '
mkdir submodule &&
(cd submodule &&
git init &&
echo a >a &&
git add . &&
git commit -ma
) &&
mkdir super &&
(cd super &&
git init &&
git submodule add ../submodule &&
git submodule add ../submodule a &&
git commit -m "add as submodule and as a" &&
git mv a b &&
git commit -m "move a to b"
cat >super/expect <<EOF
Submodule name: 'a' for path 'a'
Submodule name: 'a' for path 'b'
Submodule name: 'submodule' for path 'submodule'
Submodule name: 'submodule' for path 'submodule'
test_expect_success 'test parsing and lookup of submodule config by path' '
(cd super &&
test-submodule-config \
HEAD^ a \
HEAD b \
HEAD^ submodule \
HEAD submodule \
>actual &&
test_cmp expect actual
test_expect_success 'test parsing and lookup of submodule config by name' '
(cd super &&
test-submodule-config --name \
HEAD^ a \
HEAD a \
HEAD^ submodule \
HEAD submodule \
>actual &&
test_cmp expect actual
cat >super/expect_error <<EOF
Submodule name: 'a' for path 'b'
Submodule name: 'submodule' for path 'submodule'
test_expect_success 'error in one submodule config lets continue' '
(cd super &&
cp .gitmodules .gitmodules.bak &&
echo " value = \"" >>.gitmodules &&
git add .gitmodules &&
mv .gitmodules.bak .gitmodules &&
git commit -m "add error" &&
test-submodule-config \
HEAD b \
HEAD submodule \
>actual &&
test_cmp expect_error actual
#include "cache.h"
#include "submodule-config.h"
static void die_usage(int argc, char **argv, const char *msg)
fprintf(stderr, "%s\n", msg);
fprintf(stderr, "Usage: %s [<commit> <submodulepath>] ...\n", argv[0]);
int main(int argc, char **argv)
char **arg = argv;
int my_argc = argc;
int output_url = 0;
int lookup_name = 0;
while (starts_with(arg[0], "--")) {
if (!strcmp(arg[0], "--url"))
output_url = 1;
if (!strcmp(arg[0], "--name"))
lookup_name = 1;
if (my_argc % 2 != 0)
die_usage(argc, argv, "Wrong number of arguments.");
while (*arg) {
unsigned char commit_sha1[20];
const struct submodule *submodule;
const char *commit;
const char *path_or_name;
commit = arg[0];
path_or_name = arg[1];
if (commit[0] == '\0')
hashcpy(commit_sha1, null_sha1);
else if (get_sha1(commit, commit_sha1) < 0)
die_usage(argc, argv, "Commit not found.");
if (lookup_name) {
submodule = submodule_from_name(commit_sha1, path_or_name);
} else
submodule = submodule_from_path(commit_sha1, path_or_name);
if (!submodule)
die_usage(argc, argv, "Submodule not found.");
if (output_url)
printf("Submodule url: '%s' for path '%s'\n",
submodule->url, submodule->path);
printf("Submodule name: '%s' for path '%s'\n",
submodule->name, submodule->path);
arg += 2;
return 0;
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment