Indexing a C++ repository with LSIF
This guide walks through setting up LSIF generation for a C++ codebase using
lsif-clang
. These instructions should apply to any
C++ project that is buildable with clang
or clang++
. (This should also cover most projects built
with gcc
or g++
.)
Local dev setup
With Docker (recommended)
-
Copy the files in the
lsif-docker
directory of sourcegraph/tesseract to a locallsif-docker
directory in your C++ repository (the one you wish to index). -
Replace the contents of
lsif-docker/install_build_deps.sh
with commands that install any requisite build dependencies of the project. These should be dependencies that do not vary from revision to revision. -
Modify
lsif-docker/checkout.sh
to clone your repository to the/source
directory in the Docker container's filesystem. -
Modify
lsif-docker/gen_compile_commands.sh
to generate a compilation database (compile_commands.json
).-
If you use autotools to build your project (
./autogen.sh && ./configure && make
), you can probably keep the existing contents. -
If you build your project using CMake, you can use
cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .
. -
If you use Bazel, you can use bazel-compilation-database:
git clone --depth=10 https://github.com/grailbio/bazel-compilation-database.git /bazel-compilation-database /bazel-compilation-database/generate.sh
-
If you use another build system or if any of the above steps break, consult this very helpful guide to generating compilation databases for various build systems. It may be helpful to
docker build
your container anddocker run -it $IMAGE
to get an interactive shell into the container, so you can ensure the build environment is correct. We recommend getting the project to build normally first (e.g., emit a binary) and then following the aforementioned guide to modify the regular build steps to emit a compilation database.- Most often, the
compile_commands.json
file will be emitted in the root directory of the repository. If this is not the case, you'll also need to modifylsif-docker/gen_lsif.sh
tocd
into the directory containing it and then runlsif-clang --project-root=/source compile_commands.json
. If you're unsure of wherecompile_commands.json
will be emitted, just continue to the next step for now.
- Most often, the
-
-
Run
docker build lsif-docker
to build the Docker image. -
Generate a Sourcegraph access token from your Sourcegraph instance (Settings > Access tokens). Give it
sudo
permission. -
Run the following command to generate and upload LSIF data to Sourcegraph:
docker run -e SRC_ACCESS_TOKEN=$ACCESS_TOKEN -e SRC_ENDPOINT=https://sourcegraph.example.com -e PROJECT_REV=HEAD $IMAGE_ID
with the following substitutions:
SRC_ACCESS_TOKEN=
: the Sourcegraph access token you just createdSRC_ENDPOINT=
: the URL to your Sourcegraph instancePROJECT_REV=
: the revision of your repository to be indexed$IMAGE_ID
: the ID of the Docker image you just built
If successful, you should see the upload visible in the repository settings page like this.
For reference, some examples of Dockerized C++ LSIF generation are:
github.com/opencv/opencv
github.com/osquery/osquery
github.com/google/tcmalloc
github.com/tesseract-ocr/tesseract
Without Docker
It can sometimes be difficult to replicate the build environment inside a separate Docker
container. If this situation applies to you, you'll need to install lsif-clang
directly to your
local dev environment.
-
Install
lsif-clang
in your environment using the instructions in thelsif-clang
repository. -
You'll need a way to generate a compilation database (i.e., a
compile_commands.json
file). There are different methods of doing so depending on your build tool, and we recommend reading these excellent notes. If there isn't an explicit way to generate one with your build tool, we recommend using Bear, which should be generic enough to handle any C++ build (but might be less efficient than explicit generation methods). -
Generate the
compile_commands.json
file in the root directory of the repository. -
Run
lsif-clang compile_commands.json
from the root directory. This should emit adump.lsif
file. -
Run
src lsif upload
from the root directory. You may first have to authenticate to your Sourcegraph instance.
If you run into issues along the way, a useful reference is one of the
Dockerfile
s
currently used for LSIF generation for an open-source repository.
CI setup
Incorporating LSIF generation and uploading in CI will allow precise code navigation to remain up-to-date without any human intervention.
If you created a Dockerfile
that encapsulates LSIF generation, you can use the same one in your CI
pipeline.
If you installed lsif-clang
directly into your host machine in development, you'll need to
incorporate those steps into your build scripts.
Troubleshooting
With Docker
If the docker run
command fails, you likely have an error in one of the lsif-docker/*.sh
files. The general rule is if you can get your project to build normally (i.e., generate an
executable), you can get the LSIF indexer to generate LSIF. So we recommend the following approach
if things don't work on the first try:
- Build the Docker image:
docker build lsif-docker
- Run the container with an interactive shell:
docker run -it $IMAGE_ID bash
- In the container shell,
cd /source
and figure out what steps are needed to build the project. - Once the build successfully completes, figure out which steps are needed to generate the
compile_commands.json
file. We have found this guide to be a useful resource. - Once you've successfully generated
compile_commands.json
,cd
into the directory containingcompile_commands.json
and runlsif-clang --project-root=/source compile_commands.json
. This should generate adump.lsif
file in the same directory. Thisdump.lsif
should contain JSON describing all the symbols and references in the codebase (it should be rather large). - Once the
dump.lsif
file is generated correctly, set the environment variablesSRC_ACCESS_TOKEN
andSRC_ENDPOINT
to the appropriate values in your shell. Then runsrc lsif upload
from the directory containing thelsif.dump
file. This should successfully upload the LSIF dump to Sourcegraph. - After you've successfully done all of the above in the container's interactive shell, incorporate
these steps into the
lsif-docker/*.sh
files. Then re-build the Docker container and try runningdocker run
again.