Configure Sourcegraph with Kubernetes
Configuring a Sourcegraph Kubernetes cluster without Helm is done by applying manifest files and with simple
kubectl
commands. You can configure Sourcegraph as flexibly as you need to meet the requirements
of your deployment environment.
Featured guides
Getting started
We strongly recommend you fork the Sourcegraph with Kubernetes reference repository to track your configuration changes in Git. This will make upgrades far easier and is a good practice not just for Sourcegraph, but for any Kubernetes application.
Create a fork or private duplicate
Create a public fork of the deploy-sourcegraph repository.
Alternatively, create a private duplicate of the deploy-sourcegraph repository as follows:
Create an empty private repository, for example <you/private-repository>
in GitHub, then bare clone the reference repository.
git clone --bare https://github.com/sourcegraph/deploy-sourcegraph/
Navigate to the bare clone and mirror push it to your private repository.
cd deploy-sourcegraph.git git push --mirror https://github.com/<you/private-repository>.git
Remove your local bare clone.
cd .. rm -rf deploy-sourcegraph.git
Clone your fork using the repository's URL.
git clone https://github.com/<you/private-repository>.git
- Add the reference repository as an
upstream
remote so that you can get updates.
git remote add upstream https://github.com/sourcegraph/deploy-sourcegraph
- Create a
release
branch to track all of your customizations to Sourcegraph. This branch will be used to upgrade Sourcegraph and install your Sourcegraph instance.
export SOURCEGRAPH_VERSION="v3.43.2" git checkout $SOURCEGRAPH_VERSION -b release
Some of the following instructions require cluster access. Ensure you can access your Kubernetes cluster with kubectl
.
Customizations
To make customizations to the Sourcegraph deployment such as resources, replicas or other changes, we recommend using Kustomize.
This means that you define your customizations as patches, and generate a manifest from our provided manifests to apply.
Configure a storage class
Sourcegraph by default requires a storage class for all persisent volumes claims. By default this storage class is called sourcegraph
. This storage class must be configured before applying the base configuration to your cluster.
-
Create
base/sourcegraph.StorageClass.yaml
with the appropriate configuration for your cloud provider and commit the file to your fork. -
The sourcegraph StorageClass will retain any persistent volumes created in the event of an accidental deletion of a persistent volume claim.
-
The sourcegraph StorageClass also allows the persistent volumes to expand their storage capacity by increasing the size of the related persistent volume claim.
-
This cannot be changed once the storage class has been created. Persistent volumes not created with the reclaimPolicy set to
Retain
can be patched with the following command:
kubectl patch pv <your-pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'
See the official documentation for more information about patching persistent volumes.
Google Cloud Platform (GCP)
Kubernetes 1.19 and higher
-
Please read and follow the official documentation for enabling the persistent disk CSI driver on a new or existing cluster.
-
Add the following Kubernetes manifest to the
base
directory of your fork:
# base/sourcegraph.StorageClass.yaml kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: sourcegraph labels: deploy: sourcegraph provisioner: pd.csi.storage.gke.io parameters: type: pd-ssd # This configures SSDs (recommended). reclaimPolicy: Retain allowVolumeExpansion: true volumeBindingMode: WaitForFirstConsumer
Amazon Web Services (AWS)
Kubernetes 1.19 and higher
-
Follow the official instructions to deploy the Amazon Elastic Block Store (Amazon EBS) Container Storage Interface (CSI) driver.
-
Add the following Kubernetes manifest to the
base
directory of your fork:
# base/sourcegraph.StorageClass.yaml kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: sourcegraph labels: deploy: sourcegraph provisioner: ebs.csi.aws.com parameters: type: gp2 # This configures SSDs (recommended). reclaimPolicy: Retain volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true
Azure
Kubernetes 1.19 and higher
-
Follow the official instructions to deploy the Amazon Elastic Block Store (Amazon EBS) Container Storage Interface (CSI) driver.
-
Add the following Kubernetes manifest to the
base
directory of your fork:
# base/sourcegraph.StorageClass.yaml kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: sourcegraph labels: deploy: sourcegraph provisioner: disk.csi.azure.com parameters: storageaccounttype: Premium_LRS # This configures SSDs (recommended). A Premium VM is required. reclaimPolicy: Retain volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true
Other cloud providers
# base/sourcegraph.StorageClass.yaml kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: sourcegraph labels: deploy: sourcegraph reclaimPolicy: Retain allowVolumeExpansion: true # Read https://kubernetes.io/docs/concepts/storage/storage-classes/ to configure the "provisioner" and "parameters" fields for your cloud provider. # SSDs are highly recommended! # provisioner: # parameters:
Configure network access
You need to make the main web server accessible over the network to external users.
There are a few approaches, but using an ingress controller is recommended.
Ingress controller (recommended)
For production environments, we recommend using the ingress-nginx ingress.
-
As part of our base configuration, we install an ingress for sourcegraph-frontend. It installs rules for the default ingress, see comments to restrict it to a specific host.
-
In addition to the sourcegraph-frontend ingress, you'll need to install the NGINX ingress controller (ingress-nginx).
-
Follow the instructions at https://kubernetes.github.io/ingress-nginx/deploy/ to create the ingress controller.
-
Add the files to configure/ingress-nginx, including an install.sh file which applies the relevant manifests.
-
We include sample generic-cloud manifests as part of this repository, but please follow the official instructions for your cloud provider.
-
Add the configure/ingress-nginx/install.sh command to create-new-cluster.sh and commit the change:
echo ./configure/ingress-nginx/install.sh >> create-new-cluster.sh
-
Once the ingress has acquired an external address, you should be able to access Sourcegraph using that.
-
You can check the external address by running the following command and looking for the
LoadBalancer
entry:
kubectl -n ingress-nginx get svc
If you are having trouble accessing Sourcegraph, ensure ingress-nginx IP is accessible above. Otherwise see Troubleshooting ingress-nginx. The namespace of the ingress-controller is ingress-nginx
.
Once you have installed Sourcegraph, run the following command, and ensure an IP address has been assigned to your ingress resource. Then browse to the IP or configured URL.
kubectl get ingress sourcegraph-frontend NAME CLASS HOSTS ADDRESS PORTS AGE sourcegraph-frontend <none> sourcegraph.com 8.8.8.8 80, 443 1d
Configuration
ingress-nginx
has extensive configuration documented at NGINX Configuration. We expect most administrators to modify ingress-nginx annotations in sourcegraph-frontend.Ingress.yaml. Some settings are modified globally (such as HSTS). In that case we expect administrators to modify the ingress-nginx configmap in configure/ingress-nginx/mandatory.yaml.
NGINX service
In cases where ingress controllers cannot be created, creating an explicit NGINX service is a viable alternative. See the files in the configure/nginx-svc folder for an example of how to do this via a NodePort service (any other type of Kubernetes service will also work):
-
Modify configure/nginx-svc/nginx.ConfigMap.yaml to contain the TLS certificate and key for your domain.
-
kubectl apply -f configure/nginx-svc
to create the NGINX service. -
Update create-new-cluster.sh with the previous command.
echo kubectl apply -f configure/nginx-svc >> create-new-cluster.sh
Network rule
Add a network rule that allows ingress traffic to port 30080 (HTTP) on at least one node.
Google Cloud Platform Firewall rules.
- Expose the necessary ports.
gcloud compute --project=$PROJECT firewall-rules create sourcegraph-frontend-http --direction=INGRESS --priority=1000 --network=default --action=ALLOW --rules=tcp:30080
- Change the type of the
sourcegraph-frontend
service in base/frontend/sourcegraph-frontend.Service.yaml fromClusterIP
toNodePort
:
spec: ports: - name: http port: 30080 + nodePort: 30080 - type: ClusterIP + type: NodePort
- Directly applying this change to the service will fail. Instead, you must delete the old service and then create the new one (this will result in a few seconds of downtime):
kubectl delete svc sourcegraph-frontend kubectl apply -f base/frontend/sourcegraph-frontend.Service.yaml
- Find a node name.
kubectl get pods -l app=sourcegraph-frontend -o=custom-columns=NODE:.spec.nodeName
- Get the EXTERNAL-IP address (will be ephemeral unless you make it static).
kubectl get node $NODE -o wide
AWS Security Group rules.
Sourcegraph should now be accessible at $EXTERNAL_ADDR:30080
, where $EXTERNAL_ADDR
is the address of any node in the cluster.
Using NetworkPolicy
Network policy is a Kubernetes resource that defines how pods are allowed to communicate with each other and with
other network endpoints. If the cluster administration requires an associated NetworkPolicy when doing an installation,
then we recommend running Sourcegraph in a namespace (as described in our Overlays guide or below in the
Using NetworkPolicy with Namespaced Overlay Example).
You can then use the namespaceSelector
to allow traffic between the Sourcegraph pods.
When you create the namespace you need to give it a label so it can be used in a matchLabels
clause.
apiVersion: v1 kind: Namespace metadata: name: ns-sourcegraph labels: name: ns-sourcegraph
If the namespace already exists you can still label it like so
kubectl label namespace ns-sourcegraph name=ns-sourcegraph
kind: NetworkPolicy apiVersion: networking.k8s.io/v1 metadata: name: np-sourcegraph namespace: ns-sourcegraph spec: # For all pods with the label "deploy: sourcegraph" podSelector: matchLabels: deploy: sourcegraph policyTypes: - Ingress - Egress # Allow all traffic inside the ns-sourcegraph namespace ingress: - from: - namespaceSelector: matchLabels: name: ns-sourcegraph egress: - to: - namespaceSelector: matchLabels: name: ns-sourcegraph
Configure external databases
We recommend utilizing an external database when deploying Sourcegraph to provide the most resilient and performant backend for your deployment. For more information on the specific requirements for Sourcegraph databases, see this guide.
Simply edit the relevant PostgreSQL environment variables (e.g. PGHOST, PGPORT, PGUSER, etc.) in base/frontend/sourcegraph-frontend.Deployment.yaml to point to your existing PostgreSQL instance.
If you do not have an external database available, configuration is provided to deploy PostgreSQL on Kubernetes.
Configure repository cloning via SSH
Sourcegraph will clone repositories using SSH credentials if they are mounted at /home/sourcegraph/.ssh
in the gitserver
deployment.
Create a secret that contains the base64 encoded contents of your SSH private key (make sure it doesn't require a password) and known_hosts file.
kubectl create secret generic gitserver-ssh \ --from-file id_rsa=${HOME}/.ssh/id_rsa \ --from-file known_hosts=${HOME}/.ssh/known_hosts
Update create-new-cluster.sh with the previous command.
echo kubectl create secret generic gitserver-ssh \ --from-file id_rsa=${HOME}/.ssh/id_rsa \ --from-file known_hosts=${HOME}/.ssh/known_hosts >> create-new-cluster.sh
Mount the secret as a volume in gitserver.StatefulSet.yaml.
For example:
# base/gitserver/gitserver.StatefulSet.yaml spec: containers: volumeMounts: - mountPath: /root/.ssh name: ssh volumes: - name: ssh secret: defaultMode: 0644 secretName: gitserver-ssh
Convenience script:
# This script requires https://github.com/sourcegraph/jy and https://github.com/sourcegraph/yj GS=base/gitserver/gitserver.StatefulSet.yaml cat $GS | yj | jq '.spec.template.spec.containers[].volumeMounts += [{mountPath: "/root/.ssh", name: "ssh"}]' | jy -o $GS cat $GS | yj | jq '.spec.template.spec.volumes += [{name: "ssh", secret: {defaultMode: 384, secretName:"gitserver-ssh"}}]' | jy -o $GS
If you run your installation with non-root users (the non-root overlay) then use the mount path /home/sourcegraph/.ssh
instead of /root/.ssh
:
# base/gitserver/gitserver.StatefulSet.yaml spec: containers: volumeMounts: - mountPath: /home/sourcegraph/.ssh name: ssh volumes: - name: ssh secret: defaultMode: 0644 secretName: gitserver-ssh
Convenience script:
# This script requires https://github.com/sourcegraph/jy and https://github.com/sourcegraph/yj GS=base/gitserver/gitserver.StatefulSet.yaml cat $GS | yj | jq '.spec.template.spec.containers[].volumeMounts += [{mountPath: "/home/sourcegraph/.ssh", name: "ssh"}]' | jy -o $GS cat $GS | yj | jq '.spec.template.spec.volumes += [{name: "ssh", secret: {defaultMode: 384, secretName:"gitserver-ssh"}}]' | jy -o $GS
- Apply the updated
gitserver
configuration to your cluster.
./kubectl-apply-all.sh
WARNING: Do NOT commit the actual id_rsa
and known_hosts
files to your fork (unless
your fork is private and you are okay with storing secrets in it).
Configure custom Redis
Sourcegraph supports specifying a custom Redis server for:
- caching information (specified via the
REDIS_CACHE_ENDPOINT
environment variable) - storing information (session data and job queues) (specified via the
REDIS_STORE_ENDPOINT
environment variable)
If these are not set, they will default to redis-cache:6379
& redis-store:6379
If you want to specify a custom Redis server, you'll need specify the corresponding environment variable for each of the following deployments:
sourcegraph-frontend
repo-updater
gitserver
searcher
symbols
worker
Kubernetes yaml example
apiVersion: apps/v1 kind: <Deployment/StatefulSet> spec: template: spec: containers: - name: <frontend> - image: <frontend_image>/<TAG> - env: - name: REDIS_CACHE_ENDPOINT value: "<REDIS_CACHE_DSN>" - name: REDIS_STORE_ENDPOINT value: "<REDIS_STORE_DSN>"
Connect to an external Jaeger instance
If you have an existing Jaeger instance you would like to connect Sourcegraph to (instead of running the Jaeger instance inside the Sourcegraph cluster), do:
- Remove the
base/jaeger
directory:rm -rf base/jaeger
- Update the Jaeger agent containers to point to your Jaeger collector.
- Find all instances of Jaeger agent (
grep -R 'jaegertracing/jaeger-agent'
). - Update the
args
field of the Jaeger agent container configuration to point to the external collector. E.g.,args: - --reporter.grpc.host-port=external-jaeger-collector-host:14250 - --reporter.type=grpc
- Find all instances of Jaeger agent (
- Apply these changes to the cluster.
Disable Jaeger entirely
To disable Jaeger entirely, do:
- Update the Sourcegraph site
configuration to remove the
observability.tracing
field. - Remove the
base/jaeger
directory:rm -rf base/jaeger
- Remove the jaeger agent containers from each
*.Deployment.yaml
and*.StatefulSet.yaml
file. - Apply these changes to the cluster.
Install without cluster-wide RBAC
Sourcegraph communicates with the Kubernetes API for service discovery. It also has some janitor DaemonSets that clean up temporary cache data. To do that we need to create RBAC resources.
If using cluster roles and cluster rolebinding RBAC is not an option, then you can use the non-privileged overlay to generate modified manifests. Read the Overlays section below about overlays.
Add license key
Sourcegraph's Kubernetes deployment requires an Enterprise license key.
-
Create an account on or sign in to sourcegraph.com, and go to https://sourcegraph.com/subscriptions/new to obtain a license key.
-
Once you have a license key, add it to your site configuration.
Environment variables
Update the environment variables in the appropriate deployment manifest.
For example, the following patch will update PGUSER
to have the value bob
:
apiVersion: apps/v1 kind: Deployment metadata: name: sourcegraph-frontend spec: template: spec: containers: - name: frontend env: - name: PGUSER value: bob
Filtering cAdvisor metrics
Due to how cAdvisor works, Sourcegraph's cAdvisor deployment can pick up metrics for services unrelated to the Sourcegraph deployment running on the same nodes as Sourcegraph services. Learn more.
To work around this, update your prometheus.ConfigMap.yaml
to target your namespaced Sourcegraph deployment by uncommenting the below metric_relabel_configs
entry and updating it with the appropriate namespace.
This will cause Prometheus to drop all metrics from cAdvisor that are not from services in the desired namespace.
apiVersion: v1 data: prometheus.yml: | # ... metric_relabel_configs: # cAdvisor-specific customization. Drop container metrics exported by cAdvisor # not in the same namespace as Sourcegraph. # Uncomment this if you have problems with certain dashboards or cAdvisor itself # picking up non-Sourcegraph services. Ensure all Sourcegraph services are running # within the Sourcegraph namespace you have defined. # The regex must keep matches on '^$' (empty string) to ensure other metrics do not # get dropped. - source_labels: [container_label_io_kubernetes_pod_namespace] regex: ^$|ns-sourcegraph # ensure this matches with namespace declarations action: keep # ...
Outbound Traffic
When working with an Internet Gateway or VPC it may be necessary to expose ports for outbound network traffic. Sourcegraph must open port 443 for outbound traffic to codehosts, and to enable telemetry with Sourcegraph.com. Port 22 must also be opened to enable git SSH cloning by Sourcegraph. Take care to secure your cluster in a manner that meets your organization's security requirements.
Troubleshooting
See the Troubleshooting docs.