GitLab Configuration¶
Follow these steps to configure Cartography to sync GitLab organization, group, project, and related data.
Prerequisites¶
A GitLab instance (self-hosted or gitlab.com)
A GitLab personal access token with the required scopes (see below)
The numeric ID of the GitLab organization (top-level group) to sync
Creating a GitLab Personal Access Token¶
Navigate to your GitLab instance (e.g.,
https://gitlab.comorhttps://gitlab.example.com)Go to User Settings → Access Tokens (or directly to
https://your-gitlab-instance/-/user_settings/personal_access_tokens)Click Add new token
Configure your token:
Token name:
cartography-syncScopes: Select
read_user,read_repository, andread_apiExpiration date: Set according to your security policy
Click Create personal access token
Important: Copy the token immediately - you won’t be able to see it again
Required Token Permissions¶
The token requires the following scopes:
Scope |
Purpose |
|---|---|
|
Access user profile information for group/project membership |
|
Access repository metadata, branches, and file contents |
|
Access groups, projects, dependency scanning artifacts, language statistics, and group/project-level CI/CD runners |
These scopes provide read-only access to:
Organizations (top-level groups) and nested groups
Projects and their metadata
Branches and default branch information
Dependency files (package.json, requirements.txt, etc.)
Dependency files and dependencies extracted from GitLab dependency scanning artifacts
Project language statistics
Group-level and project-level CI/CD runners
Dependency scanning artifact access¶
Cartography ingests GitLab dependencies from CycloneDX SBOM artifacts produced by GitLab dependency scanning jobs, not from GitLab’s dependency list API. The token scopes above are required, and the token’s user must also be allowed to download CI job artifacts for each project.
GitLab projects can restrict artifact downloads with artifacts:access. If a dependency scanning job uses artifacts:access: developer or artifacts:access: maintainer, a token that belongs to a Reporter-level user can receive 403 Forbidden when Cartography downloads the job artifacts. Grant the token’s user a project role that satisfies the artifact access policy.
Dependency scanning jobs must produce CycloneDX SBOM artifacts, such as gl-sbom-*.cdx.json, gl-sbom.cdx.json, or gzipped equivalents. GitLab documents these SBOMs as job artifacts of the dependency scanning job. Cartography can only ingest artifacts that GitLab still serves, so expired or deleted job artifacts cannot be recovered during sync.
Optional: instance-level runners¶
Listing instance-level (shared) runners via GET /api/v4/runners/all requires the token to belong to a GitLab administrator. If the token does not have admin privileges, the sync logs a warning and skips instance-level runners; group-level and project-level runners continue to be ingested normally.
CI config (.gitlab-ci.yml) ingestion¶
The CI config sync first calls GET /api/v4/projects/:id/ci/lint?dry_run=true to obtain the merged YAML (with all include: references expanded). Tokens generated from a user without Maintainer access on the project may not be allowed to use this endpoint — in that case the sync falls back to the raw .gitlab-ci.yml from the repository, which only requires read_repository. If both calls fail (404 / 403), the project is skipped (a warning is logged before the skip).
Finding Your Organization ID¶
The organization ID is the numeric ID of the top-level GitLab group you want to sync. To find it:
Navigate to your group’s page on GitLab (e.g.,
https://gitlab.com/your-organization).Click the ⋮ (three dots) menu in the top right of the group header and select Copy group ID.
Alternatively, fetch it via the API:
curl -H "PRIVATE-TOKEN: your-token" "https://gitlab.com/api/v4/groups/your-organization"The
idfield in the response is your organization ID.
Configuration¶
Set your GitLab token in an environment variable:
export GITLAB_TOKEN="glpat-your-token-here"Run Cartography with GitLab module:
cartography \ --neo4j-uri bolt://localhost:7687 \ --selected-modules gitlab \ --gitlab-organization-id 12345678 \ --gitlab-token-env-var "GITLAB_TOKEN"
Configuration Options¶
Parameter |
CLI Argument |
Environment Variable |
Required |
Default |
Description |
|---|---|---|---|---|---|
GitLab URL |
|
N/A |
No |
|
The GitLab instance URL. Only set for self-hosted instances. |
GitLab Token |
|
Set by you |
Yes |
N/A |
Name of the environment variable containing your GitLab personal access token |
Organization ID |
|
N/A |
Yes |
N/A |
The numeric ID of the top-level GitLab group (organization) to sync |
Performance Considerations¶
Language detection: Fetches programming language statistics for all projects using parallel async requests (10 concurrent by default). Languages are stored as a JSON property on each project.
Large instances: For ~3000 projects, language fetching takes approximately 5-7 minutes
API rate limits: GitLab.com has rate limits (2000 requests/minute for authenticated users). Self-hosted instances may have different limits
Multi-Instance Support¶
Cartography supports syncing from multiple GitLab instances simultaneously. Repository and group IDs are prefixed with the GitLab instance URL to prevent collisions:
https://gitlab.com/projects/12345
https://gitlab.example.com/projects/12345
Both can exist in the same Neo4j database without conflicts.
Example: Self-Hosted GitLab¶
export GITLAB_TOKEN="glpat-abc123xyz"
cartography \
--neo4j-uri bolt://localhost:7687 \
--selected-modules gitlab \
--gitlab-url "https://gitlab.example.com" \
--gitlab-organization-id 12345678 \
--gitlab-token-env-var "GITLAB_TOKEN"
Troubleshooting¶
Connection timeout:
Default timeout is 60 seconds
For slow GitLab instances, the sync may take longer during language detection
Check GitLab instance health if repeated timeouts occur
Missing language data:
Some projects may not have language statistics available (empty repos, binary-only repos)
Errors fetching languages for individual projects are logged as warnings but don’t stop the sync
Missing dependency data:
Dependency scanning requires projects to have supported manifest files (package.json, requirements.txt, etc.)
The GitLab Dependency Scanning feature must be enabled for the project
Permission errors:
Ensure your token has all required scopes:
read_user,read_repository,read_apiVerify the token hasn’t expired
Check that the GitLab user has access to the organization and projects you want to sync
Organization not found:
Verify the
--gitlab-organization-idis the correct numeric ID (not the group path)Ensure the token’s user has access to the organization