GitLab¶
Cartography can sync repository, group, and programming language data from GitLab instances.
Module Features¶
Repositories: Comprehensive metadata for all GitLab projects including URLs, statistics, feature flags, and access settings
Groups: GitLab group (namespace) information with ownership relationships
Programming Languages: Language detection with usage percentages for all repositories
Multi-instance support: Sync from multiple GitLab instances without ID conflicts
Performance optimized: Parallel language fetching for large instances (tested with 3000+ repos)
Data Collected¶
GitLabRepository Nodes¶
Repository identification and paths
Multiple URL formats (web, HTTP clone, SSH clone, README)
Visibility and access settings (private/internal/public, archived)
Statistics (stars, forks, open issues)
Feature flags (issues, merge requests, wiki, snippets, container registry)
Timestamps (created, last activity)
Default branch information
GitLabGroup Nodes¶
Group names and paths
Full namespace hierarchy
Visibility settings
Web URLs
Programming Language Analysis¶
Language detection for all repositories
Usage percentages (e.g., 65.5% Python, 34.5% JavaScript)
Shared
ProgrammingLanguagenodes across GitHub and GitLab modules
Graph Relationships¶
(:GitLabGroup)-[:OWNER]->(:GitLabRepository)-[:LANGUAGE{percentage}]->(:ProgrammingLanguage)
Configuration¶
See GitLab Configuration for setup instructions.
Schema¶
See GitLab Schema for detailed schema documentation and sample queries.
Scalability¶
The GitLab module has been tested with large instances and uses parallel execution (10 concurrent workers) to efficiently handle language detection across thousands of repositories.