Github Schema¶
graph LR
O(GitHubOrganization) -- OWNER --> R(GitHubRepository)
O -- RESOURCE --> T(GitHubTeam)
O -- RESOURCE --> OS(GitHubActionsSecret)
O -- RESOURCE --> OV(GitHubActionsVariable)
O -- RESOURCE --> A(GitHubAction)
U(GitHubUser) -- MEMBER_OF --> O
U -- ADMIN_OF --> O
U -- UNAFFILIATED --> O
U -- OWNER --> R
U -- OUTSIDE_COLLAB_{ACTION} --> R
U -- DIRECT_COLLAB_{ACTION} --> R
U -- COMMITTED_TO --> R
R -- LANGUAGE --> L(ProgrammingLanguage)
R -- BRANCH --> B(GitHubBranch)
R -- HAS_RULE --> BPR(GitHubBranchProtectionRule)
R -- REQUIRES --> D(Dependency)
R -- HAS_MANIFEST --> M(DependencyGraphManifest)
R -- HAS_WORKFLOW --> W(GitHubWorkflow)
R -- HAS_SECRET --> RS(GitHubActionsSecret)
R -- HAS_VARIABLE --> RV(GitHubActionsVariable)
R -- HAS_ENVIRONMENT --> E(GitHubEnvironment)
W -- USES_ACTION --> A(GitHubAction)
W -- REFERENCES_SECRET --> RS
E -- HAS_SECRET --> ES(GitHubActionsSecret)
E -- HAS_VARIABLE --> EV(GitHubActionsVariable)
M -- HAS_DEP --> D
T -- {ROLE} --> R
T -- MEMBER_OF_TEAM --> T
U -- MEMBER --> T
U -- MAINTAINER --> T
IT(ImageTag) -- PACKAGED_FROM --> R
I(Image) -- PACKAGED_BY --> W
GitHubRepository¶
Representation of a single GitHubRepository (repo) repository object. This node contains all data unique to the repo.
Ontology Mapping: This node has the extra label
CodeRepositoryto enable cross-platform queries for source code repositories across different systems (e.g., GitLabProject).
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The GitHub repo id. These are not unique across GitHub instances, so are prepended with the API URL the id applies to |
createdat |
GitHub timestamp from when the repo was created |
name |
Name of the repo |
fullname |
Name of the organization and repo together |
description |
Text describing the repo |
primarylanguage |
The primary language used in the repo |
homepage |
The website used as a homepage for project information |
defaultbranch |
The default branch used by the repo, typically master |
defaultbranchid |
The unique identifier of the default branch |
private |
True if repo is private |
disabled |
True if repo is disabled |
archived |
True if repo is archived |
locked |
True if repo is locked |
giturl |
URL used to access the repo from git commandline |
url |
Web URL for viewing the repo |
sshurl |
URL for access the repo via SSH |
updatedat |
GitHub timestamp for last time repo was modified |
Relationships¶
GitHubUsers or GitHubOrganizations own GitHubRepositories.
(GitHubUser)-[OWNER]->(GitHubRepository) (GitHubOrganization)-[OWNER]->(GitHubRepository)
GitHubRepositories in an organization can have outside collaborators who may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:OUTSIDE_COLLAB_{ACTION}]->(GitHubRepository)GitHubRepositories in an organization also mark all direct collaborators, folks who are not necessarily ‘outside’ but who are granted access directly to the repository (as opposed to via membership in a team). They may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:DIRECT_COLLAB_{ACTION}]->(GitHubRepository)GitHubRepositories use ProgrammingLanguages
(GitHubRepository)-[:LANGUAGE]->(ProgrammingLanguage)GitHubRepositories have GitHubBranches
(GitHubRepository)-[:BRANCH]->(GitHubBranch)GitHubRepositories have GitHubBranchProtectionRules.
(GitHubRepository)-[:HAS_RULE]->(GitHubBranchProtectionRule)GitHubTeams can have various levels of access to GitHubRepositories.
(GitHubTeam)-[ADMIN|READ|WRITE|TRIAGE|MAINTAIN]->(GitHubRepository)GitHubUsers who have committed to GitHubRepositories in the last 30 days are tracked with commit activity data.
(GitHubUser)-[:COMMITTED_TO]->(GitHubRepository)This relationship includes the following properties:
commit_count: Number of commits made by the user to the repository in the last 30 days
last_commit_date: ISO 8601 timestamp of the user’s most recent commit to the repository
first_commit_date: ISO 8601 timestamp of the user’s oldest commit to the repository within the 30-day period
GitHubOrganization¶
Representation of a single GitHubOrganization organization object. This node contains minimal data for the GitHub Organization.
Ontology Mapping: This node has the extra label
Tenantto enable cross-platform queries for organizational tenants across different systems (e.g., OktaOrganization, AWSAccount).
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The URL of the GitHub organization |
username |
Name of the organization |
Relationships¶
GitHubOrganizations own GitHubRepositories.
(GitHubOrganization)-[OWNER]->(GitHubRepository)GitHubTeams are resources under GitHubOrganizations
(GitHubOrganization)-[RESOURCE]->(GitHubTeam)GitHubUsers relate to GitHubOrganizations in a few ways:
Most typically, they are members of an organization.
They may also be org admins (aka org owners), with broad permissions over repo and team settings. In these cases, they will be graphed with two relationships between GitHubUser and GitHubOrganization, both
MEMBER_OFandADMIN_OF.In some cases there may be a user who is “unaffiliated” with an org, for example if the user is an enterprise owner, but not member of, the org. Enterprise owners have complete control over the enterprise (i.e. they can manage all enterprise settings, members, and policies) yet may not show up on member lists of the GitHub org.
# a typical member (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) # an admin member has two relationships to the org (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) (GitHubUser)-[ADMIN_OF]->(GitHubOrganization) # an unaffiliated user (e.g. an enterprise owner) (GitHubUser)-[UNAFFILIATED]->(GitHubOrganization)
GitHubTeam¶
A GitHubTeam organization object.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The URL of the GitHub Team |
name |
The name (a.k.a URL slug) of the GitHub Team |
description |
Description of the GitHub team |
Relationships¶
GitHubTeams can have various levels of access to GitHubRepositories.
(GitHubTeam)-[ADMIN|READ|WRITE|TRIAGE|MAINTAIN]->(GitHubRepository)GitHubTeams are resources under GitHubOrganizations
(GitHubOrganization)-[RESOURCE]->(GitHubTeam)GitHubTeams may be children of other teams:
(GitHubTeam)-[MEMBER_OF_TEAM]->(GitHubTeam)GitHubUsers may be ‘immediate’ members of a team (as opposed to being members via membership in a child team), with their membership role being MEMBER or MAINTAINER.
(GitHubUser)-[MEMBER|MAINTAINER]->(GitHubTeam)
GitHubUser¶
Representation of a single GitHubUser user object. This node contains minimal data for the GitHub User.
Ontology Mapping: This node has the extra label
UserAccountto enable cross-platform queries for user accounts across different systems (e.g., OktaUser, AWSSSOUser, EntraUser).
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The URL of the GitHub user |
username |
Name of the user |
fullname |
The full name |
has_2fa_enabled |
Whether the user has 2-factor authentication enabled |
is_site_admin |
Whether the user is a site admin |
is_enterprise_owner |
Whether the user is an enterprise owner |
permission |
Only present if the user is an outside collaborator of this repo. |
The user’s publicly visible profile email. |
|
company |
The user’s public profile company. |
organization_verified_domain_emails |
List of emails verified by the user’s organization. |
Relationships¶
GitHubUsers own GitHubRepositories.
(GitHubUser)-[OWNER]->(GitHubRepository)GitHubRepositories in an organization can have outside collaborators who may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:OUTSIDE_COLLAB_{ACTION}]->(GitHubRepository)GitHubRepositories in an organization also mark all direct collaborators, folks who are not necessarily ‘outside’ but who are granted access directly to the repository (as opposed to via membership in a team). They may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:DIRECT_COLLAB_{ACTION}]->(GitHubRepository)GitHubUsers relate to GitHubOrganizations in a few ways:
Most typically, they are members of an organization.
They may also be org admins (aka org owners), with broad permissions over repo and team settings. In these cases, they will be graphed with two relationships between GitHubUser and GitHubOrganization, both
MEMBER_OFandADMIN_OF.In some cases there may be a user who is “unaffiliated” with an org, for example if the user is an enterprise owner, but not member of, the org. Enterprise owners have complete control over the enterprise (i.e. they can manage all enterprise settings, members, and policies) yet may not show up on member lists of the GitHub org.
# a typical member (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) # an admin member has two relationships to the org (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) (GitHubUser)-[ADMIN_OF]->(GitHubOrganization) # an unaffiliated user (e.g. an enterprise owner) (GitHubUser)-[UNAFFILIATED]->(GitHubOrganization)
GitHubTeams may be children of other teams:
(GitHubTeam)-[MEMBER_OF_TEAM]->(GitHubTeam)GitHubUsers may be ‘immediate’ members of a team (as opposed to being members via membership in a child team), with their membership role being MEMBER or MAINTAINER.
(GitHubUser)-[MEMBER|MAINTAINER]->(GitHubTeam)GitHubUsers who have committed to GitHubRepositories in the last 30 days are tracked with commit activity data.
(GitHubUser)-[:COMMITTED_TO]->(GitHubRepository)This relationship includes the following properties:
commit_count: Number of commits made by the user to the repository in the last 30 days
last_commit_date: ISO 8601 timestamp of the user’s most recent commit to the repository
first_commit_date: ISO 8601 timestamp of the user’s oldest commit to the repository within the 30-day period
GitHubBranch¶
Representation of a single GitHubBranch ref object. This node contains minimal data for a repository branch.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The GitHub branch id. These are not unique across GitHub instances, so are prepended with the API URL the id applies to |
name |
Name of the branch |
Relationships¶
GitHubRepositories have GitHubBranches.
(GitHubBranch)<-[BRANCH]-(GitHubRepository)
GitHubBranchProtectionRule¶
Representation of a single GitHubBranchProtectionRule BranchProtectionRule object. This node contains branch protection configuration for repositories.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The GitHub branch protection rule id |
pattern |
The branch name pattern protected by this rule (e.g., “main”, “release/*”) |
allows_deletions |
Whether users can delete matching branches |
allows_force_pushes |
Whether force pushes are allowed on matching branches |
dismisses_stale_reviews |
Whether reviews are dismissed when new commits are pushed |
is_admin_enforced |
Whether admins must follow this rule |
requires_approving_reviews |
Whether pull requests require approval before merging |
required_approving_review_count |
Number of approvals required (if requires_approving_reviews is true) |
requires_code_owner_reviews |
Whether code owner review is required |
requires_commit_signatures |
Whether commits must be signed |
requires_linear_history |
Whether merge commits are prohibited |
requires_status_checks |
Whether status checks must pass before merging |
requires_strict_status_checks |
Whether branches must be up to date before merging |
restricts_pushes |
Whether push access is restricted |
restricts_review_dismissals |
Whether review dismissals are restricted |
Relationships¶
GitHubRepositories have GitHubBranchProtectionRules.
(GitHubRepository)-[:HAS_RULE]->(GitHubBranchProtectionRule)
ProgrammingLanguage¶
Representation of a single Programming Language language object. This node contains programming language information.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
Language ids need not be tracked across instances, so defaults to the name |
name |
Name of the language |
Relationships¶
GitHubRepositories use ProgrammingLanguages.
(ProgrammingLanguage)<-[LANGUAGE]-(GitHubRepository)
DependencyGraphManifest¶
Represents a dependency manifest file (e.g., package.json, requirements.txt, pom.xml) from GitHub’s dependency graph API.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first discovered this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
Unique identifier: |
blob_path |
Path to the manifest file in the repository (e.g., “/package.json”) |
filename |
Name of the manifest file (e.g., “package.json”) |
dependencies_count |
Number of dependencies listed in this manifest |
repo_url |
URL of the GitHub repository containing this manifest |
Relationships¶
GitHubRepository via HAS_MANIFEST relationship
GitHubRepositories can have multiple dependency manifests
(GitHubRepository)-[:HAS_MANIFEST]->(DependencyGraphManifest)
Dependency via HAS_DEP relationship
Each manifest lists specific dependencies
(DependencyGraphManifest)-[:HAS_DEP]->(Dependency)
Dependency¶
https://docs.github.com/en/graphql/reference/objects#dependencygraphdependency Represents a software dependency from GitHub’s dependency graph manifests. This node contains information about a package dependency within a repository
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first discovered this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
Identifier: `{canonical_name} |
name |
Canonical name of the dependency (ecosystem-specific normalization) |
original_name |
Original name as specified in the manifest file |
requirements |
Unparsed requirement string from the manifest (e.g., |
ecosystem |
Package ecosystem (npm, pip, maven, etc.) |
package_manager |
Package manager name (NPM, PIP, MAVEN, etc.) |
manifest_file |
Manifest filename (package.json, requirements.txt, etc.) |
version |
Exact version if pinned (e.g., |
type |
Package URL type (e.g., |
purl |
Package URL (e.g., |
normalized_id |
Normalized ID for cross-tool matching (format: |
Relationships¶
GitHubRepository via REQUIRES relationship
requirements: Original requirement string from manifest
manifest_path: Path to manifest file in repository
(GitHubRepository)-[:REQUIRES]->(Dependency)
DependencyGraphManifest via HAS_DEP relationship
Dependencies are linked to their specific manifest files
(DependencyGraphManifest)-[:HAS_DEP]->(Dependency)
ImageTag to GitHubRepository (Cross-module relationship)¶
Container images (ImageTag nodes from any registry: ECR, GitLab, GCR, etc.) can be linked to the GitHubRepository that contains the Dockerfile used to build them. This relationship is created by analyzing Dockerfile content and matching layer commands against image history.
Relationships¶
ImageTag nodes may be packaged from a GitHubRepository
(:ImageTag)-[:PACKAGED_FROM]->(:GitHubRepository)Relationship properties:
match_method: How the match was determined:
"provenance"(from SLSA attestation) or"dockerfile_analysis"(from command matching)dockerfile_path: Path to the Dockerfile in the repository (only for
dockerfile_analysismethod)confidence: Confidence score of the match (0.0 to 1.0, only for
dockerfile_analysismethod)matched_commands: Number of commands that matched between Dockerfile and image history (only for
dockerfile_analysismethod)total_commands: Total number of commands compared (only for
dockerfile_analysismethod)command_similarity: Average similarity score of matched commands (only for
dockerfile_analysismethod)
Note: This relationship uses the generic
ImageTagsemantic label, enabling cross-registry querying (ECR, GitLab, GCR, etc.).
Dependency::PythonLibrary¶
Representation of a Python library as listed in a requirements.txt
or setup.cfg file.
Within a setup.cfg file, cartography will load everything from install_requires, setup_requires, and extras_require.
Field |
Description |
|---|---|
id |
The canonicalized name of the library. If the library was pinned in a requirements file using the |
name |
The canonicalized name of the library. |
version |
The exact version of the library. This field is only present if the library was pinned in a requirements file using the |
Relationships¶
Software on Github repos can import Python libraries by optionally specifying a version number.
(GitHubRepository)-[:REQUIRES{specifier}]->(PythonLibrary)specifier: A string describing this library’s version e.g. “<4.0,>=3.0” or “==1.0.2”. This field is only present on the
:REQUIRESedge if the repo’s requirements file provided a version pin.
A Python Dependency is affected by a SemgrepSCAFinding (optional)
(:SemgrepSCAFinding)-[:AFFECTS]->(:PythonLibrary)
GitHubWorkflow¶
Represents a GitHub Actions workflow definition file in a repository.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first discovered this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The GitHub workflow ID |
name |
Name of the workflow |
path |
Path to the workflow file (e.g., |
state |
Workflow state: |
created_at |
Timestamp when the workflow was created |
updated_at |
Timestamp when the workflow was last updated |
repo_url |
URL of the repository containing this workflow (e.g., |
trigger_events |
List of events that trigger the workflow (e.g., |
permissions_actions |
Permission level for the |
permissions_contents |
Permission level for the |
permissions_packages |
Permission level for the |
permissions_pull_requests |
Permission level for the |
permissions_issues |
Permission level for the |
permissions_deployments |
Permission level for the |
permissions_statuses |
Permission level for the |
permissions_checks |
Permission level for the |
permissions_id_token |
Permission level for the |
permissions_security_events |
Permission level for the |
env_vars |
List of top-level environment variable names defined in the workflow |
job_count |
Number of jobs defined in the workflow |
has_reusable_workflow_calls |
Whether the workflow calls reusable workflows |
Relationships¶
GitHubRepositories have GitHubWorkflows.
(GitHubRepository)-[:HAS_WORKFLOW]->(GitHubWorkflow)GitHubWorkflows use GitHubActions.
(GitHubWorkflow)-[:USES_ACTION]->(GitHubAction)GitHubWorkflows reference GitHubActionsSecrets (detected via
${{ secrets.NAME }}patterns in the YAML).(GitHubWorkflow)-[:REFERENCES_SECRET]->(GitHubActionsSecret)Container images may be packaged by a GitHubWorkflow (derived from SLSA provenance attestations).
(:Image)-[:PACKAGED_BY]->(:GitHubWorkflow)Note: This relationship is created when SLSA provenance attestations specify the GitHub Actions workflow that built the container image. The
Imagelabel is a semantic label applied to container images across registries (ECR, GitLab, etc.).
GitHubAction¶
Represents a third-party GitHub Action used in workflows, parsed from workflow YAML uses references.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first discovered this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
Unique identifier in the format |
owner |
Owner of the action repository (e.g., |
name |
Name of the action (e.g., |
version |
Version reference (tag, branch, or SHA), or |
is_pinned |
Whether the action is pinned to a full 40-character SHA commit hash |
is_local |
Whether the action is a local action (path starting with |
full_name |
Full name of the action (e.g., |
Relationships¶
GitHubWorkflows use GitHubActions.
(GitHubWorkflow)-[:USES_ACTION]->(GitHubAction)GitHubOrganizations are sub-resources for GitHubActions (for cleanup scoping).
(GitHubOrganization)-[:RESOURCE]->(GitHubAction)
GitHubEnvironment¶
Represents a GitHub deployment environment for a repository.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first discovered this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The GitHub environment ID |
name |
Name of the environment (e.g., |
html_url |
Web URL for viewing the environment settings |
created_at |
Timestamp when the environment was created |
updated_at |
Timestamp when the environment was last updated |
Relationships¶
GitHubRepositories have GitHubEnvironments.
(GitHubRepository)-[:HAS_ENVIRONMENT]->(GitHubEnvironment)GitHubEnvironments can have GitHubActionsSecrets.
(GitHubEnvironment)-[:HAS_SECRET]->(GitHubActionsSecret)GitHubEnvironments can have GitHubActionsVariables.
(GitHubEnvironment)-[:HAS_VARIABLE]->(GitHubActionsVariable)
GitHubActionsSecret¶
Represents a GitHub Actions secret. Secrets can exist at three levels: organization, repository, or environment. Note that secret values are never exposed by the GitHub API - only metadata is stored.
Ontology Mapping: This node has the extra label
Secretand normalized_ont_*properties for cross-platform secret queries. See Secret.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first discovered this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
Unique identifier (composite URL based on level) |
name |
Name of the secret |
level |
Level of the secret: |
visibility |
Visibility setting (organization-level only): |
created_at |
Timestamp when the secret was created |
updated_at |
Timestamp when the secret was last updated |
Relationships¶
GitHubOrganizations have organization-level GitHubActionsSecrets.
(GitHubOrganization)-[:RESOURCE]->(GitHubActionsSecret {level: "organization"})GitHubRepositories have repository-level GitHubActionsSecrets.
(GitHubRepository)-[:HAS_SECRET]->(GitHubActionsSecret {level: "repository"})GitHubEnvironments have environment-level GitHubActionsSecrets.
(GitHubEnvironment)-[:HAS_SECRET]->(GitHubActionsSecret {level: "environment"})GitHubOrganizations are sub-resources for environment-level GitHubActionsSecrets (for cleanup scoping).
(GitHubOrganization)-[:RESOURCE]->(GitHubActionsSecret {level: "environment"})
GitHubActionsVariable¶
Represents a GitHub Actions variable. Variables can exist at three levels: organization, repository, or environment. Unlike secrets, variable values are stored in plaintext.
Field |
Description |
|---|---|
firstseen |
Timestamp of when a sync job first discovered this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
Unique identifier (composite URL based on level) |
name |
Name of the variable |
value |
Value of the variable (plaintext) |
level |
Level of the variable: |
visibility |
Visibility setting (organization-level only): |
created_at |
Timestamp when the variable was created |
updated_at |
Timestamp when the variable was last updated |
Relationships¶
GitHubOrganizations have organization-level GitHubActionsVariables.
(GitHubOrganization)-[:RESOURCE]->(GitHubActionsVariable {level: "organization"})GitHubRepositories have repository-level GitHubActionsVariables.
(GitHubRepository)-[:HAS_VARIABLE]->(GitHubActionsVariable {level: "repository"})GitHubEnvironments have environment-level GitHubActionsVariables.
(GitHubEnvironment)-[:HAS_VARIABLE]->(GitHubActionsVariable {level: "environment"})GitHubOrganizations are sub-resources for environment-level GitHubActionsVariables (for cleanup scoping).
(GitHubOrganization)-[:RESOURCE]->(GitHubActionsVariable {level: "environment"})