Github Schema¶
GitHubRepository¶
Representation of a single GitHubRepository (repo) repository object. This node contains all data unique to the repo.
Field |
Description |
---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The GitHub repo id. These are not unique across GitHub instances, so are prepended with the API URL the id applies to |
createdat |
GitHub timestamp from when the repo was created |
name |
Name of the repo |
fullname |
Name of the organization and repo together |
description |
Text describing the repo |
primarylanguage |
The primary language used in the repo |
homepage |
The website used as a homepage for project information |
defaultbranch |
The default branch used by the repo, typically master |
defaultbranchid |
The unique identifier of the default branch |
private |
True if repo is private |
disabled |
True if repo is disabled |
archived |
True if repo is archived |
locked |
True if repo is locked |
giturl |
URL used to access the repo from git commandline |
url |
Web URL for viewing the repo |
sshurl |
URL for access the repo via SSH |
updatedat |
GitHub timestamp for last time repo was modified |
Relationships¶
GitHubUsers or GitHubOrganizations own GitHubRepositories.
(GitHubUser)-[OWNER]->(GitHubRepository) (GitHubOrganization)-[OWNER]->(GitHubRepository)
GitHubRepositories in an organization can have outside collaborators who may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:OUTSIDE_COLLAB_{ACTION}]->(GitHubRepository)
GitHubRepositories in an organization also mark all direct collaborators, folks who are not necessarily ‘outside’ but who are granted access directly to the repository (as opposed to via membership in a team). They may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:DIRECT_COLLAB_{ACTION}]->(GitHubRepository)
GitHubRepositories use ProgrammingLanguages
(GitHubRepository)-[:LANGUAGE]->(ProgrammingLanguage)
GitHubRepositories have GitHubBranches
(GitHubRepository)-[:BRANCH]->(GitHubBranch)
GitHubTeams can have various levels of access to GitHubRepositories.
(GitHubTeam)-[ADMIN|READ|WRITE|TRIAGE|MAINTAIN]->(GitHubRepository)
GitHubOrganization¶
Representation of a single GitHubOrganization organization object. This node contains minimal data for the GitHub Organization.
Field |
Description |
---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The URL of the GitHub organization |
username |
Name of the organization |
Relationships¶
GitHubOrganizations own GitHubRepositories.
(GitHubOrganization)-[OWNER]->(GitHubRepository)
GitHubTeams are resources under GitHubOrganizations
(GitHubOrganization)-[RESOURCE]->(GitHubTeam)
GitHubUsers relate to GitHubOrganizations in a few ways:
Most typically, they are members of an organization.
They may also be org admins (aka org owners), with broad permissions over repo and team settings. In these cases, they will be graphed with two relationships between GitHubUser and GitHubOrganization, both
MEMBER_OF
andADMIN_OF
.In some cases there may be a user who is “unaffiliated” with an org, for example if the user is an enterprise owner, but not member of, the org. Enterprise owners have complete control over the enterprise (i.e. they can manage all enterprise settings, members, and policies) yet may not show up on member lists of the GitHub org.
# a typical member (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) # an admin member has two relationships to the org (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) (GitHubUser)-[ADMIN_OF]->(GitHubOrganization) # an unaffiliated user (e.g. an enterprise owner) (GitHubUser)-[UNAFFILIATED]->(GitHubOrganization)
GitHubTeam¶
A GitHubTeam organization object.
Field |
Description |
---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The URL of the GitHub Team |
name |
The name (a.k.a URL slug) of the GitHub Team |
description |
Description of the GitHub team |
Relationships¶
GitHubTeams can have various levels of access to GitHubRepositories.
(GitHubTeam)-[ADMIN|READ|WRITE|TRIAGE|MAINTAIN]->(GitHubRepository)
GitHubTeams are resources under GitHubOrganizations
(GitHubOrganization)-[RESOURCE]->(GitHubTeam)
GitHubTeams may be children of other teams:
(GitHubTeam)-[MEMBER_OF_TEAM]->(GitHubTeam)
GitHubUsers may be ‘immediate’ members of a team (as opposed to being members via membership in a child team), with their membership role being MEMBER or MAINTAINER.
(GitHubUser)-[MEMBER|MAINTAINER]->(GitHubTeam)
GitHubUser¶
Representation of a single GitHubUser user object. This node contains minimal data for the GitHub User.
Field |
Description |
---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The URL of the GitHub user |
username |
Name of the user |
fullname |
The full name |
has_2fa_enabled |
Whether the user has 2-factor authentication enabled |
is_site_admin |
Whether the user is a site admin |
is_enterprise_owner |
Whether the user is an enterprise owner |
permission |
Only present if the user is an outside collaborator of this repo. |
The user’s publicly visible profile email. |
|
company |
The user’s public profile company. |
Relationships¶
GitHubUsers own GitHubRepositories.
(GitHubUser)-[OWNER]->(GitHubRepository)
GitHubRepositories in an organization can have outside collaborators who may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:OUTSIDE_COLLAB_{ACTION}]->(GitHubRepository)
GitHubRepositories in an organization also mark all direct collaborators, folks who are not necessarily ‘outside’ but who are granted access directly to the repository (as opposed to via membership in a team). They may be granted different levels of access, including ADMIN, WRITE, MAINTAIN, TRIAGE, and READ (Reference).
(GitHubUser)-[:DIRECT_COLLAB_{ACTION}]->(GitHubRepository)
GitHubUsers relate to GitHubOrganizations in a few ways:
Most typically, they are members of an organization.
They may also be org admins (aka org owners), with broad permissions over repo and team settings. In these cases, they will be graphed with two relationships between GitHubUser and GitHubOrganization, both
MEMBER_OF
andADMIN_OF
.In some cases there may be a user who is “unaffiliated” with an org, for example if the user is an enterprise owner, but not member of, the org. Enterprise owners have complete control over the enterprise (i.e. they can manage all enterprise settings, members, and policies) yet may not show up on member lists of the GitHub org.
# a typical member (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) # an admin member has two relationships to the org (GitHubUser)-[MEMBER_OF]->(GitHubOrganization) (GitHubUser)-[ADMIN_OF]->(GitHubOrganization) # an unaffiliated user (e.g. an enterprise owner) (GitHubUser)-[UNAFFILIATED]->(GitHubOrganization)
GitHubTeams may be children of other teams:
(GitHubTeam)-[MEMBER_OF_TEAM]->(GitHubTeam)
GitHubUsers may be ‘immediate’ members of a team (as opposed to being members via membership in a child team), with their membership role being MEMBER or MAINTAINER.
(GitHubUser)-[MEMBER|MAINTAINER]->(GitHubTeam)
GitHubBranch¶
Representation of a single GitHubBranch ref object. This node contains minimal data for a repository branch.
Field |
Description |
---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
The GitHub branch id. These are not unique across GitHub instances, so are prepended with the API URL the id applies to |
name |
Name of the branch |
Relationships¶
GitHubRepositories have GitHubBranches.
(GitHubBranch)<-[BRANCH]-(GitHubRepository)
ProgrammingLanguage¶
Representation of a single Programming Language language object. This node contains programming language information.
Field |
Description |
---|---|
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
id |
Language ids need not be tracked across instances, so defaults to the name |
name |
Name of the language |
Relationships¶
GitHubRepositories use ProgrammingLanguages.
(ProgrammingLanguage)<-[LANGUAGE]-(GitHubRepository)
Dependency::PythonLibrary¶
Representation of a Python library as listed in a requirements.txt
or setup.cfg file.
Within a setup.cfg file, cartography will load everything from install_requires
, setup_requires
, and extras_require
.
Field |
Description |
---|---|
id |
The canonicalized name of the library. If the library was pinned in a requirements file using the |
name |
The canonicalized name of the library. |
version |
The exact version of the library. This field is only present if the library was pinned in a requirements file using the |
Relationships¶
Software on Github repos can import Python libraries by optionally specifying a version number.
(GitHubRepository)-[:REQUIRES{specifier}]->(PythonLibrary)
specifier: A string describing this library’s version e.g. “<4.0,>=3.0” or “==1.0.2”. This field is only present on the
:REQUIRES
edge if the repo’s requirements file provided a version pin.
A Python Dependency is affected by a SemgrepSCAFinding (optional)
(:SemgrepSCAFinding)-[:AFFECTS]->(:PythonLibrary)