Databricks Schema¶
graph LR
W(DatabricksWorkspace) -- RESOURCE --> U(DatabricksUser)
W -- RESOURCE --> S(DatabricksServicePrincipal)
W -- RESOURCE --> G(DatabricksGroup)
W -- RESOURCE --> T(DatabricksToken)
W -- RESOURCE --> CP(DatabricksClusterPolicy)
W -- RESOURCE --> IP(DatabricksInstancePool)
W -- RESOURCE --> C(DatabricksCluster)
W -- RESOURCE --> SS(DatabricksSecretScope)
W -- RESOURCE --> IPL(DatabricksIpAccessList)
U -- MEMBER_OF --> G
S -- MEMBER_OF --> G
G -- MEMBER_OF --> G
U -- OWNER_OF --> T
S -- OWNER_OF --> T
C -- HAS_POLICY --> CP
C -- USES_INSTANCE_POOL --> IP
DatabricksWorkspace¶
A Databricks workspace, scoped by host URL.
Ontology Mapping: This node has the extra label
Tenantto enable cross-platform queries for organizational tenants across different systems.
Field |
Description |
|---|---|
id |
Workspace host (e.g. |
host |
Full workspace URL (indexed) |
tokens_enabled |
Whether PATs are enabled in the workspace |
max_token_lifetime_days |
Max PAT lifetime in days from the workspace token management settings, or null when the workspace is on the Databricks default policy (the API encodes that as the string |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
DatabricksUser,DatabricksServicePrincipal,DatabricksGroup,DatabricksToken,DatabricksClusterPolicy,DatabricksInstancePool,DatabricksCluster,DatabricksSecretScope,DatabricksIpAccessListbelong to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->( :DatabricksUser, :DatabricksServicePrincipal, :DatabricksGroup, :DatabricksToken, :DatabricksClusterPolicy, :DatabricksInstancePool, :DatabricksCluster, :DatabricksSecretScope, :DatabricksIpAccessList )
DatabricksUser¶
A workspace SCIM user.
Ontology Mapping: This node has the extra label
UserAccountto enable cross-platform queries for user accounts across different systems.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
scim_id |
Raw SCIM user ID returned by Databricks (indexed) |
user_name |
SCIM |
Primary email address (indexed) |
|
display_name |
SCIM display name |
external_id |
External SCIM ID (federation) |
active |
Whether the user is active |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksUserbelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksUser)A
DatabricksUseris a member of one or moreDatabricksGroup.(:DatabricksUser)-[:MEMBER_OF]->(:DatabricksGroup)
DatabricksServicePrincipal¶
A workspace SCIM service principal.
Ontology Mapping: This node has the extra label
ServiceAccountto enable cross-platform queries for non-human accounts across different systems.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
scim_id |
Raw SCIM service principal ID (indexed) |
application_id |
OAuth application ID (indexed) |
display_name |
SCIM display name |
external_id |
External SCIM ID (federation) |
active |
Whether the service principal is active |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksServicePrincipalbelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksServicePrincipal)A
DatabricksServicePrincipalis a member of one or moreDatabricksGroup.(:DatabricksServicePrincipal)-[:MEMBER_OF]->(:DatabricksGroup)
DatabricksGroup¶
A workspace SCIM group.
Ontology Mapping: This node has the extra label
UserGroupto enable cross-platform group queries.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
scim_id |
Raw SCIM group ID (indexed) |
display_name |
Group display name (indexed) |
external_id |
External SCIM ID (federation) |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksGroupbelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksGroup)A
DatabricksGroupcan be a member of anotherDatabricksGroup(nested groups).(:DatabricksGroup)-[:MEMBER_OF]->(:DatabricksGroup)
DatabricksToken¶
A Databricks personal access token (PAT) returned by the token management API.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
token_id |
Raw token id returned by the token-management API (indexed) |
comment |
Token description provided at creation |
creation_time |
Native datetime when the token was created (UTC) |
expiry_time |
Native datetime when the token expires (UTC); null when the token has no expiry |
owner_id |
Workspace-scoped composite id of the token owner (matches |
created_by_id |
Workspace-scoped composite id of the principal that created the token |
created_by_username |
Username/email of the principal that created the token (indexed) |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksTokenbelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksToken)A
DatabricksUserorDatabricksServicePrincipalowns aDatabricksToken.(:DatabricksUser)-[:OWNER_OF]->(:DatabricksToken) (:DatabricksServicePrincipal)-[:OWNER_OF]->(:DatabricksToken)
DatabricksClusterPolicy¶
A cluster policy returned by the policies API. Cluster policies define a set of
allowed configurations a DatabricksCluster can be launched with.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
policy_id |
Raw policy id (indexed) |
name |
Policy display name (indexed) |
description |
Free-text description |
definition |
JSON-encoded policy definition (allowed fields, fixed values, …) |
policy_family_id |
Policy family id when the policy is derived from a Databricks-provided family |
creator_user_name |
User name of the policy creator (indexed) |
created_at |
Native datetime when the policy was created (UTC) |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksClusterPolicybelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksClusterPolicy)A
DatabricksClusteris launched against aDatabricksClusterPolicy.(:DatabricksCluster)-[:HAS_POLICY]->(:DatabricksClusterPolicy)
DatabricksInstancePool¶
A pre-warmed instance pool that clusters can pull nodes from to reduce startup latency.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
instance_pool_id |
Raw pool id (indexed) |
instance_pool_name |
Pool display name (indexed) |
node_type_id |
Underlying VM instance type id |
min_idle_instances |
Minimum number of idle instances kept warm |
max_capacity |
Maximum number of instances the pool can scale to |
idle_instance_autotermination_minutes |
Idle instance reclaim window |
enable_elastic_disk |
Whether elastic disk autoscaling is enabled |
state |
Pool state ( |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksInstancePoolbelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksInstancePool)A
DatabricksClusterallocates nodes from aDatabricksInstancePool.(:DatabricksCluster)-[:USES_INSTANCE_POOL]->(:DatabricksInstancePool)
DatabricksCluster¶
A Databricks compute cluster returned by the clusters 2.1 API.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
cluster_id |
Raw cluster id (indexed) |
cluster_name |
Cluster display name (indexed) |
state |
Cluster state ( |
spark_version |
Spark / Databricks runtime version string |
runtime_engine |
Runtime engine ( |
node_type_id |
Worker node VM type id |
driver_node_type_id |
Driver node VM type id |
num_workers |
Static worker count (null when autoscaling is enabled) |
autotermination_minutes |
Idle auto-termination window in minutes |
cluster_source |
What created the cluster ( |
data_security_mode |
UC access mode ( |
single_user_name |
Owning user for single-user UC clusters (indexed) |
creator_user_name |
User name of the cluster creator (indexed) |
instance_pool_id |
Raw worker instance pool id, when the cluster targets one (indexed) |
driver_instance_pool_id |
Raw driver instance pool id, when the driver targets a distinct pool (indexed) |
enable_local_disk_encryption |
Whether local disks are encrypted |
enable_elastic_disk |
Whether elastic disk autoscaling is enabled |
start_time |
Native datetime when the cluster was first started (UTC) |
terminated_time |
Native datetime when the cluster was last terminated (UTC), if applicable |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksClusterbelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksCluster)A
DatabricksClusteris governed by aDatabricksClusterPolicy.(:DatabricksCluster)-[:HAS_POLICY]->(:DatabricksClusterPolicy)A
DatabricksClusterallocates nodes from one or moreDatabricksInstancePool— the worker pool and, when set, a distinct driver pool both land here.(:DatabricksCluster)-[:USES_INSTANCE_POOL]->(:DatabricksInstancePool)
DatabricksSecretScope¶
A Databricks secret scope. Scopes can be backed by Databricks’s own store
(DATABRICKS) or by an Azure Key Vault (AZURE_KEYVAULT).
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
name |
Scope name (indexed) |
backend_type |
Backing store ( |
keyvault_resource_id |
Azure Key Vault resource id when backend is |
keyvault_dns_name |
Azure Key Vault DNS name when backend is |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksSecretScopebelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksSecretScope)
DatabricksIpAccessList¶
An IP access list applied at the workspace level. Restricts inbound access to the workspace to ranges in the allow list, blocks ranges in the block list.
Field |
Description |
|---|---|
id |
Workspace-scoped composite id |
list_id |
Raw list id (indexed) |
label |
List label (indexed) |
list_type |
List type ( |
enabled |
Whether the list is enforced |
address_count |
Number of addresses in the list |
ip_addresses |
Source CIDR / IP entries in the list |
created_at |
Native datetime when the list was created (UTC) |
updated_at |
Native datetime when the list was last updated (UTC) |
firstseen |
Timestamp of when a sync job first created this node |
lastupdated |
Timestamp of the last time the node was updated |
Relationships¶
A
DatabricksIpAccessListbelongs to aDatabricksWorkspace.(:DatabricksWorkspace)-[:RESOURCE]->(:DatabricksIpAccessList)