Ontology in Cartography¶
What is an Ontology?¶
An ontology is a structured representation of concepts and relationships within a given domain. It enables semantic modeling of knowledge, making it easier to organize, analyze, and query data. In Cartography, the ontology defines entity (node) types and their relationships, using semantic labels and abstract nodes for better interoperability and extensibility.
A key benefit of this approach is that it enables cross-module queries and the export of data in a unified format. By providing a common semantic layer, different modules can interact seamlessly, and data from various sources can be normalized and exported consistently.
Implementation in Cartography¶
Cartography implements ontology using two main concepts:
Semantic Label: Each graph node can have one or more semantic labels describing its nature (e.g.,
UserAccount,DNSRecord).Abstract Nodes: Some nodes serve as abstract concepts to group similar entities or define common behaviors. This allows logic to be factored and ensures model consistency.
See also
For more background and design rationale, see:
How Ontology Works in Cartography¶
The intel.ontology module in Cartography manages ontology logic. It allows:
Loading ontology definitions from JSON files
Validating the consistency of entities and relationships
Ensuring mapping between collected data and the defined semantic model
The module provides functions to traverse, enrich, and leverage the ontology during data ingestion. It plays a key role in normalizing entities from heterogeneous sources.
By default, nodes are created in the ontology based on data observed in various modules. For some node, such as User, Device, etc., you can specify “sources of truth” modules that will exclusively create those nodes. This allows for more controlled and accurate representation of certain entities.
Example: User Nodes and Source of Truth
If you set the --ontology-users-source parameter to duo, then a User node will be created for every account found in Duo. In contrast, for other integrations like Tailscale, only existing User nodes (those created by the source of truth) will be linked to Tailscale accounts. No new User nodes will be created from Tailscale data alone.
Ontology Field Mappings¶
Ontology mappings are defined in Python using the OntologyFieldMapping class, which maps fields from source nodes to ontology nodes. Each mapping specifies:
ontology_field: The field name in the ontology node (e.g., “email”, “hostname”)node_field: The corresponding field name in the source node (e.g., “email_address”, “device_name”)required: Whether this field is required for ontology node creation (defaults toFalse)
Example:
OntologyFieldMapping(
ontology_field="email",
node_field="email_address",
required=True
)
Required Fields¶
The required flag serves two critical purposes:
Data Quality Control: If a source node lacks a required field (i.e., the field is
Noneor missing), the entire ontology node creation is skipped for that record. This ensures only complete, usable data creates ontology nodes.Primary Identifier Validation: Fields that serve as primary identifiers must be marked as required. For example:
emailshould be required for User ontology nodeshostnameshould be required for Device ontology nodes
This prevents creating ontology nodes that cannot be properly identified or matched across different data sources.
Example with Required Field:
# If a source DuoUser has no email, no User ontology node is created
OntologyFieldMapping(ontology_field="email", node_field="email", required=True)
Special Field Handling¶
Ontology field mappings support special handling for complex data transformations using the special_handling parameter:
invert_boolean¶
Inverts boolean values - useful when a source field represents the opposite of the ontology field:
OntologyFieldMapping(
ontology_field="inactive",
node_field="account_enabled",
special_handling="invert_boolean",
)
# account_enabled=True becomes inactive=False
to_boolean¶
Converts any non-null value to True, null/missing values to False:
OntologyFieldMapping(
ontology_field="has_mfa",
node_field="multifactor",
special_handling="to_boolean",
)
# Any non-null multifactor value becomes has_mfa=True
or_boolean¶
Combines multiple boolean fields using logical OR - useful when a concept spans multiple source fields:
OntologyFieldMapping(
ontology_field="inactive",
node_field="suspended",
special_handling="or_boolean",
extra={"fields": ["archived"]},
)
# inactive = suspended OR archived
equal_boolean¶
Checks if the field value equals any of the specified values:
OntologyFieldMapping(
ontology_field="inactive",
node_field="status",
special_handling="equal_boolean",
extra={"values": ["disabled", "locked out", "pending deletion"]},
)
# inactive=True if status is "disabled", "locked out", or "pending deletion"
Node Eligibility¶
The eligible_for_source parameter in OntologyNodeMapping controls whether a node mapping can create new ontology nodes (default: True).
Set eligible_for_source=False when:
A node type lacks sufficient data to create meaningful ontology nodes
The node serves only as a connection point to existing ontology nodes
Required fields are not available or reliable enough for primary node creation
Example:
OntologyNodeMapping(
node_label="AWSUser",
eligible_for_source=False, # Cannot create User ontology nodes
fields=[
OntologyFieldMapping(ontology_field="username", node_field="name")
],
)
In this example, AWS IAM users don’t have email addresses (required for User ontology nodes), so they’re marked as ineligible for creating new User nodes. They can still be linked to existing User nodes through relationships.
Ontology Mapping Structure¶
Ontology mappings are defined in cartography/models/ontology/mapping/data/ using Python dataclasses:
your_service_mapping = OntologyMapping(
module_name="your_service",
nodes=[
OntologyNodeMapping(
node_label="YourServiceUser",
fields=[
OntologyFieldMapping(ontology_field="email", node_field="email", required=True),
OntologyFieldMapping(ontology_field="username", node_field="username"),
OntologyFieldMapping(ontology_field="fullname", node_field="display_name"),
],
),
],
rels=[
OntologyRelMapping(
__comment__="Link User to Device based on observed relationships",
query="MATCH (u:User)-[:HAS_ACCOUNT]->(:YourServiceUser)-[:OWNS]->(:YourServiceDevice)<-[:OBSERVED_AS]-(d:Device) MERGE (u)-[r:OWNS]->(d) ON CREATE SET r.firstseen = timestamp() SET r.lastupdated = $UPDATE_TAG",
iterative=False,
),
],
)
This structure allows Cartography to flexibly describe how to map and relate entities from specific integrations into the unified ontology graph.