Troubleshooting Guide

Related docs: Main AGENTS.md | Create Module | Add Node Type

This guide helps you diagnose and fix common issues when developing Cartography intel modules.

Common Issues and Solutions

Import Errors

# Problem: ModuleNotFoundError for your new module
# Solution: Ensure __init__.py files exist in all directories
cartography/intel/your_service/__init__.py
cartography/models/your_service/__init__.py

Checklist:

  • [ ] __init__.py exists in cartography/intel/your_service/

  • [ ] __init__.py exists in cartography/models/your_service/

  • [ ] Module is imported in parent __init__.py if needed

Schema Validation Errors

# Problem: "PropertyRef validation failed"
# Solution: Check dataclass syntax and PropertyRef definitions
@dataclass(frozen=True)  # Don't forget frozen=True!
class YourNodeProperties(CartographyNodeProperties):
    id: PropertyRef = PropertyRef("id")  # Must have type annotation

Common causes:

  • Missing frozen=True in @dataclass decorator

  • Missing type annotation (: PropertyRef)

  • Typo in PropertyRef field name

Relationship Connection Issues

# Problem: Relationships not created
# Solution: Ensure target nodes exist before creating relationships

# Load parent nodes first:
load(neo4j_session, TenantSchema(), tenant_data, lastupdated=update_tag)

# Then load child nodes with relationships:
load(neo4j_session, UserSchema(), user_data, lastupdated=update_tag, TENANT_ID=tenant_id)

Debugging steps:

  1. Check that the target node label matches exactly

  2. Verify the target_node_matcher property name matches the target node’s property

  3. Ensure the value in your data dict or kwargs is not None

Cleanup Job Failures

# Problem: "GraphJob failed" during cleanup
# Solution: Check common_job_parameters structure
common_job_parameters = {
    "UPDATE_TAG": config.update_tag,  # Must match what's set on nodes
    "TENANT_ID": tenant_id,           # If using scoped cleanup (default)
}
# Problem: Cleanup deleting too much data (wrong scoped_cleanup setting)
# Solution: Verify scoped_cleanup setting is appropriate

@dataclass(frozen=True)
class MySchema(CartographyNodeSchema):
    # For tenant-scoped resources (default, most common):
    # scoped_cleanup: bool = True  # Default - no need to specify

    # For global resources only (rare):
    scoped_cleanup: bool = False  # Only for vuln data, threat intel, etc.

Data Transform Issues

# Problem: KeyError during transform
# Solution: Handle required vs optional fields correctly
{
    "id": data["id"],                    # Required - let it fail
    "name": data.get("name"),            # Optional - use .get()
    "email": data.get("email", ""),      # DON'T use empty string default
    "email": data.get("email"),          # DO use None default
}

Schema Definition Issues

# Problem: Adding custom fields to schema classes
# Solution: Remove them - only standard fields are recognized by the loading system

@dataclass(frozen=True)
class MyRel(CartographyRelSchema):
    # Remove any custom fields like these:
    # conditional_match_property: str = "some_field"  # Ignored
    # custom_flag: bool = True                        # Ignored
    # extra_config: dict = {}                         # Ignored

    # Keep only the standard relationship fields
    target_node_label: str = "TargetNode"
    target_node_matcher: TargetNodeMatcher = make_target_node_matcher(...)
    direction: LinkDirection = LinkDirection.OUTWARD
    rel_label: str = "CONNECTS_TO"
    properties: MyRelProperties = MyRelProperties()

Performance Issues

# Problem: Slow queries
# Solution: Add indexes to frequently queried fields
email: PropertyRef = PropertyRef("email", extra_index=True)

# Query on indexed fields only
MATCH (u:User {id: $user_id})  # Good - id is always indexed
MATCH (u:User {name: $name})   # Bad - name might not be indexed

Note: Fields referred to in a target_node_matcher are indexed automatically.


Debugging Tips for AI Assistants

  1. Check existing patterns first: Look at similar modules in cartography/intel/ before creating new patterns

  2. Verify data model imports: Ensure all CartographyNodeSchema imports are correct

  3. Test transform functions: Always test data transformation logic with real API responses

  4. Validate Neo4j queries: Use Neo4j browser to test queries manually if relationships aren’t working

  5. Check file naming: Module files should match the service name (e.g., cartography/intel/lastpass/users.py)

  6. Run tests incrementally: After each change, run the integration test to catch issues early

  7. Use the sync function: Always test through the sync() function, not individual load() calls


Key Files for Debugging

Understanding these files helps diagnose issues:

File

Purpose

cartography/client/core/tx.py

Core load() and load_matchlinks() functions - check for query generation issues

cartography/graph/job.py

GraphJob class for cleanup operations

cartography/models/core/common.py

PropertyRef class definition

cartography/models/core/nodes.py

CartographyNodeSchema, CartographyNodeProperties base classes

cartography/models/core/relationships.py

CartographyRelSchema, LinkDirection, matchers

cartography/config.py

Configuration object - check for missing fields

cartography/cli.py

Typer-based CLI with organized help panels

cartography/data/indexes.cypher

Manual index definitions (legacy)

cartography/data/jobs/cleanup/

Legacy cleanup job JSON files


Test Utilities

Use these utilities in integration tests:

from tests.integration.util import check_nodes, check_rels

# Check nodes exist with expected properties
expected_nodes = {
    ("user-123", "alice@example.com"),
    ("user-456", "bob@example.com"),
}
assert check_nodes(neo4j_session, "YourServiceUser", ["id", "email"]) == expected_nodes

# Check relationships exist
expected_rels = {
    ("user-123", "tenant-123"),
    ("user-456", "tenant-123"),
}
assert check_rels(
    neo4j_session,
    "YourServiceUser",      # Source node label
    "id",                   # Source node property
    "YourServiceTenant",    # Target node label
    "id",                   # Target node property
    "RESOURCE",             # Relationship label
    rel_direction_right=True,
) == expected_rels

Error Messages Reference

Error Message

Likely Cause

Solution

PropertyRef validation failed

Missing type annotation or frozen=True

Check dataclass definition

Node not found for relationship

Target node doesn’t exist

Load parent nodes first

GraphJob failed

Wrong common_job_parameters

Check UPDATE_TAG and tenant ID

KeyError: 'field_name'

Required field missing in API response

Use .get() for optional fields

ModuleNotFoundError

Missing __init__.py

Add __init__.py to all directories

Relationship not created

Matcher property mismatch

Verify property names match exactly


When to Ask for Help

Stop and ask the user if you encounter:

  • Unclear business logic in legacy Cypher queries

  • Complex relationships that don’t map clearly to data model

  • Test failures you can’t resolve after multiple attempts

  • Multiple modules that seem interdependent

  • Performance issues that persist after adding indexes

  • Unexpected data in the graph after sync