Query Provenance Tracking
Rhema provides comprehensive provenance tracking for all queries, enabling full audit trails and data lineage. This example shows how to use provenance tracking to understand query execution and data origins.
π― What is Provenance Tracking?
Provenance tracking provides detailed information about:
-
Query execution - How queries are processed
-
Data lineage - Where each piece of data comes from
-
Performance metrics - Execution time and resource usage
-
Applied transformations - Filters, sorting, and other operations
π Query-Level Provenance
Track execution metadata and performance:
# Basic provenance tracking
rhema query "todos WHERE status='pending'" --provenanceExample Output
$ rhema query "todos WHERE status='pending'" --provenance
π Query Provenance:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Original Query: todos WHERE status='pending'
β° Executed At: 2024-01-15T10:30:00Z
β±οΈ Execution Time: 45ms
π Scopes Searched: test-scope
π Files Accessed: todos.yaml
π Performance Metrics:
Total Time: 45ms
Files Read: 1
YAML Documents Processed: 1
Phase Times:
parsing: 2ms
scope_discovery: 5ms
execution: 38ms
π§ Execution Steps:
β’ Query Parsing (2ms)
β’ Scope Discovery (5ms)
β’ File Access (10ms)
β’ Condition Filtering (15ms)
β’ Result Assembly (13ms)
π Applied Filters:
β’ WhereCondition: Applied 1 WHERE conditions (2 β 1 items)
β’ Limit: Applied LIMIT=None OFFSET=None (1 β 1 items)
π Query Result:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
todos:
- id: "todo-001"
title: "Test todo"
status: pending
priority: medium
created_at: "2024-01-15T10:00:00Z"π Field-Level Provenance
Track the origin of each field in query results:
# Field-level provenance for detailed lineage
rhema query "knowledge WHERE confidence>7" --field-provenanceWhat Field-Level Provenance Shows
-
Data lineage - Track the origin of each field in query results
-
Transformation history - Record all transformations applied to fields
-
Source tracking - Identify which scope, file, and YAML path each field came from
-
Confidence scoring - Assign confidence levels to field values based on data quality
π Provenance Components
Execution Metadata
-
Timestamp - When the query was executed
-
Duration - Total execution time
-
Scopes searched - Which scopes were examined
-
Files accessed - Which files were read
Performance Metrics
-
Phase-by-phase timing - Detailed breakdown of execution stages
-
Memory usage - Resource consumption during execution
-
Cache statistics - Cache hit/miss rates
-
File operations - Number of files read and processed
Execution Steps
-
Query parsing - How the query was interpreted
-
Scope discovery - How scopes were identified
-
File access - How files were located and read
-
Condition filtering - How WHERE conditions were applied
-
Result assembly - How final results were constructed
Applied Filters
-
WHERE conditions - Complete record of filtering operations
-
YAML paths - Which paths were traversed
-
Ordering - Sort operations applied
-
Limits - Pagination and result limiting
π― Use Cases for Provenance Tracking
Debugging Queries
# Understand why a query returns unexpected results
rhema query "todos WHERE priority='high'" --provenancePerformance Optimization
# Identify slow query phases
rhema query "*/knowledge WHERE confidence='high'" --provenanceAudit Trails
# Track data lineage for compliance
rhema query "decisions WHERE status='approved'" --field-provenanceData Quality Assessment
# Assess confidence in query results
rhema query "knowledge WHERE category='performance'" --field-provenanceπ§ Provenance Best Practices
-
Use for Debugging - Enable provenance when queries donβt work as expected
-
Monitor Performance - Track execution times to identify bottlenecks
-
Audit Compliance - Use field-level provenance for regulatory requirements
-
Data Quality - Assess confidence levels in query results
-
Team Collaboration - Share provenance information to help team members understand queries
π Advanced Provenance Features
Custom Provenance Output
# Export provenance to JSON for analysis
rhema query "todos WHERE status='pending'" --provenance --format json
# Save provenance to file
rhema query "knowledge WHERE confidence>7" --provenance --output provenance.jsonProvenance Comparison
# Compare query performance over time
rhema query "todos WHERE status='pending'" --provenance --compare-with previous-run.jsonπ Next Steps
-
Experiment - Try provenance tracking with your own queries
-
Monitor - Use provenance to optimize query performance
-
Document - Share provenance insights with your team
-
Automate - Integrate provenance tracking into your workflows
π Related Examples
-
CQL Queries - Learn the query language
-
Advanced Usage - Explore advanced features
-
Quick Start Commands - Basic Rhema usage