Analytics
Note: this is a work in progress, still at specification stage and will be functional soon.
Overview
ABI Analytics are specialized components that create insights, visualizations, and reports from data generated by modules. Analytics components transform raw operational data and module outputs into actionable intelligence, enabling users to understand system performance, track KPIs, and derive value from the data processed by ABI modules.
Purpose and Benefits
Analytics components serve several key functions in the ABI ecosystem:
- Operational Visibility: Monitor and track module usage, performance, and resource utilization
- Data Visualization: Transform complex data into intuitive visual representations
- Business Intelligence: Convert module outputs into actionable insights for decision makers
- Automated Reporting: Generate scheduled reports on system activities and findings
- User Feedback: Provide end users with meaningful summaries of module outputs
- Performance Tracking: Measure KPIs and track progress toward organizational goals
Supported Frameworks
ABI Analytics support multiple data visualization and reporting frameworks:
- Plotly: Interactive data visualizations with Python
- Matplotlib/Seaborn: Statistical visualizations and charts
- Pandas Profiling: Automated exploratory data analysis
- Jupyter Notebooks: Interactive analytical narratives
- PDF Reporting: Automated report generation with tools like ReportLab
- Business Intelligence Tools: Export capabilities for tools like PowerBI and Tableau
Analytics Architecture
Analytics components follow a standard structure and are integrated with the module system:
src/custom/modules/your_module_name/
├── analytics/ # Contains analytics components
│ ├── visualizations/ # Data visualization components
│ │ └── dashboard.py # Interactive dashboard
│ ├── reports/ # Reporting components
│ │ └── weekly_report.py # Scheduled report generation
│ ├── metrics/ # Performance metrics collection
│ │ └── usage_stats.py # Usage statistics collector
│ └── notebooks/ # Jupyter notebooks for analysis
│ └── exploration.ipynb # Exploratory data analysis
Scheduling Analytics with GitHub Actions
Analytics components should be scheduled using GitHub Actions for reliable execution and integration with your CI/CD pipeline:
# .github/workflows/weekly-analytics.yml
name: Weekly Analytics Report
on:
schedule:
# Run every Monday at 8:00 AM UTC
- cron: '0 8 * * 1'
workflow_dispatch: # Allow manual triggering
jobs:
generate-report:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install uv
uv sync
- name: Generate analytics report
run: |
uv run python -m src.custom.modules.your_module_name.analytics.reports.generate_report
env:
# Add any required environment variables/secrets
DATABASE_URL: ${{ secrets.DATABASE_URL }}
- name: Upload report artifact
uses: actions/upload-artifact@v3
with:
name: weekly-analytics-report
path: reports/
# Optional: Send report via email or store in S3/cloud storage
- name: Send report via email
if: success()
uses: dawidd6/action-send-mail@v3
with:
server_address: ${{ secrets.MAIL_SERVER }}
server_port: ${{ secrets.MAIL_PORT }}
username: ${{ secrets.MAIL_USERNAME }}
password: ${{ secrets.MAIL_PASSWORD }}
subject: Weekly Analytics Report
body: Please find attached the weekly analytics report.
to: [email protected]
from: ABI Analytics <[email protected]>
attachments: reports/weekly_report.pdf
This approach offers several advantages:
- No need to maintain a separate scheduler
- Leverages GitHub's reliable scheduling infrastructure
- Logs and artifacts are stored in GitHub
- Easy to configure and modify without changing code
- Can be manually triggered when needed
Best Practices
When developing ABI Analytics:
- Functional Programming: Use pure functions and immutable data structures where possible
- Separation of Concerns: Keep data extraction, transformation, and visualization logic separate
- Stateless Operations: Design analytics components to be stateless for better scalability
- Efficient Data Handling: Use optimized libraries for handling large datasets
- Caching Mechanisms: Implement caching for expensive queries or computations
- Parameterization: Make analytics components highly configurable for different use cases
- Automation: Set up automated generation and distribution of reports
- Self-Service: Enable non-technical users to generate and customize reports
- Privacy and Security: Always anonymize sensitive data in analytics outputs
Examples
ABI includes several example analytics components that demonstrate best practices:
- Module Usage Dashboard: Track module and component usage over time
- Performance Metrics: Monitor execution times and resource utilization
- Data Quality Reports: Analyze the quality of data processed by pipelines
- User Activity Analytics: Track how users interact with ABI components
You can find these examples in the src/core/modules/common/analytics/
directory.