Backup Control and Management #8

Open
opened 2025-10-29 19:44:55 -07:00 by peterwood · 0 comments
Owner

Originally created by @acedanger on GitHub (May 27, 2025).

Backup Control and Management

Issue Summary

Implement advanced backup control and management capabilities within the Telegram bot, allowing authorized users to trigger backups, manage schedules, perform maintenance tasks, and control backup system operations remotely.

Description

Develop comprehensive backup control functionality that provides secure remote management of all backup systems. This includes triggering manual backups, scheduling management, maintenance operations, and emergency controls for system administrators.

Requirements

Control Commands (Admin Only)

  • /backup_now <system> - Trigger immediate backup for specified system
  • /backup_all - Trigger backup for all systems
  • /backup_stop <system> - Stop running backup (emergency)
  • /backup_schedule <system> <time> - Modify backup schedule
  • /maintenance_mode <on|off> - Enable/disable maintenance mode
  • /cleanup_old - Trigger cleanup of old backups
  • /validate_all - Run validation on all backup systems

Management Commands

  • /schedules - Show current backup schedules
  • /running - Show currently running backups
  • /queue - Show backup queue status
  • /logs_download - Generate and download log archives
  • /system_restart <service> - Restart backup-related services
  • /config_reload - Reload backup system configurations

Emergency Commands (Super Admin Only)

  • /emergency_stop - Stop all backup operations immediately
  • /emergency_backup - Force emergency backup of critical data
  • /disaster_recovery - Initiate disaster recovery procedures
  • /system_status_override - Override system status for maintenance

Integration Points

Backup System Scripts

# Plex backup system
/home/acedanger/shell/plex/backup-plex.sh
/home/acedanger/shell/plex/validate-plex-backups.sh

# Immich backup system
/home/acedanger/shell/immich/backup-immich.sh
/home/acedanger/shell/immich/validate-immich-backups.sh

# Media services backup
/home/acedanger/shell/backup-media.sh

# System-wide utilities
/home/acedanger/shell/backup-log-monitor.sh

System Integration

# Crontab management
/home/acedanger/shell/crontab/manage-enhanced-crontab.sh

# Docker container management
# Service restart capabilities
# Log file management and archival

Technical Implementation

Secure Command Execution

def execute_backup_command(system, command, user_id):
    """Execute backup command with security validation"""
    # Validate user authorization level
    # Check system availability
    # Sanitize and validate command parameters
    # Execute with proper error handling
    # Log command execution
    # Return execution results

def validate_admin_access(user_id, command):
    """Validate user has appropriate access level"""
    # Check user authorization level
    # Verify command permissions
    # Log access attempts
    # Return authorization status

Backup Queue Management

def manage_backup_queue():
    """Manage backup operation queue"""
    # Track running backup operations
    # Queue manual backup requests
    # Prevent conflicting operations
    # Manage execution priorities

def check_running_backups():
    """Check status of currently running backups"""
    # Check process status
    # Monitor backup progress
    # Estimate completion times
    # Return status information

Schedule Management

def modify_backup_schedule(system, new_schedule):
    """Modify backup schedule for specified system"""
    # Validate schedule format
    # Update crontab entries
    # Verify schedule changes
    # Notify users of changes

def get_current_schedules():
    """Retrieve current backup schedules"""
    # Parse crontab entries
    # Format schedule information
    # Calculate next run times
    # Return schedule data

Command Examples

/backup_now plex

🚀 Manual Backup Initiated

🎬 Plex Backup Starting...
├── Status: ⏳ Initializing
├── Estimated Duration: ~3 minutes
├── Priority: Manual (High)
└── Queue Position: Starting immediately

⚠️ Note: This will run in addition to scheduled backup
🔄 Use /running to monitor progress
⏹️ Use /backup_stop plex to cancel if needed

Initiated by: @username
Time: 2025-05-27 14:30:15

/backup_all

🚀 Full System Backup Initiated

Starting backup for all systems:

🎬 Plex Backup
├── Status: ⏳ Queued (Position 1)
├── Estimated Start: Immediately
└── Estimated Duration: ~3 minutes

📸 Immich Backup
├── Status: ⏳ Queued (Position 2)
├── Estimated Start: After Plex (~3 min)
└── Estimated Duration: ~6 minutes

🎭 Media Services Backup
├── Status: ⏳ Queued (Position 3)
├── Estimated Start: After Immich (~9 min)
└── Estimated Duration: ~5 minutes

📊 Total Estimated Time: ~14 minutes
🔄 Use /running to monitor progress
⏹️ Use /emergency_stop to cancel all

Initiated by: @username
Time: 2025-05-27 14:30:15

/running

🔄 Currently Running Backups

🎬 Plex Backup (Manual)
├── Status: 🟡 Running
├── Progress: ~65% complete
├── Started: 14:30:15 (2m 15s ago)
├── ETA: ~1m 30s remaining
└── Current Phase: Database verification

📸 Immich Backup
├── Status: ⏳ Queued
├── Queue Position: Next
├── Estimated Start: ~1m 30s
└── Type: Manual trigger

🎭 Media Services
├── Status: ⏳ Queued
├── Queue Position: 3rd
├── Estimated Start: ~7m 30s
└── Type: Manual trigger

🔄 Queue Status: 1 running, 2 queued
⏹️ Emergency stop available if needed

/schedules

📅 Backup Schedules

🎬 PLEX BACKUP
├── Schedule: Daily at 02:00 AM
├── Next Run: Tomorrow 02:00:00
├── Last Run: Today 02:00:15 (✅ Success)
├── Duration Avg: 2m 45s
└── Status: ✅ Active

📸 IMMICH BACKUP
├── Schedule: Daily at 02:30 AM
├── Next Run: Tomorrow 02:30:00
├── Last Run: Today 02:30:22 (✅ Success)
├── Duration Avg: 5m 30s
└── Status: ✅ Active

���� MEDIA SERVICES BACKUP
├── Schedule: Daily at 03:00 AM
├── Next Run: Tomorrow 03:00:00
├── Last Run: Today 03:00:15 (✅ Success)
├── Duration Avg: 4m 30s
└── Status: ✅ Active

🔧 Schedule Management:
├── All schedules active
├── No conflicts detected
└── Next system backup: 9h 30m

/maintenance_mode on

🔧 Maintenance Mode Activated

System Status Changed:
├── Scheduled backups: ⏸️ PAUSED
├── Manual backups: ✅ Still available
├── Monitoring: ✅ Active
└── Alerts: 🔕 Reduced (emergency only)

Maintenance Features:
├── Safe to perform system updates
├── Backup schedules won't trigger
├── Manual operations still possible
├── Emergency functions remain active

⚠️ Remember to disable maintenance mode when complete
🔧 Use /maintenance_mode off to resume normal operations

Activated by: @username
Time: 2025-05-27 14:35:20

/validate_all

🔍 System-wide Validation Initiated

Running comprehensive validation...

🎬 Plex Validation
├── Status: ⏳ Running
├── Checking: Database integrity
├── Progress: Backup file verification
└── ETA: ~2 minutes

📸 Immich Validation
├── Status: ⏳ Queued
├── Will Check: DB + Upload sync + B2
└── ETA: ~3 minutes (after Plex)

🎭 Media Services Validation
├── Status: ⏳ Queued
├── Will Check: All 7 services
└── ETA: ~4 minutes (after Immich)

📊 Total Validation Time: ~9 minutes
🔄 Use /running to monitor progress
📋 Results will be summarized when complete

Initiated by: @username
Time: 2025-05-27 14:40:10

File Structure

telegram/bot/commands/
├── control/
│   ├── __init__.py
│   ├── backup_control.py   # Manual backup triggers
│   ├── schedule_management.py # Schedule modification
│   ├── maintenance.py      # Maintenance operations
│   ├── emergency.py        # Emergency controls
│   └── validation.py       # Validation commands
├── management/
│   ├── __init__.py
│   ├── queue_manager.py    # Backup queue management
│   ├── process_monitor.py  # Running process monitoring
│   ├── log_manager.py      # Log file management
│   └── system_control.py   # System-level controls
└── security/
    ├── __init__.py
    ├── authorization.py    # User authorization levels
    ├── command_validation.py # Command security validation
    └── audit_logging.py    # Security audit logging

Security Framework

Authorization Levels

AUTHORIZATION_LEVELS = {
    'user': ['status', 'help', 'basic_info'],
    'admin': ['backup_now', 'schedules', 'logs', 'validate'],
    'super_admin': ['emergency_stop', 'maintenance_mode', 'system_restart']
}

def check_authorization(user_id, command):
    """Check if user is authorized for command"""
    # Look up user authorization level
    # Check command permissions
    # Log authorization attempts
    # Return authorization result

Command Validation

def validate_command_parameters(command, params):
    """Validate command parameters for security"""
    # Sanitize input parameters
    # Validate system names
    # Check parameter ranges
    # Prevent injection attacks

Queue Management System

class BackupQueue:
    def __init__(self):
        self.running_backups = {}
        self.queued_backups = []
        self.max_concurrent = 1  # Prevent conflicts

    def add_backup(self, system, priority='normal'):
        """Add backup to queue"""
        # Check if system already running
        # Add to queue with priority
        # Notify user of queue position

    def start_next_backup(self):
        """Start next backup in queue"""
        # Check for available slots
        # Start highest priority backup
        # Update status tracking

Success Criteria

  • Manual backup triggers working reliably
  • Schedule management functional
  • Queue system preventing conflicts
  • Authorization levels enforced
  • Emergency controls responsive
  • Maintenance mode working properly

Dependencies

  • Depends on: Issues #01-06 (All previous components)
  • Process management utilities
  • Crontab management capabilities
  • System service control access

Estimated Effort

Time: 4-5 days
Complexity: High

Testing Requirements

  • Test backup triggering for all systems
  • Verify queue management prevents conflicts
  • Test authorization level enforcement
  • Validate emergency stop functionality
  • Test schedule modification accuracy
  • Verify maintenance mode behavior

Notes

This control system provides the operational capabilities that transform the Telegram bot from a monitoring tool into a complete backup management interface. Security is paramount since these commands can affect critical backup operations.

Originally created by @acedanger on GitHub (May 27, 2025). # Backup Control and Management ## Issue Summary Implement advanced backup control and management capabilities within the Telegram bot, allowing authorized users to trigger backups, manage schedules, perform maintenance tasks, and control backup system operations remotely. ## Description Develop comprehensive backup control functionality that provides secure remote management of all backup systems. This includes triggering manual backups, scheduling management, maintenance operations, and emergency controls for system administrators. ## Requirements ### Control Commands (Admin Only) - [ ] `/backup_now <system>` - Trigger immediate backup for specified system - [ ] `/backup_all` - Trigger backup for all systems - [ ] `/backup_stop <system>` - Stop running backup (emergency) - [ ] `/backup_schedule <system> <time>` - Modify backup schedule - [ ] `/maintenance_mode <on|off>` - Enable/disable maintenance mode - [ ] `/cleanup_old` - Trigger cleanup of old backups - [ ] `/validate_all` - Run validation on all backup systems ### Management Commands - [ ] `/schedules` - Show current backup schedules - [ ] `/running` - Show currently running backups - [ ] `/queue` - Show backup queue status - [ ] `/logs_download` - Generate and download log archives - [ ] `/system_restart <service>` - Restart backup-related services - [ ] `/config_reload` - Reload backup system configurations ### Emergency Commands (Super Admin Only) - [ ] `/emergency_stop` - Stop all backup operations immediately - [ ] `/emergency_backup` - Force emergency backup of critical data - [ ] `/disaster_recovery` - Initiate disaster recovery procedures - [ ] `/system_status_override` - Override system status for maintenance ### Integration Points #### Backup System Scripts ```bash # Plex backup system /home/acedanger/shell/plex/backup-plex.sh /home/acedanger/shell/plex/validate-plex-backups.sh # Immich backup system /home/acedanger/shell/immich/backup-immich.sh /home/acedanger/shell/immich/validate-immich-backups.sh # Media services backup /home/acedanger/shell/backup-media.sh # System-wide utilities /home/acedanger/shell/backup-log-monitor.sh ``` #### System Integration ```bash # Crontab management /home/acedanger/shell/crontab/manage-enhanced-crontab.sh # Docker container management # Service restart capabilities # Log file management and archival ``` ### Technical Implementation #### Secure Command Execution ```python def execute_backup_command(system, command, user_id): """Execute backup command with security validation""" # Validate user authorization level # Check system availability # Sanitize and validate command parameters # Execute with proper error handling # Log command execution # Return execution results def validate_admin_access(user_id, command): """Validate user has appropriate access level""" # Check user authorization level # Verify command permissions # Log access attempts # Return authorization status ``` #### Backup Queue Management ```python def manage_backup_queue(): """Manage backup operation queue""" # Track running backup operations # Queue manual backup requests # Prevent conflicting operations # Manage execution priorities def check_running_backups(): """Check status of currently running backups""" # Check process status # Monitor backup progress # Estimate completion times # Return status information ``` #### Schedule Management ```python def modify_backup_schedule(system, new_schedule): """Modify backup schedule for specified system""" # Validate schedule format # Update crontab entries # Verify schedule changes # Notify users of changes def get_current_schedules(): """Retrieve current backup schedules""" # Parse crontab entries # Format schedule information # Calculate next run times # Return schedule data ``` ### Command Examples #### `/backup_now plex` ``` 🚀 Manual Backup Initiated 🎬 Plex Backup Starting... ├── Status: ⏳ Initializing ├── Estimated Duration: ~3 minutes ├── Priority: Manual (High) └── Queue Position: Starting immediately ⚠️ Note: This will run in addition to scheduled backup 🔄 Use /running to monitor progress ⏹️ Use /backup_stop plex to cancel if needed Initiated by: @username Time: 2025-05-27 14:30:15 ``` #### `/backup_all` ``` 🚀 Full System Backup Initiated Starting backup for all systems: 🎬 Plex Backup ├── Status: ⏳ Queued (Position 1) ├── Estimated Start: Immediately └── Estimated Duration: ~3 minutes 📸 Immich Backup ├── Status: ⏳ Queued (Position 2) ├── Estimated Start: After Plex (~3 min) └── Estimated Duration: ~6 minutes 🎭 Media Services Backup ├── Status: ⏳ Queued (Position 3) ├── Estimated Start: After Immich (~9 min) └── Estimated Duration: ~5 minutes 📊 Total Estimated Time: ~14 minutes 🔄 Use /running to monitor progress ⏹️ Use /emergency_stop to cancel all Initiated by: @username Time: 2025-05-27 14:30:15 ``` #### `/running` ``` 🔄 Currently Running Backups 🎬 Plex Backup (Manual) ├── Status: 🟡 Running ├── Progress: ~65% complete ├── Started: 14:30:15 (2m 15s ago) ├── ETA: ~1m 30s remaining └── Current Phase: Database verification 📸 Immich Backup ├── Status: ⏳ Queued ├── Queue Position: Next ├── Estimated Start: ~1m 30s └── Type: Manual trigger 🎭 Media Services ├── Status: ⏳ Queued ├── Queue Position: 3rd ├── Estimated Start: ~7m 30s └── Type: Manual trigger 🔄 Queue Status: 1 running, 2 queued ⏹️ Emergency stop available if needed ``` #### `/schedules` ``` 📅 Backup Schedules 🎬 PLEX BACKUP ├── Schedule: Daily at 02:00 AM ├── Next Run: Tomorrow 02:00:00 ├── Last Run: Today 02:00:15 (✅ Success) ├── Duration Avg: 2m 45s └── Status: ✅ Active 📸 IMMICH BACKUP ├── Schedule: Daily at 02:30 AM ├── Next Run: Tomorrow 02:30:00 ├── Last Run: Today 02:30:22 (✅ Success) ├── Duration Avg: 5m 30s └── Status: ✅ Active ���� MEDIA SERVICES BACKUP ├── Schedule: Daily at 03:00 AM ├── Next Run: Tomorrow 03:00:00 ├── Last Run: Today 03:00:15 (✅ Success) ├── Duration Avg: 4m 30s └── Status: ✅ Active 🔧 Schedule Management: ├── All schedules active ├── No conflicts detected └── Next system backup: 9h 30m ``` #### `/maintenance_mode on` ``` 🔧 Maintenance Mode Activated System Status Changed: ├── Scheduled backups: ⏸️ PAUSED ├── Manual backups: ✅ Still available ├── Monitoring: ✅ Active └── Alerts: 🔕 Reduced (emergency only) Maintenance Features: ├── Safe to perform system updates ├── Backup schedules won't trigger ├── Manual operations still possible ├── Emergency functions remain active ⚠️ Remember to disable maintenance mode when complete 🔧 Use /maintenance_mode off to resume normal operations Activated by: @username Time: 2025-05-27 14:35:20 ``` #### `/validate_all` ``` 🔍 System-wide Validation Initiated Running comprehensive validation... 🎬 Plex Validation ├── Status: ⏳ Running ├── Checking: Database integrity ├── Progress: Backup file verification └── ETA: ~2 minutes 📸 Immich Validation ├── Status: ⏳ Queued ├── Will Check: DB + Upload sync + B2 └── ETA: ~3 minutes (after Plex) 🎭 Media Services Validation ├── Status: ⏳ Queued ├── Will Check: All 7 services └── ETA: ~4 minutes (after Immich) 📊 Total Validation Time: ~9 minutes 🔄 Use /running to monitor progress 📋 Results will be summarized when complete Initiated by: @username Time: 2025-05-27 14:40:10 ``` ### File Structure ``` telegram/bot/commands/ ├── control/ │ ├── __init__.py │ ├── backup_control.py # Manual backup triggers │ ├── schedule_management.py # Schedule modification │ ├── maintenance.py # Maintenance operations │ ├── emergency.py # Emergency controls │ └── validation.py # Validation commands ├── management/ │ ├── __init__.py │ ├── queue_manager.py # Backup queue management │ ├── process_monitor.py # Running process monitoring │ ├── log_manager.py # Log file management │ └── system_control.py # System-level controls └── security/ ├── __init__.py ├── authorization.py # User authorization levels ├── command_validation.py # Command security validation └── audit_logging.py # Security audit logging ``` ### Security Framework #### Authorization Levels ```python AUTHORIZATION_LEVELS = { 'user': ['status', 'help', 'basic_info'], 'admin': ['backup_now', 'schedules', 'logs', 'validate'], 'super_admin': ['emergency_stop', 'maintenance_mode', 'system_restart'] } def check_authorization(user_id, command): """Check if user is authorized for command""" # Look up user authorization level # Check command permissions # Log authorization attempts # Return authorization result ``` #### Command Validation ```python def validate_command_parameters(command, params): """Validate command parameters for security""" # Sanitize input parameters # Validate system names # Check parameter ranges # Prevent injection attacks ``` ### Queue Management System ```python class BackupQueue: def __init__(self): self.running_backups = {} self.queued_backups = [] self.max_concurrent = 1 # Prevent conflicts def add_backup(self, system, priority='normal'): """Add backup to queue""" # Check if system already running # Add to queue with priority # Notify user of queue position def start_next_backup(self): """Start next backup in queue""" # Check for available slots # Start highest priority backup # Update status tracking ``` ### Success Criteria - [ ] Manual backup triggers working reliably - [ ] Schedule management functional - [ ] Queue system preventing conflicts - [ ] Authorization levels enforced - [ ] Emergency controls responsive - [ ] Maintenance mode working properly ## Dependencies - Depends on: Issues #01-06 (All previous components) - Process management utilities - Crontab management capabilities - System service control access ## Estimated Effort **Time**: 4-5 days **Complexity**: High ## Testing Requirements - [ ] Test backup triggering for all systems - [ ] Verify queue management prevents conflicts - [ ] Test authorization level enforcement - [ ] Validate emergency stop functionality - [ ] Test schedule modification accuracy - [ ] Verify maintenance mode behavior ## Notes This control system provides the operational capabilities that transform the Telegram bot from a monitoring tool into a complete backup management interface. Security is paramount since these commands can affect critical backup operations.
peterwood added the enhancement label 2025-10-29 19:44:55 -07:00
Sign in to join this conversation.