Skip to content

Data Dispatch

Data Dispatch is DataScribe's powerful data transformation and movement engine. This component enables you to effortlessly import, transform, and route data across your research environment, ensuring seamless integration between different data sources and analytical tools.

Understanding Data Dispatch

Data Dispatch Overview

Data Dispatch serves as the central data orchestration hub for your research workflow, providing:

  • Seamless data import from diverse sources
  • Intelligent transformation from raw data to structured formats
  • Automated data routing to appropriate storage locations
  • Scheduled data processing and synchronization
  • Error handling and data quality monitoring

Key Components of Data Dispatch

Data Sources

Connect to various origins of research data:

  • File Uploads: CSV, Excel, JSON, XML, and other formats
  • Database Connections: SQL, NoSQL, and specialized research databases
  • API Integrations: REST, GraphQL, and SOAP endpoints
  • Instrument Feeds: Direct connections to laboratory equipment
  • External Repositories: DOI-based academic repositories
  • Cloud Storage: Google Drive, Dropbox, Box, and other services

Transformations

Convert and manipulate data to fit your research needs:

  • Mapping: Connect source fields to destination structures
  • Cleaning: Handle missing values, outliers, and inconsistencies
  • Formatting: Standardize data formats and units
  • Enrichment: Add calculated fields and derived values
  • Aggregation: Combine multiple data points into summaries
  • Filtering: Remove irrelevant or low-quality data

Destinations

Route processed data to appropriate targets:

  • Data Structures: Place in your defined folder hierarchies
  • Databases: Store in relational or specialized research databases
  • Analysis Tools: Send directly to analytical pipelines
  • Visualization Platforms: Prepare for direct visualization
  • Export Formats: Generate files for external use

Creating Data Dispatch Workflows

Method 1: Visual Workflow Builder

  1. Navigate to "Data Dispatch" in the main menu
  2. Click "Create Workflow"
  3. Select "Visual Builder"
  4. Configure workflow properties:
  5. Name and description
  6. Schedule/trigger options
  7. Error handling preferences
  8. Use the drag-and-drop interface to:
  9. Add source connectors
  10. Configure transformation steps
  11. Define destination targets
  12. Set conditional logic
  13. Validate and save your workflow

Method 2: From Templates

  1. Navigate to "Data Dispatch" in the main menu
  2. Click "Create Workflow"
  3. Select "Use Template"
  4. Browse the template library by:
  5. Data source type
  6. Transformation complexity
  7. Research discipline
  8. Select a template that matches your needs
  9. Customize source, transformation, and destination settings
  10. Validate and save your workflow

Method 3: Code-Based Workflow

For advanced users requiring custom logic:

  1. Navigate to "Data Dispatch" in the main menu
  2. Click "Create Workflow"
  3. Select "Code Editor"
  4. Choose a language:
  5. Python
  6. R
  7. SQL
  8. JavaScript
  9. Write custom transformation code
  10. Configure input and output parameters
  11. Validate and save your workflow

Data Import Capabilities

CSV/Excel Import

Effortlessly bring tabular data into your research environment:

  1. In your workflow, add a "CSV/Excel Import" source
  2. Configure import settings:
  3. File selection/upload
  4. Header row configuration
  5. Data type detection
  6. Missing value handling
  7. Sheet selection (for Excel)
  8. Preview the detected data structure
  9. Apply initial transformations if needed
  10. Set up column mapping
  11. Configure destination

Intelligent Schema Detection

Data Dispatch automatically analyzes your data:

  1. Upload or connect to your data source
  2. The system detects:
  3. Column data types
  4. Value distributions
  5. Potential primary keys
  6. Relationships between tables
  7. Data quality issues
  8. Review and adjust the detected schema
  9. Approve for further processing

Batch vs. Streaming

Choose the appropriate data processing model:

Batch Processing

  • Process data in scheduled or triggered chunks
  • Ideal for historical data and periodic updates
  • Configure processing windows and triggers

Stream Processing

  • Process data as it arrives in real-time
  • Ideal for continuous data collection
  • Configure stream connections and processing logic

Data Transformation Features

Mapping Tools

Connect source fields to destination structures:

  1. In your workflow, add a "Field Mapper" step
  2. The system suggests field mappings based on:
  3. Field names
  4. Data types
  5. Value patterns
  6. Adjust mappings as needed
  7. Configure transformation rules
  8. Preview results
  9. Save mapping configuration

Data Cleaning

Ensure high-quality data with automated cleaning:

  1. Add a "Data Cleaning" step to your workflow
  2. Configure cleaning operations:
  3. Missing value handling
  4. Outlier detection and treatment
  5. Duplicate removal
  6. Standardization rules
  7. Format enforcement
  8. Preview cleaning results
  9. Save cleaning configuration

Advanced Transformations

Apply sophisticated data manipulations:

  1. Add appropriate transformation steps:
  2. Aggregation
  3. Pivoting
  4. Normalization
  5. Denormalization
  6. Type conversion
  7. Derived calculations
  8. Configure transformation parameters
  9. Preview results
  10. Chain multiple transformations as needed

Workflow Automation

Triggers and Scheduling

Automate workflow execution:

Event-Based Triggers

  • Form submissions
  • File uploads
  • API calls
  • Database changes
  • System events

Schedule-Based Triggers

  • One-time execution
  • Recurring schedules
  • Calendar-based timing
  • Dependent scheduling

Conditional Logic

Create intelligent workflows with decision points:

  1. Add a "Condition" step to your workflow
  2. Configure evaluation criteria:
  3. Data value conditions
  4. Metadata conditions
  5. External system conditions
  6. Define alternative paths:
  7. Success path
  8. Error path
  9. Conditional branches
  10. Test your conditions
  11. Save configuration

Error Handling

Ensure robust processing with comprehensive error management:

  1. Configure workflow-level error policies:
  2. Stop on error
  3. Continue with warnings
  4. Retry logic
  5. Fallback processing
  6. Set up error notifications
  7. Define recovery procedures
  8. Configure error logging and tracking

Monitoring and Management

Workflow Dashboard

Monitor your data processing operations:

  1. Navigate to "Data Dispatch" → "Dashboard"
  2. View active workflows with status indicators
  3. Monitor performance metrics:
  4. Processing volume
  5. Execution time
  6. Success rates
  7. Error frequencies
  8. Drill down into specific workflows
  9. Access logs and execution history

Logging and Auditing

Maintain comprehensive records of data movement:

  1. Navigate to "Data Dispatch" → "Logs"
  2. View detailed event logs:
  3. Execution events
  4. Data transformations
  5. Error records
  6. User actions
  7. Filter logs by:
  8. Workflow
  9. Date range
  10. Event type
  11. Status
  12. Export logs for compliance

Advanced Features

Data Lineage Tracking

Maintain visibility into data origins and transformations:

  1. Navigate to "Data Dispatch" → "Lineage"
  2. View visual representation of data flow
  3. Trace data elements to their sources
  4. Identify transformation history
  5. Understand data dependencies
  6. Export lineage documentation

Versioning and Rollback

Manage changes to your workflows:

  1. Navigate to your workflow
  2. View version history
  3. Compare versions to identify changes
  4. Restore previous versions if needed
  5. Clone versions for new development

Testing and Validation

Ensure workflow quality before deployment:

  1. Navigate to your workflow
  2. Click "Test"
  3. Configure test parameters:
  4. Test data selection
  5. Execution environment
  6. Validation criteria
  7. Run tests and review results
  8. Debug issues as needed
  9. Certify workflow for production

Integration Ecosystem

Supported Systems

Data Dispatch connects with numerous research tools:

  • Laboratory Information Management Systems (LIMS)
  • Electronic Lab Notebooks (ELN)
  • Statistical Analysis Software
  • Machine Learning Platforms
  • Visualization Tools
  • Academic Repositories
  • Instrument Control Software

Custom Connectors

Build connections to specialized systems:

  1. Navigate to "Settings" → "Connectors"
  2. Click "Create Connector"
  3. Configure connection parameters:
  4. Authentication details
  5. Endpoint information
  6. Data format specifications
  7. Mapping templates
  8. Test the connection
  9. Save your custom connector

Best Practices for Data Dispatch

  • Start with simple workflows and iteratively add complexity
  • Test thoroughly with representative data samples
  • Document your workflows with clear descriptions and comments
  • Monitor performance and optimize as needed
  • Implement appropriate error handling for all critical workflows
  • Use parameterization to create reusable workflow templates
  • Schedule resource-intensive processes during off-peak hours

Next Steps

After configuring your data dispatch workflows: