Parallelization

Parallelization

Parallelization is a pattern where multiple tasks are executed simultaneously to improve performance and efficiency.

Basic Structure

states:
  - id: start
    type: initial
    next: parallel_tasks
  
  - id: parallel_tasks
    type: parallel
    branches:
      - id: task1
        states:
          - id: start
            type: initial
            next: process_task1
          - id: process_task1
            type: tool
            tool: process_data
            input:
              data: "${input.data1}"
      - id: task2
        states:
          - id: start
            type: initial
            next: process_task2
          - id: process_task2
            type: tool
            tool: process_data
            input:
              data: "${input.data2}"
    next: combine_results

Common Use Cases

1. Data Processing

states:
  - id: start
    type: initial
    next: parallel_processing
  
  - id: parallel_processing
    type: parallel
    branches:
      - id: validate_data
        states:
          - id: start
            type: initial
            next: run_validation
          - id: run_validation
            type: tool
            tool: validate_data
            input:
              data: "${input.data}"
      - id: enrich_data
        states:
          - id: start
            type: initial
            next: run_enrichment
          - id: run_enrichment
            type: tool
            tool: enrich_data
            input:
              data: "${input.data}"
      - id: transform_data
        states:
          - id: start
            type: initial
            next: run_transformation
          - id: run_transformation
            type: tool
            tool: transform_data
            input:
              data: "${input.data}"
    next: merge_results

2. Content Analysis

states:
  - id: start
    type: initial
    next: parallel_analysis
  
  - id: parallel_analysis
    type: parallel
    branches:
      - id: sentiment_analysis
        states:
          - id: start
            type: initial
            next: analyze_sentiment
          - id: analyze_sentiment
            type: llm
            model: sentiment_analyzer
            prompt: "Analyze sentiment: {text}"
            input:
              text: "${input.content}"
      - id: topic_extraction
        states:
          - id: start
            type: initial
            next: extract_topics
          - id: extract_topics
            type: llm
            model: topic_extractor
            prompt: "Extract main topics: {text}"
            input:
              text: "${input.content}"
      - id: keyword_analysis
        states:
          - id: start
            type: initial
            next: analyze_keywords
          - id: analyze_keywords
            type: llm
            model: keyword_analyzer
            prompt: "Extract keywords: {text}"
            input:
              text: "${input.content}"
    next: combine_analysis

3. API Integration

states:
  - id: start
    type: initial
    next: parallel_apis
  
  - id: parallel_apis
    type: parallel
    branches:
      - id: weather_api
        states:
          - id: start
            type: initial
            next: fetch_weather
          - id: fetch_weather
            type: tool
            tool: weather_api
            input:
              location: "${input.location}"
      - id: news_api
        states:
          - id: start
            type: initial
            next: fetch_news
          - id: fetch_news
            type: tool
            tool: news_api
            input:
              location: "${input.location}"
      - id: traffic_api
        states:
          - id: start
            type: initial
            next: fetch_traffic
          - id: fetch_traffic
            type: tool
            tool: traffic_api
            input:
              location: "${input.location}"
    next: combine_api_data

Advanced Patterns

1. Dynamic Parallelization

states:
  - id: start
    type: initial
    next: determine_tasks
  
  - id: determine_tasks
    type: llm
    model: task_planner
    prompt: |
      Determine required tasks for:
      {input}
      
      Return as JSON array of task names      
    input:
      input: "${input.request}"
    next: parallel_execution
  
  - id: parallel_execution
    type: parallel
    branches: "${output.map(task => ({
      id: task,
      states: [
        {
          id: 'start',
          type: 'initial',
          next: 'execute'
        },
        {
          id: 'execute',
          type: 'tool',
          tool: task,
          input: {
            data: '${input.data}'
          }
        }
      ]
    }))}"
    next: combine_results

2. Conditional Parallelization

states:
  - id: start
    type: initial
    next: check_conditions
  
  - id: check_conditions
    type: llm
    model: condition_checker
    prompt: |
      Determine which tasks to run in parallel:
      {input}
      
      Return as JSON:
      {
        "tasks": ["task1", "task2"],
        "reason": "explanation"
      }      
    input:
      input: "${input.request}"
    next: parallel_tasks
  
  - id: parallel_tasks
    type: parallel
    branches: "${output.tasks.map(task => ({
      id: task,
      states: [
        {
          id: 'start',
          type: 'initial',
          next: 'execute'
        },
        {
          id: 'execute',
          type: 'tool',
          tool: task,
          input: {
            data: '${input.data}'
          }
        }
      ]
    }))}"
    next: combine_results

3. Error Handling in Parallel Tasks

states:
  - id: start
    type: initial
    next: parallel_with_error_handling
  
  - id: parallel_with_error_handling
    type: parallel
    branches:
      - id: task1
        states:
          - id: start
            type: initial
            next: execute_task1
          - id: execute_task1
            type: tool
            tool: task1
            input:
              data: "${input.data}"
            error:
              next: handle_task1_error
          - id: handle_task1_error
            type: tool
            tool: error_handler
            input:
              error: "${error}"
              task: "task1"
      - id: task2
        states:
          - id: start
            type: initial
            next: execute_task2
          - id: execute_task2
            type: tool
            tool: task2
            input:
              data: "${input.data}"
            error:
              next: handle_task2_error
          - id: handle_task2_error
            type: tool
            tool: error_handler
            input:
              error: "${error}"
              task: "task2"
    next: combine_results

Best Practices

  1. Task Independence: Ensure parallel tasks are independent
  2. Resource Management: Monitor and limit concurrent tasks
  3. Error Handling: Implement proper error handling for each branch
  4. Result Aggregation: Plan how to combine parallel results
  5. Timeout Management: Set appropriate timeouts for parallel tasks
  6. Monitoring: Track progress and performance of parallel tasks

Common Pitfalls

  1. Task Dependencies: Creating dependencies between parallel tasks
  2. Resource Exhaustion: Running too many parallel tasks
  3. Error Propagation: Not handling errors in parallel branches
  4. Result Synchronization: Issues with combining parallel results
  5. Timeout Issues: Not setting appropriate timeouts

Next Steps

Last updated on