We're software that helps growing brands & retailers grow and scale. Sync, sell and ship your products and inventory on online marketplaces and storefronts faster, easier and more accurately.

Learn more now

Batch Processing & Concurrency

When your automation needs to process tens of thousands of products, running them sequentially in a single job can take hours. Batch processing splits the work into smaller chunks that run in parallel, dramatically reducing total processing time.

When to Use Batching

Batching is most useful when:

  • Your product catalog exceeds 10,000 items
  • The automation exports to a slow API endpoint (one request per product)
  • You want to process large vendor feeds faster by parallelizing the work
Important: Batch processing currently supports product automations only. Order automations are not supported for batching.

How Batching Works

When you set limit_batch_size, the Automation Engine queries all matching products, divides them into chunks of that size, and enqueues each chunk as a separate parallel job. Each batch processes independently, and a finalization step runs once all batches complete to stitch the results together.

Batch Lifecycle

  1. Enqueueing -- The parent automation queries all matching product IDs and splits them into batches. Each batch is saved and enqueued as a separate job.
  2. Processing -- Each batch runs as an independent automation, processing only its assigned product IDs. Multiple batches execute in parallel up to the concurrency limit.
  3. Complete / Failed -- Each batch updates its status when it finishes. If a batch fails, it can be retried automatically.
  4. Finalization -- Once all batches reach a terminal state (complete or failed), the finalizer job runs to consolidate results and update the automation log.

Configuration Options

Core Batch Settings

Add these to a file_configs entry:

"file_configs": [
  {
    "field_map": { "guid": "sku", "price": "price", "stock": "qty" },
    "update": "edit",
    "limit_batch_size": 5000,
    "limit_batch_concurrency": 10
  }
]
Key Description Default
limit_batch_size Number of items per batch. Maximum is 10,000. -- (batching disabled)
limit_batch_concurrency Maximum number of batches running in parallel at once. Maximum is 100. 10

Stall Handling

Sometimes a batch can get stuck -- perhaps the worker crashed or the API endpoint became unresponsive. The engine monitors for stalled batches and automatically retries them:

Key Description Default
batch_stall_time Seconds of inactivity before a batch is considered stalled. 10800 (3 hours)
batch_max_retries Number of times to retry a stalled batch before marking it as failed. 3

When a batch has been inactive longer than batch_stall_time, the engine re-enqueues it. If a batch exceeds batch_max_retries, it is marked as failed so the remaining batches can finish and the finalizer can run.

Batch Throttling

If the target API has strict rate limits, you can throttle how quickly batches are dispatched:

"batch_throttle": {
  "request_limit": 1,
  "time_period": 60
}
Key Description Default
request_limit Number of batches to dispatch within the time period. Required
time_period Time window in seconds. 1
Tip: When batch_throttle is enabled, concurrency is automatically forced to 1. This ensures batches run sequentially at the throttled rate rather than all launching at once.

Other Limit Controls

Beyond batching, several other limit settings control how much data an automation processes:

Key Description
limit_export When exporting with payload_multi: true, limits the number of objects included in each HTTP request. The automation splits into multiple requests to send all data.
limit_import Maximum total number of items to process during an import. Items beyond this limit are skipped.
limit_template_size Limit a template payload to approximately this size in megabytes. Useful when APIs have request body size limits.
limit_files When using regex to match multiple files, limit how many files are processed.

Example: Processing 50K Products in Batches

Suppose you have a vendor feed with 50,000 products and you want to import updates efficiently:

{
  "name": "Daily Vendor Stock Update",
  "vendor": "Acme Distributor",
  "active": true,
  "schedule": "0 6 * * *",
  "type": "products",
  "action": "import",
  "connection": {
    "type": "sftp",
    "address": "sftp.acme.com",
    "username": "{{sftp_user}}",
    "password": "{{sftp_pass}}",
    "path": "/exports/",
    "port": 22
  },
  "file_configs": [
    {
      "name": "daily_inventory.csv",
      "update": "edit",
      "field_map": {
        "guid": "SKU",
        "stock": "QtyOnHand",
        "price": "DealerPrice"
      },
      "diff_update": true,
      "diff_fields": ["stock", "price"],
      "limit_batch_size": 5000,
      "limit_batch_concurrency": 5,
      "batch_stall_time": 7200,
      "batch_max_retries": 2
    }
  ]
}

With this configuration:

  • The 50,000 products are split into 10 batches of 5,000 each.
  • Up to 5 batches run in parallel at a time.
  • If any batch has no activity for 2 hours, it is retried up to 2 times.
  • Combined with diff_update, only products with actual stock or price changes generate updates, further reducing processing time.
Warning: Setting limit_batch_size too small (e.g., 100) creates many batches, which adds queue overhead. Setting it too large may negate the parallelism benefit. For most use cases, batch sizes between 2,000 and 10,000 work well.