We're software that helps growing brands & retailers grow and scale. Sync, sell and ship your products and inventory on online marketplaces and storefronts faster, easier and more accurately.

Learn more now

Batch Processing & Concurrency

Modified on: Thu, Apr 2, 2026 at 3:44 AM

When your automation needs to process tens of thousands of products, running them sequentially in a single job can take hours. Batch processing splits the work into smaller chunks that run in parallel, dramatically reducing total processing time.

When to Use Batching

Batching is most useful when:

Your product catalog exceeds 10,000 items
The automation exports to a slow API endpoint (one request per product)
You want to process large vendor feeds faster by parallelizing the work

Important: Batch processing currently supports product automations only. Order automations are not supported for batching.

How Batching Works

When you set limit_batch_size, the Automation Engine queries all matching products, divides them into chunks of that size, and enqueues each chunk as a separate parallel job. Each batch processes independently, and a finalization step runs once all batches complete to stitch the results together.

Batch Lifecycle

Enqueueing -- The parent automation queries all matching product IDs and splits them into batches. Each batch is saved and enqueued as a separate job.
Processing -- Each batch runs as an independent automation, processing only its assigned product IDs. Multiple batches execute in parallel up to the concurrency limit.
Complete / Failed -- Each batch updates its status when it finishes. If a batch fails, it can be retried automatically.
Finalization -- Once all batches reach a terminal state (complete or failed), the finalizer job runs to consolidate results and update the automation log.

Configuration Options

Core Batch Settings

Add these to a file_configs entry:

"file_configs": [
  {
    "field_map": { "guid": "sku", "price": "price", "stock": "qty" },
    "update": "edit",
    "limit_batch_size": 5000,
    "limit_batch_concurrency": 10
  }
]

Key	Description	Default
limit_batch_size	Number of items per batch. Maximum is 10,000.	-- (batching disabled)
limit_batch_concurrency	Maximum number of batches running in parallel at once. Maximum is 100.	10

Stall Handling

Sometimes a batch can get stuck -- perhaps the worker crashed or the API endpoint became unresponsive. The engine monitors for stalled batches and automatically retries them:

Key	Description	Default
batch_stall_time	Seconds of inactivity before a batch is considered stalled.	10800 (3 hours)
batch_max_retries	Number of times to retry a stalled batch before marking it as failed.	3

When a batch has been inactive longer than batch_stall_time, the engine re-enqueues it. If a batch exceeds batch_max_retries, it is marked as failed so the remaining batches can finish and the finalizer can run.

Batch Throttling

If the target API has strict rate limits, you can throttle how quickly batches are dispatched:

"batch_throttle": {
  "request_limit": 1,
  "time_period": 60
}

Key	Description	Default
request_limit	Number of batches to dispatch within the time period.	Required
time_period	Time window in seconds.	1

Tip: When batch_throttle is enabled, concurrency is automatically forced to 1. This ensures batches run sequentially at the throttled rate rather than all launching at once.

Other Limit Controls

Beyond batching, several other limit settings control how much data an automation processes:

Key	Description
limit_export	When exporting with payload_multi: true, limits the number of objects included in each HTTP request. The automation splits into multiple requests to send all data.
limit_import	Maximum total number of items to process during an import. Items beyond this limit are skipped.
limit_template_size	Limit a template payload to approximately this size in megabytes. Useful when APIs have request body size limits.
limit_files	When using regex to match multiple files, limit how many files are processed.

Example: Processing 50K Products in Batches

Suppose you have a vendor feed with 50,000 products and you want to import updates efficiently:

{
  "name": "Daily Vendor Stock Update",
  "vendor": "Acme Distributor",
  "active": true,
  "schedule": "0 6 * * *",
  "type": "products",
  "action": "import",
  "connection": {
    "type": "sftp",
    "address": "sftp.acme.com",
    "username": "{{sftp_user}}",
    "password": "{{sftp_pass}}",
    "path": "/exports/",
    "port": 22
  },
  "file_configs": [
    {
      "name": "daily_inventory.csv",
      "update": "edit",
      "field_map": {
        "guid": "SKU",
        "stock": "QtyOnHand",
        "price": "DealerPrice"
      },
      "diff_update": true,
      "diff_fields": ["stock", "price"],
      "limit_batch_size": 5000,
      "limit_batch_concurrency": 5,
      "batch_stall_time": 7200,
      "batch_max_retries": 2
    }
  ]
}

With this configuration:

The 50,000 products are split into 10 batches of 5,000 each.
Up to 5 batches run in parallel at a time.
If any batch has no activity for 2 hours, it is retried up to 2 times.
Combined with diff_update, only products with actual stock or price changes generate updates, further reducing processing time.

Warning: Setting limit_batch_size too small (e.g., 100) creates many batches, which adds queue overhead. Setting it too large may negate the parallelism benefit. For most use cases, batch sizes between 2,000 and 10,000 work well.