Request Queuing
Request Queuing is a technique where requests that exceed current capacity or rate limits are held in a buffer to be processed later. This allows a system to manage bursts of activity without dropping requests or returning errors.
In a trading system, this can be used to smooth out order submissions, ensuring that they are processed in an orderly fashion once the matching engine has capacity. While queuing helps prevent errors, it also introduces additional latency, as the request must wait in the queue before execution.
Managing the size and duration of these queues is a critical aspect of system design.