Advanced#

Order Management#

We discuss scenarios for which an ack or update is not received following an order request being sent.

Before jumping in, these would be a normal sequence of events

  • Client sends a roq::CreateOrder request to a gateway.

  • Gateway translates the request and forwards it to the exchange.

  • Gateway responds to client with an roq::OrderAck.

  • Gateway receives an ack from the exchange.

  • Gateway sends an roq::OrderAck to client.

  • Gateway receives an order update from the exchange.

  • Gateway sends an roq::OrderUpdate to client.

  • Client can use the roq::is_order_complete() function to check if the order is likely to have completed.

This sequence might be all you expect to see for e.g. IOC or FOK limit orders. For other order types the client should expect the number of roq::OrderUpdate’s to be ≥1.

So what could go wrong?

A client communicates with a gateway using a dedicated shared-memory ring-buffer.

Q: Can the connection between client and gateway be broken?

Heartbeats are constantly being transmitted two-way between client and gateway. Either side will automatically disconnect if a heartbeat hast not been received. The client will then receive a roq::Connection event notifying about change to the connection status.

Q: Can the ring buffer overflow?

This happens when the client sends requests too quickly for the gateway to process. The client library will automatically identify that the ring buffer has no more space and synchronously throw a roq::Full exception. This should terminate the client process since it would be indicative of a more fundamental problem with the implementation of the trading strategy.

A gateway communicates with many clients using a single shared-memory ring-buffer.

Q: Can the client miss any updates from the gateway?

This can only happen if the client is too slow in processing the gateway updates. The client library will automatically detect that the client can’t keep up and asynchronously throw a roq::Slow exception. The framework will automatically reset all communication by reconnecting the client with the gateway. This will initiate the download phase: the client will then receive a list with the last known status of all working orders known to the gateway. The client may never receive the roq::OrderAck meant for it, but it will receive an image with the latest known roq::OrderUpdate.

Gateway is connected to exchange using TCP/IP.

Q: Can the gateway ack successfully and the order request still fail?

The gateway may think the exchange connection is up and in a good state. It will then try to send a message to the exchange. A broken connection will only be detected later since the actual sending is asynchronous. A broken connection will trigger a disconnect to the exchange which in turn will issue a roq::OrderManagerStatus broadcast to connected clients.

Q: Are there any uncertainties following a disconnect?

A disconnect is by no means a guarantee that the order request did not reach the market… the request may or may not have reached the exchange.

The following can be done to reduce uncertainty

  • Whenever possible, the gateway should be configured to instruct the exchange to cancel all orders upon detecting a disconnect.

  • The gateway should stop the client from sending any further requests until the situation can be clarified.

  • The gateway should download working orders, done trades and current positions upon reconnecting to the exchange.

  • The gateway should forward this information to clients.

  • Clients must reset all order management, possibly request orders to be cancelled and then resume.

Q: What if the message is lost somewhere between the gateway and exchange?

The TCP protocol will retransmit. This obviously impacts latency, but does not otherwise affect the logic of a trading strategy.

Note

For this reason it’s important to monitor your system for TCP retransmissions.

Q: The exchange has received the message, but no ack (or update) is received?

The most likely reason is a broken connection. The disconnect will be detected and the reconnect + download will bring the client back to a safe state.

Another reason could be that the exchange never sends a response. One can only assume this to be a bug and that all market participants would suffer, causing the exchange to halt further trading.

This should be captured by the trading strategy by monitoring for timeout. The default action should be to alert the trader and enter a mode disallowing any further trading.

Q: The exchange only responds with an update (no ack)?

This really depends on the exchange protocol. The gateway try (best-effort) to communicate an ack (before an update) to the client.

The client should therefore update it’s own order status based on either an ack or an update.

Q: Can the gateway miss a subsequent order update from the exchange?

This is a very hard problem: updates may arrive after a very long time. There is currently no automated solution to this problem.

The client can (and should) implement indirect ways to detect this loss of message.

It is possible to accumulate fills per order-id and compare the sum to what has been seen from the last roq::OrderUpdate. This is the better solution because fills are (normally) reported with relatively low latency.

It is also possible to use the position feed (when available) and compare to the accumulation of the fills as reported by all roq::OrderUpdate’s (added to a position captured before any order was sent by the trading strategy). This is not the preferred solution since position feeds are typically fetched by polling and therefore quite slow. However, it is the most accurate way to reconcile with e.g. fills.

These are the scenarios relevant to sending an order request

Scenario

Client observes

Client must

Uncertainties?

Safe to continue?

Gateway down

roq::Connection

Wait for reconnect and download

No

Yes

Client sends requests too fast

roq::Full exception

Stop trading

Maybe

No

Client too slow

roq::Slow exception

Wait for reconnect and download

Maybe

Maybe

Gateway lost connection to exchange

roq::OrderManagerStatus

Wait for reconnect and download

Yes

Yes

Exchange never sends a response

timeout

Stop trading

Yes

No

Position mismatch

Internal accounting mismatch

Depends on trading strategy

Maybe

Maybe

Conclusions

  • The period from a disconnect is detected until the reconnect + download has completed is critical: the trader must use other solutions to cancel working orders. The exchange may be able to assist by auto-cancelling orders, but the timeout period is typically very long (30-60 seconds, possibly).

  • Reconnect + download is meant to bring the client back to a good state so it can safely continue. The client is responsible for monitoring the relevant events (mentioned above) and reset internal state during reconnect + download. The client may detect working orders after download: it is the responsibility of the trading strategy to decide whether the order can continue or should be cancelled.

  • Certain scenarios are indicative of software bugs (client too slow/fast, exchange never sending ack) and should possibly make the trading strategy choose to enter a mode where it doesn’t do anything and it must be manually stopped (to prevent auto-start by e.g. systemd).

  • Internal accounting should be implemented to monitor positions using a number of different methods. The client may decided to enter a mode where it can’t place any further orders until the account methods agree (again). The comparison must take into the account the possibly very different latency profiles between order updates, fills and position updates.