Flags¶

Here we describe common options for all gateway solutions. You can see all options by using --help from the command-line.

Note

Most options have sensible defaults for a balanced configuration.

You should only have to override the defaults, if you want to tune the configuration for ultra low latency.

Basic¶

--name: A name used to identify the service.
--instance: Service instance (0-15)

Config¶

--config_file: The config file path. See here for further details.
--secrets_file: An optional secrets file path. See here for further details.

Client¶

--client_listen_address: A filesystem path used by clients to establish communication with the service

Note

Unix domain sockets are used to exchange shared memory file descriptors between clients and gateway. You can therefore only connect clients running on the same host.

After an initial handshake, a Unix domain socket remains connected with the only purpose of checking the liveness of a connected client. All following communication is over shared memory.

Authentication¶

To unlock all features you must acquire an access token from the license manager.

Warning

You should NOT expect the gateway to be fully operational without a valid access token.

--auth_keys_file: A file containing a public/private key pair generated by roq-keygen (from the roq-tools package) for further details
--auth_license_manager_uri (https://auth-1.roq-trading.com/,https://auth-2.roq-trading.com/): URI’s used to contact the license manager (comma-separated list)
--auth_proxy: Proxy end-point (URI)
--auth_is_uat (false): Is the service deployed to UAT?
--auth_hide_name (false): The name is useful when investigating what services have currently locked all licenses. However, you may decide to opt-out from revealing the service name.
--auth_refresh_freq (30min): Time between refresh
--auth_retry_freq (1min): Retry frequency
--auth_lock_period (2h): Desired lock period. The license manager will ultimately control if this is accepted or overriden.

Cache¶

--cache_dir (roq-cache): Directory used to cache data between runs, e.g. config history or authentication tokens.
--cache_config_retain_users_for (168h): User configuration is retained for this period.

Note

The user_id is an integer which is encoded into the ClOrdID when order requests are being sent to an exchange. If a gateway is restarted, it must somehow remember the mapping from user_id back to the name found in the configuration file. The cached config history is used to track this mapping.

Since user_id has limited range (254 possible values), we need to recycle id’s. However, due to the exchange remembering orders for a certain period, it’s not safe to recycle immediately after a name has been removed (or renamed) from the config file. We therefore retain the “last seen” timestamp and will only recycle the id after the period configured here.

Warning

There are no guarantees that an old user_id mapping can’t collide with a newly created mapping.

You should ensure orders are canceled when a user name is removed (or renamed). And you should set this value to a longer period than the one used by the exchange to publish closed orders.

The gateway will cache certain views (based on rolled up events) for the following purposes

Allow download of current state to newly established client connections.
Early detection of e.g. bad order book state. These situations may occur as a result of lost messages and/or programming mistakes. It is important that the gateways can act as a first order level of protection such that clients (trading strategies) don’t act on incorrect information.

Filter¶

--cache_all_reference_data (false): This flag allows you (opt-in) to cache all reference data including symbols that would normally be discarded due to regex matching.

Event-Log¶

Gateways may opt-in to persist a log of all events. This is useful for simulation, back-testing and investigations.

Event logs are saved to this relative path

<category>/<iso-week>/<name>/<name>-<category>-<timestamp>.roq

Category is either md or om
The ISO week is formatted like this 2020-W37.
The name is what you have specified as --name.
The filename is constructed from startup time in milliseconds since the epoch.

Event logs are not persisted to disk in real time. This is primarily to avoid wearing out your storage device and to allow for better compression.

--event_log_input_buffer_size (16777216): Event log input buffer size (must be power of two)
--event_log_output_buffer_size (16777216): Size of the output buffer in bytes (must be power of two)
--event_log_compression_level (4): Compression level (1-9, lower is faster, higher will compress more)
--event_log_dir: Directory used to write an event-log.
--event_log_iso_week (true): Event-log sub-directory as ISO week? (alternative is YYYY-MM-DD)
--event_log_sync_freq (300s): Maximum time between buffering (encoding and compression) and write-to-disk.
--event_log_utimes_on_sync (false): Update last modified/access times on sync?
--event_log_encoding (flatbuffers): Event-log encoding (native or flatbuffers)
--event_log_symlink (true): Create symlink to latest event-logs at the top-level directory?

Journal¶

Gateways may optionally cache routing_id.

--oms_cache (false): Enable?
--oms_multicast_port: Port(s) used to broadcast OrderAck and OderUpdate messages.
--oms_multicast_address: Optional IPv4 address(es), e.g. 127.0.0.1 for localhost (also the default).
--oms_local_interface: Optional IPv4 address(es), e.g. 127.0.0.1 for localhost (also the default).
--oms_multicast_ttl (4): Multicast TTL
--oms_multicast_loop (false): Multicast available on localhost?
--oms_listen_port: UDP listen port (used by roq-journal to ack)

Database¶

--cache_database_uri: cache database uri
--cache_database_username: cache database username
--cache_database_password: cache database password
--cache_database_name (roq_cache): cache database name
--cache_orders_retain_stale_for (168h): Stale orders are retained for this period.

UDP¶

Gateways may optionally broadcast messages to UDP.

Note

The gateway will only publish to localhost. It is possible to use the tee module of iptables to clone packets for distribution to other hosts (serverfault).

--udp_snapshot_address: Optional IPv4 address(es), e.g. 127.0.0.1 for localhost (also the default).
--udp_snapshot_port: Port(s) used to broadcast snapshot messages
--udp_incremental_address: Optional IPv4 address(s), e.g. 127.0.0.1 for localhost (also the default).
--udp_incremental_port: Port(s) used to broadcast incremental update messages.
--udp_snapshot_delay_freq (10ms): UDP snapshot frequency (delay between updates)
--udp_snapshot_repeat_freq (1min): Minimum time until repeat
--udp_incremental_heartbeat_freq (3s): Heartbeat frequency, e.g. 3s.
--udp_encoding (flatbuffers): Message encoding
--udp_enable_mbp (false): MbP is currently opt-in

Service¶

All services (gateways, clients, bridges, …) support a HTTP interface for accessing and/or updating internal state.

--service_listen_address: The path or port used to expose service metrics
--url_prefix: The HTTP URL prefix path used to query for service metrics
--web_dir: Directory containing static files, e.g. $CONDA_PREFIX/share/roq/web
--disable_service_manager (false): Disable support for the service manager?

Loop¶

Gateway services are primarily implemented around a core event loop.

Note

The defaults will allow the gateway to sleep (and therefore preserve CPU). This is for low latency (response time: double-digit microseconds).

For ultra low latency (response time: single-digit microseconds) make sure to pin the dispatch thread to a CPU using --loop_cpu_affinity and choose the following options for busy-polling

--loop_sleep=0ns
--loop_timer_freq=250ns

--loop_cpu_affinity (-1): Used to pin the main dispatch thread to a specific CPU
--loop_sleep (500ns): Use to relinquish control to the kernel
--loop_timer_freq (2500ns): Timer frequency
--scheduling_policy: Thread scheduling policy
--scheduling_priority: Thread scheduling priority

Note

The core event loop looks a bit like this (pseudo code)

std::chrono::nanoseconds next_timer_update = {};
while (true) {
  auto now = get_monotonic_clock();
  if (now > next_timer_update) {
    next_timer_update = now + FLAGS_loop_timer_freq;
    drain_epoll_queues();  // socket processing (*no* wait!)
    gateway_timer();  // gateway specific, perhaps send ping to exchange
  }
  drain_shared_memory_queues();  // connected clients
  if (FLAGS_loop_sleep) {
    std::this_thread::sleep_for(FLAGS_loop_sleep);
  }
}

The core event loop is primarily designed for low latency, i.e. busy-polling.
--loop_sleep=0ns will never relinquish the CPU.
--loop_timer_freq=250ns will regularly drain socket queues. This is about achieving the best responsiveness, when communicating with the exchange, whilst reducing the time spent transitioning between user-space and kernel. It is here important to understand that syscall’s are expensive and will block the gateway from responding to client requests communicated over shared memory. You are encouraged to experiment and optimize for your own server configuration.
--loop_timer_freq=0ns will always drain socket queues. This may be a good choice, if you use a kernel bypass solution, e.g. something based on SolarFlare or DPDK. You are encouraged to experiment and optimize for your own server configuration.

IPC¶

--ipc_copy_out_buffer_size (8388608): buffer size used when copying from a shared inter-process stream
--ipc_spmc_queue_size (134217728): std/inter-process spmc queue size (must be power of two)
--ipc_spsc_queue_size (1048576): std/inter-process spsc queue size (must be power of two)

IO¶

--io_backend (libevent): IO backend (default: libevent)
--io_recv_buffer_size: IO receive buffer size
--io_recv_buffer_count: IO receive number of buffers
--io_send_buffer_size: IO send buffer size
--io_send_buffer_count: IO send number of buffers

Net¶

Gateways may opt-in to use either of these flags

--net_connection_timeout (5s): Connection timeout
--net_disconnect_on_idle_timeout (30s): Disconnect if nothing ‘important’ has been received before the timeout. This is only implemented where relevant and, when it is, it depends on the communication protocol
--net_tls_validate_certificate (false): Validate TLS certificate?

Experimental¶

--cache_mbp_checksum (false): Compute checksum? This can be useful when verifying the client order book is exactly identical to the gateway’s.
--enable_risk (false): Allows a risk manager (drop-copy client) to publish positions and risk limits. Order requests will then be validated against current positions, current risk exposure and risk limits. See here for further details.
--enable_portfolio (false): Allows a position manager (drop-copy client) to publish positions. This is a subset of the functionality provided by --enable_risk.
--enable_risk_reduce_only (false): Enable reduce-only checking? (requires –enable_risk)
--enable_rest_cache (false): Eanble caching for the rest service
--enable_time_series (false): Eanble time-series

Logging¶

Warning

It is preferable to use a logging service that reads directly from stdout and stderr. Any other mechanism (pipe to file or using the --log_path flag) will likely create a direct dependency on the filesystem. A filesystem dependency may introduce unexpected high latency from time to time.

--log_flush_freq: Frequency at which the log buffers will be flushed.
--log_path: Path used for logging.
--log_max_size: Log files will be rotated when this size is reached.
--log_max_files: Retain at most this number of log files.
--log_rotate_on_open: Rotate log files on open?