Jobs writing C++ code for high frequency trading firms (HFTs) and hedge funds can pay very well indeed. Headhunters put compensation (salary and bonus) for such roles at $600k+ a few years ago. But simply knowing C++ is not enough. The language is always comparatively fast, but for low latency trading applications you need to know how to make it really fast.
Get Morning Coffee ☕ in your inbox. Sign up here.
Paul Bilokon, a former director at Deutsche Bank, visiting professor at Imperial College London, and chief scientific advisor at Thalesians Marine Ltd, says that if you want an integral role as a C++ developer in an HFT team, familiarity low latency C++ is usually mandatory. Although some firms use programmable FPGAs to achieve ultra low latency, Bilokon says this can be complicated because they require specialized hardware knowledge and languages like Lucid, VHDL and Verilog. “Unless the company is prepared to invest in FPGAs in the long term (both in terms of research and development and ongoing support) it is probably a wise decision to get the most mileage (low latency) out of C++,” he tells us.
However, information on low latency C++ can be hard to come by. A paper* released last year by Bilokon and one of his PhD students looks at 12 techniques for reducing latency in C++ code, as follows:
- Lock-free programming: a concurrent programming paradigm involving multi-threaded algorithms which, unlike their traditional counterparts, do not employ the usage of mutual exclusion mechanisms, such as locks, to arbitrate access to shared resources.
- Single mix multiple data (SMID) instructions: Instructions that take advantage of the parallel processing power of contemporary CPUs, allowing simultaneous execution of multiple operations.
- Mixing data types: When a computation involves both float and double types, implicit conversions are required. If only float computations are used, performance improves.
- Signed vs unsigned: Ensuring consistent signedness in comparisons to avoid conversions.
- Prefetching: Explicitly loading data into cache before it is needed to reduce data fetch delays, particularly in memory-bound applications
- Branch reduction: predicting conditional branch outcomes to allow speculative code execution
- Slowpath removal: minimize execution of rarely executed code paths.
- Short-circuiting: Logical expressions cease evaluation when the final result is determined.
- Inlining: Incorporating the body of a function at each point the function is called, reducing function call overhead and enabling further optimisation by the compiler
- Contexpr: Computations marked as constexpr are evaluated at compile time, enabling constant folding and efficient code execution by eliminating runtime calculations
- Compile-time dispatch: Techniques like template specialization or function overloading so that optimised code paths are chosen at compile time based on type or value, avoiding runtime dispatch and early optimisation decision.
- Cache warming: To minimize memory access time and boost program responsiveness, data is preloaded into the CPU cache before it’s needed.
Source: C++ design patterns for low-latency applications including high-frequency trading
The effectiveness of these techniques is shown in the chart above. While cache warming and contextpr can bring 90% efficiency improvements. Using signed comparisons only leads to a 12.5% increase.
If you’re interested in the topic, Bilokon also suggests watching the 2019 conference video by Carl Cook and Nimrod Sapir at QSpark, a provider of low-latency trading platforms, shown here:
*C++ design patterns for low-latency applications including high-frequency trading. Github: GitHub – 0burak/imperial_hft. Bilokon’s academic papers
Have a confidential story, tip, or comment you’d like to share? Contact: +44 7537 182250 (SMS, Whatsapp or voicemail). Telegram: @SarahButcher. Click here to fill in our anonymous form, or email editortips@efinancialcareers.com. Signal also available.
Bear with us if you leave a comment at the bottom of this article: all our comments are moderated by human beings. Sometimes these humans might be asleep, or away from their desks, so it may take a while for your comment to appear. Eventually it will – unless it’s offensive or libelous (in which case it won’t.)