At the NVIDIA developer workshop days I attended last week, the following paper was highly recommended:
“s1: Simple test-time scaling” by Niklas Muennighoff, Zitong Yang, Weijia Shi, Xiang Lisa Li, Li Fei-Fei, Hannaneh Hajishirzi, Luke Zettlemoyer, Percy Liang, Emmanuel Candès, Tatsunori Hashimoto
(PDF)
Github project: https://github.com/simplescaling/s1
Not directly related, but an interesting application of the NVIDIA RAPIDS Accelerator that was also presented: https://aws.amazon.com/blogs/industries/accelerating-fraud-detection-in-financial-services-with-rapids-accelerator-for-apache-spark-on-aws/