Hamish Morrison from Bloomberg presents a talk on building a high-performance binary serialization format that allows in-place modification. He begins by defining serialization and discussing the problems with simple serialization methods, such as endianness, type size, and padding. He then explores schema-based and schema-less approaches, highlighting the trade-offs between them. Morrison introduces his attempt to create a schema-less serialization format that supports in-place content modification, detailing the baseline format, its limitations, and various optimizations. These optimizations include sorting keys, using offset tables, hashing keys, and implementing branchless binary search. He also discusses memory allocation strategies, such as bump allocators and freelist allocators, and benchmarks their performance. The presentation concludes with a comparison against unordered maps and a discussion of future work, including buffer compaction and exploring formats without offsets. The talk ends with a Q&A session, where attendees ask about hash collisions, benchmark setups, and alternative approaches.
Sign in to continue reading, translating and more.
Continue