Researchers are increasingly focused on enabling secure data analytics across multiple parties holding vertically distributed data, and a critical component of this is secure data join. Shuyu Chen from Fudan University, Mingxun Zhou from The Hong Kong University of Science and Technology, and Haoyu Niu et al. present Bifrost, a novel protocol designed to simplify this process and overcome limitations of existing methods. Current approaches, such as those utilising Circuit-based Private Set Intersection, often introduce redundant data which hinders efficient downstream analysis, while alternatives incur substantial communication overhead. Bifrost distinguishes itself by delivering a redundancy-free joined table using only two conceptually simple building blocks, significantly reducing computational cost and communication requirements. Evaluations using datasets up to 100 GB demonstrate that Bifrost achieves considerable speedup and communication reduction compared to state-of-the-art protocols, representing a substantial advance towards practical and scalable secure data analytics.
Reducing redundancy and communication costs in vertically partitioned data joins requires careful data distribution and query planning
Researchers have developed Bifrost, a new protocol for secure two-party data join that significantly improves upon existing methods for secure data analytics. While iPrivJoin attempted to resolve this issue, it incurred substantial communication costs due to its reliance on complex cryptographic primitives and multiple rounds of data shuffling.
Bifrost offers a much simpler approach, outputting a redundancy-free joined table by leveraging two fundamental building blocks: an ECDH-PSI protocol and a two-party oblivious shuffle protocol. This lightweight design eliminates the need for computationally expensive operations like Oblivious Programmable Pseudorandom Functions.
Furthermore, an optimization termed “dual mapping” reduces the number of required oblivious shuffle rounds from two to one, streamlining the process. Experiments conducted on datasets up to 100 GB demonstrate that Bifrost achieves a speed-up of 2.54 to 22.32times and reduces communication costs by 84.15% to 88.97% compared to the state-of-the-art iPrivJoin protocol.
Notably, the communication overhead of Bifrost is comparable to the size of the original input data, making it highly efficient. Evaluation of a complete secure data analytics pipeline, combining the join operation with subsequent data analysis, reveals that Bifrost’s redundancy-free property prevents error rate increases in downstream tasks caused by dummy data.
This also results in a speed-up of up to 2.80times in the secure analytics process, alongside a communication reduction of up to 73.15%. The protocol begins with an ECDH-PSI to privately discover the intersection of identifiers between two input tables, Ta and Tb, held by parties Pa and Pb, respectively.
This intersection forms the basis for the joined table, ensuring only matched identifiers are included, thereby removing redundancy. Following the ECDH-PSI, a two-party oblivious shuffle protocol is implemented to combine the feature columns associated with the matched identifiers, creating secret shares of the joined table distributed between Pa and Pb.
A key optimization, termed “dual mapping”, further reduces the computational load by decreasing the number of required oblivious shuffle rounds from two to one, enhancing efficiency. Extensive experimentation was conducted using datasets up to 100 GB to validate Bifrost’s performance. Under a Wide Area Network (WAN) setting, iPrivJoin required 53.87 hours and 773.95 GB of communication, while Bifrost demonstrated substantial improvements.
Specifically, Bifrost achieved speed-ups of up to 15.2× and 10.36× compared to iPrivJoin and CPSI, respectively. Compared to the redundancy-free baseline iPrivJoin, Bifrost exhibited running time improvements ranging from 2.54× to 22.32×, alongside communication size reductions of 84.15% to 88.97%. Notably, the communication overhead of Bifrost closely matches the size of the input data, signifying a highly efficient data transfer process.
Performance gains and communication efficiency of the Bifrost protocol are significant improvements over existing methods
Bifrost, a novel secure data join protocol, achieves speed-ups ranging from 2.54 to 22.32times compared to the state-of-the-art redundancy-free protocol iPrivJoin. Communication overhead is also substantially reduced, with decreases of 84.15% to 88.97% observed in experiments. The protocol outputs a redundancy-free joined table, eliminating the need for costly redundancy removal processes required by previous methods like iPrivJoin.
Communication size with Bifrost closely matches the size of the input data, demonstrating efficient data handling. Experiments conducted on datasets up to 100 GB demonstrate the protocol’s scalability and performance. The lightweight design of Bifrost avoids reliance on Oblivious Programmable Pseudorandom Functions, simplifying the protocol and improving efficiency.
A dual mapping optimization further reduces the number of oblivious shuffle rounds needed from two to one, streamlining the computation. This optimization contributes to the observed speed-ups and communication reductions. In a two-step secure data analytics pipeline, incorporating both the join operation and subsequent data analytics, Bifrost avoids error rate increases caused by dummy rows introduced in other protocols.
The redundancy-free joined table enables up to a 2.80times speed-up in the secure analytics process. Furthermore, the SDA process benefits from up to a 73.15% reduction in communication costs due to the elimination of redundant data. These results highlight the significant benefits of a redundancy-free approach for secure data analytics.
Bifrost protocol delivers substantial gains in speed and communication efficiency for secure data joining across disparate systems
Researchers have developed Bifrost, a new protocol for secure two-party data join that delivers a redundancy-free joined table. Experimental results, utilising datasets of up to 100 GB, demonstrate that Bifrost is significantly faster and requires less communication than the current state-of-the-art redundancy-free protocol, iPrivJoin.
Specifically, Bifrost achieved speed-ups ranging from 4.41 to 12.96times and reduced communication size by 64.32% to 80.07%. Furthermore, evaluation within a secure data analytics pipeline showed that the redundancy-free joined table avoids error rate increases and enables faster analytics with up to 73.15% communication reduction compared to methods that introduce redundant data. The authors acknowledge that their protocol is designed for a two-party setting and future work could explore extensions to multi-party computation.
👉 More information
🗞 Bifrost: A Much Simpler Secure Two-Party Data Join Protocol for Secure Data Analytics
🧠 ArXiv: https://arxiv.org/abs/2602.01225
