A Cuckoo Filter Modification Inspired by Bloom Filter

Document Type : Review Article

Authors

1 گروه مهندسی برق و کامپیوتر دانشگاه تبریز

2 Tabriz

3 Faculty of Electrical and Computer Engineering, University of Tabriz

Abstract

Probabilistic data structures are so popular in membership queries, network applications, and so on. Bloom Filter and Cuckoo Filter are two popular space efficient models that incorporate in set membership checking part of many important protocols. They are compact representation of data that use hash functions to randomize a set of items. Being able to store more elements while keeping a reasonable false positive probability is a key factor of design. A new algorithm is proposed to improve some of the performance properties of Cuckoo Filter such as false positive rate and insertion performance and solve some drawbacks of the Cuckoo algorithm such as endless loop. Main characteristic of the Bloom Filter is used to improve Cuckoo Filter, so we have a smart Cuckoo Filter which is modified by Bloom Filter (SCFMBF). SCFMBF uses the same table of buckets as Cuckoo Filter but instead of storing constant Fingerprints, It stores Bloom Filters. Bloom Filters can be accumulated in the table’s buckets which leads to higher insertion feasibility. We also address the endless loop problem of Cuckoo Filter that means an inserted item is stuck in an iterative process of finding an empty bucket, so a smart algorithm is designed which not only solves endless loop problems but also prevents insertion failure. our algorithm prevents double checking of a bucket and avoids making loops. Consequently the capacity of SCFMBF is improved significantly. Results of comparison with Cuckoo Filter shows that false positive probability of SCFMBF method is four times enhanced.

Keywords

Main Subjects


[1]     P. Brass, “Advanced Data Structures”, Cambridge University Press , pp. 402-405, 2008.
[2]     B. H. Bloom, “Space/time trade-offs in hash coding with allowable errors”, Communications of the ACM, vol. 13, no. 7, pp. 422–426, 1970.
[3]     J. K. Mullin and D. J. Margoliash, “A tale of three spelling checkers”, Software, Practice and Experience, pp. 625-630, 1990.
[4]     S. Czerwinski, B. Y. Zhao, T. Hodes, A. D. Joseph and R. Katz. ,“An Architecture for a Secure Service Discovery Service”, Proceedings of the Fifth Annual ACM/IEEE International Conference on Mobile Computing and Networking, ACM Press, pp. 24-35, 1999
[5]     J. Wang, “A survey of web caching schemes for the internet”, ACM SIG-COMM Computer Communication Review, 1999.
[6]     R. P. Laufer, P. B. Velloso, D. d. O. Cunha, I. M. Moraes, M. D. D. Bicudo, M. D. D. Moreira and O. C. M. B. Duarte, “Towards stateless single-packet IP traceback”, Proceedings of the 32nd IEEE Conference on Local Computer Networks, pp. 548–555, 2007.
[7]     L. Fan, P. Cao, J. Almeida, and A. Z. Broder, “Summary cache: a scalable wide-area web cache sharing protocol”, IEEE/ACM Transactions on Networking, vol. 28, no. 4, pp. 254–265, 1998.
[8]      M. Xiao, Y. dai and X. Luo, “The dynamic Bloom Filters”, IEEE Trasactions on Knowledge and Data Engineering, vol. 22, no. 1, pp. 120-133, 2010.
[9]     D. Guo, J. Wu, H. Chen, Y. Yuan and X. Luo, “The dynamic Bloom filters,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 1, pp. 120–133, 2010.
[10]  S. Geravand and M. Ahmadi, “A novel adjustable matrix Bloom filter-based copy detection system for digital libraries,” Proceedings of IEEE International Conference on Computer and Information Technology, 2011.
[11]  M. Mitzenmacher, “Compressed Bloom filters”, Proceedings of the twentieth annual ACM symposium on Principles of distributed computing, pp. 144–150, 2001.
[12]   F. Bonomi, M. Mitzenmacher, R. Panigrah, S. Singh and G. Varghese, “Bloom filters via d-left hashing and dynamic bit reassignment,” 44th Allerton Conference, 2006.
[13]  K. Shanmugasundaram, H. Bronnimann, and N. Memon, “Payload attribution via hierarchical Bloom Filter”, Proceedings of the 11th ACM conference on Computer and communicationsecurity (CCS 04), pp..31-41, 2004.
[14]  S. Cohen and Y. Matias, “Spectral Bloom filters,” in SIGMOD ’03: Proceedings of the 2003 ACM SIGMOD international conference on Management of data. New York, NY, USA: ACM, pp. 241–252, 2003.
[15]  Y. Matsumoto, H. Hazeyama, and Y. Kadobayashi, “Adaptive Bloom filter: A space-efficient counting algorithm for unpredictable network traffic,” IEICE Trans. Inf. Syst., vol. E91-D, no. 5, pp. 1292–1299, 2008.
[16]  P. S. Almeida, C. Baquero, N. Preguic¸a, and D. Hutchison, “Scalable Bloom filters,” Inf. Process. Lett., vol. 101, no. 6, pp. 255–261, 2007.
[17]  J. Bruck, J. Gao, and A. Jiang, “Weighted Bloom filter,” in 2006 IEEE International Symposium on Information Theory (ISIT’06), 2006.
[18]  M. A. Bender, M. F. Colton, R. Johnson, B. C. Kuszmaul, D. Medjedovic, P. Montes, P. Shetty, R. P. Spillanne and E. Zadoc, “Don’t Thrash: How to Chache you Hash on Flash”, Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems (HOTStorage11), 2012.
[19]  B. Fan, D. Andersen, M. Kaminsky, M. Mitzenmacher, “Cuckoo filter: practically better than Bloom,” Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies (CoNEXT 14), pp. 75-88, 2014.
[20]  J. Grashofer, F. Jacob and H. Hartenstein, “Towards Application of Cuckoo Filters in Network Security Monitoring”, 14th international conference on network and service management (CNSM), 2018.
[21]  M. Mitzenmacher, S. Pontarelli, P. Reviriego, “Adaptive Cuckoo filters,” arXiv preprint, 2017.
[22]  Y. Sun, Y. Hua, S. Jiang, Q. Li, S. Cao and P. Zue, “SmartCuckoo: A Fast and Cost-Efficient Hashing Index Scheme for Cloud Storage Systems”, USENIX Annual Technical Conference, pp. 553-565, 2017.
[23]  A. Kirsch, A. Mitzenmacher, and U. Wieder, “More Robust Hashing: Cuckoo Hashing with a Stash”, SIAM Journal on Computing, vol. 39, no. 4, pp. 1543-1561, 2009.
[24]  U. Erlingsson, F. Mcsherry and M. Manasse, “A cool and practical alternative to traditional hash tables”, Proceedings of the Seventh Workshop on Distributed Data and Structures (WDAS), 2006.
[25]  B. Fan, D.G. Andersen, and M. Kaminsky, “MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing”, Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation, 2013.
[26]  A. Singh, S. Garg, K. Kaur and K. KwangRaymond Cool,: “Fuzzy-folded Bloom Filter-as-a-Service for Big Data Storage in the Cloud”, IEEE Transactions on Industrial Informatics, pp. 1-10, 2018.
[27]  K. Sasaki, S. Sugiura, S. Makido and N. Suzuki, “Bloom-Filter Aided Two-Layered Structured Overlay for Highly-Dynamic Wireless Distributed Storage” IEEE Communications Letters, vol. 17, no. 4, pp. 629-632, 2013.
[28]  C. E. Rothenberg, C. A. B. Macapuna, F. L. Verdi and M. F. Magalhaes, ” The Deletable Bloom Filter: A New Member of the Bloom Family”, IEEE Communications Letters, vol. 14, no. 6, pp. 557-559, 2010.
[29]  P. Reviriego, S. Pontarelli, J. A. Maestro and M. Ottavi: “A Synergetic Use of Bloom Filters for Error Detection and Correction”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 3, pp. 584-587, 2015.
[30]  H. Lim, J. Lee, H. Byun and C. Yim,  “Ternary Bloom Filter Replacing Counting Bloom Filter” Communications Letters, vol. 21, no. 2, pp. 278-281, 2016.
[31]  D. Ficara, A.D. Pietro, S. Giordano,G. Procissi and F. Vitucci, “Enhancing Counting Bloom Filters Through Huffman-Coded Multilayer” IEEE/ACM Transactions on Networking, vol. 18, no. 6, pp. 1977-1987, 2010.
[32]  A. Kumar, J. Xu and J. Wang, “Space-Code Bloom Filter for Efficient Per-Flow Traffic Measurement”, IEEE Journal on Selected Areas in Communications, vol. 24, no. 12, pp. 2327-2339, 2006.
[33]  F. Hao, M. Kodialam, T. V. Lakshman and H. Song, “Fast Dynamic Multiple-Set Membership Testing Using Combinatorial Bloom Filters”, IEEE/ACM Transactions on Networking, vol. 20, no.1, pp. 295-304,  2012.
[34]  Duan, H., Yu, S., Mei, M., Zhan,W., Li, L.: Cstore: a desktop-oriented distributed public cloud storage system. Computers & Electrical Engineering 42, 60–73,2015.
[35]  Geravand, S., Ahmadi, M.: A novel adjustable matrix bloom filter-based copy detection system for digital libraries. In: Computer and Information Technology (CIT), 2011 IEEE 11th International Conference on. pp. 518–525. IEEE ,2011.
[36]  S. Xiong, F. Wang and Q. Cao, “A bloom filter based scalable data integrity check tool for large-scale dataset”, First Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS), pp. 55–60, 2016.
[37]  Y. Qiao, T. Li and S. Chen, “Fast bloom filters and their generalization”, IEEE Transactions on Parallel and Distributed Systems, vol. 25, pp. 93–103, 2013.
[38]  P. Reviriego, K. Christensen and J. A. Maestro, “A comment on “fast Bloom Filters and their generalization””, IEEE Transactions on Parallel and Distributed Systems, vol. 27, pp. 303–304, 2015.
[39]  A. Crainiceanu and D. Lemire, “Bloofi: Multidimensional Bloom Filters”, Information Systems, vol. 54, pp. 311–324, 2015.
[40]  A. Singh, S. Garg, S. Batra, N. Kuma and J.J. Rodrigues, “Bloom Filter based optimization scheme for massive data handling in IoT environment”, Future Generation Computer Systems, vol. 82, pp. 440–449, 2018.
[41]  F. Grandi, “On the analysis of Bloom Filters”, Information Processing Letters, vol. 129, pp. 35–39, 2018