Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Document Type : Research Article


1 IT Research Faculty, Iran Telecommunication Research Center

2 Electronics Research Institute, Sharif University of Technology

3 Electrical Engineering Department, Sharif University of Technology


Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this paper, a two-level Sparse Topical Coding (STC) topic model is proposed to analyze traffic surveillance video sequences which contain hierarchical patterns with complicated motions and co-occurrences. The first level STC model is applied to automatically cluster optical flow features into motion patterns. Then, the second level STC model is used to cluster motion patterns into traffic phases. Experiments on a real world traffic dataset demonstrate the effectiveness of the proposed method against conventional one-level topic model based methods. The results show that our two-level STC can successfully discover not only the lower level activities but also the higher level traffic phases, which makes a more appropriate interpretation of traffic scenes. Furthermore, based on the two-level structure, either activity anomalies or traffic phase anomalies can be detected, which cannot be achieved by the one-level structure.


Main Subjects

[1] O.P. Popoola, K. Wang, Video-based abnormal human behavior recognition—a review, Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on., 42(6) (2012) 865-78.
[2] L. Song, F. Jiang, Z. Shi, R. Molina, A.K. Katsaggelos, Toward dynamic scene understanding by hierarchical motion pattern mining, Intelligent Transportation Systems, IEEE Transactions on., 15(3) (2014) 1273-85.
[3] T. Hofmann, Probabilistic latent semantic analysis, UAI, (1999) 289-296.
[4] D.M. Blei, A.Y. Ng, M.I. Jordan, J. Lafferty, Latent Dirichlet allocation, Journal of Machine Learning Research, (3) (2003) 993-1022.
[5] K. Than, T.B. Ho, Fully Sparse Topic Models, Machine Learning and Knowledge Discovery in Databases, 7523 (2012) 490-505.
[6] J. Zhu, E. Xing, Sparse topical coding, Proceedings of the Twenty Seventh Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI), (2011) 831-838.
[7] W. Fu, J. Wang, H. Lu, S. Ma, Dynamic scene understanding by improved sparse topical coding, Pattern Recognition., 46(7) (2013) 1841-50.
[8] T. Hospedales, S. Gong, T. Xiang, A Markov Clustering Topic Model for Mining Behaviour in Video, IEEE International Conference on Computer Vision, Kyoto, Japan, (2009) 1165-1172.
[9] R. Emonet, J. Varadarajan, J. Odobez, Extracting and Locating Temporal Motifs in Video Scenes Using A Hierarchical Non Parametric Bayesian Model, IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, USA., (2011) 3233-3240.
[10] Y. Yang, J. Liu, M. Shah, Video Scene Understanding Using Multi-scale Analysis, IEEE International Conference on Computer Vision, Kyoto, Japan, (2009) 1669-1676.
[11] J. Li, S. Gong, T. Xiang, Global behaviour inference using probabilistic latent semantic analysis, British Machine Vision Conference, 3231 (2008) 3232.
[12] X. Wang, X. Ma, E.L. Grimson, Unsupervised activity perception in crowded and complicated scenes using hierarchical bayesian models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(3) (2009) 539-555.
[13] T. Haines, T. Xiang, Delta-dual hierarchical Dirichlet processes: a pragmatic abnormal behaviour detector, Proc. IEEE Int. Conf. Computer Vision, Barcelona, Spain, (2011) 2198-2205.
[14] T.M. Hospedales, S. Gong, T. Xiang, Video Behaviour Mining using A Dynamic Topic Model, International Journal of Computer Vision, 98(3) (2012) 303-323.
[15] J. Varadarajan, R. Emonet, J.M. Odobez, Bridging The Past, Present and Future: Modeling Scene Activities From Event Relationships and Global Rules, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2012) 2096-2103.
[16] L. Song, F. Jiang, Z. Shi, A. Katsaggelos, Understanding dynamic scenes by hierarchical motion pattern mining, IEEE International Conference on Multimedia and Expo (ICME), (2011) 1-6.
[17] Y. Fan, S. Zheng, Dynamic Scene Analysis Based on the Topic Model, 2nd IEEE International Symposium on Instrumentation and Measurement, Sensor Network and Automation (IMSNA), (2013) 436-439.
[18] J. Li, S. Gong, T. Xiang, Learning behavioural context, International Journal of Computer Vision, 97(3) (2012) 276-304.
[19] L. Song, L. Mei, Z. Liu, H. Duan, N. Liu, J. Wang, C. Hu, Motion perception for traffic surveillance, IEEE 17th International Conference on Intelligent Transportation Systems (ITSC), (2014) 1298-1303.
[20] T. Lin, W. Tian, Q. Mei, H. Cheng, The Dual-Sparse Topic Model: Mining Focused Topics and Focused Terms in Short Text, Proceedings of the 23rd international conference on World wide web, Seoul, Korea, (2014) 539-550.
[21] J. Shi, C. Tomasi, Good features to track, Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, Washington, (1994) 593-600.
[22] B.D. Lucas, T. Kanade, An Iterative Image Registration Technique with an Application to Stereo Vision, Proceedings of imaging understanding workshop, (1981) 674-679.
[24] H.W. Kuhn, The hungarian method for the assignment problem, Naval research logistics quarterly, 2(1-2), (1955) 83-97.
[25] J. Munkres, Algorithms for the assignment and transportation problems, Journal of the Society of Industrial and Applied Mathematics, 5(1) (1957) 32-38.
[26] PLSA:
[27] LDA:
[28] STC:
[29] FSTM: