Siamese Networks for Bounding-Box to Silhouette Annotation of Video Databases
Thadeu Luiz B Dias, Luiz Tavares, Rafael Padilla, Allan F Silva, Lucas A Thomaz, Sergio Lima Netto, Eduardo A. B. da Silva

DOI: 10.14209/SBRT.2020.1570645009
Evento: XXXVIII Simpósio Brasileiro de Telecomunicações e Processamento de Sinais (SBrT2020)
Keywords: siamese networks anomaly detection video annotation deep learning
Abstract
Pixel-level ground truth masks for object detection databases are extremely useful in the context of machine learning, specially for convolutional neural network applications. However, the manual labeling process of such data demands a lot of effort and time, especially in videos, in which the labeling needs to be performed in each frame. Therefore, only bounding box annotations, that are much faster to perform, are present in most databases. In this work we propose a semi-automated approach to transform bounding-box annotations into silhouette annotations with a reduced processing time. We compute features of a siamese network in the region inside a bounding box and obtain the probability of a pixel to belong to the foreground, which is then refined by a post-processing step. We employ our methodology to the VDAO dataset, creating a new annotation that contains the silhouette of the objects. We estimate that our method results in a reduction of the annotation time by 90% in average, while providing an accurate silhouette for the objects.

Download