Our paper “A Data-Driven Approach for Tag Refinement and Localization in Web Videos”, by myself, Marco Bertini, Giuseppe Serra, Alberto Del Bimbo, has been accepted for publication in Computer Vision and Image Understanding (CVIU) and is now available online.
Alberto Del Bimbo has been also invited to present our work at the Workshop on Large-Scale Video Search and Mining at CVPR 2015.
Estimating the relevance of a specific tag with respect to the visual content of a given image and video has become the key problem in order to have reliable and objective tags. With video tag localization is also required to index and access video content properly. In this paper, we present a data-driven approach for automatic video annotation by expanding the original tags through images retrieved from photo-sharing website, like Flickr, and search engines such as Google or Bing. Compared to previous approaches that require training classifiers for each tag, our approach has few parameters and permits open vocabulary.
Watch the TED 2015 talk by my postdoc advisor Prof. Fei-Fei Li about the recent advances in computer vision, from the detection and classification of objects in images to algorithms that are able to construct natural descriptions of those images. It is an exciting overview of the current state of the art in computer vision, in which she shares her thoughts on its potential use and impact. http://goo.gl/8O5Fch
I am finally settled at Stanford University and just started my appointment as postdoctoral scholar in the AI laboratory (SAIL) on a Marie Curie Fellowship from the European Commission.
I started working in Fei-Fei Li‘s Vision Lab. I am also collaborating with Silvio Savarese and Bernd Girod.
Lorenzo Seidenari and I gave the tutorial “Hands on Advanced Bag-of-Words Models for Visual Recognition” at the ICPR 2014 conference (August 24, Stockholm, Sweden).
All materials – i.e. slides, Matlab code, images and features – and more details can still be found on this webpage.
I have been awarded with a Marie Curie International Outgoing Fellowship (IOF) granted by the European Commission. The Marie Curie IOF is a prestigious and highly competitive fellowship for experienced European scientists to gain new skills and expertise while conducting high-level research in a country outside Europe.
I have been awarded a grant of 272K Euro for the 3-years project “EAGLE: Exploiting semAntic and social knowledGe for visuaL rEcognition”. I will spend the first two years (outgoing phase) at Stanford University.
Our ICMR 2014 full paper “A Cross-media Model for Automatic Image Annotation” by Lamberto Ballan, Tiberio Uricchio, Lorenzo Seidenari and Alberto Del Bimbo has been accepted for oral presentation and it is now available online.
Automatic image annotation is still an important open problem in multimedia and computer vision. The success of media sharing websites has led to the availability of large collections of images tagged with human-provided labels. Many approaches previously proposed in the literature do not accurately capture the intricate dependencies between image content and annotations. We propose a learning procedure based on KCCA which finds a mapping between visual and textual words by projecting them into a latent meaning space. The learned mapping is then used to annotate new images using advanced nearest-neighbor voting methods.
University of Florence
Course on Multimedia Databases – 2013/14 (Prof. A. Del Bimbo)
Instructors: Lamberto Ballan and Lorenzo Seidenari
Goal
The goal of this laboratory is to get basic practical experience with image classification. We will implement a system based on bag-of-visual-words image representation and will apply it to the classification of four image classes: airplanes, cars, faces, and motorbikes.
We will follow the three steps:
- Load pre-computed image features, construct visual dictionary, quantize features
- Represent images by histograms of quantized features
- Classify images with Nearest Neighbor / SVM classifiers
Getting started
- Download excercises-description.pdf
- Download lab-bow.zip (type the password given in class to uncompress the file) including the Matlab code
- Download 4_ObjectCategories.zip including images and precomputed SIFT features; uncompress this file in lab-bow/img
- Download 15_ObjectCategories.zip including images and precomputed SIFT features; uncompress this file in lab-bow/img
- Start Matlab in the directory lab-bow/matlab and run exercises.m
MICC laboratories, Florence, 31th October 2013 (10.15-13.15). Course on Multimedia Databases (DBMM) – laboratory lecture.
- Goal: logo recognition in web images.
- Dataset/testset: find 4 different logos vs 110 images.
- Evaluation metrics: recognition performances will be evaluated in terms of mean Average Precision (mAP).
Instructors: Lamberto Ballan, Lorenzo Seidenari.
Download Software & Dataset (* based on VLFeat library by A. Vedaldi)
Final results (ranking): http://goo.gl/o5DCG5
Lorenzo Seidenari and I will give a tutorial named “Hands on Advanced Bag-of-Words Models for Visual Recognition” at the forthcoming ICIAP 2013 conference (September 9, Naples, Italy). All materials (slides, Matlab code, etc.) and more details can be found on this webpage.
Last friday I visited Fei-Fei Li’s Vision Lab at Stanford University and I had the pleasure of giving a very informal talk on our ongoing works on social media annotation. The slides of the talk are available online.