Posts tagged: computer vision

ICIAP 2013 Tutorial: Hands on Advanced Bag-of-Words Models for Visual Recognition

comments Comments Off on ICIAP 2013 Tutorial: Hands on Advanced Bag-of-Words Models for Visual Recognition
By , September 7, 2013

ICIAP2013Lorenzo Seidenari and I will give a tutorial named “Hands on Advanced Bag-of-Words Models for Visual Recognition” at the forthcoming ICIAP 2013 conference (September 9, Naples, Italy). All materials (slides, Matlab code, etc.) and more details can be found on this webpage.

MICC Reading Group on Multimedia and Vision

comments Comments Off on MICC Reading Group on Multimedia and Vision
By , May 13, 2013

MICC Reading GroupAndy Bagdanov and I are organizing a paper reading group on Multimedia and Vision at the MICC of University of Florence.

We plan a meeting once every three weeks (approximately), usually from 12.00 to 13.30. The schedule of our meetings and the material are available on this page.

Generative & discriminative models for classifying social images on the MICC-Flickr101 dataset

comments Comments Off on Generative & discriminative models for classifying social images on the MICC-Flickr101 dataset
By , June 17, 2012

The MICC-Flickr101 datasetOur paper “Combining Generative and Discriminative Models for Classifying Social Images from 101 Object Categories” has been accepted at ICPR’12. We use a hybrid generative-discriminative approach (LDA + SVM with non-linear kernels) over several visual descriptors (SIFT, GIST, colorSIFT).

A major contribution of our work is also the introduction of a novel dataset, called MICC-Flickr101, based on the popular Caltech 101 and collected from Flickr. We demonstrate the effectiveness and efficiency of our method testing it on both datasets, and we evaluate the impact of combining image features and tags for object recognition.

ECCV Workshop on Web-scale Vision and Social Media

comments Comments Off on ECCV Workshop on Web-scale Vision and Social Media
By , May 3, 2012

ECCV VSM 2012I am co-organizer (with Marco Bertini, Alex Berg and Cees Snoek) of the International Workshop on Web-scale Vision and Social Media, in conjunction with ECCV 2012.

The world-wide-web has become a large ecosystem that reaches billions of users through information processing and sharing, and most of this information resides in pixels. Web-based services like YouTube and Flickr, and social networks such as Facebook have become more and more popular, allowing people to easily upload, share and annotate massive amounts of images and videos all over the web.

Vision and social media thus has recently become a very active inter-disciplinary research area, involving computer vision, multimedia, machine-learning, information retrieval, and data mining. This workshop aims to bring together leading researchers in the related fields to advocate and promote new research directions for problems involving vision and social media, such as large-scale visual content analysis, search and mining.

Effective Codebooks for Action Recognition in Unconstrained Videos

comments Comments Off on Effective Codebooks for Action Recognition in Unconstrained Videos
By , March 12, 2012

IEEE-TMM

Our paper entitled “Effective Codebooks for Human Action Representation and Classification in Unconstrained Videos” by L. Ballan, M. Bertini, A. Del Bimbo, L. Seidenari and G. Serra has been accepted for publication in the IEEE Transactions on Multimedia.

Recognition and classification of human actions for annotation of unconstrained video sequences has proven to be challenging because of the variations in the environment, appearance of actors, modalities in which the same action is performed by different persons, speed and duration and points of view from which the event is observed. This variability reflects in the difficulty of defining effective descriptors and deriving appropriate and effective codebooks for action categorization.

In this paper we propose a novel and effective solution to classify human actions in unconstrained videos. It improves on previous contributions through the definition of a novel local descriptor that uses image gradient and optic flow to respectively model the appearance and motion of human actions at interest point regions. In the formation of the codebook we employ radius-based clustering with soft assignment in order to create a rich vocabulary that may account for the high variability of human actions. We show that our solution scores very good performance with no need of parameter tuning. We also show that a strong reduction of computation time can be obtained by applying codebook size reduction with Deep Belief Networks with little loss of accuracy.

Our method has obtained very competitive performance on several popular action-recognition datasets such as KTH (accuracy = 92.7%), Weizmann (accuracy = 95.4%) and Hollywood-2 (mAP = 0.451).

ECCV 2012 in Florence, Italy

comments Comments Off on ECCV 2012 in Florence, Italy
By , October 10, 2011

ECCV 2012I am involved in the local committee of ECCV 2012. A year from now, we will host in Florence the 12th European Conference on Computer Vision. ECCV has an established tradition of high scientific quality, with double blind reviewing and very low acceptance rates (about 5% for orals and 25% for posters in 2010). The conference has an overall duration of one week. The main conference has a duration of four days starting from the second and a single-track format, with about ten oral presentations and one poster session per day. Tutorials are held on the first day, and Workshops on the last two days. Industrial exhibits and Demo sessions are also scheduled in the conference programme.

ECCV 2012 will be held in Florence, Italy, on October 7-13, 2012. Visit ECCV 2012 site.

International Workshop on Computer Vision Methods in Blind Image Forensics (in conjunction with ICCV 2011)

comments Comments Off on International Workshop on Computer Vision Methods in Blind Image Forensics (in conjunction with ICCV 2011)
By , March 26, 2011

Lunar

I am involved in the technical program committee of the 1st International Workshop on Computer Vision Methods in Blind Image Forensics (CVBIF), in conjunction with ICCV 2011.

The verification of original images, as well as the detection of manipulations in digital images and multimedia content has become an increasingly important topic. The purpose of this workshop is to bring together leading experts from image forensics and the computer vision community. Its goal is to foster new vision-based approaches to image forensics problems and thus promote the advancement of vision-based solutions in forensics applications. Download a PDF version of the call for papers here!

A SIFT-based forensic method for copy-move attack detection and transformation recovery

comments Comments Off on A SIFT-based forensic method for copy-move attack detection and transformation recovery
By , March 10, 2011

IEEE TIFS

The paper “A SIFT-based forensic method for copy-move attack detection and transformation recovery” by I. Amerini, L. Ballan, R. Caldelli, A. Del Bimbo, and G. Serra is now officially accepted for publication by the IEEE Transactions on Information Forensics and Security.

One of the principal problems in image forensics is determining if a particular image is authentic or not. This can be a crucial task when images are used as basic evidence to influence judgment like, for example, in a court of law. To carry out such forensic analysis, various technological instruments have been developed in the literature.

In this paper the problem of detecting if an image has been forged is investigated; in particular, attention has been paid to the case in which an area of an image is copied and then pasted onto another zone to create a duplication or to cancel something that was awkward. Generally, to adapt the image patch to the new context a geometric transformation is needed. To detect such modifications, a novel methodology based on Scale Invariant Features Transform (SIFT) is proposed. Such a method allows both to understand if a copy-move attack has occurred and, furthermore, to recover the geometric transformation used to perform cloning. Extensive experimental results are presented to confirm that the technique is able to precisely individuate the altered area and, in addition, to estimate the geometric transformation parameters with high reliability. The method also deals with multiple cloning.

More information about this project (there are also links to datasets used in the experiments) are available on this page.

Human action recognition: ICIP and ICCV VOEC 2009 papers online

comments Comments Off on Human action recognition: ICIP and ICCV VOEC 2009 papers online
By , July 17, 2009

Our ICIP 2009 and ICCV VOEC 2009 papers are available online. We are working at a novel method based on an effective visual bag-of-words model and on a new spatio-temporal descriptor.

First, we define a new 3D gradient descriptor that combined with optic flow outperforms the state-of-the-art, without requiring fine parameter tuning (ICIP paper).

Second, we show that for spatio-temporal features the popular k-means algorithm is insufficient because cluster centers are attracted by the denser regions of the sample distribution, providing a non-uniform description of the feature space and thus failing to code other informative regions. For this reason we use a radius-based clustering method and a soft assignment that considers the information of two or more relevant candidates, thus obtaining a more effective codebook (ICCV VOEC paper). We extensively test our approach on standard KTH and Weizmann action datasets showing its validity and outperforming other recent approaches.

Panorama Theme by Themocracy