Effettua una ricerca
Paolo Spagnolo
Ruolo
III livello - Ricercatore
Organizzazione
Consiglio Nazionale delle Ricerche
Dipartimento
Non Disponibile
Area Scientifica
AREA 09 - Ingegneria industriale e dell'informazione
Settore Scientifico Disciplinare
ING-INF/05 - Sistemi di Elaborazione delle Informazioni
Settore ERC 1° livello
PE - PHYSICAL SCIENCES AND ENGINEERING
Settore ERC 2° livello
PE6 Computer Science and Informatics: Informatics and information systems, computer science, scientific computing, intelligent systems
Settore ERC 3° livello
PE6_8 Computer graphics, computer vision, multi media, computer games
Distributed networks of sensors have been recognized to be a powerful tool for developing fully automated systems that monitor environments and human activities. Nevertheless, problems such as active control of heterogeneous sensors for high-level scene interpretation and mission execution are open. This paper presents the authors' ongoing research about design and implementation of a distributed heterogeneous sensor network that includes static cameras and multi-sensor mobile robots. The system is intended to provide robot-assisted monitoring and surveillance of large environments. The proposed solution exploits a distributed control architecture to enable the network to autonomously accomplish general-purpose and complex monitoring tasks. The nodes can both act with some degree of autonomy and cooperate with each other. The paper describes the concepts underlying the designed system architecture and presents the results obtained working on its components, including some simulations performed in a realistic scenario to validate the distributed target tracking algorithm.
In recent years, "FragTrack" has become one of the most cited real time algorithms for visual tracking of an object in a video sequence. However, this algorithm fails when the object model is not present in the image or it is completely occluded, and in long term video sequences. In these sequences, the target object appearance is considerably modified during the time and its comparison with the template established at the first frame is hard to compute. In this work we introduce improvements to the original FragTrack: the management of total object occlusions and the update of the object template. Basically, we use a voting map generated by a non-parametric kernel density estimation strategy that allows us to compute a probability distribution for the distances of the histograms between template and object patches. In order to automatically determine whether the target object is present or not in the current frame, an adaptive threshold is introduced. A Bayesian classifier establishes, frame by frame, the presence of template object in the current frame. The template is partially updated at every frame. We tested the algorithm on well-known benchmark sequences, in which the object is always present, and on video sequences showing total occlusion of the target object to demonstrate the effectiveness of the proposed method.
Automatic sport team discrimination, that is the correct assignment of each player to the relative team, is a fundamental step in high level sport video sequences analysis applications. In this work we propose a novel set of features based on a variation of classic color histograms called Positional Histograms: these features try to overcome the main drawbacks of classic histograms, first of all the weakness of any kind of relation between spectral and spatial contents of the image. The basic idea is to extract histograms as a function of the position of points in the image, with the goal of maintaining a relationship between the color distribution and the position: this is necessary because often the actors in a play field dress in a similar way, with just a different distribution of the same col-ors across the silhouettes. Further, different unsupervised classifiers and different feature sets are jointly evaluated with the goal of investigate toward the feasibility of unsupervised techniques in sport video analysis.
Unmanned aerial vehicles (UAVs) are an active research field since several years. They can be applied in a large variety of different scenarios, and supply a test bed to investigate several unsolved problems such as path planning, control and navigation. Furthermore, with the availability of low cost, robust and small video cameras, UAV video has been one of the fastest growing data sources in the last couple of years. In other words, object detection and tracking as well as visual navigation has recently received a lot of attention. This paper proposes an advanced technology framework that, through the use of UAVs, allows to supervise a specific sensible area (i.e. traffic monitoring, dangerous zone and so on). In particular, one of the most cited real-rime visual tracker proposed in the literature, Struck, is applied on video sequences tipically supplied by UAVs equipped with amonocular camera. Furthermore in this paper is investigated on the feasibility to graft different features characterization into the original tracking architecture (replacing the orginal ones). The used feature extraction methods are based on Local Binary Pattern (LBP) and Histogram of Oriented Gradients (HOG). Objects to be tracked could be selected manually or by means of advanced detection technique based, for example, on change detection or template matching strategies. The experimental results on well known benchmark sequences show as these features replacing improve the overall performances of the original considered real-time visual tracker.
In the last decade, soccer video analysis has received a lot of attention from the scientific community. This increasing interest is motivated by the possible applications over a wide spectrum of topics: indexing, summarization, video enhancement, team and players statistics, tactics analysis, referee support, etc. The application of computer vision methodologies in the soccer context requires many problems to be faced: ball and players have to be detected in the images in any light and weather condition, they have to be localized in the field, tracked over time and finally their interactions have to be detected and analyzed. The latter task is fundamental, especially for statistic and referee decision support purposes, but, unfortunately, it has not received adequate attention from the scientific community and a lot of research remains to be done. In this paper a multicamera system is presented to detect the ball player interactions during soccer matches. The proposed method extracts, by triangulation from multiple cameras, the 3D ball and player trajectories and, by estimating the trajectory intersections, detects the ball-player interactions. An inference process is then introduced to determine the player kicking the ball and to estimate the interaction frame. The system was tested during several matches of the Italian first division football championship and experimental results demonstrated that the proposed method is robust and accurate.
Automatic sport video analysis has became one of the most attractive research fields in the areas of computer vision and multimedia technologies. In particular, there has been a boom in soccer video analysis research. This paper presents a new multi-step algorithm to automatically detect the soccer ball in image sequences acquired from static cameras. In each image, candidate ball regions are selected by analyzing edge circularity and then ball patterns are extracted representing locally affine invariant regions around distinctive points which have been highlighted automatically. The effectiveness of the proposed methodologies is demonstrated through a huge number of experiments using real balls under challenging conditions, as well as a favorable comparison with some of the leading approaches from the literature.
Mobility and multi-functionality have been recognized as being basic requirements for the development of fully automated surveillance systems in realistic scenarios. Nevertheless, problems such as active control of heterogeneous mobile agents, integration of information from fixed and moving sensors for high-level scene interpretation, and mission execution are open. This paper describes recent and current research of the authors concerning the design and implementation of a multi-agent surveillance system, using static cameras and mobile robots. The proposed solution takes advantage of a distributed control architecture that allows the agents to autonomously handle general-purpose tasks, as well as more complex surveillance issues. The various agents can either take decisions and act with some degree of autonomy, or cooperate with each other. This paper presents an overview of the system architecture and of the algorithms involved in developing such an autonomous, multi-agent surveillance system.
Joint attention is an early-developing social-communicative skill in which two people (usually a young child and an adult) share attention with regards to an interesting object or event, by means of gestures and gaze, and its presence is a key element in evaluating the therapy in the case of autism spectrum disorders. In this work, a novel automatic system able to detect joint attention by using completely non-intrusive depth camera installed on the room ceiling is presented. In particular, in a scenario where a humanoid-robot, a therapist (or a parent) and a child are interacting, the system can detect the social interaction between them. Specifically, a depth camera mounted on the top of a room is employed to detect, first of all, the arising event to be monitored (performed by an humanoid robot) and, subsequently, to detect the eventual joint attention mechanism analyzing the orientation of the head. The system operates in real-time, providing to the therapist a completely non-intrusive instrument to help him to evaluate the quality and the precise modalities of this predominant feature during the therapy session.
Ball recognition in soccer matches is a critical issue for automatic soccer video analysis. Unfortunately, because of the difficulty in solving the problem, many efforts of numerous researchers have still not produced fully satisfactory results in terms of accuracy. This paper proposes a ball recognition approach that introduces a double level of innovation. Firstly, a randomized circle detection approach based on the local curvature information of the isophotes is used to identify the edge pixels belonging to the ball boundaries. Then, ball candidates are validated by a learning framework formulated into a three-layered model based on a variation of the conventional local binary pattern approach. Experimental results were obtained on a significant set of real soccer images, acquired under challenging lighting conditions during Italian "Serie A" matches. The results have been also favorably compared with the leading state-of-the-art methods.
This paper focuses on the ball detection algorithm that analyzes candidate ball regions to detect the ball. Unfortunately, in the time of goal, the goal-posts (and sometimes also some players) partially occlude the ball or alter its appearance (due to their shadows cast on it). This often makes ineffective the traditional pattern recognition approaches and it forces the system to make the decision about the event based on estimates and not on the basis of the real ball position measurements. To overcome this drawback, this work compares different descriptors of the ball appearance, in particular it investigates on both different well known feature extraction approaches and the recent local descriptors BRISK in a soccer match context. This paper analyzes critical situations in which the ball is heavily occluded in order to measure robustness, accuracy and detection performances. The effectiveness of BRISK compared with other local descriptors is validated by a huge number of experiments on heavily occluded ball examples acquired under realistic conditions
Moving object detection is a crucial step in many application contexts such as people detection, action recognition, and visual surveillance for safety and security. The recent advance in depth camera technology has suggested the possibility to exploit a multi-sensor information (color and depth) in order to achieve better results in video segmentation. In this paper, we present a technique that combines depth and color image information and demonstrate its effectiveness through experiments performed on real image sequences recorded by means of a stereo camera.
In order to perform automatic analysis of sport videos ac-quired from a multi-sensing environment, it is fundamental to face theproblem of automatic football team discrimination. A correct assignmentof each player to the relative team is a preliminary task that togetherwith player detection and tracking algorithms can strongly a®ect anyhigh level semantic analysis. Supervised approaches for object classi¯-cation, require the construction of ad hoc models before the processingand also a manual selection of di®erent player patches belonging to theteam classes. The idea of this paper is to collect the players patches com-ing from six di®erent cameras, and after a pre-processing step based onCBTF (Cumulative Brightness Transfer Function) studying and compar-ing di®erent unsupervised method for classi¯cation. The pre-processingstep based on CBTF has been implemented in order to mitigate di®er-ence in appearance between images acquired by di®erent cameras. Wetested three di®erent unsupervised classi¯cation algorithms (MBSAS - asequential clustering algorithm; BCLS - a competitive one; and k-means- a hard-clustering algorithm) on the transformed patches. Results ob-tained by comparing di®erent set of features with di®erent classi¯ers areproposed. Experimental results have been carried out on di®erent realmatches of the Italian Serie A.1
The detection of moving objects is a crucial step in many application contexts such as people detection, action recognition, and visual surveillance for safety and security. The recent advance in depth camera technology has suggested the possibility to exploit a multi-sensor information (color and depth) in order to achieve better results in video segmentation. In this paper, we present a technique that combines depth and color image information and demonstrate its effectiveness through experiments performed on real image sequences recorded by means of a stereo camera.
In this paper, a real case study on a Goal Line Monitoringsystem is presented. The core of the paper is a re-fined ball detection algorithm that analyzes candidate ballregions to detect the ball. A decision making approach, bymeans of camera calibration, decides about the goal eventoccurrence. Differently from other similar approaches, theproposed one provides, as unquestionable proof, the imagesequence that records the goal event under consideration.Moreover, it is non-invasive: it does not require any changein the typical football devices (ball, goal posts, and so on).Extensive experiments were performed on both real matchesacquired during the Italian Serie A championship, and specificevaluation tests by means of an artificial impact walland a shooting machine for shot simulation. The encouragingexperimental results confirmed that the system couldhelp humans in ambiguous goal line event detection.
In the last years, smart surveillance has been one of the most active research topics in computervision because of the wide spectrum of promising applications. Its main point is about the use of automatic videoanalysis technologies for surveillance purposes. In general, a processing framework for smart surveillanceconsists of a preliminary motion detection step in combination with high-level reasoning that allows automaticunderstanding of evolutions of observed scenes. In this paper, we propose a surveillance framework based on aset of reliable visual algorithms that perform different tasks: a motion analysis approach that segmentsforeground regions is followed by three procedures, which perform object tracking, homographic transformationsand edge matching, in order to achieve the real-time monitoring of forbidden areas and the detection ofabandoned or removed objects. Several experiments have been performed on different real image sequencesacquired from a Messapic museum (indoor context) and the nearby archaeological site (outdoor context) todemonstrate the effectiveness and the flexibility of the proposed approach.
This paper presents a robust visual tracking algorithm based on dense local descriptors. These local invariant representations with a robust object/context nearest neighbor classifier, permits to build a very powerful visual tracker. The performances are very promising even in very long video sequences. © 2015 OSA.
The present invention refers to the problem of the automatic detection of events in sport field, in particular Goal/NoGoal events by signalling to the mach management, which can autonomously take the final decision upon the event. The system is not invasive for the field structures, neither it requires to interrupt the game or to modify the rules thereof, but it only aims at detecting objectively the event occurrence and at providing support in the referees' decisions by means of specific signalling of the detected events.
The present invention relates to a system for detecting and classifying events during motion actions, in particular "offside" event in the football game. The system allows determining such event in a real-time and semi-automatic context, by taking into account the variability of the environmental conditions and of the dynamics of the events which can be traced back to the offside. The present invention proposes itself with a not-invasive technique, compatible with the usual course of the match.
Condividi questo sito sui social