Effettua una ricerca
Nicola Mosca
Ruolo
III livello - Ricercatore
Organizzazione
Consiglio Nazionale delle Ricerche
Dipartimento
Non Disponibile
Area Scientifica
Non Disponibile
Settore Scientifico Disciplinare
Non Disponibile
Settore ERC 1° livello
Non Disponibile
Settore ERC 2° livello
Non Disponibile
Settore ERC 3° livello
Non Disponibile
One of the first tasks executed by a vision system made of fixed cameras is the background (BG) subtraction and a particularly challenging context for real time applications is the athletic one because of illumination changes, moving objects and cluttered scenes. The aim of this work is to extract a BG model based on statistical likelihood able to process color filter array (CFA) images taking into account the intrinsic variance of each gray level of the sensor, named Likelihood Bayer Background (LBB). The BG model should be not so computationally complex while highly responsive to extract a robust foreground. Moreover, the mathematical operations used in the formulation should be parallelizable, working on image patches, and computationally efficient, exploiting the dynamics of a pixel within its integer range. Both simulations and experiments on real video sequences demonstrate that this BG model approach shows great performances and robustness during the real time processing of scenes extracted from a soccer match.
Camera triggering represents an essential step for synchronizing artificial vision systems (AVSs) and can affect the quality of acquired images. In fact, a proper trigger signal is mandatory to synchronize in time both stand alone or multiple cameras covering large environments. In addition, indoor environments with artificial light sources can induce flickering noise in captured frames, affecting the performance of the algorithms that are usually executed to interpret a scene or perform various tasks. In this letter, we describe the design of an embedded system for camera triggering that can be employed to deal with such issues and remove flickering noise while capturing an image with the highest possible dynamic range. Experiments on real data show how the proposed trigger can be effectively employed in AVSs.
Sports video research is a popular topic that has been applied to many prominent sports for a large spectrum of applications. In this paper we introduce a technology platform which has been developed for the tennis context, able to extract action sequences and provide support to coaches for players performance analysis during training and official matches. The system consists of an hardware architecture, devised to acquire data in the tennis context and for the specific domain requirements, and a number of processing modules which are able to track both the ball and the players, to extract semantic information from their interactions and automatically annotate video sequences. The aim of this paper is to demonstrate that the proposed combination of hardware and software modules is able to extract 3D ball trajectories robust enough to evaluate ball changes of direction recognizing serves, strokes and bounces. Starting from these information, a finite state machine based decision process can be employed to evaluate the score of each action of the game. The entire platform has been tested in real experiments during both training sessions and matches, and results show that automatic annotation of key events along with 3D positions and scores can be used to support coaches in the extraction of valuable information about players intentions and behaviours.
The software architecture is composed by a web application (BIS client) and a web service (BIS service). The former can be accessed using a common browser and presents in a user-friendly way the core functionalities supplied by the latter. Underneath, the two components interact through the means of an HTTP API based on the LCML-based data model; the BIS service makes use of a geo-database to store georeferenced data and a native XML database to store and transform the LCML collections according to the user requests. The LCML-based data model is encoded using XML schema, in order to leverage the query capabilities of a native XML database (e.g. XQuery, XPath). The use of native XML technologies seems a reasonable choice to enable the system scaling, maintaining unaltered the full set of information that is available in LCCS3/LCML. Future steps of this work include the application of BIS functionalities on even larger LCCS3 legends (e.g. output of automatic classification systems).
In recent years sport video research has gained a steady interest among the scientific community. The large amount of video data available from broadcast transmissions and from dedicated camera setups, and the need of extracting meaningful information from data, pose significant research challenges. Hence, computer vision and machine learning are essential for enabling automated or semi-automated processing of big data in sports. Although sports are diverse enough to present unique challenges on their own, most of them share the need to identify active entities such as ball or players. In this paper, an innovative deep learning approach to the identification of the ball in tennis context is presented. The work exploits the potential of a convolutional neural network classifier to decide whether a ball is being observed in a single frame, overcoming the typical issues that can occur dealing with classical approaches on long video sequences (e.g. illumination changes and flickering issues). Experiments on real data confirm the validity of the proposed approach that achieves 98.77% accuracy and suggest its implementation and integration at a larger scale in more complex vision systems.
High resolution in distance (range) measurements can be achieved by means of accurate instrumentations and precise analytical models. This paper reports an improvement in the estimation of distance measurements performed by an omnidirectional range sensor already presented in literature. This sensor exploits the principle of laser triangulation, together with the advantages brought by catadioptric systems, which allow the reduction of the sensor size without decreasing the resolution. Starting from a known analytical model in two dimensions (2D), the paper shows the development of a fully 3D formulation where all initial constrains are removed to gain in measurement accuracy. Specifically, the ray projection problem is solved by considering that both the emitter and the receiver have general poses in a global system of coordinates. Calibration is thus made to estimate their poses and compensate for any misalignment with respect to the 2D approximation. Results prove an increase in the measurement accuracy due to the more general formulation of the problem, with a remarkable decrease of the uncertainty.
We propose a method for solving one of the significant open issues in computer vision: material recognition. A time-of-flight range camera has been employed to analyze the characteristics of different materials. Starting from the information returned by the depth sensor, different features of interest have been extracted using transforms such as Fourier, discrete cosine, Hilbert, chirp-z, and Karhunen-Loève. Such features have been used to build a training and a validation set useful to feed a classifier (J48) able to accomplish the material recognition step. The effectiveness of the proposed methodology has been experimentally tested. Good predictive accuracies of materials have been obtained. Moreover, experiments have shown that the combination of multiple transforms increases the robustness and reliability of the computed features, although the shutter value can heavily affect the prediction rates.
In this paper we present a natural humancomputer interface based on gesture recognition. The principal aimis to study how different personalized gestures, defined by users,can be represented in terms of features and can be modelled byclassification approaches in order to obtain the best performancesin gesture recognition. Ten different gestures involving themovement of the left arm are performed by different users.Different classification methodologies (SVM, HMM, NN, and DTW) arecompared and their performances and limitations are discussed. Anensemble of classifiers is proposed to produce more favorableresults compared to those of a single classifier system. Theproblems concerning different lengths of gesture executions,variability in their representations, generalization ability ofthe classifiers have been analyzed and a valuable insight inpossible recommendation is provided.
Computer vision is steadily gaining importance in many research fields, as its applications expand from traditional fields situation analysis and scene understanding in video surveillance to other scenarios. The sportive context can represent a perfect test-bed for many machine vision algorithms because of the large availability of visual data brought by wide spread cameras on a relatively high number of courts. In this paper we introduce a tennis ball detection and tracking method that exploits domain knowledge to effectively recognize ball positions and trajectories. A peculiarity of this approach is that it starts from a sparse but cluttered point cloud that evolves over time, basically working on 3D samples only. Experiments on real data demonstrate the effectiveness of the algorithm in terms of tracking accuracy and path following capability.
Tennis player silhouette extraction is a preliminary step fundamental for any behavior analysis processing. Automatic systems for the evaluation of player tactics, in terms of position in the court, postures during the game and types of strokes, are highly desired for coaches and training purposes. These systems require accurate segmentation of players in order to apply posture analysis and high level semantic analysis. Background subtraction algorithms have been largely used in sportive context when fixed cameras are used. In this paper an innovative background subtraction algorithm is presented, which has been adapted to the tennis context and allows high precision in player segmentation both for the completeness of the extracted silhouettes. The algorithm is able to achieve interactive frame rates with up to 30 frames per second, and is suitable for smart cameras embedding. Real experiments demonstrate that the proposed approach is suitable in tennis contexts.
The present invention refers to the problem of the automatic detection of events in sport field, in particular Goal/NoGoal events by signalling to the mach management, which can autonomously take the final decision upon the event. The system is not invasive for the field structures, neither it requires to interrupt the game or to modify the rules thereof, but it only aims at detecting objectively the event occurrence and at providing support in the referees' decisions by means of specific signalling of the detected events.
The present invention relates to a system for detecting and classifying events during motion actions, in particular "offside" event in the football game. The system allows determining such event in a real-time and semi-automatic context, by taking into account the variability of the environmental conditions and of the dynamics of the events which can be traced back to the offside. The present invention proposes itself with a not-invasive technique, compatible with the usual course of the match.
Condividi questo sito sui social