Effettua una ricerca
Tiziana Rita D'orazio
Ruolo
III livello - Ricercatore
Organizzazione
Consiglio Nazionale delle Ricerche
Dipartimento
Non Disponibile
Area Scientifica
AREA 09 - Ingegneria industriale e dell'informazione
Settore Scientifico Disciplinare
ING-INF/05 - Sistemi di Elaborazione delle Informazioni
Settore ERC 1° livello
PE - PHYSICAL SCIENCES AND ENGINEERING
Settore ERC 2° livello
PE6 Computer Science and Informatics: Informatics and information systems, computer science, scientific computing, intelligent systems
Settore ERC 3° livello
PE6_7 Artificial intelligence, intelligent systems, multi agent systems
In this paper a fast and innovative three-dimensional vision system, having high resolution in the surface reconstruction, is discussed. It is based on a triangulation 3D laser scanner with a linear beam shape. The high precision (few microns) is guaranteed by very small laser line width, small camera pixel-size and proper optical properties of the Telecentric Lens. The entire system has been tested on two kinds of sample objects such as a 20 cent coin and a set of precision drilling tools. The main purpose of this work is the detection and reconstruction of the 3D surface of tiny objects and the measurement of their surface defects with high accuracy. Furthermore the occlusion problem is faced and solved by properly handling the camera-laser setup. Experimental tests prove the high precision of the system that can reach a resolution of 15 ?m. © 2013 IEEE.
In this paper, an accurate range sensor for the three-dimensional reconstruction of environments is designed and developed. Following the principles of laser profilometry, the device exploits a set of optical transmitters able to project a laser line on the environment. A high-resolution and high-frame-rate camera assisted by a telecentric lens collects the laser light reflected by a parabolic mirror, whose shape is designed ad hoc to achieve a maximum measurement error of 10 mm when the target is placed 3 m away from the laser source. Measurements are derived by means of an analytical model, whose parameters are estimated during a preliminary calibration phase. Geometrical parameters, analytical modeling and image processing steps are validated through several experiments, which indicate the capability of the proposed device to recover the shape of a target with high accuracy. Experimental measurements show Gaussian statistics, having standard deviation of 1.74 mm within the measurable range. Results prove that the presented range sensor is a good candidate for environmental inspections and measurements.
In this paper, we present a gesture recognition system for the development of a human-robot interaction (HRI) interface. Kinect cameras and the OpenNI framework are used to obtain real-time tracking of a human skeleton. Ten different gestures, performed by different persons, are defined. Quaternions of joint angles are first used as robust and significant features. Next, neural network (NN) classifiers are trained to recognize the different gestures. This work deals with different challenging tasks, such as the real-time implementation of a gesture recognition system and the temporal resolution of gestures. The HRI interface developed in this work includes three Kinect cameras placed at different locations in an indoor environment and an autonomous mobile robot that can be remotely controlled by one operator standing in front of one of the Kinects. Moreover, the system is supplied with a people re-identification module which guarantees that only one person at a time has control of the robot. The system's performance is first validated offline, and then online experiments are carried out, proving the real-time operation of the system as required by a HRI interface.
One of the first tasks executed by a vision system made of fixed cameras is the background (BG) subtraction and a particularly challenging context for real time applications is the athletic one because of illumination changes, moving objects and cluttered scenes. The aim of this work is to extract a BG model based on statistical likelihood able to process color filter array (CFA) images taking into account the intrinsic variance of each gray level of the sensor, named Likelihood Bayer Background (LBB). The BG model should be not so computationally complex while highly responsive to extract a robust foreground. Moreover, the mathematical operations used in the formulation should be parallelizable, working on image patches, and computationally efficient, exploiting the dynamics of a pixel within its integer range. Both simulations and experiments on real video sequences demonstrate that this BG model approach shows great performances and robustness during the real time processing of scenes extracted from a soccer match.
In this article, an accurate method for the registration of point clouds returned by a 3D rangefinder is presented. The method modifies the well-known iterative closest point (ICP) algorithm by introducing the concept of deletion mask. This term is defined starting from virtual scans of the reconstructed surfaces and using inconsistencies between measurements. In this way, spatial regions of implicit ambiguities, due to edge effects or systematical errors of the rangefinder, are automatically found. Several experiments are performed to compare the proposed method with three ICP variants. Results prove the capability of deletion masks to aid the point cloud registration, lowering the errors of the other ICP variants, regardless the presence of artifacts caused by small changes of the sensor view-point and changes of the environment.
Service robots are expected to be used in many household in the near future, provided that proper interfaces are developed for the human robot interaction. Gesture recognition has been recognized as a natural way for the communication especially for elder or impaired people. With the developments of new technologies and the large availability of inexpensive depth sensors, real time gesture recognition has been faced by using depth information and avoiding the limitations due to complex background and lighting situations. In this paper the Kinect Depth Camera, and the OpenNI framework have been used to obtain real time tracking of human skeleton. Then, robust and significant features have been selected to get rid of unrelated features and decrease the computational costs. These features are fed to a set of Neural Network Classifiers that recognize ten different gestures. Several experiments demonstrate that the proposed method works effectively. Real time tests prove the robustness of the method for realization of human robot interfaces. Copyright © 2014 SCITEPRESS.
Automatic sport team discrimination, that is the correct assignment of each player to the relative team, is a fundamental step in high level sport video sequences analysis applications. In this work we propose a novel set of features based on a variation of classic color histograms called Positional Histograms: these features try to overcome the main drawbacks of classic histograms, first of all the weakness of any kind of relation between spectral and spatial contents of the image. The basic idea is to extract histograms as a function of the position of points in the image, with the goal of maintaining a relationship between the color distribution and the position: this is necessary because often the actors in a play field dress in a similar way, with just a different distribution of the same col-ors across the silhouettes. Further, different unsupervised classifiers and different feature sets are jointly evaluated with the goal of investigate toward the feasibility of unsupervised techniques in sport video analysis.
Camera triggering represents an essential step for synchronizing artificial vision systems (AVSs) and can affect the quality of acquired images. In fact, a proper trigger signal is mandatory to synchronize in time both stand alone or multiple cameras covering large environments. In addition, indoor environments with artificial light sources can induce flickering noise in captured frames, affecting the performance of the algorithms that are usually executed to interpret a scene or perform various tasks. In this letter, we describe the design of an embedded system for camera triggering that can be employed to deal with such issues and remove flickering noise while capturing an image with the highest possible dynamic range. Experiments on real data show how the proposed trigger can be effectively employed in AVSs.
This paper presents a survey of soccer video analysis systems for different applications: video summarization, provision of augmented information, high-level analysis. Computer vision techniques have been adapted to be applicable in the challenging soccer context. Different semantic levels of interpretation are required according to the complexity of the corresponding applications. For each application area we analyze the computer vision methodologies, their strengths and weaknesses and we investigate whether these approaches can be applied to extensive and real time soccer video analysis.
Third generation surveillance systems are largely requested for intelligent surveillance of different scenarios such as public areas, urban traffic control, smart homes and so on. They are based on multiple cameras and processing modules that integrate data coming from a large surveillance space. The semantic interpretation of data from a multi-view context is a challenging task and requires the development of image processing methodologies that could support applications in extensive and real-time contexts. This paper presents a survey of automatic event detection functionalities that have been developed for third generation surveillance systems with a particular emphasis on open problems that limit the application of computer vision methodologies to commercial multi-camera systems.
In the last decade, soccer video analysis has received a lot of attention from the scientific community. This increasing interest is motivated by the possible applications over a wide spectrum of topics: indexing, summarization, video enhancement, team and players statistics, tactics analysis, referee support, etc. The application of computer vision methodologies in the soccer context requires many problems to be faced: ball and players have to be detected in the images in any light and weather condition, they have to be localized in the field, tracked over time and finally their interactions have to be detected and analyzed. The latter task is fundamental, especially for statistic and referee decision support purposes, but, unfortunately, it has not received adequate attention from the scientific community and a lot of research remains to be done. In this paper a multicamera system is presented to detect the ball player interactions during soccer matches. The proposed method extracts, by triangulation from multiple cameras, the 3D ball and player trajectories and, by estimating the trajectory intersections, detects the ball-player interactions. An inference process is then introduced to determine the player kicking the ball and to estimate the interaction frame. The system was tested during several matches of the Italian first division football championship and experimental results demonstrated that the proposed method is robust and accurate.
In this paper we present an on-board Computer Vision System for the pose estimation of an Unmanned Aerial Vehicle (UAV)with respect to a human-made landing target. The proposed methodology is based on a coarse-to-fine approach to searchthe target marks starting from the recognition of the characteristics visible from long distances, up to the inner details whenshort distances require high precisions for the final landing phase. A sequence of steps, based on a Point-to-Line Distancemethod, analyzes the contour information and allows the recognition of the target also in cluttered scenarios. The proposedapproach enables to fully assist the UAV during its take-off and landing on the target, as it is able to detect anomaloussituations, such as the loss of the target from the image field of view, and the precise evaluation of the drone attitude whenonly a part of the target remains visible in the image plane. Several indoor and outdoor experiments have been carriedout to demonstrate the effectiveness, robustness and accuracy of developed algorithm. The outcomes have proven that ourmethodology outperforms the current state of art, providing high accuracies in estimating the position and the orientation oflanding target with respect to the UAV.
Mobility and multi-functionality have been recognized as being basic requirements for the development of fully automated surveillance systems in realistic scenarios. Nevertheless, problems such as active control of heterogeneous mobile agents, integration of information from fixed and moving sensors for high-level scene interpretation, and mission execution are open. This paper describes recent and current research of the authors concerning the design and implementation of a multi-agent surveillance system, using static cameras and mobile robots. The proposed solution takes advantage of a distributed control architecture that allows the agents to autonomously handle general-purpose tasks, as well as more complex surveillance issues. The various agents can either take decisions and act with some degree of autonomy, or cooperate with each other. This paper presents an overview of the system architecture and of the algorithms involved in developing such an autonomous, multi-agent surveillance system.
Background (BG) modelling is a key task in every computer vision system (CVS) independently of the final purpose for which it is designed. Even if many BG approaches exist (for example Mixture of Gaussians or Eigenbackground), they can not efficiently process real time videos due to the model complexity and to the high throughput of the video flux. One of the most challenging real time applications is the athletic scene processing, because, in this context, there are many critical aspects for defining a BG model: no a-priori knowledge of the static scene, sudden illumination changes and many moving objects that slow down the upgrade phase. The aim of this work is to provide an adaptive BG model able to deal with high frame rate videos (>= 100 fps) in real time processing, and suitable for smart cameras embedding, finding a good compromise between the model complexity and its responsiveness. Real experiments demonstrate that this BG model approach shows great performances and robustness during the real time processing of athletic video frames, up to 100 fps. Copyright 2014 ACM.
In this paper, we propose an embedded vision system based on laser profilometry able to get the pose of a vehicle and its relative displacements with reference to the constitutive media of a structured environment. Fundamental equations for laser triangulation are developed and encoded for their actual implementation on an embedded system. It is made of a laser source that projects a line-shaped beam onto the environment and an on-chip camera able to frame the laser light. Images are then sent to the inexpensive Raspberry Pi onboard computer, which is responsible for processing tasks. For the first time, laser profilometry is coupled with the correlation of laser signatures on a low-cost and low-resource processing board for vehicle localization purposes. Several validation tests of the proposed sensor have proven the effectiveness of the system with respect to commercially available sensors such as inductive sensors and standard odometers, which fail when the vehicle crosses path interceptions or its wheels undergo unavoidable slippages. Moreover, further comparisons with other vision-based techniques have also proven the good performances of this embedded system for real-time localization of vehicles.
Person re-identification has increasingly become an interesting task in the computer vision field, especially after the well known terroristic attacks on theWorld Trade Center in 2001. Even if video surveillance systems exist since the early 1950s, the third generation of such systems is a relatively modern topic and refers to systems formed by multiple fixed or mobile cameras - geographically referenced or not - whose information have to be handled and processed by an intelligent system. In the last decade, researchers are focusing their attention on the person re-identification task because computers (and so video surveillance systems) can handle a huge amount of data reducing the time complexity of the algorithms. Moreover, some well known image processing techniques - i.e. background subtraction - can be embedded directly on cameras, giving modularity and flexibility to the whole system. The aim of this work is to present an appearance-based method for person re-identification that models the chromatic relationship between both different frames and different areas of the same frame. This approach has been tested against two public benchmark datasets (ViPER and ETHZ) and the experiments demonstrate that the person re-identification processing by means of intra frame relationships is robust and shows great results in terms of recognition percentage.
In this paper, an approach based on the analysis of variance (ANOVA) for the extraction of crop marks from aerial images is improved by means of preliminary analyses and semantic processing of the extracted objects. The paper falls in the field of digitalization of images for archaeology, assisting expert users in the detection of un-excavated sites. The methodology is improved by a preliminary analysis of local curvatures, able to determine the most suitable direction for the ANOVA formulation. Then, a semantic processing, based on the knowledge of the shape of the target wide line, is performed to delete false positive detections. Sample analyses are always performed on actual images and prove the capability of the method to discriminate the most significant marks, aiding archaeologists in the analysis of huge amount of data.
In this paper we present a reliable method to derive the differences between indoor environments using the comparison of high-resolution range images. Samples belonging to different acquisitions are firstly reduced preserving the topology of the scenes and then registered in the same system of reference through an iterative least-squares algorithm, aided by a deletion mask, whose assignment is the removal of implicit errors due to the different points of view of each orthographic acquisition. Finally the analysis of the exact range measures returns an intuitive difference map that allows the fast detection of the positions of the altered regions within the scenes. Numerical experiments are presented to prove the capability of the method for the comparison of scenes regardless the resolution of the sensor and the input noise level of such measurements. © 2013 IEEE.
Archaeological trace extraction in aerial or satellite data is a difficult issue for automatic algorithms due to the traces similarity to other image artifacts or to their poor boundary information, discontinuities and so on. We propose in this paper a modified region based active contour approach for archaeological trace identification that overcomes the limits of standard methods of region uniformity and different consistencies with respect to the background. The proposed approach introduces a directional energy model in the minimization of the conventional energy term used in the existing active contour approaches. The local trace direction is estimated automatically after an initial unconstrained evolution of the region. Then, an iterative block based directional procedure has been introduced to limit the application of the modified method to local and adjacent areas and to allow the processing of large images in which the traces may have complex intersections or follow a curved trajectory. Finally, in order to reduce the initialization dependance problem, we propose the use of one seed point for each trace as the initial curve. Tests on the extraction of archaeological traces such as centuriations and ancient roads, visible as crop marks, have demonstrated that the proposed method and the developed MATLAB-based Graphical User Interface (GUI) facilitate unskilled/semi-skilled users in their archaeologic traces mapping operations and improve their detection precisions. © 2012 Elsevier Ltd. All rights reserved.
This paper considers the problem of detecting archaeological traces in digital aerial images by analyzing the pixel variance over regions around selected points. In order to decide if a point belongs to an archaeological trace or not, its surrounding regions are considered. The one-way ANalysis Of VAriance (ANOVA) is applied several times to detect the differences among these regions; in particular the expected shape of the mark to be detected is used in each region. Furthermore, an effect size parameter is defined by comparing the statistics of these regions with the statistics of the entire population in order to measure how strongly the trace is appreciable. Experiments on synthetic and real images demonstrate the effectiveness of the proposed approach with respect to some state-of-the-art methodologies.
This paper presents a complete framework aimed to nondestructive inspection of composite materials. Starting from the acquisition, performed with lock-in thermography, the method flows through a set of consecutive blocks of data processing: input enhancement, feature extraction, classification and defect detection. Experimental results prove the capability of the presented methodology to detect the presence of defects underneath the surface of a calibrated specimen made of Glass Fiber Reinforced Polymer (GFRP). Results are also compared with those obtained by other techniques, based on different features and unsupervised learning methods. The comparison further proves that the proposed methodology is able to reduce the number of false positives, while ensuring the exact detection of subsurface defects.
In recent years sport video research has gained a steady interest among the scientific community. The large amount of video data available from broadcast transmissions and from dedicated camera setups, and the need of extracting meaningful information from data, pose significant research challenges. Hence, computer vision and machine learning are essential for enabling automated or semi-automated processing of big data in sports. Although sports are diverse enough to present unique challenges on their own, most of them share the need to identify active entities such as ball or players. In this paper, an innovative deep learning approach to the identification of the ball in tennis context is presented. The work exploits the potential of a convolutional neural network classifier to decide whether a ball is being observed in a single frame, overcoming the typical issues that can occur dealing with classical approaches on long video sequences (e.g. illumination changes and flickering issues). Experiments on real data confirm the validity of the proposed approach that achieves 98.77% accuracy and suggest its implementation and integration at a larger scale in more complex vision systems.
In the field of NDT techniques for aeronautic components of composite materials, the development of automatic and robust approaches for defect detection is largely desirable for both safety and economic reasons. This paper introduces a novel methodology for the automatic analysis of thermal signals resulting from the application of pulsed thermography. Input thermal decays are processed by a proper FIR filter designed to reduce the measurement noise, and then modeled to represent both sound regions and defective ones. Output signals are thus fitted on an exponential model, which approximates thermal contrasts with three robust parameters. These features feed a decision forest, trained to detect discontinuities and characterize their depths. Several experiments on actual sample laminates have proven the increase of the classification performance of the proposed approach with respect to related ones in terms of the reduction of missing predictions of defective classes.
In this paper, a method to find, exploit and classify ambiguities in the results of a person re-identification PRID) algorithm is presented. We start from the assumption that ambiguity is implicit in the classical formulation of the re-identification problem, as a specific individual may resemble one or more subjects by the color of dresses or the shape of the body. Therefore, we propose the introduction of the AMbiguity rAte in REidentification (AMARE) approach, which relates the results of a classical PRID pipeline on a specific dataset with their effectiveness in re-identification terms, exploiting the ambiguity rate (AR). As a consequence, the cumulative matching curves (CMC) used to show the results of a PRID algorithm will be filtered according to the AR. The proposed method gives a different interpretation of the output of PRID algorithms, because the CMC curves are processed, split and studied separately. Real experiments demonstrate that the separation of the results is really helpful in order to better understand the capabilities of a PRID algorithm.
In order to perform automatic analysis of sport videos ac-quired from a multi-sensing environment, it is fundamental to face theproblem of automatic football team discrimination. A correct assignmentof each player to the relative team is a preliminary task that togetherwith player detection and tracking algorithms can strongly a®ect anyhigh level semantic analysis. Supervised approaches for object classi¯-cation, require the construction of ad hoc models before the processingand also a manual selection of di®erent player patches belonging to theteam classes. The idea of this paper is to collect the players patches com-ing from six di®erent cameras, and after a pre-processing step based onCBTF (Cumulative Brightness Transfer Function) studying and compar-ing di®erent unsupervised method for classi¯cation. The pre-processingstep based on CBTF has been implemented in order to mitigate di®er-ence in appearance between images acquired by di®erent cameras. Wetested three di®erent unsupervised classi¯cation algorithms (MBSAS - asequential clustering algorithm; BCLS - a competitive one; and k-means- a hard-clustering algorithm) on the transformed patches. Results ob-tained by comparing di®erent set of features with di®erent classi¯ers areproposed. Experimental results have been carried out on di®erent realmatches of the Italian Serie A.1
In this article, we tackle the problem of developing a visual framework to allow the autonomous landing of an unmanned aerial vehicle onto a platform using a single camera. Specifically, we propose a vision-based helipad detection algorithm in order to estimate the attitude of a drone on which the camera is fastened with respect to target. Since the algorithm should be simple and quick, we implemented a method based on curvatures in order to detect the heliport marks, that is, the corners of character H. By knowing the size of H mark and the actual location of its corners, we are able to compute the homography matrix containing the relative pose information. The effectiveness of our methodology has been proven through controlled indoor and outdoor experiments. The outcomes have shown that the method provides high accuracies in estimating the distance and the orientation of camera with respect to visual target. Specifically, small errors lower than 1% and 4% have been achieved in the computing of measurements, respectively.
A high-resolution vision system for the inspection of drilling tools is presented. A triangulation-based laser scanner is used to extract a three-dimensional model of the target aimed to the fast detection and characterization of surface defects. The use of two orthogonal calibrated handlings allows the achievement of precisions of the order of few microns in the whole testing volume and the prevention of self-occlusions induced on the undercut surfaces of the tool. Point cloud registration is also derived analytically to increase to strength of the measurement scheme, whereas proper filters are used to delete samples whose quality is below a reference threshold. Experimental tests are performed on calibrated spheres and different-sized tools, proving the capability of the presented setup to entirely reconstruct complex targets with maximum absolute errors between the estimated distances and the corresponding nominal values below 12 mu m.
High resolution in distance (range) measurements can be achieved by means of accurate instrumentations and precise analytical models. This paper reports an improvement in the estimation of distance measurements performed by an omnidirectional range sensor already presented in literature. This sensor exploits the principle of laser triangulation, together with the advantages brought by catadioptric systems, which allow the reduction of the sensor size without decreasing the resolution. Starting from a known analytical model in two dimensions (2D), the paper shows the development of a fully 3D formulation where all initial constrains are removed to gain in measurement accuracy. Specifically, the ray projection problem is solved by considering that both the emitter and the receiver have general poses in a global system of coordinates. Calibration is thus made to estimate their poses and compensate for any misalignment with respect to the 2D approximation. Results prove an increase in the measurement accuracy due to the more general formulation of the problem, with a remarkable decrease of the uncertainty.
Pulsed thermography has been used for many years to investigate the presence of subsurface defects in composite materials for aeronautics. Several methods have been proposed but only few of them include a complete automated approach for the effective defect characterization. This paper presents a novel method which approximates the thermal decays on the laminate surface, induced by a short heat pulse, by means of an exponential model in three unknowns (model parameters), estimated in the least squares sense. These parameters are discriminant and noise-insensitive features used to feed several classifiers, which are trained to label possible defects according to their depths. Experimental tests have been performed on a carbon-fiber reinforced polymer (CFRP) laminate having four inclusions of known properties. The comparative analysis of the proposed classifiers has demonstrated that the best results are achieved by a decision forest made of 30 trees. In this case the mean values of standard and balanced accuracies reach 99.47% and 86.9%, whereas precision and recall are 89.87% and 73.67%, respectively.
This paper describes a complete method for monitoring indoor environments. Three-dimensional (3D) point clouds are first acquired from the environment under investigation by means of a laser range scanner in order to obtain several 3D models to be compared. Input datasets are thus registered each other exploiting a reliable variant of the iterative closest point algorithm (ICP) assisted by the use of deletion masks. These terms work in cooperation with the resampling of the model surfaces to reduce significantly the errors in the estimation of the registration parameters. Once datasets are registered, deformation maps are displayed to help the user to detect changes within the environment. Deletion masks are again used to filter measurement artifacts from the comparison, thus highlighting only the actual alterations of the environment. Several experiments are performed for the analysis of an indoor environment, proving the capability of the proposed method to reliably estimate the presence of alterations.
People tracking is a central and crucial point for the development of intelligent surveillance systems. When multiple cameras are used, the problem becomes more challenging as people re-identification is needed. Humans can greatly change their appearance according to posture, clothing and lighting conditions, thus defining features that describe people moving in large scenarios is a complex task. In this paper the problem of people re-identification and tracking is reviewed. The most used methodologies are discussed and insight into open problems and future research directions is provided. © 2012 IEEE.
This paper tackles the problem of people re-identification by using soft biometrics features. The method works on RGB-D data (color point clouds) to determine the best matching among a database of possible users. For each subject under testing, skeletal information in three-dimensions is used to regularize the pose and to create a skeleton standard posture (SSP). A partition grid, whose sizes depend on the SSP, groups the samples of the point cloud accordingly to their position. Every group is then studied to build the person signature. The same grid is then used for the other subjects of the database to preserve information about possible shape differences among users. The effectiveness of this novel method has been tested on three public datasets. Numerical experiments demonstrate an improvement of results with reference to the current state-of-the-art, with recognition rates of 97.84% (on a partition of BIWI RGBD-ID), 61.97% (KinectREID) and 89.71% (RGBD-ID), respectively.
In this paper we present a natural humancomputer interface based on gesture recognition. The principal aimis to study how different personalized gestures, defined by users,can be represented in terms of features and can be modelled byclassification approaches in order to obtain the best performancesin gesture recognition. Ten different gestures involving themovement of the left arm are performed by different users.Different classification methodologies (SVM, HMM, NN, and DTW) arecompared and their performances and limitations are discussed. Anensemble of classifiers is proposed to produce more favorableresults compared to those of a single classifier system. Theproblems concerning different lengths of gesture executions,variability in their representations, generalization ability ofthe classifiers have been analyzed and a valuable insight inpossible recommendation is provided.
In the last years, smart surveillance has been one of the most active research topics in computervision because of the wide spectrum of promising applications. Its main point is about the use of automatic videoanalysis technologies for surveillance purposes. In general, a processing framework for smart surveillanceconsists of a preliminary motion detection step in combination with high-level reasoning that allows automaticunderstanding of evolutions of observed scenes. In this paper, we propose a surveillance framework based on aset of reliable visual algorithms that perform different tasks: a motion analysis approach that segmentsforeground regions is followed by three procedures, which perform object tracking, homographic transformationsand edge matching, in order to achieve the real-time monitoring of forbidden areas and the detection ofabandoned or removed objects. Several experiments have been performed on different real image sequencesacquired from a Messapic museum (indoor context) and the nearby archaeological site (outdoor context) todemonstrate the effectiveness and the flexibility of the proposed approach.
Computer vision is steadily gaining importance in many research fields, as its applications expand from traditional fields situation analysis and scene understanding in video surveillance to other scenarios. The sportive context can represent a perfect test-bed for many machine vision algorithms because of the large availability of visual data brought by wide spread cameras on a relatively high number of courts. In this paper we introduce a tennis ball detection and tracking method that exploits domain knowledge to effectively recognize ball positions and trajectories. A peculiarity of this approach is that it starts from a sparse but cluttered point cloud that evolves over time, basically working on 3D samples only. Experiments on real data demonstrate the effectiveness of the algorithm in terms of tracking accuracy and path following capability.
This paper analyzes with a new perspective the recent state of-the-art on gesture recognition approaches that exploit both RGB and depth data (RGB-D images). The most relevant papers have been analyzed to point out which features and classifiers best work with depth data, if these fundamentals are specifically designed to process RGB-D images and, above all, how depth information can improve gesture recognition beyond the limit of standard approaches based on solely color images. Papers have been deeply reviewed finding the relation between gesture complexity and features/methodologies suitability. Different types of gestures are discussed, focusing attention on the kind of datasets (public or private) used to compare results, in order to understand weather they provide a good representation of actual challenging problems, such as: gesture segmentation, idle gesture recognition, and length gesture invariance. Finally the paper discusses on the current open problems and highlights the future directions of research in the field of processing of RGB-D data for gesture recognition.
Tennis player silhouette extraction is a preliminary step fundamental for any behavior analysis processing. Automatic systems for the evaluation of player tactics, in terms of position in the court, postures during the game and types of strokes, are highly desired for coaches and training purposes. These systems require accurate segmentation of players in order to apply posture analysis and high level semantic analysis. Background subtraction algorithms have been largely used in sportive context when fixed cameras are used. In this paper an innovative background subtraction algorithm is presented, which has been adapted to the tennis context and allows high precision in player segmentation both for the completeness of the extracted silhouettes. The algorithm is able to achieve interactive frame rates with up to 30 frames per second, and is suitable for smart cameras embedding. Real experiments demonstrate that the proposed approach is suitable in tennis contexts.
Non-destructive testing is essential for the thorough assessment of production processes of complex materials, such as composites. This paper presents a complete algorithm to detect subsurface defects, e.g. extended delaminations or local resin pockets, by comparing the outputs produced by lock-in thermography for the inspection of master pristine samples and the current ones under testing. The use of lock-in thermography produces amplitude and phase maps. Focusing on amplitudes, dataset are first made comparable in both magnitude spans and spatial positions exploiting image normalization and alignment. Then local patches in actual correspondence are cross-correlated to further improve their alignment and estimate a similarity measurement. Differences in thermal behaviors detected by the proposed processing underlie subsurface defects. These outcomes have been also proven by experimental investigations performed on a carbon fiber reinforced polymer (CFRP) T-joint.
The present invention refers to the problem of the automatic detection of events in sport field, in particular Goal/NoGoal events by signalling to the mach management, which can autonomously take the final decision upon the event. The system is not invasive for the field structures, neither it requires to interrupt the game or to modify the rules thereof, but it only aims at detecting objectively the event occurrence and at providing support in the referees' decisions by means of specific signalling of the detected events.
The present invention relates to a system for detecting and classifying events during motion actions, in particular "offside" event in the football game. The system allows determining such event in a real-time and semi-automatic context, by taking into account the variability of the environmental conditions and of the dynamics of the events which can be traced back to the offside. The present invention proposes itself with a not-invasive technique, compatible with the usual course of the match.
Condividi questo sito sui social