Data Science


Data Analysis in Sensor Networks

DaSAS researchers develop statistical models for short-term forecasting and incident/anomaly detection in sensor networks, with emphasis in real-time algorithms for vehicular networks. Nonlinear parametric time series models, quantile regression, penalized/shrinkage estimates and regime-switching forecast combination schemes are among the methodologies which are developed in the Lab. Depending on the application, either a frequentist or a Bayesian approach can be implemented.

Quantitative methods in Biomedical Engineering

DaSAS researchers have been involved in a wide variety of projects related to Biomedical Engineering. Example applications include: a) predictive models for pediatric heart transplantation donor-recipient size matching; b) spatial regression models for the association of thrombus growth with hemodynamic variables in aneurysms; and c) longitudinal, mixed-effects specifications for the evaluation of feeding practices for high-risk, very low birthweight infants.

Fractional Land Cover

Fractional land cover mapping is needed to capture urban fabric and surface materials using satellite imagery with spatial resolution of the order of 10 m x 10 m. To achieve robust material base-mapping, the SVM (Support Vector Machine) method was extended in a way that can be used for both land cover classification (pixel scale) and spectral un-mixing (sub-pixel scale) and can combine several different Kernels for individual classes. New spheroid and ellipsoid Kernels were also developed, which in many cases work better than the standard Kernels (linear, RBF, polynomial) in Earth Observation data analysis.

Global albedo trends from Earth Observation

Surface albedo is important because it largely affects the Earth’s energy budget and it is used in a varietyof scientific fields. Satellites like MODIS measure the reflected radiation from the Earth, but daily nadiracquisitions are not enough to estimate the albedo, because of the surface anisotropy. Multi-dateobservations account for the multiple angles. The MODIS bi-directional reflectance products productcontains three-dimensional data sets with parameters to model the directional-hemispherical (black-sky)and bi-hemispherical (white-sky) albedo, which are combined to give the real (blue-sky) albedo. To estimatetrends for a 15-year time series of blue-sky albedo globally, we need to loop over 24 solar zenith angleswithin a day, times 3 billion pixels, times 45 products per year, times 15 years. The Google Earth Engineallows for this kind of Big Data analysis to derive snow-free land surface albedo trends at 500 m x 500 mscale, for any area of the globe.

Global Land Surface Temperature

The land surface temperature (LST) is an important parameter for environmental studies and enables theThe land surface temperature (LST) is an important parameter for environmental studies and enables themonitoring of landscape processes and responses (energy budget, water budget etc.). By taking advantageof the latest cloud computing and database technologies of Google Earth Engine (GEE), it is possible toautomatically access and process Big Data from satellites on the fly. Therefore, by combining thenecessary thermal infrared observations from Landsat 5,7 and 8 satellites for more than 30 years, afteron the fly emissivity and atmospheric corrections, it is possible to instantly provide LST maps, at a spatialresolution of 30 m x 30 m, for any area of interest globally.

Statistical Models for Remotely Sensed Data

DaSAS researchers collaborate with RSLab to construct spatiotemporal statistical models for remotely sensed data. Example applications include: a) data fusion of measurements with different spatial supports to produce maps of pollution (e.g. PM10) or rainfall totals; and b) robust subpixel classification to uncover fractions of soil, vegetation and impervious surfacesrface Temperature within each pixel.

Evaluation and combination of Regional Climate model outputs

The evaluation of alternative parameterizations of Regional Climate Models and their combination using spatially varying weights are complex tasks that require estimation of space-time models with spatially varying coefficients. DaSAS researchers perform such analyses by combining Moran eigenvector filtering techniques with modern screening algorithms and penalized estimators.

Time Series models for high-frequency emissions rates

DaSAS researchers develop predictive models for high-frequency emissions rates from field experiments. Such real-life investigations aim to evaluate different types of fuels, filters, engine types etc. The undertaken quantitative analyses are frequently based on parametric, nonlinear time series models, quantile regressions and extreme value theory.

Underwater ambient noise and Soundscape data

Underwater measurements of ambient noise are performed at specific areas in the sea. The data collected are formatted, analyzed, stored and statistically processed. The data can be used by the proper authorities in the framework of the MFSD (EU Marine Strategy Framework Directive). On the other hand continuous monitoring of specific coastal areas, recording the underwater soundscape and providing real time data is of interest to local authorities and coastal economic activities.

Seismic data

The seismological station situated in the premises of IACM-FORTH is monitoring the seismic activity in the broader area of the Eastern Mediterranean. Seismic data are collected and analyzed continuously. The station is part of the National Seismological Network (NSN). Therefore, information about the station and its operation is available from the World Wide Web. In addition data are being sent over the network and can be used in conjunction with other stations of the NSN in order to detect the epicenter and magnitude of earthquakes.