Title: | RFSI & STRK Interpolation for Meteo and Environmental Variables |
---|---|
Description: | Random Forest Spatial Interpolation (RFSI, Sekulić et al. (2020) <doi:10.3390/rs12101687>) and spatio-temporal geostatistical (spatio-temporal regression Kriging (STRK)) interpolation for meteorological (Kilibarda et al. (2014) <doi:10.1002/2013JD020803>, Sekulić et al. (2020) <doi:10.1007/s00704-019-03077-3>) and other environmental variables. Contains global spatio-temporal models calculated using publicly available data. |
Authors: | Milan Kilibarda [aut]
|
Maintainer: | Aleksandar Sekulić <[email protected]> |
License: | GPL (>= 2.0) | file LICENCE |
Version: | 2.0-3 |
Built: | 2025-03-01 04:53:00 UTC |
Source: | https://github.com/aleksandarsekulic/rmeteo |
Calculates classification and regression accuracy metrics for given coresponding observation and prediction vectors.
acc.metric.fun(obs, pred, acc.m)
acc.metric.fun(obs, pred, acc.m)
obs |
|
pred |
|
acc.m |
|
Accuracy metric value.
Aleksandar Sekulic [email protected]
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation.Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
acc.metric.fun
rfsi
pred.rfsi
tune.rfsi
cv.rfsi
pred.strk
cv.strk
library(sp) library(sf) library(CAST) library(ranger) library(plyr) library(meteo) # preparing data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # making tgrid n.obs <- 1:3 min.node.size <- 2:10 sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement splitrule <- "variance" ntree <- 250 # 500 mtry <- 3:(2+2*max(n.obs)) tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree, mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction) # do cross-validation rfsi_cv <- cv.rfsi(formula=fm.RFSI, # without nearest obs data = data, zero.tol=0, tgrid = tgrid, # combinations for tuning tgrid.n = 5, # number of randomly selected combinations from tgrid for tuning tune.type = "LLO", # Leave-Location-Out CV k = 5, # number of folds seed = 42, acc.metric = "RMSE", # R2, CCC, MAE output.format = "data.frame", cpus=2, # detectCores()-1, progress=1, importance = "impurity") summary(rfsi_cv) # accuracy metric calculation acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "R2") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "RMSE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "NRMSE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "MAE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "NMAE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "CCC")
library(sp) library(sf) library(CAST) library(ranger) library(plyr) library(meteo) # preparing data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # making tgrid n.obs <- 1:3 min.node.size <- 2:10 sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement splitrule <- "variance" ntree <- 250 # 500 mtry <- 3:(2+2*max(n.obs)) tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree, mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction) # do cross-validation rfsi_cv <- cv.rfsi(formula=fm.RFSI, # without nearest obs data = data, zero.tol=0, tgrid = tgrid, # combinations for tuning tgrid.n = 5, # number of randomly selected combinations from tgrid for tuning tune.type = "LLO", # Leave-Location-Out CV k = 5, # number of folds seed = 42, acc.metric = "RMSE", # R2, CCC, MAE output.format = "data.frame", cpus=2, # detectCores()-1, progress=1, importance = "impurity") summary(rfsi_cv) # accuracy metric calculation acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "R2") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "RMSE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "NRMSE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "MAE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "NMAE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "CCC")
Function for nested k-fold cross-validation function for Random Forest Spatial Interpolation (RFSI) (Sekulić et al. 2020). It is based on rfsi, pred.rfsi, and tune.rfsi functions. Currently, only spatial (leave-location-out) cross-validation is implemented. Temporal and spatio-temporal cross-validation will be implemented in the future.
cv.rfsi(formula, data, data.staid.x.y.z = NULL, use.idw = FALSE, s.crs = NA, p.crs = NA, tgrid, tgrid.n=10, tune.type = "LLO", k = 5, seed=42, out.folds, in.folds, acc.metric, output.format = "data.frame", cpus = detectCores()-1, progress = 1, soil3d = FALSE, no.obs = 'increase', ...)
cv.rfsi(formula, data, data.staid.x.y.z = NULL, use.idw = FALSE, s.crs = NA, p.crs = NA, tgrid, tgrid.n=10, tune.type = "LLO", k = 5, seed=42, out.folds, in.folds, acc.metric, output.format = "data.frame", cpus = detectCores()-1, progress = 1, soil3d = FALSE, no.obs = 'increase', ...)
formula |
formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If |
data |
sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates ( |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
use.idw |
boolean; IDW prediction as covariate - will IDW predictions from |
s.crs |
st_crs or crs; Source CRS of |
p.crs |
st_crs or crs; Projection CRS for |
tgrid |
data.frame; Possible tuning parameters for nested folds. The column names are same as the tuning parameters. Possible tuning parameters are: |
tgrid.n |
numeric; Number of randomly chosen |
tune.type |
character; Type of nested cross-validation: leave-location-out ("LLO"), leave-time-out ("LTO") - TO DO, and leave-location-time-out ("LLTO") - TO DO. Default is "LLO". |
k |
numeric; Number of random outer and inner folds (i.e. for cross-validation and nested tuning) that will be created with CreateSpacetimeFolds function. Default is 5. |
seed |
numeric; Random seed that will be used to generate outer and inner folds with CreateSpacetimeFolds function. |
out.folds |
numeric or character vector or value; Showing outer folds column (if value) or rows (vector) of |
in.folds |
numeric or character vector or value; Showing innner folds column (if value) or rows (vector) of |
acc.metric |
character; Accuracy metric that will be used as a criteria for choosing an optimal RFSI model in nested tuning. Possible values for regression: "ME", "MAE", "NMAE", "RMSE" (default), "NRMSE", "R2", "CCC". Possible values for classification: "Accuracy","Kappa" (default), "AccuracyLower", "AccuracyUpper", "AccuracyNull", "AccuracyPValue", "McnemarPValue". |
output.format |
character; Format of the output, data.frame (default), sf-class, sftime-class, or SpatVector-class. |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
progress |
numeric; If progress bar is shown. 0 is no progress bar, 1 is outer folds results, 2 is + innner folds results, 3 is + prediction progress bar. Default is 1. |
soil3d |
logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the |
no.obs |
character; Possible values are |
... |
Further arguments passed to ranger. |
A data.frame, sf-class, sftime-class, or SpatVector-class object (depends on output.format
argument), with columns:
obs |
Observations. |
pred |
Predictions from cross-validation. |
folds |
Folds used for cross-validation. |
Aleksandar Sekulic [email protected]
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
near.obs
rfsi
pred.rfsi
tune.rfsi
library(CAST) library(doParallel) library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # making tgrid n.obs <- 1:6 min.node.size <- 2:10 sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement splitrule <- "variance" ntree <- 250 # 500 mtry <- 3:(2+2*max(n.obs)) tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree, mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction) # Cross-validation of RFSI rfsi_cv <- cv.rfsi(formula=fm.RFSI, # without nearest obs data = data, tgrid = tgrid, # combinations for tuning tgrid.n = 2, # number of randomly selected combinations from tgrid for tuning tune.type = "LLO", # Leave-Location-Out CV k = 5, # number of folds seed = 42, acc.metric = "RMSE", # R2, CCC, MAE output.format = "sf", # "data.frame", # "SpatVector", cpus=2, # detectCores()-1, progress=1, importance = "impurity") # ranger parameter summary(rfsi_cv) rfsi_cv$dif <- rfsi_cv$obs - rfsi_cv$pred plot(rfsi_cv["dif"]) plot(rfsi_cv[, , "obs"]) acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "R2") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "RMSE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "MAE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "CCC")
library(CAST) library(doParallel) library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # making tgrid n.obs <- 1:6 min.node.size <- 2:10 sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement splitrule <- "variance" ntree <- 250 # 500 mtry <- 3:(2+2*max(n.obs)) tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree, mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction) # Cross-validation of RFSI rfsi_cv <- cv.rfsi(formula=fm.RFSI, # without nearest obs data = data, tgrid = tgrid, # combinations for tuning tgrid.n = 2, # number of randomly selected combinations from tgrid for tuning tune.type = "LLO", # Leave-Location-Out CV k = 5, # number of folds seed = 42, acc.metric = "RMSE", # R2, CCC, MAE output.format = "sf", # "data.frame", # "SpatVector", cpus=2, # detectCores()-1, progress=1, importance = "impurity") # ranger parameter summary(rfsi_cv) rfsi_cv$dif <- rfsi_cv$obs - rfsi_cv$pred plot(rfsi_cv["dif"]) plot(rfsi_cv[, , "obs"]) acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "R2") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "RMSE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "MAE") acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "CCC")
k-fold cross-validation function for spatio-temporal regression kriging based on pred.strk. Currently, only spatial (leave-location-out) cross-validation is implemented. Temporal and spatio-temporal cross-validation will be implemented in the future.
cv.strk(data, obs.col=1, data.staid.x.y.z = NULL, crs = NA, zero.tol=0, reg.coef, vgm.model, sp.nmax=20, time.nmax=2, type = "LLO", k = 5, seed = 42, folds, refit = TRUE, output.format = "STFDF", parallel.processing = FALSE, pp.type = "snowfall", cpus=detectCores()-1, progress=TRUE, ...)
cv.strk(data, obs.col=1, data.staid.x.y.z = NULL, crs = NA, zero.tol=0, reg.coef, vgm.model, sp.nmax=20, time.nmax=2, type = "LLO", k = 5, seed = 42, folds, refit = TRUE, output.format = "STFDF", parallel.processing = FALSE, pp.type = "snowfall", cpus=detectCores()-1, progress=TRUE, ...)
data |
STFDF-class, STSDF-class, STIDF-class, sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates in space and time used to perform STRK cross validation. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs), and covariates (cov1, cov2, ...). Covariate names should be the same as in the |
obs.col |
numeric or character; Column name or number showing position of the observation column in the |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component - time, depth (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
crs |
st_crs or crs; Source CRS of |
zero.tol |
numeric; A distance value below (or equal to) which locations are considered as duplicates. Default is 0. See rm.dupl. Duplicates are removed to avoid singular covariance matrices in kriging. |
reg.coef |
numeric; Vector of named linear regression coefficients. Names of the coefficients (e.g. "Intercept", "temp_geo", "modis", "dem", "twi") will be used to match appropriate covariates from |
vgm.model |
StVariogramModel list; Spatio-temporal variogram of regression residuals (or observations if spatio-temporal ordinary kriging). See vgmST. Spatio-temporal variogram model on residuals for metorological variables (temperature, precipitation, etc.) can be taken from data(tvgms) or can be specified by the user as a vgmST object. |
sp.nmax |
numeric; A number of spatially nearest observations that should be used for kriging predictions. If |
time.nmax |
numeric; A number of temporally nearest observations that should be used for kriging predictions Deafult is 2. |
type |
character; Type of cross-validation: leave-location-out ("LLO"), leave-time-out ("LTO"), and leave-location-time-out ("LLTO"). Default is "LLO". "LTO" and "LLTO" are not implemented yet. Will be in the future. |
k |
numeric; Number of random folds that will be created with CreateSpacetimeFolds function. Default is 5. |
seed |
numeric; Random seed that will be used to generate outer and inner folds with CreateSpacetimeFolds function. |
folds |
numeric or character vector or value; Showing folds column (if value) or rows (vector) of |
refit |
logical; If refit of linear regression trend and spatio-teporal variogram should be performed. Spatio-teporal variogram is fit using |
output.format |
character; Format of the output, STFDF-class (default), STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, or SpatVector-class. |
parallel.processing |
logical; If parallel processing is performed. Default is FALSE. |
pp.type |
character; Type (R package) of parallel processing, "snowfall" (default) or "doParallel". |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
progress |
logical; If progress bar is shown. Default is TRUE. |
... |
A STFDF-class (default), STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, or SpatVector-class object (depends on output.format
argument), with columns:
obs |
Observations. |
pred |
Predictions from cross-validation. |
folds |
Folds used for cross-validation. |
For accuracy metrics see acc.metric.fun function.
Aleksandar Sekulic [email protected], Milan Kilibarda [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.
acc.metric.fun
pred.strk
tregcoef
tvgms
regdata
meteo2STFDF
tgeom2STFDF
library(sp) library(spacetime) library(gstat) library(plyr) library(CAST) library(doParallel) library(ranger) # preparing data data(dtempc) data(stations) data(regdata) # covariates, made by mete2STFDF function regdata@sp@proj4string <- CRS('+proj=longlat +datum=WGS84') data(tvgms) # ST variogram models data(tregcoef) # MLR coefficients lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46 serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), c(latmin,latmin,latmax,latmax)) st = stations[ serbia!=0, ] # stations in Serbia approx. crs = CRS('+proj=longlat +datum=WGS84') # create STFDF stfdf <- meteo2STFDF(obs = dtempc, stations = st, crs = crs) # Cross-validation for mean temperature for days "2011-07-05" and "2011-07-06" # global model is used for regression and variogram # Overlay observations with covariates time <- index(stfdf@time) covariates.df <- as.data.frame(regdata) names_covar <- names(tregcoef[[1]])[-1] for (covar in names_covar){ nrowsp <- length(stfdf@sp) regdata@sp=as(regdata@sp,'SpatialPixelsDataFrame') ov <- sapply(time, function(i) if (covar %in% names(regdata@data)) { if (as.Date(i) %in% as.Date(index(regdata@time))) { over(stfdf@sp, as(regdata[, i, covar], 'SpatialPixelsDataFrame'))[, covar] } else { rep(NA, length(stfdf@sp)) } } else { over(stfdf@sp, as(regdata@sp[covar], 'SpatialPixelsDataFrame'))[, covar] } ) ov <- as.vector(ov) if (all(is.na(ov))) { stop(paste('There is no overlay of data with covariates!', sep = "")) } stfdf@data[covar] <- ov } # Remove stations out of covariates for (covar in names_covar){ # count NAs per stations numNA <- apply(matrix(stfdf@data[,covar], nrow=nrowsp,byrow= FALSE), MARGIN=1, FUN=function(x) sum(is.na(x))) rem <- numNA != length(time) stfdf <- stfdf[rem,drop= FALSE] } # Remove dates out of covariates rm.days <- c() for (t in 1:length(time)) { if(sum(complete.cases(stfdf[, t]@data)) == 0) { rm.days <- c(rm.days, t) } } if(!is.null(rm.days)){ stfdf <- stfdf[,-rm.days] } ### Example with STFDF and without parallel processing and without refitting of variogram results <- cv.strk(data = stfdf, obs.col = 1, # "tempc" data.staid.x.y.z = c(1,NA,NA,NA), reg.coef = tregcoef[[1]], vgm.model = tvgms[[1]], sp.nmax = 20, time.nmax = 2, type = "LLO", k = 5, seed = 42, refit = FALSE, progress = TRUE ) stplot(results[,,"pred"]) summary(results) # accuracy acc.metric.fun(results@data$obs, results@data$pred, "R2") acc.metric.fun(results@data$obs, results@data$pred, "RMSE") acc.metric.fun(results@data$obs, results@data$pred, "MAE") acc.metric.fun(results@data$obs, results@data$pred, "CCC")
library(sp) library(spacetime) library(gstat) library(plyr) library(CAST) library(doParallel) library(ranger) # preparing data data(dtempc) data(stations) data(regdata) # covariates, made by mete2STFDF function regdata@sp@proj4string <- CRS('+proj=longlat +datum=WGS84') data(tvgms) # ST variogram models data(tregcoef) # MLR coefficients lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46 serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), c(latmin,latmin,latmax,latmax)) st = stations[ serbia!=0, ] # stations in Serbia approx. crs = CRS('+proj=longlat +datum=WGS84') # create STFDF stfdf <- meteo2STFDF(obs = dtempc, stations = st, crs = crs) # Cross-validation for mean temperature for days "2011-07-05" and "2011-07-06" # global model is used for regression and variogram # Overlay observations with covariates time <- index(stfdf@time) covariates.df <- as.data.frame(regdata) names_covar <- names(tregcoef[[1]])[-1] for (covar in names_covar){ nrowsp <- length(stfdf@sp) regdata@sp=as(regdata@sp,'SpatialPixelsDataFrame') ov <- sapply(time, function(i) if (covar %in% names(regdata@data)) { if (as.Date(i) %in% as.Date(index(regdata@time))) { over(stfdf@sp, as(regdata[, i, covar], 'SpatialPixelsDataFrame'))[, covar] } else { rep(NA, length(stfdf@sp)) } } else { over(stfdf@sp, as(regdata@sp[covar], 'SpatialPixelsDataFrame'))[, covar] } ) ov <- as.vector(ov) if (all(is.na(ov))) { stop(paste('There is no overlay of data with covariates!', sep = "")) } stfdf@data[covar] <- ov } # Remove stations out of covariates for (covar in names_covar){ # count NAs per stations numNA <- apply(matrix(stfdf@data[,covar], nrow=nrowsp,byrow= FALSE), MARGIN=1, FUN=function(x) sum(is.na(x))) rem <- numNA != length(time) stfdf <- stfdf[rem,drop= FALSE] } # Remove dates out of covariates rm.days <- c() for (t in 1:length(time)) { if(sum(complete.cases(stfdf[, t]@data)) == 0) { rm.days <- c(rm.days, t) } } if(!is.null(rm.days)){ stfdf <- stfdf[,-rm.days] } ### Example with STFDF and without parallel processing and without refitting of variogram results <- cv.strk(data = stfdf, obs.col = 1, # "tempc" data.staid.x.y.z = c(1,NA,NA,NA), reg.coef = tregcoef[[1]], vgm.model = tvgms[[1]], sp.nmax = 20, time.nmax = 2, type = "LLO", k = 5, seed = 42, refit = FALSE, progress = TRUE ) stplot(results[,,"pred"]) summary(results) # accuracy acc.metric.fun(results@data$obs, results@data$pred, "R2") acc.metric.fun(results@data$obs, results@data$pred, "RMSE") acc.metric.fun(results@data$obs, results@data$pred, "MAE") acc.metric.fun(results@data$obs, results@data$pred, "CCC")
Function for data preparation for RFSI and STRK functions. It transforms data to a data.frame.
data.prepare(data, data.staid.x.y.z=NULL, obs.col=NULL, s.crs=NA )
data.prepare(data, data.staid.x.y.z=NULL, obs.col=NULL, s.crs=NA )
data |
sf-class, sftime-class, SpatVector-class, SpatRaster-class or data.frame; Contains target variable (observations) and covariates. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, and observation value (obs). |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
obs.col |
numeric or character; Column name or number showing position of the observation column in the |
s.crs |
st_crs or crs; Source CRS of |
A list with the following elements:
data.df |
A data.frame obtained from |
data.staid.x.y.z |
Positions of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). |
s.crs |
Source CRS of |
obs.col |
Column number showing position of the observation column in the |
Aleksandar Sekulic [email protected]
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
near.obs
rfsi
tune.rfsi
cv.rfsi
library(sf) library(meteo) library(sp) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") data.df <- data.prepare(data, obs.col="zinc") str(data.df)
library(sf) library(meteo) library(sp) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") data.df <- data.prepare(data, obs.col="zinc") str(data.df)
Digital Elevation Model (DEM) and Topographic Wetness Index (TWI) for Serbia in SpatRaster format.
data(dem_twi_srb)
data(dem_twi_srb)
The dem_twi_srb
contains the following layers:
dem
Digital Elevation Model (DEM) in meters
twi
Topographic Wetness Index (TWI)
Aleksandar Sekulic [email protected]
library(terra) # load data data(dem_twi_srb) terra::unwrap(dem_twi_srb)
library(terra) # load data data(dem_twi_srb) terra::unwrap(dem_twi_srb)
Sample data set showing values of merged daily precipitation amount measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.
data(dprec)
data(dprec)
The dprec
contains the following columns:
staid
character; station ID from GSOD or ECA&D data set
time
Date; day of the measurement
prec
numeric; daily precipitation amount in mm
The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations
table containing stations' coordinates.
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data data(dprec) str(dprec)
# load data data(dprec) str(dprec)
Sample data set showing values of merged mean sea level pressure measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.
data(dslp)
data(dslp)
The dslp
contains the following columns:
staid
character; station ID from GSOD or ECA&D data set
time
Date; day of the measurement
slp
numeric; mean sea level pressure amount in hPa
The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations
table containing stations' coordinates.
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data data(dslp) str(dslp)
# load data data(dslp) str(dslp)
Sample data set showing values of merged daily snow depth measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.
data(dsndp)
data(dsndp)
The dsndp
contains the following columns:
staid
character; station ID from GSOD or ECA&D data set
time
Date; day of the measurement
sndp
numeric; daily snow depth in cm
The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations
table containing stations' coordinates.
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data data(dsndp) str(dsndp)
# load data data(dsndp) str(dsndp)
Sample data set showing values of merged maximum daily temperature measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Dataset (ECA&D) data for the month July 2011.
data(dtemp_maxc)
data(dtemp_maxc)
The dtemp_maxc
contains the following columns:
staid
character; station ID from GSOD or ECA&D dataset
time
Date; day of the measurement
temp_minc
numeric; maximum daily temperature in degrees Celsius
The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations
table containing stations' coordinates.
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data data(dtemp_maxc) str(dtemp_maxc)
# load data data(dtemp_maxc) str(dtemp_maxc)
Sample data set showing values of merged minimum daily temperature measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.
data(dtemp_minc)
data(dtemp_minc)
The dtemp_minc
contains the following columns:
staid
character; station ID from GSOD or ECA&D data set
time
Date; day of the measurement
temp_minc
numeric; minimum daily temperature in degrees Celsius
The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations
table containing stations' coordinates.
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data data(dtemp_minc) str(dtemp_minc)
# load data data(dtemp_minc) str(dtemp_minc)
Sample data set showing values of merged mean daily temperature measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Dataset (ECA&D) data for the month July 2011.
data(dtempc)
data(dtempc)
The dtempc
contains the following columns:
staid
character; station ID from GSOD or ECA&D dataset
time
Date; day of the measurement
tempc
numeric; mean daily temperature in degrees Celsius
The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations
table containing stations' coordinates.
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data data(dtempc) str(dtempc)
# load data data(dtempc) str(dtempc)
Sample data set of mean daily temperature measurements from the OGIMET service for the year 2019 for Serbian territory.
data(dtempc_ogimet)
data(dtempc_ogimet)
The dtempc_ogimet
contains the following columns:
staid
character; station ID from OGIMET
name
character; station name
lon
numeric; Longitude
lat
numeric; Latitude
elevation
numeric; Hight
time
Date; day of the measurement
tmean
numeric; mean daily temperature in degrees Celsius
dem
numeric; Digital Elevation Model (DEM) in meters
twi
numeric; Topographic Wetness Index (TWI)
cdate
numeric; Cumulative day from 1960
doy
numeric; Day of year
gtt
numeric; Geometrical temperature trend
Aleksandar Sekulic [email protected]
OGIMET service (https://www.ogimet.com/)
# load data data(dtempc_ogimet) str(dtempc)
# load data data(dtempc_ogimet) str(dtempc)
Sample data set showing values of merged daily mean wind speed measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.
data(dwdsp)
data(dwdsp)
The dwdsp
contains the following columns:
staid
character; station ID from GSOD or ECA&D data set
time
Date; day of the measurement
wdsp
numeric; daily mean wind speed in m/s
The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations
table containing stations' coordinates.
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data data(dwdsp) str(dwdsp)
# load data data(dwdsp) str(dwdsp)
The function gives back lon/lat coordinates for a specific location name using Nominatim (https://nominatim.org/) service.
get_coordinates(location_name = "Belgrade")
get_coordinates(location_name = "Belgrade")
location_name |
|
numeric
vector with lon/lat values for a specific location name.
Aleksandar Sekulić [email protected]
coords <- get_coordinates("Belgrade") str(coords)
coords <- get_coordinates("Belgrade") str(coords)
The function gives back daily, monthly, or annual and aggregated or long-term means metorological data for specific location(s) and date(s) for entire World for 1961-2020 period from DailyMeteo portal (https://app.dailymeteo.com/).
get_meteo(loc, var = "tmean", agg_level = "agg", time_scale = "day", time, from, to, api_key)
get_meteo(loc, var = "tmean", agg_level = "agg", time_scale = "day", time, from, to, api_key)
loc |
|
var |
|
agg_level |
|
time_scale |
|
time |
|
from |
|
to |
|
api_key |
|
data.frame object with columns loc
(location index from loc
argument), timestamp
(time reference), and value
(meteorological value).
Aleksandar Sekulić [email protected]
## Not run: loc <- get_coordinates("Belgrade") loc <- rbind(loc, get_coordinates("Kopaonik")) loc api_key <- "" # get API key from DailyMeteo portal (https://app.dailymeteo.com/) result <- get_meteo(loc = loc, var = "tmean", agg_level = "agg", time_scale = "day", from = "2020-01-01", to = "2020-01-02", # or use time = c("2020-01-01", "2020-01-02"), api_key = api_key) # result # loc timestamp value # 1 1 2020-01-01 0.7 # 2 1 2020-01-02 1.0 # 3 2 2020-01-01 -9.2 # 4 2 2020-01-02 -8.6 # 5 3 2020-01-01 -9.2 # 6 3 2020-01-02 -8.6 ## End(Not run)
## Not run: loc <- get_coordinates("Belgrade") loc <- rbind(loc, get_coordinates("Kopaonik")) loc api_key <- "" # get API key from DailyMeteo portal (https://app.dailymeteo.com/) result <- get_meteo(loc = loc, var = "tmean", agg_level = "agg", time_scale = "day", from = "2020-01-01", to = "2020-01-02", # or use time = c("2020-01-01", "2020-01-02"), api_key = api_key) # result # loc timestamp value # 1 1 2020-01-01 0.7 # 2 1 2020-01-02 1.0 # 3 2 2020-01-01 -9.2 # 4 2 2020-01-02 -8.6 # 5 3 2020-01-01 -9.2 # 6 3 2020-01-02 -8.6 ## End(Not run)
The function creates an object of STFDF-class class, spatio-temporal data with full space-time grid, from two data frames (observation and stations). Observations data frame minimum contains station ID column, time column (day of observation) and measured variable column. Stations data frame contains at least station ID column, longitude (or x) and latitude (or y) column.
meteo2STFDF(obs, stations, obs.staid.time = c(1, 2), stations.staid.lon.lat = c(1, 2, 3), crs=CRS(as.character(NA)), delta=NULL)
meteo2STFDF(obs, stations, obs.staid.time = c(1, 2), stations.staid.lon.lat = c(1, 2, 3), crs=CRS(as.character(NA)), delta=NULL)
obs |
data.frame; observations data frame minimum contains station ID column, time column (day of observation) and measured variable column. It can contain additional variables (columns). |
stations |
data.frame; Stations data frame contains at least station ID column, longitude (or x) and latitude (or y) column.It can contain additional variables (columns). |
obs.staid.time |
numeric; records the column positions where in |
stations.staid.lon.lat |
numeric; records the column positions where in |
crs |
CRS; coordinate reference system (see CRS-class) of |
delta |
time; time interval to end points in seconds |
STFDF-class object
The function is intended for conversion of meteorological data to STFDF-class object, but can be used for similar spatio temporal data stored in two separated tables.
Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]
# prepare data # load observation - data.frame of mean temperatures data(dtempc) str(dtempc) data(stations) str(stations) lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46 library(sp) library(spacetime) serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), c(latmin,latmin,latmax,latmax)) st= stations[ serbia!=0, ] # stations in Serbia approx. # create STFDF temp <- meteo2STFDF(dtempc,st, crs= CRS('+proj=longlat +datum=WGS84')) str(temp)
# prepare data # load observation - data.frame of mean temperatures data(dtempc) str(dtempc) data(stations) str(stations) lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46 library(sp) library(spacetime) serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), c(latmin,latmin,latmax,latmax)) st= stations[ serbia!=0, ] # stations in Serbia approx. # create STFDF temp <- meteo2STFDF(dtempc,st, crs= CRS('+proj=longlat +datum=WGS84')) str(temp)
The function finds n nearest observations from given locations and creates an object of data.frame class. First n columns are Euclidean distances to n nearest locations and next n columns are observations at n nearest stations, and rows are given locations. Further more it can calculate averages in circles with different radiuses, can find nearest observation in quadrants (directions) and calculate IDW predictions from nearest observations. It is based on knn function of package nabor.
near.obs(locations, locations.x.y = c(1,2), observations, observations.x.y = c(1,2), obs.col = 3, n.obs = 10, rm.dupl = TRUE, avg = FALSE, increment, range, quadrant = FALSE, idw=FALSE, idw.p=2)
near.obs(locations, locations.x.y = c(1,2), observations, observations.x.y = c(1,2), obs.col = 3, n.obs = 10, rm.dupl = TRUE, avg = FALSE, increment, range, quadrant = FALSE, idw=FALSE, idw.p=2)
locations |
data.frame with x and y coordinates columns, or sf-class, SpatVector-class or SpatRaster-class object; Locations (FROM) for which n nearest observations are found and distances are calculated. |
locations.x.y |
numeric or character vector; Positions or names of the x and y columns in |
observations |
data.frame with x, y and observation columns, or sf-class or SpatVector-class object with an observation column; Observations (TO). |
observations.x.y |
numeric or character vector; Positions or names of the x and y columns in |
obs.col |
numeric or character; Column name or number showing position of the observation column in the |
n.obs |
numeric; Number of nearest observations to be found. Note that it cannot be larger than number of obsevrations. Default is 10. |
rm.dupl |
boolean; Remove spatial duplicates - will the spatial duplicates (nearest observations where Euclidean distance is 0) be removed from the result. Default is TRUE. |
avg |
boolean; Averages in circles - will averages in circles with different radiuses be calculated. Default is FALSE. |
increment |
numeric; Increment of radiuses for calculation of averages in circles with different radiuses. Units depends on CRS. |
range |
numeric; Maximum radius for calculation of averages in circles with different radiuses. Units depends on CRS. |
quadrant |
boolean; Nearest observations in quadrants - will nearest observation in quadrants be calculated. Default is FALSE. |
idw |
boolean; IDW prediction as predictor - will IDW predictions from |
idw.p |
numeric; Exponent parameter for IDW weights. Default is 2. |
data.frame object. Rows represents specific locations. First n.obs
columns are Euclidean distances to n.obs
nearest observations. Next n.obs
columns are observations at n.obs
nearest stations. The following columns are averages in circles with different radiuses if avg
is set to TRUE. The following columns are nearest observation in quadrants if direct
is set to TRUE. The following columns are IDW prediction from nearest observation if idw
is set to TRUE.
The function can be used in any case if it is needed to find n nearest observations from given locations and distances to them.
Aleksandar Sekulic [email protected]
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
knn
rfsi
pred.rfsi
tune.rfsi
cv.rfsi
library(sp) library(sf) library(terra) library(meteo) # prepare data # load observation - data.frame of mean temperatures demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] locations = terra::rast(meuse.grid) observations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") # find 5 nearest observations and distances to them (remove duplicates) nearest_obs <- near.obs(locations = locations, # from observations = observations, # to obs.col = "zinc", n.obs = 5, # number of nearest observations rm.dupl = TRUE) str(nearest_obs) summary(nearest_obs)
library(sp) library(sf) library(terra) library(meteo) # prepare data # load observation - data.frame of mean temperatures demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] locations = terra::rast(meuse.grid) observations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") # find 5 nearest observations and distances to them (remove duplicates) nearest_obs <- near.obs(locations = locations, # from observations = observations, # to obs.col = "zinc", n.obs = 5, # number of nearest observations rm.dupl = TRUE) str(nearest_obs) summary(nearest_obs)
The function finds n nearest observations from given locations and at specific depth range from location min depth and creates an object of data.frame class. First n columns are Euclidean distances to n nearest locations and next n columns are observations at n nearest stations, and rows are given locations. It is based on knn function of package nabor.
near.obs.soil(locations, locations.x.y.md = c(1,2,3), observations, observations.x.y.md = c(1,2,3), obs.col = 4, n.obs = 5, depth.range = 0.1, no.obs = 'increase', parallel.processing = TRUE, pp.type = "doParallel", # "snowfall" cpus = detectCores()-1)
near.obs.soil(locations, locations.x.y.md = c(1,2,3), observations, observations.x.y.md = c(1,2,3), obs.col = 4, n.obs = 5, depth.range = 0.1, no.obs = 'increase', parallel.processing = TRUE, pp.type = "doParallel", # "snowfall" cpus = detectCores()-1)
locations |
data.frame with x and y coordinates and mid depth columns, or sf-class, SpatVector-class or SpatRaster-class object; Locations (FROM) for which n nearest observations are found and distances are calculated. |
locations.x.y.md |
numeric or character vector; Positions or names of the x, y, and mid depth columns in |
observations |
data.frame with x, y, mid depth and observation columns, or sf-class or SpatVector-class object with mid depth and observation columns; Observations (TO). |
observations.x.y.md |
numeric or character vector; positions or names of the x, y, and mid depth columns in |
obs.col |
numeric or character; Column name or number showing position of the observation column in the |
n.obs |
numeric; Number of nearest observations to be found. Note that it cannot be larger than number of obsevrations. Default is 5. |
depth.range |
numeric; Depth range for location mid depth in which to search for nearest observations. It's in the mid depth units. Default is 0.1. |
no.obs |
character; Possible values are |
parallel.processing |
logical; If parallel processing is performed. Default is FALSE. |
pp.type |
character; Type (R package) used for parallel processing, "doParallel" (default) or "snowfall". |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
data.frame object. Rows represents specific locations. First n.obs
columns are Euclidean distances to n.obs
nearest observations. Next n.obs
columns are observations at n.obs
nearest stations.
The function is intended for soil mapping applications.
Aleksandar Sekulic [email protected], Anatol Helfenstein [email protected]
knn
near.obs
rfsi
pred.rfsi
tune.rfsi
cv.rfsi
library(sp) library(sf) library(meteo) # prepare data # load observation - data.frame of mean temperatures demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] locations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") locations = # terra::rast(meuse.grid) observations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") # find 5 nearest observations and distances to them (remove duplicates) nearest_obs <- near.obs.soil(locations = locations, # from locations.x.y.md = c("x","y","dist"), observations = observations, # to observations.x.y.md= c("x","y","dist"), obs.col = "zinc", n.obs = 5) # number of nearest observations str(nearest_obs) summary(nearest_obs)
library(sp) library(sf) library(meteo) # prepare data # load observation - data.frame of mean temperatures demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] locations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") locations = # terra::rast(meuse.grid) observations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") # find 5 nearest observations and distances to them (remove duplicates) nearest_obs <- near.obs.soil(locations = locations, # from locations.x.y.md = c("x","y","dist"), observations = observations, # to observations.x.y.md= c("x","y","dist"), obs.col = "zinc", n.obs = 5) # number of nearest observations str(nearest_obs) summary(nearest_obs)
The original 8 day MODIS LST images were also converted from Kelvin to degrees Celsius using the formula indicated in the MODIS user's manual.SpatialGridDataFrame.
data(nlmodis20110704)
data(nlmodis20110704)
Milan Kilibarda [email protected]
Wan, Z., Y. Zhang, Q. Zhang, and Z.-L. Li (2004), Quality assessment and validation of the MODIS global land surface temperature, Int. J. Remote Sens., 25(1), 261-274
library(sp) data(nlmodis20110704) spplot(nlmodis20110704)
library(sp) data(nlmodis20110704) spplot(nlmodis20110704)
The original 8 day MODIS LST images were also converted from Kelvin to degrees Celsius using the formula indicated in the MODIS user's manual.
data(nlmodis20110712)
data(nlmodis20110712)
Milan Kilibarda [email protected]
Wan, Z., Y. Zhang, Q. Zhang, and Z.-L. Li (2004), Quality assessment and validation of the MODIS global land surface temperature, Int. J. Remote Sens., 25(1), 261-274
library(sp) data(nlmodis20110712) spplot(nlmodis20110712)
library(sp) data(nlmodis20110712) spplot(nlmodis20110712)
SpatialGridDataFrame from World Country Admin Boundary Shapefile
data(NLpol)
data(NLpol)
library(sp) data(NLpol) plot(NLpol)
library(sp) data(NLpol) plot(NLpol)
Function for spatial/spatio-temporal prediction based on Random Forest Spatial Interpolation (RFSI) model (Sekulić et al. 2020).
pred.rfsi(model, data, obs.col=1, data.staid.x.y.z = NULL, newdata, newdata.staid.x.y.z = NULL, z.value = NULL, s.crs = NA, newdata.s.crs=NA, p.crs = NA, output.format = "data.frame", cpus = detectCores()-1, progress = TRUE, soil3d = FALSE, # soil RFSI depth.range = 0.1, # in units of depth no.obs = 'increase', ...)
pred.rfsi(model, data, obs.col=1, data.staid.x.y.z = NULL, newdata, newdata.staid.x.y.z = NULL, z.value = NULL, s.crs = NA, newdata.s.crs=NA, p.crs = NA, output.format = "data.frame", cpus = detectCores()-1, progress = TRUE, soil3d = FALSE, # soil RFSI depth.range = 0.1, # in units of depth no.obs = 'increase', ...)
model |
ranger; An RFSI model made by rfsi function. |
data |
sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for RFSI prediction. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, and observation value (obs). |
obs.col |
numeric or character; Column name or number showing position of the observation column in the |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
newdata |
sf-class, sftime-class, SpatVector-class, SpatRaster-class or data.frame; Contains prediction locations and covariates used for RFSI prediction. If data.frame object, it should have next columns: prediction location ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z), and covariates (cov1, cov2, ...). Covariate names have to be the same as in the |
newdata.staid.x.y.z |
numeric or character vector; Positions or names of the prediction location ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame |
z.value |
vector; A vector of 3rd component - time, depth, ... (z) values if |
s.crs |
st_crs or crs; Source CRS of |
newdata.s.crs |
st_crs or crs; Source CRS of |
p.crs |
st_crs or crs; Projection CRS for |
output.format |
character; Format of the output, data.frame (default), sf-class, sftime-class, SpatVector-class, or SpatRaster-class. |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
progress |
logical; If progress bar is shown. Default is TRUE. |
soil3d |
logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the |
depth.range |
numeric; Depth range for location mid depth in which to search for nearest observations (see function near.obs.soil). It's in the mid depth units. Default is 0.1. |
no.obs |
character; Possible values are |
... |
Further arguments passed to predict.ranger function, such as |
A data.frame, sf-class, sftime-class, SpatVector-class, or SpatRaster-class object (depends on output.format
argument) with prediction - pred
or quantile..X.X
(quantile regression) columns.
Aleksandar Sekulic [email protected]
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
near.obs
rfsi
tune.rfsi
cv.rfsi
library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # fit the RFSI model rfsi_model <- rfsi(formula = fm.RFSI, data = data, # meuse.df (use data.staid.x.y.z) n.obs = 5, # number of nearest observations cpus = 2, # detectCores()-1, progress = TRUE, # ranger parameters importance = "impurity", seed = 42, num.trees = 250, mtry = 5, splitrule = "variance", min.node.size = 5, sample.fraction = 0.95, quantreg = FALSE) # quantreg = TRUE) # for quantile regression rfsi_model # OOB prediction error (MSE): 47758.14 # R squared (OOB): 0.6435869 sort(rfsi_model$variable.importance) sum("obs" == substr(rfsi_model$forest$independent.variable.names, 1, 3)) # Make RFSI prediction newdata <- terra::rast(meuse.grid) class(newdata) # prediction rfsi_prediction <- pred.rfsi(model = rfsi_model, data = data, # meuse.df (use data.staid.x.y.z) obs.col = "zinc", newdata = newdata, # meuse.grid.df (use newdata.staid.x.y.z) output.format = "SpatRaster", # "sf", # "SpatVector", zero.tol = 0, cpus = 2, # detectCores()-1, progress = TRUE, # type = "quantiles", # for quantile regression # quantiles = c(0.1, 0.5, 0.9) # for quantile regression ) class(rfsi_prediction) names(rfsi_prediction) head(rfsi_prediction) plot(rfsi_prediction) plot(rfsi_prediction['pred'])
library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # fit the RFSI model rfsi_model <- rfsi(formula = fm.RFSI, data = data, # meuse.df (use data.staid.x.y.z) n.obs = 5, # number of nearest observations cpus = 2, # detectCores()-1, progress = TRUE, # ranger parameters importance = "impurity", seed = 42, num.trees = 250, mtry = 5, splitrule = "variance", min.node.size = 5, sample.fraction = 0.95, quantreg = FALSE) # quantreg = TRUE) # for quantile regression rfsi_model # OOB prediction error (MSE): 47758.14 # R squared (OOB): 0.6435869 sort(rfsi_model$variable.importance) sum("obs" == substr(rfsi_model$forest$independent.variable.names, 1, 3)) # Make RFSI prediction newdata <- terra::rast(meuse.grid) class(newdata) # prediction rfsi_prediction <- pred.rfsi(model = rfsi_model, data = data, # meuse.df (use data.staid.x.y.z) obs.col = "zinc", newdata = newdata, # meuse.grid.df (use newdata.staid.x.y.z) output.format = "SpatRaster", # "sf", # "SpatVector", zero.tol = 0, cpus = 2, # detectCores()-1, progress = TRUE, # type = "quantiles", # for quantile regression # quantiles = c(0.1, 0.5, 0.9) # for quantile regression ) class(rfsi_prediction) names(rfsi_prediction) head(rfsi_prediction) plot(rfsi_prediction) plot(rfsi_prediction['pred'])
Function for spatio-temporal regression kriging prediction based on krigeST.
pred.strk(data, obs.col=1, data.staid.x.y.z = NULL, newdata, newdata.staid.x.y.z = NULL, z.value = NULL, crs = NA, zero.tol=0, reg.coef, vgm.model, sp.nmax=20, time.nmax=2, by='time', tiling= FALSE, ntiles=64, output.format = "STFDF", parallel.processing = FALSE, pp.type = "snowfall", cpus=detectCores()-1, computeVar=FALSE, progress=TRUE, ...)
pred.strk(data, obs.col=1, data.staid.x.y.z = NULL, newdata, newdata.staid.x.y.z = NULL, z.value = NULL, crs = NA, zero.tol=0, reg.coef, vgm.model, sp.nmax=20, time.nmax=2, by='time', tiling= FALSE, ntiles=64, output.format = "STFDF", parallel.processing = FALSE, pp.type = "snowfall", cpus=detectCores()-1, computeVar=FALSE, progress=TRUE, ...)
data |
STFDF-class, STSDF-class, STIDF-class, sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates in space and time used to perform STRK. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs), and covariates (cov1, cov2, ...). Covariate names should be the same as in the |
obs.col |
numeric or character; Column name or number showing position of the observation column in the |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component - time, depth (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
newdata |
STFDF-class, STSDF-class, STIDF-class, sf-class, sftime-class, SpatVector-class, SpatRaster-class or data.frame; Contains prediction locations and covariates used for STRK prediction. If data.frame object, it should have next columns: prediction location ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z), and covariates (cov1, cov2, ...). Covariate names have to be the same as in the |
newdata.staid.x.y.z |
numeric or character vector; Positions or names of the prediction location ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame |
z.value |
vector; A vector of 3rd component - time, depth, ... (z) values if |
crs |
st_crs or crs; Source CRS of |
zero.tol |
numeric; A distance value below (or equal to) which locations are considered as duplicates. Default is 0. See rm.dupl. Duplicates are removed to avoid singular covariance matrices in kriging. |
reg.coef |
numeric; Vector of named linear regression coefficients. Names of the coefficients (e.g. "Intercept", "temp_geo", "modis", "dem", "twi") will be used to match appropriate covariates from |
vgm.model |
StVariogramModel list; Spatio-temporal variogram of regression residuals (or observations if spatio-temporal ordinary kriging). See vgmST. Spatio-temporal variogram model on residuals for metorological variables (temperature, precipitation, etc.) can be taken from data(tvgms) or can be specified by the user as a vgmST object. |
sp.nmax |
numeric; A number of spatially nearest observations that should be used for kriging predictions. If |
time.nmax |
numeric; A number of temporally nearest observations that should be used for kriging predictions Deafult is 2. |
by |
cahracter; Will foreach loop by time (default) or station. If station is set, |
tiling |
logical; Should simplified local kriging be used. Default is FALSE. If TRUE, area is divided in tiles and kriging calculation is done for each tile separately. Number of observation used per tile is defined with |
ntiles |
numeric; A number of tiles for tilling. Default is 64. Ideally, each tile should contain less observations than |
output.format |
character; Format of the output, STFDF-class (default), STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, SpatVector-class, or SpatRaster-class. |
parallel.processing |
logical; If parallel processing is performed. Default is FALSE. |
pp.type |
character; Type ( |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
computeVar |
logical; If kriging variance is computed. Default is FALSE. |
progress |
logical; If progress bar is shown. Default is TRUE. |
... |
Further arguments passed to krigeST function. |
A STFDF-class, STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, SpatVector-class, or SpatRaster-class object (depends on output.format
argument), with columns (elements):
pred |
Predictions. |
tlm |
Trend. |
var |
Kriging variance, if |
Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.
tregcoef
tvgms
regdata
meteo2STFDF
tgeom2STFDF
library(sp) library(spacetime) library(sf) library(gstat) library(plyr) # prepare data # load observation - data.frame of mean temperatures # preparing data data(dtempc) data(stations) data(regdata) # covariates, made by mete2STFDF function regdata@sp@proj4string <- CRS('+proj=longlat +datum=WGS84') lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46 serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), c(latmin,latmin,latmax,latmax)) st = stations[ serbia!=0, ] # stations in Serbia approx. obs.staid.time = c("staid", "time") stations.staid.lon.lat = c(1,2,3) crs = CRS('+proj=longlat +datum=WGS84') delta = NULL # create STFDF stfdf <- meteo2STFDF(obs = dtempc, stations = st, crs = crs) # Calculate prediction of mean temperatures for "2011-07-05" and "2011-07-06" # global model is used for regression and variogram # load precalculated variograms data(tvgms) # ST variogram models data(tregcoef) # MLR coefficients ### Example with STFDF and without parallel processing results <- pred.strk(data = stfdf, # observations newdata = regdata, # prediction locations with covariates # newdata = regdata[,2,drop=FALSE], # for one day only output.format = "STFDF", # data.frame | sf | sftime | SpatVector | SpatRaster reg.coef = tregcoef[[1]], # MLR coefficients vgm.model = tvgms[[1]], # STRK variogram model sp.nmax = 20, time.nmax = 2, computeVar=TRUE ) class(results) # plot prediction results@sp=as(results@sp,'SpatialPixelsDataFrame') stplot(results[,,"pred", drop= FALSE], col.regions=bpy.colors()) stplot(results[,,"var", drop= FALSE], col.regions=bpy.colors()) # Example with data.frames and parallel processing - SpatRaster output library(terra) library(doParallel) # create data.frame stfdf.df <- join(dtempc, st) summary(stfdf.df) regdata.df <- as.data.frame(regdata) results <- pred.strk(data = stfdf.df, obs.col = 3, data.staid.x.y.z = c(1,4,5,2), newdata = regdata.df, newdata.staid.x.y.z = c(3,1,2,4), crs = CRS("EPSG:4326"), output.format = "SpatRaster", # STFDF |data.frame | sf | sftime | SpatVector reg.coef = tregcoef[[1]], vgm.model = tvgms[[1]], sp.nmax = 20, time.nmax = 2, parallel.processing = TRUE, pp.type = "doParallel", # "snowfall" cpus = 2, # detectCores()-1, computeVar = TRUE, progress = TRUE ) # plot prediction plot(results$`2011-07-06`[["pred"]]) plot(results$`2011-07-06`[["var"]])
library(sp) library(spacetime) library(sf) library(gstat) library(plyr) # prepare data # load observation - data.frame of mean temperatures # preparing data data(dtempc) data(stations) data(regdata) # covariates, made by mete2STFDF function regdata@sp@proj4string <- CRS('+proj=longlat +datum=WGS84') lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46 serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), c(latmin,latmin,latmax,latmax)) st = stations[ serbia!=0, ] # stations in Serbia approx. obs.staid.time = c("staid", "time") stations.staid.lon.lat = c(1,2,3) crs = CRS('+proj=longlat +datum=WGS84') delta = NULL # create STFDF stfdf <- meteo2STFDF(obs = dtempc, stations = st, crs = crs) # Calculate prediction of mean temperatures for "2011-07-05" and "2011-07-06" # global model is used for regression and variogram # load precalculated variograms data(tvgms) # ST variogram models data(tregcoef) # MLR coefficients ### Example with STFDF and without parallel processing results <- pred.strk(data = stfdf, # observations newdata = regdata, # prediction locations with covariates # newdata = regdata[,2,drop=FALSE], # for one day only output.format = "STFDF", # data.frame | sf | sftime | SpatVector | SpatRaster reg.coef = tregcoef[[1]], # MLR coefficients vgm.model = tvgms[[1]], # STRK variogram model sp.nmax = 20, time.nmax = 2, computeVar=TRUE ) class(results) # plot prediction results@sp=as(results@sp,'SpatialPixelsDataFrame') stplot(results[,,"pred", drop= FALSE], col.regions=bpy.colors()) stplot(results[,,"var", drop= FALSE], col.regions=bpy.colors()) # Example with data.frames and parallel processing - SpatRaster output library(terra) library(doParallel) # create data.frame stfdf.df <- join(dtempc, st) summary(stfdf.df) regdata.df <- as.data.frame(regdata) results <- pred.strk(data = stfdf.df, obs.col = 3, data.staid.x.y.z = c(1,4,5,2), newdata = regdata.df, newdata.staid.x.y.z = c(3,1,2,4), crs = CRS("EPSG:4326"), output.format = "SpatRaster", # STFDF |data.frame | sf | sftime | SpatVector reg.coef = tregcoef[[1]], vgm.model = tvgms[[1]], sp.nmax = 20, time.nmax = 2, parallel.processing = TRUE, pp.type = "doParallel", # "snowfall" cpus = 2, # detectCores()-1, computeVar = TRUE, progress = TRUE ) # plot prediction plot(results$`2011-07-06`[["pred"]]) plot(results$`2011-07-06`[["var"]])
Dynamic and static covariates for spatio-temporal regression kriging of STFDF-class. The regdata
contains geometrical temperature trend, MODIS LST 8-day splined at daily resolution, elevation and topographic wetness index.
data(regdata)
data(regdata)
The regdata
contains the following dynamic and static covariates:
regdata$temp_geo
numeric; geometrical temperature trend for mean temperature, calculated with tgeom2STFDF
; from 2011-07-05 to 2011-07-09, in degree Celsius
regdata$modis
numeric; MODIS LST 8-day splined at daily resolution, missing pixels are filtered by spatial splines and 8-day values are splined at daily level; from 2011-07-05 to 2011-07-09, in degree Celsius
regdata@sp$dem
numeric; elevation data obtained from Worldgrids (depricated)
regdata@sp$twi
numeric; SAGA Topographic Wetness Index (TWI) from Worldgrids (depricated)
Milan Kilibarda [email protected]
data(regdata) str(regdata) library(sp) # spplot library(spacetime) # stplot stplot(regdata[,,'modis']) # plot modis data spplot(regdata@sp,zcol='twi', col.regions = bpy.colors() ) # plot TWI spplot(regdata@sp,zcol='dem', col.regions = bpy.colors() ) # plot dem
data(regdata) str(regdata) library(sp) # spplot library(spacetime) # stplot stplot(regdata[,,'modis']) # plot modis data spplot(regdata@sp,zcol='twi', col.regions = bpy.colors() ) # plot TWI spplot(regdata@sp,zcol='dem', col.regions = bpy.colors() ) # plot dem
The function close gaps of a raster data by using IDW.
rfillspgaps(rasterLayer, maskPol=NULL, nmax =50, zcol=1, ...)
rfillspgaps(rasterLayer, maskPol=NULL, nmax =50, zcol=1, ...)
rasterLayer |
SpatRaster-class, RasterLayer-class, SpatialGrid-class or SpatialPixels-class; Raster that contains NAs. |
maskPol |
sf-class, SpatVector-class, SpatialPolygons or SpatialPolygonsDataFrame; Area of interest to spatially fill |
nmax |
see krige, idw function. |
zcol |
|
... |
arguments passed to krige, idw function. |
raster object with NA replaced using IDW in rasterLayer
format.
Milan Kilibarda [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803;
Kilibarda M., M. Percec Tadic, T. Hengl, J. Lukovic, B. Bajat - Spatial Statistics (2015), Global geographic and feature space coverage of temperature data in the context of spatio-temporal interpolation, doi:10.1016/j.spasta.2015.04.005.
library(terra) data(nlmodis20110712) data(NLpol) nlmodis20110712 <- terra::rast(nlmodis20110712) # SpaVector NLpol = vect(NLpol) crs(NLpol) <- "epsg:4326" # # sf # NLpol <- st_as_sf(NLpol) #, crs = st_crs(4326)) plot(nlmodis20110712) # fill spatial gaps r=rfillspgaps(nlmodis20110712,NLpol) plot(r)
library(terra) data(nlmodis20110712) data(NLpol) nlmodis20110712 <- terra::rast(nlmodis20110712) # SpaVector NLpol = vect(NLpol) crs(NLpol) <- "epsg:4326" # # sf # NLpol <- st_as_sf(NLpol) #, crs = st_crs(4326)) plot(nlmodis20110712) # fill spatial gaps r=rfillspgaps(nlmodis20110712,NLpol) plot(r)
The function creates an object of STFDF-class class, spatio-temporal data with full space-time grid, from another STFDF-class and fills attribute data for missing values in time using splines.
rfilltimegaps(stfdf, tunits="day", attrname=1, ...)
rfilltimegaps(stfdf, tunits="day", attrname=1, ...)
stfdf |
STFDF-class; object with time information of minimum length 2, and gap in time dimension. |
tunits |
|
attrname |
|
... |
arguments passed to splinefun, function spline. |
tunits
can be specified in several ways:
A number, taken to be in seconds
A object of class difftime
A character string, containing one of "sec"
, "min"
, "hour"
, "day"
, "DSTday"
, "week"
, "month"
, "quarter"
or "year"
. This can optionally be preceded by a (positive or negative) integer and a space, or followed by "s"
The difference between "day"
and "DSTday"
is that the former ignores changes to/from daylight savings time and the latter takes the same clock time each day. ("week"
ignores DST (it is a period of 144 hours), but "7 DSTdays"
) can be used as an alternative. "month"
and "year"
allow for DST.)
STFDF-class object with filled temporal gaps.
Milan Kilibarda [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803;
Kilibarda M., M. Percec Tadic, T. Hengl, J. Lukovic, B. Bajat - Spatial Statistics (2015), Global geographic and feature space coverage of temperature data in the context of spatio-temporal interpolation, doi:10.1016/j.spasta.2015.04.005.
data(nlmodis20110704) data(nlmodis20110712) data(NLpol) # fill spatial gaps library(raster) NLpol@proj4string <- nlmodis20110704@proj4string nlmodis20110704 <- rfillspgaps(nlmodis20110704,NLpol) nlmodis20110712 <- rfillspgaps(nlmodis20110712,NLpol) nlmodis20110704 <- as(nlmodis20110704,"SpatialPixelsDataFrame") names(nlmodis20110704)='m1' nlmodis20110712 <- as(nlmodis20110712,"SpatialPixelsDataFrame") names(nlmodis20110712)='m2' nlmodis20110704@data <- cbind(nlmodis20110704@data, nlmodis20110712@data) df<-reshape(nlmodis20110704@data , varying=list(1:2), v.names="modis",direction="long", times=as.Date(c('2011-07-04','2011-07-12')), ids=1:dim(nlmodis20110704)[1]) library(spacetime) stMODIS<- STFDF(as( nlmodis20110704, "SpatialPixels"), time= as.Date(c('2011-07-04','2011-07-12')), data.frame(modis=df[,'modis'])) stplot(stMODIS, col.regions=bpy.colors()) stMODIS <- rfilltimegaps(stMODIS) stplot(stMODIS, col.regions=bpy.colors())
data(nlmodis20110704) data(nlmodis20110712) data(NLpol) # fill spatial gaps library(raster) NLpol@proj4string <- nlmodis20110704@proj4string nlmodis20110704 <- rfillspgaps(nlmodis20110704,NLpol) nlmodis20110712 <- rfillspgaps(nlmodis20110712,NLpol) nlmodis20110704 <- as(nlmodis20110704,"SpatialPixelsDataFrame") names(nlmodis20110704)='m1' nlmodis20110712 <- as(nlmodis20110712,"SpatialPixelsDataFrame") names(nlmodis20110712)='m2' nlmodis20110704@data <- cbind(nlmodis20110704@data, nlmodis20110712@data) df<-reshape(nlmodis20110704@data , varying=list(1:2), v.names="modis",direction="long", times=as.Date(c('2011-07-04','2011-07-12')), ids=1:dim(nlmodis20110704)[1]) library(spacetime) stMODIS<- STFDF(as( nlmodis20110704, "SpatialPixels"), time= as.Date(c('2011-07-04','2011-07-12')), data.frame(modis=df[,'modis'])) stplot(stMODIS, col.regions=bpy.colors()) stMODIS <- rfilltimegaps(stMODIS) stplot(stMODIS, col.regions=bpy.colors())
Function for creation of Random Forest Spatial Interpolation (RFSI) model (Sekulić et al. 2020). Besides environmental covariates, RFSI uses additional spatial covariates: (1) observations at n nearest locations and (2) distances to them, in order to include spatial context into the random forest.
rfsi(formula, data, data.staid.x.y.z = NULL, n.obs = 5, avg = FALSE, increment = 10000, range = 50000, quadrant = FALSE, use.idw = FALSE, idw.p = 2, s.crs = NA, p.crs = NA, cpus = detectCores()-1, progress = TRUE, soil3d = FALSE, depth.range = 0.1, no.obs = 'increase', ...)
rfsi(formula, data, data.staid.x.y.z = NULL, n.obs = 5, avg = FALSE, increment = 10000, range = 50000, quadrant = FALSE, use.idw = FALSE, idw.p = 2, s.crs = NA, p.crs = NA, cpus = detectCores()-1, progress = TRUE, soil3d = FALSE, depth.range = 0.1, no.obs = 'increase', ...)
formula |
formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If |
data |
sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates ( |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
n.obs |
numeric; Number of nearest observations to be used as covariates in RFSI model (see function near.obs). Note that it cannot be larger than number of obsevrations. Default is 5. |
avg |
boolean; Averages in circles covariate - will averages in circles with different radiuses be calculated (see function near.obs). Default is FALSE. |
increment |
numeric; Increment of radiuses for calculation of averages in circles with different radiuses (see function near.obs). Units depends on CRS. |
range |
numeric; Maximum radius for calculation of averages in circles with different radiuses (see function near.obs). Units depends on CRS. |
quadrant |
boolean; Nearest observations in quadrants covariate - will nearest observation in quadrants be calculated (see function near.obs). Default is FALSE. |
use.idw |
boolean; IDW prediction as covariate - will IDW predictions from |
idw.p |
numeric; Exponent parameter for IDW weights (see function near.obs). Default is 2. |
s.crs |
st_crs or crs; Source CRS of |
p.crs |
st_crs or crs; Projection CRS for |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
progress |
logical; If progress bar is shown. Default is TRUE. |
soil3d |
logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the |
depth.range |
numeric; Depth range for location mid depth in which to search for nearest observations (see function near.obs.soil). It's in the mid depth units. Default is 0.1. |
no.obs |
character; Possible values are |
... |
Further arguments passed to ranger, such as |
RFSI model of class ranger.
Observations should be in projection for finding nearest observations based on Eucleadean distances (see function near.obs). If crs is not specified in the data
object or through the s.crs
parameter, the coordinates will be used as they are in projection. Use s.crs
and p.crs
if the coordinates of the data
object are in lon/lat (WGS84).
Aleksandar Sekulic [email protected]
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
near.obs
pred.rfsi
tune.rfsi
cv.rfsi
library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # fit the RFSI model rfsi_model <- rfsi(formula = fm.RFSI, data = data, # meuse.df (use data.staid.x.y.z) n.obs = 5, # number of nearest observations cpus = 2, # detectCores()-1, progress = TRUE, # ranger parameters importance = "impurity", seed = 42, num.trees = 250, mtry = 5, splitrule = "variance", min.node.size = 5, sample.fraction = 0.95, quantreg = FALSE) rfsi_model # OOB prediction error (MSE): 47758.14 # R squared (OOB): 0.6435869 sort(rfsi_model$variable.importance) sum("obs" == substr(rfsi_model$forest$independent.variable.names, 1, 3))
library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # fit the RFSI model rfsi_model <- rfsi(formula = fm.RFSI, data = data, # meuse.df (use data.staid.x.y.z) n.obs = 5, # number of nearest observations cpus = 2, # detectCores()-1, progress = TRUE, # ranger parameters importance = "impurity", seed = 42, num.trees = 250, mtry = 5, splitrule = "variance", min.node.size = 5, sample.fraction = 0.95, quantreg = FALSE) rfsi_model # OOB prediction error (MSE): 47758.14 # R squared (OOB): 0.6435869 sort(rfsi_model$variable.importance) sum("obs" == substr(rfsi_model$forest$independent.variable.names, 1, 3))
This function finds point pairs with equal spatial coordinates from STFDF-class object and remove locations with less observations.
rm.dupl(obj, zcol = 1, zero.tol = 0)
rm.dupl(obj, zcol = 1, zero.tol = 0)
obj |
STFDF-class object |
zcol |
variable column name, or column number, from obj@data |
zero.tol |
distance values less than or equal to this threshold value are considered as duplicates; units are those of the coordinates for projected data or unknown projection, or km if coordinates are defined to be longitute/latitude |
STFDF-class object with removed duplicate locations. Stations with less observation is removed, if number of observation is the same for two stations the first is removed.
Milan Kilibarda [email protected]
library(sp) # load observation - data frame data(dtempc) # load stations - data frame data(stations) str(dtempc) str(stations) # create STFDF - from 2 data frames temp <- meteo2STFDF(dtempc, stations, crs = CRS('+proj=longlat +datum=WGS84')) nrow(temp@sp) # number of stations before removing dupl. temp2 <-rm.dupl(temp, zcol = 1, zero.tol = 50) # 50 km nrow(temp2@sp) # number of stations after
library(sp) # load observation - data frame data(dtempc) # load stations - data frame data(stations) str(dtempc) str(stations) # create STFDF - from 2 data frames temp <- meteo2STFDF(dtempc, stations, crs = CRS('+proj=longlat +datum=WGS84')) nrow(temp@sp) # number of stations before removing dupl. temp2 <-rm.dupl(temp, zcol = 1, zero.tol = 50) # 50 km nrow(temp2@sp) # number of stations after
Data frame containing stations' information of merged daily observations from the Global Surface Summary of Day (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.
data(stations)
data(stations)
The stations
contains the following columns:
staid
character; station ID from GSOD or ECA&D data set
lon
numeric; longitude coordinate
lat
numeric; longitude coordinate
elev_1m
numeric; elevation derived from station metadata in m
data_source
Factor; data source, GSOD or ECA&D
station_name
character; station name
Milan Kilibarda and Tomislav Hengl
Global Surface Summary of the day data (ftp://ftp.ncdc.noaa.gov/pub/data/gsod/)
European Climate Assessment & Dataset (https://www.ecad.eu/dailydata/predefinedseries.php)
# load data: data(stations) str(stations) library(sp) coordinates(stations) <-~ lon +lat stations@proj4string <-CRS('+proj=longlat +datum=WGS84') plot(stations)
# load data: data(stations) str(stations) library(sp) coordinates(stations) <-~ lon +lat stations@proj4string <-CRS('+proj=longlat +datum=WGS84') plot(stations)
Data frame containing stations' information of daily observations from the OGIMET service for the year 2019 for Serbian territory.
data(stations_ogimet)
data(stations_ogimet)
The dtempc_ogimet
contains the following columns:
staid
character; station ID from OGIMET
name
character; station name
lon
numeric; Longitude
lat
numeric; Latitude
elevation
numeric; Hight
dem
numeric; Digital Elevation Model (DEM) in meters
twi
numeric; Topographic Wetness Index (TWI)
Aleksandar Sekulic [email protected]
OGIMET service (https://www.ogimet.com/)
# load data: data(stations_ogimet) str(stations) library(sp) coordinates(stations) <-~ lon +lat stations@proj4string <-CRS('+proj=longlat +datum=WGS84') plot(stations)
# load data: data(stations_ogimet) str(stations) library(sp) coordinates(stations) <-~ lon +lat stations@proj4string <-CRS('+proj=longlat +datum=WGS84') plot(stations)
Calculate geometrical temperature trend for mean, minimum or maximum temperature.
temp_geom(day, fi, variable="mean", ab = NULL)
temp_geom(day, fi, variable="mean", ab = NULL)
day |
integer; Day of the year (from 1 to 366). Single value or vector of days of the year (only if |
fi |
numeric; Latitude. Single value or vector of latitudes (only if |
variable |
character; Geometrical temperature trend calculated for mean, minimum or maximum temperature; Possible values are |
ab |
Predefined coefficients to be used instead of incorporated. |
A numerical single value or vector
with calculated geometrical temperature trend.
Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.
tgeom <- temp_geom(day = 1, fi = 45.33, variable="mean") tgeom_vect <- temp_geom(day = 1:365, fi = 45.33, variable="mean") tgeom_vect2 <- temp_geom(day = 1, fi = seq(35, 45, 0.5), variable="mean")
tgeom <- temp_geom(day = 1, fi = 45.33, variable="mean") tgeom_vect <- temp_geom(day = 1:365, fi = 45.33, variable="mean") tgeom_vect2 <- temp_geom(day = 1, fi = seq(35, 45, 0.5), variable="mean")
Calculate geometrical temperature trend for mean, minimum or maximum temperature.
tgeom2STFDF(grid, time, variable = "mean", ab=NULL)
tgeom2STFDF(grid, time, variable = "mean", ab=NULL)
grid |
Spatial-class (Points, Grid or Pixels), sf-class, SpatVector-class, SpatRaster-class; Locations for which gtt is calculated with associated coordinate reference systems (CRS-class). If CRS is not defined longitude latitude is assumed. |
time |
date or datetime; Object holding time information, reasonably it is day (calendar date), or vector of days. |
variable |
character; Geometrical temperature trend calculated for mean, minimum or maximum temperature; Possible values are |
ab |
Predefined coefficients to be used instead of incorporated. |
STFDF-class object with calculated temp_geo
geometrical temperature trend. The calculated values are stored in obj@data
slot.
Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.
library(sp) library(spacetime) # create one point from lon lat pos <- SpatialPoints(coords = cbind(19.22,45.33)) # temp_geom for 1st Jan 2011 tg1 <- tgeom2STFDF(pos,as.POSIXct("2011-01-01") ) tg1 # temp_geom for the 2011 at pos location tg365<- tgeom2STFDF(pos,time = seq(as.POSIXct("2011-01-01"), as.POSIXct("2011-12-31"), by="day") ) stplot(tg365, mode='ts') data(regdata) # DEM and TWI data for Serbia at 1 km resolution str(regdata@sp) spplot(regdata@sp, zcol='dem', col.regions=bpy.colors() ) # temp_geom for Serbia 1st and 2nd Jully 2011 tgSrb<- tgeom2STFDF(regdata@sp,time = seq(as.POSIXct("2011-07-01"), as.POSIXct("2011-07-02"), by="day") ) # temp_geom for "2011-07-01" , "2011-07-02" # stplot(tgSrb, col.regions = bpy.colors() )
library(sp) library(spacetime) # create one point from lon lat pos <- SpatialPoints(coords = cbind(19.22,45.33)) # temp_geom for 1st Jan 2011 tg1 <- tgeom2STFDF(pos,as.POSIXct("2011-01-01") ) tg1 # temp_geom for the 2011 at pos location tg365<- tgeom2STFDF(pos,time = seq(as.POSIXct("2011-01-01"), as.POSIXct("2011-12-31"), by="day") ) stplot(tg365, mode='ts') data(regdata) # DEM and TWI data for Serbia at 1 km resolution str(regdata@sp) spplot(regdata@sp, zcol='dem', col.regions=bpy.colors() ) # temp_geom for Serbia 1st and 2nd Jully 2011 tgSrb<- tgeom2STFDF(regdata@sp,time = seq(as.POSIXct("2011-07-01"), as.POSIXct("2011-07-02"), by="day") ) # temp_geom for "2011-07-01" , "2011-07-02" # stplot(tgSrb, col.regions = bpy.colors() )
Tiling raster or Spatial-class Grid or Pixels (data frame) object to smaller parts with optional overlap.
tiling(rast, tilesize=500, overlapping=50, aspoints= NA, asfiles=FALSE, tilename="tile", tiles_folder='tiles', parallel.processing=FALSE, cpus=6, ...)
tiling(rast, tilesize=500, overlapping=50, aspoints= NA, asfiles=FALSE, tilename="tile", tiles_folder='tiles', parallel.processing=FALSE, cpus=6, ...)
rast |
SpatRaster, SpatialPixels* object, SpatialGrid* object or file path to raster object stored on the disk (can be read via rast), for more details see SpatRaster. The resolution of the raster should be the same in x and y direction. |
tilesize |
integer or vector; tile size in number of cells. Can be a vector of tilesize in x and y direction. Total number of tile cells is |
overlapping |
integer or vector; overlapping in number of cells. Can be a vector of overlapping in x and y direction. |
aspoints |
character; Posiible values are |
asfiles |
boolean; if TRUE tiles are stored on local drive as raster objects. |
tilename |
character; prefix given to file names |
tiles_folder |
character; destination folder where tiles will be stored. If doesn't exist, the folder will be created. |
parallel.processing |
boolean; if TRUE parralel processing is performed via snowfall-calculation, sfLapply function. |
cpus |
integer; number of proccesing units. |
... |
character; additional arguments for for writing files, see writeRaster. |
The list of tiles in SpatRaster-class format or in sf-class, SpatVector or SpatialPointsDataFrame format if aspoints=TRUE
.
Milan Kilibarda [email protected]
library(sp) demo(meuse, echo=FALSE) rast <- terra::rast(meuse.grid[, "dist"]) # tiling dem in tiles 250x250 with 25 cells overlap tiles = tiling(rast, tilesize=20, overlapping=5, aspoints=TRUE) # number of tiles length(tiles) plot(rast) plot(tiles[[1]] , pch='-' ,col ='green', add=TRUE) plot(tiles[[2]], pch='.', add=TRUE) str(tiles[[1]])
library(sp) demo(meuse, echo=FALSE) rast <- terra::rast(meuse.grid[, "dist"]) # tiling dem in tiles 250x250 with 25 cells overlap tiles = tiling(rast, tilesize=20, overlapping=5, aspoints=TRUE) # number of tiles length(tiles) plot(rast) plot(tiles[[1]] , pch='-' ,col ='green', add=TRUE) plot(tiles[[2]], pch='.', add=TRUE) str(tiles[[1]])
Multiple linear regression coefficients for mean, minimum, maximum daily temperature on geometric temperature trend (GTT), MODIS LST, digital elevation model (DEM) and topographic wetness index (TWI). The models are computed from GSOD, ECA&D, GHCN-Daily or local meteorological stations.
data(tregcoef)
data(tregcoef)
A list of 9 multiple linear regression coefficients for daily air temperatures.
tmeanGSODECAD
Multiple linear regression coefficients of global mean daily temperature on GTT, MODIS LST, DEM and TWI. Data used: GSOD and ECA&D
tmeanGSODECAD_noMODIS
Multiple linear regression coefficients of global mean daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D
tminGSODECAD
Multiple linear regression coefficients of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D
tminGHCND
Multiple linear regression coefficients of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily
tminGSODECAD_noMODIS
Multiple linear regression coefficients of global minimum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D
tmaxGSODECAD
Multiple linear regression coefficients of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D
tmaxGHCND
Multiple linear regression coefficients of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily
tmaxGSODECAD_noMODIS
Multiple linear regression coefficients of global maximum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D
tmeanHR
Multiple linear regression coefficients of Croatian mean daily temperature on GTT, DEM, and TWI. Data used: Croatian mean daily temperature dataset for the year 2008 (Croatian national meteorological network)
Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.
data(tregcoef) tregcoef[[1]] # model for mean daily temp.
data(tregcoef) tregcoef[[1]] # model for mean daily temp.
Function for tuning of Random Forest Spatial Interpolation (RFSI) model using k-fold leave-location-out cross-validation (Sekulić et al. 2020).
tune.rfsi(formula, data, data.staid.x.y.z = NULL, use.idw = FALSE, s.crs = NA, p.crs = NA, tgrid, tgrid.n=10, tune.type = "LLO", k = 5, seed=42, folds, acc.metric, fit.final.model=TRUE, cpus = detectCores()-1, progress = TRUE, soil3d = FALSE, no.obs = 'increase', ...)
tune.rfsi(formula, data, data.staid.x.y.z = NULL, use.idw = FALSE, s.crs = NA, p.crs = NA, tgrid, tgrid.n=10, tune.type = "LLO", k = 5, seed=42, folds, acc.metric, fit.final.model=TRUE, cpus = detectCores()-1, progress = TRUE, soil3d = FALSE, no.obs = 'increase', ...)
formula |
formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If |
data |
sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates ( |
data.staid.x.y.z |
numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If |
use.idw |
boolean; IDW prediction as covariate - will IDW predictions from |
s.crs |
st_crs or crs; Source CRS of |
p.crs |
st_crs or crs; Projection CRS for |
tgrid |
data.frame; Possible tuning parameters. The column names are same as the tuning parameters. Possible tuning parameters are: |
tgrid.n |
numeric; Number of randomly chosen |
tune.type |
character; Type of cross-validation: leave-location-out ("LLO"), leave-time-out ("LTO") - TO DO, and leave-location-time-out ("LLTO") - TO DO. Default is "LLO". |
k |
numeric; Number of random folds that will be created with CreateSpacetimeFolds function if |
seed |
numeric; Random seed that will be used to generate folds with CreateSpacetimeFolds function. |
folds |
numeric or character vector or value; Showing folds column (if value) or rows (vector) of |
acc.metric |
character; Accuracy metric that will be used as a criteria for choosing an optimal RFSI model. Possible values for regression: "ME", "MAE", "NMAE", "RMSE" (default), "NRMSE", "R2", "CCC". Possible values for classification: "Accuracy","Kappa" (default), "AccuracyLower", "AccuracyUpper", "AccuracyNull", "AccuracyPValue", "McnemarPValue". |
fit.final.model |
boolean; Fit the final RFSI model. Defailt is TRUE. |
cpus |
numeric; Number of processing units. Default is detectCores()-1. |
progress |
logical; If progress bar is shown. Default is TRUE. |
soil3d |
logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the |
no.obs |
character; Possible values are |
... |
Further arguments passed to ranger. |
A list with elements:
combinations |
data.frame; All tuned parameter combinations with chosen accuracy metric value. |
tuned.parameters |
numeric vector; Tuned parameters with chosen accuracy metric value. |
final.model |
ranger; Final RFSI model (if |
Aleksandar Sekulic [email protected]
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).
near.obs
rfsi
pred.rfsi
cv.rfsi
library(CAST) library(doParallel) library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # making tgrid n.obs <- 1:6 min.node.size <- 2:10 sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement splitrule <- "variance" ntree <- 250 # 500 mtry <- 3:(2+2*max(n.obs)) tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree, mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction) # Tune RFSI model rfsi_tuned <- tune.rfsi(formula = fm.RFSI, data = data, # data.staid.x.y.z = data.staid.x.y.z, # data.frame # s.crs = st_crs(data), # p.crs = st_crs(data), tgrid = tgrid, # combinations for tuning tgrid.n = 20, # number of randomly selected combinations from tgrid tune.type = "LLO", # Leave-Location-Out CV k = 5, # number of folds seed = 42, acc.metric = "RMSE", # R2, CCC, MAE fit.final.model = TRUE, cpus = 2, # detectCores()-1, progress = TRUE, importance = "impurity") # ranger parameter rfsi_tuned$combinations rfsi_tuned$tuned.parameters # min.node.size num.trees mtry n.obs sample.fraction RMSE # 3701 3 250 6 5 0.75 222.6752 rfsi_tuned$final.model # OOB prediction error (MSE): 46666.51 # R squared (OOB): 0.6517336
library(CAST) library(doParallel) library(ranger) library(sp) library(sf) library(terra) library(meteo) # prepare data demo(meuse, echo=FALSE) meuse <- meuse[complete.cases(meuse@data),] data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant") fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq") # making tgrid n.obs <- 1:6 min.node.size <- 2:10 sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement splitrule <- "variance" ntree <- 250 # 500 mtry <- 3:(2+2*max(n.obs)) tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree, mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction) # Tune RFSI model rfsi_tuned <- tune.rfsi(formula = fm.RFSI, data = data, # data.staid.x.y.z = data.staid.x.y.z, # data.frame # s.crs = st_crs(data), # p.crs = st_crs(data), tgrid = tgrid, # combinations for tuning tgrid.n = 20, # number of randomly selected combinations from tgrid tune.type = "LLO", # Leave-Location-Out CV k = 5, # number of folds seed = 42, acc.metric = "RMSE", # R2, CCC, MAE fit.final.model = TRUE, cpus = 2, # detectCores()-1, progress = TRUE, importance = "impurity") # ranger parameter rfsi_tuned$combinations rfsi_tuned$tuned.parameters # min.node.size num.trees mtry n.obs sample.fraction RMSE # 3701 3 250 6 5 0.75 222.6752 rfsi_tuned$final.model # OOB prediction error (MSE): 46666.51 # R squared (OOB): 0.6517336
Variograms of residuals from multiple linear regression of mean, minimum, maximum daily temperatures on geometric temperature trend (GTT), MODIS LST, digital elevation model (DEM) and topographic wetness index (TWI). The models is computed from GSOD, ECA&D, GHCN-Daily or local meteorological stations. The obtained global or local models for mean, minimum, and maximum temperature can be used to produce gridded images of daily temperatures at high spatial and temporal resolution.
data(tvgms)
data(tvgms)
A list of 9 space-time sum-metric models for daily air temperatures, units: space km, time days.
tmeanGSODECAD
Variogram for residuals from multiple linear regression of global mean daily temperature on GTT, MODIS LST, DEM, and TWI. D used: GSOD and ECA&D
tmeanGSODECAD_noMODIS
Variogram for residuals from multiple linear regression of global mean daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D
tminGSODECAD
Variogram for residuals from multiple linear regression of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D
tminGHCND
Variogram for residuals from multiple linear regression of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily
tminGSODECAD_noMODIS
Variogram for residuals from multiple linear regression of global minimum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D
tmaxGSODECAD
Variogram for residuals from multiple linear regression of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D
tmaxGHCND
Variogram for residuals from multiple linear regression of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily
tmaxGSODECAD_noMODIS
Variogram for residuals from multiple linear regression of global maximum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D
tmeanHR
Variogram for residuals from multiple linear regression of Croatian mean daily temperature on GTT, DEM, and TWI. Data used: Croatian mean daily temperature dataset for the year 2008 (Croatian national meteorological network)
Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]
Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.
data(tvgms) tvgms[[1]] # model for mean daily temp.
data(tvgms) tvgms[[1]] # model for mean daily temp.