Package 'meteo'

Title: RFSI & STRK Interpolation for Meteo and Environmental Variables
Description: Random Forest Spatial Interpolation (RFSI, Sekulić et al. (2020) <doi:10.3390/rs12101687>) and spatio-temporal geostatistical (spatio-temporal regression Kriging (STRK)) interpolation for meteorological (Kilibarda et al. (2014) <doi:10.1002/2013JD020803>, Sekulić et al. (2020) <doi:10.1007/s00704-019-03077-3>) and other environmental variables. Contains global spatio-temporal models calculated using publicly available data.
Authors: Milan Kilibarda [aut] , Aleksandar Sekulić [aut, cre] , Tomislav Hengl [ctb], Edzer Pebesma [ctb], Benedikt Graeler [ctb]
Maintainer: Aleksandar Sekulić <[email protected]>
License: GPL (>= 2.0) | file LICENCE
Version: 2.0-3
Built: 2025-03-01 04:53:00 UTC
Source: https://github.com/aleksandarsekulic/rmeteo

Help Index


Accuracy metrics calculation

Description

Calculates classification and regression accuracy metrics for given coresponding observation and prediction vectors.

Usage

acc.metric.fun(obs, pred, acc.m)

Arguments

obs

numeric or factor vector; Observations.

pred

numeric or factor vector; Predictions.

acc.m

character; Accuracy metric. Possible values for regression: "ME", "MAE", "NMAE", "RMSE", "NRMSE", "R2", "CCC". Possible values for classification: "Accuracy", "Kappa", "AccuracyLower", "AccuracyUpper", "AccuracyNull", "AccuracyPValue", "McnemarPValue".

Value

Accuracy metric value.

Author(s)

Aleksandar Sekulic [email protected]

References

Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation.Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).

See Also

acc.metric.fun rfsi pred.rfsi tune.rfsi cv.rfsi pred.strk cv.strk

Examples

library(sp)
library(sf)
library(CAST)
library(ranger)
library(plyr)
library(meteo)

# preparing data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq")

# making tgrid
n.obs <- 1:3
min.node.size <- 2:10
sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement
splitrule <- "variance"
ntree <- 250 # 500
mtry <- 3:(2+2*max(n.obs))
tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree,
                    mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction)


# do cross-validation
rfsi_cv <- cv.rfsi(formula=fm.RFSI, # without nearest obs
                   data = data,
                   zero.tol=0,
                   tgrid = tgrid, # combinations for tuning
                   tgrid.n = 5, # number of randomly selected combinations from tgrid for tuning
                   tune.type = "LLO", # Leave-Location-Out CV
                   k = 5, # number of folds
                   seed = 42,
                   acc.metric = "RMSE", # R2, CCC, MAE
                   output.format = "data.frame",
                   cpus=2, # detectCores()-1,
                   progress=1,
                   importance = "impurity")
summary(rfsi_cv)

# accuracy metric calculation
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "R2")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "RMSE")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "NRMSE")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "MAE")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "NMAE")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "CCC")

Nested k-fold cross-validation for Random Forest Spatial Interpolation (RFSI)

Description

Function for nested k-fold cross-validation function for Random Forest Spatial Interpolation (RFSI) (Sekulić et al. 2020). It is based on rfsi, pred.rfsi, and tune.rfsi functions. Currently, only spatial (leave-location-out) cross-validation is implemented. Temporal and spatio-temporal cross-validation will be implemented in the future.

Usage

cv.rfsi(formula,
        data,
        data.staid.x.y.z = NULL,
        use.idw = FALSE,
        s.crs = NA,
        p.crs = NA,
        tgrid,
        tgrid.n=10,
        tune.type = "LLO",
        k = 5,
        seed=42,
        out.folds,
        in.folds,
        acc.metric,
        output.format = "data.frame",
        cpus = detectCores()-1,
        progress = 1,
        soil3d = FALSE,
        no.obs = 'increase',
        ...)

Arguments

formula

formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If z~1, an RFSI model using only nearest obsevrations and distances to them as covariates will be cross-validated.

data

sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates (formula=z~1) will be cross-validated.

data.staid.x.y.z

numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If data is sf-class, sftime-class, or SpatVector-class object, data.staid.x.y.z is used to point staid and z position. Set z position to NA (e.g. c(1,2,3,NA)) or ommit it (e.g. c(1,2,3)) for spatial interpolation. Default is NULL.

use.idw

boolean; IDW prediction as covariate - will IDW predictions from n.obs nearest observations be calculated and tuned (see function near.obs). Default is FALSE.

s.crs

st_crs or crs; Source CRS of data. If data contains crs, s.crs will be overwritten. Default is NA.

p.crs

st_crs or crs; Projection CRS for data reprojection. If NA, s.crs will be used for distance calculation. Note that observations should be in projection for finding nearest observations based on Eucleadean distances (see function near.obs). Default is NA.

tgrid

data.frame; Possible tuning parameters for nested folds. The column names are same as the tuning parameters. Possible tuning parameters are: n.obs, num.trees, mtry, min.node.size, sample.fraction, splirule, idw.p, and depth.range.

tgrid.n

numeric; Number of randomly chosen tgrid combinations for nested tuning of RFSI. If larger than tgrid, will be set to length(tgrid)

tune.type

character; Type of nested cross-validation: leave-location-out ("LLO"), leave-time-out ("LTO") - TO DO, and leave-location-time-out ("LLTO") - TO DO. Default is "LLO".

k

numeric; Number of random outer and inner folds (i.e. for cross-validation and nested tuning) that will be created with CreateSpacetimeFolds function. Default is 5.

seed

numeric; Random seed that will be used to generate outer and inner folds with CreateSpacetimeFolds function.

out.folds

numeric or character vector or value; Showing outer folds column (if value) or rows (vector) of data observations used for cross-validation. If missing, will be created with CreateSpacetimeFolds function.

in.folds

numeric or character vector or value; Showing innner folds column (if value) or rows (vector) of data observations used for cross-validation. If missing, will be created with CreateSpacetimeFolds function.

acc.metric

character; Accuracy metric that will be used as a criteria for choosing an optimal RFSI model in nested tuning. Possible values for regression: "ME", "MAE", "NMAE", "RMSE" (default), "NRMSE", "R2", "CCC". Possible values for classification: "Accuracy","Kappa" (default), "AccuracyLower", "AccuracyUpper", "AccuracyNull", "AccuracyPValue", "McnemarPValue".

output.format

character; Format of the output, data.frame (default), sf-class, sftime-class, or SpatVector-class.

cpus

numeric; Number of processing units. Default is detectCores()-1.

progress

numeric; If progress bar is shown. 0 is no progress bar, 1 is outer folds results, 2 is + innner folds results, 3 is + prediction progress bar. Default is 1.

soil3d

logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the data.staid.x.y.z points to the depth column.

no.obs

character; Possible values are increase (default) and exactly. If set to increase, in case if there is no n.obs observations in depth.range for a specific location, the depth.range is increased (multiplied by 2, 3, ...) until the number of observations are larger or equal to n.obs. If set to exactly, the function will raise an error when it come to the first location with no n.obs observations in specified depth.range (see function near.obs.soil).

...

Further arguments passed to ranger.

Value

A data.frame, sf-class, sftime-class, or SpatVector-class object (depends on output.format argument), with columns:

obs

Observations.

pred

Predictions from cross-validation.

folds

Folds used for cross-validation.

Author(s)

Aleksandar Sekulic [email protected]

References

Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).

See Also

near.obs rfsi pred.rfsi tune.rfsi

Examples

library(CAST)
library(doParallel)
library(ranger)
library(sp)
library(sf)
library(terra)
library(meteo)

# prepare data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq")

# making tgrid
n.obs <- 1:6
min.node.size <- 2:10
sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement
splitrule <- "variance"
ntree <- 250 # 500
mtry <- 3:(2+2*max(n.obs))
tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree,
                    mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction)

# Cross-validation of RFSI
rfsi_cv <- cv.rfsi(formula=fm.RFSI, # without nearest obs
                   data = data,
                   tgrid = tgrid, # combinations for tuning
                   tgrid.n = 2, # number of randomly selected combinations from tgrid for tuning
                   tune.type = "LLO", # Leave-Location-Out CV
                   k = 5, # number of folds
                   seed = 42,
                   acc.metric = "RMSE", # R2, CCC, MAE
                   output.format = "sf", # "data.frame", # "SpatVector",
                   cpus=2, # detectCores()-1,
                   progress=1,
                   importance = "impurity") # ranger parameter

summary(rfsi_cv)
rfsi_cv$dif <- rfsi_cv$obs - rfsi_cv$pred
plot(rfsi_cv["dif"])
plot(rfsi_cv[, , "obs"])
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "R2")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "RMSE")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "MAE")
acc.metric.fun(rfsi_cv$obs, rfsi_cv$pred, "CCC")

k-fold cross-validation for spatio-temporal regression kriging

Description

k-fold cross-validation function for spatio-temporal regression kriging based on pred.strk. Currently, only spatial (leave-location-out) cross-validation is implemented. Temporal and spatio-temporal cross-validation will be implemented in the future.

Usage

cv.strk(data,
        obs.col=1,
        data.staid.x.y.z = NULL,
        crs = NA,
        zero.tol=0,
        reg.coef,
        vgm.model,
        sp.nmax=20,
        time.nmax=2,
        type = "LLO",
        k = 5,
        seed = 42,
        folds,
        refit = TRUE,
        output.format = "STFDF",
        parallel.processing = FALSE,
        pp.type = "snowfall",
        cpus=detectCores()-1,
        progress=TRUE,
        ...)

Arguments

data

STFDF-class, STSDF-class, STIDF-class, sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates in space and time used to perform STRK cross validation. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs), and covariates (cov1, cov2, ...). Covariate names should be the same as in the reg.coef (see below). If covariates are missing, then spatio-temporal ordinary kriging cross validation is performed.

obs.col

numeric or character; Column name or number showing position of the observation column in the data. Default is 1.

data.staid.x.y.z

numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component - time, depth (z) columns in data.frame object (e.g. c(1,2,3,4)). If data is sf-class, sftime-class, or SpatVector-class object, data.staid.x.y.z is used to point staid and z position. If data is STFDF-class, STSDF-class, STIDF-class object, data.staid.x.y.z is used to point only staid position. Default is NULL.

crs

st_crs or crs; Source CRS of data. If data contains crs, crs will not be used. Default is NA.

zero.tol

numeric; A distance value below (or equal to) which locations are considered as duplicates. Default is 0. See rm.dupl. Duplicates are removed to avoid singular covariance matrices in kriging.

reg.coef

numeric; Vector of named linear regression coefficients. Names of the coefficients (e.g. "Intercept", "temp_geo", "modis", "dem", "twi") will be used to match appropriate covariates from data. Coefficients for metorological variables (temperature, precipitation, etc.) can be taken from data(tregcoef) or can be specified by the user.

vgm.model

StVariogramModel list; Spatio-temporal variogram of regression residuals (or observations if spatio-temporal ordinary kriging). See vgmST. Spatio-temporal variogram model on residuals for metorological variables (temperature, precipitation, etc.) can be taken from data(tvgms) or can be specified by the user as a vgmST object.

sp.nmax

numeric; A number of spatially nearest observations that should be used for kriging predictions. If tiling is TRUE (see below), then is a number of spatially nearest observations that should be used for each tile. Deafult is 20.

time.nmax

numeric; A number of temporally nearest observations that should be used for kriging predictions Deafult is 2.

type

character; Type of cross-validation: leave-location-out ("LLO"), leave-time-out ("LTO"), and leave-location-time-out ("LLTO"). Default is "LLO". "LTO" and "LLTO" are not implemented yet. Will be in the future.

k

numeric; Number of random folds that will be created with CreateSpacetimeFolds function. Default is 5.

seed

numeric; Random seed that will be used to generate outer and inner folds with CreateSpacetimeFolds function.

folds

numeric or character vector or value; Showing folds column (if value) or rows (vector) of data observations used for cross-validation. If missing, will be created with CreateSpacetimeFolds function.

refit

logical; If refit of linear regression trend and spatio-teporal variogram should be performed. Spatio-teporal variogram is fit using vgm.model as desired spatio-temporal model for fit.StVariogram function. Default is TRUE.

output.format

character; Format of the output, STFDF-class (default), STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, or SpatVector-class.

parallel.processing

logical; If parallel processing is performed. Default is FALSE.

pp.type

character; Type (R package) of parallel processing, "snowfall" (default) or "doParallel".

cpus

numeric; Number of processing units. Default is detectCores()-1.

progress

logical; If progress bar is shown. Default is TRUE.

...

Further arguments passed to krigeST or pred.strk.

Value

A STFDF-class (default), STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, or SpatVector-class object (depends on output.format argument), with columns:

obs

Observations.

pred

Predictions from cross-validation.

folds

Folds used for cross-validation.

For accuracy metrics see acc.metric.fun function.

Author(s)

Aleksandar Sekulic [email protected], Milan Kilibarda [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.

See Also

acc.metric.fun pred.strk tregcoef tvgms regdata meteo2STFDF tgeom2STFDF

Examples

library(sp)
library(spacetime)
library(gstat)
library(plyr)
library(CAST)
library(doParallel)
library(ranger)
# preparing data
data(dtempc) 
data(stations)
data(regdata) # covariates, made by mete2STFDF function

regdata@sp@proj4string <- CRS('+proj=longlat +datum=WGS84')
data(tvgms) # ST variogram models
data(tregcoef) # MLR coefficients

lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46
serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), 
                          c(latmin,latmin,latmax,latmax))
st = stations[ serbia!=0, ] # stations in Serbia approx.
crs = CRS('+proj=longlat +datum=WGS84')

# create STFDF
stfdf <- meteo2STFDF(obs = dtempc,
                     stations = st,
                     crs = crs)

# Cross-validation for mean temperature for days "2011-07-05" and "2011-07-06" 
# global model is used for regression and variogram

# Overlay observations with covariates
time <- index(stfdf@time)
covariates.df <- as.data.frame(regdata)
names_covar <- names(tregcoef[[1]])[-1]
for (covar in names_covar){
  nrowsp <- length(stfdf@sp)
  regdata@sp=as(regdata@sp,'SpatialPixelsDataFrame')
  ov <- sapply(time, function(i) 
    if (covar %in% names(regdata@data)) {
      if (as.Date(i) %in% as.Date(index(regdata@time))) {
        over(stfdf@sp, as(regdata[, i, covar], 'SpatialPixelsDataFrame'))[, covar]
      } else {
        rep(NA, length(stfdf@sp))
      }
    } else {
      over(stfdf@sp, as(regdata@sp[covar], 'SpatialPixelsDataFrame'))[, covar]
    }
  )
  ov <- as.vector(ov)
  if (all(is.na(ov))) {
    stop(paste('There is no overlay of data with covariates!', sep = ""))
  }
  stfdf@data[covar] <- ov
}

# Remove stations out of covariates
for (covar in names_covar){
  # count NAs per stations
  numNA <- apply(matrix(stfdf@data[,covar],
                        nrow=nrowsp,byrow= FALSE), MARGIN=1,
                 FUN=function(x) sum(is.na(x)))
  rem <- numNA != length(time)
  stfdf <-  stfdf[rem,drop= FALSE]
}

# Remove dates out of covariates
rm.days <- c()
for (t in 1:length(time)) {
  if(sum(complete.cases(stfdf[, t]@data)) == 0) {
    rm.days <- c(rm.days, t)
  }
}
if(!is.null(rm.days)){
  stfdf <- stfdf[,-rm.days]
}

### Example with STFDF and without parallel processing and without refitting of variogram
results <- cv.strk(data = stfdf,
                   obs.col = 1, # "tempc"
                   data.staid.x.y.z = c(1,NA,NA,NA),
                   reg.coef = tregcoef[[1]],
                   vgm.model = tvgms[[1]],
                   sp.nmax = 20,
                   time.nmax = 2,
                   type = "LLO",
                   k = 5,
                   seed = 42,
                   refit = FALSE,
                   progress = TRUE
)


stplot(results[,,"pred"])

summary(results)
# accuracy
acc.metric.fun(results@data$obs, results@data$pred, "R2")
acc.metric.fun(results@data$obs, results@data$pred, "RMSE")
acc.metric.fun(results@data$obs, results@data$pred, "MAE")
acc.metric.fun(results@data$obs, results@data$pred, "CCC")

Prepare data

Description

Function for data preparation for RFSI and STRK functions. It transforms data to a data.frame.

Usage

data.prepare(data,
             data.staid.x.y.z=NULL,
             obs.col=NULL,
             s.crs=NA
)

Arguments

data

sf-class, sftime-class, SpatVector-class, SpatRaster-class or data.frame; Contains target variable (observations) and covariates. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, and observation value (obs).

data.staid.x.y.z

numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If data is sf-class, sftime-class, or SpatVector-class object, data.staid.x.y.z is used to point staid and z position. Set z position to NA (e.g. c(1,2,3,NA)) or ommit it (e.g. c(1,2,3)) for spatial interpolation. Default is NULL.

obs.col

numeric or character; Column name or number showing position of the observation column in the data. Default is 1.

s.crs

st_crs or crs; Source CRS of data. If data contains crs, s.crs will not be used. Default is NA.

Value

A list with the following elements:

data.df

A data.frame obtained from data.

data.staid.x.y.z

Positions of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)).

s.crs

Source CRS of data.

obs.col

Column number showing position of the observation column in the data.

Author(s)

Aleksandar Sekulic [email protected]

References

Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).

See Also

near.obs rfsi tune.rfsi cv.rfsi

Examples

library(sf)
library(meteo)
library(sp)

# prepare data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
data.df <- data.prepare(data,
                        obs.col="zinc")
str(data.df)

Digital Elevation Model (DEM) and Topographic Wetness Index (TWI) for Serbia

Description

Digital Elevation Model (DEM) and Topographic Wetness Index (TWI) for Serbia in SpatRaster format.

Usage

data(dem_twi_srb)

Format

The dem_twi_srb contains the following layers:

dem

Digital Elevation Model (DEM) in meters

twi

Topographic Wetness Index (TWI)

Author(s)

Aleksandar Sekulic [email protected]

Examples

library(terra)
# load data 
data(dem_twi_srb)
terra::unwrap(dem_twi_srb)

Daily precipitation amount in mm for July 2011

Description

Sample data set showing values of merged daily precipitation amount measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.

Usage

data(dprec)

Format

The dprec contains the following columns:

staid

character; station ID from GSOD or ECA&D data set

time

Date; day of the measurement

prec

numeric; daily precipitation amount in mm

Note

The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations table containing stations' coordinates.

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data 
data(dprec)
str(dprec)

Mean sea level pressure in hPa for July 2011

Description

Sample data set showing values of merged mean sea level pressure measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.

Usage

data(dslp)

Format

The dslp contains the following columns:

staid

character; station ID from GSOD or ECA&D data set

time

Date; day of the measurement

slp

numeric; mean sea level pressure amount in hPa

Note

The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations table containing stations' coordinates.

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data
data(dslp)
str(dslp)

Daily snow depth in cm for July 2011

Description

Sample data set showing values of merged daily snow depth measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.

Usage

data(dsndp)

Format

The dsndp contains the following columns:

staid

character; station ID from GSOD or ECA&D data set

time

Date; day of the measurement

sndp

numeric; daily snow depth in cm

Note

The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations table containing stations' coordinates.

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data 
data(dsndp)
str(dsndp)

Maximum daily temperature in degrees Celsius for July 2011

Description

Sample data set showing values of merged maximum daily temperature measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Dataset (ECA&D) data for the month July 2011.

Usage

data(dtemp_maxc)

Format

The dtemp_maxc contains the following columns:

staid

character; station ID from GSOD or ECA&D dataset

time

Date; day of the measurement

temp_minc

numeric; maximum daily temperature in degrees Celsius

Note

The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations table containing stations' coordinates.

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data 
data(dtemp_maxc)
str(dtemp_maxc)

Minimum daily temperature in degrees Celsius for July 2011

Description

Sample data set showing values of merged minimum daily temperature measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.

Usage

data(dtemp_minc)

Format

The dtemp_minc contains the following columns:

staid

character; station ID from GSOD or ECA&D data set

time

Date; day of the measurement

temp_minc

numeric; minimum daily temperature in degrees Celsius

Note

The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations table containing stations' coordinates.

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data
data(dtemp_minc)
str(dtemp_minc)

Mean daily temperature in degrees Celsius for July 2011

Description

Sample data set showing values of merged mean daily temperature measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Dataset (ECA&D) data for the month July 2011.

Usage

data(dtempc)

Format

The dtempc contains the following columns:

staid

character; station ID from GSOD or ECA&D dataset

time

Date; day of the measurement

tempc

numeric; mean daily temperature in degrees Celsius

Note

The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations table containing stations' coordinates.

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data 
data(dtempc)
str(dtempc)

Mean daily temperature in degrees Celsius for the year 2019 for Serbia

Description

Sample data set of mean daily temperature measurements from the OGIMET service for the year 2019 for Serbian territory.

Usage

data(dtempc_ogimet)

Format

The dtempc_ogimet contains the following columns:

staid

character; station ID from OGIMET

name

character; station name

lon

numeric; Longitude

lat

numeric; Latitude

elevation

numeric; Hight

time

Date; day of the measurement

tmean

numeric; mean daily temperature in degrees Celsius

dem

numeric; Digital Elevation Model (DEM) in meters

twi

numeric; Topographic Wetness Index (TWI)

cdate

numeric; Cumulative day from 1960

doy

numeric; Day of year

gtt

numeric; Geometrical temperature trend

Author(s)

Aleksandar Sekulic [email protected]

References

Examples

# load data 
data(dtempc_ogimet)
str(dtempc)

Daily mean wind speed in m/s for July 2011

Description

Sample data set showing values of merged daily mean wind speed measurements from the Global Surface Summary of Day data (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.

Usage

data(dwdsp)

Format

The dwdsp contains the following columns:

staid

character; station ID from GSOD or ECA&D data set

time

Date; day of the measurement

wdsp

numeric; daily mean wind speed in m/s

Note

The data summaries provided here are based on data exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. To prepare a point map, merge with the stations table containing stations' coordinates.

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data 
data(dwdsp)
str(dwdsp)

Get lon/lat coordinates for a specific location name.

Description

The function gives back lon/lat coordinates for a specific location name using Nominatim (https://nominatim.org/) service.

Usage

get_coordinates(location_name = "Belgrade")

Arguments

location_name

character; Location name (e.g. "Belgrade").

Value

numeric vector with lon/lat values for a specific location name.

Author(s)

Aleksandar Sekulić [email protected]

See Also

get_meteo

Examples

coords <- get_coordinates("Belgrade")
str(coords)

Get daily, monthly, or annual; aggregated or long-term means meteorological data for specific location(s) and date(s).

Description

The function gives back daily, monthly, or annual and aggregated or long-term means metorological data for specific location(s) and date(s) for entire World for 1961-2020 period from DailyMeteo portal (https://app.dailymeteo.com/).

Usage

get_meteo(loc,
          var = "tmean",
          agg_level = "agg",
          time_scale = "day",
          time,
          from,
          to,
          api_key)

Arguments

loc

sf, SpatVector, data.frame, matrix, numeric or integer; Location(s) in WGS84 (EPSG:4326). If data.frame or matrix columns are lon/lat. See get_coordinates.

var

character; Meteorological variable. Possible values are: "tmean" (mean temperature, default), "tmax" (maximum temperature), "tmin" (minimum temperature), "prcp" (total precipitation), or "slp" (sea-level pressure).

agg_level

character; Aggregation level. Possible values are: "agg" (aggregated, default) or "ltm" (long-term mean).

time_scale

character; Time scale. Possible values are: "day" (daily, default), "mon" (monthly), or "ann" (annual).

time

character; Vector of time references. If specified, ignore from and to. If time_scale="day" then format is 'YYYY-MM-DD'. If time_scale="mon" then the format is 'YYYY-MM'. If time_scale="ann" then format is 'YYYY' (e.g. if time_scale="day" then c('2020-01-01,2020-01-02'); if time_scale="mon" then c('2020-01,2020-02'); if time_scale="ann" then c('2020,2019')).

from

character; Time reference from. Format is the same as for the parameter time

to

character; Time reference to. Format is the same as for the parameter time

api_key

character; API key. Check DailyMeteo portal (https://app.dailymeteo.com/)

Value

data.frame object with columns loc (location index from loc argument), timestamp (time reference), and value (meteorological value).

Author(s)

Aleksandar Sekulić [email protected]

See Also

get_coordinates

Examples

## Not run: 
loc <- get_coordinates("Belgrade")
loc <- rbind(loc, get_coordinates("Kopaonik"))
loc
api_key <- "" # get API key from DailyMeteo portal (https://app.dailymeteo.com/)
result <- get_meteo(loc = loc,
                   var = "tmean",
                   agg_level = "agg",
                   time_scale = "day",
                   from = "2020-01-01",
                   to = "2020-01-02", # or use time = c("2020-01-01", "2020-01-02"),
                   api_key = api_key)
# result
#   loc  timestamp value
# 1   1 2020-01-01   0.7
# 2   1 2020-01-02   1.0
# 3   2 2020-01-01  -9.2
# 4   2 2020-01-02  -8.6
# 5   3 2020-01-01  -9.2
# 6   3 2020-01-02  -8.6

## End(Not run)

Create an object of STFDF-class class from two data frames (observation and stations)

Description

The function creates an object of STFDF-class class, spatio-temporal data with full space-time grid, from two data frames (observation and stations). Observations data frame minimum contains station ID column, time column (day of observation) and measured variable column. Stations data frame contains at least station ID column, longitude (or x) and latitude (or y) column.

Usage

meteo2STFDF(obs,
            stations,
            obs.staid.time = c(1, 2),
            stations.staid.lon.lat = c(1, 2, 3),
            crs=CRS(as.character(NA)),
            delta=NULL)

Arguments

obs

data.frame; observations data frame minimum contains station ID column, time column (day of observation) and measured variable column. It can contain additional variables (columns).

stations

data.frame; Stations data frame contains at least station ID column, longitude (or x) and latitude (or y) column.It can contain additional variables (columns).

obs.staid.time

numeric; records the column positions where in obs (observation) data frame the station ID and time values are stored.

stations.staid.lon.lat

numeric; records the column positions where in stations data frame the station ID, longitude (x) and latitude (y) values are stored.

crs

CRS; coordinate reference system (see CRS-class) of stations coordinates

delta

time; time interval to end points in seconds

Value

STFDF-class object

Note

The function is intended for conversion of meteorological data to STFDF-class object, but can be used for similar spatio temporal data stored in two separated tables.

Author(s)

Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]

See Also

tgeom2STFDF, pred.strk

Examples

# prepare data
# load observation - data.frame of mean temperatures
data(dtempc) 
str(dtempc)
data(stations)
str(stations)
lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46
library(sp)
library(spacetime)
serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), 
                          c(latmin,latmin,latmax,latmax))
st= stations[ serbia!=0, ] # stations in Serbia approx.
# create STFDF
temp <- meteo2STFDF(dtempc,st, crs= CRS('+proj=longlat +datum=WGS84'))
str(temp)

Finds n nearest observations from given locations.

Description

The function finds n nearest observations from given locations and creates an object of data.frame class. First n columns are Euclidean distances to n nearest locations and next n columns are observations at n nearest stations, and rows are given locations. Further more it can calculate averages in circles with different radiuses, can find nearest observation in quadrants (directions) and calculate IDW predictions from nearest observations. It is based on knn function of package nabor.

Usage

near.obs(locations,
         locations.x.y = c(1,2),
         observations,
         observations.x.y = c(1,2),
         obs.col = 3,
         n.obs = 10,
         rm.dupl = TRUE,
         avg = FALSE,
         increment,
         range,
         quadrant = FALSE,
         idw=FALSE,
         idw.p=2)

Arguments

locations

data.frame with x and y coordinates columns, or sf-class, SpatVector-class or SpatRaster-class object; Locations (FROM) for which n nearest observations are found and distances are calculated.

locations.x.y

numeric or character vector; Positions or names of the x and y columns in locations if data.frame. Default is c(1,2).

observations

data.frame with x, y and observation columns, or sf-class or SpatVector-class object with an observation column; Observations (TO).

observations.x.y

numeric or character vector; Positions or names of the x and y columns in observations if data.frame. Default is c(1,2).

obs.col

numeric or character; Column name or number showing position of the observation column in the observations. Default is 3.

n.obs

numeric; Number of nearest observations to be found. Note that it cannot be larger than number of obsevrations. Default is 10.

rm.dupl

boolean; Remove spatial duplicates - will the spatial duplicates (nearest observations where Euclidean distance is 0) be removed from the result. Default is TRUE.

avg

boolean; Averages in circles - will averages in circles with different radiuses be calculated. Default is FALSE.

increment

numeric; Increment of radiuses for calculation of averages in circles with different radiuses. Units depends on CRS.

range

numeric; Maximum radius for calculation of averages in circles with different radiuses. Units depends on CRS.

quadrant

boolean; Nearest observations in quadrants - will nearest observation in quadrants be calculated. Default is FALSE.

idw

boolean; IDW prediction as predictor - will IDW predictions from n.obs nearest observations be calculated. Default is FALSE.

idw.p

numeric; Exponent parameter for IDW weights. Default is 2.

Value

data.frame object. Rows represents specific locations. First n.obs columns are Euclidean distances to n.obs nearest observations. Next n.obs columns are observations at n.obs nearest stations. The following columns are averages in circles with different radiuses if avg is set to TRUE. The following columns are nearest observation in quadrants if direct is set to TRUE. The following columns are IDW prediction from nearest observation if idw is set to TRUE.

Note

The function can be used in any case if it is needed to find n nearest observations from given locations and distances to them.

Author(s)

Aleksandar Sekulic [email protected]

References

Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).

See Also

knn rfsi pred.rfsi tune.rfsi cv.rfsi

Examples

library(sp)
library(sf)
library(terra)
library(meteo)
# prepare data
# load observation - data.frame of mean temperatures
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
locations = terra::rast(meuse.grid)
observations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
# find 5 nearest observations and distances to them (remove duplicates)
nearest_obs <- near.obs(locations = locations, # from
                        observations = observations, # to
                        obs.col = "zinc",
                        n.obs = 5, # number of nearest observations
                        rm.dupl = TRUE) 
str(nearest_obs)
summary(nearest_obs)

Finds n nearest observations from given locations for soil mapping.

Description

The function finds n nearest observations from given locations and at specific depth range from location min depth and creates an object of data.frame class. First n columns are Euclidean distances to n nearest locations and next n columns are observations at n nearest stations, and rows are given locations. It is based on knn function of package nabor.

Usage

near.obs.soil(locations,
              locations.x.y.md = c(1,2,3),
              observations,
              observations.x.y.md = c(1,2,3),
              obs.col = 4,
              n.obs = 5,
              depth.range = 0.1,
              no.obs = 'increase',
              parallel.processing = TRUE,
              pp.type = "doParallel", # "snowfall"
              cpus = detectCores()-1)

Arguments

locations

data.frame with x and y coordinates and mid depth columns, or sf-class, SpatVector-class or SpatRaster-class object; Locations (FROM) for which n nearest observations are found and distances are calculated.

locations.x.y.md

numeric or character vector; Positions or names of the x, y, and mid depth columns in locations if data.frame. Default is c(1,2,3).

observations

data.frame with x, y, mid depth and observation columns, or sf-class or SpatVector-class object with mid depth and observation columns; Observations (TO).

observations.x.y.md

numeric or character vector; positions or names of the x, y, and mid depth columns in observations if data.frame. Default is c(1,2,3).

obs.col

numeric or character; Column name or number showing position of the observation column in the observations. Default is 4.

n.obs

numeric; Number of nearest observations to be found. Note that it cannot be larger than number of obsevrations. Default is 5.

depth.range

numeric; Depth range for location mid depth in which to search for nearest observations. It's in the mid depth units. Default is 0.1.

no.obs

character; Possible values are increase (default) and exactly. If set to increase, in case if there is no n.obs observations in depth.range for a specific location, the depth.range is increased (multiplied by 2, 3, ...) until the number of observations are larger or equal to n.obs. If set to exactly, the function will raise an error when it come to the first location with no n.obs observations in specified depth.range.

parallel.processing

logical; If parallel processing is performed. Default is FALSE.

pp.type

character; Type (R package) used for parallel processing, "doParallel" (default) or "snowfall".

cpus

numeric; Number of processing units. Default is detectCores()-1.

Value

data.frame object. Rows represents specific locations. First n.obs columns are Euclidean distances to n.obs nearest observations. Next n.obs columns are observations at n.obs nearest stations.

Note

The function is intended for soil mapping applications.

Author(s)

Aleksandar Sekulic [email protected], Anatol Helfenstein [email protected]

See Also

knn near.obs rfsi pred.rfsi tune.rfsi cv.rfsi

Examples

library(sp)
library(sf)
library(meteo)
# prepare data
# load observation - data.frame of mean temperatures
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
locations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
locations = # terra::rast(meuse.grid)
observations = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
# find 5 nearest observations and distances to them (remove duplicates)
nearest_obs <- near.obs.soil(locations = locations, # from
                             locations.x.y.md =  c("x","y","dist"),
                             observations = observations, # to
                             observations.x.y.md= c("x","y","dist"),
                             obs.col = "zinc",
                             n.obs = 5) # number of nearest observations
str(nearest_obs)
summary(nearest_obs)

MODIS LST 8 day images image for the Netherlands ('2011-07-04')

Description

The original 8 day MODIS LST images were also converted from Kelvin to degrees Celsius using the formula indicated in the MODIS user's manual.SpatialGridDataFrame.

Usage

data(nlmodis20110704)

Author(s)

Milan Kilibarda [email protected]

References

Wan, Z., Y. Zhang, Q. Zhang, and Z.-L. Li (2004), Quality assessment and validation of the MODIS global land surface temperature, Int. J. Remote Sens., 25(1), 261-274

Examples

library(sp)
data(nlmodis20110704)

spplot(nlmodis20110704)

MODIS LST 8 day images image for the Netherlands ('2011-07-12')

Description

The original 8 day MODIS LST images were also converted from Kelvin to degrees Celsius using the formula indicated in the MODIS user's manual.

Usage

data(nlmodis20110712)

Author(s)

Milan Kilibarda [email protected]

References

Wan, Z., Y. Zhang, Q. Zhang, and Z.-L. Li (2004), Quality assessment and validation of the MODIS global land surface temperature, Int. J. Remote Sens., 25(1), 261-274

Examples

library(sp)
data(nlmodis20110712)

spplot(nlmodis20110712)

The Netherlands border polygon from WCAB

Description

SpatialGridDataFrame from World Country Admin Boundary Shapefile

Usage

data(NLpol)

Examples

library(sp)
data(NLpol)

plot(NLpol)

Random Forest Spatial Interpolation (RFSI) prediction

Description

Function for spatial/spatio-temporal prediction based on Random Forest Spatial Interpolation (RFSI) model (Sekulić et al. 2020).

Usage

pred.rfsi(model,
          data,
          obs.col=1,
          data.staid.x.y.z = NULL,
          newdata,
          newdata.staid.x.y.z = NULL,
          z.value = NULL,
          s.crs = NA,
          newdata.s.crs=NA,
          p.crs = NA,
          output.format = "data.frame",
          cpus = detectCores()-1,
          progress = TRUE,
          soil3d = FALSE, # soil RFSI
          depth.range = 0.1, # in units of depth
          no.obs = 'increase',
          ...)

Arguments

model

ranger; An RFSI model made by rfsi function.

data

sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for RFSI prediction. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, and observation value (obs).

obs.col

numeric or character; Column name or number showing position of the observation column in the data. Default is 1.

data.staid.x.y.z

numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If data is sf-class, sftime-class, or SpatVector-class object, data.staid.x.y.z is used to point staid and z position. Set z position to NA (e.g. c(1,2,3,NA)) or ommit it (e.g. c(1,2,3)) for spatial interpolation. Default is NULL.

newdata

sf-class, sftime-class, SpatVector-class, SpatRaster-class or data.frame; Contains prediction locations and covariates used for RFSI prediction. If data.frame object, it should have next columns: prediction location ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z), and covariates (cov1, cov2, ...). Covariate names have to be the same as in the model.

newdata.staid.x.y.z

numeric or character vector; Positions or names of the prediction location ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame newdata object (e.g. c(1,2,3,4)). If newdata is sf-class, sftime-class, SpatVector-class or SpatRaster-class object, newdata.staid.x.y.z is used to point staid and z position. Set z position to NA (e.g. c(1,2,3,NA)) or ommit it (e.g. c(1,2,3)) for spatial interpolation. Default is NULL.

z.value

vector; A vector of 3rd component - time, depth, ... (z) values if newdata is SpatRaster-class.

s.crs

st_crs or crs; Source CRS of data. If data contains crs, s.crs will not be used. Default is NA.

newdata.s.crs

st_crs or crs; Source CRS of newdata. If newdata contains crs, newdata.s.crs will not be used. Default is NA.

p.crs

st_crs or crs; Projection CRS for data reprojection. If NA, s.crs will be used for distance calculation. Note that observations should be in projection for finding nearest observations based on Eucleadean distances (see function near.obs). Default is NA.

output.format

character; Format of the output, data.frame (default), sf-class, sftime-class, SpatVector-class, or SpatRaster-class.

cpus

numeric; Number of processing units. Default is detectCores()-1.

progress

logical; If progress bar is shown. Default is TRUE.

soil3d

logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the data.staid.x.y.z points to the depth column.

depth.range

numeric; Depth range for location mid depth in which to search for nearest observations (see function near.obs.soil). It's in the mid depth units. Default is 0.1.

no.obs

character; Possible values are increase (default) and exactly. If set to increase, in case if there is no n.obs observations in depth.range for a specific location, the depth.range is increased (multiplied by 2, 3, ...) until the number of observations are larger or equal to n.obs. If set to exactly, the function will raise an error when it come to the first location with no n.obs observations in specified depth.range (see function near.obs.soil).

...

Further arguments passed to predict.ranger function, such as type = "quantile" and quantiles = c(0.1,0.5,0.9) for quantile regression, etc.

Value

A data.frame, sf-class, sftime-class, SpatVector-class, or SpatRaster-class object (depends on output.format argument) with prediction - pred or quantile..X.X (quantile regression) columns.

Author(s)

Aleksandar Sekulic [email protected]

References

Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).

See Also

near.obs rfsi tune.rfsi cv.rfsi

Examples

library(ranger)
library(sp)
library(sf)
library(terra)
library(meteo)

# prepare data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq")

# fit the RFSI model
rfsi_model <- rfsi(formula = fm.RFSI,
                   data = data, # meuse.df (use data.staid.x.y.z)
                   n.obs = 5, # number of nearest observations
                   cpus = 2, # detectCores()-1,
                   progress = TRUE,
                   # ranger parameters
                   importance = "impurity",
                   seed = 42,
                   num.trees = 250,
                   mtry = 5,
                   splitrule = "variance",
                   min.node.size = 5,
                   sample.fraction = 0.95,
                   quantreg = FALSE)
                   # quantreg = TRUE) # for quantile regression

rfsi_model
# OOB prediction error (MSE):       47758.14 
# R squared (OOB):                  0.6435869 
sort(rfsi_model$variable.importance)
sum("obs" == substr(rfsi_model$forest$independent.variable.names, 1, 3))

# Make RFSI prediction
newdata <- terra::rast(meuse.grid)
class(newdata)

# prediction

rfsi_prediction <- pred.rfsi(model = rfsi_model,
                             data = data, # meuse.df (use data.staid.x.y.z)
                             obs.col = "zinc",
                             newdata = newdata, # meuse.grid.df (use newdata.staid.x.y.z)
                             output.format = "SpatRaster", # "sf", # "SpatVector", 
                             zero.tol = 0,
                             cpus = 2, # detectCores()-1,
                             progress = TRUE,
                             # type = "quantiles", # for quantile regression
                             # quantiles = c(0.1, 0.5, 0.9) # for quantile regression
)
class(rfsi_prediction)
names(rfsi_prediction)
head(rfsi_prediction)
plot(rfsi_prediction)
plot(rfsi_prediction['pred'])

Spatio-temporal regression kriging prediction

Description

Function for spatio-temporal regression kriging prediction based on krigeST.

Usage

pred.strk(data,
          obs.col=1,
          data.staid.x.y.z = NULL,
          newdata,
          newdata.staid.x.y.z = NULL,
          z.value = NULL,
          crs = NA,
          zero.tol=0,
          reg.coef,
          vgm.model,
          sp.nmax=20,
          time.nmax=2,
          by='time',
          tiling= FALSE,
          ntiles=64,
          output.format = "STFDF",
          parallel.processing = FALSE,
          pp.type = "snowfall",
          cpus=detectCores()-1,
          computeVar=FALSE,
          progress=TRUE,
          ...)

Arguments

data

STFDF-class, STSDF-class, STIDF-class, sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates in space and time used to perform STRK. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs), and covariates (cov1, cov2, ...). Covariate names should be the same as in the reg.coef (see below). If covariates are missing, overlay with newdata is tried. If overlay with newdata is not possible, then spatio-temporal ordinary kriging is performed.

obs.col

numeric or character; Column name or number showing position of the observation column in the data. Default is 1.

data.staid.x.y.z

numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component - time, depth (z) columns in data.frame object (e.g. c(1,2,3,4)). If data is sf-class, sftime-class, or SpatVector-class object, data.staid.x.y.z is used to point staid and z position. Default is NULL.

newdata

STFDF-class, STSDF-class, STIDF-class, sf-class, sftime-class, SpatVector-class, SpatRaster-class or data.frame; Contains prediction locations and covariates used for STRK prediction. If data.frame object, it should have next columns: prediction location ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z), and covariates (cov1, cov2, ...). Covariate names have to be the same as in the reg.coef (see below).

newdata.staid.x.y.z

numeric or character vector; Positions or names of the prediction location ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame newdata object (e.g. c(1,2,3,4)). If newdata is sf-class, sftime-class, SpatVector-class or SpatRaster-class object, newdata.staid.x.y.z is used to point staid and z position. Default is NULL.

z.value

vector; A vector of 3rd component - time, depth, ... (z) values if newdata is SpatRaster-class.

crs

st_crs or crs; Source CRS of data and newdata. If data or newdata contains crs, crs will not be used. Default is NA.

zero.tol

numeric; A distance value below (or equal to) which locations are considered as duplicates. Default is 0. See rm.dupl. Duplicates are removed to avoid singular covariance matrices in kriging.

reg.coef

numeric; Vector of named linear regression coefficients. Names of the coefficients (e.g. "Intercept", "temp_geo", "modis", "dem", "twi") will be used to match appropriate covariates from data. Coefficients for metorological variables (temperature, precipitation, etc.) can be taken from data(tregcoef) or can be specified by the user.

vgm.model

StVariogramModel list; Spatio-temporal variogram of regression residuals (or observations if spatio-temporal ordinary kriging). See vgmST. Spatio-temporal variogram model on residuals for metorological variables (temperature, precipitation, etc.) can be taken from data(tvgms) or can be specified by the user as a vgmST object.

sp.nmax

numeric; A number of spatially nearest observations that should be used for kriging predictions. If tiling is TRUE (see below), then is a number of spatially nearest observations that should be used for each tile. Deafult is 20.

time.nmax

numeric; A number of temporally nearest observations that should be used for kriging predictions Deafult is 2.

by

cahracter; Will foreach loop by time (default) or station. If station is set, sp.nmax will be used for each station prediction.

tiling

logical; Should simplified local kriging be used. Default is FALSE. If TRUE, area is divided in tiles and kriging calculation is done for each tile separately. Number of observation used per tile is defined with sp.nmax and time.nmax. If FALSE, temporal local kriging will be applied defined.

ntiles

numeric; A number of tiles for tilling. Default is 64. Ideally, each tile should contain less observations than sp.nmax and observations fall in neighboring tiles.

output.format

character; Format of the output, STFDF-class (default), STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, SpatVector-class, or SpatRaster-class.

parallel.processing

logical; If parallel processing is performed. Default is FALSE.

pp.type

character; Type (R package) for parallel processing, "snowfall" (default) or "doParallel".

cpus

numeric; Number of processing units. Default is detectCores()-1.

computeVar

logical; If kriging variance is computed. Default is FALSE.

progress

logical; If progress bar is shown. Default is TRUE.

...

Further arguments passed to krigeST function.

Value

A STFDF-class, STSDF-class, STIDF-class, data.frame, sf-class, sftime-class, SpatVector-class, or SpatRaster-class object (depends on output.format argument), with columns (elements):

pred

Predictions.

tlm

Trend.

var

Kriging variance, if computeVar=TRUE.

Author(s)

Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.

See Also

tregcoef tvgms regdata meteo2STFDF tgeom2STFDF

Examples

library(sp)
library(spacetime)
library(sf)
library(gstat)
library(plyr)
# prepare data
# load observation - data.frame of mean temperatures
# preparing data
data(dtempc) 
data(stations)
data(regdata) # covariates, made by mete2STFDF function
regdata@sp@proj4string <- CRS('+proj=longlat +datum=WGS84')

lonmin=18 ;lonmax=22.5 ; latmin=40 ;latmax=46
serbia = point.in.polygon(stations$lon, stations$lat, c(lonmin,lonmax,lonmax,lonmin), 
                          c(latmin,latmin,latmax,latmax))
st = stations[ serbia!=0, ] # stations in Serbia approx.
obs.staid.time = c("staid", "time")
stations.staid.lon.lat = c(1,2,3)
crs = CRS('+proj=longlat +datum=WGS84')
delta = NULL


# create STFDF
stfdf <- meteo2STFDF(obs = dtempc,
                     stations = st,
                     crs = crs)

# Calculate prediction of mean temperatures for "2011-07-05" and "2011-07-06" 
# global model is used for regression and variogram
# load precalculated variograms
data(tvgms) # ST variogram models
data(tregcoef) # MLR coefficients

### Example with STFDF and without parallel processing
results <- pred.strk(data = stfdf, # observations
                     newdata = regdata, # prediction locations with covariates
                     # newdata = regdata[,2,drop=FALSE], # for one day only
                     output.format = "STFDF", # data.frame | sf | sftime | SpatVector | SpatRaster 
                     reg.coef = tregcoef[[1]], # MLR coefficients
                     vgm.model = tvgms[[1]], # STRK variogram model
                     sp.nmax = 20,
                     time.nmax = 2,
                     computeVar=TRUE
)
class(results)
# plot prediction
results@sp=as(results@sp,'SpatialPixelsDataFrame')
stplot(results[,,"pred", drop= FALSE], col.regions=bpy.colors())
stplot(results[,,"var", drop= FALSE], col.regions=bpy.colors())



# Example with data.frames and parallel processing - SpatRaster output

library(terra)
library(doParallel)
# create data.frame
stfdf.df <- join(dtempc, st)
summary(stfdf.df)
regdata.df <- as.data.frame(regdata)
results <- pred.strk(data = stfdf.df,
                     obs.col = 3,
                     data.staid.x.y.z = c(1,4,5,2),
                     newdata = regdata.df,
                     newdata.staid.x.y.z = c(3,1,2,4),
                     crs = CRS("EPSG:4326"),
                     output.format = "SpatRaster", # STFDF |data.frame | sf | sftime | SpatVector
                     reg.coef = tregcoef[[1]],
                     vgm.model = tvgms[[1]],
                     sp.nmax = 20,
                     time.nmax = 2,
                     parallel.processing = TRUE,
                     pp.type = "doParallel", # "snowfall"
                     cpus = 2, # detectCores()-1,
                     computeVar = TRUE,
                     progress = TRUE
)
# plot prediction
plot(results$`2011-07-06`[["pred"]])
plot(results$`2011-07-06`[["var"]])

Dynamic and static covariates for spatio-temporal regression kriging

Description

Dynamic and static covariates for spatio-temporal regression kriging of STFDF-class. The regdata contains geometrical temperature trend, MODIS LST 8-day splined at daily resolution, elevation and topographic wetness index.

Usage

data(regdata)

Format

The regdata contains the following dynamic and static covariates:

regdata$temp_geo

numeric; geometrical temperature trend for mean temperature, calculated with tgeom2STFDF ; from 2011-07-05 to 2011-07-09, in degree Celsius

regdata$modis

numeric; MODIS LST 8-day splined at daily resolution, missing pixels are filtered by spatial splines and 8-day values are splined at daily level; from 2011-07-05 to 2011-07-09, in degree Celsius

regdata@sp$dem

numeric; elevation data obtained from Worldgrids (depricated)

regdata@sp$twi

numeric; SAGA Topographic Wetness Index (TWI) from Worldgrids (depricated)

Author(s)

Milan Kilibarda [email protected]

Examples

data(regdata)
str(regdata)
library(sp) # spplot
library(spacetime) # stplot


stplot(regdata[,,'modis']) # plot modis data
spplot(regdata@sp,zcol='twi', col.regions = bpy.colors() ) # plot TWI
spplot(regdata@sp,zcol='dem', col.regions = bpy.colors() ) # plot dem

Close gaps of a grid or raster Layer data

Description

The function close gaps of a raster data by using IDW.

Usage

rfillspgaps(rasterLayer,
            maskPol=NULL,
            nmax =50,
            zcol=1,
            ...)

Arguments

rasterLayer

SpatRaster-class, RasterLayer-class, SpatialGrid-class or SpatialPixels-class; Raster that contains NAs.

maskPol

sf-class, SpatVector-class, SpatialPolygons or SpatialPolygonsDataFrame; Area of interest to spatially fill rasterLayer missing values and to mask rasterLayer.

nmax

see krige, idw function.

zcol

integer or character; variable column name or number showing position of a variable in rasterLayer to be interpolated.

...

arguments passed to krige, idw function.

Value

raster object with NA replaced using IDW in rasterLayer format.

Author(s)

Milan Kilibarda [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803;

Kilibarda M., M. Percec Tadic, T. Hengl, J. Lukovic, B. Bajat - Spatial Statistics (2015), Global geographic and feature space coverage of temperature data in the context of spatio-temporal interpolation, doi:10.1016/j.spasta.2015.04.005.

See Also

rfilltimegaps pred.strk

Examples

library(terra)
   data(nlmodis20110712)
   data(NLpol)
   
   nlmodis20110712 <- terra::rast(nlmodis20110712)
   # SpaVector
   NLpol = vect(NLpol)
   crs(NLpol) <- "epsg:4326"
   # # sf
   # NLpol <- st_as_sf(NLpol) #, crs = st_crs(4326))
   
   
   plot(nlmodis20110712)
   
   # fill spatial gaps
   r=rfillspgaps(nlmodis20110712,NLpol)
   
   plot(r)

Disaggregation in the time dimension through the use of splines for each pixel

Description

The function creates an object of STFDF-class class, spatio-temporal data with full space-time grid, from another STFDF-class and fills attribute data for missing values in time using splines.

Usage

rfilltimegaps(stfdf,
              tunits="day",
              attrname=1,
              ...)

Arguments

stfdf

STFDF-class; object with time information of minimum length 2, and gap in time dimension.

tunits

character; increment of the sequence used to generete time infromation for temporal gap. See 'Details'.

attrname

integer or character; varible from STFDF-class to be splined in time.

...

arguments passed to splinefun, function spline.

Details

tunits can be specified in several ways:

  • A number, taken to be in seconds

  • A object of class difftime

  • A character string, containing one of "sec", "min", "hour", "day", "DSTday", "week", "month", "quarter" or "year". This can optionally be preceded by a (positive or negative) integer and a space, or followed by "s"

The difference between "day" and "DSTday" is that the former ignores changes to/from daylight savings time and the latter takes the same clock time each day. ("week" ignores DST (it is a period of 144 hours), but "7 DSTdays") can be used as an alternative. "month" and "year" allow for DST.)

Value

STFDF-class object with filled temporal gaps.

Author(s)

Milan Kilibarda [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803;

Kilibarda M., M. Percec Tadic, T. Hengl, J. Lukovic, B. Bajat - Spatial Statistics (2015), Global geographic and feature space coverage of temperature data in the context of spatio-temporal interpolation, doi:10.1016/j.spasta.2015.04.005.

See Also

rfillspgaps pred.strk

Examples

data(nlmodis20110704)
  data(nlmodis20110712)
  data(NLpol)

  # fill spatial gaps
  library(raster)
  NLpol@proj4string <- nlmodis20110704@proj4string
  
  
  nlmodis20110704 <- rfillspgaps(nlmodis20110704,NLpol)
  nlmodis20110712 <- rfillspgaps(nlmodis20110712,NLpol)
  
  nlmodis20110704 <- as(nlmodis20110704,"SpatialPixelsDataFrame")
  names(nlmodis20110704)='m1'
  nlmodis20110712 <- as(nlmodis20110712,"SpatialPixelsDataFrame")
  names(nlmodis20110712)='m2'
  
  nlmodis20110704@data <- cbind(nlmodis20110704@data, nlmodis20110712@data)
  
  df<-reshape(nlmodis20110704@data , varying=list(1:2), v.names="modis",direction="long", 
            times=as.Date(c('2011-07-04','2011-07-12')), ids=1:dim(nlmodis20110704)[1])
  
  library(spacetime)
  stMODIS<- STFDF(as( nlmodis20110704, "SpatialPixels"), 
                  time= as.Date(c('2011-07-04','2011-07-12')), 
                  data.frame(modis=df[,'modis']))
  
  stplot(stMODIS, col.regions=bpy.colors())
  stMODIS <- rfilltimegaps(stMODIS)
  stplot(stMODIS, col.regions=bpy.colors())

Random Forest Spatial Interpolation (RFSI) model

Description

Function for creation of Random Forest Spatial Interpolation (RFSI) model (Sekulić et al. 2020). Besides environmental covariates, RFSI uses additional spatial covariates: (1) observations at n nearest locations and (2) distances to them, in order to include spatial context into the random forest.

Usage

rfsi(formula,
     data,
     data.staid.x.y.z = NULL,
     n.obs = 5,
     avg = FALSE,
     increment = 10000,
     range = 50000,
     quadrant = FALSE,
     use.idw = FALSE,
     idw.p = 2,
     s.crs = NA,
     p.crs = NA,
     cpus = detectCores()-1,
     progress = TRUE,
     soil3d = FALSE,
     depth.range = 0.1,
     no.obs = 'increase',
     ...)

Arguments

formula

formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If z~1, an RFSI model using only nearest obsevrations and distances to them as covariates will be made.

data

sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates (formula=z~1) will be made.

data.staid.x.y.z

numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If data is sf-class, sftime-class, or SpatVector-class object, data.staid.x.y.z is used to point staid and z position. Set z position to NA (e.g. c(1,2,3,NA)) or ommit it (e.g. c(1,2,3)) for spatial interpolation. Default is NULL.

n.obs

numeric; Number of nearest observations to be used as covariates in RFSI model (see function near.obs). Note that it cannot be larger than number of obsevrations. Default is 5.

avg

boolean; Averages in circles covariate - will averages in circles with different radiuses be calculated (see function near.obs). Default is FALSE.

increment

numeric; Increment of radiuses for calculation of averages in circles with different radiuses (see function near.obs). Units depends on CRS.

range

numeric; Maximum radius for calculation of averages in circles with different radiuses (see function near.obs). Units depends on CRS.

quadrant

boolean; Nearest observations in quadrants covariate - will nearest observation in quadrants be calculated (see function near.obs). Default is FALSE.

use.idw

boolean; IDW prediction as covariate - will IDW predictions from n.obs nearest observations be calculated (see function near.obs). Default is FALSE.

idw.p

numeric; Exponent parameter for IDW weights (see function near.obs). Default is 2.

s.crs

st_crs or crs; Source CRS of data. If data contains crs, s.crs will be overwritten. Default is NA.

p.crs

st_crs or crs; Projection CRS for data reprojection. If NA, s.crs will be used for distance calculation. Note that observations should be in projection for finding nearest observations based on Eucleadean distances (see function near.obs). Default is NA.

cpus

numeric; Number of processing units. Default is detectCores()-1.

progress

logical; If progress bar is shown. Default is TRUE.

soil3d

logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the data.staid.x.y.z points to the depth column.

depth.range

numeric; Depth range for location mid depth in which to search for nearest observations (see function near.obs.soil). It's in the mid depth units. Default is 0.1.

no.obs

character; Possible values are increase (default) and exactly. If set to increase, in case if there is no n.obs observations in depth.range for a specific location, the depth.range is increased (multiplied by 2, 3, ...) until the number of observations are larger or equal to n.obs. If set to exactly, the function will raise an error when it come to the first location with no n.obs observations in specified depth.range (see function near.obs.soil).

...

Further arguments passed to ranger, such as quantreg, importance, etc.

Value

RFSI model of class ranger.

Note

Observations should be in projection for finding nearest observations based on Eucleadean distances (see function near.obs). If crs is not specified in the data object or through the s.crs parameter, the coordinates will be used as they are in projection. Use s.crs and p.crs if the coordinates of the data object are in lon/lat (WGS84).

Author(s)

Aleksandar Sekulic [email protected]

References

Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).

See Also

near.obs pred.rfsi tune.rfsi cv.rfsi

Examples

library(ranger)
library(sp)
library(sf)
library(terra)
library(meteo)
# prepare data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq")

# fit the RFSI model
rfsi_model <- rfsi(formula = fm.RFSI,
                   data = data, # meuse.df (use data.staid.x.y.z)
                   n.obs = 5, # number of nearest observations
                   cpus = 2, # detectCores()-1,
                   progress = TRUE,
                   # ranger parameters
                   importance = "impurity",
                   seed = 42,
                   num.trees = 250,
                   mtry = 5,
                   splitrule = "variance",
                   min.node.size = 5,
                   sample.fraction = 0.95,
                   quantreg = FALSE)

rfsi_model
# OOB prediction error (MSE):       47758.14 
# R squared (OOB):                  0.6435869 
sort(rfsi_model$variable.importance)
sum("obs" == substr(rfsi_model$forest$independent.variable.names, 1, 3))

Find point pairs with equal spatial coordinates from STFDF-class object.

Description

This function finds point pairs with equal spatial coordinates from STFDF-class object and remove locations with less observations.

Usage

rm.dupl(obj, zcol = 1, zero.tol = 0)

Arguments

obj

STFDF-class object

zcol

variable column name, or column number, from obj@data

zero.tol

distance values less than or equal to this threshold value are considered as duplicates; units are those of the coordinates for projected data or unknown projection, or km if coordinates are defined to be longitute/latitude

Value

STFDF-class object with removed duplicate locations. Stations with less observation is removed, if number of observation is the same for two stations the first is removed.

Author(s)

Milan Kilibarda [email protected]

See Also

tgeom2STFDF, pred.strk

Examples

library(sp)
# load observation - data frame
data(dtempc) 
# load stations - data frame
data(stations)

str(dtempc)
str(stations)


# create STFDF - from 2 data frames
temp <- meteo2STFDF(dtempc,
                    stations,
                    crs = CRS('+proj=longlat +datum=WGS84'))

nrow(temp@sp) # number of stations before removing dupl.

temp2 <-rm.dupl(temp, zcol = 1, zero.tol = 50) # 50 km
nrow(temp2@sp) # number of stations after

Data frame containing stations' information

Description

Data frame containing stations' information of merged daily observations from the Global Surface Summary of Day (GSOD) with European Climate Assessment & Data set (ECA&D) for the month July 2011.

Usage

data(stations)

Format

The stations contains the following columns:

staid

character; station ID from GSOD or ECA&D data set

lon

numeric; longitude coordinate

lat

numeric; longitude coordinate

elev_1m

numeric; elevation derived from station metadata in m

data_source

Factor; data source, GSOD or ECA&D

station_name

character; station name

Author(s)

Milan Kilibarda and Tomislav Hengl

References

Examples

# load data:
data(stations)
str(stations)
library(sp)
coordinates(stations) <-~ lon +lat
stations@proj4string <-CRS('+proj=longlat +datum=WGS84')

plot(stations)

Data frame containing stations' information from the OGIMET service for Serbian territory

Description

Data frame containing stations' information of daily observations from the OGIMET service for the year 2019 for Serbian territory.

Usage

data(stations_ogimet)

Format

The dtempc_ogimet contains the following columns:

staid

character; station ID from OGIMET

name

character; station name

lon

numeric; Longitude

lat

numeric; Latitude

elevation

numeric; Hight

dem

numeric; Digital Elevation Model (DEM) in meters

twi

numeric; Topographic Wetness Index (TWI)

Author(s)

Aleksandar Sekulic [email protected]

References

Examples

# load data:
data(stations_ogimet)
str(stations)
library(sp)
coordinates(stations) <-~ lon +lat
stations@proj4string <-CRS('+proj=longlat +datum=WGS84')

plot(stations)

Calculate geometrical temperature trend

Description

Calculate geometrical temperature trend for mean, minimum or maximum temperature.

Usage

temp_geom(day,
          fi,
          variable="mean",
          ab = NULL)

Arguments

day

integer; Day of the year (from 1 to 366). Single value or vector of days of the year (only if fi is single value).

fi

numeric; Latitude. Single value or vector of latitudes (only if day is single value).

variable

character; Geometrical temperature trend calculated for mean, minimum or maximum temperature; Possible values are 'mean', 'min' or 'max'. 'mean' is default.

ab

Predefined coefficients to be used instead of incorporated.

Value

A numerical single value or vector with calculated geometrical temperature trend.

Author(s)

Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.

Examples

tgeom <- temp_geom(day = 1,
          fi = 45.33,
          variable="mean")
          
tgeom_vect <- temp_geom(day = 1:365,
          fi = 45.33,
          variable="mean")
          
tgeom_vect2 <- temp_geom(day = 1,
          fi = seq(35, 45, 0.5),
          variable="mean")

Calculate geometrical temperature trend

Description

Calculate geometrical temperature trend for mean, minimum or maximum temperature.

Usage

tgeom2STFDF(grid,
            time,
            variable = "mean",
            ab=NULL)

Arguments

grid

Spatial-class (Points, Grid or Pixels), sf-class, SpatVector-class, SpatRaster-class; Locations for which gtt is calculated with associated coordinate reference systems (CRS-class). If CRS is not defined longitude latitude is assumed.

time

date or datetime; Object holding time information, reasonably it is day (calendar date), or vector of days.

variable

character; Geometrical temperature trend calculated for mean, minimum or maximum temperature; Possible values are 'mean', 'min' or 'max'. 'mean' is default.

ab

Predefined coefficients to be used instead of incorporated.

Value

STFDF-class object with calculated temp_geo geometrical temperature trend. The calculated values are stored in obj@data slot.

Author(s)

Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.

Examples

library(sp)
library(spacetime)
# create one point from lon lat
pos <- SpatialPoints(coords = cbind(19.22,45.33)) 
# temp_geom for 1st Jan 2011
tg1 <- tgeom2STFDF(pos,as.POSIXct("2011-01-01") )  
tg1

# temp_geom for the 2011 at pos location
tg365<- tgeom2STFDF(pos,time = seq(as.POSIXct("2011-01-01"), as.POSIXct("2011-12-31"), 
                    by="day") ) 

stplot(tg365, mode='ts')


data(regdata) 
# DEM and TWI data for Serbia at 1 km resolution

str(regdata@sp)
spplot(regdata@sp, zcol='dem', col.regions=bpy.colors() )


# temp_geom for Serbia 1st and 2nd Jully 2011
tgSrb<- tgeom2STFDF(regdata@sp,time = seq(as.POSIXct("2011-07-01"), 
                    as.POSIXct("2011-07-02"), by="day") ) 

# temp_geom for "2011-07-01" , "2011-07-02"

# stplot(tgSrb, col.regions = bpy.colors() )

Tiling raster or Spatial-class Grid or Pixels object

Description

Tiling raster or Spatial-class Grid or Pixels (data frame) object to smaller parts with optional overlap.

Usage

tiling(rast,
       tilesize=500,
       overlapping=50,
       aspoints= NA, 
       asfiles=FALSE,
       tilename="tile",
       tiles_folder='tiles',
       parallel.processing=FALSE,
       cpus=6,
       ...)

Arguments

rast

SpatRaster, SpatialPixels* object, SpatialGrid* object or file path to raster object stored on the disk (can be read via rast), for more details see SpatRaster. The resolution of the raster should be the same in x and y direction.

tilesize

integer or vector; tile size in number of cells. Can be a vector of tilesize in x and y direction. Total number of tile cells is tilesize[1] x tilesize[2].

overlapping

integer or vector; overlapping in number of cells. Can be a vector of overlapping in x and y direction.

aspoints

character; Posiible values are sf, terra sp. If specified, tiles are returned in form of points as sf-class, SpatVector or SpatialPointsDataFrame.

asfiles

boolean; if TRUE tiles are stored on local drive as raster objects.

tilename

character; prefix given to file names

tiles_folder

character; destination folder where tiles will be stored. If doesn't exist, the folder will be created.

parallel.processing

boolean; if TRUE parralel processing is performed via snowfall-calculation, sfLapply function.

cpus

integer; number of proccesing units.

...

character; additional arguments for for writing files, see writeRaster.

Value

The list of tiles in SpatRaster-class format or in sf-class, SpatVector or SpatialPointsDataFrame format if aspoints=TRUE.

Author(s)

Milan Kilibarda [email protected]

See Also

pred.strk

Examples

library(sp)
demo(meuse, echo=FALSE)
rast <- terra::rast(meuse.grid[, "dist"])

# tiling dem in tiles 250x250 with 25 cells overlap
tiles = tiling(rast,
               tilesize=20,
               overlapping=5,
               aspoints=TRUE)
# number of tiles
length(tiles)


plot(rast)
plot(tiles[[1]] , pch='-' ,col ='green', add=TRUE)
plot(tiles[[2]], pch='.', add=TRUE)


str(tiles[[1]])

Multiple linear regression coefficients for global and local daily air temperatures

Description

Multiple linear regression coefficients for mean, minimum, maximum daily temperature on geometric temperature trend (GTT), MODIS LST, digital elevation model (DEM) and topographic wetness index (TWI). The models are computed from GSOD, ECA&D, GHCN-Daily or local meteorological stations.

Usage

data(tregcoef)

Format

A list of 9 multiple linear regression coefficients for daily air temperatures.

tmeanGSODECAD

Multiple linear regression coefficients of global mean daily temperature on GTT, MODIS LST, DEM and TWI. Data used: GSOD and ECA&D

tmeanGSODECAD_noMODIS

Multiple linear regression coefficients of global mean daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D

tminGSODECAD

Multiple linear regression coefficients of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D

tminGHCND

Multiple linear regression coefficients of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily

tminGSODECAD_noMODIS

Multiple linear regression coefficients of global minimum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D

tmaxGSODECAD

Multiple linear regression coefficients of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D

tmaxGHCND

Multiple linear regression coefficients of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily

tmaxGSODECAD_noMODIS

Multiple linear regression coefficients of global maximum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D

tmeanHR

Multiple linear regression coefficients of Croatian mean daily temperature on GTT, DEM, and TWI. Data used: Croatian mean daily temperature dataset for the year 2008 (Croatian national meteorological network)

Author(s)

Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.

Examples

data(tregcoef)
tregcoef[[1]] # model for mean daily temp.

Tuning of Random Forest Spatial Interpolation (RFSI) model

Description

Function for tuning of Random Forest Spatial Interpolation (RFSI) model using k-fold leave-location-out cross-validation (Sekulić et al. 2020).

Usage

tune.rfsi(formula,
          data,
          data.staid.x.y.z = NULL,
          use.idw = FALSE,
          s.crs = NA,
          p.crs = NA,
          tgrid,
          tgrid.n=10,
          tune.type = "LLO",
          k = 5,
          seed=42,
          folds,
          acc.metric,
          fit.final.model=TRUE,
          cpus = detectCores()-1,
          progress = TRUE,
          soil3d = FALSE,
          no.obs = 'increase',
          ...)

Arguments

formula

formula; Formula for specifying target variable and covariates (without nearest observations and distances to them). If z~1, an RFSI model using only nearest obsevrations and distances to them as covariates will be tuned.

data

sf-class, sftime-class, SpatVector-class or data.frame; Contains target variable (observations) and covariates used for making an RFSI model. If data.frame object, it should have next columns: station ID (staid), longitude (x), latitude (y), 3rd component - time, depth, ... (z) of the observation, observation value (obs) and covariates (cov1, cov2, ...). If covariates are missing, the RFSI model using only nearest obsevrations and distances to them as covariates (formula=z~1) will be tuned.

data.staid.x.y.z

numeric or character vector; Positions or names of the station ID (staid), longitude (x), latitude (y) and 3rd component (z) columns in data.frame object (e.g. c(1,2,3,4)). If data is sf-class, sftime-class, or SpatVector-class object, data.staid.x.y.z is used to point staid and z position. Set z position to NA (e.g. c(1,2,3,NA)) or ommit it (e.g. c(1,2,3)) for spatial interpolation. Default is NULL.

use.idw

boolean; IDW prediction as covariate - will IDW predictions from n.obs nearest observations be calculated and tuned (see function near.obs). Default is FALSE.

s.crs

st_crs or crs; Source CRS of data. If data contains crs, s.crs will be overwritten. Default is NA.

p.crs

st_crs or crs; Projection CRS for data reprojection. If NA, s.crs will be used for distance calculation. Note that observations should be in projection for finding nearest observations based on Eucleadean distances (see function near.obs). Default is NA.

tgrid

data.frame; Possible tuning parameters. The column names are same as the tuning parameters. Possible tuning parameters are: n.obs, num.trees, mtry, min.node.size, sample.fraction, splirule, idw.p, and depth.range.

tgrid.n

numeric; Number of randomly chosen tgrid combinations for tuning of RFSI. If larger than tgrid, will be set to length(tgrid)

tune.type

character; Type of cross-validation: leave-location-out ("LLO"), leave-time-out ("LTO") - TO DO, and leave-location-time-out ("LLTO") - TO DO. Default is "LLO".

k

numeric; Number of random folds that will be created with CreateSpacetimeFolds function if folds is column. Default is 5.

seed

numeric; Random seed that will be used to generate folds with CreateSpacetimeFolds function.

folds

numeric or character vector or value; Showing folds column (if value) or rows (vector) of data observations used for cross-validation. If missing, will be created with CreateSpacetimeFolds function.

acc.metric

character; Accuracy metric that will be used as a criteria for choosing an optimal RFSI model. Possible values for regression: "ME", "MAE", "NMAE", "RMSE" (default), "NRMSE", "R2", "CCC". Possible values for classification: "Accuracy","Kappa" (default), "AccuracyLower", "AccuracyUpper", "AccuracyNull", "AccuracyPValue", "McnemarPValue".

fit.final.model

boolean; Fit the final RFSI model. Defailt is TRUE.

cpus

numeric; Number of processing units. Default is detectCores()-1.

progress

logical; If progress bar is shown. Default is TRUE.

soil3d

logical; If 3D soil modellig is performed and near.obs.soil function is used for finding n nearest observations and distances to them. In this case, z position of the data.staid.x.y.z points to the depth column.

no.obs

character; Possible values are increase (default) and exactly. If set to increase, in case if there is no n.obs observations in depth.range for a specific location, the depth.range is increased (multiplied by 2, 3, ...) until the number of observations are larger or equal to n.obs. If set to exactly, the function will raise an error when it come to the first location with no n.obs observations in specified depth.range (see function near.obs.soil).

...

Further arguments passed to ranger.

Value

A list with elements:

combinations

data.frame; All tuned parameter combinations with chosen accuracy metric value.

tuned.parameters

numeric vector; Tuned parameters with chosen accuracy metric value.

final.model

ranger; Final RFSI model (if fit.final.model=TRUE).

Author(s)

Aleksandar Sekulic [email protected]

References

Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M. & Bajat, B. Random Forest Spatial Interpolation. Remote. Sens. 12, 1687, https://doi.org/10.3390/rs12101687 (2020).

See Also

near.obs rfsi pred.rfsi cv.rfsi

Examples

library(CAST)
library(doParallel)
library(ranger)
library(sp)
library(sf)
library(terra)
library(meteo)

# prepare data
demo(meuse, echo=FALSE)
meuse <- meuse[complete.cases(meuse@data),]
data = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
fm.RFSI <- as.formula("zinc ~ dist + soil + ffreq")

# making tgrid
n.obs <- 1:6
min.node.size <- 2:10
sample.fraction <- seq(1, 0.632, -0.05) # 0.632 without / 1 with replacement
splitrule <- "variance"
ntree <- 250 # 500
mtry <- 3:(2+2*max(n.obs))
tgrid = expand.grid(min.node.size=min.node.size, num.trees=ntree,
                    mtry=mtry, n.obs=n.obs, sample.fraction=sample.fraction)


# Tune RFSI model
rfsi_tuned <- tune.rfsi(formula = fm.RFSI,
                        data = data,
                        # data.staid.x.y.z = data.staid.x.y.z, # data.frame
                        # s.crs = st_crs(data),
                        # p.crs = st_crs(data),
                        tgrid = tgrid, # combinations for tuning
                        tgrid.n = 20, # number of randomly selected combinations from tgrid
                        tune.type = "LLO", # Leave-Location-Out CV
                        k = 5, # number of folds
                        seed = 42,
                        acc.metric = "RMSE", # R2, CCC, MAE
                        fit.final.model = TRUE,
                        cpus = 2, # detectCores()-1,
                        progress = TRUE,
                        importance = "impurity") # ranger parameter

rfsi_tuned$combinations
rfsi_tuned$tuned.parameters
# min.node.size num.trees mtry n.obs sample.fraction     RMSE
# 3701             3       250    6     5            0.75 222.6752
rfsi_tuned$final.model
# OOB prediction error (MSE):       46666.51 
# R squared (OOB):                  0.6517336

Spatio-temporal variogram models for global and local daily air temperatures

Description

Variograms of residuals from multiple linear regression of mean, minimum, maximum daily temperatures on geometric temperature trend (GTT), MODIS LST, digital elevation model (DEM) and topographic wetness index (TWI). The models is computed from GSOD, ECA&D, GHCN-Daily or local meteorological stations. The obtained global or local models for mean, minimum, and maximum temperature can be used to produce gridded images of daily temperatures at high spatial and temporal resolution.

Usage

data(tvgms)

Format

A list of 9 space-time sum-metric models for daily air temperatures, units: space km, time days.

tmeanGSODECAD

Variogram for residuals from multiple linear regression of global mean daily temperature on GTT, MODIS LST, DEM, and TWI. D used: GSOD and ECA&D

tmeanGSODECAD_noMODIS

Variogram for residuals from multiple linear regression of global mean daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D

tminGSODECAD

Variogram for residuals from multiple linear regression of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D

tminGHCND

Variogram for residuals from multiple linear regression of global minimum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily

tminGSODECAD_noMODIS

Variogram for residuals from multiple linear regression of global minimum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D

tmaxGSODECAD

Variogram for residuals from multiple linear regression of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GSOD and ECA&D

tmaxGHCND

Variogram for residuals from multiple linear regression of global maximum daily temperature on GTT, MODIS LST, DEM, and TWI. Data used: GHCN-Daily

tmaxGSODECAD_noMODIS

Variogram for residuals from multiple linear regression of global maximum daily temperature on GTT, DEM, and TWI. Data used: GSOD and ECA&D

tmeanHR

Variogram for residuals from multiple linear regression of Croatian mean daily temperature on GTT, DEM, and TWI. Data used: Croatian mean daily temperature dataset for the year 2008 (Croatian national meteorological network)

Author(s)

Milan Kilibarda [email protected], Aleksandar Sekulic [email protected]

References

Kilibarda, M., T. Hengl, G. B. M. Heuvelink, B. Graeler, E. Pebesma, M. Percec Tadic, and B. Bajat (2014), Spatio-temporal interpolation of daily temperatures for global land areas at 1 km resolution, J. Geophys. Res. Atmos., 119, 2294-2313, doi:10.1002/2013JD020803.

Examples

data(tvgms)
tvgms[[1]] # model for mean daily temp.