TVGEV
or a NSGEV
ObjectR/predictUncond.R
predictUncond.Rd
Prediction of Unconditional or Non-Stationary Return Levels for a
TVGEV
or a NSGEV
object.
A TVGEV
or a NSGEV
model object.
A numeric vector giving the periods at which the Return Levels will be computed.
The Confidence Levels at which the CI will be
computed. For "proflik"
CI, only one level can be given
for now.
Only for a NSGEV
model. A data frame
containing the covariates needed. Each row represents a block with
unit duration.
Only for a TVGEV
model. This argument is
essentially for compatibility with the predict
method for
unconditional prediction. A vector that can be coerced to the
"Date"
class. This vector define \(B\) blocks used to
compute the return levels. The standard use is is with consecutive
blocks, so the specification should use the
newdateFrom
formal argument rather than newdate
.
Only for a TVGEV
model. A vector with
length one that can be coerced to the "Date"
class. This
gives the first block of a period of \(B\) successive blocks
used to compute the Return Levels, where \(B\) is the maximal
period defined in period
.
Not used yet. See the RL
function. For now, the RL period that are greater than
nrow(newdata)
are discarded in the computation.
The type of Return Level as in RL
.
The type of Confidence Interval (CI) to
compute. The value "delta"
leads to using the delta
method, and "proflik"
leads to using the
profile-likelihood method.
Type of output.
Level of verbosity.
Not used yet.
A data frame with the predicted RLs and the related CIs, with one row by block.
This data frame can have several attributes
title
is a string that can be used as title in plots.
diagno
. When confInt
is equal to
"proflik"
, this element is an array providing information
about the optimisation.
PsiStar
. When confInt
is equal to
"proflik"
, this element is a matrix with its row \(i\)
giving the value \(\boldsymbol{\psi}^\star\) of the vector
of parameters that maximizes the Return Level \(\rho(T)\) with
period \(T = T_i\) under the constraint on the log-likelihood.
For a Non-Stationary GEV model, the Return Level corresponding to
a given period (e.g. 100 years) depends on the value of the
covariates. However we can make an assumption on the the
distribution of the covariates to compute an uncondtitional Return
Level as the expectation of the RL. Similarily, for Time-Varying
TVGEV
models we can define the Return Level as the level
for which the expected number of exceedances on a reference period
of time is equal to one, see RL
. In both cases, the RL
no longer depends of the covariates or on a specific block.
For the profile-likelihood method, the determination of the confidence intervals is quite slow because a constrained optimization problem is solved for each period.
Since most often the values of \(\boldsymbol{\psi}^\star\) corresponding to different periods are close enough, a significant reduction of the computing time could be achieved in the near future by using the same value of \(\boldsymbol{\psi}\) for all the Return Periods.
Future versions might allow different durations across blocks by
using dedicated arguments. For now, it must be kept in mind that
the periods are understood as multiple of a constant block
duration. So if newdata
has 100
rows the maximal
Return Period that can be used without resampling is 100
.
When period
contains some large values, say 100 or
more, a massive extrapolation from the model is required to
compute the corresponding RLs. This is likely to be meaningless
from a statistical point of view, because a massive extrapolation
of regression-like model does not make sense. The interest should
rather focus on a moderately large period.
For each period \(T\), the Return Level is computed using the \(T\) blocks \(t_0\), \(t_0 + 1\), ..., \(t_0 + T -1\) so the different periods \(T\) appearing on a Return Level plot correspond to different prediction horizons. As a result, the quantile or confidence bounds do not depend as smoothly of \(T\) as they do for conditional predictions.
example(TVGEV)
#>
#> TVGEV> ## transform a numeric year into a date
#> TVGEV> df <- within(TXMax_Dijon, Date <- as.Date(sprintf("%4d-01-01", Year)))
#>
#> TVGEV> df0 <- subset(df, !is.na(TXMax))
#>
#> TVGEV> ## fit a TVGEV model. Only the location parameter is TV.
#> TVGEV> t1 <- system.time(
#> TVGEV+ res1 <- TVGEV(data = df, response = "TXMax", date = "Date",
#> TVGEV+ design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
#> TVGEV+ loc = ~ t1 + t1_1970))
#>
#> TVGEV> ## The same using "nloptr" optimisation.
#> TVGEV> t2 <- system.time(
#> TVGEV+ res2 <- TVGEV(data = df, response = "TXMax", date = "Date",
#> TVGEV+ design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
#> TVGEV+ loc = ~ t1 + t1_1970,
#> TVGEV+ estim = "nloptr",
#> TVGEV+ parTrack = TRUE))
#>
#> TVGEV> ## use extRemes::fevd the required variables need to be added to the data frame
#> TVGEV> ## passed as 'data' argument
#> TVGEV> t0 <- system.time({
#> TVGEV+ df0.evd <- cbind(df0, breaksX(date = df0$Date, breaks = "1970-01-01",
#> TVGEV+ degree = 1));
#> TVGEV+ res0 <- fevd(x = df0.evd$TXMax, data = df0.evd, loc = ~ t1 + t1_1970)
#> TVGEV+ })
#>
#> TVGEV> ## compare estimate and negative log-liks
#> TVGEV> cbind("fevd" = res0$results$par,
#> TVGEV+ "TVGEV_optim" = res1$estimate,
#> TVGEV+ "TVGEV_nloptr" = res2$estimate)
#> fevd TVGEV_optim TVGEV_nloptr
#> mu0 32.06678895 32.06638460 32.06679233
#> mu1 -0.02391857 -0.02392656 -0.02391860
#> mu2 0.07727041 0.07728411 0.07727031
#> scale 1.75585289 1.75541862 1.75585346
#> shape -0.18130928 -0.18112018 -0.18130938
#>
#> TVGEV> cbind("fevd" = res0$results$value,
#> TVGEV+ "VGEV_optim" = res1$negLogLik,
#> TVGEV+ "TVGEV_nloptr" = res2$negLogLik)
#> fevd VGEV_optim TVGEV_nloptr
#> [1,] 177.2014 177.2014 177.2014
#>
#> TVGEV> ## ====================================================================
#> TVGEV> ## use a loop on plausible break years. The fitted models
#> TVGEV> ## are stored within a list
#> TVGEV> ## ====================================================================
#> TVGEV>
#> TVGEV> ## Not run:
#> TVGEV> ##D
#> TVGEV> ##D yearBreaks <- c(1940, 1950, 1955, 1960:2000, 2005, 2010)
#> TVGEV> ##D res <- list()
#> TVGEV> ##D
#> TVGEV> ##D for (ib in seq_along(yearBreaks)) {
#> TVGEV> ##D d <- sprintf("%4d-01-01", yearBreaks[[ib]])
#> TVGEV> ##D floc <- as.formula(sprintf("~ t1 + t1_%4d", yearBreaks[[ib]]))
#> TVGEV> ##D res[[d]] <- TVGEV(data = df, response = "TXMax", date = "Date",
#> TVGEV> ##D design = breaksX(date = Date, breaks = d, degree = 1),
#> TVGEV> ##D loc = floc)
#> TVGEV> ##D }
#> TVGEV> ##D
#> TVGEV> ##D ## [continuing...] ]find the model with maximum likelihood, and plot
#> TVGEV> ##D ## something like a profile likelihood for the break date considered
#> TVGEV> ##D ## as a new parameter. However, the model is not differentiable w.r.t.
#> TVGEV> ##D ## the break!
#> TVGEV> ##D
#> TVGEV> ##D ll <- sapply(res, logLik)
#> TVGEV> ##D plot(yearBreaks, ll, type = "o", pch = 21, col = "orangered",
#> TVGEV> ##D lwd = 2, bg = "gold", xlab = "break", ylab = "log-lik")
#> TVGEV> ##D grid()
#> TVGEV> ##D iMax <- which.max(ll)
#> TVGEV> ##D abline(v = yearBreaks[iMax])
#> TVGEV> ##D abline(h = ll[iMax] - c(0, qchisq(0.95, df = 1) /2),
#> TVGEV> ##D col = "SpringGreen3", lwd = 2)
#> TVGEV> ##D
#> TVGEV> ## End(Not run)
#> TVGEV>
#> TVGEV>
#> TVGEV>
pu <- predictUncond(res2, newdateFrom = "2020-01-01")
pu
#> Date Period Level Quant L U
#> 1 2020-01-01 5 95% 37.14824 35.95343 38.34306
#> 2 2020-01-01 10 95% 38.22614 36.90581 39.54647
#> 3 2020-01-01 20 95% 39.30895 37.74284 40.87507
#> 4 2020-01-01 30 95% 40.03629 38.22453 41.84806
#> 5 2020-01-01 40 95% 40.64955 38.58459 42.71452
#> 6 2020-01-01 50 95% 41.21581 38.88621 43.54541
#> 7 2020-01-01 70 95% 42.29921 39.40569 45.19272
#> 8 2020-01-01 100 95% 43.90084 40.10935 47.69233