Prediction of Unconditional or Non-Stationary Return Levels for a TVGEV or a NSGEV Object

Prediction of Unconditional or Non-Stationary Return Levels for a TVGEV or a NSGEV object.

predictUncond(
  object,
  period = NULL,
  level = 0.95,
  newdata = NULL,
  newdate = NULL,
  newdateFrom = NULL,
  resample = FALSE,
  RLType = c("exceed", "average"),
  confintMethod = c("delta", "none", "boot", "proflik"),
  out = c("data.frame", "array"),
  trace = 0,
  ...
)

Arguments

object: A TVGEV or a NSGEV model object.
period: A numeric vector giving the periods at which the Return Levels will be computed.
level: The Confidence Levels at which the CI will be computed. For "proflik" CI, only one level can be given for now.
newdata: Only for a NSGEV model. A data frame containing the covariates needed. Each row represents a block with unit duration.
newdate: Only for a TVGEV model. This argument is essentially for compatibility with the predict method for unconditional prediction. A vector that can be coerced to the "Date" class. This vector define $B$ blocks used to compute the return levels. The standard use is is with consecutive blocks, so the specification should use the newdateFrom formal argument rather than newdate.
newdateFrom: Only for a TVGEV model. A vector with length one that can be coerced to the "Date" class. This gives the first block of a period of $B$ successive blocks used to compute the Return Levels, where $B$ is the maximal period defined in period.
resample: Not used yet. See the RL function. For now, the RL period that are greater than nrow(newdata) are discarded in the computation.
RLType: The type of Return Level as in RL.
confintMethod: The type of Confidence Interval (CI) to compute. The value "delta" leads to using the delta method, and "proflik" leads to using the profile-likelihood method.
out: Type of output.
trace: Level of verbosity.
...: Not used yet.

Value

A data frame with the predicted RLs and the related CIs, with one row by block.

This data frame can have several attributes

title is a string that can be used as title in plots.
diagno. When confInt is equal to "proflik", this element is an array providing information about the optimisation.
PsiStar. When confInt is equal to "proflik", this element is a matrix with its row $i$ giving the value $\boldsymbol{\psi}^\star$ of the vector of parameters that maximizes the Return Level $\rho(T)$ with period $T = T_i$ under the constraint on the log-likelihood.

Details

For a Non-Stationary GEV model, the Return Level corresponding to a given period (e.g. 100 years) depends on the value of the covariates. However we can make an assumption on the the distribution of the covariates to compute an uncondtitional Return Level as the expectation of the RL. Similarily, for Time-Varying TVGEV models we can define the Return Level as the level for which the expected number of exceedances on a reference period of time is equal to one, see RL. In both cases, the RL no longer depends of the covariates or on a specific block.

Note

For the profile-likelihood method, the determination of the confidence intervals is quite slow because a constrained optimization problem is solved for each period.

Since most often the values of $\boldsymbol{\psi}^\star$ corresponding to different periods are close enough, a significant reduction of the computing time could be achieved in the near future by using the same value of $\boldsymbol{\psi}$ for all the Return Periods.

Future versions might allow different durations across blocks by using dedicated arguments. For now, it must be kept in mind that the periods are understood as multiple of a constant block duration. So if newdata has 100 rows the maximal Return Period that can be used without resampling is 100.

Caution

When period contains some large values, say 100 or more, a massive extrapolation from the model is required to compute the corresponding RLs. This is likely to be meaningless from a statistical point of view, because a massive extrapolation of regression-like model does not make sense. The interest should rather focus on a moderately large period.
For each period $T$ , the Return Level is computed using the $T$ blocks $t_0$ , $t_0 + 1$ , ..., $t_0 + T -1$ so the different periods $T$ appearing on a Return Level plot correspond to different prediction horizons. As a result, the quantile or confidence bounds do not depend as smoothly of $T$ as they do for conditional predictions.

Author

Yves Deville

Examples

example(TVGEV)
#> 
#> TVGEV> ## transform a numeric year into a date
#> TVGEV> df <- within(TXMax_Dijon, Date <- as.Date(sprintf("%4d-01-01", Year)))
#> 
#> TVGEV> df0 <- subset(df, !is.na(TXMax))
#> 
#> TVGEV> ## fit a TVGEV model. Only the location parameter is TV.
#> TVGEV> t1 <- system.time(
#> TVGEV+     res1 <- TVGEV(data = df, response = "TXMax", date = "Date",
#> TVGEV+                   design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
#> TVGEV+                   loc = ~ t1 + t1_1970))
#> 
#> TVGEV> ## The same using "nloptr" optimisation.
#> TVGEV> t2 <- system.time(
#> TVGEV+     res2 <- TVGEV(data = df, response = "TXMax", date = "Date",
#> TVGEV+                   design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
#> TVGEV+                   loc = ~ t1 + t1_1970,
#> TVGEV+                   estim = "nloptr",
#> TVGEV+                   parTrack = TRUE))
#> 
#> TVGEV> ## use extRemes::fevd the required variables need to be added to the data frame
#> TVGEV> ## passed as 'data' argument
#> TVGEV> t0 <- system.time({
#> TVGEV+    df0.evd <- cbind(df0, breaksX(date = df0$Date, breaks = "1970-01-01",
#> TVGEV+                     degree = 1));
#> TVGEV+    res0 <- fevd(x = df0.evd$TXMax, data = df0.evd, loc = ~ t1 + t1_1970)
#> TVGEV+  })
#> 
#> TVGEV> ## compare estimate and negative log-liks
#> TVGEV> cbind("fevd" = res0$results$par,
#> TVGEV+       "TVGEV_optim" = res1$estimate,
#> TVGEV+       "TVGEV_nloptr" = res2$estimate)
#>              fevd TVGEV_optim TVGEV_nloptr
#> mu0   32.06678895 32.06638460  32.06679282
#> mu1   -0.02391857 -0.02392656  -0.02391856
#> mu2    0.07727041  0.07728411   0.07727061
#> scale  1.75585289  1.75541862   1.75585353
#> shape -0.18130928 -0.18112018  -0.18131041
#> 
#> TVGEV> cbind("fevd" = res0$results$value,
#> TVGEV+       "VGEV_optim" = res1$negLogLik,
#> TVGEV+       "TVGEV_nloptr" = res2$negLogLik)
#>          fevd VGEV_optim TVGEV_nloptr
#> [1,] 177.2014   177.2014     177.2014
#> 
#> TVGEV> ## ====================================================================
#> TVGEV> ## use a loop on plausible break years. The fitted models
#> TVGEV> ## are stored within a list
#> TVGEV> ## ====================================================================
#> TVGEV> 
#> TVGEV> ## Not run: 
#> TVGEV> ##D 
#> TVGEV> ##D     yearBreaks <- c(1940, 1950, 1955, 1960:2000, 2005, 2010)
#> TVGEV> ##D     res <- list()
#> TVGEV> ##D 
#> TVGEV> ##D     for (ib in seq_along(yearBreaks)) {
#> TVGEV> ##D         d <- sprintf("%4d-01-01", yearBreaks[[ib]])
#> TVGEV> ##D         floc <- as.formula(sprintf("~ t1 + t1_%4d", yearBreaks[[ib]]))
#> TVGEV> ##D         res[[d]] <- TVGEV(data = df, response = "TXMax", date = "Date",
#> TVGEV> ##D         design = breaksX(date = Date, breaks = d, degree = 1),
#> TVGEV> ##D         loc = floc)
#> TVGEV> ##D     }
#> TVGEV> ##D 
#> TVGEV> ##D     ## [continuing...] ]find the model with maximum likelihood, and plot
#> TVGEV> ##D     ## something like a profile likelihood for the break date considered
#> TVGEV> ##D     ## as a new parameter. However, the model is not differentiable w.r.t.
#> TVGEV> ##D     ## the break! 
#> TVGEV> ##D 
#> TVGEV> ##D     ll <- sapply(res, logLik)
#> TVGEV> ##D     plot(yearBreaks, ll, type = "o", pch = 21, col = "orangered",
#> TVGEV> ##D          lwd = 2, bg = "gold", xlab = "break", ylab = "log-lik")
#> TVGEV> ##D     grid()
#> TVGEV> ##D     iMax <- which.max(ll)
#> TVGEV> ##D     abline(v = yearBreaks[iMax])
#> TVGEV> ##D     abline(h = ll[iMax] - c(0, qchisq(0.95, df = 1) /2),
#> TVGEV> ##D            col = "SpringGreen3", lwd = 2)
#> TVGEV> ##D 
#> TVGEV> ## End(Not run)
#> TVGEV> 
#> TVGEV> 
#> TVGEV> 
pu <- predictUncond(res2, newdateFrom = "2020-01-01")
pu
#>         Date Period Level    Quant        L        U
#> 1 2020-01-01      5   95% 37.14826 35.95345 38.34308
#> 2 2020-01-01     10   95% 38.22615 36.90582 39.54648
#> 3 2020-01-01     20   95% 39.30897 37.74285 40.87508
#> 4 2020-01-01     30   95% 40.03631 38.22455 41.84808
#> 5 2020-01-01     40   95% 40.64957 38.58461 42.71454
#> 6 2020-01-01     50   95% 41.21583 38.88623 43.54543
#> 7 2020-01-01     70   95% 42.29923 39.40572 45.19275
#> 8 2020-01-01    100   95% 43.90088 40.10939 47.69237