Time-Varying GEV Model.

Time-varying GEV model

TVGEV(data, date, response, design = NULL,
      loc = ~ 1, scale = ~ 1, shape = ~ 1,
      psi0 = NULL, estim = c("optim", "nloptr", "none"),
      parTrack = FALSE,
      coefLower = if (estim != "nloptr") numeric(0) else c("xi_0" = -0.9),
      coefUpper = if (estim != "nloptr") numeric(0) else c("xi_0" = 2.0),
      trace = 0)

Arguments

data: A data frame containing at least the two required variables with their names given in date and response.
date: Character. Name of the date variable in data.
response: Character. Name of the response variable in data.
design: A call to a function creating a data frame with all the variables required in the formulas loc, scale and shape. While the content of data is mainly used for the creation of the TVGEV object, design will be used in prediction or computation of Return Levels.
loc: A formula linking linearly the GEV location parameter to the columns in the data frame created with design. The Rogers-Wilkinson style of formula is used here (as in lm): only the variables are given, the parameter names are created in relation to the variables names.
scale: A formula for the GEV scale parameter.
shape: A formula for the GEV shape parameter.
psi0: A numeric vector of initial values for the parameters. By default a vector is computed using a linear model.
estim: Character giving the package or function to be used for the likelihood maximisation.
parTrack: Logical. If TRUE the value of the parameter vector is tracked during the optimisation (i.e. at each call of the objective) and all the values are returned as a matrix "psi" in a list named "tracked" which contains as well the value of the objective.
coefLower, coefUpper: Numeric vectors or NULL giving bounds on the parameters. These are used overwrite the "normal" values which are -Inf and Inf. These arguments are only used when estim is "nloptr" and are anyway not taken into account in the inference. So the inference is misleading when some box constraints are active at the optimum, see MLE.TVGEV.
trace: Integer level of verbosity.

Value

An object with class "TVGEV", a list. Among the list elements

call: The call. This will be required.
data, design, loc, scale, shape: Copies of the inputs.
response, date: The name of the response and of the date columns in data.
terms: A list with the terms for the three GEV parameters.
opt: The results of the optimisation, if relevant.
estimate: The numeric vector of estimates for the vector psi.
theta: A matrix with three columns containing the GEV parameters loc, scale and shape.

Details

This kind of model describe independent observations having a Generalized Extreme Value (GEV) distribution depending on time. The three GEV parameters ("location", "scale" and "shape") can depend on a date variable used as a covariate in a linear-model style of dependence. The covariates of the model are functions of the date as provided in breaksX, polynomX, trigonX. No other covariates can be used unless the use of predict or RL will not be possible.

Note

When the response contains NA, the corresponding elements are discarded in the computation of the log-likelihood. However the corresponding rows exist in the matrix theta.

Caution

The call passed in the design formal will be reused in prediction. Passing a merely data frame would make prediction impossible because there would then be no way to guess how the covariates of the model were created from the date.

Author

Yves Deville

Examples


## transform a numeric year into a date
df <- within(TXMax_Dijon, Date <- as.Date(sprintf("%4d-01-01", Year)))
df0 <- subset(df, !is.na(TXMax))

## fit a TVGEV model. Only the location parameter is TV.
t1 <- system.time(
    res1 <- TVGEV(data = df, response = "TXMax", date = "Date",
                  design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
                  loc = ~ t1 + t1_1970))

## The same using "nloptr" optimisation.
t2 <- system.time(
    res2 <- TVGEV(data = df, response = "TXMax", date = "Date",
                  design = breaksX(date = Date, breaks = "1970-01-01", degree = 1),
                  loc = ~ t1 + t1_1970,
                  estim = "nloptr",
                  parTrack = TRUE))

## use extRemes::fevd the required variables need to be added to the data frame
## passed as 'data' argument
t0 <- system.time({
   df0.evd <- cbind(df0, breaksX(date = df0$Date, breaks = "1970-01-01",
                    degree = 1));
   res0 <- fevd(x = df0.evd$TXMax, data = df0.evd, loc = ~ t1 + t1_1970)
 })

## compare estimate and negative log-liks
cbind("fevd" = res0$results$par,
      "TVGEV_optim" = res1$estimate,
      "TVGEV_nloptr" = res2$estimate)
#>              fevd TVGEV_optim TVGEV_nloptr
#> mu0   32.06678895 32.06638460  32.06679282
#> mu1   -0.02391857 -0.02392656  -0.02391856
#> mu2    0.07727041  0.07728411   0.07727061
#> scale  1.75585289  1.75541862   1.75585353
#> shape -0.18130928 -0.18112018  -0.18131041
cbind("fevd" = res0$results$value,
      "VGEV_optim" = res1$negLogLik,
      "TVGEV_nloptr" = res2$negLogLik)
#>          fevd VGEV_optim TVGEV_nloptr
#> [1,] 177.2014   177.2014     177.2014

## ====================================================================
## use a loop on plausible break years. The fitted models
## are stored within a list
## ====================================================================

if (FALSE) { # \dontrun{

    yearBreaks <- c(1940, 1950, 1955, 1960:2000, 2005, 2010)
    res <- list()

    for (ib in seq_along(yearBreaks)) {
        d <- sprintf("%4d-01-01", yearBreaks[[ib]])
        floc <- as.formula(sprintf("~ t1 + t1_%4d", yearBreaks[[ib]]))
        res[[d]] <- TVGEV(data = df, response = "TXMax", date = "Date",
        design = breaksX(date = Date, breaks = d, degree = 1),
        loc = floc)
    }

    ## [continuing...] ]find the model with maximum likelihood, and plot
    ## something like a profile likelihood for the break date considered
    ## as a new parameter. However, the model is not differentiable w.r.t.
    ## the break! 

    ll <- sapply(res, logLik)
    plot(yearBreaks, ll, type = "o", pch = 21, col = "orangered",
         lwd = 2, bg = "gold", xlab = "break", ylab = "log-lik")
    grid()
    iMax <- which.max(ll)
    abline(v = yearBreaks[iMax])
    abline(h = ll[iMax] - c(0, qchisq(0.95, df = 1) /2),
           col = "SpringGreen3", lwd = 2)

} # }